[
https://issues.apache.org/jira/browse/HADOOP-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301081#comment-15301081
]
Junping Du commented on HADOOP-10048:
-------------------------------------
bq. No, that could trigger an array bounds exception if we update it to a
value that is past the number of directories in the other, unrelated context.
Also we don't need to worry about this particular race.
We can simply move {{dirNum % numDirs}} ahead of ctx.dirDF[dirNum] to get rid
of array out of bound issue. However, I agree that this particular race is not
important given the value of dirNumLastAccessed could mean something different
in different context.
Under the same context, mark dirNumLastAccessed as volatile could still cause
multiple threads end up with the same dirNumLastAccessed in case {{int dirNum =
ctx.dirNumLastAccessed;}} get accessed at almost the same time. In this case,
previous round-robin pickup for disks with available capacity is broken, we may
use random instead. Otherwise, accessing of disks could be aggregated on
particular disk. Thoughts?
bq. Since this is not related to this change and could degrade the error
diagnostics in some corner cases, I'm tempted to leave it as-is. If we feel
it's important to fix it then we can tackle it in a followup JIRA where it does
the full file stat first, checks the corner cases, then calls mkdirs if
necessary.
That sounds a reasonable plan. We can discuss this later in other JIRA.
> LocalDirAllocator should avoid holding locks while accessing the filesystem
> ---------------------------------------------------------------------------
>
> Key: HADOOP-10048
> URL: https://issues.apache.org/jira/browse/HADOOP-10048
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 2.3.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: HADOOP-10048.003.patch, HADOOP-10048.004.patch,
> HADOOP-10048.005.patch, HADOOP-10048.patch, HADOOP-10048.trunk.patch
>
>
> As noted in MAPREDUCE-5584 and HADOOP-7016, LocalDirAllocator can be a
> bottleneck for multithreaded setups like the ShuffleHandler. We should
> consider moving to a lockless design or minimizing the critical sections to a
> very small amount of time that does not involve I/O operations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]