[ 
https://issues.apache.org/jira/browse/HADOOP-10048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15301081#comment-15301081
 ] 

Junping Du commented on HADOOP-10048:
-------------------------------------

bq.  No, that could trigger an array bounds exception if we update it to a 
value that is past the number of directories in the other, unrelated context. 
Also we don't need to worry about this particular race. 
We can simply move {{dirNum % numDirs}} ahead of ctx.dirDF[dirNum] to get rid 
of array out of bound issue. However, I agree that this particular race is not 
important given the value of dirNumLastAccessed could mean something different 
in different context. 
Under the same context, mark dirNumLastAccessed as volatile could still cause 
multiple threads end up with the same dirNumLastAccessed in case {{int dirNum = 
ctx.dirNumLastAccessed;}} get accessed at almost the same time. In this case, 
previous round-robin pickup for disks with available capacity is broken, we may 
use random instead. Otherwise, accessing of disks could be aggregated on 
particular disk. Thoughts?

bq. Since this is not related to this change and could degrade the error 
diagnostics in some corner cases, I'm tempted to leave it as-is.  If we feel 
it's important to fix it then we can tackle it in a followup JIRA where it does 
the full file stat first, checks the corner cases, then calls mkdirs if 
necessary.
That sounds a reasonable plan. We can discuss this later in other JIRA.

> LocalDirAllocator should avoid holding locks while accessing the filesystem
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10048
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10048
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: HADOOP-10048.003.patch, HADOOP-10048.004.patch, 
> HADOOP-10048.005.patch, HADOOP-10048.patch, HADOOP-10048.trunk.patch
>
>
> As noted in MAPREDUCE-5584 and HADOOP-7016, LocalDirAllocator can be a 
> bottleneck for multithreaded setups like the ShuffleHandler.  We should 
> consider moving to a lockless design or minimizing the critical sections to a 
> very small amount of time that does not involve I/O operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to