[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810914#comment-13810914
 ] 

Sandy Ryza commented on MAPREDUCE-5601:
---------------------------------------

bq. Yea... though could be problematic due to parallel fetches.
Right.  It would help a little if the amount remaining was less than the total 
fetched in each request, but wouldn't solve the bigger problem.

bq. I think best would be to add a new field to TaskAttemptCompletionEventProto 
which contains the size of the completed map output.
That's been my thinking too.  Unfortunately the task umbilical protocol is 
still on Writables, so there could be compatibility issues.

> ShuffleHandler fadvises file regions as DONTNEED even when fetch fails
> ----------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5601
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5601
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 2.2.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>         Attachments: MAPREDUCE-5601.patch, MAPREDUCE-5601.patch, 
> MAPREDUCE-5601.patch
>
>
> When a reducer initiates a fetch request, it does not know whether it will be 
> able to fit the fetched data in memory.  The first part of the response tells 
> how much data will be coming.  If space is not currently available, the 
> reduce will abandon its request and try again later.  When this occurs, the 
> ShuffleHandler still fadvises the file region as DONTNEED.  Meaning that the 
> next time it's asked for, it will definitely be read from disk, even if it 
> happened to be in the page cache before the request.
> I noticed this when trying to figure out why my job was doing so much more 
> disk IO in MR2 than in MR1.  When I turned the fadvise stuff off, I found 
> that disk reads went to nearly 0 on machines that had enough memory to fit 
> map outputs into the page cache.  I then straced the NodeManager and noticed 
> that there were over four times as many fadvise DONTNEED calls as map-reduce 
> pairs.  Further logging showed the same map outputs being fetched about this 
> many times.
> This is a regression from MR1, which only did the fadvise DONTNEED after all 
> the bytes were transferred.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to