[
https://issues.apache.org/jira/browse/MAPREDUCE-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sandy Ryza updated MAPREDUCE-5601:
----------------------------------
Issue Type: Improvement (was: Bug)
> Fetches when reducer can't fit them result in unnecessary reads on shuffle
> server
> ---------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5601
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5601
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Affects Versions: 2.2.0
> Reporter: Sandy Ryza
> Assignee: Sandy Ryza
>
> When a reducer initiates a fetch request, it does not know whether it will be
> able to fit the fetched data in memory. The first part of the response tells
> how much data will be coming. If space is not currently available, the
> reduce will abandon its request and try again later. Unfortunately, this has
> some consequences on the server side - it forces unnecessary disk and network
> IO as the server begins to read the output data that will go nowhere. Also,
> when the channel is closed, it triggers an fadvise DONTNEED that causes the
> data region to be evicted from the OS page cache. Meaning that the next time
> it's asked for, it will definitely be read from disk, even if it happened to
> be in the page cache before the request.
> I noticed this when trying to figure out why my job was doing so much more
> disk IO in MR2 than in MR1. When I turned the fadvise stuff off, I found
> that disk reads went to nearly 0 on machines that had enough memory to fit
> map outputs into the page cache. I then straced the NodeManager noticed that
> there were over four times as many fadvise DONTNEED calls as map-reduce
> pairs. Further logging showed the same map outputs being fetched about this
> many times.
> The fix would be to reserve space in the reducer before fetching the data.
> Currently the fetching the size of the data and fetching the actual data
> happen in the same HTTP request. Fixing it would require doing these in
> separate HTTP requests. Or transferring the sizes through the AM.
--
This message was sent by Atlassian JIRA
(v6.1#6144)