[
https://issues.apache.org/jira/browse/MAPREDUCE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated MAPREDUCE-4946:
----------------------------------
Attachment: MAPREDUCE-4946.patch
Attaching a patch that works around the problem by caching the type conversion
of the map completion event.
It's a bit messy since previously we relied on the fact that updating the task
completion event during a fetch failure implicitly updated map completion event
since they were the same object. When it's cached separately, we need a way to
quickly lookup the cached map event from the task event so it can be updated as
well.
Manually tested this on a large cluster, and it seems to make things better
wrt. a thundering herd of reducers trying to connect.
A cleaner fix would be to convert TaskUmbilicalProtocol to use
TaskAttemptCompletionEvents directly so we don't need to convert them at all
except when honoring the
org.apache.hadoop.mapreduce.Job.getTaskAttemptCompletionEvents interface which
is unlikely to be called in a performance-critical scenario.
There's also the performance issue of the locking within the PBImpl classes,
but that's an issue probably best left to another JIRA.
> Type conversion of map completion events leads to performance problems with
> large jobs
> --------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4946
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4946
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mr-am
> Affects Versions: 2.0.2-alpha, 0.23.5
> Reporter: Jason Lowe
> Priority: Critical
> Attachments: MAPREDUCE-4946.patch
>
>
> We've seen issues with large jobs (e.g.: 13,000 maps and 3,500 reduces) where
> reducers fail to connect back to the AM after being launched due to
> connection timeout. Looking at stack traces of the AM during this time we
> see a lot of IPC servers stuck waiting for a lock to get the application ID
> while type converting the map completion events. What's odd is that normally
> getting the application ID should be very cheap, but in this case we're
> type-converting thousands of map completion events for *each* reducer
> connecting. That means we end up type-converting the map completion events
> over 45 million times during the lifetime of the example job (13,000 * 3,500).
> We either need to make the type conversion much cheaper (i.e.: lockless or at
> least read-write locked) or, even better, store the completion events in a
> form that does not require type conversion when serving them up to reducers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira