[
https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017771#comment-14017771
]
Carlo Curino commented on MAPREDUCE-5196:
-----------------------------------------
Answering Remus:
(I am not 100% sure, as I wrote this code over a year ago, but let me try to
recall)
As part of the preemption work we explored doing HDFS-based shuffling.
The benefits of this were:
1) performance enhancements on certain data size ranges (stream-merge on the
reducers)
2) the reducer checkpoint state was much smaller (no data, just offset of the
last read key from each map)
That was an initial sperimentation, but making it robust was non-trivial
(missing mapoutput were hard to
recover) so we didn't push it yet. In that context, the mapOutput was not on
localFS but on HDFS, and
the change you pointed out was fixing that. But this clearly does not work for
windows. My guess is that
reverting that part should be fine here.
> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
> ------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5196
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mr-am, mrv2
> Reporter: Carlo Curino
> Assignee: Carlo Curino
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch,
> MAPREDUCE-5196.3.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
>
>
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles
> propagation of the preemption requests received from the RM to the
> appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the
> task state is handled in upcoming JIRAs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)