[
https://issues.apache.org/jira/browse/MAPREDUCE-5196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017641#comment-14017641
]
Remus Rusanu commented on MAPREDUCE-5196:
-----------------------------------------
Hi [~curino],
Can you shed some light on the rationale of this change:
{code}
@@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException {
if (isMapTask() && conf.getNumReduceTasks() > 0) {
try {
Path mapOutput = mapOutputFile.getOutputFile();
- FileSystem localFS = FileSystem.getLocal(conf);
- return localFS.getFileStatus(mapOutput).getLen();
+ FileSystem fs = mapOutput.getFileSystem(conf);
+ return fs.getFileStatus(mapOutput).getLen();
} catch (IOException e) {
LOG.warn ("Could not find output size " , e);
}
{code}
This breaks Windows deployments as the local files get get routed through HDFS:
{code}
c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_000000_0/file.out
is not a valid DFS filename.
at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187)
at
org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020)
at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124)
at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102)
{code}
> CheckpointAMPreemptionPolicy implements preemption in MR AM via checkpointing
> ------------------------------------------------------------------------------
>
> Key: MAPREDUCE-5196
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5196
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: mr-am, mrv2
> Reporter: Carlo Curino
> Assignee: Carlo Curino
> Fix For: 3.0.0
>
> Attachments: MAPREDUCE-5196.1.patch, MAPREDUCE-5196.2.patch,
> MAPREDUCE-5196.3.patch, MAPREDUCE-5196.patch, MAPREDUCE-5196.patch
>
>
> This JIRA tracks a checkpoint-based AM preemption policy. The policy handles
> propagation of the preemption requests received from the RM to the
> appropriate tasks, and bookeeping of checkpoints. Actual checkpointing of the
> task state is handled in upcoming JIRAs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)