[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030522#comment-14030522 ] Hudson commented on MAPREDUCE-5912: --- FAILURE: Integrated in Hadoop-Yarn-trunk #582 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/582/]) MAPREDUCE-5912. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196. Contributed by Remus Rusanu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602282) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030649#comment-14030649 ] Hudson commented on MAPREDUCE-5912: --- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1773 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1773/]) MAPREDUCE-5912. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196. Contributed by Remus Rusanu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602282) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030719#comment-14030719 ] Hudson commented on MAPREDUCE-5912: --- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1800 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1800/]) MAPREDUCE-5912. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196. Contributed by Remus Rusanu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602282) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029597#comment-14029597 ] Chris Nauroth commented on MAPREDUCE-5912: -- +1 for this patch. [~rusanu], [~curino] and [~chris.douglas], my understanding is that MAPREDUCE-5196 accidentally introduced this bug, but this part of the change is not strictly necessary for the goals of MAPREDUCE-5196. Based on that, I'm in favor of committing this patch to revert just the part of MAPREDUCE-5196 that caused the bug. The alternative patch on the {{Path}} class posted in HADOOP-10663 has some other potential side effects, so I prefer doing a localized fix here in MR. (I'll enter more details on HADOOP-10663.) If in the future we want to revisit the idea of map outputs going somewhere different than the local file system, then I think we'd need a different patch. I think we'd want to make sure that the map output's {{Path}} instance contains an explicit scheme, so that the code here doesn't need to assume local vs. default vs. something else. Can you let me know if you agree with committing this and not committing HADOOP-10663? I'll hold off on committing until I hear from one of you. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029625#comment-14029625 ] Chris Douglas commented on MAPREDUCE-5912: -- bq. If in the future we want to revisit the idea of map outputs going somewhere different than the local file system, then I think we'd need a different patch. I think we'd want to make sure that the map output's Path instance contains an explicit scheme, so that the code here doesn't need to assume local vs. default vs. something else. Agreed. MAPREDUCE-5269 changed all {{Path}} instances returned from {{YARNOutputFiles}} to be fully qualified, but the two changes were separated. +1 for committing the workaround until HADOOP-10663 is ready. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029739#comment-14029739 ] Hudson commented on MAPREDUCE-5912: --- SUCCESS: Integrated in Hadoop-trunk-Commit #5691 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5691/]) MAPREDUCE-5912. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196. Contributed by Remus Rusanu. (cnauroth: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1602282) * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018694#comment-14018694 ] Remus Rusanu commented on MAPREDUCE-5912: - I also posted a patch that solves HADOOP-10663. I guess if that is accepted, this is obsolete. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14017828#comment-14017828 ] Hadoop QA commented on MAPREDUCE-5912: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12648337/MAPREDUCE-5912.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4642//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4642//console This message is automatically generated. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14017852#comment-14017852 ] Remus Rusanu commented on MAPREDUCE-5912: - No new tests included because this is a revert of an earlier breaking change. Manually validated the change on Windows. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5912) Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196
[ https://issues.apache.org/jira/browse/MAPREDUCE-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018246#comment-14018246 ] Chris Douglas commented on MAPREDUCE-5912: -- As you identified in HADOOP-10663, returning the default filesystem for local paths is not correct. Task.calculateOutputSize does not handle Windows files after MAPREDUCE-5196 --- Key: MAPREDUCE-5912 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5912 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 3.0.0 Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 3.0.0 Attachments: MAPREDUCE-5912.1.patch {code} @@ -1098,8 +1120,8 @@ private long calculateOutputSize() throws IOException { if (isMapTask() conf.getNumReduceTasks() 0) { try { Path mapOutput = mapOutputFile.getOutputFile(); -FileSystem localFS = FileSystem.getLocal(conf); -return localFS.getFileStatus(mapOutput).getLen(); +FileSystem fs = mapOutput.getFileSystem(conf); +return fs.getFileStatus(mapOutput).getLen(); } catch (IOException e) { LOG.warn (Could not find output size , e); } {code} causes Windows local output files to be routed through HDFS: {code} 2014-06-02 00:14:53,891 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.IllegalArgumentException: Pathname /c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out from c:/Hadoop/Data/Hadoop/local/usercache/HadoopUser/appcache/application_1401693085139_0001/output/attempt_1401693085139_0001_m_00_0/file.out is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:187) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:101) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1024) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1020) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1020) at org.apache.hadoop.mapred.Task.calculateOutputSize(Task.java:1124) at org.apache.hadoop.mapred.Task.sendLastUpdate(Task.java:1102) at org.apache.hadoop.mapred.Task.done(Task.java:1048) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)