[jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595680#comment-13595680 ] Harsh J commented on MAPREDUCE-2911: Where exactly did all that naming you refer to, happen? I've not noticed it on the lists and there's been a few asks there as well (IIRC), but no negativism ever came in on its responses. I do not see any 'bile-spewing' on this very ticket either. So what community are you pointing this onto? Thanks for still working on getting this available though, there are several people interested in this! Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Assignee: Ralph H Castain Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595767#comment-13595767 ] Hudson commented on MAPREDUCE-3685: --- Integrated in Hadoop-Yarn-trunk #148 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/148/]) MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is appropriately used and that on-disk segments are correctly sorted on file-size. Contributed by Anty Rao and Ravi Prakash. (Revision 1453365) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java There are some bugs in implementation of MergeManager - Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: anty.rao Assignee: anty Priority: Critical Fix For: 0.23.7, 2.0.4-beta Attachments: MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595832#comment-13595832 ] Hudson commented on MAPREDUCE-3685: --- Integrated in Hadoop-Hdfs-0.23-Build #546 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/546/]) MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is appropriately used and that on-disk segments are correctly sorted on file-size. Contributed by Anty Rao and Ravi Prakash. (Revision 1453373) Result = UNSTABLE acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453373 Files : * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MapOutput.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManager.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java There are some bugs in implementation of MergeManager - Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: anty.rao Assignee: anty Priority: Critical Fix For: 0.23.7, 2.0.4-beta Attachments: MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595854#comment-13595854 ] Hudson commented on MAPREDUCE-3685: --- Integrated in Hadoop-Hdfs-trunk #1337 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1337/]) MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is appropriately used and that on-disk segments are correctly sorted on file-size. Contributed by Anty Rao and Ravi Prakash. (Revision 1453365) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java There are some bugs in implementation of MergeManager - Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: anty.rao Assignee: anty Priority: Critical Fix For: 0.23.7, 2.0.4-beta Attachments: MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2911) Hamster: Hadoop And Mpi on the same cluSTER
[ https://issues.apache.org/jira/browse/MAPREDUCE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595881#comment-13595881 ] Arun C Murthy commented on MAPREDUCE-2911: -- I have the same questions as Harsh. On May 17, 2012 Ralph said he was close to committing this to OpenMPI, as mentioned on this jira: http://s.apache.org/uY Where is this 'bile-spewing' and when did it start? I'm still looking forward to playing with this. Hamster: Hadoop And Mpi on the same cluSTER --- Key: MAPREDUCE-2911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2911 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv2 Affects Versions: 0.23.0 Environment: All Unix-Environments Reporter: Milind Bhandarkar Assignee: Ralph H Castain Original Estimate: 336h Remaining Estimate: 336h MPI is commonly used for many machine-learning applications. OpenMPI (http://www.open-mpi.org/) is a popular BSD-licensed version of MPI. In the past, running MPI application on a Hadoop cluster was achieved using Hadoop Streaming (http://videolectures.net/nipsworkshops2010_ye_gbd/), but it was kludgy. After the resource-manager separation from JobTracker in Hadoop, we have all the tools needed to make MPI a first-class citizen on a Hadoop cluster. I am currently working on the patch to make MPI an application-master. Initial version of this patch will be available soon (hopefully before September 10.) This jira will track the development of Hamster: The application master for MPI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3685) There are some bugs in implementation of MergeManager
[ https://issues.apache.org/jira/browse/MAPREDUCE-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595902#comment-13595902 ] Hudson commented on MAPREDUCE-3685: --- Integrated in Hadoop-Mapreduce-trunk #1365 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1365/]) MAPREDUCE-3685. Fix bugs in MergeManager to ensure compression codec is appropriately used and that on-disk segments are correctly sorted on file-size. Contributed by Anty Rao and Ravi Prakash. (Revision 1453365) Result = SUCCESS acmurthy : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1453365 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Merger.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/MergeManagerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/OnDiskMapOutput.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/test/java/org/apache/hadoop/mapreduce/task/reduce/TestMergeManager.java There are some bugs in implementation of MergeManager - Key: MAPREDUCE-3685 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3685 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.1 Reporter: anty.rao Assignee: anty Priority: Critical Fix For: 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685-branch-0.23.1.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.branch-0.23.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch, MAPREDUCE-3685.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4993) AM thinks it was killed when an error occurs setting up a task container launch context
[ https://issues.apache.org/jira/browse/MAPREDUCE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13595972#comment-13595972 ] Jason Lowe commented on MAPREDUCE-4993: --- Yes, the AM cannot succeed if it cannot create the common task container launch context. However the whole point of this JIRA is that it should mark the job as FAILED or ERROR with an appropriate diagnostic message for the application rather than marking the job as KILLED. The latter status leads users to believe someone or something killed the job which is not the case. AM thinks it was killed when an error occurs setting up a task container launch context --- Key: MAPREDUCE-4993 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4993 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Abhishek Kapoor If an IOException occurs while setting up a container launch context for a task then the AM exits with a KILLED status and no diagnostics. The job should be marked as FAILED (or maybe ERROR) with a useful diagnostics message indicating the nature of the error. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5050) Cannot find partition.lst in Terasort on Hadoop/Local File System
Matt Parker created MAPREDUCE-5050: -- Summary: Cannot find partition.lst in Terasort on Hadoop/Local File System Key: MAPREDUCE-5050 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5050 Project: Hadoop Map/Reduce Issue Type: Bug Components: examples Affects Versions: 0.20.2 Environment: Cloudera VM CDH3u4, VMWare, Linux, Java SE 1.6.0_31-b04 Reporter: Matt Parker Priority: Minor I'm trying to simulate running Hadoop on Lustre by configuring it to use the local file system using a single cloudera VM (cdh3u4). I can generate the data just fine, but when running the sorting portion of the program, I get an error about not being able to find the _partition.lst file. It exists in the generated data directory. Perusing the Terasort code, I see in the main method that has a Path reference to partition.lst, which is created with the parent directory. public int run(String[] args) throws Exception { LOG.info(starting); JobConf job = (JobConf) getConf(); Path inputDir = new Path(args[0]); inputDir = inputDir.makeQualified(inputDir.getFileSystem(job)); Path partitionFile = new Path(inputDir, TeraInputFormat.PARTITION_FILENAME); URI partitionUri = new URI(partitionFile.toString() + # + TeraInputFormat.PARTITION_FILENAME); TeraInputFormat.setInputPaths(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setJobName(TeraSort); job.setJarByClass(TeraSort.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setInputFormat(TeraInputFormat.class); job.setOutputFormat(TeraOutputFormat.class); job.setPartitionerClass(TotalOrderPartitioner.class); TeraInputFormat.writePartitionFile(job, partitionFile); DistributedCache.addCacheFile(partitionUri, job); DistributedCache.createSymlink(job); job.setInt(dfs.replication, 1); TeraOutputFormat.setFinalSync(job, true); JobClient.runJob(job); LOG.info(done); return 0; } But in the configure method, the Path isn't created with the parent directory reference. public void configure(JobConf job) { try { FileSystem fs = FileSystem.getLocal(job); Path partFile = new Path(TeraInputFormat.PARTITION_FILENAME); splitPoints = readPartitions(fs, partFile, job); trie = buildTrie(splitPoints, 0, splitPoints.length, new Text(), 2); } catch (IOException ie) { throw new IllegalArgumentException(can't read paritions file, ie); } } I modified the code as follows, and now sorting portion of the Terasort test works using the general file system. I think the above code is a bug. public void configure(JobConf job) { try { FileSystem fs = FileSystem.getLocal(job); Path[] inputPaths = TeraInputFormat.getInputPaths(job); Path partFile = new Path(inputPaths[0], TeraInputFormat.PARTITION_FILENAME); splitPoints = readPartitions(fs, partFile, job); trie = buildTrie(splitPoints, 0, splitPoints.length, new Text(), 2); } catch (IOException ie) { throw new IllegalArgumentException(can't read paritions file, ie); } } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0
Damien Hardy created MAPREDUCE-5051: --- Summary: Combiner not used when NUM_REDUCES=0 Key: MAPREDUCE-5051 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 2.0.2-alpha Environment: CDH4.1.2 MR1 Reporter: Damien Hardy We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer : Bulk indexing of HBase data in ElasticSearch, Map output is K / V : #bulk / json_data_to_be_indexed. So job is launched maps work, combiners index and a reducer is created for nothing (sometimes waiting for other M/R job to free a tasktracker slot for reducer cf. MAPREDUCE-5019 ) When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are started but combiner are not used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5052) Job History UI and web services confusing job start time and job submit time
Kendall Thrapp created MAPREDUCE-5052: - Summary: Job History UI and web services confusing job start time and job submit time Key: MAPREDUCE-5052 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5052 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp The Start Time column shown on Job History server's main webpage (http://host:port/jobhistory) is actually showing the *submit* time for jobs. However, when you drill down to an individual job's page, there the Start Time really does refer to when the job actually started. This also true for the web services REST API, where the Jobs listing returns the submit times as startTime, but the single Job API returns the start time as startTime. The two different times being referred to by the same name is confusing. However, it is useful to have both times, as the difference between the submit time and start time can show how long a job was stuck waiting in a queue. The column on the main job history page should be changed to Submit Time and the individual job's page should show both the submit time and start time. The web services REST API should be updated with these changes as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash reassigned MAPREDUCE-5023: --- Assignee: Ravi Prakash History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596115#comment-13596115 ] Tom White commented on MAPREDUCE-5038: -- +1 old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5049) CombineFileInputFormat counts all compressed files non-splitable
[ https://issues.apache.org/jira/browse/MAPREDUCE-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596117#comment-13596117 ] Tom White commented on MAPREDUCE-5049: -- +1 CombineFileInputFormat counts all compressed files non-splitable Key: MAPREDUCE-5049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5049 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-5049.patch In branch-1, CombineFileInputFormat doesn't take SplittableCompressionCodec into account and thinks that all compressible input files aren't splittable. This is a regression from when handling for non-splitable compression codecs was originally added in MAPREDUCE-1597, and seems to have somehow gotten in when the code was pulled from 0.22 to branch-1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-5023: Attachment: MAPREDUCE-5023.patch Simple enough patch. There's some code duplication between getCounters() in CountersBlock.java JobCounterInfo.java but I don't know if it can be helped. History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3916) various issues with running yarn proxyserver
[ https://issues.apache.org/jira/browse/MAPREDUCE-3916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596136#comment-13596136 ] Suresh Srinivas commented on MAPREDUCE-3916: Alejandro, all I see in this patch is just a change in the description in yarn-default.xml. How does this solve the problem? If it is just a doc update, should the title of this jira be updated? various issues with running yarn proxyserver Key: MAPREDUCE-3916 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3916 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, resourcemanager, webapps Affects Versions: 0.23.1, 2.0.0-alpha, 3.0.0 Reporter: Roman Shaposhnik Assignee: Devaraj K Priority: Critical Labels: mrv2 Fix For: 2.0.0-alpha Attachments: MAPREDUCE-3916.patch Seem like yarn proxyserver is not operational when running out of the 0.23.1 RC2 tarball. # Setting yarn.web-proxy.address to match yarn.resourcemanager.address doesn't disable the proxyserver (althought not setting yarn.web-proxy.address at all correctly disable it and produces a message: org.apache.hadoop.yarn.YarnException: yarn.web-proxy.address is not set so the proxy will not run). This contradicts the documentation provided for yarn.web-proxy.address in yarn-default.xml # Setting yarn.web-proxy.address and running the service results in the following: {noformat} $ ./sbin/yarn-daemon.sh start proxyserver starting proxyserver, logging to /tmp/hadoop-0.23.1/logs/yarn-rvs-proxyserver-ahmed-laptop.out /usr/java/64/jdk1.6.0_22/bin/java -Dproc_proxyserver -Xmx1000m -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.home.dir= -Dyarn.id.str=rvs -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/tmp/hadoop-0.23.1/lib/native -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/tmp/hadoop-0.23.1/logs -Dyarn.log.dir=/tmp/hadoop-0.23.1/logs -Dhadoop.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.log.file=yarn-rvs-proxyserver-ahmed-laptop.log -Dyarn.home.dir=/tmp/hadoop-0.23.1 -Dhadoop.root.logger=INFO,DRFA -Dyarn.root.logger=INFO,DRFA -Djava.library.path=/tmp/hadoop-0.23.1/lib/native -classpath /tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/etc/hadoop:/tmp/hadoop-0.23.1/share/hadoop/common/lib/*:/tmp/hadoop-0.23.1/share/hadoop/common/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs:/tmp/hadoop-0.23.1/share/hadoop/hdfs/lib/*:/tmp/hadoop-0.23.1/share/hadoop/hdfs/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/*:/tmp/hadoop-0.23.1/share/hadoop/mapreduce/lib/* org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer {noformat} with the following message found in the logs: {noformat} 2012-02-24 09:26:31,099 FATAL org.apache.hadoop.yarn.server.webproxy.WebAppProxy: Could not start proxy web server java.io.FileNotFoundException: webapps/proxy not found in CLASSPATH at org.apache.hadoop.http.HttpServer.getWebAppsPath(HttpServer.java:532) at org.apache.hadoop.http.HttpServer.init(HttpServer.java:224) at org.apache.hadoop.http.HttpServer.init(HttpServer.java:164) at org.apache.hadoop.yarn.server.webproxy.WebAppProxy.start(WebAppProxy.java:85) at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68) at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.main(WebAppProxyServer.java:76) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3900) mr-jobhistory-daemon.sh should rely on MAPREDUCE env. variables instead of the YARN ones
[ https://issues.apache.org/jira/browse/MAPREDUCE-3900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated MAPREDUCE-3900: --- Issue Type: Improvement (was: Bug) mr-jobhistory-daemon.sh should rely on MAPREDUCE env. variables instead of the YARN ones Key: MAPREDUCE-3900 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3900 Project: Hadoop Map/Reduce Issue Type: Improvement Components: jobhistoryserver Affects Versions: 0.23.0 Reporter: Roman Shaposhnik Assignee: Roman Shaposhnik It nice to see yarn-deamo.sh be split into a separate script for managing MR service(s), but once that has happened we should go all the way and make it configurable as an MR entity. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-3980) mr-jobhistory-daemon.sh should look for mapred script in HADOOP_MAPRED_HOME
[ https://issues.apache.org/jira/browse/MAPREDUCE-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved MAPREDUCE-3980. Resolution: Duplicate Fix Version/s: 2.0.2-alpha Resolving it as duplicate as this has been addressed by MAPREDUCE-4649 for 2.0.2-alpha. mr-jobhistory-daemon.sh should look for mapred script in HADOOP_MAPRED_HOME --- Key: MAPREDUCE-3980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3980 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.1 Reporter: Roman Shaposhnik Fix For: 2.0.2-alpha The following: {noformat} nohup nice -n $YARN_NICENESS $YARN_HOME/bin/mapred --config $YARN_CONF_DIR $command $@ $log 21 /dev/null {noformat} should be this instead: {noformat} nohup nice -n $YARN_NICENESS $HADOOP_MAPRED_HOME/bin/mapred --config $YARN_CONF_DIR $command $@ $log 21 /dev/null {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-5023: Status: Patch Available (was: Open) History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0
[ https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596218#comment-13596218 ] Robert Joseph Evans commented on MAPREDUCE-5051: Damien, The combiner only runs as part of the shuffle phase. The shuffle phase only runs when there is a reducer that needs the data to be shuffled. So your indexing works just fine if all of the indexes for a given key are not in the same file? If you want just a combiner to run with no reducers configured, you are going to have to write something for that yourself. Combiner not used when NUM_REDUCES=0 Key: MAPREDUCE-5051 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 2.0.2-alpha Environment: CDH4.1.2 MR1 Reporter: Damien Hardy We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer : Bulk indexing of HBase data in ElasticSearch, Map output is K / V : #bulk / json_data_to_be_indexed. So job is launched maps work, combiners index and a reducer is created for nothing (sometimes waiting for other M/R job to free a tasktracker slot for reducer cf. MAPREDUCE-5019 ) When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are started but combiner are not used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5051) Combiner not used when NUM_REDUCES=0
[ https://issues.apache.org/jira/browse/MAPREDUCE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans resolved MAPREDUCE-5051. Resolution: Won't Fix If you feel strongly that this should be supported you can reopen this JIRA as new feature work. Combiner not used when NUM_REDUCES=0 Key: MAPREDUCE-5051 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5051 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 2.0.2-alpha Environment: CDH4.1.2 MR1 Reporter: Damien Hardy We have a M/R job that use Mapper + Combiner but have nothing to do in Reducer : Bulk indexing of HBase data in ElasticSearch, Map output is K / V : #bulk / json_data_to_be_indexed. So job is launched maps work, combiners index and a reducer is created for nothing (sometimes waiting for other M/R job to free a tasktracker slot for reducer cf. MAPREDUCE-5019 ) When we put ```job.setNumReduceTasks(0);``` in our job .run(), mapper are started but combiner are not used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596228#comment-13596228 ] Hadoop QA commented on MAPREDUCE-5023: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572564/MAPREDUCE-5023.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3390//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3390//console This message is automatically generated. History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596289#comment-13596289 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-5042: In my prelim security work, I once had the JobClient generate the secret and then later had the MR AM generate the tokens and reupload the tokens file into the submit directory. That was another hop to DFS and we changed that since, but this recovery code bug fell through. So there are multiple solutions: - Have a single secret but let the client generate it - Have a single secret but upload the tokens file for future app-attempts - Have multiple tokens It's future proof to separate the task and shuffle security secrets, but not sure that is tied in directly to this one if we consider the reupload solution. I don't feel strongly about any solution, but one thing we should keep in mind is to move as much stuff into the AM so that the client is thinner and enables us to do submits via web services. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.5-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596332#comment-13596332 ] Jason Lowe commented on MAPREDUCE-5042: --- I thought about the upload-to-staging-for-future-attempts solution but it seemed passing the secret in the job credentials was a bit cleaner and avoided the extra HDFS operations. As for splitting the job token into shuffle and task, I didn't want to change the current task authentication behavior. Allowing an old task attempt to authenticate with a new app attempt seemed like it would be a problem waiting to happen. But we need the shuffle secret to persist across app attempts, hence the push to split them as part of this change. Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.5-beta Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596342#comment-13596342 ] Thomas Graves commented on MAPREDUCE-5023: -- Ravi can you please make sure this works for the AM also. History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated MAPREDUCE-5023: Attachment: MAPREDUCE-5023.patch Thanks Tom. This new patch makes it work for both the HS and AM History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4885: - Target Version/s: 3.0.0, trunk-win (was: trunk-win) Affects Version/s: 3.0.0 streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0, trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4885: - Attachment: MAPREDUCE-4885.1.patch With this patch, all of the streaming tests pass consistently on Windows. Note that to see the tests pass, you'll also need the patch for MAPREDUCE-5006, which hasn't been committed yet. The problems were: # The now-infamous problem of attempting to use paths rooted on test.build.data with HDFS, which rejects paths containing ':', such as the Windows drive spec. The patch implements our standard work-around to allow overriding the test path to /tmp/test name. # There was an assumption of Unix-style commands available for use as streaming mapper and reducer functions. To work around this, I introduced some cmd scripts that roughly approximate Unix cat and xargs cat. # There was one actual bug in {{StreamJob}}. It was attempting to pass a string file path into the {{URI}} constructor. On Windows, this would contain drive spec, and {{URI}} would consider it invalid and throw an error. The only reason we needed the {{URI}} was to pass it in to the constructor of {{Path}}. Fortunately, we already have the logic in the {{Path}} constructor now to handle this case correctly cross-platform, so the simple fix is just to call the {{Path}} constructor with the string file path directly. # I've increased a few test timeouts. The old timeout values were borderline in my environment, sometimes causing the tests to fail sporadically on timeouts. This was not a Windows-specific problem. I've tested this patch on Mac and Windows. streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0, trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4885.1.patch There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4885: - Status: Patch Available (was: Open) streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0, trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4885.1.patch There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596412#comment-13596412 ] Hadoop QA commented on MAPREDUCE-5023: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572613/MAPREDUCE-5023.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3391//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3391//console This message is automatically generated. History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596438#comment-13596438 ] Hadoop QA commented on MAPREDUCE-4885: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572615/MAPREDUCE-4885.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 11 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:red}-1 release audit{color}. The applied patch generated 2 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-tools/hadoop-streaming: org.apache.hadoop.streaming.TestStreamReduceNone org.apache.hadoop.streaming.TestStreamXmlRecordReader {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3392//console This message is automatically generated. streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0, trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4885.1.patch There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596469#comment-13596469 ] Chris Nauroth commented on MAPREDUCE-4885: -- The test failures reported by Hudson are unrelated and will be resolved by the patch on MAPREDUCE-5006. streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0, trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4885.1.patch There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-5038: -- Resolution: Fixed Fix Version/s: 1.2.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Sandy. Thanks Tom for reviewing. Committed to branch-1. old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5049) CombineFileInputFormat counts all compressed files non-splitable
[ https://issues.apache.org/jira/browse/MAPREDUCE-5049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-5049: -- Resolution: Fixed Fix Version/s: 1.2.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Sandy. Thanks Tom for reviewing. Committed to branch-1. CombineFileInputFormat counts all compressed files non-splitable Key: MAPREDUCE-5049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5049 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5049.patch In branch-1, CombineFileInputFormat doesn't take SplittableCompressionCodec into account and thinks that all compressible input files aren't splittable. This is a regression from when handling for non-splitable compression codecs was originally added in MAPREDUCE-1597, and seems to have somehow gotten in when the code was pulled from 0.22 to branch-1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596501#comment-13596501 ] Sangjin Lee commented on MAPREDUCE-5038: I think MAPREDUCE-5046 can be closed, as it is a subset of this patch. old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-5046) backport MAPREDUCE-1423 to mapred.lib.CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee resolved MAPREDUCE-5046. Resolution: Fixed Fix Version/s: 1.2.0 It was fixed as part of MAPREDUCE-5038. backport MAPREDUCE-1423 to mapred.lib.CombineFileInputFormat Key: MAPREDUCE-5046 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5046 Project: Hadoop Map/Reduce Issue Type: Bug Components: client Affects Versions: 1.1.1 Reporter: Sangjin Lee Fix For: 1.2.0 The CombineFileInputFormat class in org.apache.hadoop.mapred.lib (the old API) has a couple of issues. These issues were addressed in the new API (MAPREDUCE-1423), but the old class was not fixed. The main issue the JIRA refers to is a performance problem. However, IMO there is a more serious problem which is a thread-safety issue (rackToNodes) which was fixed alongside. What is the policy on addressing issues in the old API? Can we backport this to the old class? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596522#comment-13596522 ] Thomas Graves commented on MAPREDUCE-5023: -- Thanks Ravi! +1 on the first patch. It works for me on AM and HS. I'll commit shortly History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-5023: - Resolution: Fixed Fix Version/s: 2.0.4-alpha 0.23.7 3.0.0 Status: Resolved (was: Patch Available) History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-alpha Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5023) History Server Web Services missing Job Counters
[ https://issues.apache.org/jira/browse/MAPREDUCE-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596543#comment-13596543 ] Hudson commented on MAPREDUCE-5023: --- Integrated in Hadoop-trunk-Commit #3437 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3437/]) MAPREDUCE-5023. History Server Web Services missing Job Counters (Ravi Prakash via tgraves) (Revision 1454156) Result = SUCCESS tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1454156 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/webapp/dao/JobCounterInfo.java History Server Web Services missing Job Counters Key: MAPREDUCE-5023 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5023 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, webapps Affects Versions: 0.23.6 Reporter: Kendall Thrapp Assignee: Ravi Prakash Priority: Critical Fix For: 3.0.0, 0.23.7, 2.0.4-alpha Attachments: MAPREDUCE-5023.patch, MAPREDUCE-5023.patch The History Server's Job Counters API is not returning all the counters seen on the Job's Counters webpage. Specifically, I'm not seeing any of the counters in the org.apache.hadoop.mapreduce.JobCounter group: TOTAL_LAUNCHED_MAPS TOTAL_LAUNCHED_REDUCES OTHER_LOCAL_MAPS SLOTS_MILLIS_MAPS SLOTS_MILLIS_REDUCES -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4885) streaming tests have multiple failures on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4885: - Target Version/s: 3.0.0 (was: 3.0.0, trunk-win) Affects Version/s: (was: trunk-win) streaming tests have multiple failures on Windows - Key: MAPREDUCE-4885 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4885 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/streaming, test Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4885.1.patch There are multiple test failures due to Queue configuration missing child queue names for root. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail
Robert Parker created MAPREDUCE-5053: Summary: java.lang.InternalError from decompression codec cause reducer to fail Key: MAPREDUCE-5053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.5, 2.0.3-alpha, trunk Reporter: Robert Parker Assignee: Robert Parker lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. This exception will cause the reducer to fail and bypass all of the fetch failure logic. The decompressing errors should be treated as fetch failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-207) Computing Input Splits on the MR Cluster
[ https://issues.apache.org/jira/browse/MAPREDUCE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596692#comment-13596692 ] Sandy Ryza commented on MAPREDUCE-207: -- Arun, are you still planning on working on this? If not, do you mind if I pick it up? Computing Input Splits on the MR Cluster Key: MAPREDUCE-207 URL: https://issues.apache.org/jira/browse/MAPREDUCE-207 Project: Hadoop Map/Reduce Issue Type: New Feature Components: applicationmaster, mrv2 Reporter: Philip Zeyliger Assignee: Arun C Murthy Attachments: MAPREDUCE-207.patch Instead of computing the input splits as part of job submission, Hadoop could have a separate job task type that computes the input splits, therefore allowing that computation to happen on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5053: - Status: Patch Available (was: Open) java.lang.InternalError from decompression codec cause reducer to fail -- Key: MAPREDUCE-5053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.23.5, 2.0.3-alpha, trunk Reporter: Robert Parker Assignee: Robert Parker Attachments: MAPREDUCE-5053-1.patch lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. This exception will cause the reducer to fail and bypass all of the fetch failure logic. The decompressing errors should be treated as fetch failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Parker updated MAPREDUCE-5053: - Attachment: MAPREDUCE-5053-1.patch java.lang.InternalError from decompression codec cause reducer to fail -- Key: MAPREDUCE-5053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.0.3-alpha, 0.23.5 Reporter: Robert Parker Assignee: Robert Parker Attachments: MAPREDUCE-5053-1.patch lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. This exception will cause the reducer to fail and bypass all of the fetch failure logic. The decompressing errors should be treated as fetch failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5053) java.lang.InternalError from decompression codec cause reducer to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-5053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596724#comment-13596724 ] Hadoop QA commented on MAPREDUCE-5053: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572678/MAPREDUCE-5053-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3393//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3393//console This message is automatically generated. java.lang.InternalError from decompression codec cause reducer to fail -- Key: MAPREDUCE-5053 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5053 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk, 2.0.3-alpha, 0.23.5 Reporter: Robert Parker Assignee: Robert Parker Attachments: MAPREDUCE-5053-1.patch lz4, snappy, zlib, and lzo Decompressor's only throw java.lang.InternalError. This exception will cause the reducer to fail and bypass all of the fetch failure logic. The decompressing errors should be treated as fetch failures. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry Chen updated MAPREDUCE-4961: -- Attachment: MAPREDUCE-4961.patch Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations - Key: MAPREDUCE-4961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Jerry Chen Assignee: Jerry Chen Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch Original Estimate: 72h Remaining Estimate: 72h MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 extends Shuffle to be able to provide different MergeManager implementations. While using these pluggable features, I find that when a map reduce is running locally, a RawKeyValueIterator was returned directly from a static call of Merge.merge, which break the assumption that the Shuffle may provide different merge methods although there is no copy phase for this situation. The use case is when I am implementating a hash-based MergeManager, we don't need sort in map side, while when running the map reduce locally, the hash-based MergeManager will have no chance to be used as it goes directly to Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete. So we need to move the code calling Merger.merge from Reduce Task to ShuffleConsumerPlugin implementation, so that the Suffle implementation can decide how to do the merge and return corresponding iterator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry Chen updated MAPREDUCE-4961: -- Status: Open (was: Patch Available) Cancel the current patch for taking Asokan's advice. Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations - Key: MAPREDUCE-4961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Jerry Chen Assignee: Jerry Chen Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch Original Estimate: 72h Remaining Estimate: 72h MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 extends Shuffle to be able to provide different MergeManager implementations. While using these pluggable features, I find that when a map reduce is running locally, a RawKeyValueIterator was returned directly from a static call of Merge.merge, which break the assumption that the Shuffle may provide different merge methods although there is no copy phase for this situation. The use case is when I am implementating a hash-based MergeManager, we don't need sort in map side, while when running the map reduce locally, the hash-based MergeManager will have no chance to be used as it goes directly to Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete. So we need to move the code calling Merger.merge from Reduce Task to ShuffleConsumerPlugin implementation, so that the Suffle implementation can decide how to do the merge and return corresponding iterator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry Chen updated MAPREDUCE-4961: -- Status: Patch Available (was: Open) The new patch submitted. Remove the changes related to MergeManager and keep it in ShuffleConsumerPlugin interface. Please kindly help review. Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations - Key: MAPREDUCE-4961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Jerry Chen Assignee: Jerry Chen Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch Original Estimate: 72h Remaining Estimate: 72h MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 extends Shuffle to be able to provide different MergeManager implementations. While using these pluggable features, I find that when a map reduce is running locally, a RawKeyValueIterator was returned directly from a static call of Merge.merge, which break the assumption that the Shuffle may provide different merge methods although there is no copy phase for this situation. The use case is when I am implementating a hash-based MergeManager, we don't need sort in map side, while when running the map reduce locally, the hash-based MergeManager will have no chance to be used as it goes directly to Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete. So we need to move the code calling Merger.merge from Reduce Task to ShuffleConsumerPlugin implementation, so that the Suffle implementation can decide how to do the merge and return corresponding iterator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4961) Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations
[ https://issues.apache.org/jira/browse/MAPREDUCE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596916#comment-13596916 ] Hadoop QA commented on MAPREDUCE-4961: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12572711/MAPREDUCE-4961.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3394//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3394//console This message is automatically generated. Map reduce running local should also go through ShuffleConsumerPlugin for enabling different MergeManager implementations - Key: MAPREDUCE-4961 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4961 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: trunk Reporter: Jerry Chen Assignee: Jerry Chen Attachments: MAPREDUCE-4961.patch, MAPREDUCE-4961.patch Original Estimate: 72h Remaining Estimate: 72h MAPREDUCE-4049 provide the ability for pluggable Shuffle and MAPREDUCE-4080 extends Shuffle to be able to provide different MergeManager implementations. While using these pluggable features, I find that when a map reduce is running locally, a RawKeyValueIterator was returned directly from a static call of Merge.merge, which break the assumption that the Shuffle may provide different merge methods although there is no copy phase for this situation. The use case is when I am implementating a hash-based MergeManager, we don't need sort in map side, while when running the map reduce locally, the hash-based MergeManager will have no chance to be used as it goes directly to Merger.merge. This makes the pluggable Shuffle and MergeManager incomplete. So we need to move the code calling Merger.merge from Reduce Task to ShuffleConsumerPlugin implementation, so that the Suffle implementation can decide how to do the merge and return corresponding iterator. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira