[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505291#comment-13505291 ] Vinod Kumar Vavilapalli commented on MAPREDUCE-4819: Haven't read through the whole discussion yet, but it looks to me that the following will solve the issue: What we need to ensure is the final JobHistoryEvent i.e JobFinishedEvent is logged and flushed before changing the job-state to SUCCEEDED. JobHistory is our commit log. In case RM reruns an application, we need to verify if there is a final JobFinishedEvent and avoid rerunning in case there is one. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4798) TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use'
[ https://issues.apache.org/jira/browse/MAPREDUCE-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam liu updated MAPREDUCE-4798: --- Attachment: (was: MAPREDUCE-4798.patch) TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use' - Key: MAPREDUCE-4798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4798 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, test Affects Versions: 1.0.3 Environment: Red Hat Ent Server 6.2 Reporter: sam liu Priority: Minor Labels: patch Fix For: 1.0.3 Original Estimate: 3h Remaining Estimate: 3h UT Failure in IHC 1.0.3: org.apache.hadoop.mapred.TestJobHistoryServer. This UT fails sometimes. The error message is: 'Testcase: testHistoryServerStandalone took 5.376 sec Caused an ERROR Address already in use java.lang.AssertionError: Address already in use at org.apache.hadoop.mapred.TestJobHistoryServer.testHistoryServerStandalone(TestJobHistoryServer.java:113)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4798) TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use'
[ https://issues.apache.org/jira/browse/MAPREDUCE-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam liu updated MAPREDUCE-4798: --- Attachment: MAPREDUCE-4798_branch-1.patch MAPREDUCE-4798.patch TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use' - Key: MAPREDUCE-4798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4798 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, test Affects Versions: 1.0.3 Environment: Red Hat Ent Server 6.2 Reporter: sam liu Priority: Minor Labels: patch Fix For: 1.0.3 Attachments: MAPREDUCE-4798_branch-1.patch, MAPREDUCE-4798.patch Original Estimate: 3h Remaining Estimate: 3h UT Failure in IHC 1.0.3: org.apache.hadoop.mapred.TestJobHistoryServer. This UT fails sometimes. The error message is: 'Testcase: testHistoryServerStandalone took 5.376 sec Caused an ERROR Address already in use java.lang.AssertionError: Address already in use at org.apache.hadoop.mapred.TestJobHistoryServer.testHistoryServerStandalone(TestJobHistoryServer.java:113)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4798) TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use'
[ https://issues.apache.org/jira/browse/MAPREDUCE-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505329#comment-13505329 ] sam liu commented on MAPREDUCE-4798: Eric, I updated the patch for 1.0.3 and uploaded the patch for branch-1. For trunk, there is no TestJobHistoryServer.java, so I did not generate another patch for trunk. Thanks! TestJobHistoryServer fails some times with 'java.lang.AssertionError: Address already in use' - Key: MAPREDUCE-4798 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4798 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver, test Affects Versions: 1.0.3 Environment: Red Hat Ent Server 6.2 Reporter: sam liu Priority: Minor Labels: patch Fix For: 1.0.3 Attachments: MAPREDUCE-4798_branch-1.patch, MAPREDUCE-4798.patch Original Estimate: 3h Remaining Estimate: 3h UT Failure in IHC 1.0.3: org.apache.hadoop.mapred.TestJobHistoryServer. This UT fails sometimes. The error message is: 'Testcase: testHistoryServerStandalone took 5.376 sec Caused an ERROR Address already in use java.lang.AssertionError: Address already in use at org.apache.hadoop.mapred.TestJobHistoryServer.testHistoryServerStandalone(TestJobHistoryServer.java:113)' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505331#comment-13505331 ] Avner BenHanoch commented on MAPREDUCE-4049: Arun Alejandro - thanks for your clarifications! To summarize, I’ll implement all your comments, with the following emphasizes: * Leave class names as is: both for *Shuffle* and *ShuffleConsumerPlugin*. * ShuffleConsumerPlugin will be *interface* instead of AbstractClass * The property name will be *mapreduce.job.reduce.shuffle.class* * ShuffleConsumerPlugin ShuffleContext need to be *@LimitedPrivate* (without @unstable since it is interface for 3rd party vendors) Please ACK! plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4815) FileOutputCommitter.commitJob can be very slow for jobs with many output files
[ https://issues.apache.org/jira/browse/MAPREDUCE-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505479#comment-13505479 ] eric baldeschwieler commented on MAPREDUCE-4815: Bikas: Anyone who wants to use S3 as a target is going to have trouble w this. Of course, that is the case w the MR1 implementation too, so this is not a regression. But we're going to need to put energy into providing an alternative approach that does work w cloud stores at some point. It would be great if a solution emerged here that did not involve moving of files. That would reduce the burden on the HDFS meta-data system too, which would be a good thing for scalability. FileOutputCommitter.commitJob can be very slow for jobs with many output files -- Key: MAPREDUCE-4815 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4815 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2 Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Arun C Murthy If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. This is a performance regression from 1.x, as 1.x had the tasks commit directly to the final output directory as they were completing and commitJob had very little to do. The commit work was processed in parallel and overlapped the processing of outstanding tasks. In 0.23/2.x, the commit is single-threaded and waits until all tasks have completed before commencing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4778) Fair scheduler event log is only written if directory exists on HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-4778: - Resolution: Fixed Fix Version/s: 2.0.3-alpha 1.2.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1 I just committed this. Thanks, Sandy! Fair scheduler event log is only written if directory exists on HDFS Key: MAPREDUCE-4778 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4778 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker, scheduler Affects Versions: 1.1.0, 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-4778.branch1.patch, MAPREDUCE-4778.patch The fair scheduler event log is supposed to be written to the local filesystem, at {hadoop.log.dir}/fairscheduler. The event log will not be written unless this directory exists on HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4778) Fair scheduler event log is only written if directory exists on HDFS
[ https://issues.apache.org/jira/browse/MAPREDUCE-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505505#comment-13505505 ] Hudson commented on MAPREDUCE-4778: --- Integrated in Hadoop-trunk-Commit #3068 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3068/]) MAPREDUCE-4778. Fair scheduler event log is only written if directory exists on HDFS. Contributed by Sandy Ryza. (Revision 1414729) Result = SUCCESS tomwhite : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1414729 Files : * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerEventLog.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerEventLog.java Fair scheduler event log is only written if directory exists on HDFS Key: MAPREDUCE-4778 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4778 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobtracker, scheduler Affects Versions: 1.1.0, 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0, 2.0.3-alpha Attachments: MAPREDUCE-4778.branch1.patch, MAPREDUCE-4778.patch The fair scheduler event log is supposed to be written to the local filesystem, at {hadoop.log.dir}/fairscheduler. The event log will not be written unless this directory exists on HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Text File Busy errors launching MR tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505518#comment-13505518 ] Marc Reichman commented on MAPREDUCE-2374: -- Please consider backporting this to the stable branch, as I am seeing this regularly in 1.0.3/1.0.4. I believe this is the true fix for the original condition (not what was fixed, see the last few comments) of MAPREDUCE-4003. At the very least, if someone could provide a patched hadoop 1.0.3 jar with this bash fix I would try it out. Text File Busy errors launching MR tasks -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Fix For: 0.23.3, 2.0.2-alpha Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch, mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace Some very small percentage of tasks fail with a Text file busy error. The following was the original diagnosis: {quote} Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. {quote} .. but turned out to be another issue as well (see below) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4826) backport MAPREDUCE-2374 fix to 1.0.x stable
Marc Reichman created MAPREDUCE-4826: Summary: backport MAPREDUCE-2374 fix to 1.0.x stable Key: MAPREDUCE-4826 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4826 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task-controller Affects Versions: 1.0.3 Environment: Linux CentOS 6.3 amd64 Reporter: Marc Reichman Please consider backporting this fix to 1.0.x. I am running into it frequently , and it seems to be the original situation of MAPREDUCE-4003, which was marked fixed for a different item (see the last few comments). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4826) backport MAPREDUCE-2374 fix to 1.0.x stable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marc Reichman updated MAPREDUCE-4826: - Description: Please consider backporting the fix for MAPREDUCE-2374 to 1.0.x. I am running into it frequently , and it seems to be the original situation of MAPREDUCE-4003, which was marked fixed for a different item (see the last few comments). (was: Please consider backporting this fix to 1.0.x. I am running into it frequently , and it seems to be the original situation of MAPREDUCE-4003, which was marked fixed for a different item (see the last few comments).) backport MAPREDUCE-2374 fix to 1.0.x stable --- Key: MAPREDUCE-4826 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4826 Project: Hadoop Map/Reduce Issue Type: Improvement Components: task-controller Affects Versions: 1.0.3 Environment: Linux CentOS 6.3 amd64 Reporter: Marc Reichman Original Estimate: 48h Remaining Estimate: 48h Please consider backporting the fix for MAPREDUCE-2374 to 1.0.x. I am running into it frequently , and it seems to be the original situation of MAPREDUCE-4003, which was marked fixed for a different item (see the last few comments). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4809) Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-4809: - Resolution: Fixed Fix Version/s: (was: 2.0.3-alpha) MR-2454 Status: Resolved (was: Patch Available) +1. I've just committed this to MR-2454 branch, thanks Asokan! Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) Key: MAPREDUCE-4809 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4809 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: MR-2454 Attachments: MAPREDUCE-4809-1.patch, mapreduce-4809.patch, mapreduce-4809.patch, mapreduce-4809.patch Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4824) Provide a mechanism for jobs to indicate they should not be recovered on restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated MAPREDUCE-4824: - Attachment: MAPREDUCE-4824.patch Thanks for the feedback. Here's an updated patch with the improved message. I didn't add the property to mapred-default.xml, since it is a job-specific property and these are generally not added there. There's no way to have true job-specific properties, since if someone adds the property to the jobtracker's mapred-site.xml file then it will be picked up. I'm not sure there's an easy way around this. Provide a mechanism for jobs to indicate they should not be recovered on restart Key: MAPREDUCE-4824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4824 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1 Affects Versions: 1.1.0 Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-4824.patch, MAPREDUCE-4824.patch Some jobs (like Sqoop or HBase jobs) are not idempotent, so should not be recovered on jobtracker restart. MAPREDUCE-2702 solves this problem for MR2, however the approach there is not applicable for MR1, since even if we only use the job-level part of the patch and add a isRecoverySupported method to OutputCommitter, there is no way to use that information from the JT (which initiates recovery), since the JT does not instantiate OutputCommitters - and it shouldn't since they are user-level code. (In MR2 it's OK since the MR AM calls the method.) Instead, we can add a MR configuration property to say that a job is not recoverable, and the JT could safely read this from the job conf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-4807: - Status: Open (was: Patch Available) The patch looks reasonable, some comments: # The Context should just be passed into the ctor rather than ctor/init pairs - they don't buy us much. # Please keep the member fields in MapOutputBuffer/DirectMapOutputCollector, this way your patch is *much* smaller. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4824) Provide a mechanism for jobs to indicate they should not be recovered on restart
[ https://issues.apache.org/jira/browse/MAPREDUCE-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505596#comment-13505596 ] Harsh J commented on MAPREDUCE-4824: bq. I didn't add the property to mapred-default.xml, since it is a job-specific property and these are generally not added there. We do have several job-specific properties with proper defaults listed in that file. Unless someone overrides them manually, how come there is harm in doing this, and must we remove the ones already present? The file just helps serve as a good doc. behind the config feature, cause otherwise there's no doc reference to this in the patch. Provide a mechanism for jobs to indicate they should not be recovered on restart Key: MAPREDUCE-4824 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4824 Project: Hadoop Map/Reduce Issue Type: New Feature Components: mrv1 Affects Versions: 1.1.0 Reporter: Tom White Assignee: Tom White Attachments: MAPREDUCE-4824.patch, MAPREDUCE-4824.patch Some jobs (like Sqoop or HBase jobs) are not idempotent, so should not be recovered on jobtracker restart. MAPREDUCE-2702 solves this problem for MR2, however the approach there is not applicable for MR1, since even if we only use the job-level part of the patch and add a isRecoverySupported method to OutputCommitter, there is no way to use that information from the JT (which initiates recovery), since the JT does not instantiate OutputCommitters - and it shouldn't since they are user-level code. (In MR2 it's OK since the MR AM calls the method.) Instead, we can add a MR configuration property to say that a job is not recoverable, and the JT could safely read this from the job conf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4817: - Status: Patch Available (was: Open) Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4817: - Attachment: MAPREDUCE-4817.patch This patch removes the ping Timeout check from the AM task heart beat handler. If we want to remove the other side from each Task we can do that in separate jira. Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505617#comment-13505617 ] Jason Lowe commented on MAPREDUCE-4819: --- We have to be careful about the fact that the job history log is moved to the done intermediate directory during shutdown after notifying the client. Therefore there's a window of opportunity where we can fail after notifying the client and moving the job history file but before unregistering from the RM. When the app attempt restarts in that case, the job history file won't be found and we'll end up re-running the job from scratch. We either need to unregister from the RM first (and rely on the FINISHING grace period to buy us enough time to move the file) or explicitly *not* delete the file when we copy it to done intermediate and instead wait for the staging directory to be removed later to clean it up. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505618#comment-13505618 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- (Arun, hopefully you family health issues are on the right track) Avner, * I'm ok with leaving *Shuffle* as it is, though I don't like the *Consumer* in *ShufleConsumerPlugin* interface, I'd be OK with *ShufflePlugin*. * The property name should relfect the final name o the *ShufleConsumerPlugin* interface. * Please make ShuffleContext a static inner class of the *ShufleConsumerPlugin* interface called *Context*. While I'm not religious about names, I do care. In this case, we have the opportunity to have a consistent set of names and APIs (ie inner Context) for a set of related plugins (all the ones affected by MAPREDUCE-2454). plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505625#comment-13505625 ] Alejandro Abdelnur commented on MAPREDUCE-4807: --- Arun, Regarding your #1 comment, I don't think is a good idea given that he MOC is instantiated using ReflectionUtils.newInstance(). Thus you cannot pass the context, you need the init(). It the same pattern used MAPREDUCE-4049. {code} private KEY, VALUE MapOutputCollectorKEY, VALUE createMapOutputCollector(JobConf job, TaskReporter reporter) throws IOException, ClassNotFoundException { MapOutputCollectorKEY, VALUE collector = (MapOutputCollectorKEY, VALUE) ReflectionUtils.newInstance( job.getClass(JobContext.MAP_OUTPUT_COLLECTOR_CLASS_ATTR, MapOutputBuffer.class, MapOutputCollector.class), job); LOG.info(Map output collector class = + collector.getClass().getName()); MapOutputCollector.Context context = new MapOutputCollector.Context(this, job, reporter); collector.init(context); return collector; } {code} Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505628#comment-13505628 ] Robert Joseph Evans commented on MAPREDUCE-4819: My vote would be to leave it around until we are done done and staging is removed. It seems simpler. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505636#comment-13505636 ] Robert Joseph Evans commented on MAPREDUCE-4817: The patch is simple and straight forward I am +1 assuming that Jekins is OK with it. I am not sure that we need to update the task. The ping is used check if the task can reach the AM still. If you want to remove it go ahead and file a JIRA but it may have further ramifications. Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505644#comment-13505644 ] Jason Lowe commented on MAPREDUCE-4819: --- bq. My vote would be to leave it around until we are done done and staging is removed. It seems simpler. Agreed, although we would also need to make sure we only delete the staging directory after unregistering from the RM. Something we need to do anyway, see YARN-244. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
Radim Kolar created MAPREDUCE-4827: -- Summary: Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Radim Kolar updated MAPREDUCE-4827: --- Attachment: betterhash1.txt Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505659#comment-13505659 ] Avner BenHanoch commented on MAPREDUCE-4049: Thanks Alejandro, I'll submit the patch next week, based on ALL your (and Arun's) comments: * Shuffle - leave the class name as is * ShufflePlugin - instead of ShuffleConsumerPlugin * ShufflePlugin will be an interface * property name will be: *mapreduce.job.reduce.shuffle.plugin.class* (Kindly let me know ASAP if you prefer other name, or in case you consulted mapred-default.xml and preferred names like mapreduce.reduce.shuffle... OR mapreduce.shuffle... ) * ShuffleContext - ShufflePlugin.Context - a static inner class * ShufflePlugin will be @LimitedPrivate (without @unstable) Cheers, Avner plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505673#comment-13505673 ] Robert Joseph Evans commented on MAPREDUCE-4827: I can see that there may be a need to improve the hashing of some poor quality implementations and the patch looks OK. I am not an expert on hash functions but from what I know it looks good. Do you have some concrete numbers that we can see how it improved the distribution in some specific cases? Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505677#comment-13505677 ] Hadoop QA commented on MAPREDUCE-4817: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555186/MAPREDUCE-4817.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3073//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3073//console This message is automatically generated. Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505679#comment-13505679 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Avner, everything looks good except your last bullet, ShufflePlugin Context must be marked as @LimitedPrivate for MapReduce and as @Unstable. plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505683#comment-13505683 ] Robert Joseph Evans commented on MAPREDUCE-4819: Yes, but going off of Koji's comments we also want to be sure that if the previous attempts edit log does not exist we don't know what state we were in and we should just assume we need to unregister and exit. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505687#comment-13505687 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Avner, one more thing, please make sure the patch applies to branch MR-2454 (https://svn.apache.org/repos/asf/hadoop/common/branches/MR-2454). thx plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4825) JobImpl.finished doesn't expect ERROR as a final job state
[ https://issues.apache.org/jira/browse/MAPREDUCE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505701#comment-13505701 ] Robert Joseph Evans commented on MAPREDUCE-4825: The patch looks fine to me. +1 I'll check it in. JobImpl.finished doesn't expect ERROR as a final job state -- Key: MAPREDUCE-4825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4825 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Jason Lowe Attachments: MAPREDUCE-4825.patch TestMRApp.testJobError is causing AsyncDispatcher to exit with System.exit due to an exception being thrown. From the console output from testJobError: {noformat} 2012-11-27 18:46:15,240 ERROR [AsyncDispatcher event handler] impl.TaskImpl (TaskImpl.java:internalError(665)) - Invalid event T_SCHEDULE on Task task_0__m_00 2012-11-27 18:46:15,242 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(132)) - Error in dispatcher thread java.lang.IllegalArgumentException: Illegal job state: ERROR at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.finished(JobImpl.java:838) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1622) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299) at org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:287) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:723) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:974) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:662) 2012-11-27 18:46:15,242 INFO [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(135)) - Exiting, bbye.. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505707#comment-13505707 ] Radim Kolar commented on MAPREDUCE-4827: its knutt formula commonly used in hashtables for improve hashing. java hashtable is using it too Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4825) JobImpl.finished doesn't expect ERROR as a final job state
[ https://issues.apache.org/jira/browse/MAPREDUCE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4825: --- Resolution: Fixed Fix Version/s: 0.23.6 2.0.3-alpha 3.0.0 Status: Resolved (was: Patch Available) Thanks Jason, I put this in trunk, branch-2, and branch-0.23 JobImpl.finished doesn't expect ERROR as a final job state -- Key: MAPREDUCE-4825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4825 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 Attachments: MAPREDUCE-4825.patch TestMRApp.testJobError is causing AsyncDispatcher to exit with System.exit due to an exception being thrown. From the console output from testJobError: {noformat} 2012-11-27 18:46:15,240 ERROR [AsyncDispatcher event handler] impl.TaskImpl (TaskImpl.java:internalError(665)) - Invalid event T_SCHEDULE on Task task_0__m_00 2012-11-27 18:46:15,242 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(132)) - Error in dispatcher thread java.lang.IllegalArgumentException: Illegal job state: ERROR at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.finished(JobImpl.java:838) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1622) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299) at org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:287) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:723) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:974) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:662) 2012-11-27 18:46:15,242 INFO [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(135)) - Exiting, bbye.. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505728#comment-13505728 ] Arun C Murthy commented on MAPREDUCE-4049: -- Looks good. Some more I've noted previously: # Context should have get/set apis # I don't see a need to replace all member fields in Shuffle.java, just init them from the passed-in context. plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4825) JobImpl.finished doesn't expect ERROR as a final job state
[ https://issues.apache.org/jira/browse/MAPREDUCE-4825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505730#comment-13505730 ] Hudson commented on MAPREDUCE-4825: --- Integrated in Hadoop-trunk-Commit #3069 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3069/]) MAPREDUCE-4825. JobImpl.finished doesn't expect ERROR as a final job state (jlowe via bobby) (Revision 1414840) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1414840 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java JobImpl.finished doesn't expect ERROR as a final job state -- Key: MAPREDUCE-4825 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4825 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Jason Lowe Assignee: Jason Lowe Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 Attachments: MAPREDUCE-4825.patch TestMRApp.testJobError is causing AsyncDispatcher to exit with System.exit due to an exception being thrown. From the console output from testJobError: {noformat} 2012-11-27 18:46:15,240 ERROR [AsyncDispatcher event handler] impl.TaskImpl (TaskImpl.java:internalError(665)) - Invalid event T_SCHEDULE on Task task_0__m_00 2012-11-27 18:46:15,242 FATAL [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(132)) - Error in dispatcher thread java.lang.IllegalArgumentException: Illegal job state: ERROR at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.finished(JobImpl.java:838) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1622) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InternalErrorTransition.transition(JobImpl.java:1) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:359) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:299) at org.apache.hadoop.yarn.state.StateMachineFactory.access$3(StateMachineFactory.java:287) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:445) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:723) at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:1) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:974) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:128) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:77) at java.lang.Thread.run(Thread.java:662) 2012-11-27 18:46:15,242 INFO [AsyncDispatcher event handler] event.AsyncDispatcher (AsyncDispatcher.java:dispatch(135)) - Exiting, bbye.. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated MAPREDUCE-4819: -- Attachment: MAPREDUCE-4819.1.patch AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505740#comment-13505740 ] Arun C Murthy commented on MAPREDUCE-4807: -- Good point, agreed. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505743#comment-13505743 ] Arun C Murthy commented on MAPREDUCE-4807: -- Also, the function needs to be renamed to 'createSortingCollector' or some such since it isn't creating the DirectMapOutputCollector - equivalently, we can move the creation of DirectMapOutputCollector there too. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505742#comment-13505742 ] Mariappan Asokan commented on MAPREDUCE-4807: - Hi Arun, Thanks for your comments. I agree with Alejandro on #1. On #2, I agree with you. The patch will definitely get smaller. I will go ahead and make the changes. -- Asokan Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505747#comment-13505747 ] Bikas Saha commented on MAPREDUCE-4819: --- Attaching a patch based on discussions with Vinod and implementing what is in his comment above. I was testing it by making the AM die during MRAppMaster.shutdownJob() after successful job completion but the second attempt could not find the history file during recoveryService.parse() File does not exist: /tmp/hadoop-yarn/staging/bikas/.staging/job_1354125268052_0001_1.jhist bq. the job history log is moved to the done intermediate dir Can this explain why I am seeing the above error? Any pointers? AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505792#comment-13505792 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Hey Arun, changing the Context methods to get*() makes sense. Adding set*() methods is not needed at this point, right? plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505801#comment-13505801 ] Avner BenHanoch commented on MAPREDUCE-4049: Arun/Alejandro, can you pls delink it? plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505806#comment-13505806 ] Jason Lowe commented on MAPREDUCE-4819: --- See JobHistoryEventHandler.closeEventWriter and moveToDoneNow. That's what's moving the job history file from the staging directory to the done intermediate directory so the history server picks it up. We need to not delete the file after we move it. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated MAPREDUCE-4817: - Resolution: Fixed Fix Version/s: 0.23.6 2.0.3-alpha 3.0.0 Target Version/s: 3.0.0, 2.0.3-alpha, 0.23.6 Status: Resolved (was: Patch Available) Thanks Bobby, I've committed this. Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4817) Hardcoded task ping timeout kills tasks localizing large amounts of data
[ https://issues.apache.org/jira/browse/MAPREDUCE-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505817#comment-13505817 ] Hudson commented on MAPREDUCE-4817: --- Integrated in Hadoop-trunk-Commit #3070 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3070/]) MAPREDUCE-4817. Hardcoded task ping timeout kills tasks localizing large amounts of data (tgraves) (Revision 1414873) Result = FAILURE tgraves : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1414873 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/TaskAttemptListenerImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/TaskHeartbeatHandler.java Hardcoded task ping timeout kills tasks localizing large amounts of data Key: MAPREDUCE-4817 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4817 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster, mr-am Affects Versions: 0.23.3, 2.0.3-alpha Reporter: Jason Lowe Assignee: Thomas Graves Priority: Critical Fix For: 3.0.0, 2.0.3-alpha, 0.23.6 Attachments: MAPREDUCE-4817.patch, MAPREDUCE-4817.patch When a task is launched and spends more than 5 minutes localizing files, the AM will kill the task due to ping timeout. The AM's TaskHeartbeatHandler currently tracks tasks via a progress timeout and a ping timeout. The progress timeout can be controlled via mapreduce.task.timeout and even disabled by setting the property to 0. The ping timeout, however, is hardcoded to 5 minutes and cannot be configured. Therefore if the task takes too long localizing, it never gets running in order to ping back to the AM and the AM kills it due to ping timeout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4813) AM timing out during job commit
[ https://issues.apache.org/jira/browse/MAPREDUCE-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4813: -- Attachment: MAPREDUCE-4813.patch Patch that fixes the unit test failures and adds some testing of the new COMMITTING state. As a bonus, most of the tests in TestJobImpl actually test a JobImpl object rather than a mock of it. AM timing out during job commit --- Key: MAPREDUCE-4813 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4813 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-4813.patch, MAPREDUCE-4813.patch The AM calls the output committer's {{commitJob}} method synchronously during JobImpl state transitions, which means the JobImpl write lock is held the entire time the job is being committed. Holding the write lock prevents the RM allocator thread from heartbeating to the RM. Therefore if committing the job takes too long (e.g.: the job has tons of files to commit and/or the namenode is bogged down) then the AM appears to be unresponsive to the RM and the RM kills the AM attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4813) AM timing out during job commit
[ https://issues.apache.org/jira/browse/MAPREDUCE-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-4813: -- Target Version/s: 2.0.3-alpha, 0.23.6 Status: Patch Available (was: Open) AM timing out during job commit --- Key: MAPREDUCE-4813 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4813 Project: Hadoop Map/Reduce Issue Type: Bug Components: applicationmaster Affects Versions: 2.0.1-alpha, 0.23.3 Reporter: Jason Lowe Assignee: Jason Lowe Priority: Critical Attachments: MAPREDUCE-4813.patch, MAPREDUCE-4813.patch The AM calls the output committer's {{commitJob}} method synchronously during JobImpl state transitions, which means the JobImpl write lock is held the entire time the job is being committed. Holding the write lock prevents the RM allocator thread from heartbeating to the RM. Therefore if committing the job takes too long (e.g.: the job has tons of files to commit and/or the namenode is bogged down) then the AM appears to be unresponsive to the RM and the RM kills the AM attempt. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505831#comment-13505831 ] Bikas Saha commented on MAPREDUCE-4819: --- Yeah. Got the same info from Vinod in an offline conversation. Looks like the patch solves half the problem. Making sure that history is fully saved before changing to succeeded state. The other half is to make sure the recovery data is available to the restarted app. Since the RM can restart FAILED/KILLED/SUCCEEDED apps, looks like we will need to wait for state data to be saved for all of them and not just succeeded state (which is what the patch does). Or else, the RM could restart a failed app which would run to again and fail again. The solutions to the second half could be 1) dont delete the original in staging dirs. But this suffers from a problem that final staging dir clean up would end up cleaning it for a successful app and then AM could crash 2) have recovery service look at both temp and done locations. But this suffers from race conditions when the AM does a partial move to done dir and then dies. so part of the data is on temp and part in done. 3) before moving from temp to done create a marker file in done. upon restart, check if marker file exists. if it does then dont do anything because the job was done (failed/killed/successful) and it died sometime after that. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505863#comment-13505863 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Delink? you me remove it as sub-task?, If so, I'd like it to stay as subtask as they are related. Thx plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505864#comment-13505864 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- And don't worry about begin a subtask delaying it, I'll review it as soon as you post a patch and committed it when ready. The same is happening with the other subtasks, so things should be in quite quickly. Thx plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505890#comment-13505890 ] Radim Kolar commented on MAPREDUCE-4827: i have no numbers available Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505900#comment-13505900 ] Jason Lowe commented on MAPREDUCE-4819: --- We can't have the AM looking for the file in done_intermediate. The history server could have moved it out of there in the interim. And I don't think we want the AM to know how to find it's file in the final done location the history server puts it in either. Too much coupling between those systems, IMHO. I think leaving it in the staging directory is the correct solution. As I mentioned, we need to make sure we don't delete the staging directory before unregistering with the RM. That prevents subsequent AM re-attempts right off the bat. And deleting the staging directory before unregistering is happening today as discussed in YARN-244, so that problem is not specific to this fix. Leaving it in staging is straightforward. No need for extra markers, racing with the history server, etc. And if the staging directory is gone, well the AM can't relaunch in the first place, so no issues of re-running and re-committing there. We could still have a discrepancy between the client thinking the job succeeded (which it basically did re: its output data) but the RM saying it failed, but this is fixable by moving the removal of the staging directory to after we unregister from the RM when we fix YARN-244. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4819) AM can rerun job after reporting final job status to the client
[ https://issues.apache.org/jira/browse/MAPREDUCE-4819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505935#comment-13505935 ] Jason Lowe commented on MAPREDUCE-4819: --- Took a look at the patch, and I think we are missing some critical corner cases. For example, if we finish committing the job and the committer is using a marker of sorts (e.g.: _SUCCESS), then we could trigger downstream jobs to run *before* the job history is completely closed. I believe Oozie is polling for the _SUCCESS marker, for example. If we crash after committing but before writing the job finished record then we could end up re-committing again while another job is attempting to consume our output, leading to potential data loss even though both jobs would have SUCCEEDED. That's a Bad Thing. I think the crux of the issue is that we must not commit twice. The act of committing is what could trigger downstream jobs or in itself not be repeatable/recoverable, so we should treat AM crashes during job commit much like we treat non-crashing failures during job commit today, i.e.: it should fail the job without re-running and re-committing. Worst-case we have a false negative where the output did commit successfully but we thought the job failed, and I agree with Koji that a false negative beats a false positive in this case. This means we need a marker noting when we start and stop committing sync'd to the job history file. If the AM relaunches and finds we crashed during commit, we should treat it as we do a committer failure and fail the job. If the re-attempt finds we finished committing then we simply need to unregister from the RM without re-running. AM can rerun job after reporting final job status to the client --- Key: MAPREDUCE-4819 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4819 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am Affects Versions: 0.23.3, 2.0.1-alpha Reporter: Jason Lowe Assignee: Bikas Saha Priority: Critical Attachments: MAPREDUCE-4819.1.patch If the AM reports final job status to the client but then crashes before unregistering with the RM then the RM can run another AM attempt. Currently AM re-attempts assume that the previous attempts did not reach a final job state, and that causes the job to rerun (from scratch, if the output format doesn't support recovery). Re-running the job when we've already told the client the final status of the job is bad for a number of reasons. If the job failed, it's confusing at best since the client was already told the job failed but the subsequent attempt could succeed. If the job succeeded there could be data loss, as a subsequent job launched by the client tries to consume the job's output as input just as the re-attempt starts removing output files in preparation for the output commit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505940#comment-13505940 ] Robert Joseph Evans commented on MAPREDUCE-4827: That is very interesting. I can see it in java.util.HashMap but it looks like java.util.Hashtable does not. Assuming that Jenkins comes back with a +1 I am OK with putting this in. I would like to have some numbers, because this is a performance improvement, but the citation of the code in HashMap.java, which is almost identical to this patch, is good enough for me. +1 Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Text File Busy errors launching MR tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505955#comment-13505955 ] Andy Isaacson commented on MAPREDUCE-2374: -- The fix has been merged to branch-1, but unfortunately not to branch-1.1, so it's not included in the 1.1.1 release which is currently being voted on. Text File Busy errors launching MR tasks -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Fix For: 0.23.3, 2.0.2-alpha Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch, mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace Some very small percentage of tasks fail with a Text file busy error. The following was the original diagnosis: {quote} Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. {quote} .. but turned out to be another issue as well (see below) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Text File Busy errors launching MR tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505957#comment-13505957 ] Marc Reichman commented on MAPREDUCE-2374: -- Andy, Thank you for your comment. I apologize for my lack of understanding of the hadoop release process, but does this mean the fix will be included in a future 1.0.5 release? Text File Busy errors launching MR tasks -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Fix For: 0.23.3, 2.0.2-alpha Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch, mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace Some very small percentage of tasks fail with a Text file busy error. The following was the original diagnosis: {quote} Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. {quote} .. but turned out to be another issue as well (see below) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: COMBO-mapreduce-4809-4807.patch mapreduce-4807.patch Hi Arun, I addressed the following issues: *Copied fields from {{Context}} to local copies to reduce the size of the patch. *Opted to change the method name to {{createSortingCollector().}} I cannot use this to create {{DirectMapOutputCollector()}} (based on whether it is a map-only job) since the call to this method from {{NewOutputCollector}} always expects a sorting collector. *Prefixed *get* in the method signatures of {{Context}} class. Please review the uploaded patch. Thanks. -- Asokan Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505979#comment-13505979 ] Doug Cutting commented on MAPREDUCE-4827: - This is an incompatible change; it will change the output of jobs. In most cases this shouldn't matter, but there might be applications which expect, e.g., the key '1' to go to the output file numbered '1'. This could be avoided by, instead of modifying HashPartitioner, adding a new partitioner. Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4809) Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505980#comment-13505980 ] Mariappan Asokan commented on MAPREDUCE-4809: - Hi Arun and Alejandro, Thanks for all your help in making this happen. -- Asokan Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) Key: MAPREDUCE-4809 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4809 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: MR-2454 Attachments: MAPREDUCE-4809-1.patch, mapreduce-4809.patch, mapreduce-4809.patch, mapreduce-4809.patch Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Status: Patch Available (was: Open) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505991#comment-13505991 ] Radim Kolar commented on MAPREDUCE-4827: If applications requires stable partitioning, then it needs to provide own partitioner because hashCode() for Object is not same across JVMs. No need to push backward compatibility that hard. I never seen such app and we have about 2 mils lines of mapred stuff. Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4828) Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x
Amir Sanjar created MAPREDUCE-4828: -- Summary: Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x Key: MAPREDUCE-4828 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4828 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.0 Environment: Fedora 17 RHEL 6.3, x86_64, IBM JAVA 7 Reporter: Amir Sanjar Priority: Critical Fix For: 1.1.0 Problem is caused by JUnit3 based testcases ran in Junit4 environment configured by ant 1.8.4.. in this case @Ignore tag is not getting ignored. This testcase has been removed from trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505996#comment-13505996 ] Mariappan Asokan commented on MAPREDUCE-4807: - Sorry for the botched formatting:( Here we go. Hi Arun, I addressed the following issues: * Copied fields from {{Context}} to local copies to reduce the size of the patch. * Opted to change the method name to {{createSortingCollector().}} I cannot use this to create {{DirectMapOutputCollector()}} (based on whether it is a map-only job) since the call to this method from {{NewOutputCollector}} always expects a sorting collector. * Prefixed *get* in the method signatures of {{Context}} class. Please review the uploaded patch. Thanks. – Asokan Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4829) Unit Test: TestMiniMRMapRedDebugScript fails when ran with ant-1.8.4 and not 1.7.x
Amir Sanjar created MAPREDUCE-4829: -- Summary: Unit Test: TestMiniMRMapRedDebugScript fails when ran with ant-1.8.4 and not 1.7.x Key: MAPREDUCE-4829 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4829 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.0 Environment: Fedora 17 RHEL 6.3, x86_64, IBM JAVA 7 Reporter: Amir Sanjar Priority: Critical Fix For: 1.1.0 Problem is caused by JUnit3 based testcases ran in Junit4 environment configured by ant 1.8.4.. in this case @Ignore tag is not getting ignored. This testcase has been removed from trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2058) FairScheduler:NullPointerException in web interface when JobTracker not initialized
[ https://issues.apache.org/jira/browse/MAPREDUCE-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-2058: - Affects Version/s: 1.0.4 FairScheduler:NullPointerException in web interface when JobTracker not initialized --- Key: MAPREDUCE-2058 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2058 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0, 1.0.4 Reporter: Dan Adkins Attachments: MAPREDUCE-2058.patch When I contact the jobtracker web interface prior to the job tracker being fully initialized (say, if hdfs is still in safe mode), I get the following error: 10/09/09 18:06:02 ERROR mortbay.log: /jobtracker.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.FairScheduler.getJobs(FairScheduler.java:909) at org.apache.hadoop.mapred.JobTracker.getJobsFromQueue(JobTracker.java:4357) at org.apache.hadoop.mapred.JobTracker.getQueueInfoArray(JobTracker.java:4334) at org.apache.hadoop.mapred.JobTracker.getRootQueues(JobTracker.java:4295) at org.apache.hadoop.mapred.jobtracker_jsp.generateSummaryTable(jobtracker_jsp.java:44) at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:176) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:857) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4828) Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amir Sanjar updated MAPREDUCE-4828: --- Attachment: MAPREDUCE-4828-release-1.1.0.patch Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x -- Key: MAPREDUCE-4828 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4828 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.0 Environment: Fedora 17 RHEL 6.3, x86_64, IBM JAVA 7 Reporter: Amir Sanjar Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-4828-release-1.1.0.patch Problem is caused by JUnit3 based testcases ran in Junit4 environment configured by ant 1.8.4.. in this case @Ignore tag is not getting ignored. This testcase has been removed from trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4828) Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amir Sanjar updated MAPREDUCE-4828: --- Attachment: MAPREDUCE-4828-branch1.patch Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x -- Key: MAPREDUCE-4828 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4828 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.0 Environment: Fedora 17 RHEL 6.3, x86_64, IBM JAVA 7 Reporter: Amir Sanjar Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-4828-branch1.patch, MAPREDUCE-4828-release-1.1.0.patch Problem is caused by JUnit3 based testcases ran in Junit4 environment configured by ant 1.8.4.. in this case @Ignore tag is not getting ignored. This testcase has been removed from trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4828) Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x
[ https://issues.apache.org/jira/browse/MAPREDUCE-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506012#comment-13506012 ] Amir Sanjar commented on MAPREDUCE-4828: this failure has been seen in multiple f17 rehel 6.3 hadoop development environments. Unit Test: TestTaskTrackerLocalization fails when ran with ant-1.8.4 and not 1.7.x -- Key: MAPREDUCE-4828 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4828 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.0 Environment: Fedora 17 RHEL 6.3, x86_64, IBM JAVA 7 Reporter: Amir Sanjar Priority: Critical Fix For: 1.1.0 Attachments: MAPREDUCE-4828-branch1.patch, MAPREDUCE-4828-release-1.1.0.patch Problem is caused by JUnit3 based testcases ran in Junit4 environment configured by ant 1.8.4.. in this case @Ignore tag is not getting ignored. This testcase has been removed from trunk -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-2058) FairScheduler:NullPointerException in web interface when JobTracker not initialized
[ https://issues.apache.org/jira/browse/MAPREDUCE-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated MAPREDUCE-2058: - Attachment: MAPREDUCE-2058-branch-1.patch web threads have to be synchronized with the initialization otherwise there is no proper happens-before. FairScheduler:NullPointerException in web interface when JobTracker not initialized --- Key: MAPREDUCE-2058 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2058 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0, 1.0.4 Reporter: Dan Adkins Attachments: MAPREDUCE-2058-branch-1.patch, MAPREDUCE-2058.patch When I contact the jobtracker web interface prior to the job tracker being fully initialized (say, if hdfs is still in safe mode), I get the following error: 10/09/09 18:06:02 ERROR mortbay.log: /jobtracker.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.FairScheduler.getJobs(FairScheduler.java:909) at org.apache.hadoop.mapred.JobTracker.getJobsFromQueue(JobTracker.java:4357) at org.apache.hadoop.mapred.JobTracker.getQueueInfoArray(JobTracker.java:4334) at org.apache.hadoop.mapred.JobTracker.getRootQueues(JobTracker.java:4295) at org.apache.hadoop.mapred.jobtracker_jsp.generateSummaryTable(jobtracker_jsp.java:44) at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:176) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:857) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Text File Busy errors launching MR tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506022#comment-13506022 ] Matt Foley commented on MAPREDUCE-2374: --- Marc, being in branch-1, it will be in 1.2.0 when we make that release in December. Andy, please go ahead and commit it to branch-1.1 also, so it will be in 1.1.2 when that patch release is made. Marc, you can request it be committed to branch-1.0 also, but at this time there are no plans to produce a 1.0.5 release. Are you able to move to 1.1.1 instead? 1.1.1 passed vote yesterday, and I will have it published and announced in the next day or two. Text File Busy errors launching MR tasks -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Fix For: 0.23.3, 2.0.2-alpha Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch, mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace Some very small percentage of tasks fail with a Text file busy error. The following was the original diagnosis: {quote} Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. {quote} .. but turned out to be another issue as well (see below) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2374) Text File Busy errors launching MR tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506026#comment-13506026 ] Marc Reichman commented on MAPREDUCE-2374: -- Matt, Thank you for your response. I will be able to move to 1.1.x. I was hoping to not have to move to 2.x soon. Does 1.1 move to stable when 1.2 gets released (beta?) in December? I apologize for the improper forum for these questions. Thanks, Marc Text File Busy errors launching MR tasks -- Key: MAPREDUCE-2374 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2374 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Andy Isaacson Fix For: 0.23.3, 2.0.2-alpha Attachments: failed_taskjvmsh.strace, mapreduce-2374-2.txt, mapreduce-2374-branch-1.patch, mapreduce-2374-on-20sec.txt, mapreduce-2374.txt, mapreduce-2374.txt, mapreduce-2374.txt, successfull_taskjvmsh.strace Some very small percentage of tasks fail with a Text file busy error. The following was the original diagnosis: {quote} Our use of PrintWriter in TaskController.writeCommand is unsafe, since that class swallows all IO exceptions. We're not currently checking for errors, which I'm seeing result in occasional task failures with the message Text file busy - assumedly because the close() call is failing silently for some reason. {quote} .. but turned out to be another issue as well (see below) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-2058) FairScheduler:NullPointerException in web interface when JobTracker not initialized
[ https://issues.apache.org/jira/browse/MAPREDUCE-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506028#comment-13506028 ] Hadoop QA commented on MAPREDUCE-2058: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555264/MAPREDUCE-2058-branch-1.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3076//console This message is automatically generated. FairScheduler:NullPointerException in web interface when JobTracker not initialized --- Key: MAPREDUCE-2058 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2058 Project: Hadoop Map/Reduce Issue Type: Bug Components: contrib/fair-share Affects Versions: 0.22.0, 1.0.4 Reporter: Dan Adkins Attachments: MAPREDUCE-2058-branch-1.patch, MAPREDUCE-2058.patch When I contact the jobtracker web interface prior to the job tracker being fully initialized (say, if hdfs is still in safe mode), I get the following error: 10/09/09 18:06:02 ERROR mortbay.log: /jobtracker.jsp java.lang.NullPointerException at org.apache.hadoop.mapred.FairScheduler.getJobs(FairScheduler.java:909) at org.apache.hadoop.mapred.JobTracker.getJobsFromQueue(JobTracker.java:4357) at org.apache.hadoop.mapred.JobTracker.getQueueInfoArray(JobTracker.java:4334) at org.apache.hadoop.mapred.JobTracker.getRootQueues(JobTracker.java:4295) at org.apache.hadoop.mapred.jobtracker_jsp.generateSummaryTable(jobtracker_jsp.java:44) at org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:176) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1124) at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:857) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506032#comment-13506032 ] Doug Cutting commented on MAPREDUCE-4827: - Integer#hashCode() is documented to be the integer value. http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html#hashCode() Similarly, the hashCode() implelementations for String, Double, Float, Long, etc. are specified and do not change from one JVM to another. Also, I didn't veto this change. I just observed that it was not back-compatible. That should be taken into account if/when it is committed. Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506050#comment-13506050 ] Hadoop QA commented on MAPREDUCE-4807: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555252/COMBO-mapreduce-4809-4807.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3075//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3075//console This message is automatically generated. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: (was: mapreduce-4807.patch) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: mapreduce-4807.patch Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506052#comment-13506052 ] Hadoop QA commented on MAPREDUCE-4807: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555268/mapreduce-4807.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3077//console This message is automatically generated. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506054#comment-13506054 ] Radim Kolar commented on MAPREDUCE-4827: this one is platform dependent and more or less random. Most writables do not implement hashCode() http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode%28%29 Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506059#comment-13506059 ] Arun C Murthy commented on MAPREDUCE-4049: -- Alejandro - there seems to be some lingering history between the protagonists here and in MAPREDUCE-2454. There is no point trying to force each upon the other. Since it's different people working on it (who don't the same horizon) let's de-link them and take off ferrets, ok? plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Sub-task Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506061#comment-13506061 ] Arun C Murthy commented on MAPREDUCE-4049: -- Avner, I can't seem to make you a 'contributor' and assign this jira to you. Some weird issue with JIRA, fyi. plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-4049: - Issue Type: Improvement (was: Sub-task) Parent: (was: MAPREDUCE-2454) plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4827) Increase hash quality of HashPartitioner
[ https://issues.apache.org/jira/browse/MAPREDUCE-4827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506064#comment-13506064 ] Doug Cutting commented on MAPREDUCE-4827: - Most writables do not implement hashCode() All WritableComparable (i.e., key) implementations included with Hadoop implement hashCode(). Moreover a WritableComparable would be a poor key implementation if it did not implement hashCode() and was used with HashPartitioner since it wouldn't send equivalent values at the same reducer. The WritableComparable documentation specifically advises implementing hashCode(). http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/WritableComparable.html Increase hash quality of HashPartitioner Key: MAPREDUCE-4827 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4827 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Radim Kolar Attachments: betterhash1.txt hash partitioner is using object.hashCode() for splitting keys into partitions. This results in bad distributions because hashCode() quality is poor. These hashCode() functions are sometimes written by hand (very poor quality) and sometimes generated from by commons lang code (poor quality). Applying some transformation on top of hashCode() provides better distribution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur reassigned MAPREDUCE-4049: - Assignee: Avner BenHanoch plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Assignee: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506069#comment-13506069 ] Alejandro Abdelnur commented on MAPREDUCE-4049: --- Arun, as I said before, the works is related thus it should be done together. If there was some lingering history this seems to be in past because now there seems to be a full synergy between the work done in the different JIRAs. We are community, we have disagreements and we address them, this is how we suppose to work. Avner, just sorted out the JIRA glitch, and assigned the JIRA to you. plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Assignee: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3772) MultipleOutputs output lost if baseOutputPath starts with ../
[ https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506107#comment-13506107 ] Priyo Mustafi commented on MAPREDUCE-3772: -- MultipleOutputs exposes to methods. 1) public K,V void write(String namedOutput,K key,V value) 2) public K,V void write(String namedOutput,K key,V value,String baseOutputPath) where namedOutput - the named output name baseOutputPath - base-output path to write the record to. Note: Framework will generate unique filename for the baseOutputPath We use the second one which allows you to provide a baseOutputPath where the data needs to be written. I don't see anywhere in the javadoc which mentions that baseOutputPath shouldn't be a fully qualified path. So the Jira is definitely valid. Either the Javadoc needs to be fixed or the code needs to be fixed and I would prefer the latter as we have developed extensive data-pipelines based on this. If it is not fixed, we have to change the absolute paths to sub-directory paths and then once the job is done, move all those directories out to the expected locations. Aside that, if we provide baseOutputPath as abc/def/xyz then it puts the directory under the main output directory i.e. you get files like this main-output-dir/abc/def/xyz-r-0. Instead if you use baseOutputPath as /abc/def/xyz where the path isn't a subdirectory of the main output directory, then the problem is seen. MultipleOutputs output lost if baseOutputPath starts with ../ - Key: MAPREDUCE-3772 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.20.203.0, 0.22.0 Environment: FreeBSD Reporter: Radim Kolar Lets say you have output directory set: FileOutputFormat.setOutputPath(job, /tmp/multi1/out); and want to place output from MultipleOutputs into /tmp/multi1/extra I expect following code to work: mos = new MultipleOutputsText, IntWritable(context); mos.write(new Text(zrr), value, ../extra/); but no Exception is throw and expected output directory /tmp/multi1/extra does not even exists. All data written to this output vanish without trace. To make it work fullpath must be used mos.write(new Text(zrr), value, /tmp/multi1/extra/); Output is listed in statistics from MultipleOutputs correctly: org.apache.hadoop.mapreduce.lib.output.MultipleOutputs ../gaja1/=1 (* everything is lost *) /tmp/multi1/out/../ksd34/=1 (* this using full path works *) list1=6667 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4809) Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate)
[ https://issues.apache.org/jira/browse/MAPREDUCE-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506115#comment-13506115 ] Alejandro Abdelnur commented on MAPREDUCE-4809: --- BTW, I've had to revert and recommit the patch as it was incorrect. I had to do this twice as the first time I had some stuff uncommitted. Not it should be OK. Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) Key: MAPREDUCE-4809 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4809 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: MR-2454 Attachments: MAPREDUCE-4809-1.patch, mapreduce-4809.patch, mapreduce-4809.patch, mapreduce-4809.patch Make classes required for MAPREDUCE-2454 to be java public (with LimitedPrivate) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (MAPREDUCE-4049) plugin for generic shuffle service
[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506069#comment-13506069 ] Alejandro Abdelnur edited comment on MAPREDUCE-4049 at 11/29/12 1:17 AM: - Arun, as I said before, the works is related thus it should be done together. If there was some lingering history, this seems to be in the past because now there is full synergy between the work done in the different JIRAs. We are s community, we have disagreements and we address them, this is how we suppose to work. Avner, just sorted out the JIRA glitch, and assigned the JIRA to you. was (Author: tucu00): Arun, as I said before, the works is related thus it should be done together. If there was some lingering history this seems to be in past because now there seems to be a full synergy between the work done in the different JIRAs. We are community, we have disagreements and we address them, this is how we suppose to work. Avner, just sorted out the JIRA glitch, and assigned the JIRA to you. plugin for generic shuffle service -- Key: MAPREDUCE-4049 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 Project: Hadoop Map/Reduce Issue Type: Improvement Components: performance, task, tasktracker Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 Reporter: Avner BenHanoch Assignee: Avner BenHanoch Labels: merge, plugin, rdma, shuffle Fix For: trunk Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch Support generic shuffle service as set of two plugins: ShuffleProvider ShuffleConsumer. This will satisfy the following needs: # Better shuffle and merge performance. For example: we are working on shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, or Infiniband) instead of using the current HTTP shuffle. Based on the fast RDMA shuffle, the plugin can also utilize a suitable merge approach during the intermediate merges. Hence, getting much better performance. # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden dependency of NodeManager with a specific version of mapreduce shuffle (currently targeted to 0.24.0). References: # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu from Auburn University with others, [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] # I am attaching 2 documents with suggested Top Level Design for both plugins (currently, based on 1.0 branch) # I am providing link for downloading UDA - Mellanox's open source plugin that implements generic shuffle service using RDMA and levitated merge. Note: At this phase, the code is in C++ through JNI and you should consider it as beta only. Still, it can serve anyone that wants to implement or contribute to levitated merge. (Please be advised that levitated merge is mostly suit in very fast networks) - [http://www.mellanox.com/content/pages.php?pg=products_dynproduct_family=144menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3772) MultipleOutputs output lost if baseOutputPath starts with ../
[ https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506121#comment-13506121 ] Radim Kolar commented on MAPREDUCE-3772: If you have budget, i can fix it for you. MultipleOutputs output lost if baseOutputPath starts with ../ - Key: MAPREDUCE-3772 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.20.203.0, 0.22.0 Environment: FreeBSD Reporter: Radim Kolar Lets say you have output directory set: FileOutputFormat.setOutputPath(job, /tmp/multi1/out); and want to place output from MultipleOutputs into /tmp/multi1/extra I expect following code to work: mos = new MultipleOutputsText, IntWritable(context); mos.write(new Text(zrr), value, ../extra/); but no Exception is throw and expected output directory /tmp/multi1/extra does not even exists. All data written to this output vanish without trace. To make it work fullpath must be used mos.write(new Text(zrr), value, /tmp/multi1/extra/); Output is listed in statistics from MultipleOutputs correctly: org.apache.hadoop.mapreduce.lib.output.MultipleOutputs ../gaja1/=1 (* everything is lost *) /tmp/multi1/out/../ksd34/=1 (* this using full path works *) list1=6667 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-3772) MultipleOutputs output lost if baseOutputPath starts with ../
[ https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506123#comment-13506123 ] Alejandro Abdelnur commented on MAPREDUCE-3772: --- I think that, at least, javadocs should be updated to reflect that if you want to use speculative execution the baseOutputPath must not be a path but a name. I would prefer to do enforce it, as IMO it is a bug it is not enforced. MultipleOutputs output lost if baseOutputPath starts with ../ - Key: MAPREDUCE-3772 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv1 Affects Versions: 0.20.203.0, 0.22.0 Environment: FreeBSD Reporter: Radim Kolar Lets say you have output directory set: FileOutputFormat.setOutputPath(job, /tmp/multi1/out); and want to place output from MultipleOutputs into /tmp/multi1/extra I expect following code to work: mos = new MultipleOutputsText, IntWritable(context); mos.write(new Text(zrr), value, ../extra/); but no Exception is throw and expected output directory /tmp/multi1/extra does not even exists. All data written to this output vanish without trace. To make it work fullpath must be used mos.write(new Text(zrr), value, /tmp/multi1/extra/); Output is listed in statistics from MultipleOutputs correctly: org.apache.hadoop.mapreduce.lib.output.MultipleOutputs ../gaja1/=1 (* everything is lost *) /tmp/multi1/out/../ksd34/=1 (* this using full path works *) list1=6667 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Status: Open (was: Patch Available) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: (was: COMBO-mapreduce-4809-4807.patch) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Status: Patch Available (was: Open) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: (was: mapreduce-4807.patch) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: mapreduce-4807.patch COMBO-mapreduce-4809-4807.patch Sorry about the confusion. QA picked up the incremental patch as well. Resubmitting patch files together. -- Asokan Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13506164#comment-13506164 ] Hadoop QA commented on MAPREDUCE-4807: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555297/mapreduce-4807.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3078//console This message is automatically generated. Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: COMBO-mapreduce-4809-4807.patch Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Attachment: (was: COMBO-mapreduce-4809-4807.patch) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Status: Open (was: Patch Available) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable
[ https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mariappan Asokan updated MAPREDUCE-4807: Status: Patch Available (was: Open) Allow MapOutputBuffer to be pluggable - Key: MAPREDUCE-4807 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807 Project: Hadoop Map/Reduce Issue Type: Sub-task Affects Versions: 2.0.2-alpha Reporter: Arun C Murthy Assignee: Mariappan Asokan Fix For: 2.0.3-alpha Attachments: COMBO-mapreduce-4809-4807.patch, COMBO-mapreduce-4809-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch, mapreduce-4807.patch Allow MapOutputBuffer to be pluggable -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira