[jira] [Updated] (MAPREDUCE-5073) TestJobStatusPersistency.testPersistency fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5073: -- Status: Patch Available (was: Open) TestJobStatusPersistency.testPersistency fails on JDK7 -- Key: MAPREDUCE-5073 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5073 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5073.patch TestJobStatusPersistency is sensitive to the order that the tests are run in. If testLocalPersistency runs before testPersistency, testPersistency will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5073) TestJobStatusPersistency.testPersistency fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated MAPREDUCE-5073: -- Attachment: MAPREDUCE-5073.patch TestJobStatusPersistency.testPersistency fails on JDK7 -- Key: MAPREDUCE-5073 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5073 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5073.patch TestJobStatusPersistency is sensitive to the order that the tests are run in. If testLocalPersistency runs before testPersistency, testPersistency will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5073) TestJobStatusPersistency.testPersistency fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603188#comment-13603188 ] Hadoop QA commented on MAPREDUCE-5073: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573837/MAPREDUCE-5073.patch against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3417//console This message is automatically generated. TestJobStatusPersistency.testPersistency fails on JDK7 -- Key: MAPREDUCE-5073 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5073 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5073.patch TestJobStatusPersistency is sensitive to the order that the tests are run in. If testLocalPersistency runs before testPersistency, testPersistency will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5062) MR AM should read max-retries information from the RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-5062: --- Attachment: MAPREDUCE-5062.1.patch It's the patch which is split from the one in YARN-378 and which is restricted in the scope of mapreduce. It includes the following features: 1. Defining the max attempts configuration in MRJobConfig and mapred-default.xml. 2. Sending the number through ApplicationSubmissionContext. 3. RM AM's reading the number from environment. MR AM should read max-retries information from the RM - Key: MAPREDUCE-5062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5062 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: MAPREDUCE-5062.1.patch Change MR AM to use app-retry maximum limit that is made available by RM after YARN-378. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5062) MR AM should read max-retries information from the RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated MAPREDUCE-5062: --- Status: Patch Available (was: Open) MR AM should read max-retries information from the RM - Key: MAPREDUCE-5062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5062 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: MAPREDUCE-5062.1.patch Change MR AM to use app-retry maximum limit that is made available by RM after YARN-378. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5062) MR AM should read max-retries information from the RM
[ https://issues.apache.org/jira/browse/MAPREDUCE-5062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603193#comment-13603193 ] Hadoop QA commented on MAPREDUCE-5062: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573838/MAPREDUCE-5062.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3418//console This message is automatically generated. MR AM should read max-retries information from the RM - Key: MAPREDUCE-5062 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5062 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Assignee: Zhijie Shen Attachments: MAPREDUCE-5062.1.patch Change MR AM to use app-retry maximum limit that is made available by RM after YARN-378. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5068) Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603239#comment-13603239 ] Vitaly Kruglikov commented on MAPREDUCE-5068: - [~sandyr]: Confirmed -- the same issue is also present in the latest hadoop CDH4.2.0 distribution. Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity Key: MAPREDUCE-5068 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5068 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Environment: Mac OS X; CDH4.1.2 Reporter: Vitaly Kruglikov Labels: hadoop This is reliably reproduced while running CDH4.1.2 on a single Mac OS X machine. # Two queues are being configured: cjmQ and slotsQ. Both queues are configured with tiny minResources. The intention is for the task(s) of the job in cjmQ to be able to preempt tasks of the job in slotsQ. # yarn.nodemanager.resource.memory-mb = 24576 # First, a long-running 6-map-task (0 reducers) mapreduce job is started in slotsQ with mapreduce.map.memory.mb=4096. Because MRAppMaster's container consumes some memory, only 5 of its 6 map tasks are able to start, and the 6th is pending, but will never run. # Then, a short-running 1-map-task (0 reducers) mapreduce job is submitted via cjmQ with mapreduce.map.memory.mb=2048. Expected behavior: At this point, because the minimum share of cjmQ has not been met, I expected Fair Scheduler to preempt one of the executing map tasks from the single slotsQ mapreduce job to make room for the single map tasks of the cjmQ mapreduce job. However, Fair Scheduler didn't preempt any of the running map tasks of the slotsQ job. Instead, the cjmQ job was being starved perpetually. Since slotsQ had far more than its minimum share allocated to it and already running, while cjmQ was far below its minimum share (0 actually), Fair Scheduler should have started preempting, regardless of there being one task container from the slotsQ job (the 6th map container) that was not being allocated. Additional useful info: # If I summit a second 1-map-task mapreduce job via cjmQ, the first cjmQ mapreduce job in that Q gets scheduled and its state changes to RUNNING; once that that first job completes, then the second job submitted via cjmQ gets starved until a third job is submitted into cjmQ, and so on. This happens regardless of the values of maxRunningApps in the queue configurations. # If, instead of requesting 6 map tasks for the slotsQ job, I only request 5 so that everything fits nicely into yarn.nodemanager.resource.memory-mb - without that 6th pending, but not running task - then preemption works as I would have expected. However, I cannot rely on this arrangement because in a production cluster that is running at full capacity, if a machine dies, the mapreduce job from slotsQ will request new containers for the failed tasks and because the cluster was already at capacity, those containers will end up as pending and will never run, recreating my original scenario of the starving cjmQ job. # I initially wrote this up on https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/0zv62pkN5lM, so it would be good to update that group with the resolution. Configuration: In yarn-site.xml: {code} property descriptionScheduler plug-in class to use instead of the default scheduler./description nameyarn.resourcemanager.scheduler.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler/value /property {code} fair-scheduler.xml: {code} configuration !-- Site specific FairScheduler configuration properties -- property descriptionAbsolute path to allocation file. An allocation file is an XML manifest describing queues and their properties, in addition to certain policy defaults. This file must be in XML format as described in http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. /description nameyarn.scheduler.fair.allocation.file/name value[obfuscated]/current/conf/site/default/hadoop/fair-scheduler-allocations.xml/value /property property descriptionWhether to use preemption. Note that preemption is experimental in the current version. Defaults to false./description nameyarn.scheduler.fair.preemption/name valuetrue/value /property property descriptionWhether to allow multiple container assignments in one heartbeat. Defaults to false./description nameyarn.scheduler.fair.assignmultiple/name
[jira] [Updated] (MAPREDUCE-5068) Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaly Kruglikov updated MAPREDUCE-5068: Environment: Mac OS X; CDH4.1.2; CDH4.2.0 (was: Mac OS X; CDH4.1.2) Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity Key: MAPREDUCE-5068 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5068 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Environment: Mac OS X; CDH4.1.2; CDH4.2.0 Reporter: Vitaly Kruglikov Labels: hadoop This is reliably reproduced while running CDH4.1.2 on a single Mac OS X machine. # Two queues are being configured: cjmQ and slotsQ. Both queues are configured with tiny minResources. The intention is for the task(s) of the job in cjmQ to be able to preempt tasks of the job in slotsQ. # yarn.nodemanager.resource.memory-mb = 24576 # First, a long-running 6-map-task (0 reducers) mapreduce job is started in slotsQ with mapreduce.map.memory.mb=4096. Because MRAppMaster's container consumes some memory, only 5 of its 6 map tasks are able to start, and the 6th is pending, but will never run. # Then, a short-running 1-map-task (0 reducers) mapreduce job is submitted via cjmQ with mapreduce.map.memory.mb=2048. Expected behavior: At this point, because the minimum share of cjmQ has not been met, I expected Fair Scheduler to preempt one of the executing map tasks from the single slotsQ mapreduce job to make room for the single map tasks of the cjmQ mapreduce job. However, Fair Scheduler didn't preempt any of the running map tasks of the slotsQ job. Instead, the cjmQ job was being starved perpetually. Since slotsQ had far more than its minimum share allocated to it and already running, while cjmQ was far below its minimum share (0 actually), Fair Scheduler should have started preempting, regardless of there being one task container from the slotsQ job (the 6th map container) that was not being allocated. Additional useful info: # If I summit a second 1-map-task mapreduce job via cjmQ, the first cjmQ mapreduce job in that Q gets scheduled and its state changes to RUNNING; once that that first job completes, then the second job submitted via cjmQ gets starved until a third job is submitted into cjmQ, and so on. This happens regardless of the values of maxRunningApps in the queue configurations. # If, instead of requesting 6 map tasks for the slotsQ job, I only request 5 so that everything fits nicely into yarn.nodemanager.resource.memory-mb - without that 6th pending, but not running task - then preemption works as I would have expected. However, I cannot rely on this arrangement because in a production cluster that is running at full capacity, if a machine dies, the mapreduce job from slotsQ will request new containers for the failed tasks and because the cluster was already at capacity, those containers will end up as pending and will never run, recreating my original scenario of the starving cjmQ job. # I initially wrote this up on https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/0zv62pkN5lM, so it would be good to update that group with the resolution. Configuration: In yarn-site.xml: {code} property descriptionScheduler plug-in class to use instead of the default scheduler./description nameyarn.resourcemanager.scheduler.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler/value /property {code} fair-scheduler.xml: {code} configuration !-- Site specific FairScheduler configuration properties -- property descriptionAbsolute path to allocation file. An allocation file is an XML manifest describing queues and their properties, in addition to certain policy defaults. This file must be in XML format as described in http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. /description nameyarn.scheduler.fair.allocation.file/name value[obfuscated]/current/conf/site/default/hadoop/fair-scheduler-allocations.xml/value /property property descriptionWhether to use preemption. Note that preemption is experimental in the current version. Defaults to false./description nameyarn.scheduler.fair.preemption/name valuetrue/value /property property descriptionWhether to allow multiple container assignments in one heartbeat. Defaults to false./description nameyarn.scheduler.fair.assignmultiple/name valuetrue/value /property /configuration {code} My fair-scheduler-allocations.xml:
[jira] [Commented] (MAPREDUCE-5068) Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603245#comment-13603245 ] Vitaly Kruglikov commented on MAPREDUCE-5068: - Per Sandy's recommendation, I opened the CDH JIRA https://issues.cloudera.org/browse/DISTRO-466 Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity Key: MAPREDUCE-5068 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5068 Project: Hadoop Map/Reduce Issue Type: Bug Components: mrv2, scheduler Environment: Mac OS X; CDH4.1.2; CDH4.2.0 Reporter: Vitaly Kruglikov Labels: hadoop This is reliably reproduced while running CDH4.1.2 on a single Mac OS X machine. # Two queues are being configured: cjmQ and slotsQ. Both queues are configured with tiny minResources. The intention is for the task(s) of the job in cjmQ to be able to preempt tasks of the job in slotsQ. # yarn.nodemanager.resource.memory-mb = 24576 # First, a long-running 6-map-task (0 reducers) mapreduce job is started in slotsQ with mapreduce.map.memory.mb=4096. Because MRAppMaster's container consumes some memory, only 5 of its 6 map tasks are able to start, and the 6th is pending, but will never run. # Then, a short-running 1-map-task (0 reducers) mapreduce job is submitted via cjmQ with mapreduce.map.memory.mb=2048. Expected behavior: At this point, because the minimum share of cjmQ has not been met, I expected Fair Scheduler to preempt one of the executing map tasks from the single slotsQ mapreduce job to make room for the single map tasks of the cjmQ mapreduce job. However, Fair Scheduler didn't preempt any of the running map tasks of the slotsQ job. Instead, the cjmQ job was being starved perpetually. Since slotsQ had far more than its minimum share allocated to it and already running, while cjmQ was far below its minimum share (0 actually), Fair Scheduler should have started preempting, regardless of there being one task container from the slotsQ job (the 6th map container) that was not being allocated. Additional useful info: # If I summit a second 1-map-task mapreduce job via cjmQ, the first cjmQ mapreduce job in that Q gets scheduled and its state changes to RUNNING; once that that first job completes, then the second job submitted via cjmQ gets starved until a third job is submitted into cjmQ, and so on. This happens regardless of the values of maxRunningApps in the queue configurations. # If, instead of requesting 6 map tasks for the slotsQ job, I only request 5 so that everything fits nicely into yarn.nodemanager.resource.memory-mb - without that 6th pending, but not running task - then preemption works as I would have expected. However, I cannot rely on this arrangement because in a production cluster that is running at full capacity, if a machine dies, the mapreduce job from slotsQ will request new containers for the failed tasks and because the cluster was already at capacity, those containers will end up as pending and will never run, recreating my original scenario of the starving cjmQ job. # I initially wrote this up on https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/0zv62pkN5lM, so it would be good to update that group with the resolution. Configuration: In yarn-site.xml: {code} property descriptionScheduler plug-in class to use instead of the default scheduler./description nameyarn.resourcemanager.scheduler.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler/value /property {code} fair-scheduler.xml: {code} configuration !-- Site specific FairScheduler configuration properties -- property descriptionAbsolute path to allocation file. An allocation file is an XML manifest describing queues and their properties, in addition to certain policy defaults. This file must be in XML format as described in http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. /description nameyarn.scheduler.fair.allocation.file/name value[obfuscated]/current/conf/site/default/hadoop/fair-scheduler-allocations.xml/value /property property descriptionWhether to use preemption. Note that preemption is experimental in the current version. Defaults to false./description nameyarn.scheduler.fair.preemption/name valuetrue/value /property property descriptionWhether to allow multiple container assignments in one heartbeat. Defaults to false./description nameyarn.scheduler.fair.assignmultiple/name
[jira] [Updated] (MAPREDUCE-5068) Fair Scheduler preemption fails if the other queue has a mapreduce job with some tasks in excess of cluster capacity
[ https://issues.apache.org/jira/browse/MAPREDUCE-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaly Kruglikov updated MAPREDUCE-5068: Description: This is reliably reproduced while running CDH4.1.2 or CDH4.2.0 on a single Mac OS X machine. # Two queues are being configured: cjmQ and slotsQ. Both queues are configured with tiny minResources. The intention is for the task(s) of the job in cjmQ to be able to preempt tasks of the job in slotsQ. # yarn.nodemanager.resource.memory-mb = 24576 # First, a long-running 6-map-task (0 reducers) mapreduce job is started in slotsQ with mapreduce.map.memory.mb=4096. Because MRAppMaster's container consumes some memory, only 5 of its 6 map tasks are able to start, and the 6th is pending, but will never run. # Then, a short-running 1-map-task (0 reducers) mapreduce job is submitted via cjmQ with mapreduce.map.memory.mb=2048. Expected behavior: At this point, because the minimum share of cjmQ has not been met, I expected Fair Scheduler to preempt one of the executing map tasks from the single slotsQ mapreduce job to make room for the single map tasks of the cjmQ mapreduce job. However, Fair Scheduler didn't preempt any of the running map tasks of the slotsQ job. Instead, the cjmQ job was being starved perpetually. Since slotsQ had far more than its minimum share allocated to it and already running, while cjmQ was far below its minimum share (0 actually), Fair Scheduler should have started preempting, regardless of there being one task container from the slotsQ job (the 6th map container) that was not being allocated. Additional useful info: # If I summit a second 1-map-task mapreduce job via cjmQ, the first cjmQ mapreduce job in that Q gets scheduled and its state changes to RUNNING; once that that first job completes, then the second job submitted via cjmQ gets starved until a third job is submitted into cjmQ, and so on. This happens regardless of the values of maxRunningApps in the queue configurations. # If, instead of requesting 6 map tasks for the slotsQ job, I only request 5 so that everything fits nicely into yarn.nodemanager.resource.memory-mb - without that 6th pending, but not running task - then preemption works as I would have expected. However, I cannot rely on this arrangement because in a production cluster that is running at full capacity, if a machine dies, the mapreduce job from slotsQ will request new containers for the failed tasks and because the cluster was already at capacity, those containers will end up as pending and will never run, recreating my original scenario of the starving cjmQ job. # I initially wrote this up on https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!topic/cdh-user/0zv62pkN5lM, so it would be good to update that group with the resolution. Configuration: In yarn-site.xml: {code} property descriptionScheduler plug-in class to use instead of the default scheduler./description nameyarn.resourcemanager.scheduler.class/name valueorg.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler/value /property {code} fair-scheduler.xml: {code} configuration !-- Site specific FairScheduler configuration properties -- property descriptionAbsolute path to allocation file. An allocation file is an XML manifest describing queues and their properties, in addition to certain policy defaults. This file must be in XML format as described in http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html. /description nameyarn.scheduler.fair.allocation.file/name value[obfuscated]/current/conf/site/default/hadoop/fair-scheduler-allocations.xml/value /property property descriptionWhether to use preemption. Note that preemption is experimental in the current version. Defaults to false./description nameyarn.scheduler.fair.preemption/name valuetrue/value /property property descriptionWhether to allow multiple container assignments in one heartbeat. Defaults to false./description nameyarn.scheduler.fair.assignmultiple/name valuetrue/value /property /configuration {code} My fair-scheduler-allocations.xml: {code} allocations queue name=cjmQ !-- minimum amount of aggregate memory; TODO which units??? -- minResources2048/minResources !-- limit the number of apps from the queue to run at once -- maxRunningApps1/maxRunningApps !-- either fifo or fair depending on the in-queue scheduling policy desired -- schedulingModefifo/schedulingMode !-- Number of seconds after which the pool can preempt other pools' tasks to achieve its min share. Requires preemption to be enabled in mapred-site.xml by setting mapred.fairscheduler.preemption to true. Defaults to infinity (no preemption). -- minSharePreemptionTimeout5/minSharePreemptionTimeout
[jira] [Updated] (MAPREDUCE-4978) Add a updateJobWithSplit() method for new-api job
[ https://issues.apache.org/jira/browse/MAPREDUCE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Liang updated MAPREDUCE-4978: --- Affects Version/s: (was: 1.1.1) 1.1.2 Add a updateJobWithSplit() method for new-api job - Key: MAPREDUCE-4978 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4978 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 1.1.2 Reporter: Liyin Liang Assignee: Liyin Liang Attachments: 4978-1.diff HADOOP-1230 adds a method updateJobWithSplit(), which only works for old-api job. It's better to add another method for new-api job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4978) Add a updateJobWithSplit() method for new-api job
[ https://issues.apache.org/jira/browse/MAPREDUCE-4978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liyin Liang updated MAPREDUCE-4978: --- Fix Version/s: 1.2.0 Add a updateJobWithSplit() method for new-api job - Key: MAPREDUCE-4978 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4978 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 1.1.2 Reporter: Liyin Liang Assignee: Liyin Liang Fix For: 1.2.0 Attachments: 4978-1.diff HADOOP-1230 adds a method updateJobWithSplit(), which only works for old-api job. It's better to add another method for new-api job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603418#comment-13603418 ] Mithun Radhakrishnan commented on MAPREDUCE-5065: - I'm with you on the need for a blocksize-independent checksum. I wasn't convinced that combining CRC32-checksums together to form a higher-level checksum could be correct. (Thanks for the explanation.) {quote} instruct her to run with -pb, not -skipCrc. {quote} Yep, that should take care of #2 (above), but not #1. The user will still need to fail first and rerun, because she's unlikely to know that some of her source-files might have non-default block-sizes. Unless the checksum calculation is fixed (or -pb is default), I don't think DistCp should enforce a check that's a guaranteed failure, under unforeseeable circumstances. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-5065.branch23.patch, MAPREDUCE-5065.branch2.patch When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Attachment: (was: MAPREDUCE-5065.branch2.patch) DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Attachment: (was: MAPREDUCE-5065.branch23.patch) DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Attachment: MAPREDUCE-5065.branch-2.patch MAPREDUCE-5065.branch-0.23.patch Updated patches so that post-copy checksum comparisons are dropped with -skipCrc, on Hadoop-0.23. This brings 0.23 implementation to parity with 2.0/. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-5065.branch-0.23.patch, MAPREDUCE-5065.branch-2.patch When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603437#comment-13603437 ] Hadoop QA commented on MAPREDUCE-5065: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573882/MAPREDUCE-5065.branch-2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-distcp. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3419//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3419//console This message is automatically generated. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-5065.branch-0.23.patch, MAPREDUCE-5065.branch-2.patch When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Status: Open (was: Patch Available) Sorry it took so long, but I think I see your argument now, Doug. We'd rather have the false positive (and re-run), rather than silently skip CRC-checks and risk a bad data-copy. Making -pb default is probably still a bad thing (because there'd be no option *not* to preserve block-size). And the cost of the re-run can be mitigated with -update. I'll change the patches. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 0.23.5, 2.0.3-alpha Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: MAPREDUCE-5065.branch-0.23.patch, MAPREDUCE-5065.branch-2.patch When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5028) Maps fail when io.sort.mb is set to high value
[ https://issues.apache.org/jira/browse/MAPREDUCE-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603458#comment-13603458 ] Karthik Kambatla commented on MAPREDUCE-5028: - [~chris.douglas], will you be able to take a look at the latest patch? Your valuable insights from MAPREDUCE-64 experience will surely help. Maps fail when io.sort.mb is set to high value -- Key: MAPREDUCE-5028 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5028 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1, 2.0.3-alpha, 0.23.5 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Fix For: 1.2.0 Attachments: mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-branch1.patch, mr-5028-trunk.patch Verified the problem exists on branch-1 with the following configuration: Pseudo-dist mode: 2 maps/ 1 reduce, mapred.child.java.opts=-Xmx2048m, io.sort.mb=1280, dfs.block.size=2147483648 Run teragen to generate 4 GB data Maps fail when you run wordcount on this configuration with the following error: {noformat} java.io.IOException: Spill failed at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1031) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:692) at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:45) at org.apache.hadoop.examples.WordCount$TokenizerMapper.map(WordCount.java:34) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.io.IntWritable.readFields(IntWritable.java:38) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:67) at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:116) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1505) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1438) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:855) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1346) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Attachment: (was: MAPREDUCE-5065.branch-0.23.patch) DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mithun Radhakrishnan updated MAPREDUCE-5065: Attachment: (was: MAPREDUCE-5065.branch-2.patch) DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5069) add concrete common implementations of CombineFileInputFormat
[ https://issues.apache.org/jira/browse/MAPREDUCE-5069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603476#comment-13603476 ] Robert Joseph Evans commented on MAPREDUCE-5069: +1 on the idea too. add concrete common implementations of CombineFileInputFormat - Key: MAPREDUCE-5069 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5069 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mrv1, mrv2 Affects Versions: 2.0.3-alpha Reporter: Sangjin Lee Priority: Minor CombineFileInputFormat is abstract, and its specific equivalents to TextInputFormat, SequenceFileInputFormat, etc. are currently not in the hadoop code base. These sound like very common need wherever CombineFileInputFormat is used, and different folks would write the same code over and over to achieve the same goal. It sounds very natural for hadoop to provide at least the text and sequence file implementations of the CombineFileInputFormat class. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603493#comment-13603493 ] Kihwal Lee commented on MAPREDUCE-5065: --- bq. Another option might be to implement a checksum that's blocksize-independent... Reading whole metadata may be too much, especially for huge files. It will be better if we make computation happen where the data is. :) Most hashing is incremental, so if DFSClient feeds the last state of hash into the next datanode and let it continue updating it, the result will be independent of block size. The current way of doing file checksum allows calculating individual block checksums in parallel, but we are not taking advantage of it in DFSClient anyway. So I don't think there won't be any significant changes in performance or overhead. We should probably continue this discussion in a separate jira. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603494#comment-13603494 ] Kihwal Lee commented on MAPREDUCE-5065: --- bq. So I don't think there won't be any significant changes in performance or overhead. Sorry, unintended double negation. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5065) DistCp should skip checksum comparisons if block-sizes are different on source/target.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603502#comment-13603502 ] Kihwal Lee commented on MAPREDUCE-5065: --- Filed HDFS-4605 for block-size independent FileChecksum in HDFS. DistCp should skip checksum comparisons if block-sizes are different on source/target. -- Key: MAPREDUCE-5065 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5065 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 2.0.3-alpha, 0.23.5 Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan When copying files between 2 clusters with different default block-sizes, one sees that the copy fails with a checksum-mismatch, even though the files have identical contents. The reason is that on HDFS, a file's checksum is unfortunately a function of the block-size of the file. So you could have 2 different files with identical contents (but different block-sizes) have different checksums. (Thus, it's also possible for DistCp to fail to copy files on the same file-system, if the source-file's block-size differs from HDFS default, and -pb isn't used.) I propose that we skip checksum comparisons under the following conditions: 1. -skipCrc is specified. 2. File-size is 0 (in which case the call to the checksum-servlet is moot). 3. source.getBlockSize() != target.getBlockSize(), since the checksums are guaranteed to differ in this case. I have a patch for #3. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4987: - Attachment: MAPREDUCE-4987.1.patch I'm attaching a patch. This fixes the issue of symlink handling on Windows by copying the files instead of truly symlinking, similar to the approach taken in prior patches like HADOOP-9061. This also fixes the logic for bundling the classpath into a jar manifest by guaranteeing that localized resources get added to the classpath, even if those localized resource don't exist in the container path yet. (The classpath jar must get created before the container launch script runs to symlink or copy files from filecache, so this was a chicken-and-egg problem.) With these changes in place, {{TestMRJobs#testDistributedCache}} passes on Mac and Windows. Here is a summary of the changes in each file: {{FileUtil#createJarWithClassPath}} - Accept environment provided by caller, because YARN will construct an environment different from the current system environment. Provide a way to maintain a classpath entry with a trailing '/' even though the directory doesn't exist, because the container launch script hasn't run yet. {{TestFileUtil#testCreateJarWithClassPath}} - Change test to cover new logic. {{TestMRJobs}} - Initialize {{MiniDFSCluster}} in a @BeforeClass method instead of a static initialization block. This test uses an inner class, {{DistributedCacheChecker}}, as the job's mapper. Since this is an inner class, it has a back-reference to the {{TestMRJobs}} class. This means that the {{TestMRJobs}} static initialization runs for each mapper task in addition to running in the JUnit runner. Therefore, this would start multiple instances of {{MiniDFSCluster}} pointing at the same directories, which would sometimes cause deadlocks. Moving the initialization to a @BeforeClass method prevents it from running in the mappers. I also needed to add a special check that a path is a symlinked directory, because {{FileUtils#isSymlink}} does not work as expected on Windows. {{ContainerLaunch}} - Copy files instead of symlinking on Windows. Guarantee that localized resources get added to the classpath correctly, even if the paths do not exist yet. TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks --- Key: MAPREDUCE-4987 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache, nodemanager Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4987.1.patch On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-4987: - Status: Patch Available (was: Open) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks --- Key: MAPREDUCE-4987 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache, nodemanager Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4987.1.patch On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5070) TestClusterStatus.testClusterMetrics fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603573#comment-13603573 ] Alejandro Abdelnur commented on MAPREDUCE-5070: --- +1 TestClusterStatus.testClusterMetrics fails on JDK7 -- Key: MAPREDUCE-5070 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5070 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: MAPREDUCE-5070.patch TestClusterStatus is sensitive to the order that the tests are run in. If testReservedSlots is called before testClusterMetrics, testClusterMetrics will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5070) TestClusterStatus.testClusterMetrics fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-5070: -- Resolution: Fixed Fix Version/s: 1.3.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Sandy. Committed to branch-1. TestClusterStatus.testClusterMetrics fails on JDK7 -- Key: MAPREDUCE-5070 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5070 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5070.patch TestClusterStatus is sensitive to the order that the tests are run in. If testReservedSlots is called before testClusterMetrics, testClusterMetrics will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5072) TestDelegationTokenRenewal.testDTRenewal fails in MR1 on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603580#comment-13603580 ] Alejandro Abdelnur commented on MAPREDUCE-5072: --- +1 TestDelegationTokenRenewal.testDTRenewal fails in MR1 on jdk7 - Key: MAPREDUCE-5072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5072 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5072.patch TestDelegationTokenRenewal.testDTRenewal fails in MR1 for the reasons that TestDelegationTokenRenewer.testDTRenewal fails described in YARN-31. The fix is the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603585#comment-13603585 ] Alejandro Abdelnur commented on MAPREDUCE-4716: --- typo in commit message, said '(tucu)' when it should have been '(sandyr via tucu)', sorry about that. TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Deleted] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-4716: -- Comment: was deleted (was: typo in commit message, said '(tucu)' when it should have been '(sandyr via tucu)', sorry about that.) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5070) TestClusterStatus.testClusterMetrics fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603589#comment-13603589 ] Alejandro Abdelnur commented on MAPREDUCE-5070: --- typo in commit message, said '(tucu)' when it should have been '(sandyr via tucu)' TestClusterStatus.testClusterMetrics fails on JDK7 -- Key: MAPREDUCE-5070 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5070 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5070.patch TestClusterStatus is sensitive to the order that the tests are run in. If testReservedSlots is called before testClusterMetrics, testClusterMetrics will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5072) TestDelegationTokenRenewal.testDTRenewal fails in MR1 on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-5072: -- Resolution: Fixed Fix Version/s: (was: 1.2.0) 1.3.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Sandy. Committed to branch-1. TestDelegationTokenRenewal.testDTRenewal fails in MR1 on jdk7 - Key: MAPREDUCE-5072 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5072 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5072.patch TestDelegationTokenRenewal.testDTRenewal fails in MR1 for the reasons that TestDelegationTokenRenewer.testDTRenewal fails described in YARN-31. The fix is the same. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4571) TestHsWebServicesJobs fails on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603605#comment-13603605 ] Sandy Ryza commented on MAPREDUCE-4571: --- I just tried the patch on top of trunk and it still applies and works. TestHsWebServicesJobs fails on jdk7 --- Key: MAPREDUCE-4571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4571 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4571.patch TestHsWebServicesJobs fails on jdk7. Tests run: 22, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.561 sec FAILURE!testJobIdSlash(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs) Time elapsed: 0.334 sec FAILURE! java.lang.AssertionError: mapsTotal incorrect expected:0 but was:1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603608#comment-13603608 ] Sandy Ryza commented on MAPREDUCE-4716: --- I just tried the patch on top of trunk and it still applies and works. TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5074) Remove limits on number of counters and counter groups in MapReduce
Ravi Prakash created MAPREDUCE-5074: --- Summary: Remove limits on number of counters and counter groups in MapReduce Key: MAPREDUCE-5074 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5074 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am, mrv2 Affects Versions: 2.0.3-alpha, 3.0.0, 0.23.6 Reporter: Ravi Prakash Can we please consider removing limits on the number of counters and counter groups now that it is all user code? Thanks to the much better architecture of YARN in which there is no single Job Tracker we have to worry about overloading, I feel we should do away with this (now arbitrary) constraint on users' capabilities. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4571) TestHsWebServicesJobs fails on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603679#comment-13603679 ] Alejandro Abdelnur commented on MAPREDUCE-4571: --- +1 TestHsWebServicesJobs fails on jdk7 --- Key: MAPREDUCE-4571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4571 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4571.patch TestHsWebServicesJobs fails on jdk7. Tests run: 22, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.561 sec FAILURE!testJobIdSlash(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs) Time elapsed: 0.334 sec FAILURE! java.lang.AssertionError: mapsTotal incorrect expected:0 but was:1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4571) TestHsWebServicesJobs fails on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-4571: -- Resolution: Fixed Fix Version/s: 2.0.5-beta Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Thomas (and Sandy for verifying it still applies/works). Committed to trunk and branch-2. TestHsWebServicesJobs fails on jdk7 --- Key: MAPREDUCE-4571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4571 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Fix For: 2.0.5-beta Attachments: MAPREDUCE-4571.patch TestHsWebServicesJobs fails on jdk7. Tests run: 22, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.561 sec FAILURE!testJobIdSlash(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs) Time elapsed: 0.334 sec FAILURE! java.lang.AssertionError: mapsTotal incorrect expected:0 but was:1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603692#comment-13603692 ] Alejandro Abdelnur commented on MAPREDUCE-4716: --- +1 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-4716: -- Resolution: Fixed Fix Version/s: 2.0.5-beta Target Version/s: 2.0.3-alpha, 3.0.0, 0.23.7 (was: 3.0.0, 2.0.3-alpha, 0.23.7) Status: Resolved (was: Patch Available) Thanks Thomas (and Sandy for verifying still applies/works). Committed to trunk and branch-2. TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Fix For: 2.0.5-beta Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603700#comment-13603700 ] Hadoop QA commented on MAPREDUCE-4987: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573897/MAPREDUCE-4987.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3420//console This message is automatically generated. TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks --- Key: MAPREDUCE-4987 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache, nodemanager Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4987.1.patch On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4716) TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603704#comment-13603704 ] Hudson commented on MAPREDUCE-4716: --- Integrated in Hadoop-trunk-Commit #3481 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3481/]) MAPREDUCE-4716. TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7. (tgraves via tucu) (Revision 1457065) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457065 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/webapp/TestHsWebServicesJobsQuery.java TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails with jdk7 Key: MAPREDUCE-4716 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4716 Project: Hadoop Map/Reduce Issue Type: Bug Components: jobhistoryserver Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Fix For: 2.0.5-beta Attachments: MAPREDUCE-4716.patch Using jdk7 TestHsWebServicesJobsQuery.testJobsQueryStateInvalid fails. It looks like the string changed from const class to constant in jdk7. Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.713 sec FAILURE! testJobsQueryStateInvalid(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery) Time elapsed: 0.371 sec FAILURE! java.lang.AssertionError: exception message doesn't match, got: No enum constant org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState expected: No enum const class org.apache.hadoop.mapreduce.v2.api.records.JobState.InvalidState at org.junit.Assert.fail(Assert.java:91)at org.junit.Assert.assertTrue(Assert.java:43) at org.apache.hadoop.yarn.webapp.WebServicesTestUtils.checkStringMatch(WebServicesTestUtils.java:77) at org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobsQuery.testJobsQueryStateInvalid(TestHsWebServicesJobsQuery.java:286) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4571) TestHsWebServicesJobs fails on jdk7
[ https://issues.apache.org/jira/browse/MAPREDUCE-4571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603705#comment-13603705 ] Hudson commented on MAPREDUCE-4571: --- Integrated in Hadoop-trunk-Commit #3481 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3481/]) MAPREDUCE-4571. TestHsWebServicesJobs fails on jdk7. (tgraves via tucu) (Revision 1457061) Result = SUCCESS tucu : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457061 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/MockHistoryJobs.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs/src/test/java/org/apache/hadoop/mapreduce/v2/hs/webapp/TestHsWebServicesJobs.java TestHsWebServicesJobs fails on jdk7 --- Key: MAPREDUCE-4571 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4571 Project: Hadoop Map/Reduce Issue Type: Bug Components: webapps Affects Versions: 0.23.3, 3.0.0, 2.0.2-alpha Reporter: Thomas Graves Assignee: Thomas Graves Labels: java7 Fix For: 2.0.5-beta Attachments: MAPREDUCE-4571.patch TestHsWebServicesJobs fails on jdk7. Tests run: 22, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.561 sec FAILURE!testJobIdSlash(org.apache.hadoop.mapreduce.v2.hs.webapp.TestHsWebServicesJobs) Time elapsed: 0.334 sec FAILURE! java.lang.AssertionError: mapsTotal incorrect expected:0 but was:1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Issue Type: Test (was: Improvement) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4987) TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks
[ https://issues.apache.org/jira/browse/MAPREDUCE-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603718#comment-13603718 ] Arpit Agarwal commented on MAPREDUCE-4987: -- +1 Chris explained to me offline about the change in FileUtil#createJarWithClassPath. Quoting here since I found it helpful to understand the change. {quote} In the method sanitizeEnv, you'll see that nodemanager does various things to set up a new environment for the container to be launched. The final state of this environment will be different from the environment of the currently running process (the nodemanager itself). The most glaring problem with this bug was the setting of PWD to the new container work directory. There are various classpath entries for the distributed cache files that are of the form $PWD/file on Mac or %PWD%/file on Windows, and FileUtil#createJarWithClassPath needs to expand this to container_dir/file. Without this change, the variable expansion would be incorrect: nodemanager_working_dir/file on Mac or just /file on Windows (since Windows doesn't intrinsically have %PWD% defined until nodemanager sets it in sanitizeEnv). {quote} TestMRJobs#testDistributedCache fails on Windows due to unexpected behavior of symlinks --- Key: MAPREDUCE-4987 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4987 Project: Hadoop Map/Reduce Issue Type: Bug Components: distributed-cache, nodemanager Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-4987.1.patch On Windows, {{TestMRJobs#testDistributedCache}} fails on an assertion while checking the length of a symlink. It expects to see the length of the target of the symlink, but Java 6 on Windows always reports that a symlink has length 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5073) TestJobStatusPersistency.testPersistency fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603789#comment-13603789 ] Alejandro Abdelnur commented on MAPREDUCE-5073: --- +1 TestJobStatusPersistency.testPersistency fails on JDK7 -- Key: MAPREDUCE-5073 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5073 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5073.patch TestJobStatusPersistency is sensitive to the order that the tests are run in. If testLocalPersistency runs before testPersistency, testPersistency will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5073) TestJobStatusPersistency.testPersistency fails on JDK7
[ https://issues.apache.org/jira/browse/MAPREDUCE-5073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated MAPREDUCE-5073: -- Resolution: Fixed Fix Version/s: (was: 1.2.0) 1.3.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Sandy. Committed to branch-1. TestJobStatusPersistency.testPersistency fails on JDK7 -- Key: MAPREDUCE-5073 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5073 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 1.1.2 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.3.0 Attachments: MAPREDUCE-5073.patch TestJobStatusPersistency is sensitive to the order that the tests are run in. If testLocalPersistency runs before testPersistency, testPersistency will fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4875) coverage fixing for org.apache.hadoop.mapred
[ https://issues.apache.org/jira/browse/MAPREDUCE-4875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603834#comment-13603834 ] Robert Joseph Evans commented on MAPREDUCE-4875: For the most part everything looks good here. My only concern is that in some places it looks like we are just testing dead code. Things like TaskLog are used by pipes, but only part of it are used by pipes so ripping out the unused code I think would be preferable. coverage fixing for org.apache.hadoop.mapred Key: MAPREDUCE-4875 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4875 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0, 2.0.3-alpha, 0.23.6 Reporter: Aleksey Gorshkov Fix For: 3.0.0, 0.23.6, 2.0.5-beta Attachments: MAPREDUCE-4875-branch-0.23.patch, MAPREDUCE-4875-trunk.patch added some tests for org.apache.hadoop.mapred MAPREDUCE-4875-trunk.patch for trunk and branch-2 MAPREDUCE-4875-branch-0.23.patch for branch-0.23 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-5042: --- Resolution: Fixed Fix Version/s: 2.0.5-beta 0.23.7 3.0.0 Status: Resolved (was: Patch Available) Thanks Jason, I put this into trunk, branch-2, and branch-0.23 Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5042) Reducer unable to fetch for a map task that was recovered
[ https://issues.apache.org/jira/browse/MAPREDUCE-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603872#comment-13603872 ] Hudson commented on MAPREDUCE-5042: --- Integrated in Hadoop-trunk-Commit #3483 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/3483/]) MAPREDUCE-5042. Reducer unable to fetch for a map task that was recovered (Jason Lowe via bobby) (Revision 1457119) Result = SUCCESS bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1457119 Files : * /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/YarnChild.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/MRAppMaster.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/JobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/TestRecovery.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TestJobImpl.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/Task.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/JobSubmitter.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/security/TokenCache.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Fetcher.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java * /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/pipes/TestPipeApplication.java Reducer unable to fetch for a map task that was recovered - Key: MAPREDUCE-5042 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5042 Project: Hadoop Map/Reduce Issue Type: Bug Components: mr-am, security Affects Versions: 0.23.7, 2.0.4-alpha Reporter: Jason Lowe Assignee: Jason Lowe Priority: Blocker Fix For: 3.0.0, 0.23.7, 2.0.5-beta Attachments: MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch, MAPREDUCE-5042.patch If an application attempt fails and is relaunched the AM will try to recover previously completed tasks. If a reducer needs to fetch the output of a map task attempt that was recovered then it will fail with a 401 error like this: {noformat} java.io.IOException: Server returned HTTP response code: 401 for URL: http://xx:xx/mapOutput?job=job_1361569180491_21845reduce=0map=attempt_1361569180491_21845_m_16_0 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1615) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:231) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:156) {noformat} Looking at the corresponding NM's logs, we see the shuffle failed due to Verification of the hashReply failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url
[ https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated MAPREDUCE-5066: --- Affects Version/s: 2.0.3-alpha JobTracker should set a timeout when calling into job.end.notification.url -- Key: MAPREDUCE-5066 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win, 2.0.3-alpha, 1.3.0 Reporter: Ivan Mitic Assignee: Ivan Mitic In current code, timeout is not specified when JobTracker (JobEndNotifier) calls into the notification URL. When the given URL points to a server that will not respond for a long time, job notifications are completely stuck (given that we have only a single thread processing all notifications). We've seen this cause noticeable delays in job execution in components that rely on job end notifications (like Oozie workflows). I propose we introduce a configurable timeout option and set a default to a reasonably small value. If we want, we can also introduce a configurable number of workers processing the notification queue (not sure if this is needed though at this point). I will prepare a patch soon. Please comment back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5066) JobTracker should set a timeout when calling into job.end.notification.url
[ https://issues.apache.org/jira/browse/MAPREDUCE-5066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603885#comment-13603885 ] Hitesh Shah commented on MAPREDUCE-5066: Job notification also exists in 2.x which may face the same set of issues. JobTracker should set a timeout when calling into job.end.notification.url -- Key: MAPREDUCE-5066 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5066 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1-win, 2.0.3-alpha, 1.3.0 Reporter: Ivan Mitic Assignee: Ivan Mitic In current code, timeout is not specified when JobTracker (JobEndNotifier) calls into the notification URL. When the given URL points to a server that will not respond for a long time, job notifications are completely stuck (given that we have only a single thread processing all notifications). We've seen this cause noticeable delays in job execution in components that rely on job end notifications (like Oozie workflows). I propose we introduce a configurable timeout option and set a default to a reasonably small value. If we want, we can also introduce a configurable number of workers processing the notification queue (not sure if this is needed though at this point). I will prepare a patch soon. Please comment back. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5075) DistCp leaks input file handles
Chris Nauroth created MAPREDUCE-5075: Summary: DistCp leaks input file handles Key: MAPREDUCE-5075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5075 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth DistCp wraps the {{InputStream}} for each input file it reads in an instance of {{ThrottledInputStream}}. This class does not close the wrapped {{InputStream}}. {{RetriableFileCopyCommand}} guarantees that the {{ThrottledInputStream}} gets closed, but without closing the underlying wrapped stream, it still leaks a file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5075) DistCp leaks input file handles
[ https://issues.apache.org/jira/browse/MAPREDUCE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603903#comment-13603903 ] Chris Nauroth commented on MAPREDUCE-5075: -- I discovered this while testing on Windows, where file locking is enforced more strictly. The DistCp tests would fail sporadically due to not being able to delete the temp files. I have a patch in progress. DistCp leaks input file handles --- Key: MAPREDUCE-5075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5075 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth DistCp wraps the {{InputStream}} for each input file it reads in an instance of {{ThrottledInputStream}}. This class does not close the wrapped {{InputStream}}. {{RetriableFileCopyCommand}} guarantees that the {{ThrottledInputStream}} gets closed, but without closing the underlying wrapped stream, it still leaks a file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5038) old API CombineFileInputFormat missing fixes that are in new API
[ https://issues.apache.org/jira/browse/MAPREDUCE-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603940#comment-13603940 ] Sandy Ryza commented on MAPREDUCE-5038: --- It looks like I messed this up and left out part of MAPREDUCE-1597. Working on a replacement patch. old API CombineFileInputFormat missing fixes that are in new API - Key: MAPREDUCE-5038 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5038 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 1.1.1 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 1.2.0 Attachments: MAPREDUCE-5038-1.patch, MAPREDUCE-5038.patch The following changes patched the CombineFileInputFormat in mapreduce, but neglected the one in mapred MAPREDUCE-1597 enabled the CombineFileInputFormat to work on splittable files MAPREDUCE-2021 solved returning duplicate hostnames in split locations MAPREDUCE-1806 CombineFileInputFormat does not work with paths not on default FS In trunk this is not an issue as the one in mapred extends the one in mapreduce. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5075) DistCp leaks input file handles
[ https://issues.apache.org/jira/browse/MAPREDUCE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5075: - Attachment: MAPREDUCE-5075.1.patch Here is a patch with the following changes: # {{RetriableFileCopyCommand}} - This is just code clean-up. The {{copyBytes}} private method accepted a flag as an argument to control whether or not to close the streams after copying. This method was only ever called from {{copyToTmpFile}} with a hard-coded true. I removed the flag from the method signature and changed the code so that it closes the streams unconditionally. # {{ThrottledInputStream}} - Override {{close}} so that it closes the wrapped stream. # {{TestIntegration}} - This code was not creating the target file correctly. {{target}} contains a fully qualified path. Inside {{createFiles}}, it prepends the test root again. This would be 2 fully qualified paths appended to each other. On Windows, the result would look like C:\project\target\C:\project\target. The second ':' makes the filename invalid. With this patch, all DistCp tests pass consistently on Mac and Windows. DistCp leaks input file handles --- Key: MAPREDUCE-5075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5075 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5075.1.patch DistCp wraps the {{InputStream}} for each input file it reads in an instance of {{ThrottledInputStream}}. This class does not close the wrapped {{InputStream}}. {{RetriableFileCopyCommand}} guarantees that the {{ThrottledInputStream}} gets closed, but without closing the underlying wrapped stream, it still leaks a file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5075) DistCp leaks input file handles
[ https://issues.apache.org/jira/browse/MAPREDUCE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5075: - Status: Patch Available (was: Open) DistCp leaks input file handles --- Key: MAPREDUCE-5075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5075 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5075.1.patch DistCp wraps the {{InputStream}} for each input file it reads in an instance of {{ThrottledInputStream}}. This class does not close the wrapped {{InputStream}}. {{RetriableFileCopyCommand}} guarantees that the {{ThrottledInputStream}} gets closed, but without closing the underlying wrapped stream, it still leaks a file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-5075) DistCp leaks input file handles
[ https://issues.apache.org/jira/browse/MAPREDUCE-5075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13603992#comment-13603992 ] Hadoop QA commented on MAPREDUCE-5075: -- {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12573975/MAPREDUCE-5075.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 tests included appear to have a timeout.{color} {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-tools/hadoop-distcp. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3421//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/3421//console This message is automatically generated. DistCp leaks input file handles --- Key: MAPREDUCE-5075 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5075 Project: Hadoop Map/Reduce Issue Type: Bug Components: distcp Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: MAPREDUCE-5075.1.patch DistCp wraps the {{InputStream}} for each input file it reads in an instance of {{ThrottledInputStream}}. This class does not close the wrapped {{InputStream}}. {{RetriableFileCopyCommand}} guarantees that the {{ThrottledInputStream}} gets closed, but without closing the underlying wrapped stream, it still leaks a file handle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5076) CombineFileInputFormat with maxSplitSize can omit data
Sandy Ryza created MAPREDUCE-5076: - Summary: CombineFileInputFormat with maxSplitSize can omit data Key: MAPREDUCE-5076 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5076 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Sandy Ryza Assignee: Sandy Ryza I ran a local job with CombineFileInputFormat using an 80 MB file and a max split size of 32 MB (the default local FS block size). The job ran with two splits of 32 MB, and the last 16 MB were just omitted. This appears to be caused by a subtle bug in getMoreSplits, in which the code that generates the splits from the blocks expects the 16 MB block to be at the end of the block list. But the code that generates the blocks does not respect this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira