[jira] [Commented] (MAPREDUCE-6709) Add configurable flag to allow MapReduce AM to specify the 'ensureExecutionType' of a Request
[ https://issues.apache.org/jira/browse/MAPREDUCE-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798568#comment-16798568 ] Hadoop QA commented on MAPREDUCE-6709: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 6m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 18s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 5m 12s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 43s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}123m 13s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 1m 0s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}242m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | MAPREDUCE-6709 | | JIRA
[jira] [Updated] (MAPREDUCE-6709) Add configurable flag to allow MapReduce AM to specify the 'ensureExecutionType' of a Request
[ https://issues.apache.org/jira/browse/MAPREDUCE-6709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated MAPREDUCE-6709: - Attachment: MAPREDUCE-6709.003.patch > Add configurable flag to allow MapReduce AM to specify the > 'ensureExecutionType' of a Request > - > > Key: MAPREDUCE-6709 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6709 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Arun Suresh >Assignee: Abhishek Modi >Priority: Major > Attachments: MAPREDUCE-6709.001.patch, MAPREDUCE-6709.002.patch, > MAPREDUCE-6709.003.patch > > > MAPREDUCE-6703 allows users to configure the ratio of Map tasks to be > requested as OPPORTUNISTIC. This JIRA proposes to expose configuration to > allow users to additionally specify the value *ensureExecutionType* flag > (introduced in YARN-5180) as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7169) Speculative attempts should not run on the same node
[ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798119#comment-16798119 ] Hadoop QA commented on MAPREDUCE-7169: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 22s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 21s{color} | {color:orange} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app: The patch generated 15 new + 463 unchanged - 1 fixed = 478 total (was 464) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 12 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 12s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 18s{color} | {color:red} hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 39s{color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | MAPREDUCE-7169 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12963268/MAPREDUCE-7169-001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 7d98ec9b3b24 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9f1c017 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/7605/artifact/out/diff-checkstyle-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt | | whitespace |
[jira] [Updated] (MAPREDUCE-7169) Speculative attempts should not run on the same node
[ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated MAPREDUCE-7169: - Attachment: MAPREDUCE-7169-001.patch > Speculative attempts should not run on the same node > > > Key: MAPREDUCE-7169 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: yarn >Affects Versions: 2.7.2 >Reporter: Lee chen >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7169-001.patch, > image-2018-12-03-09-54-07-859.png > > > I found in all versions of yarn, Speculative Execution may set the > speculative task to the node of original task.What i have read is only it > will try to have one more task attempt. haven't seen any place mentioning not > on same node.It is unreasonable.If the node have some problems lead to tasks > execution will be very slow. and then placement the speculative task to same > node cannot help the problematic task. > In our cluster (version 2.7.2,2700 nodes),this phenomenon appear > almost everyday. > !image-2018-12-03-09-54-07-859.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7169) Speculative attempts should not run on the same node
[ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798074#comment-16798074 ] Bilwa S T commented on MAPREDUCE-7169: -- cc [~bibinchundatt] [~jlowe] > Speculative attempts should not run on the same node > > > Key: MAPREDUCE-7169 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: yarn >Affects Versions: 2.7.2 >Reporter: Lee chen >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7169-001.patch, > image-2018-12-03-09-54-07-859.png > > > I found in all versions of yarn, Speculative Execution may set the > speculative task to the node of original task.What i have read is only it > will try to have one more task attempt. haven't seen any place mentioning not > on same node.It is unreasonable.If the node have some problems lead to tasks > execution will be very slow. and then placement the speculative task to same > node cannot help the problematic task. > In our cluster (version 2.7.2,2700 nodes),this phenomenon appear > almost everyday. > !image-2018-12-03-09-54-07-859.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7169) Speculative attempts should not run on the same node
[ https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T updated MAPREDUCE-7169: - Status: Patch Available (was: Open) > Speculative attempts should not run on the same node > > > Key: MAPREDUCE-7169 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: yarn >Affects Versions: 2.7.2 >Reporter: Lee chen >Assignee: Bilwa S T >Priority: Major > Attachments: MAPREDUCE-7169-001.patch, > image-2018-12-03-09-54-07-859.png > > > I found in all versions of yarn, Speculative Execution may set the > speculative task to the node of original task.What i have read is only it > will try to have one more task attempt. haven't seen any place mentioning not > on same node.It is unreasonable.If the node have some problems lead to tasks > execution will be very slow. and then placement the speculative task to same > node cannot help the problematic task. > In our cluster (version 2.7.2,2700 nodes),this phenomenon appear > almost everyday. > !image-2018-12-03-09-54-07-859.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7090) BigMapOutput example doesn't work with paths off cluster fs
[ https://issues.apache.org/jira/browse/MAPREDUCE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7090. --- Resolution: Duplicate Fix Version/s: 3.3.0 > BigMapOutput example doesn't work with paths off cluster fs > --- > > Key: MAPREDUCE-7090 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7090 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: examples >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 3.3.0 > > > You can't pass an object store path to bigmapoutput, because it uses the > default fs, not the path FS, to work with the directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7099) Daily test result fails in MapReduce JobClient though there isn't any error
[ https://issues.apache.org/jira/browse/MAPREDUCE-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7099. --- Resolution: Cannot Reproduce doesnt' seem to be happening, closing > Daily test result fails in MapReduce JobClient though there isn't any error > --- > > Key: MAPREDUCE-7099 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7099 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: build, test >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Critical > > Looks like the test result in MapReduce JobClient always fails lately. Please > see the results of hadoop-qbt-trunk-java8-linux-x86: > > [https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/]/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient.txt > {noformat} > [INFO] Results: > [INFO] > [WARNING] Tests run: 565, Failures: 0, Errors: 0, Skipped: 10 > [INFO] > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 02:06 h > [INFO] Finished at: 2018-05-30T12:32:39+00:00 > [INFO] Final Memory: 25M/645M > [INFO] > > [WARNING] The requested profile "parallel-tests" could not be activated > because it does not exist. > [WARNING] The requested profile "shelltest" could not be activated because it > does not exist. > [WARNING] The requested profile "native" could not be activated because it > does not exist. > [WARNING] The requested profile "yarn-ui" could not be activated because it > does not exist. > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on > project hadoop-mapreduce-client-jobclient: There was a timeout or other error > in the fork -> [Help 1] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7092) MR examples to work better against cloud stores
[ https://issues.apache.org/jira/browse/MAPREDUCE-7092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7092. --- Resolution: Duplicate Assignee: Steve Loughran Fix Version/s: 3.3.0 > MR examples to work better against cloud stores > --- > > Key: MAPREDUCE-7092 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7092 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: examples >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Fix For: 3.3.0 > > > Some of the MR examples either don't work or underperform on cloud > infrastructure, all straightforward to fix. Of course, that means the cloud > connectors all get an opportunity to add more integration tests... -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Resolved] (MAPREDUCE-7091) Terasort on S3A to switch to new committers
[ https://issues.apache.org/jira/browse/MAPREDUCE-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved MAPREDUCE-7091. --- Resolution: Duplicate Fix Version/s: 3.3.0 > Terasort on S3A to switch to new committers > --- > > Key: MAPREDUCE-7091 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7091 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: examples >Affects Versions: 3.1.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Fix For: 3.3.0 > > > Terasort is very slow on S3, because it still uses the classic > rename-to-commit algorithm on the sort, even while teragen and the reporting > can use the new committer > Reason: {{org.apache.hadoop.examples.terasort.TeraOutputFormat}} has > overriden {{getOutputCommitter}} even though it doesn't need to. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-7195) Mapreduce task timeout to zero could cause too many status update
[ https://issues.apache.org/jira/browse/MAPREDUCE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bilwa S T reassigned MAPREDUCE-7195: Assignee: Bilwa S T > Mapreduce task timeout to zero could cause too many status update > - > > Key: MAPREDUCE-7195 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7195 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bilwa S T >Priority: Major > Attachments: screenshot-1.png > > > * mapreduce.task.timeout=0 > Could create too many status update > {code} > public static long getTaskProgressReportInterval(final Configuration conf) { > long taskHeartbeatTimeOut = conf.getLong( > MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); > return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, > (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * > taskHeartbeatTimeOut)); > } > {code} > mapreduce timeout=0 is used to disable timeout feature -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6190) If a task stucks before its first heartbeat, it never timeouts and the MR job becomes stuck
[ https://issues.apache.org/jira/browse/MAPREDUCE-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16797892#comment-16797892 ] Bibin A Chundatt commented on MAPREDUCE-6190: - [~uranus] During one of the test found that mapreduce.task.timeout=0 configuration used to disable timeout doesn't work now. If the task timeout is configured as zero the task fails with stuck timeout, if the TaskStatus is null {code} if (sendProgress) { // we need to send progress update updateCounters(); checkTaskLimits(); taskStatus.statusUpdate(taskProgress.get(), taskProgress.toString(), counters); amFeedback = umbilical.statusUpdate(taskId, taskStatus); taskFound = amFeedback.getTaskFound(); taskStatus.clearStatus(); } else { // send ping amFeedback = umbilical.statusUpdate(taskId, null); taskFound = amFeedback.getTaskFound(); } {code} > If a task stucks before its first heartbeat, it never timeouts and the MR job > becomes stuck > --- > > Key: MAPREDUCE-6190 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6190 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.1 >Reporter: Ankit Malhotra >Assignee: Zhaohui Xin >Priority: Major > Fix For: 3.3.0 > > Attachments: MAPREDUCE-6190.001.patch, MAPREDUCE-6190.002.patch, > MAPREDUCE-6190.003.patch, MAPREDUCE-6190.004.patch, MAPREDUCE-6190.005.patch > > > Trying to figure out a weird issue we started seeing on our CDH5.1.0 cluster > with map reduce jobs on YARN. > We had a job stuck for hours because one of the mappers never started up > fully. Basically, the map task had 2 attempts, the first one failed and the > AM tried to schedule a second one and the second attempt was stuck on STATE: > STARTING, STATUS: NEW. A node never got assigned and the task along with the > job was stuck indefinitely. > The AM logs had this being logged again and again: > {code} > 2014-12-09 19:25:12,347 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down 0 > 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Received > completed container container_1408745633994_450952_02_003807 > 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce preemption > successful attempt_1408745633994_450952_r_48_1000 > 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down all > scheduled reduces:0 > 2014-12-09 19:25:13,352 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Going to preempt 1 > 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Preempting > attempt_1408745633994_450952_r_50_1000 > 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating > schedule, headroom=0 > 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: > completedMapPercent 0.99968 totalMemLimit:1722880 finalMapMemLimit:2560 > finalReduceMemLimit:1720320 netScheduledMapMem:2560 > netScheduledReduceMem:1722880 > 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Ramping down 0 > 2014-12-09 19:25:13,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: > PendingReds:77 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 > AssignedReds:673 CompletedMaps:3124 CompletedReds:0 ContAlloc:4789 > ContRel:798 HostLocal:2944 RackLocal:155 > 2014-12-09 19:25:14,353 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before > Scheduling: PendingReds:78 ScheduledMaps:1 ScheduledReds:0 AssignedMaps:0 > AssignedReds:673 CompletedMaps:3124 CompletedReds:0 ContAlloc:4789 > ContRel:798 HostLocal:2944 RackLocal:155 > 2014-12-09 19:25:14,359 INFO [RMCommunicator Allocator] > org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating > schedule, headroom=0 > {code} > On killing the task manually, the AM started up the task again, scheduled and > ran it successfully completing the task and the job with it. > Some quick code grepping led us here: >
[jira] [Updated] (MAPREDUCE-7195) Mapreduce task timeout to zero could cause too many status update
[ https://issues.apache.org/jira/browse/MAPREDUCE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated MAPREDUCE-7195: Description: * mapreduce.task.timeout=0 Could create too many status update {code} public static long getTaskProgressReportInterval(final Configuration conf) { long taskHeartbeatTimeOut = conf.getLong( MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut)); } {code} mapreduce timeout=0 is used to disable timeout feature was: * mapreduce.task.timeout=0 Could create too many status update {code} public static long getTaskProgressReportInterval(final Configuration conf) { long taskHeartbeatTimeOut = conf.getLong( MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut)); } {code} !screenshot-1.png! > Mapreduce task timeout to zero could cause too many status update > - > > Key: MAPREDUCE-7195 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7195 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Major > Attachments: screenshot-1.png > > > * mapreduce.task.timeout=0 > Could create too many status update > {code} > public static long getTaskProgressReportInterval(final Configuration conf) { > long taskHeartbeatTimeOut = conf.getLong( > MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); > return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, > (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * > taskHeartbeatTimeOut)); > } > {code} > mapreduce timeout=0 is used to disable timeout feature -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7195) Mapreduce task timeout to zero could cause too many status update
[ https://issues.apache.org/jira/browse/MAPREDUCE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated MAPREDUCE-7195: Description: * mapreduce.task.timeout=0 Could create too many status update {code} public static long getTaskProgressReportInterval(final Configuration conf) { long taskHeartbeatTimeOut = conf.getLong( MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut)); } {code} !screenshot-1.png! was: * mapreduce.task.timeout=0 Could create too many status update {code} public static long getTaskProgressReportInterval(final Configuration conf) { long taskHeartbeatTimeOut = conf.getLong( MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut)); } {code} > Mapreduce task timeout to zero could cause too many status update > - > > Key: MAPREDUCE-7195 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7195 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Major > Attachments: screenshot-1.png > > > * mapreduce.task.timeout=0 > Could create too many status update > {code} > public static long getTaskProgressReportInterval(final Configuration conf) { > long taskHeartbeatTimeOut = conf.getLong( > MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); > return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, > (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * > taskHeartbeatTimeOut)); > } > {code} > !screenshot-1.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7195) Mapreduce task timeout to zero could cause too many status update
[ https://issues.apache.org/jira/browse/MAPREDUCE-7195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated MAPREDUCE-7195: Attachment: screenshot-1.png > Mapreduce task timeout to zero could cause too many status update > - > > Key: MAPREDUCE-7195 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7195 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Bibin A Chundatt >Priority: Major > Attachments: screenshot-1.png > > > * mapreduce.task.timeout=0 > Could create too many status update > {code} > public static long getTaskProgressReportInterval(final Configuration conf) { > long taskHeartbeatTimeOut = conf.getLong( > MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); > return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, > (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * > taskHeartbeatTimeOut)); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7195) Mapreduce task timeout to zero could cause too many status update
Bibin A Chundatt created MAPREDUCE-7195: --- Summary: Mapreduce task timeout to zero could cause too many status update Key: MAPREDUCE-7195 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7195 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Bibin A Chundatt * mapreduce.task.timeout=0 Could create too many status update {code} public static long getTaskProgressReportInterval(final Configuration conf) { long taskHeartbeatTimeOut = conf.getLong( MRJobConfig.TASK_TIMEOUT, MRJobConfig.DEFAULT_TASK_TIMEOUT_MILLIS); return conf.getLong(MRJobConfig.TASK_PROGRESS_REPORT_INTERVAL, (long) (TASK_REPORT_INTERVAL_TO_TIMEOUT_RATIO * taskHeartbeatTimeOut)); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org