[jira] [Commented] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184381#comment-15184381 ] Naganarasimha G R commented on MAPREDUCE-6546: -- Thanks [~sjlee0], i will wait for some more time and commit. > reconcile the two versions of the timeline service performance tests > > > Key: MAPREDUCE-6546 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6546 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Minor > Labels: yarn-2928-1st-milestone > Attachments: MAPREDUCE-6546-YARN-2928.01.patch, > MAPREDUCE-6546-YARN-2928.02.patch, MAPREDUCE-6546-YARN-2928.03.patch > > > The trunk now has a version of the timeline service performance test > (YARN-2556). The timeline service v.2 (YARN-2928) also has a performance > test, and these two versions are quite similar (by design). > We need to reconcile the two. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184358#comment-15184358 ] Sangjin Lee commented on MAPREDUCE-6546: Oops sorry about that. I should have printed out the message before cutting the patch. Yes, please feel free to add the closing parenthesis. Thanks! > reconcile the two versions of the timeline service performance tests > > > Key: MAPREDUCE-6546 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6546 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Minor > Labels: yarn-2928-1st-milestone > Attachments: MAPREDUCE-6546-YARN-2928.01.patch, > MAPREDUCE-6546-YARN-2928.02.patch, MAPREDUCE-6546-YARN-2928.03.patch > > > The trunk now has a version of the timeline service performance test > (YARN-2556). The timeline service v.2 (YARN-2928) also has a performance > test, and these two versions are quite similar (by design). > We need to reconcile the two. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184351#comment-15184351 ] Naganarasimha G R commented on MAPREDUCE-6546: -- Thanks [~sjlee0] for the patch, Just a small nit ( i can correct while applying the patch ) {code} [-v] timeline service version (default: 1 1. version 1.x 2. version 2.x [-mtype ] (default: 1 {code} bracket is not closed. Other than that the patch is fine. if no more comments i will go ahead and commit. > reconcile the two versions of the timeline service performance tests > > > Key: MAPREDUCE-6546 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6546 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Minor > Labels: yarn-2928-1st-milestone > Attachments: MAPREDUCE-6546-YARN-2928.01.patch, > MAPREDUCE-6546-YARN-2928.02.patch, MAPREDUCE-6546-YARN-2928.03.patch > > > The trunk now has a version of the timeline service performance test > (YARN-2556). The timeline service v.2 (YARN-2928) also has a performance > test, and these two versions are quite similar (by design). > We need to reconcile the two. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184284#comment-15184284 ] Hadoop QA commented on MAPREDUCE-6546: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 15 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 101m 7s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 107m 1s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed with JDK v1.7.0_95. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 26s {color} | {color:red} Patch generated 19 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 224m 9s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.mapreduce.v2.TestMRJobsWithProfiler | | | hadoop.mapred.TestNetworkedJob | | JDK v1.7.0_95 Failed junit tests | hadoop.mapred.TestNetworkedJob | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12791850/MAPREDUCE-6546-YARN-2928.03.patch | | JIRA Issue | MAPREDUCE-6546 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d50c329094a2 3.13.0-36-lowlatency #63-Ubuntu SMP
[jira] [Updated] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated MAPREDUCE-6546: --- Attachment: MAPREDUCE-6546-YARN-2928.03.patch Posted patch v.3. Addressed Naga's feedback. Thanks for the review! > reconcile the two versions of the timeline service performance tests > > > Key: MAPREDUCE-6546 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6546 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Minor > Labels: yarn-2928-1st-milestone > Attachments: MAPREDUCE-6546-YARN-2928.01.patch, > MAPREDUCE-6546-YARN-2928.02.patch, MAPREDUCE-6546-YARN-2928.03.patch > > > The trunk now has a version of the timeline service performance test > (YARN-2556). The timeline service v.2 (YARN-2928) also has a performance > test, and these two versions are quite similar (by design). > We need to reconcile the two. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6546) reconcile the two versions of the timeline service performance tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183480#comment-15183480 ] Naganarasimha G R commented on MAPREDUCE-6546: -- Thanks [~sjlee0] for the latest patch few more nits : # Its better to capture the defaults for {{timeline service version}} and {{-mtype}} # {{\[-d \] root path of job history files}} => {{\[-d \] hdfs root path of job history fileS}} # I had configured for V2(ATS) and when submitting a job missed to specify {{-v}} option, job succeeded with counter stats as {code} org.apache.hadoop.mapreduce.TimelineServicePerformance$PerfCounters TIMELINE_SERVICE_WRITE_COUNTER=100 TIMELINE_SERVICE_WRITE_FAILURES=100 TIMELINE_SERVICE_WRITE_KBS=100 TIMELINE_SERVICE_WRITE_TIME=13 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0 TRANSACTION RATE (per mapper): 7692.307692307692 ops/s IO RATE (per mapper): 7692.307692307692 KB/s TRANSACTION RATE (total): 7692.307692307692 ops/s IO RATE (total): 7692.307692307692 KB/s {code} So wondering whether we can add a check to confirm \#failures is != write counters before declaring it as success when {{-mtype}} option is 1 ? Apart from this ran it for all V2 options and its running fine. > reconcile the two versions of the timeline service performance tests > > > Key: MAPREDUCE-6546 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6546 > Project: Hadoop Map/Reduce > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Sangjin Lee >Assignee: Sangjin Lee >Priority: Minor > Labels: yarn-2928-1st-milestone > Attachments: MAPREDUCE-6546-YARN-2928.01.patch, > MAPREDUCE-6546-YARN-2928.02.patch > > > The trunk now has a version of the timeline service performance test > (YARN-2556). The timeline service v.2 (YARN-2928) also has a performance > test, and these two versions are quite similar (by design). > We need to reconcile the two. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6649) getFailureInfo not returning any failure info
Eric Badger created MAPREDUCE-6649: -- Summary: getFailureInfo not returning any failure info Key: MAPREDUCE-6649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger Assignee: Eric Badger The following command does not produce any failure info as to why the job failed. {noformat} $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 {noformat} {noformat} 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with state FAILED due to: {noformat} To contrast, here is a command and associated command line output to show a failed job that gives the correct failiure info. {noformat} $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar sleep -Dyarn.app.mapreduce.am.command-opts=-goober -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 {noformat} {noformat} 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with state FAILED due to: Application application_1457364518683_0003 failed 3 times due to AM Container for appattempt_1457364518683_0003_03 exited with exitCode: 1 Failing this attempt.Diagnostics: Exception from container-launch. Container id: container_1457364518683_0003_03_01 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (MAPREDUCE-6633) AM should retry map attempts if the reduce task encounters commpression related errors.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on MAPREDUCE-6633 started by Rushabh S Shah. - > AM should retry map attempts if the reduce task encounters commpression > related errors. > --- > > Key: MAPREDUCE-6633 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6633 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.7.2 >Reporter: Rushabh S Shah >Assignee: Rushabh S Shah > > When reduce task encounters compression related errors, AM doesn't retry the > corresponding map task. > In one of the case we encountered, here is the stack trace. > {noformat} > 2016-01-27 13:44:28,915 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : > org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in > shuffle in fetcher#29 > at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1694) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at > com.hadoop.compression.lzo.LzoDecompressor.setInput(LzoDecompressor.java:196) > at > org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:104) > at > org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192) > at > org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:537) > at > org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:336) > at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:193) > {noformat} > In this case, the node on which the map task ran had a bad drive. > If the AM had retried running that map task somewhere else, the job > definitely would have succeeded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)