[jira] [Commented] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
[ https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346469#comment-17346469 ] Eric Badger commented on MAPREDUCE-7324: Ah crap. That's my fault. Sorry about messing up the number ordering :( > ClientHSSecurityInfo class is in wrong META-INF file > > > Key: MAPREDUCE-7324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7324.001.patch > > > {{ClientHSSecurityInfo}} is located in > {noformat} > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo > {noformat} > But the actual class exists in > {noformat} > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common > {noformat} > Because of this issue, there is an ordering dependency between the > client-jobclient and client-common that can cause failures if the ordering is > not correct. Namely, if client-common is in the classpath _after_ > client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6759) JobSubmitter/JobResourceUploader should parallelize upload of -libjars, -files, -archives
[ https://issues.apache.org/jira/browse/MAPREDUCE-6759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6759: -- Assignee: Christos Karampeazis-Papadakis > JobSubmitter/JobResourceUploader should parallelize upload of -libjars, > -files, -archives > - > > Key: MAPREDUCE-6759 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6759 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: job submission >Reporter: Dennis Huo >Assignee: Christos Karampeazis-Papadakis >Priority: Major > > During job submission, the {{JobResourceUploader}} currently iterates over > for-loops of {{-libjars}}, {{-files}}, and {{-archives}} sequentially, which > can significantly slow down job startup time when a large number of files > need to be uploaded, especially if staging the files to a cloud object-store > based FileSystem implementation like S3, GCS, WABS, etc., where round-trip > latencies may be higher than HDFS despite having good throughput when > parallelized: > {code:title=JobResourceUploader.java} > if (files != null) { > FileSystem.mkdirs(jtFs, filesDir, mapredSysPerms); > String[] fileArr = files.split(","); > for (String tmpFile : fileArr) { > URI tmpURI = null; > try { > tmpURI = new URI(tmpFile); > } catch (URISyntaxException e) { > throw new IllegalArgumentException(e); > } > Path tmp = new Path(tmpURI); > Path newPath = copyRemoteFiles(filesDir, tmp, conf, replication); > try { > URI pathURI = getPathURI(newPath, tmpURI.getFragment()); > DistributedCache.addCacheFile(pathURI, conf); > } catch (URISyntaxException ue) { > // should not throw a uri exception > throw new IOException("Failed to create uri for " + tmpFile, ue); > } > } > } > if (libjars != null) { > FileSystem.mkdirs(jtFs, libjarsDir, mapredSysPerms); > String[] libjarsArr = libjars.split(","); > for (String tmpjars : libjarsArr) { > Path tmp = new Path(tmpjars); > Path newPath = copyRemoteFiles(libjarsDir, tmp, conf, replication); > DistributedCache.addFileToClassPath( > new Path(newPath.toUri().getPath()), conf, jtFs); > } > } > if (archives != null) { > FileSystem.mkdirs(jtFs, archivesDir, mapredSysPerms); > String[] archivesArr = archives.split(","); > for (String tmpArchives : archivesArr) { > URI tmpURI; > try { > tmpURI = new URI(tmpArchives); > } catch (URISyntaxException e) { > throw new IllegalArgumentException(e); > } > Path tmp = new Path(tmpURI); > Path newPath = copyRemoteFiles(archivesDir, tmp, conf, replication); > try { > URI pathURI = getPathURI(newPath, tmpURI.getFragment()); > DistributedCache.addCacheArchive(pathURI, conf); > } catch (URISyntaxException ue) { > // should not throw an uri excpetion > throw new IOException("Failed to create uri for " + tmpArchives, > ue); > } > } > } > {code} > Parallelizing the upload of these files would improve job submission time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7302) Upgrading to JUnit 4.13 causes testcase TestFetcher.testCorruptedIFile() to fail
[ https://issues.apache.org/jira/browse/MAPREDUCE-7302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7302: --- Fix Version/s: 3.2.3 2.10.2 3.3.1 3.1.5 I've cherry-picked this back all the way to branch-2.10. There were small import conflicts that I fixed because of us shading some jars in the newer branches, but not the older ones. This has now been committed to trunk (3.4), branch-3.3, branch-3.2, branch-3.1, and branch-2.10 > Upgrading to JUnit 4.13 causes testcase TestFetcher.testCorruptedIFile() to > fail > > > Key: MAPREDUCE-7302 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7302 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7302-001.patch, MAPREDUCE-7302-002.patch, > MAPREDUCE-7302-003.patch > > > See related ticket YARN-10460. JUnit 4.13 causes the same failure: > {noformat} > [ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 1.851 s <<< FAILURE! - in org.apache.hadoop.mapreduce.task.reduce.TestFetcher > [ERROR] > testCorruptedIFile(org.apache.hadoop.mapreduce.task.reduce.TestFetcher) Time > elapsed: 0.15 s <<< ERROR! > java.lang.IllegalThreadStateException > at java.lang.ThreadGroup.addUnstarted(ThreadGroup.java:867) > at java.lang.Thread.init(Thread.java:405) > at java.lang.Thread.init(Thread.java:349) > at java.lang.Thread.(Thread.java:678) > at > java.util.concurrent.Executors$DefaultThreadFactory.newThread(Executors.java:613) > at > org.apache.hadoop.thirdparty.com.google.common.util.concurrent.ThreadFactoryBuilder$1.newThread(ThreadFactoryBuilder.java:163) > at > java.util.concurrent.ThreadPoolExecutor$Worker.(ThreadPoolExecutor.java:619) > at > java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1367) > at > org.apache.hadoop.io.ReadaheadPool.submitReadahead(ReadaheadPool.java:159) > at > org.apache.hadoop.io.ReadaheadPool.readaheadStream(ReadaheadPool.java:141) > at > org.apache.hadoop.mapred.IFileInputStream.doReadahead(IFileInputStream.java:159) > at > org.apache.hadoop.mapred.IFileInputStream.(IFileInputStream.java:88) > at > org.apache.hadoop.mapreduce.task.reduce.TestFetcher.testCorruptedIFile(TestFetcher.java:587) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
[ https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7324: --- Resolution: Fixed Status: Resolved (was: Patch Available) > ClientHSSecurityInfo class is in wrong META-INF file > > > Key: MAPREDUCE-7324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7324.001.patch > > > {{ClientHSSecurityInfo}} is located in > {noformat} > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo > {noformat} > But the actual class exists in > {noformat} > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common > {noformat} > Because of this issue, there is an ordering dependency between the > client-jobclient and client-common that can cause failures if the ordering is > not correct. Namely, if client-common is in the classpath _after_ > client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
[ https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7324: --- Fix Version/s: 3.2.3 2.10.2 3.3.1 3.1.5 3.4.0 Thanks for the review, [~daryn]. I've committed this to trunk (3.4), branch-3.3, branch-3.2, branch-3.1, and branch-2.10. > ClientHSSecurityInfo class is in wrong META-INF file > > > Key: MAPREDUCE-7324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7324.001.patch > > > {{ClientHSSecurityInfo}} is located in > {noformat} > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo > {noformat} > But the actual class exists in > {noformat} > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common > {noformat} > Because of this issue, there is an ordering dependency between the > client-jobclient and client-common that can cause failures if the ordering is > not correct. Namely, if client-common is in the classpath _after_ > client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
[ https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7324: --- Attachment: MAPREDUCE-7324.001.patch > ClientHSSecurityInfo class is in wrong META-INF file > > > Key: MAPREDUCE-7324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: MAPREDUCE-7324.001.patch > > > {{ClientHSSecurityInfo}} is located in > {noformat} > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo > {noformat} > But the actual class exists in > {noformat} > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common > {noformat} > Because of this issue, there is an ordering dependency between the > client-jobclient and client-common that can cause failures if the ordering is > not correct. Namely, if client-common is in the classpath _after_ > client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
[ https://issues.apache.org/jira/browse/MAPREDUCE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7324: --- Status: Patch Available (was: Open) > ClientHSSecurityInfo class is in wrong META-INF file > > > Key: MAPREDUCE-7324 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: MAPREDUCE-7324.001.patch > > > {{ClientHSSecurityInfo}} is located in > {noformat} > ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo > {noformat} > But the actual class exists in > {noformat} > hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common > {noformat} > Because of this issue, there is an ordering dependency between the > client-jobclient and client-common that can cause failures if the ordering is > not correct. Namely, if client-common is in the classpath _after_ > client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7324) ClientHSSecurityInfo class is in wrong META-INF file
Eric Badger created MAPREDUCE-7324: -- Summary: ClientHSSecurityInfo class is in wrong META-INF file Key: MAPREDUCE-7324 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7324 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Eric Badger Assignee: Eric Badger {{ClientHSSecurityInfo}} is located in {noformat} ./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/resources/META-INF/services/org.apache.hadoop.security.SecurityInfo {noformat} But the actual class exists in {noformat} hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common {noformat} Because of this issue, there is an ordering dependency between the client-jobclient and client-common that can cause failures if the ordering is not correct. Namely, if client-common is in the classpath _after_ client-jobclient, the JVM won't find {{ClientHSSecurityInfo}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7320) ClusterMapReduceTestCase does not clean directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17288590#comment-17288590 ] Eric Badger commented on MAPREDUCE-7320: I think leaving the test logs around is a feature, not a bug. I agree with [~Jim_Brennan] on keeping them around and deleting them at the start of the next run of unit tests > ClusterMapReduceTestCase does not clean directories > --- > > Key: MAPREDUCE-7320 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7320 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Running Junits that extend {{ClusterMapReduceTestCase}} generate lots of > directories and folders without cleaning them up. > For example: > {code:bash} > men test -Dtest=TestMRJobClient{code} > generates the following directories: > {code:bash} > - target >-+ ConfigurableMiniMRCluster_315090884 >-+ ConfigurableMiniMRCluster_1335188990 >-+ ConfigurableMiniMRCluster_1973037511 >-+ test-dir > -+ dfs > -+ hadopp-XYZ-01 > -+ hadopp-XYZ-02 > -+ hadopp-XYZ-03 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log
[ https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7319: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Log list of mappers at trace level in ShuffleHandler audit log > -- > > Key: MAPREDUCE-7319 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: yarn >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7319.001.patch > > > [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which > is logged at DEBUG level. After enabling it, we found that the list of > mappers for large jobs was filling up our audit logs. It would be good to > move the list of mappers to TRACE level to reduce the logging impact without > disabling the log message entirely. > For example a log message like this: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: > [attempt_1512479762132_1318600_1_00_004852_0_10003, > attempt_1512479762132_1318600_1_00_004190_0_10003, > attempt_1512479762132_1318600_1_00_004393_0_10003, > attempt_1512479762132_1318600_1_00_005057_0_10003, > attempt_1512479762132_1318600_1_00_004855_0_10002, > attempt_1512479762132_1318600_1_00_003976_0_10003, > attempt_1512479762132_1318600_1_00_004058_0_10003, > attempt_1512479762132_1318600_1_00_004355_0_10003, > attempt_1512479762132_1318600_1_00_004436_0_10002, > attempt_1512479762132_1318600_1_00_004854_0_10003, > attempt_1512479762132_1318600_1_00_005174_0_10004, > attempt_1512479762132_1318600_1_00_003972_0_10002, > attempt_1512479762132_1318600_1_00_004853_0_10002, > attempt_1512479762132_1318600_1_00_004856_0_10002] > {noformat} > Would become this with > {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 > {noformat} > And this with > {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 > 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 mappers: > [attempt_1512479762132_1318600_1_00_004852_0_10003, > attempt_1512479762132_1318600_1_00_004190_0_10003, > attempt_1512479762132_1318600_1_00_004393_0_10003, > attempt_1512479762132_1318600_1_00_005057_0_10003, > attempt_1512479762132_1318600_1_00_004855_0_10002, > attempt_1512479762132_1318600_1_00_003976_0_10003, > attempt_1512479762132_1318600_1_00_004058_0_10003, > attempt_1512479762132_1318600_1_00_004355_0_10003, > attempt_1512479762132_1318600_1_00_004436_0_10002, > attempt_1512479762132_1318600_1_00_004854_0_10003, > attempt_1512479762132_1318600_1_00_005174_0_10004, > attempt_1512479762132_1318600_1_00_003972_0_10002, > attempt_1512479762132_1318600_1_00_004853_0_10002, > attempt_1512479762132_1318600_1_00_004856_0_10002] > {noformat} > One question is whether there are any downstream consumers of this audit log > that might have a problem with this change? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7319) Log list of mappers at trace level in ShuffleHandler audit log
[ https://issues.apache.org/jira/browse/MAPREDUCE-7319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7319: --- Fix Version/s: 3.2.3 2.10.2 3.3.1 3.1.5 3.4.0 +1 Thanks for the patch, [~Jim_Brennan]! I've committed this to trunk (3.4), branch-3.3, branch-3.2, branch-3.1, and branch-2.10 > Log list of mappers at trace level in ShuffleHandler audit log > -- > > Key: MAPREDUCE-7319 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7319 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: yarn >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 3.4.0, 3.1.5, 3.3.1, 2.10.2, 3.2.3 > > Attachments: MAPREDUCE-7319.001.patch > > > [MAPREDUCE-6958] added the content length to ShuffleHandler audit log, which > is logged at DEBUG level. After enabling it, we found that the list of > mappers for large jobs was filling up our audit logs. It would be good to > move the list of mappers to TRACE level to reduce the logging impact without > disabling the log message entirely. > For example a log message like this: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 mappers: > [attempt_1512479762132_1318600_1_00_004852_0_10003, > attempt_1512479762132_1318600_1_00_004190_0_10003, > attempt_1512479762132_1318600_1_00_004393_0_10003, > attempt_1512479762132_1318600_1_00_005057_0_10003, > attempt_1512479762132_1318600_1_00_004855_0_10002, > attempt_1512479762132_1318600_1_00_003976_0_10003, > attempt_1512479762132_1318600_1_00_004058_0_10003, > attempt_1512479762132_1318600_1_00_004355_0_10003, > attempt_1512479762132_1318600_1_00_004436_0_10002, > attempt_1512479762132_1318600_1_00_004854_0_10003, > attempt_1512479762132_1318600_1_00_005174_0_10004, > attempt_1512479762132_1318600_1_00_003972_0_10002, > attempt_1512479762132_1318600_1_00_004853_0_10002, > attempt_1512479762132_1318600_1_00_004856_0_10002] > {noformat} > Would become this with > {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=DEBUG}}: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 > {noformat} > And this with > {{log4j.logger.org.apache.hadoop.mapred.ShuffleHandler.audit=TRACE}}: > {noformat} > 2018-01-25 23:43:02,669 [New I/O worker #1] DEBUG ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 reducer 241 length 482072 > 2018-01-25 23:43:02,669 [New I/O worker #1] TRACE ShuffleHandler.audit: > shuffle for job_1512479762132_1318600 mappers: > [attempt_1512479762132_1318600_1_00_004852_0_10003, > attempt_1512479762132_1318600_1_00_004190_0_10003, > attempt_1512479762132_1318600_1_00_004393_0_10003, > attempt_1512479762132_1318600_1_00_005057_0_10003, > attempt_1512479762132_1318600_1_00_004855_0_10002, > attempt_1512479762132_1318600_1_00_003976_0_10003, > attempt_1512479762132_1318600_1_00_004058_0_10003, > attempt_1512479762132_1318600_1_00_004355_0_10003, > attempt_1512479762132_1318600_1_00_004436_0_10002, > attempt_1512479762132_1318600_1_00_004854_0_10003, > attempt_1512479762132_1318600_1_00_005174_0_10004, > attempt_1512479762132_1318600_1_00_003972_0_10002, > attempt_1512479762132_1318600_1_00_004853_0_10002, > attempt_1512479762132_1318600_1_00_004856_0_10002] > {noformat} > One question is whether there are any downstream consumers of this audit log > that might have a problem with this change? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-7285) Junit class missing from hadoop-mapreduce-client-jobclient-*-tests jar
Eric Badger created MAPREDUCE-7285: -- Summary: Junit class missing from hadoop-mapreduce-client-jobclient-*-tests jar Key: MAPREDUCE-7285 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7285 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger {noformat} [ebadger@foo bin]$ $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-*-tests.jar sleep -Dyarn.app.mapreduce.am.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" -Dmapreduce.admin.user.env="HADOOP_MAPRED_HOME=$HADOOP_HOME" -mt 1 -rt 1 -m 1 -r 1 WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX. java.lang.NoClassDefFoundError: junit/framework/TestCase at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:109) at org.apache.hadoop.test.MapredTestDriver.(MapredTestDriver.java:61) at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:147) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: java.lang.ClassNotFoundException: junit.framework.TestCase at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 21 more {noformat} The sleep job continues to run after the error and succeeds successfully, but the error shouldn't be there. Something must have removed a jar or added an unfulfilled dependency on junit -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-7277) IndexCache totalMemoryUsed differs from cache contents.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-7277: --- Fix Version/s: (was: 3.4.0) (was: 2.10.1) (was: 3.2.2) (was: 3.1.4) (was: 2.9.3) (was: 3.3.0) (was: 2.8.6) We found a deadlock bug in this code that caused shuffles to hang. I have reverted this from 2.8, 2.9, 2.10, 3.1, 3.2, 3.3, and trunk. > IndexCache totalMemoryUsed differs from cache contents. > --- > > Key: MAPREDUCE-7277 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7277 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jonathan Turner Eagles >Assignee: Jonathan Turner Eagles >Priority: Major > Attachments: IndexCacheActualSize.png, MAPREDUCE-7277.001.patch, > MAPREDUCE-7277.002.patch, MAPREDUCE-7277.003.patch, > MAPREDUCE-7277.004-branch-2.10.patch, MAPREDUCE-7277.004.patch > > > It was observed recently in a nodemanager OOM that the memory was filled with > SpillRecords. However, the IndexCache was only 15% full (1.5MB used on a 10MB > configured cache size). In particular was noted that the booking variable > totalMemoryUsed, was out of sync with the contents of the cache showing 96% > full, thereby drastically reducing the effectiveness of the cache. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Reopened] (MAPREDUCE-7277) IndexCache totalMemoryUsed differs from cache contents.
[ https://issues.apache.org/jira/browse/MAPREDUCE-7277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reopened MAPREDUCE-7277: > IndexCache totalMemoryUsed differs from cache contents. > --- > > Key: MAPREDUCE-7277 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7277 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Jonathan Turner Eagles >Assignee: Jonathan Turner Eagles >Priority: Major > Fix For: 2.8.6, 3.3.0, 2.9.3, 3.1.4, 3.2.2, 2.10.1, 3.4.0 > > Attachments: IndexCacheActualSize.png, MAPREDUCE-7277.001.patch, > MAPREDUCE-7277.002.patch, MAPREDUCE-7277.003.patch, > MAPREDUCE-7277.004-branch-2.10.patch, MAPREDUCE-7277.004.patch > > > It was observed recently in a nodemanager OOM that the memory was filled with > SpillRecords. However, the IndexCache was only 15% full (1.5MB used on a 10MB > configured cache size). In particular was noted that the booking variable > totalMemoryUsed, was out of sync with the contents of the cache showing 96% > full, thereby drastically reducing the effectiveness of the cache. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7272) TaskAttemptListenerImpl excessive log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17082534#comment-17082534 ] Eric Badger commented on MAPREDUCE-7272: [~epayne], yea that would be great. > TaskAttemptListenerImpl excessive log messages > -- > > Key: MAPREDUCE-7272 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Attachments: MAPREDUCE-7272-branch-2.10.001.patch, > MAPREDUCE-7272-branch-2.10.002.patch, MAPREDUCE-7272-branch-2.10.003.patch, > MAPREDUCE-7272-branch-2.10.004.patch, MAPREDUCE-7272.001.patch, > MAPREDUCE-7272.002.patch, MAPREDUCE-7272.003.patch, MAPREDUCE-7272.004.patch > > > {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. > One every call, the listener uses {{LOG.info()}} to printout the progress of > the {{TaskAttempt}}. > {code:java} > taskAttemptStatus.progress = taskStatus.getProgress(); > LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : " > + taskStatus.getProgress()); > {code} > > {code:bash} > 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_007783_0 is : 0.40713295 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_020681_0 is : 0.55573714 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_024371_0 is : 0.54190344 > 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_033182_0 is : 0.50264555 > 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_022375_0 is : 0.5495565 > {code} > After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we > thought that while it is helpful to have a log print of task progress, it is > still excessive to log the progress in every update. > This Jira is to suppress the excessive logging from TaskAttemptListener > without affecting the frequency of progress updates. > There are two flags: > * {{-Dmapreduce.task.log.progress.delta.threshold=0.10}}: means that the > task progress will be logged every 10% of delta progress. Default is 5%. > * {{-Dmapreduce.task.log.progress.wait.interval-seconds=120}}: means that if > the listener will log the progress every 2 minutes. This is helpful for long > running tasks that take long time to achieve the delta threshold. Default is > 1 minute. > The listener will long whichever of {{delta.threshold}} and > {{wait.interval-seconds}} is reached first. > Enabling {{LOG.DEBUG}} for {{TaskAttemptListenerImpl}} will override > those two flags and log the task progress on every update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7272) TaskAttemptListenerImpl excessive log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17081018#comment-17081018 ] Eric Badger commented on MAPREDUCE-7272: {noformat} new Double(PROGRESS_MIN_DELTA_FACTOR * conf.getDouble(MRJobConfig.TASK_LOG_PROGRESS_DELTA_THRESHOLD, MRJobConfig.TASK_LOG_PROGRESS_DELTA_THRESHOLD_DEFAULT)); {noformat} Minor nit, you don't need to create a new double. You're already multiplying 2 doubles. Other than that I'm +1 on this patch. But I'll let [~epayne] give his seal of approval as well since he's been reviewing. > TaskAttemptListenerImpl excessive log messages > -- > > Key: MAPREDUCE-7272 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Attachments: MAPREDUCE-7272-branch-2.10.001.patch, > MAPREDUCE-7272-branch-2.10.002.patch, MAPREDUCE-7272-branch-2.10.003.patch, > MAPREDUCE-7272-branch-2.10.004.patch, MAPREDUCE-7272.001.patch, > MAPREDUCE-7272.002.patch, MAPREDUCE-7272.003.patch > > > {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. > One every call, the listener uses {{LOG.info()}} to printout the progress of > the {{TaskAttempt}}. > {code:java} > taskAttemptStatus.progress = taskStatus.getProgress(); > LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : " > + taskStatus.getProgress()); > {code} > > {code:bash} > 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_007783_0 is : 0.40713295 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_020681_0 is : 0.55573714 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_024371_0 is : 0.54190344 > 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_033182_0 is : 0.50264555 > 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_022375_0 is : 0.5495565 > {code} > After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we > thought that while it is helpful to have a log print of task progress, it is > still excessive to log the progress in every update. > This Jira is to suppress the excessive logging from TaskAttemptListener > without affecting the frequency of progress updates. > There are two flags: > * {{-Dmapreduce.task.log.progress.delta.threshold=0.10}}: means that the > task progress will be logged every 10% of delta progress. Default is 5%. > * {{-Dmapreduce.task.log.progress.wait.interval-seconds=120}}: means that if > the listener will log the progress every 2 minutes. This is helpful for long > running tasks that take long time to achieve the delta threshold. Default is > 1 minute. > The listener will long whichever of {{delta.threshold}} and > {{wait.interval-seconds}} is reached first. > Enabling {{LOG.DEBUG}} for {{TaskAttemptListenerImpl}} will override > those two flags and log the task progress on every update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7272) TaskAttemptListenerImpl excessive log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080035#comment-17080035 ] Eric Badger commented on MAPREDUCE-7272: I see. Thanks for the explanation, [~epayne]. > TaskAttemptListenerImpl excessive log messages > -- > > Key: MAPREDUCE-7272 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Attachments: MAPREDUCE-7272-branch-2.10.001.patch, > MAPREDUCE-7272.001.patch > > > {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. > One every call, the listener uses {{LOG.info()}} to printout the progress of > the {{TaskAttempt}}. > {code:java} > taskAttemptStatus.progress = taskStatus.getProgress(); > LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : " > + taskStatus.getProgress()); > {code} > > {code:bash} > 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_007783_0 is : 0.40713295 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_020681_0 is : 0.55573714 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_024371_0 is : 0.54190344 > 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_033182_0 is : 0.50264555 > 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_022375_0 is : 0.5495565 > {code} > After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we > thought that while it is helpful to have a log print of task progress, it is > still excessive to log the progress in every update. > This Jira is to suppress the excessive logging from TaskAttemptListener > without affecting the frequency of progress updates. > There are two flags: > * {{-Dmapreduce.task.progress.min.delta.threshold=0.10}}: means that the > task progress will be logged every 10% of delta progress. Default is 5%. > * {{-Dmapreduce.task.progress.wait.delta.time.threshold=3}}: means that if > the listener will log the progress every 3 minutes. This is helpful for long > running tasks that take long time to achieve the delta threshold. Default is > 1 minute. > The listener will long whichever of {{delta.threshold}} and > {{wait.delta.time}} is reached first. > Enabling {{LOG.DEBUG}} for {{TaskAttemptListenerImpl}} will override > those two flags and log the task progress on every update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7272) TaskAttemptListenerImpl excessive log messages
[ https://issues.apache.org/jira/browse/MAPREDUCE-7272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17080026#comment-17080026 ] Eric Badger commented on MAPREDUCE-7272: Thanks for the patch, [~ahussein]! I have a few comments bq. -Dmapreduce.task.progress.wait.delta.time.threshold=3: means that if the listener will log the progress every 3 minutes. To give more granularity, I think this config should be in seconds or milliseconds. Also it's helpful to add "secs" or "ms" or whatever it ends up being into the config name so it's immediately known what the unit is. {noformat} + * The taskAttemptId of that hostory record. {noformat} Typo on history. {noformat} if (LOG.isDebugEnabled()) { {noformat} I believe this is unnecessary. Pretty sure the {{log.debug}} will only log if debug is enabled. {noformat} // It is helpful to log the progress when it reaches 1.0F. resetLog(result || (Float.compare(progress, 1.0f) == 0), progress, processedProgress, currentTime); {noformat} When the task reaches 100%, can we also remove it from {{TASK_ATTEMPT_PROGRESS_LOG_STAMPS}} to limit heap usage? > TaskAttemptListenerImpl excessive log messages > -- > > Key: MAPREDUCE-7272 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7272 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Ahmed Hussein >Assignee: Ahmed Hussein >Priority: Major > Attachments: MAPREDUCE-7272-branch-2.10.001.patch, > MAPREDUCE-7272.001.patch > > > {{TaskAttemptListenerImpl.statusUpdate()}} causes a bloating in log files. > One every call, the listener uses {{LOG.info()}} to printout the progress of > the {{TaskAttempt}}. > {code:java} > taskAttemptStatus.progress = taskStatus.getProgress(); > LOG.info("Progress of TaskAttempt " + taskAttemptID + " is : " > + taskStatus.getProgress()); > {code} > > {code:bash} > 2020-04-07 10:20:50,708 INFO [IPC Server handler 17 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_007783_0 is : 0.40713295 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 7 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_020681_0 is : 0.55573714 > 2020-04-07 10:20:50,717 INFO [IPC Server handler 26 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_024371_0 is : 0.54190344 > 2020-04-07 10:20:50,738 INFO [IPC Server handler 15 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_033182_0 is : 0.50264555 > 2020-04-07 10:20:50,748 INFO [IPC Server handler 3 on 43926] > org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttempt > attempt_1586003420099_716645_m_022375_0 is : 0.5495565 > {code} > After discussing this issue with [~nroberts], [~ebadger], and [~epayne], we > thought that while it is helpful to have a log print of task progress, it is > still excessive to log the progress in every update. > This Jira is to suppress the excessive logging from TaskAttemptListener > without affecting the frequency of progress updates. > There are two flags: > * {{-Dmapreduce.task.progress.min.delta.threshold=0.10}}: means that the > task progress will be logged every 10% of delta progress. Default is 5%. > * {{-Dmapreduce.task.progress.wait.delta.time.threshold=3}}: means that if > the listener will log the progress every 3 minutes. This is helpful for long > running tasks that take long time to achieve the delta threshold. Default is > 1 minute. > The listener will long whichever of {{delta.threshold}} and > {{wait.delta.time}} is reached first. > Enabling {{LOG.DEBUG}} for {{TaskAttemptListenerImpl}} will override > those two flags and log the task progress on every update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-7079) JobHistory#ServiceStop implementation is incorrect
[ https://issues.apache.org/jira/browse/MAPREDUCE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17023236#comment-17023236 ] Eric Badger commented on MAPREDUCE-7079: Does this approach work even if the job is run inside of a docker container such as HadoopQA? > JobHistory#ServiceStop implementation is incorrect > -- > > Key: MAPREDUCE-7079 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-7079 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Jason Darrell Lowe >Assignee: Ahmed Hussein >Priority: Major > Attachments: 2020-01-10-MRApp-stack-dump.txt, > 2020-01-10-org.apache.hadoop.mapred.TestMRIntermediateDataEncryption-version-14.txt, > MAPREDUCE-7079.001.patch, MAPREDUCE-7079.002.patch, > MAPREDUCE-7079.003.patch, MAPREDUCE-7079.004.patch, MAPREDUCE-7079.005.patch, > MAPREDUCE-7079.006.patch, MAPREDUCE-7079.007.patch, MAPREDUCE-7079.008.patch, > MAPREDUCE-7079.009.patch > > > {{JobHistory.serviceStop}} skips waiting for the thread pool to terminate. > The problem is due to incorrect while condition that will evaluate to false > on the iteration of the loop. > {code:java} > scheduledExecutor.shutdown(); > boolean interrupted = false; > long currentTime = System.currentTimeMillis(); > while (!scheduledExecutor.isShutdown() > && System.currentTimeMillis() > currentTime + 1000l && > !interrupted) { > try { > Thread.sleep(20); > } catch (InterruptedException e) { > interrupted = true; > } > } > {code} > The expression "{{System.currentTimeMillis() > currentTime + 1000L}}" is > false because currentTime was just initialized with > {{System.currentTimeMillis()}}. As a result the the thread won't wait until > the executor is terminated. Instead, it will force a shutdown immediately. > *TestMRIntermediateDataEncryption is failing in precommit builds* > TestMRIntermediateDataEncryption is either timing out or tearing down the JVM > which causes the unit tests in jobclient to not pass cleanly during precommit > builds. From sample precommit console output, note the lack of a test results > line when the test is run: > {noformat} > [INFO] Running org.apache.hadoop.mapred.TestSequenceFileInputFormat > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.976 > s - in org.apache.hadoop.mapred.TestSequenceFileInputFormat > [INFO] Running org.apache.hadoop.mapred.TestMRIntermediateDataEncryption > [INFO] Running org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath > [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.659 > s - in org.apache.hadoop.mapred.TestSpecialCharactersInOutputPath > [...] > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 02:14 h > [INFO] Finished at: 2018-04-12T04:27:06+00:00 > [INFO] Final Memory: 24M/594M > [INFO] > > [WARNING] The requested profile "parallel-tests" could not be activated > because it does not exist. > [WARNING] The requested profile "native" could not be activated because it > does not exist. > [WARNING] The requested profile "yarn-ui" could not be activated because it > does not exist. > [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on > project hadoop-mapreduce-client-jobclient: There was a timeout or other error > in the fork -> [Help 1] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119108#comment-16119108 ] Eric Badger commented on MAPREDUCE-6927: Thanks, [~jlowe] for the review and commit! > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Fix For: 2.9.0, 3.0.0-beta1, 2.8.2 > > Attachments: MAPREDUCE-6927.001.patch, MAPREDUCE-6927.002.patch, > MAPREDUCE-6927.003.patch, MAPREDUCE-6927.004.patch > > > Currently the RMCommunicator will set the tracking url during unregistration > once a job has finished, regardless of whether it actually wrote history or > not. If the write to history failed for whatever reason, we should leave the > tracking url as null so that we get redirected to the AHS instead of getting > a job not found on the JHS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Attachment: MAPREDUCE-6927.004.patch Thanks for the additional review. Thought I got rid of all of the checkstyle issues, but I must've added in new ones as I updated the patch. I'm pretty sure this one got rid of them all. I had to move {{moveTmpToDone}} to protected so that I could override it in the test. Not sure if there's another way around that. > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch, MAPREDUCE-6927.002.patch, > MAPREDUCE-6927.003.patch, MAPREDUCE-6927.004.patch > > > Currently the RMCommunicator will set the tracking url during unregistration > once a job has finished, regardless of whether it actually wrote history or > not. If the write to history failed for whatever reason, we should leave the > tracking url as null so that we get redirected to the AHS instead of getting > a job not found on the JHS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Attachment: MAPREDUCE-6927.003.patch [~jlowe], that all makes sense. Definitely a better way to do the tests than I was before. I've fixed them up and separated them. I also fixed the checkstyle, so hopefully that comes back clean. > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch, MAPREDUCE-6927.002.patch, > MAPREDUCE-6927.003.patch > > > Currently the RMCommunicator will set the tracking url during unregistration > once a job has finished, regardless of whether it actually wrote history or > not. If the write to history failed for whatever reason, we should leave the > tracking url as null so that we get redirected to the AHS instead of getting > a job not found on the JHS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Description: Currently the RMCommunicator will set the tracking url during unregistration once a job has finished, regardless of whether it actually wrote history or not. If the write to history failed for whatever reason, we should leave the tracking url as null so that we get redirected to the AHS instead of getting a job not found on the JHS. > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch, MAPREDUCE-6927.002.patch > > > Currently the RMCommunicator will set the tracking url during unregistration > once a job has finished, regardless of whether it actually wrote history or > not. If the write to history failed for whatever reason, we should leave the > tracking url as null so that we get redirected to the AHS instead of getting > a job not found on the JHS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Attachment: MAPREDUCE-6927.002.patch [~jlowe], thanks for the comments. I'm not quite sure why I didn't write a description. I'll fix that. {quote} I think we should set the tracking URL as soon as the .jhist file is successfully put in the done folder which is a bit earlier than where it is now in the patch. {quote} I moved the {{setHistoryFile()}} call up to just after the .jhist file is moved to done. bq. It would be nice if the unit test verified the job history URL was correctly rather than just any string at all. Fixed bq. I think the test should also verify that even if the job history event handler receives a job finished event but is unable to complete moving the history to the done directory that it does not set the tracking URL. I had to mock up more stuff than I wanted to to get this to work. Let me know if you have a better way in mind that can accomplish the same thing. > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch, MAPREDUCE-6927.002.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Attachment: MAPREDUCE-6927.001.patch > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
[ https://issues.apache.org/jira/browse/MAPREDUCE-6927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6927: --- Status: Patch Available (was: Open) > MR job should only set tracking url if history was successfully written > --- > > Key: MAPREDUCE-6927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6927.001.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6927) MR job should only set tracking url if history was successfully written
Eric Badger created MAPREDUCE-6927: -- Summary: MR job should only set tracking url if history was successfully written Key: MAPREDUCE-6927 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6927 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger Assignee: Eric Badger -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6911: --- Attachment: MAPREDUCE-6911-branch-2.001.patch Forgot to name the patch against branch-2 so here's the same patch with the correct name so that hadoopqa will run correctly (hopefully) > TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails > consistently > > > Key: MAPREDUCE-6911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 2.8.2 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6911.001.patch, > MAPREDUCE-6911-branch-2.001.patch > > > {noformat} > mapred-default.xml has 2 properties missing in interface > org.apache.hadoop.mapreduce.MRJobConfig interface > org.apache.hadoop.mapreduce.MRConfig class > org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class > org.apache.hadoop.mapred.ShuffleHandler class > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class > org.apache.hadoop.mapreduce.lib.input.FileInputFormat class > org.apache.hadoop.mapreduce.Job class > org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class > org.apache.hadoop.mapred.JobConf class > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > mapreduce.jobtracker.system.dir > mapreduce.jobtracker.staging.root.dir > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6911: --- Attachment: MAPREDUCE-6911.001.patch Attaching a patch that adds the properties to the skip list > TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails > consistently > > > Key: MAPREDUCE-6911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 2.8.2 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6911.001.patch > > > {noformat} > mapred-default.xml has 2 properties missing in interface > org.apache.hadoop.mapreduce.MRJobConfig interface > org.apache.hadoop.mapreduce.MRConfig class > org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class > org.apache.hadoop.mapred.ShuffleHandler class > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class > org.apache.hadoop.mapreduce.lib.input.FileInputFormat class > org.apache.hadoop.mapreduce.Job class > org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class > org.apache.hadoop.mapred.JobConf class > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > mapreduce.jobtracker.system.dir > mapreduce.jobtracker.staging.root.dir > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6911: --- Status: Patch Available (was: Open) > TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails > consistently > > > Key: MAPREDUCE-6911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 2.8.2 >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6911.001.patch > > > {noformat} > mapred-default.xml has 2 properties missing in interface > org.apache.hadoop.mapreduce.MRJobConfig interface > org.apache.hadoop.mapreduce.MRConfig class > org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class > org.apache.hadoop.mapred.ShuffleHandler class > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class > org.apache.hadoop.mapreduce.lib.input.FileInputFormat class > org.apache.hadoop.mapreduce.Job class > org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class > org.apache.hadoop.mapred.JobConf class > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > mapreduce.jobtracker.system.dir > mapreduce.jobtracker.staging.root.dir > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6911: -- Assignee: Eric Badger > TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails > consistently > > > Key: MAPREDUCE-6911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 2.8.2 >Reporter: Eric Badger >Assignee: Eric Badger > > {noformat} > mapred-default.xml has 2 properties missing in interface > org.apache.hadoop.mapreduce.MRJobConfig interface > org.apache.hadoop.mapreduce.MRConfig class > org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class > org.apache.hadoop.mapred.ShuffleHandler class > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class > org.apache.hadoop.mapreduce.lib.input.FileInputFormat class > org.apache.hadoop.mapreduce.Job class > org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class > org.apache.hadoop.mapred.JobConf class > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > mapreduce.jobtracker.system.dir > mapreduce.jobtracker.staging.root.dir > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075520#comment-16075520 ] Eric Badger commented on MAPREDUCE-6911: This was broken by MAPREDUCE-6909. Looks like the test can't find any usage of the configs > TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails > consistently > > > Key: MAPREDUCE-6911 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 > Project: Hadoop Map/Reduce > Issue Type: Bug >Affects Versions: 2.9.0, 2.8.2 >Reporter: Eric Badger > > {noformat} > mapred-default.xml has 2 properties missing in interface > org.apache.hadoop.mapreduce.MRJobConfig interface > org.apache.hadoop.mapreduce.MRConfig class > org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class > org.apache.hadoop.mapred.ShuffleHandler class > org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class > org.apache.hadoop.mapreduce.lib.input.FileInputFormat class > org.apache.hadoop.mapreduce.Job class > org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class > org.apache.hadoop.mapred.JobConf class > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter > mapreduce.jobtracker.system.dir > mapreduce.jobtracker.staging.root.dir > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6911) TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently
Eric Badger created MAPREDUCE-6911: -- Summary: TestMapreduceConfigFields.testCompareXmlAgainstConfigurationClass fails consistently Key: MAPREDUCE-6911 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6911 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.9.0, 2.8.2 Reporter: Eric Badger {noformat} mapred-default.xml has 2 properties missing in interface org.apache.hadoop.mapreduce.MRJobConfig interface org.apache.hadoop.mapreduce.MRConfig class org.apache.hadoop.mapreduce.v2.jobhistory.JHAdminConfig class org.apache.hadoop.mapred.ShuffleHandler class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat class org.apache.hadoop.mapreduce.lib.input.FileInputFormat class org.apache.hadoop.mapreduce.Job class org.apache.hadoop.mapreduce.lib.input.NLineInputFormat class org.apache.hadoop.mapred.JobConf class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter mapreduce.jobtracker.system.dir mapreduce.jobtracker.staging.root.dir {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6882) Increase MapReduce test timeouts from 1 second to 10 seconds
[ https://issues.apache.org/jira/browse/MAPREDUCE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6882: --- Attachment: MAPREDUCE-6882.001.patch Uploading patch > Increase MapReduce test timeouts from 1 second to 10 seconds > > > Key: MAPREDUCE-6882 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6882 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6882.001.patch > > > 1 second test timeouts are susceptible to failure on overloaded or otherwise > slow machines -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6882) Increase MapReduce test timeouts from 1 second to 10 seconds
Eric Badger created MAPREDUCE-6882: -- Summary: Increase MapReduce test timeouts from 1 second to 10 seconds Key: MAPREDUCE-6882 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6882 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger Assignee: Eric Badger 1 second test timeouts are susceptible to failure on overloaded or otherwise slow machines -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6882) Increase MapReduce test timeouts from 1 second to 10 seconds
[ https://issues.apache.org/jira/browse/MAPREDUCE-6882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6882: --- Status: Patch Available (was: Open) > Increase MapReduce test timeouts from 1 second to 10 seconds > > > Key: MAPREDUCE-6882 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6882 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6882.001.patch > > > 1 second test timeouts are susceptible to failure on overloaded or otherwise > slow machines -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6565) Configuration to use host name in delegation token service is not read from job.xml during MapReduce job execution.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709437#comment-15709437 ] Eric Badger commented on MAPREDUCE-6565: [~djp], this seems to have broken the 2.8 build. Can you revert it from 2.8 until we can fix the patch? > Configuration to use host name in delegation token service is not read from > job.xml during MapReduce job execution. > --- > > Key: MAPREDUCE-6565 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6565 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chris Nauroth >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6565-trunk.001.patch > > > By default, the service field of a delegation token is populated based on > server IP address. Setting {{hadoop.security.token.service.use_ip}} to > {{false}} changes this behavior to use host name instead of IP address. > However, this configuration property is not read from job.xml. Instead, it's > read from a separate {{Configuration}} instance created during static > initialization of {{SecurityUtil}}. This does not work correctly with > MapReduce jobs if the framework is distributed by setting > {{mapreduce.application.framework.path}} and the > {{mapreduce.application.classpath}} is isolated to avoid reading > core-site.xml from the cluster nodes. MapReduce tasks will fail to > authenticate to HDFS, because they'll try to find a delegation token based on > the NameNode IP address, even though at job submission time the tokens were > generated using the host name. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15392074#comment-15392074 ] Eric Badger commented on MAPREDUCE-6744: Both unit test failures are unrelated to this patch. For sanity purposes I ran them locally and they passed. > Increase timeout on TestDFSIO tests > --- > > Key: MAPREDUCE-6744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6744.001.patch > > > The timeout on these tests is only 3 seconds and so one of the tests in the > suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389788#comment-15389788 ] Eric Badger commented on MAPREDUCE-6744: Linking JIRA that previously increased some (but not all) timeouts on DFSIO tests. > Increase timeout on TestDFSIO tests > --- > > Key: MAPREDUCE-6744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6744.001.patch > > > The timeout on these tests is only 3 seconds and so one of the tests in the > suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
Eric Badger created MAPREDUCE-6744: -- Summary: Increase timeout on TestDFSIO tests Key: MAPREDUCE-6744 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger The timeout on these tests is only 3 seconds and so one of the tests in the suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6744: --- Status: Patch Available (was: Open) > Increase timeout on TestDFSIO tests > --- > > Key: MAPREDUCE-6744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6744.001.patch > > > The timeout on these tests is only 3 seconds and so one of the tests in the > suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6744: --- Attachment: MAPREDUCE-6744.001.patch Attaching patch to increase timeout from 3 to 10 seconds. > Increase timeout on TestDFSIO tests > --- > > Key: MAPREDUCE-6744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6744.001.patch > > > The timeout on these tests is only 3 seconds and so one of the tests in the > suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Assigned] (MAPREDUCE-6744) Increase timeout on TestDFSIO tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6744: -- Assignee: Eric Badger > Increase timeout on TestDFSIO tests > --- > > Key: MAPREDUCE-6744 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6744 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > > The timeout on these tests is only 3 seconds and so one of the tests in the > suite will fail on a regular basis. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Updated] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6649: --- Fix Version/s: 2.8.0 > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Fix For: 2.8.0 > > Attachments: MAPREDUCE-6649.001.patch, MAPREDUCE-6649.002.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6649: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks, [~eepayne] for reviewing and committing this! > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6649.001.patch, MAPREDUCE-6649.002.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6512: -- Assignee: Chang Li (was: Eric Badger) > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Chang Li > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.2.patch, > MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15235278#comment-15235278 ] Eric Badger commented on MAPREDUCE-6512: [~jlowe], should we close this as won't fix? > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Eric Badger > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.2.patch, > MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6512) FileOutputCommitter tasks unconditionally create parent directories
[ https://issues.apache.org/jira/browse/MAPREDUCE-6512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6512: --- Assignee: Eric Badger (was: Chang Li) > FileOutputCommitter tasks unconditionally create parent directories > --- > > Key: MAPREDUCE-6512 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6512 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Chang Li >Assignee: Eric Badger > Attachments: MAPREDUCE-6512.2.patch, MAPREDUCE-6512.2.patch, > MAPREDUCE-6512.patch, MAPREDUCE-6512.patch > > > If the output directory is deleted then subsequent tasks should fail. Instead > they blindly create the missing parent directories, leading the job to be > "succesful" despite potentially missing almost all of the output. Task > attempts should fail if the parent app attempt directory is missing when they > go to create their task attempt directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232997#comment-15232997 ] Eric Badger commented on MAPREDUCE-6658: TestContainerManagerSecurity is tracked by [YARN-4342|https://issues.apache.org/jira/browse/YARN-4342]. Sorry for the typo. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232840#comment-15232840 ] Eric Badger commented on MAPREDUCE-6658: TestMiniYarnClusterNodeUtilization is tracked by [YARN-4453|https://issues.apache.org/jira/browse/YARN-4453] TestMiniYarnClusterNodeUtilization is tracked by [YARN-4342|https://issues.apache.org/jira/browse/YARN-4342] It concerns me a little bit that both of these tests failed twice in a row with the addition of my patch. However, I am unable to reproduce the failure on my local Mac or Linux boxes, even under high-cpu situations (similar to Jenkins). > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232375#comment-15232375 ] Eric Badger commented on MAPREDUCE-6658: [~eepayne], [~jlowe], [~kasha], can one of you review this patch? It's a very small change that will fix some recurring test failures. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229136#comment-15229136 ] Eric Badger commented on MAPREDUCE-6658: Neither of these test failures are reproducible on my local machine with the patch applied. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6658: --- Attachment: MAPREDUCE-6658.001.patch [~ajisakaa], I tracked down the issue. Indeed, it was my code that broke the tests. I'm not sure why I wasn't able to reproduce the failures consistently, but maybe that was another side effect of my earlier changes. The bug was that the MiniYARNCluster was explicitly calling transitionToActive on the resource manager with index 0. This overwrote some conf properties and reset the cluster max priority property back to DEFAULT. I'm attaching a patch that fixes this issue by only transition the RM with index 0 to active if we are in an HA cluster. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6658: --- Status: Patch Available (was: Open) > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > Attachments: MAPREDUCE-6658.001.patch > > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15228518#comment-15228518 ] Eric Badger commented on MAPREDUCE-6649: [~eepayne], can you take a look at this patch and commit if it looks good? > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6649.001.patch, MAPREDUCE-6649.002.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6649: --- Attachment: MAPREDUCE-6649.002.patch Fixing checkstyle in patch. > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6649.001.patch, MAPREDUCE-6649.002.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6649: --- Status: Patch Available (was: Open) > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6649.001.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6649) getFailureInfo not returning any failure info
[ https://issues.apache.org/jira/browse/MAPREDUCE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6649: --- Attachment: MAPREDUCE-6649.001.patch Uploading patch that fixes the issue and also creates a unit test to maintain the fix. > getFailureInfo not returning any failure info > - > > Key: MAPREDUCE-6649 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Eric Badger >Assignee: Eric Badger > Attachments: MAPREDUCE-6649.001.patch > > > The following command does not produce any failure info as to why the job > failed. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 > {noformat} > {noformat} > 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with > state FAILED due to: > {noformat} > To contrast, here is a command and associated command line output to show a > failed job that gives the correct failiure info. > {noformat} > $HADOOP_PREFIX/bin/hadoop jar > $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar > sleep -Dyarn.app.mapreduce.am.command-opts=-goober > -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 > {noformat} > {noformat} > 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job > (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with > state FAILED due to: Application application_1457364518683_0003 failed 3 > times due to AM Container for appattempt_1457364518683_0003_03 exited > with exitCode: 1 > Failing this attempt.Diagnostics: Exception from container-launch. > Container id: container_1457364518683_0003_03_01 > Exit code: 1 > Stack trace: ExitCodeException exitCode=1: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211066#comment-15211066 ] Eric Badger commented on MAPREDUCE-6658: I believe that there is a race condition that causes this bug to appear and disappear with the addition of [YARN-4686]. However, there is another race condition (possibly the same one) that forces the test to fail in the cases both with and without [YARN-4686]. Earlier today I was able to get the test to fail with the patch and not without it. But now it fails whether the patch is there or not and this happens to me on both of my machines (both unix and linux). I haven't figured out where the race(s) is/are occurring yet, but I believe that they will shed light onto why the tests are failing sporadically for me. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15210303#comment-15210303 ] Eric Badger commented on MAPREDUCE-6658: Not quite sure what has changed since yesterday, but I am now able to consistently reproduce the failure. I will look into this issue. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208709#comment-15208709 ] Eric Badger commented on MAPREDUCE-6658: [~ajisakaa] are you sure this was broken by [YARN-4686]? I checked out trunk all the back to the end of February and this test failed every time in the same way as you have shown above. > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6658) TestMRJobs fails
[ https://issues.apache.org/jira/browse/MAPREDUCE-6658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6658: -- Assignee: Eric Badger > TestMRJobs fails > > > Key: MAPREDUCE-6658 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6658 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Akira AJISAKA >Assignee: Eric Badger > > TestMRJobs#testJobWithChangePriority fails. > {noformat} > Running org.apache.hadoop.mapreduce.v2.TestMRJobs > Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 446.855 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobs > testJobWithChangePriority(org.apache.hadoop.mapreduce.v2.TestMRJobs) Time > elapsed: 21.477 sec <<< FAILURE! > java.lang.AssertionError: expected: but was: > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.mapreduce.v2.TestMRJobs.testJobWithChangePriority(TestMRJobs.java:276) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15206503#comment-15206503 ] Eric Badger commented on MAPREDUCE-6580: TestUberAM and TestMRJobs both have failed recent builds in [MAPREDUCE-6315] and are unrelated to this patch. TestMRIntermediateDataEncryption fails intermittently with a timeout in recent builds [MAPREDUCE-6579], [YARN-4340]. > Test failure : TestMRJobsWithProfiler > - > > Key: MAPREDUCE-6580 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6580.001.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently > {code} > Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) > Time elapsed: 133.116 sec <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6580: --- Status: Patch Available (was: Open) > Test failure : TestMRJobsWithProfiler > - > > Key: MAPREDUCE-6580 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6580.001.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently > {code} > Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) > Time elapsed: 133.116 sec <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6580: --- Attachment: MAPREDUCE-6580.001.patch > Test failure : TestMRJobsWithProfiler > - > > Key: MAPREDUCE-6580 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6580.001.patch > > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently > {code} > Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) > Time elapsed: 133.116 sec <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6580: -- Assignee: Eric Badger > Test failure : TestMRJobsWithProfiler > - > > Key: MAPREDUCE-6580 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Rohith Sharma K S >Assignee: Eric Badger > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently > {code} > Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) > Time elapsed: 133.116 sec <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6580) Test failure : TestMRJobsWithProfiler
[ https://issues.apache.org/jira/browse/MAPREDUCE-6580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205113#comment-15205113 ] Eric Badger commented on MAPREDUCE-6580: I am able to get TestMRJobsWithProfiler#testDifferentProfilers to fail consistently on my Mac. However, I am not able to get it to fail on my Linux box or on either box when running testDefaultProfiler. On my mac, I have found that the problem is with the hprof option "times" in TestMRJobsWithProfiler.java:137 (-agentlib:hprof=cpu=times,heap=sites,force=n,thread=y,verbose=n," + "file=%s). This is causing the JVM to screw up and fail seemingly before it even begins to start the container. Additionally, the "times" option is incredibly expensive and slows the JVM down a ton, making the test very likely to timeout on a busy Jenkins instance given the 150 second test timeout. IMO using "samples" instead of "times" would be much better. Using samples, the test passes on my mac and speeds the test up by almost 2x (74.8s vs. 145.5s on my Linux machine). Using "samples" would get rid of the hprof error on my mac and also speed up the test considerably, reducing the chance of test timeouts. Is there any reason that we can't use "samples" instead of "times" for the profiler? > Test failure : TestMRJobsWithProfiler > - > > Key: MAPREDUCE-6580 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6580 > Project: Hadoop Map/Reduce > Issue Type: Bug >Reporter: Rohith Sharma K S > > From > [https://builds.apache.org/job/PreCommit-YARN-Build/9976/artifact/patchprocess/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-jobclient-jdk1.8.0_66.txt] > TestMRJobsWithProfiler fails intermittently > {code} > Running org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 212.973 sec > <<< FAILURE! - in org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler > testDifferentProfilers(org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler) > Time elapsed: 133.116 sec <<< FAILURE! > java.lang.AssertionError: expected:<4> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testProfilerInternal(TestMRJobsWithProfiler.java:212) > at > org.apache.hadoop.mapreduce.v2.TestMRJobsWithProfiler.testDifferentProfilers(TestMRJobsWithProfiler.java:117) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6649) getFailureInfo not returning any failure info
Eric Badger created MAPREDUCE-6649: -- Summary: getFailureInfo not returning any failure info Key: MAPREDUCE-6649 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6649 Project: Hadoop Map/Reduce Issue Type: Bug Reporter: Eric Badger Assignee: Eric Badger The following command does not produce any failure info as to why the job failed. {noformat} $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar sleep -Dmapreduce.jobtracker.split.metainfo.maxsize=10 -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1 -rt 1 {noformat} {noformat} 2016-03-07 10:34:58,112 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0004 failed with state FAILED due to: {noformat} To contrast, here is a command and associated command line output to show a failed job that gives the correct failiure info. {noformat} $HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar sleep -Dyarn.app.mapreduce.am.command-opts=-goober -Dmapreduce.job.queuename=default -m 20 -r 0 -mt 3 {noformat} {noformat} 2016-03-07 10:30:13,103 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1431)) - Job job_1457364518683_0003 failed with state FAILED due to: Application application_1457364518683_0003 failed 3 times due to AM Container for appattempt_1457364518683_0003_03 exited with exitCode: 1 Failing this attempt.Diagnostics: Exception from container-launch. Container id: container_1457364518683_0003_03_01 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:227) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:319) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:88) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6630) TestUserGroupInformation#testGetServerSideGroups fails in chroot
[ https://issues.apache.org/jira/browse/MAPREDUCE-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141693#comment-15141693 ] Eric Badger commented on MAPREDUCE-6630: The TestCopyPreserveFlag JUnit failure is related to [HADOOP-12589|https://issues.apache.org/jira/browse/HADOOP-12589] and is completely separate from the patch for this JIRA. The patch for this JIRA fixes the TestUserGroupInformation#testGetServerSideGroups issue. > TestUserGroupInformation#testGetServerSideGroups fails in chroot > > > Key: MAPREDUCE-6630 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6630 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: security, test >Affects Versions: 2.1.0-beta >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Minor > Attachments: MAPREDUCE-6630-001.patch > > > Bug fixed by [HADOOP-7811] broken by [HADOOP-8562]. Need to re-introduce the > fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MAPREDUCE-6630) TestUserGroupInformation#testGetServerSideGroups fails in chroot
Eric Badger created MAPREDUCE-6630: -- Summary: TestUserGroupInformation#testGetServerSideGroups fails in chroot Key: MAPREDUCE-6630 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6630 Project: Hadoop Map/Reduce Issue Type: Bug Components: security, test Affects Versions: 2.1.0-beta Reporter: Eric Badger Assignee: Eric Badger Priority: Minor Bug fixed by [HADOOP-7811] broken by [HADOOP-8562]. Need to re-introduce the fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6630) TestUserGroupInformation#testGetServerSideGroups fails in chroot
[ https://issues.apache.org/jira/browse/MAPREDUCE-6630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6630: --- Attachment: MAPREDUCE-6630-001.patch Re-introduce the fix from [HADOOP-7811]. > TestUserGroupInformation#testGetServerSideGroups fails in chroot > > > Key: MAPREDUCE-6630 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6630 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: security, test >Affects Versions: 2.1.0-beta >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Minor > Attachments: MAPREDUCE-6630-001.patch > > > Bug fixed by [HADOOP-7811] broken by [HADOOP-8562]. Need to re-introduce the > fix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6507) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6507: --- Summary: MiniYARNCluster.start() returns before cluster is completely started (was: MiniYARNCluster start returns before cluster is completely started) > MiniYARNCluster.start() returns before cluster is completely started > > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6507) MiniYARNCluster start returns before cluster is completely started
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6507: --- Summary: MiniYARNCluster start returns before cluster is completely started (was: TestRMNMInfo fails intermittently) > MiniYARNCluster start returns before cluster is completely started > -- > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-6507) MiniYARNCluster.start() returns before cluster is completely started
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15133187#comment-15133187 ] Eric Badger commented on MAPREDUCE-6507: Tests are failing because of a race condition between the RM startup and the NM startup. In each of their serviceStart() methods, they are spawning new threads to call start(), which introduces the race. The NM is set up with a waitCount of up to 60 seconds, so that it can wait for the cluster to complete startup (even though the start method for the RM has already returned). Removing the threads fixes the race in the test that prompted this Jira (TestRMNMInfo), but causes other tests to fail. Any tests that start up the MiniYARNCluster cluster without an active RM will fail because the node managers block the main thread from transitioning one of the RMs from standby to active. This is why the threads worked, since it allowed the NMs to wait, while the main thread zoomed by and transitioned a standby RM to active. I propose changing the MiniYARNCluster start method such that it does not complete until the cluster is completely started and to always make one RM active in HA setups. This will require changes to the affected tests (TestRMFailover, TestMiniYARNClusterForHA, etc.), but makes the code more understandable and removes races. The tests are only passing right now because of excessive timeouts to mask the race that they're fighting. [~kasha] [~jlowe] Please advise. > MiniYARNCluster.start() returns before cluster is completely started > > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6507) TestRMNMInfo fails intermittently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6507: --- Target Version/s: 2.7.3 I noticed that TestMRTimelineEventHandling is failing in branch-2.7. I believe that this patch will also fix this failing test. > TestRMNMInfo fails intermittently > - > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6507) TestRMNMInfo fails intermittently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6507: --- Status: Patch Available (was: Open) > TestRMNMInfo fails intermittently > - > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MAPREDUCE-6507) TestRMNMInfo fails intermittently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated MAPREDUCE-6507: --- Attachment: MAPREDUCE-6507.001.patch The problem is that there is a thread starting the services and the main thread is asynchronously checking the state. Thus, the main thread checks that the services are fully up, but they can appear to be started before they have actually completed their startup (AbstractService sets the state to STARTED before calling serviceStart()). I think that the easiest fix is to remove the threads that are spawned from within the main thread that do not appear to accomplish anything of value. > TestRMNMInfo fails intermittently > - > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MAPREDUCE-6507) TestRMNMInfo fails intermittently
[ https://issues.apache.org/jira/browse/MAPREDUCE-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger reassigned MAPREDUCE-6507: -- Assignee: Eric Badger > TestRMNMInfo fails intermittently > - > > Key: MAPREDUCE-6507 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6507 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: test >Reporter: Rohith Sharma K S >Assignee: Eric Badger > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)