[jira] [Commented] (MAPREDUCE-5452) NPE in TaskID toString when default constructor is used
[ https://issues.apache.org/jira/browse/MAPREDUCE-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848798#comment-13848798 ] Andrey Klochkov commented on MAPREDUCE-5452: This actually leads to non minor issues in downstream projects. See HIVE-4216 NPE in TaskID toString when default constructor is used --- Key: MAPREDUCE-5452 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5452 Project: Hadoop Map/Reduce Issue Type: Bug Affects Versions: 2.0.5-alpha Reporter: Brock Noland Priority: Minor If you call TaskID.toString() after using the default the constructor toString() NPE's because taskType is null. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Attachment: MAPREDUCE-3860--n4.patch Jonathan, The logs don't provide much info on why tests fail. Per your description it seems that the tests hang indefinitely, so probably printing thread dumps on test timeouts would help. I'm attaching a patch which modifyis Rumen's pom.xml by adding a JUnit listener that prints thread dumps. I could not reproduce any failures in Rumen tests, tried to use 4 different machines (osx, centos, fedora on h/w nodes, and rhel on a VM). Please reproduce the failures in your environment one more time and attach Console output of Maven and all Surefire logs (not just *-output.txt). Thanks for working on this. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860--n4.patch, MAPREDUCE-3860.patch, org.apache.hadoop.tools.rumen.TestRumenAnonymization-output.txt, org.apache.hadoop.tools.rumen.TestRumenJobTraces-output.txt, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809577#comment-13809577 ] Andrey Klochkov commented on MAPREDUCE-3860: Also, it could be that the timeouts I set in the tests are still too low for you, if your machine is that slow. Can you increase them by up to an order of magnitude to check that? [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860--n4.patch, MAPREDUCE-3860.patch, org.apache.hadoop.tools.rumen.TestRumenAnonymization-output.txt, org.apache.hadoop.tools.rumen.TestRumenJobTraces-output.txt, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n8.patch Attaching rebased patch. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809636#comment-13809636 ] Andrey Klochkov commented on MAPREDUCE-4980: The build failed due to OOM while processing native code. Not related to the patch. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Attachment: MAPREDUCE-3860--n3.patch As I see in the logs failures on Linux environment were caused by tests timeouts being too low. Attaching a patch which fixes that. As for failures on Mac env, I see that all M/R jobs failed there. I saw similar issues when running tests without having JAVA_HOME set. Can't find more out of the logs. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807693#comment-13807693 ] Andrey Klochkov commented on MAPREDUCE-3860: Jonathan, thanks for testing this. Can you please attach surefire logs? I'm still missing a possible reason for the failures you see. I just tried to run the 3 commands you mentioned, and all three passed on my osx with jdk7. I'm not trying to use works for me argument, but I can't reproduce this, so logs would be really helpful. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805741#comment-13805741 ] Andrey Klochkov commented on MAPREDUCE-3860: Jonathan, I'm using exactly OSX and jdk7. Can you please give more info on failures? [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805787#comment-13805787 ] Andrey Klochkov commented on MAPREDUCE-3860: Still I can't reproduce failures, tried Linux x86_64 with java 1.7.0_13 and OSX with java 1.7.0_17. Please attach Surefire logs. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: (was: MAPREDUCE-4980--n7.patch) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n7.patch Rebased the patch. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800884#comment-13800884 ] Andrey Klochkov commented on MAPREDUCE-3860: Ravi, Well, to begin with, majority of Hadoop unit tests are not true unit tests. I mean all these mini cluster based tests and similar ones. Still I think that with a mature and slowly evolving codebase, and priorities shifted to maturity rather than quick changes, it makes sense to have such kinds of tests, especially when the alternative is not to have any. Currently Rumen code coverage by tests is almost zero, and this has been the case for a long time. Knowing that it is probably one of the slowest changing parts of the codebase, I think having that old tests which are indeed based on static files generated manually are better than having none. As I understand some of the tests like {{TestRumenJobTraces.testHadoop20JHParser}} use static pre-generated files to test compatibility with older versions of Hadoop, and in case of {{TestRumenJobTraces}} there is {{testCurrentJHParser}} which does use a real job to generate logs and then parse them, e.g. this is the test which works with the current version of the codebase. Having said that, I agree that improving these tests to use mini clusters as much as possible, instead of using pre-generated files, is a proper way of improving the tests further. The meaning of this task is to bring back old tests, which as I understand were broken when switching to Yarn, and were removed just to fix the builds quickly. On questions for {{TestRumentAnonymization}}. 1. It's waiting for 100 *milli*seconds, not 100 seconds. Which is tolerable. 2. No, having an instance of {{Configuration}} as the field in the class may not affect tests in these way. JUnit creates a dedicated instance of the class for each test execution. 3. Not sure if this is a valid anonymization to have a null in the username. Is it? 4. Why should those temp dirs be deleted after test runs? It makes troubleshooting more difficult. Those dirs do not interfere with each other or other tests. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798459#comment-13798459 ] Andrey Klochkov commented on MAPREDUCE-3860: Ravi, Can you please provide more info? I can't reproduce it in my environment. Surefire logs would be fine. BTW my name is Andrey :-) [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n7.patch Rebasing the patch. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Attachment: MAPREDUCE-3860--n2.patch [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798598#comment-13798598 ] Andrey Klochkov commented on MAPREDUCE-3860: Ravi, I reproduced the bug in {{testProcessInputArgument}} on a Linux machine, will attach a fixed patch shortly. Also I tried to reproduce the job failure in {{testCurrentJHParser}} and made a few runs of the test on 2 machines with 2.x and 3.x Linux kernels and different flavors of JDK7, but all runs succeeded. Worked on OSX too. Didn't see any issues caused by OOM. If it happens again, try giving more memory to Maven. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797413#comment-13797413 ] Andrey Klochkov commented on MAPREDUCE-3860: Ravi, thanks a lot for this summary. I should have written this myself when submitting the patch. Now answering your questions. The old tests were deleted by the following commit: {code} commit 00ac37838c4a55a2b855983e9730cbd26e6f3477 Author: Mahadev Konar maha...@apache.org Date: Sat Jan 21 01:15:24 2012 + MAPREDUCE-3705. ant build fails on 0.23 branch. (Thomas Graves via mahadev) git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1234227 13f79535-47bb-0310-9956-ffa450edef68 {code} So I got the tests from the preceding commit, namely: {code} commit 6b8a6a701a972f9528b9a2672401db51a31f52fb Author: Mahadev Konar maha...@apache.org Date: Sat Jan 21 00:53:02 2012 + MAPREDUCE-3549. write api documentation for web service apis for RM, NM, mapreduce app master, and job history server (Thoma git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1234222 13f79535-47bb-0310-9956-ffa450edef68 {code} The tests are under {{hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/tools/rumen}}. The data is under {{hadoop-mapreduce-project/src/test/tools/data/rumen}}. The {{rumen-test-data.tar.gz}} file contains those data files which are in binary form, so I couldn't put those into a patch file. The non-binary (non-gzipped) files are in the patch file itself. So to put the changes properly it's need to apply the patch and also to un-tar {{rumen-test-data.tar.gz}}. I did not create {{sample-conf.file.new.xml}} file, it existed in the old tests. Also, see {{TestRumenAnonymization}}, {{TestRumenFolder}} there. I did change {{job-tracker-logs-topology-output}} to make tests succeed. As I understand this is caused by newer versions of Rumen doing time adjustment, so the expected data in {{job-tracker-logs-topology-output}} is not what's being produced by Rumen currently. See {{Folder.adjustJobTimes}} method. The change in {{WordList}} is actually a bug fix. I figured it doesn't make much sense to file a separate Jira for that. Sometimes when WordList instance is being deserialized from the disk, the size attribute is read after the words themselves are read, and so when deserializing size the words list is cleared (a bug in deserialization). I did make many changes in {{TestRumenJobTraces}} and {{TestRumenAnonymization}}, that was required due to changes in Hadoop itself. [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5387) Implement Signal.TERM on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794770#comment-13794770 ] Andrey Klochkov commented on MAPREDUCE-5387: I uploaded the patch which implements approximations for both QUIT and TERM for Windows, via console event handlers. See [YARN-445]. Implement Signal.TERM on Windows Key: MAPREDUCE-5387 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 3.0.0, 1-win, 2.1.0-beta Reporter: Ivan Mitic Assignee: Ivan Mitic Signal.TERM is currently not supported by Hadoop on the Windows platform. Tracking Jira for the problem. A couple of things to keep in mind: - Support for process groups (JobObjects on Windows) - Solution should work for both java and other streaming Hadoop apps -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov reassigned MAPREDUCE-3860: -- Assignee: Andrey Klochkov [Rumen] Bring back the removed Rumen unit teststoo -- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5387) Implement Signal.TERM on Windows
[ https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13793117#comment-13793117 ] Andrey Klochkov commented on MAPREDUCE-5387: Indeed, [YARN-445] is related. Thanks to [~cnauroth] for pointing. I think I can put up a patch which sends Ctrl+C to all processes in the job object and make Yarn use it as an analog to TERM signal when running on Windows. That would be similar to how it's done with Ctrl+Break in [YARN-445]. Implement Signal.TERM on Windows Key: MAPREDUCE-5387 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387 Project: Hadoop Map/Reduce Issue Type: Improvement Affects Versions: 3.0.0, 1-win, 2.1.0-beta Reporter: Ivan Mitic Assignee: Ivan Mitic Signal.TERM is currently not supported by Hadoop on the Windows platform. Tracking Jira for the problem. A couple of things to keep in mind: - Support for process groups (JobObjects on Windows) - Solution should work for both java and other streaming Hadoop apps -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Attachment: MAPREDUCE-3860.patch rumen-test-data.tar.gz Attaching a patch and a tarball with gzip'ped test data. The robot wouldn't be able to run tests. [Rumen] Bring back the removed Rumen unit teststoo -- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Target Version/s: 3.0.0, 2.3.0 Status: Patch Available (was: Open) [Rumen] Bring back the removed Rumen unit teststoo -- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-3860: --- Summary: [Rumen] Bring back the removed Rumen unit tests (was: [Rumen] Bring back the removed Rumen unit teststoo) [Rumen] Bring back the removed Rumen unit tests --- Key: MAPREDUCE-3860 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860 Project: Hadoop Map/Reduce Issue Type: Bug Components: tools/rumen Reporter: Ravi Gummadi Assignee: Andrey Klochkov Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need to be brought back: TestZombieJob.java TestRumenJobTraces.java TestRumenFolder.java TestRumenAnonymization.java TestParsedLine.java TestConcurrentRead.java -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789457#comment-13789457 ] Andrey Klochkov commented on MAPREDUCE-5102: Thanks Nathan. Yes, that DriverForTest is not needed anymore, and TestSplitters will fail on float splitters until the bug is fixed. fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n5.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk--n5.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5102: --- Attachment: MAPREDUCE-5102-trunk--n4.patch MAPREDUCE-5102-branch-2--n4.patch Improved patches according to the last comment. Also, tests which do not have much value but add non needed rigidness to the code. fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5102: --- Attachment: MAPREDUCE-5102-trunk--n4.patch MAPREDUCE-5102-branch-2--n4.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov reassigned MAPREDUCE-5102: -- Assignee: Andrey Klochkov (was: Aleksey Gorshkov) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db
[ https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5102: --- Attachment: MAPREDUCE-5102-trunk--n3.patch MAPREDUCE-5102-branch-2--n3.patch Updated patches for trunk and branch-2 to work with both JDK 6 and 7. Please disregard the patch for 0.23, we're targeting trunk and branch-2 only. fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db Key: MAPREDUCE-5102 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102 Project: Hadoop Map/Reduce Issue Type: Test Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha Reporter: Aleksey Gorshkov Assignee: Andrey Klochkov Attachments: MAPREDUCE-5102-branch-0.23.patch, MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk.patch, MAPREDUCE-5102-trunk-v1.patch fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db patch MAPREDUCE-5102-trunk.patch for trunk and branch-2 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov resolved MAPREDUCE-5501. Resolution: Won't Fix This is caused by a bug in MiniYARNCluster. Reported in YARN-1183 RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov reassigned MAPREDUCE-5501: -- Assignee: Andrey Klochkov RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Assignee: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Status: Patch Available (was: Open) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n6.patch Rebased the patch. When testing, made an additional fix in MiniMRYarnCluster which sometimes leads to incorrectly configured history server address. Also, a fix submitted separately into YARN-1183 makes builds much more stable. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-5501) RMContainer Allocator loops forever after cluster shutdown in tests
Andrey Klochkov created MAPREDUCE-5501: -- Summary: RMContainer Allocator loops forever after cluster shutdown in tests Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Summary: RMContainer Allocator does not stop when cluster shutdown is performed in tests (was: RMContainer Allocator loops forever after cluster shutdown in tests) RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: hanging-rmcontainer-allocator.stdout RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: hanging-rmcontainer-allocator.syslog RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov Attachments: hanging-rmcontainer-allocator.stdout, hanging-rmcontainer-allocator.syslog After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: handing-rmcontainer-allocator.syslog RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: (was: handing-rmcontainer-allocator.syslog) RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: handing-rmcontainer-allocator.stdout attaching thread dump RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests
[ https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-5501: --- Attachment: (was: handing-rmcontainer-allocator.stdout) RMContainer Allocator does not stop when cluster shutdown is performed in tests --- Key: MAPREDUCE-5501 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501 Project: Hadoop Map/Reduce Issue Type: Bug Components: resourcemanager Affects Versions: trunk Reporter: Andrey Klochkov After running MR job client tests many MRAppMaster processes stay alive. The reason seems that RMContainer Allocator thread ignores InterruptedException and keeps retrying: {code} 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] org.apache.hadoop.util.ThreadUtil: interrupted while sleeping java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149) at com.sun.proxy.$Proxy29.allocate(Unknown Source) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Retrying connect to server: dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) {code} It takes 6 minutes for the processes to die, and this causes various issues with tests which use the same DFS dir. {code} 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating with RM: Could not contact RM after 36 milliseconds. org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM after 36 milliseconds. at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563) at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219) at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236) at java.lang.Thread.run(Thread.java:680) {code} Will attach a thread dump separately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Status: Open (was: Patch Available) the patch needs to be rebased, I'm working on it. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1303#comment-1303 ] Andrey Klochkov commented on MAPREDUCE-4980: Vinod, can you please point me to downstream components depending on MiniMRClientClusterFactory? Can't find such by quick grepping through Core, HBase, Ping, Hive. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n5.patch rebasing and refreshing the patch Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Andrey Klochkov Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n4.patch Updating the patch according to changes in trunk Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633849#comment-13633849 ] Andrey Klochkov commented on MAPREDUCE-4980: The failure is expected due to dependency on HDFS-4491 Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (MAPREDUCE-4534) Test failures with Container .. is running beyond virtual memory limits
[ https://issues.apache.org/jira/browse/MAPREDUCE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov resolved MAPREDUCE-4534. Resolution: Duplicate Test failures with Container .. is running beyond virtual memory limits - Key: MAPREDUCE-4534 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4534 Project: Hadoop Map/Reduce Issue Type: Bug Components: test Affects Versions: 0.23.3 Reporter: Ilya Katsov Tests org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces} fail with the following message: {code} Container [pid=7785,containerID=container_1342495768864_0001_01_01] is running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container. Dump of the process-tree for container_1342495768864_0001_01_01 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 /usr/java/jdk1.6.0_33/jre/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster |- 7785 7101 7785 7785 (bash) 1 1 108605440 332 /bin/bash -c /usr/java/jdk1.6.0_33/jre/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stdout 2/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stderr {code} This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX resolves the problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980--n3.patch Updating the patch according to changes in trunk Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617813#comment-13617813 ] Andrey Klochkov commented on MAPREDUCE-4980: I believe something's wrong with the QA robot. The patch is perfectly applicable to the current trunk. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Issue Type: Test (was: Improvement) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Test Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: (was: HADOOP-9287--N2.patch) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: HADOOP-9287--N2.patch Patch is updated with a few additional fixes. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Attachment: MAPREDUCE-4980.patch Attaching a patch which does fixes similar to the ones introduced in HADOOP-9287 and HDFS-4491 and it depends on these jiras. The patch introduces multi-fork execution for the hadoop-mapreduce-client-jobclient module only. This patch, together with HADOOP-9287 and HDFS-4491 introduces multi-fork execution for 3 Hadoop Core modules which take majority of time to test: hadoop-common, hadoop-hdfs and hadoop-mapreduce-client-jobclient. Overview of changes: 1. A new profile parallel-tests is introduced in hadoop-mapreduce-client-jobclient/pom.xml 2. Tests are refactored to use PathUtils.getTestDir/getTestPath methods to get a directory/path to be used for test data. Earlier, the refactored tests implemented it in a similar but slightly different way. 3. MiniMRClientClusterFactory is replaced with MiniMRClientClusterBuilder which makes it more flexible when configuring clusters, in a way similar to MiniDFSCluster.Builder 4. All usages of deprecated class MiniMRCluster are replaced using MiniMRClientClusterBuilder. This is required to avoid FS contention when running mini mr clusters in parallel. 5. All usages of MiniDFSCluster constructors are replaced with MiniDFSCluster.Builder usage, same purpose as prev item. The changes are tested comparing stableness of tests to trunk running both in serial mode and parallel mode. No additional flakiness is noticed. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrey Klochkov updated MAPREDUCE-4980: --- Target Version/s: 3.0.0 Status: Patch Available (was: Open) Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core
[ https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599286#comment-13599286 ] Andrey Klochkov commented on MAPREDUCE-4980: Build fails due to dependency on HDFS-4491 which is not in trunk yet. The patch affects large number of tests, setting timeout for all of them shouldn't be done as part of this patch. Parallel test execution of hadoop-mapreduce-client-core --- Key: MAPREDUCE-4980 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980 Project: Hadoop Map/Reduce Issue Type: Improvement Components: test Affects Versions: 3.0.0 Reporter: Tsuyoshi OZAWA Assignee: Tsuyoshi OZAWA Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch The maven surefire plugin supports parallel testing feature. By using it, the tests can be run more faster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (MAPREDUCE-4467) IndexCache failures due to missing synchronization
Andrey Klochkov created MAPREDUCE-4467: -- Summary: IndexCache failures due to missing synchronization Key: MAPREDUCE-4467 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4467 Project: Hadoop Map/Reduce Issue Type: Bug Components: nodemanager Affects Versions: 0.23.2 Reporter: Andrey Klochkov TestMRJobs.testSleepJob fails randomly due to synchronization error in IndexCache: {code} 2012-07-20 19:32:34,627 ERROR [New I/O server worker #2-1] mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(528)) - Shuffle error: java.lang.IllegalMonitorStateException at java.lang.Object.wait(Native Method) at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:74) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:471) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397) at org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:148) at org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:522) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506) at org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261) at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349) at org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280) at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} A related issue is MAPREDUCE-4384. The change introduced there removed synchronized keyword and hence info.wait() call fails. Tbis needs to be wrapped into a synchronized block. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira