[jira] [Commented] (MAPREDUCE-5452) NPE in TaskID toString when default constructor is used

2013-12-15 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13848798#comment-13848798
 ] 

Andrey Klochkov commented on MAPREDUCE-5452:


This actually leads to non minor issues in downstream projects. See HIVE-4216

 NPE in TaskID toString when default constructor is used
 ---

 Key: MAPREDUCE-5452
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5452
 Project: Hadoop Map/Reduce
  Issue Type: Bug
Affects Versions: 2.0.5-alpha
Reporter: Brock Noland
Priority: Minor

 If you call TaskID.toString() after using the default the constructor 
 toString() NPE's because taskType is null.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-30 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Attachment: MAPREDUCE-3860--n4.patch

Jonathan,
The logs don't provide much info on why tests fail. Per your description it 
seems that the tests hang indefinitely, so probably printing thread dumps on 
test timeouts would help. I'm attaching a patch which modifyis Rumen's pom.xml 
by adding a JUnit listener that prints thread dumps. I could not reproduce any 
failures in Rumen tests, tried to use 4 different machines (osx, centos, fedora 
on h/w nodes, and rhel on a VM). Please reproduce the failures in your 
environment one more time and attach Console output of Maven and all Surefire 
logs (not just *-output.txt). Thanks for working on this. 

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, 
 MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860--n4.patch, 
 MAPREDUCE-3860.patch, 
 org.apache.hadoop.tools.rumen.TestRumenAnonymization-output.txt, 
 org.apache.hadoop.tools.rumen.TestRumenJobTraces-output.txt, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-30 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809577#comment-13809577
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Also, it could be that the timeouts I set in the tests are still too low for 
you, if your machine is that slow. Can you increase them by up to an order of 
magnitude to check that? 

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, 
 MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860--n4.patch, 
 MAPREDUCE-3860.patch, 
 org.apache.hadoop.tools.rumen.TestRumenAnonymization-output.txt, 
 org.apache.hadoop.tools.rumen.TestRumenJobTraces-output.txt, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-10-30 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n8.patch

Attaching rebased patch.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-10-30 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809636#comment-13809636
 ] 

Andrey Klochkov commented on MAPREDUCE-4980:


The build failed due to OOM while processing native code. Not related to the 
patch.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n7.patch, MAPREDUCE-4980--n8.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-28 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Attachment: MAPREDUCE-3860--n3.patch

As I see in the logs failures on Linux environment were caused by tests 
timeouts being too low. Attaching a patch which fixes that.

As for failures on Mac env, I see that all M/R jobs failed there. I saw similar 
issues when running tests without having JAVA_HOME set. Can't find more out of 
the logs. 

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, 
 MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-28 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13807693#comment-13807693
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Jonathan, thanks for testing this. Can you please attach surefire logs? I'm 
still missing a possible reason for the failures you see. I just tried to run 
the 3 commands you mentioned, and all three passed on my osx with jdk7. I'm not 
trying to use works for me argument, but I can't reproduce this, so logs 
would be really helpful. 

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: linux-surefire-reports.tar, mac-surfire-reports.tar, 
 MAPREDUCE-3860--n2.patch, MAPREDUCE-3860--n3.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-25 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805741#comment-13805741
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Jonathan, I'm using exactly OSX and jdk7. Can you please give more info on 
failures?

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-25 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13805787#comment-13805787
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Still I can't reproduce failures, tried Linux x86_64 with java 1.7.0_13 and OSX 
with java 1.7.0_17. Please attach Surefire logs.

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-10-22 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: (was: MAPREDUCE-4980--n7.patch)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-10-22 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n7.patch

Rebased the patch.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980--n7.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-21 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800884#comment-13800884
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Ravi,
Well, to begin with, majority of Hadoop unit tests are not true unit tests. I 
mean all these mini cluster based tests and similar ones. Still I think that 
with a mature and slowly evolving codebase, and priorities shifted to maturity 
rather than quick changes, it makes sense to have such kinds of tests, 
especially when the alternative is not to have any. Currently Rumen code 
coverage by tests is almost zero, and this has been the case for a long time. 
Knowing that it is probably one of the slowest changing parts of the codebase, 
I think having that old tests which are indeed based on static files generated 
manually are better than having none. As I understand some of the tests like 
{{TestRumenJobTraces.testHadoop20JHParser}} use static pre-generated files to 
test compatibility with older versions of Hadoop, and in case of 
{{TestRumenJobTraces}} there is {{testCurrentJHParser}} which does use a real 
job to generate logs and then parse them, e.g. this is the test which works 
with the current version of the codebase. Having said that, I agree that 
improving these tests to use mini clusters as much as possible, instead of 
using pre-generated files, is a proper way of improving the tests further. The 
meaning of this task is to bring back old tests, which as I understand were 
broken when switching to Yarn, and were removed just to fix the builds quickly.

On questions for {{TestRumentAnonymization}}. 
1. It's waiting for 100 *milli*seconds, not 100 seconds. Which is tolerable. 
2. No, having an instance of {{Configuration}} as the field in the class may 
not affect tests in these way. JUnit creates a dedicated instance of the class 
for each test execution.
3. Not sure if this is a valid anonymization to have a null in the username. Is 
it?
4. Why should those temp dirs be deleted after test runs? It makes 
troubleshooting more difficult. Those dirs do not interfere with each other or 
other tests.

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-17 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798459#comment-13798459
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Ravi,
Can you please provide more info? I can't reproduce it in my environment. 
Surefire logs would be fine.
BTW my name is Andrey :-)

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-10-17 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n7.patch

Rebasing the patch.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980--n7.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-17 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Attachment: MAPREDUCE-3860--n2.patch

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-17 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798598#comment-13798598
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Ravi,
I reproduced the bug in {{testProcessInputArgument}} on a Linux machine, will 
attach a fixed patch shortly. Also I tried to reproduce the job failure in 
{{testCurrentJHParser}} and made a few runs of the test on 2 machines with 2.x 
and 3.x Linux kernels and different flavors of JDK7, but all runs succeeded. 
Worked on OSX too. Didn't see any issues caused by OOM. If it happens again, 
try giving more memory to Maven.

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860--n2.patch, MAPREDUCE-3860.patch, 
 rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-16 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797413#comment-13797413
 ] 

Andrey Klochkov commented on MAPREDUCE-3860:


Ravi, thanks a lot for this summary. I should have written this myself when 
submitting the patch. Now answering your questions.

The old tests were deleted by the following commit:
{code}
commit 00ac37838c4a55a2b855983e9730cbd26e6f3477
Author: Mahadev Konar maha...@apache.org
Date:   Sat Jan 21 01:15:24 2012 +

MAPREDUCE-3705. ant build fails on 0.23 branch. (Thomas Graves via mahadev)

git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1234227 
13f79535-47bb-0310-9956-ffa450edef68
{code}

So I got the tests from the preceding commit, namely:

{code}
commit 6b8a6a701a972f9528b9a2672401db51a31f52fb
Author: Mahadev Konar maha...@apache.org
Date:   Sat Jan 21 00:53:02 2012 +

MAPREDUCE-3549. write api documentation for web service apis for RM, NM, 
mapreduce app master, and job history server (Thoma

git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1234222 
13f79535-47bb-0310-9956-ffa450edef68
{code}

The tests are under 
{{hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/tools/rumen}}. The 
data is under {{hadoop-mapreduce-project/src/test/tools/data/rumen}}.

The {{rumen-test-data.tar.gz}} file contains those data files which are in 
binary form, so I couldn't put those into a patch file. The non-binary 
(non-gzipped) files are in the patch file itself. So to put the changes 
properly it's need to apply the patch and also to un-tar 
{{rumen-test-data.tar.gz}}. 

I did not create {{sample-conf.file.new.xml}} file, it existed in the old 
tests. Also, see {{TestRumenAnonymization}}, {{TestRumenFolder}} there.

I did change {{job-tracker-logs-topology-output}} to make tests succeed. As I 
understand this is caused by newer versions of Rumen  doing time adjustment, 
so the expected data in {{job-tracker-logs-topology-output}} is not what's 
being produced by Rumen currently. See {{Folder.adjustJobTimes}} method. 

The change in {{WordList}} is actually a bug fix. I figured it doesn't make 
much sense to file a separate Jira for that. Sometimes when WordList instance 
is being deserialized from the disk, the size attribute is read after the 
words themselves are read, and so when deserializing size the words list is 
cleared (a bug in deserialization).

I did make many changes in {{TestRumenJobTraces}} and 
{{TestRumenAnonymization}}, that was required due to changes in Hadoop itself.

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5387) Implement Signal.TERM on Windows

2013-10-14 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794770#comment-13794770
 ] 

Andrey Klochkov commented on MAPREDUCE-5387:


I uploaded the patch which implements approximations for both QUIT and TERM for 
Windows, via console event handlers. See [YARN-445].

 Implement Signal.TERM on Windows
 

 Key: MAPREDUCE-5387
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 3.0.0, 1-win, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 Signal.TERM is currently not supported by Hadoop on the Windows platform. 
 Tracking Jira for the problem. 
 A couple of things to keep in mind:
  - Support for process groups (JobObjects on Windows)
  - Solution should work for both java and other streaming Hadoop apps



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo

2013-10-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov reassigned MAPREDUCE-3860:
--

Assignee: Andrey Klochkov

 [Rumen] Bring back the removed Rumen unit teststoo
 --

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov

 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5387) Implement Signal.TERM on Windows

2013-10-11 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13793117#comment-13793117
 ] 

Andrey Klochkov commented on MAPREDUCE-5387:


Indeed, [YARN-445] is related. Thanks to [~cnauroth] for pointing. I think I 
can put up a patch which sends Ctrl+C to all processes in the job object and 
make Yarn use it as an analog to TERM signal when running on Windows. That 
would be similar to how it's done with Ctrl+Break in [YARN-445].

 Implement Signal.TERM on Windows
 

 Key: MAPREDUCE-5387
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5387
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
Affects Versions: 3.0.0, 1-win, 2.1.0-beta
Reporter: Ivan Mitic
Assignee: Ivan Mitic

 Signal.TERM is currently not supported by Hadoop on the Windows platform. 
 Tracking Jira for the problem. 
 A couple of things to keep in mind:
  - Support for process groups (JobObjects on Windows)
  - Solution should work for both java and other streaming Hadoop apps



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo

2013-10-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Attachment: MAPREDUCE-3860.patch
rumen-test-data.tar.gz

Attaching a patch and a tarball with gzip'ped test data. The robot wouldn't be 
able to run tests.

 [Rumen] Bring back the removed Rumen unit teststoo
 --

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit teststoo

2013-10-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Target Version/s: 3.0.0, 2.3.0
  Status: Patch Available  (was: Open)

 [Rumen] Bring back the removed Rumen unit teststoo
 --

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-3860) [Rumen] Bring back the removed Rumen unit tests

2013-10-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-3860:
---

Summary: [Rumen] Bring back the removed Rumen unit tests  (was: [Rumen] 
Bring back the removed Rumen unit teststoo)

 [Rumen] Bring back the removed Rumen unit tests
 ---

 Key: MAPREDUCE-3860
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3860
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: tools/rumen
Reporter: Ravi Gummadi
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-3860.patch, rumen-test-data.tar.gz


 MAPREDUCE-3582 did not move some of the Rumen unit tests to the new folder 
 and then MAPREDUCE-3705 deleted those unit tests. These Rumen unit tests need 
 to be brought back:
 TestZombieJob.java
 TestRumenJobTraces.java
 TestRumenFolder.java
 TestRumenAnonymization.java
 TestParsedLine.java
 TestConcurrentRead.java



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db

2013-10-08 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789457#comment-13789457
 ] 

Andrey Klochkov commented on MAPREDUCE-5102:


Thanks Nathan. Yes, that DriverForTest is not needed anymore, and TestSplitters 
will fail on float splitters until the bug is fixed.

 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 

 Key: MAPREDUCE-5102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-5102-branch-0.23.patch, 
 MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, 
 MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n4.patch, 
 MAPREDUCE-5102-branch-2--n5.patch, MAPREDUCE-5102-trunk--n3.patch, 
 MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk--n4.patch, 
 MAPREDUCE-5102-trunk--n5.patch, MAPREDUCE-5102-trunk.patch, 
 MAPREDUCE-5102-trunk-v1.patch


 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 patch MAPREDUCE-5102-trunk.patch for trunk and branch-2
 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db

2013-10-07 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5102:
---

Attachment: MAPREDUCE-5102-trunk--n4.patch
MAPREDUCE-5102-branch-2--n4.patch

Improved patches according to the last comment. Also, tests which do not have 
much value but add non needed rigidness to the code.

 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 

 Key: MAPREDUCE-5102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-5102-branch-0.23.patch, 
 MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, 
 MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-trunk--n3.patch, 
 MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, 
 MAPREDUCE-5102-trunk-v1.patch


 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 patch MAPREDUCE-5102-trunk.patch for trunk and branch-2
 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db

2013-10-07 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5102:
---

Attachment: MAPREDUCE-5102-trunk--n4.patch
MAPREDUCE-5102-branch-2--n4.patch

 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 

 Key: MAPREDUCE-5102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-5102-branch-0.23.patch, 
 MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, 
 MAPREDUCE-5102-branch-2--n4.patch, MAPREDUCE-5102-branch-2--n4.patch, 
 MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk--n4.patch, 
 MAPREDUCE-5102-trunk--n4.patch, MAPREDUCE-5102-trunk.patch, 
 MAPREDUCE-5102-trunk-v1.patch


 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 patch MAPREDUCE-5102-trunk.patch for trunk and branch-2
 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db

2013-10-03 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov reassigned MAPREDUCE-5102:
--

Assignee: Andrey Klochkov  (was: Aleksey Gorshkov)

 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 

 Key: MAPREDUCE-5102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-5102-branch-0.23.patch, 
 MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-trunk.patch, 
 MAPREDUCE-5102-trunk-v1.patch


 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 patch MAPREDUCE-5102-trunk.patch for trunk and branch-2
 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (MAPREDUCE-5102) fix coverage org.apache.hadoop.mapreduce.lib.db and org.apache.hadoop.mapred.lib.db

2013-10-03 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5102:
---

Attachment: MAPREDUCE-5102-trunk--n3.patch
MAPREDUCE-5102-branch-2--n3.patch

Updated patches for trunk and branch-2 to work with both JDK 6 and 7. Please 
disregard the patch for 0.23, we're targeting trunk and branch-2 only.

 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 

 Key: MAPREDUCE-5102
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5102
 Project: Hadoop Map/Reduce
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-5102-branch-0.23.patch, 
 MAPREDUCE-5102-branch-0.23-v1.patch, MAPREDUCE-5102-branch-2--n3.patch, 
 MAPREDUCE-5102-trunk--n3.patch, MAPREDUCE-5102-trunk.patch, 
 MAPREDUCE-5102-trunk-v1.patch


 fix coverage  org.apache.hadoop.mapreduce.lib.db and 
 org.apache.hadoop.mapred.lib.db
 patch MAPREDUCE-5102-trunk.patch for trunk and branch-2
 patch MAPREDUCE-5102-branch-0.23.patch for branch-0.23 only



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov resolved MAPREDUCE-5501.


Resolution: Won't Fix

This is caused by a bug in MiniYARNCluster. Reported in YARN-1183

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov reassigned MAPREDUCE-5501:
--

Assignee: Andrey Klochkov

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
Assignee: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-09-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Status: Patch Available  (was: Open)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-09-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n6.patch

Rebased the patch. When testing, made an additional fix in MiniMRYarnCluster 
which sometimes leads to incorrectly configured history server address. Also, a 
fix submitted separately into YARN-1183 makes builds much more stable.



 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980--n6.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-5501) RMContainer Allocator loops forever after cluster shutdown in tests

2013-09-10 Thread Andrey Klochkov (JIRA)
Andrey Klochkov created MAPREDUCE-5501:
--

 Summary: RMContainer Allocator loops forever after cluster 
shutdown in tests
 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov


After running MR job client tests many MRAppMaster processes stay alive. The 
reason seems that RMContainer Allocator thread ignores InterruptedException and 
keeps retrying:

{code}
2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
java.lang.InterruptedException: sleep interrupted
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
at com.sun.proxy.$Proxy29.allocate(Unknown Source)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
at java.lang.Thread.run(Thread.java:680)
2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
org.apache.hadoop.ipc.Client: Retrying connect to server: 
dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
org.apache.hadoop.ipc.Client: Retrying connect to server: 
dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 
SECONDS)
{code}

It takes  6 minutes for the processes to die, and this causes various issues 
with tests which use the same DFS dir. 

{code}
2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error communicating 
with RM: Could not contact RM after 36 milliseconds.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
after 36 milliseconds.
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
at java.lang.Thread.run(Thread.java:680)
{code}

Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Summary: RMContainer Allocator does not stop when cluster shutdown is 
performed in tests  (was: RMContainer Allocator loops forever after cluster 
shutdown in tests)

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov

 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: hanging-rmcontainer-allocator.stdout

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: hanging-rmcontainer-allocator.syslog

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov
 Attachments: hanging-rmcontainer-allocator.stdout, 
 hanging-rmcontainer-allocator.syslog


 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: handing-rmcontainer-allocator.syslog

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov

 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: (was: handing-rmcontainer-allocator.syslog)

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov

 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: handing-rmcontainer-allocator.stdout

attaching thread dump

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov

 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-5501) RMContainer Allocator does not stop when cluster shutdown is performed in tests

2013-09-10 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-5501:
---

Attachment: (was: handing-rmcontainer-allocator.stdout)

 RMContainer Allocator does not stop when cluster shutdown is performed in 
 tests
 ---

 Key: MAPREDUCE-5501
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5501
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: trunk
Reporter: Andrey Klochkov

 After running MR job client tests many MRAppMaster processes stay alive. The 
 reason seems that RMContainer Allocator thread ignores InterruptedException 
 and keeps retrying:
 {code}
 2013-09-09 18:52:07,505 WARN [RMCommunicator Allocator] 
 org.apache.hadoop.util.ThreadUtil: interrupted while sleeping
 java.lang.InterruptedException: sleep interrupted
 at java.lang.Thread.sleep(Native Method)
 at 
 org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
 at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:149)
 at com.sun.proxy.$Proxy29.allocate(Unknown Source)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerRequestor.makeRemoteRequest(RMContainerRequestor.java:154)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:553)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 2013-09-09 18:52:37,639 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 0 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 2013-09-09 18:52:38,640 INFO [RMCommunicator Allocator] 
 org.apache.hadoop.ipc.Client: Retrying connect to server: 
 dhcpx-197-141.corp.yahoo.com/10.73.197.141:61163. Already tried 1 time(s); 
 retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
 sleepTime=1 SECONDS)
 {code}
 It takes  6 minutes for the processes to die, and this causes various issues 
 with tests which use the same DFS dir. 
 {code}
 2013-09-09 22:26:47,179 ERROR [RMCommunicator Allocator] 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Error 
 communicating with RM: Could not contact RM after 36 milliseconds.
 org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Could not contact RM 
 after 36 milliseconds.
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:563)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:219)
 at 
 org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:236)
 at java.lang.Thread.run(Thread.java:680)
 {code}
 Will attach a thread dump separately. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-09-06 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Status: Open  (was: Patch Available)

the patch needs to be rebased, I'm working on it.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-05-24 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1303#comment-1303
 ] 

Andrey Klochkov commented on MAPREDUCE-4980:


Vinod, can you please point me to downstream components depending on 
MiniMRClientClusterFactory? Can't find such by quick grepping through Core, 
HBase, Ping, Hive.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-05-24 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n5.patch

rebasing and refreshing the patch

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Andrey Klochkov
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980--n5.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-04-17 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n4.patch

Updating the patch according to changes in trunk

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-04-17 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633849#comment-13633849
 ] 

Andrey Klochkov commented on MAPREDUCE-4980:


The failure is expected due to dependency on HDFS-4491

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980--n4.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (MAPREDUCE-4534) Test failures with Container .. is running beyond virtual memory limits

2013-04-05 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov resolved MAPREDUCE-4534.


Resolution: Duplicate

 Test failures with Container .. is running beyond virtual memory limits
 -

 Key: MAPREDUCE-4534
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4534
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: test
Affects Versions: 0.23.3
Reporter: Ilya Katsov

 Tests 
 org.apache.hadoop.tools.TestHadoopArchives.{testRelativePath,testPathWithSpaces}
  fail with the following message:
 {code}
 Container [pid=7785,containerID=container_1342495768864_0001_01_01] is 
 running beyond virtual memory limits. Current usage: 143.6mb of 1.5gb 
 physical memory used; 3.4gb of 3.1gb virtual memory used. Killing container.
 Dump of the process-tree for container_1342495768864_0001_01_01 :
   |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
 SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
   |- 7797 7785 7785 7785 (java) 573 38 3517018112 36421 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
   |- 7785 7101 7785 7785 (bash) 1 1 108605440 332 /bin/bash -c 
 /usr/java/jdk1.6.0_33/jre/bin/java 
 -Dlog4j.configuration=container-log4j.properties 
 -Dyarn.app.mapreduce.container.log.dir=/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01
  -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
 -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
 1/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stdout
  
 2/var/lib/jenkins/workspace/Hadoop_gd-branch0.23_integration/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-logDir-nm-0_3/application_1342495768864_0001/container_1342495768864_0001_01_01/stderr
 {code}
 This is not a stably reproducible problem, but adding MALLOC_ARENA_MAX 
 resolves the problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-29 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980--n3.patch

Updating the patch according to changes in trunk

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-29 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617813#comment-13617813
 ] 

Andrey Klochkov commented on MAPREDUCE-4980:


I believe something's wrong with the QA robot. The patch is perfectly 
applicable to the current trunk.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980--n3.patch, 
 MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-15 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Issue Type: Test  (was: Improvement)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Test
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: (was: HADOOP-9287--N2.patch)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: HADOOP-9287--N2.patch

Patch is updated with a few additional fixes.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Attachment: MAPREDUCE-4980.patch

Attaching a patch which does fixes similar to the ones introduced in 
HADOOP-9287 and HDFS-4491 and it depends on these jiras. The patch introduces 
multi-fork execution for the hadoop-mapreduce-client-jobclient module only.

This patch, together with HADOOP-9287 and HDFS-4491 introduces multi-fork 
execution for 3 Hadoop Core modules which take majority of time to test: 
hadoop-common, hadoop-hdfs and hadoop-mapreduce-client-jobclient.

Overview of changes:
1. A new profile parallel-tests is introduced in 
hadoop-mapreduce-client-jobclient/pom.xml
2. Tests are refactored to use PathUtils.getTestDir/getTestPath methods to get 
a directory/path to be used for test data. Earlier, the refactored tests 
implemented it in a similar but slightly different way.
3. MiniMRClientClusterFactory is replaced with MiniMRClientClusterBuilder which 
makes it more flexible when configuring clusters, in a way similar to 
MiniDFSCluster.Builder
4. All usages of deprecated class MiniMRCluster are replaced using  
MiniMRClientClusterBuilder. This is required to avoid FS contention when 
running mini mr clusters in parallel.
5. All usages of MiniDFSCluster constructors are replaced with 
MiniDFSCluster.Builder usage, same purpose as prev item.

The changes are tested comparing stableness of tests to trunk running both in 
serial mode  and parallel mode. No additional flakiness is noticed. 

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-11 Thread Andrey Klochkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Klochkov updated MAPREDUCE-4980:
---

Target Version/s: 3.0.0
  Status: Patch Available  (was: Open)

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (MAPREDUCE-4980) Parallel test execution of hadoop-mapreduce-client-core

2013-03-11 Thread Andrey Klochkov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599286#comment-13599286
 ] 

Andrey Klochkov commented on MAPREDUCE-4980:


Build fails due to dependency on HDFS-4491 which is not in trunk yet.

The patch affects large number of tests, setting timeout for all of them 
shouldn't be done as part of this patch.

 Parallel test execution of hadoop-mapreduce-client-core
 ---

 Key: MAPREDUCE-4980
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4980
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: test
Affects Versions: 3.0.0
Reporter: Tsuyoshi OZAWA
Assignee: Tsuyoshi OZAWA
 Attachments: MAPREDUCE-4980.1.patch, MAPREDUCE-4980.patch


 The maven surefire plugin supports parallel testing feature. By using it, the 
 tests can be run more faster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (MAPREDUCE-4467) IndexCache failures due to missing synchronization

2012-07-20 Thread Andrey Klochkov (JIRA)
Andrey Klochkov created MAPREDUCE-4467:
--

 Summary: IndexCache failures due to missing synchronization
 Key: MAPREDUCE-4467
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4467
 Project: Hadoop Map/Reduce
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 0.23.2
Reporter: Andrey Klochkov


TestMRJobs.testSleepJob fails randomly due to synchronization error in 
IndexCache:

{code}
2012-07-20 19:32:34,627 ERROR [New I/O server worker #2-1] 
mapred.ShuffleHandler (ShuffleHandler.java:exceptionCaught(528)) - Shuffle 
error: 
java.lang.IllegalMonitorStateException
at java.lang.Object.wait(Native Method)
at 
org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:74)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.sendMapOutput(ShuffleHandler.java:471)
at 
org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:397)
at 
org.jboss.netty.handler.stream.ChunkedWriteHandler.handleUpstream(ChunkedWriteHandler.java:148)
at 
org.jboss.netty.handler.codec.http.HttpChunkAggregator.messageReceived(HttpChunkAggregator.java:116)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:302)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.unfoldAndfireMessageReceived(ReplayingDecoder.java:522)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.callDecode(ReplayingDecoder.java:506)
at 
org.jboss.netty.handler.codec.replay.ReplayingDecoder.messageReceived(ReplayingDecoder.java:443)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:274)
at 
org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:261)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:349)
at 
org.jboss.netty.channel.socket.nio.NioWorker.processSelectedKeys(NioWorker.java:280)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:200)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{code}

A related issue is MAPREDUCE-4384. The change introduced there removed 
synchronized keyword and hence info.wait() call fails. Tbis needs to be 
wrapped into a synchronized block.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira