[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168277#comment-13168277 ] Hudson commented on HBASE-4965: --- Integrated in HBase-TRUNK #2541 (See [https://builds.apache.org/job/HBase-TRUNK/2541/]) HBASE-4965 Monitor the open file descriptors and the threads counters during the unit tests HBASE-4965 Monitor the open file descriptors and the threads counters during the unit tests stack : Files : * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ResourceChecker.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ResourceCheckerJUnitRule.java stack : Files : * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestAcidGuarantees.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestCompare.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestDrainingServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestFSTableDescriptorForceCreation.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestFullLogReconstruction.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestGlobalMemStoreSize.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestHBaseTestingUtility.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestHRegionLocation.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestHServerAddress.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestHServerInfo.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestInfoServers.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestKeyValue.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestMultiVersions.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestSerialization.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestServerName.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestZooKeeper.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/avro/TestAvroServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/avro/TestAvroUtil.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTrackerOnCluster.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditorNoCluster.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAdmin.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestAttributes.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestGet.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHCM.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHTablePool.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestHTableUtil.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestInstantSchemaChange.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestInstantSchemaChangeFailover.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestInstantSchemaChangeSplit.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestMetaMigrationRemovingHTD.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestMetaScanner.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestMultipleTimestamps.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestOperation.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestResult.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestScan.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestScannerTimeout.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestShell.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestTimestampsFilter.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/replication/TestReplicationAdmin.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestAggregateProtocol.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassLoading.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithAbort.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorExceptionWithRemove.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java *
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167479#comment-13167479 ] nkeywal commented on HBASE-4965: It still blocks sometimes on your computer? I haven't been able to reproduce the issue locally on linux/ubuntu. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167622#comment-13167622 ] stack commented on HBASE-4965: -- At least one test continues to fail for me. Let me try more... Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167624#comment-13167624 ] stack commented on HBASE-4965: -- Blocks is the right term. It'll skip a test because it went on too long. Let me get more info. Let me run it through hadoop-qa again too. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167665#comment-13167665 ] nkeywal commented on HBASE-4965: I propose 1024, as it's the current limit on hadoop-qa. On Mon, Dec 12, 2011 at 7:37 PM, stack (Commented) (JIRA) Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167733#comment-13167733 ] stack commented on HBASE-4965: -- I see this a bunch: {code} Running org.apache.hadoop.hbase.zookeeper.TestHQuorumPeer Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.831 sec Running org.apache.hadoop.hbase.util.TestRegionSplitCalculator Tests run: 14, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.064 sec FAILURE! Running org.apache.hadoop.hbase.zookeeper.TestZooKeeperMainServerArg Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.013 sec Results : Failed tests: testSplitCalculatorEq(org.apache.hadoop.hbase.util.TestRegionSplitCalculator): expected:2 but was:1 Tests run: 410, Failures: 1, Errors: 0, Skipped: 0 [INFO] [ERROR] BUILD FAILURE [INFO] [INFO] There are test failures. {code} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167735#comment-13167735 ] stack commented on HBASE-4965: -- On a macosx. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167814#comment-13167814 ] Hadoop QA commented on HBASE-4965: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12507021/4965-v2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 755 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 75 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestAdmin Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/489//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/489//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/489//console This message is automatically generated. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167835#comment-13167835 ] stack commented on HBASE-4965: -- The TestAdmin fail above is because of too many open files. So, this patches hadoop qa and n's tests. Let me commit. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167858#comment-13167858 ] Mikhail Bautin commented on HBASE-4965: --- I am getting a ton of compile errors caused by this change: http://pastebin.com/Lw7HNt75. I am using hadoop-0.20.205.0 (the default one in pom.xml). Can we get this fixed (change the default Hadoop version, etc. etc.)? Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4965-v2.txt, 4965-v2.txt, 4965-v3.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13167879#comment-13167879 ] Mikhail Bautin commented on HBASE-4965: --- Everything compiles again, thanks! It seems that only part of the files in this patch were committed the first time around. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Fix For: 0.94.0 Attachments: 4965-v2.txt, 4965-v2.txt, 4965-v3.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166809#comment-13166809 ] nkeywal commented on HBASE-4965: All tests ok locally, including: {noformat} --- Test set: org.apache.hadoop.hbase.master.TestLogsCleaner --- Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.775 sec {noformat} {noformat} --- Test set: org.apache.hadoop.hbase.master.TestHMasterRPCException --- Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.9 sec {noformat} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166834#comment-13166834 ] nkeywal commented on HBASE-4965: Second local execution ok as well. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166912#comment-13166912 ] stack commented on HBASE-4965: -- This is interesting. When I changed the hadoop-qa build to add build of long tests too, I inserted another output of ulimit that runs after checkout and patch application and this time fds limits is 1024 as you found above nkeywal. Let me talk to Giri, the hadoopqa master: {code} /home/jenkins/tools/maven/latest/bin/mvn clean test -DHbasePatchProcess core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited {code} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165919#comment-13165919 ] nkeywal commented on HBASE-4965: When I submitted the patch, yes, all of them, large included. On Hadoop-QA, it was tested with small medium, report says: 914 tests (+507) Took 15 mn. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166564#comment-13166564 ] Hadoop QA commented on HBASE-4965: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506785/4965_all.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 755 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/481//console This message is automatically generated. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166728#comment-13166728 ] nkeywal commented on HBASE-4965: Thank you very much stack for the v2.Hadoop-QA seems ok. Trying locally (I'am on ubuntu) right now. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965-v2.txt, 4965_all.patch, 4965_all.patch, 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165247#comment-13165247 ] nkeywal commented on HBASE-4965: I summarize the main infos: - it seems that hadoop-qa is configured with 1024 file descriptors - we have some leaks - the patch itself could be committed imho. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165639#comment-13165639 ] stack commented on HBASE-4965: -- On hadoop-qa being set to 1024 fds only, thats weird. We dump the ulimit before the test starts and it shows: {code} Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 6 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 2048 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited 6 Running in Jenkins mode {code} ... so 60k. So, I wonder where disconnect between your finding and ulimit is? We're running as a different user after ulimit is output? I love that leaks report. Thats excellent. Trying the patch locally Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165889#comment-13165889 ] stack commented on HBASE-4965: -- All tests pass for you N? I'm seeing a hang soon after we move to medium sized... let me retry. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164391#comment-13164391 ] Hadoop QA commented on HBASE-4965: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12506433/4965_all.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 755 new or modified tests. -1 javadoc. The javadoc tool appears to have generated -160 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 72 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildHole org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/459//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/459//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/459//console This message is automatically generated. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164396#comment-13164396 ] nkeywal commented on HBASE-4965: First, Hadoop QA seems to be configured with 1024 file descriptors: {noformat} 2011-12-07 13:16:26,184 ERROR [main] hbase.ResourceChecker(122): Bad configuration: the operating systems file handles maximum is 1024 our is 1 {noformat} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164401#comment-13164401 ] nkeywal commented on HBASE-4965: The error seems unrelated to my patch. It the same error for the 3 patches. {noformat} expected:[NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED] but was:[NOT_IN_META, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED, NOT_IN_META_OR_DEPLOYED] {noformat} Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: 4965_all.patch, ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13164427#comment-13164427 ] nkeywal commented on HBASE-4965: Here are the possible leaks. I am gonna fix some of them in a separate patch. Leaks on SmallTests are critical, because we the JVM is used for multiple tests. This one should be studied: client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 298), 913 file descriptors (was 488). -thread leak?- -file handle leak?- As the limit on hadoop-QA is 1024 open file descriptor, it's not far from hitting this limit. Especially is another test is ran after this one. avro.TestAvroServer#testTableAdminAndMetadata: 140 threads (was 130), 255 file descriptors (was 253). -thread leak?- -file handle leak?- avro.TestAvroServer#testFamilyAdminAndMetadata: 144 threads (was 140), 255 file descriptors (was 255). -thread leak?- avro.TestAvroServer#testDML: 146 threads (was 144), 255 file descriptors (was 255). -thread leak?- catalog.TestCatalogTrackerOnCluster#testBadOriginalRootLocation: 23 threads (was 4), 127 file descriptors (was 70). -thread leak?- -file handle leak?- catalog.TestCatalogTracker#testThatIfMETAMovesWeAreNotified: 9 threads (was 8), 84 file descriptors (was 79). -thread leak?- -file handle leak?- catalog.TestCatalogTracker#testInterruptWaitOnMetaAndRoot: 10 threads (was 9), 86 file descriptors (was 84). -file handle leak?- catalog.TestCatalogTracker#testVerifyRootRegionLocationFails: 11 threads (was 9), 89 file descriptors (was 85). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditorNoCluster#testRideOverServerNotRunning: 7 threads (was 4), 85 file descriptors (was 70). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testGetRegionsCatalogTables: 190 threads (was 185), 360 file descriptors (was 354). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testTableExists: 191 threads (was 187), 365 file descriptors (was 360). -thread leak?- -file handle leak?- catalog.TestMetaReaderEditor#testGetRegion: 193 threads (was 191), 370 file descriptors (was 365). -thread leak?- -file handle leak?- client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable: 254 threads (was 246), 423 file descriptors (was 417). -thread leak?- -file handle leak?- client.TestAdmin#testDisableAndEnableTable: 273 threads (was 254), 452 file descriptors (was 423). -thread leak?- -file handle leak?- client.TestAdmin#testDisableAndEnableTables: 294 threads (was 272), 482 file descriptors (was 452). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTable: 294 threads (was 294), 491 file descriptors (was 482). -file handle leak?- client.TestAdmin#testOnlineChangeTableSchema: 295 threads (was 294), 494 file descriptors (was 491). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTableWithRegions: 296 threads (was 294), 490 file descriptors (was 490). -thread leak?- client.TestAdmin#testTableExist: 297 threads (was 296), 494 file descriptors (was 490). -thread leak?- -file handle leak?- client.TestAdmin#testForceSplit: 303 threads (was 297), 487 file descriptors (was 494). -thread leak?- client.TestAdmin#testForceSplitMultiFamily: 309 threads (was 293), 499 file descriptors (was 464). -thread leak?- -file handle leak?- client.TestAdmin#testEnableDisableAddColumnDeleteColumn: 312 threads (was 309), 505 file descriptors (was 499). -thread leak?- -file handle leak?- client.TestAdmin#testCreateBadTables: 313 threads (was 312), 507 file descriptors (was 505). -thread leak?- -file handle leak?- client.TestAdmin#testCreateTableRPCTimeOut: 312 threads (was 313), 526 file descriptors (was 507). -file handle leak?- client.TestAdmin#testReadOnlyTable: 314 threads (was 312), 530 file descriptors (was 526). -thread leak?- -file handle leak?- client.TestAdmin#testCloseRegionThatFetchesTheHRIFromMeta: 315 threads (was 312), 513 file descriptors (was 507). -thread leak?- -file handle leak?- client.TestAdmin#testGetTableRegions: 309 threads (was 308), 512 file descriptors (was 499). -thread leak?- -file handle leak?- client.TestAdmin#testCheckHBaseAvailableClosesConnection: 523 threads (was 298), 913 file descriptors (was 488). -thread leak?- -file handle leak?- client.TestFromClientSide#testKeepDeletedCells: 261 threads (was 246), 437 file descriptors (was 414). -thread leak?- -file handle leak?- client.TestFromClientSide#testRegionCacheDeSerialization: 276 threads (was 261), 485 file descriptors (was 437). -thread leak?- -file handle leak?- client.TestFromClientSide#testRegionCachePreWarm: 277 threads (was 276), 488 file descriptors (was 485). -thread leak?- -file handle leak?- client.TestFromClientSide#testWeirdCacheBehaviour: 285 threads (was 277), 500 file descriptors (was 488). -thread leak?- -file handle leak?-
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163916#comment-13163916 ] stack commented on HBASE-4965: -- Where is the code N? Should we run this in all tests for a while till we nail some of the file descriptor issues up in hadoopqa patch build? Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163937#comment-13163937 ] nkeywal commented on HBASE-4965: Oops. Sorry. Here they are. To use then, we need to add these following lines in the test code {noformat} @Rule public ResourceCheckerJUnitRule cu = new ResourceCheckerJUnitRule(); {noformat} It's the less intrusive way I found. Before and after each code method, we count the number of threads and number of open file handles, and log them. Unfortunately, I found a bug in surefire and these lines are not stored with redirect to file option activated. I fixed it locally. Despite this, I think it makes sense to track these data. It will ease the analysis when something goes wrong, even if fixing all the current leaks would take quite a lot of time. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163947#comment-13163947 ] stack commented on HBASE-4965: -- So, you want to add it for running of all current tests? Sounds good to me (Classes are missing license and class comments explaining what they are at). We have to add the @Rule to every test method or just to each Test class? Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163951#comment-13163951 ] stack commented on HBASE-4965: -- Where does the output show? In the test output? Good stuff. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4965) Monitor the open file descriptors and the threads counters during the unit tests
[ https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163966#comment-13163966 ] nkeywal commented on HBASE-4965: It's one rule for each test class. With a fixed surefire, it shows as a standard log in the output. For example: {noformat} 2011-12-06 15:03:32,982 INFO [main] hbase.HBaseTestingUtility(518): Starting up minicluster with 1 master(s) and 3 regionserver(s) and 3 datanode(s) 2011-12-06 15:03:34,052 WARN [main] impl.MetricsSystemImpl(137): Metrics system not started: Cannot locate configuration: tried hadoop-metrics2-namenode.properties, hadoop-metrics2.properties [...] 2011-12-06 15:03:41,587 DEBUG [main] client.HTable$ClientScanner(1183): Finished with scanning at {NAME = '.META.,,1', STARTKEY = '', ENDKEY = '', ENCODED = 1028785192,} 2011-12-06 15:03:41,588 INFO [main] hbase.HBaseTestingUtility(561): Minicluster is up 2011-12-06 15:03:41,588 INFO [main] client.HConnectionManager$HConnectionImplementation(1805): Closed zookeeper sessionid=0x134159e31930008 2011-12-06 15:03:41,661 INFO [main] hbase.ResourceChecker(117): before org.apache.hadoop.hbase.client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable: 247 threads, 417 file descriptors [...] 2011-12-06 15:03:43,282 INFO [main] client.HConnectionManager$HConnectionImplementation(1805): Closed zookeeper sessionid=0x134159e31930009 2011-12-06 15:03:43,313 INFO [main] hbase.ResourceChecker(117): after org.apache.hadoop.hbase.client.TestAdmin#testDeleteEditUnknownColumnFamilyAndOrTable: 265 threads (was 247), 450 file descriptors (was 417). -thread leak?- -file handle leak?- [...] {noformat} If you're ok with the idea, I will professionalize the code a little and propose it as a patch. Monitor the open file descriptors and the threads counters during the unit tests Key: HBASE-4965 URL: https://issues.apache.org/jira/browse/HBASE-4965 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Priority: Minor Attachments: ResourceChecker.java, ResourceCheckerJUnitRule.java We're seeing a lot of issues with hadoop-qa related to threads or file descriptors. Monitoring these counters would ease the analysis. Note as well that - if we want to execute the tests in the same jvm (because the test is small or because we want to share the cluster) we can't afford to leak too many resources - if the tests leak, it's more difficult to detect a leak in the software itself. I attach piece of code that I used. It requires two lines in a unit test class to: - before every test, count the threads and the open file descriptor - after every test, compare with the previous value. I ran it on some tests; we have for example: - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 threads (was 231), 390 file descriptors (was 390). = TestMultiParallel uses 232 threads! - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 283 file descriptors (was 282). - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 294), 815 file descriptors (was 461) - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 148), 310 file descriptors (was 307). It's not always leaks, we can expect some pooling effects. But still... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira