[ 
https://issues.apache.org/jira/browse/HBASE-12740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jurriaan Mous updated HBASE-12740:
----------------------------------
    Attachment: PROFILE_before_patch_test_fails.png
                PROFILE_after_patch.png

[~dimaspivak]

The main reason why I opened this issue was that this test crashes for me 
because it reaches my machines Java default max thread count of 2000. (I tried 
various stack overflow suggested options to raise it with no success)

It caused many of these exceptions:
java.lang.OutOfMemoryError: unable to create new native thread

So I tried to lower this thread count to make it runnable so I could fix issues 
with the async client that this test was somehow pointing out.

I see my very long runs were those which would succeed in the tests because 
HBase recovered of the exceptions by retrying and gave the server some time to 
timeout some connections. Those runs took minutes longer and when I fixed what 
causes it the test would run in around the same numbers you post.

I think that the few milliseconds regression in your comparison is because the 
test now properly closes connections and regions and that takes up a bit more 
time.

I have included 2 screenshots of profiling of the failing run with too much 
thread creation and the succeeding run with much less thread creation.

Is it ok to still commit this patch so the test is runnable for those with 
lower thread limits?

> Improve performance of TestHBaseFsck
> ------------------------------------
>
>                 Key: HBASE-12740
>                 URL: https://issues.apache.org/jira/browse/HBASE-12740
>             Project: HBase
>          Issue Type: Bug
>          Components: util
>            Reporter: Jurriaan Mous
>            Assignee: Jurriaan Mous
>         Attachments: HBASE-12740-v1.patch, HBASE-12740-v2.patch, 
> HBASE-12740-v3.patch, HBASE-12740.patch, PROFILE_after_patch.png, 
> PROFILE_before_patch_test_fails.png
>
>
> TestHBaseFsck performs poor on my machine. It crashes because the threads 
> reach the 2000 thread limit on my machine. Looking at the code a lot of 
> optimization is possible and some API calls are used wrong. A lot of Admin 
> instances are created and never closed, lots of Tables are not closed, 
> ThreadPoolExecutors are not shut down and an unlimited thread pool which does 
> not recycle threads.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to