[ 
https://issues.apache.org/jira/browse/HBASE-4965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165639#comment-13165639
 ] 

stack commented on HBASE-4965:
------------------------------

On hadoop-qa being set to 1024 fds only, thats weird.  We dump the ulimit 
before the test starts and it shows:

{code}
Linux asf011.sp2.ygridcore.net 2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 
17:42:25 UTC 2011 x86_64 GNU/Linux
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 20
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 60000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 2048
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
60000
Running in Jenkins mode
{code}

... so 60k.

So, I wonder where disconnect between your finding and ulimit is?  We're 
running as a different user after ulimit is output?

I love that leaks report.  Thats excellent.

Trying the patch locally....
                
> Monitor the open file descriptors and the threads counters during the unit 
> tests
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-4965
>                 URL: https://issues.apache.org/jira/browse/HBASE-4965
>             Project: HBase
>          Issue Type: Improvement
>          Components: test
>    Affects Versions: 0.94.0
>         Environment: all
>            Reporter: nkeywal
>            Assignee: nkeywal
>            Priority: Minor
>         Attachments: 4965_all.patch, ResourceChecker.java, 
> ResourceCheckerJUnitRule.java
>
>
> We're seeing a lot of issues with hadoop-qa related to threads or file 
> descriptors.
> Monitoring these counters would ease the analysis.
> Note as well that
>  - if we want to execute the tests in the same jvm (because the test is small 
> or because we want to share the cluster) we can't afford to leak too many 
> resources
>  - if the tests leak, it's more difficult to detect a leak in the software 
> itself.
> I attach piece of code that I used. It requires two lines in a unit test 
> class to:
> - before every test, count the threads and the open file descriptor
> - after every test, compare with the previous value.
> I ran it on some tests; we have for example:
> - client.TestMultiParallel#testBatchWithManyColsInOneRowGetAndPut: 232 
> threads (was 231), 390 file descriptors (was 390). => TestMultiParallel uses 
> 232 threads!
> - client.TestMultipleTimestamps#testWithColumnDeletes: 152 threads (was 151), 
> 283 file descriptors (was 282).
> - client.TestAdmin#testCheckHBaseAvailableClosesConnection: 477 threads (was 
> 294), 815 file descriptors (was 461)
> - client.TestMetaMigrationRemovingHTD#testMetaMigration: 149 threads (was 
> 148), 310 file descriptors (was 307).
> It's not always leaks, we can expect some pooling effects. But still...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to