[ 
https://issues.apache.org/jira/browse/HBASE-22867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated HBASE-22867:
-----------------------------
    Description: 
The thousands of spawned  threads make the safepoint cost 80+s in our Master 
JVM processs.
{code}
2019-08-15,19:35:35,861 INFO [main-SendThread(zjy-hadoop-prc-zk02.bj:11000)] 
org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from 
server in 82260ms for sessionid 0x1691332e2d3aae5, closing socket connection 
and at
tempting reconnect
{code}

The stdout from JVM (can see from here there're 9126 threads & sync cost 80+s)
{code}
vmop                    [threads: total initially_running wait_to_block]    
[time: spin block sync cleanup vmop] page_trap_count
32358.859: ForceAsyncSafepoint              [    9126         67            474 
   ]      [     1    28 86596    87   101    ]  0
{code}

Also we got the jstack: 
{code}
$ cat 31162.stack.1  | grep 'ForkJoinPool-1-worker' | wc -l
8648
{code}

It's a dangerous bug, make it as blocker.


  was:
The thousands of spawned  threads make the safepoint cost 80+s in our Master 
JVM processs.
{code}
2019-08-15,19:35:35,861 INFO [main-SendThread(zjy-hadoop-prc-zk02.bj:11000)] 
org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from 
server in 82260ms for sessionid 0x1691332e2d3aae5, closing socket connection 
and at
tempting reconnect
{code}

The stdout from JVM (can see from here there's 9126 threads & sync cost 80+s)
{code}
vmop                    [threads: total initially_running wait_to_block]    
[time: spin block sync cleanup vmop] page_trap_count
32358.859: ForceAsyncSafepoint              [    9126         67            474 
   ]      [     1    28 86596    87   101    ]  0
{code}

Also we got the jstack: 
{code}
$ cat 31162.stack.1  | grep 'ForkJoinPool-1-worker' | wc -l
8648
{code}



> The ForkJoinPool in CleanerChore will spawn thousands of threads in our 
> cluster with thousands table
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-22867
>                 URL: https://issues.apache.org/jira/browse/HBASE-22867
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Zheng Hu
>            Priority: Blocker
>         Attachments: 31162.stack.1
>
>
> The thousands of spawned  threads make the safepoint cost 80+s in our Master 
> JVM processs.
> {code}
> 2019-08-15,19:35:35,861 INFO [main-SendThread(zjy-hadoop-prc-zk02.bj:11000)] 
> org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard 
> from server in 82260ms for sessionid 0x1691332e2d3aae5, closing socket 
> connection and at
> tempting reconnect
> {code}
> The stdout from JVM (can see from here there're 9126 threads & sync cost 80+s)
> {code}
> vmop                    [threads: total initially_running wait_to_block]    
> [time: spin block sync cleanup vmop] page_trap_count
> 32358.859: ForceAsyncSafepoint              [    9126         67            
> 474    ]      [     1    28 86596    87   101    ]  0
> {code}
> Also we got the jstack: 
> {code}
> $ cat 31162.stack.1  | grep 'ForkJoinPool-1-worker' | wc -l
> 8648
> {code}
> It's a dangerous bug, make it as blocker.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to