[ 
https://issues.apache.org/jira/browse/HBASE-19978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368723#comment-16368723
 ] 

Duo Zhang commented on HBASE-19978:
-----------------------------------

And for the core pool size, the default value is the max one of cpu/4 and 16. 
So user could set it to a really small one if there are not many cpu cores, but 
I think the users who try to set this value should be advanced users, they 
should know the bad effects if set it to a very small value...

And the intention here is to make the system stabler without doing big changes 
since 2.0 is closer. I think we can open a new issue which targets to 3.0, and 
try a more beautiful solution there, coroutine or something.

And will go out for another two days... Will upload the new test soon. [~stack] 
if you think it is OK then please help committing this. I think this is useful, 
especially the test. In production cluster we have hundreds of tables and an RS 
could also take hundreds of regions. We need the UT to confirm that we will not 
be stuck when failover.

Thanks.

> The keepalive logic is incomplete in ProcedureExecutor
> ------------------------------------------------------
>
>                 Key: HBASE-19978
>                 URL: https://issues.apache.org/jira/browse/HBASE-19978
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-19978-v1.patch, HBASE-19978.patch
>
>
> The worker thread will just exit after keep alive time, and we never add it 
> back. The only way to add it back is through the stuck checker, this is not 
> correct. Here we should start new worker thread if it is under the core pool 
> size and there are pending procedures.
> For now the default keep alive time is Long.MAX_VALUE which means no timeout 
> so no problem, but we do allow users to set it so we need to fix it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to