[
https://issues.apache.org/jira/browse/HBASE-14359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14729590#comment-14729590
]
stack commented on HBASE-14359:
-------------------------------
bq. Could be a reason our tests hang over on builds.apache.org when resources
get tight.
For sure we were seeing cases of OOME can't create native threads lately. Some
of this has been ameliorated by our spinning up less threads when testing but
still work to do.
> HTable#close will hang forever if unchecked error/exception thrown in
> AsyncProcess#sendMultiAction
> --------------------------------------------------------------------------------------------------
>
> Key: HBASE-14359
> URL: https://issues.apache.org/jira/browse/HBASE-14359
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.98.14, 1.1.2
> Reporter: Yu Li
> Assignee: Victor Xu
> Attachments: HBASE-14359-0.98-v1.patch,
> HBASE-14359-branch-1-v1.patch, HBASE-14359-master-branch1-v1.patch,
> HBASE-14359-master-v1.patch
>
>
> Currently in AsyncProcess#sendMultiAction, we only catch the
> RejectedExecutionException and let other error/exception go, which will cause
> decTaskCounter not invoked. Meanwhile, the recommendation for using HTable is
> to close the table in the finally clause, and HTable#close will call
> flushCommits and wait until all task done.
> The problem is when unchecked error/exception like OutOfMemoryError thrown,
> taskSent will never be equal to taskDone, so AsyncProcess#waitUntilDone will
> never return. Especially, if autoflush is set thus no data to flush during
> table close, there would be no rpc call so rpcTimeOut will not break the
> call, and thread will wait there forever.
> In our product env, the unchecked error we observed is
> "java.lang.OutOfMemoryError: unable to create new native thread", and we
> observed the client thread hang for hours
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)