[jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck

Sangjin Lee (JIRA) Tue, 15 Mar 2016 10:34:00 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195735#comment-15195735
 ]


Sangjin Lee commented on HBASE-15436:
-------------------------------------

Thanks [~anoop.hbase]. That sounds plausible.

This does represent a pretty critical issue then, no? If a region server is in 
a state where a socket timeout is thrown in this manner, flush will be stuck 
for a LONG time. In a high throughput situation, this would imply a pretty 
severe consequence and induce huge instability on the client.

For the background, we are working on using HBase for the timeline service v.2 
(see YARN-4736) and node managers will be HBase clients. If a region server or 
region servers are in an unhealthy state, this issue would cause a pretty big 
cascading effect on the client cluster, correct?

Does this behavior exist in all other later releases?

> BufferedMutatorImpl.flush() appears to get stuck
> ------------------------------------------------
>
>                 Key: HBASE-15436
>                 URL: https://issues.apache.org/jira/browse/HBASE-15436
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 1.0.2
>            Reporter: Sangjin Lee
>         Attachments: hbaseException.log, threaddump.log
>
>
> We noticed an instance where the thread that was executing a flush 
> ({{BufferedMutatorImpl.flush()}}) got stuck when the (local one-node) cluster 
> shut down and was unable to get out of that stuck state.
> The setup is a single node HBase cluster, and apparently the cluster went 
> away when the client was executing flush. The flush eventually logged a 
> failure after 30+ minutes of retrying. That is understandable.
> What is unexpected is that thread is stuck in this state (i.e. in the 
> {{flush()}} call). I would have expected the {{flush()}} call to return after 
> the complete failure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck

Reply via email to