[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL

stack (JIRA) Mon, 31 Aug 2015 13:52:37 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724013#comment-14724013
 ]


stack commented on HBASE-14317:
-------------------------------

This is the new bit in your patch:

{code}
1705          for (int i = 0; i < syncFutures.length; i++) {
1706            if (syncFutures[i] != null) {
1707              this.syncFutures[i].done(sequence, e);
1708            }
1709          }
{code}

... running through all possible syncfutures though it is as many futures as 
there are handlers?  You thinking we've just put a syncfuture in but have not 
updated the count of futures? Is that possible since it single thread doing 
syncfutures addition and count increment?

> Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL
> -----------------------------------------------------
>
>                 Key: HBASE-14317
>                 URL: https://issues.apache.org/jira/browse/HBASE-14317
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0, 1.1.1
>            Reporter: stack
>            Priority: Critical
>         Attachments: 14317.test.txt, HBASE-14317-v1.patch, HBASE-14317.patch, 
> [Java] RS stuck on WAL sync to a dead DN - Pastebin.com.html, 
> append-only-test.patch, raw.php, san_dump.txt, subset.of.rs.log
>
>
> hbase-1.1.1 and hadoop-2.7.1
> We try to roll logs because can't append (See HDFS-8960) but we get stuck. 
> See attached thread dump and associated log. What is interesting is that 
> syncers are waiting to take syncs to run and at same time we want to flush so 
> we are waiting on a safe point but there seems to be nothing in our ring 
> buffer; did we go to roll log and not add safe point sync to clear out 
> ringbuffer?
> Needs a bit of study. Try to reproduce.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14317) Stuck FSHLog: bad disk (HDFS-8960) and can't roll WAL

Reply via email to