[
https://issues.apache.org/jira/browse/HBASE-6748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13554408#comment-13554408
]
Sergey Shelukhin commented on HBASE-6748:
-----------------------------------------
Patch needs to be rebased.
{code}
if (failedDeletions.size() > 0) {
List<String> tmpPaths = new ArrayList<String>(failedDeletions);
...
failedDeletions.removeAll(tmpPaths);
} {code}
This is not thread safe, entries can be added during the deletes.
In
{code}
if (needAbandonRetries(rc,
{code}
cases, the code doesn't call deleteNodeFailure/etc., which is called in case of
failure below in each respective method.
That was the existing behavior in call cases; however, createNodeFailure call
was added to "if (needAbandonRetries(rc, ...))" in create. Why is it
inconsistent?
> Endless recursive of deleteNode happened in
> SplitLogManager#DeleteAsyncCallback
> -------------------------------------------------------------------------------
>
> Key: HBASE-6748
> URL: https://issues.apache.org/jira/browse/HBASE-6748
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.94.1, 0.96.0
> Reporter: Jieshan Bean
> Assignee: Jeffrey Zhong
> Fix For: 0.96.0, 0.94.5
>
> Attachments: hbase-6748_1.patch, hbase-6748.patch
>
>
> You can ealily understand the problem from the below logs:
> {code}
> [2012-09-01 11:41:02,062] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978]
> create rc =SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=3
> [2012-09-01 11:41:02,062] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978]
> create rc =SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=2
> [2012-09-01 11:41:02,063] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978]
> create rc =SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=1
> [2012-09-01 11:41:02,063] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$CreateAsyncCallback 978]
> create rc =SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=0
> [2012-09-01 11:41:02,063] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager 393] failed to create task
> node/hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> [2012-09-01 11:41:02,063] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager 353] Error splitting
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> [2012-09-01 11:41:02,063] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052]
> delete rc=SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=9223372036854775807
> [2012-09-01 11:41:02,064] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052]
> delete rc=SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=9223372036854775806
> [2012-09-01 11:41:02,064] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052]
> delete rc=SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=9223372036854775805
> [2012-09-01 11:41:02,064] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052]
> delete rc=SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=9223372036854775804
> [2012-09-01 11:41:02,065] [WARN ]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback 1052]
> delete rc=SESSIONEXPIRED for
> /hbase/splitlog/hdfs%3A%2F%2Fxh01%3A9000%2Fhbase%2F.logs%2Fxh01%2C20020%2C1339552105088-splitting%2Fxh01%252C20020%252C1339552105088.1339557014846
> remaining retries=9223372036854775803
> ...................
> [2012-09-01 11:41:03,307] [ERROR]
> [MASTER_SERVER_OPERATIONS-xh03,20000,1339549619270-1]
> [org.apache.zookeeper.ClientCnxn 623] Caught unexpected throwable
> java.lang.StackOverflowError
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira