Toshihiro Suzuki created HBASE-20616:
----------------------------------------
Summary: TruncateTableProcedure is stuck in retry loop in
TRUNCATE_TABLE_CREATE_FS_LAYOUT state
Key: HBASE-20616
URL: https://issues.apache.org/jira/browse/HBASE-20616
Project: HBase
Issue Type: Bug
Components: amv2
Environment: HDP-2.5.3
Reporter: Toshihiro Suzuki
Assignee: Toshihiro Suzuki
At first, TruncateTableProcedure failed to write some files to HDFS in
TRUNCATE_TABLE_CREATE_FS_LAYOUT state for some reason.
{code:java}
2018-05-15 08:00:25,346 WARN [ProcedureExecutorThread-8]
procedure.TruncateTableProcedure: Retriable error trying to truncate
table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
java.io.IOException: java.util.concurrent.ExecutionException:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/apps/hbase/data/.tmp/data/<namespace>/<table>/<region>/.regioninfo could only
be replicated to 0 nodes instead of minReplication (=1). There are <the number
of DNs> datanode(s) running and no node(s) are excluded in this operation.
...
{code}
But at this time, seemed like writing some files to HDFS was successful.
And then, TruncateTableProcedure was stuck in retry loop in
TRUNCATE_TABLE_CREATE_FS_LAYOUT state. At this point, the following log
messages were shown repeatedly in the master log:
{code:java}
2018-05-15 08:00:25,463 WARN [ProcedureExecutorThread-8]
procedure.TruncateTableProcedure: Retriable error trying to truncate
table=<namespace>:<table> state=TRUNCATE_TABLE_CREATE_FS_LAYOUT
java.io.IOException: java.util.concurrent.ExecutionException:
java.io.IOException: The specified region already exists on disk:
hdfs://<name>/apps/hbase/data/.tmp/data/<namespace>/<table>/<region>
...
{code}
It seems like this is because TruncateTableProcedure tried to write the files
that were written successfully in the first try.
I think we need to delete all the files and directories that are written
successfully in the previous try before retrying the
TRUNCATE_TABLE_CREATE_FS_LAYOUT state.
Actually, this issue was observed in HDP-2.5.3, but I think the upstream has
the same issue. Also, it looks to me that CreateTableProcedure has a similar
issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)