[
https://issues.apache.org/jira/browse/HBASE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962356#comment-13962356
]
Jeffrey Zhong commented on HBASE-10922:
---------------------------------------
Is this UI refresh problem or time zone issue? I saw the following log lines:
{noformat}
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 19:52:23,358 INFO
[RS_LOG_REPLAY_OPS-e1120:36020-0] wal.HLogSplitter: Splitting hlog:
hdfs://e1120:35802/hbase/WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838130980,
length=133872798
....
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 21:28:12,172 INFO
[RS_LOG_REPLAY_OPS-e1120:36020-1] wal.HLogSplitter: Processed 16 edits across 1
regions; log
file=hdfs://e1120:35802/hbase/WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838130980
is corrupted = false progress failed = false
{noformat}
The log splitting happened from 19:52:23 and complete on 21:28:12 due to there
are some errors between.
{noformat}
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 20:24:28,634 INFO
[RS_LOG_REPLAY_OPS-e1120:36020-0] wal.HLogSplitter: Splitting hlog:
hdfs://e1120:35802/hbase/WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838124343,
length=133855324
...
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 20:39:30,583 WARN
[RS_LOG_REPLAY_OPS-e1120:36020-1] regionserver.SplitLogWorker: log splitting of
WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838130980
failed, returning error
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 21:05:30,113 WARN
[RS_LOG_REPLAY_OPS-e1120:36020-0] regionserver.SplitLogWorker: log splitting of
WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838124343
failed, returning error
hbase-jxiang-master-e1120.halxg.cloudera.com.log:2014-04-06 21:28:09,619 INFO
[main-EventThread] wal.HLogSplitter: Archived processed log
hdfs://e1120:35802/hbase/WALs/e1320.halxg.cloudera.com,36020,1396838053131-splitting/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838124343
to
hdfs://e1120:35802/hbase/oldWALs/e1320.halxg.cloudera.com%2C36020%2C1396838053131.1396838124343
{noformat}
The above log splitting is similar started from 20:24:28 and completed until
21:05:30. There are several errors. From the code in HLogSplitter#splitLogFile,
the status.markComplete is in finally block. So we should see the status
updates at least several times.
{code}
status.markComplete(msg);
{code}
> Log splitting seems to hang
> ---------------------------
>
> Key: HBASE-10922
> URL: https://issues.apache.org/jira/browse/HBASE-10922
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: Jimmy Xiang
> Attachments: log-splitting_hang.png, master-log-grep.txt
>
>
> With distributed log replay enabled by default, I ran into an issue that log
> splitting hasn't completed after 13 hours. It seems to hang somewhere.
--
This message was sent by Atlassian JIRA
(v6.2#6252)