[ 
https://issues.apache.org/jira/browse/HBASE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13830616#comment-13830616
 ] 

stack commented on HBASE-9736:
------------------------------

Patch seems to work nicely.  Is there anything we can do to make it more 
readable regards what is going on in the logs?  I do see that the executor 
threads are named RS_LOG_REPLAY_OPS-c2024:60020-0 and 
RS_LOG_REPLAY_OPS-c2024:60020-1 which helps.  Other threads have generic names. 
 I suppose that is not a fault of this patch.  We can address in another issue. 
 I'm +1 on commit (you can address my nits on commit) for 0.96 and trunk -- 
this will help MTTR on small clusters.

{code}
2013-11-22 23:47:12,064 DEBUG [regionserver60020-EventThread] 
regionserver.SplitLogWorker: tasks arrived or departed
2013-11-22 23:47:12,123 INFO  
[SplitLogWorker-c2024.halxg.cloudera.com,60020,1385187483879] 
regionserver.SplitLogWorker: worker 
c2024.halxg.cloudera.com,60020,1385187483879 acquired task 
/hbase/splitWAL/WALs%2Fc2023.halxg.cloudera.com%2C60020%2C1385187483937-splitting%2Fc2023.halxg.cloudera.com%252C60020%252C1385187483937.1385192599487
2013-11-22 23:47:12,126 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Splitting hlog: 
hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192599487,
 length=127509944
2013-11-22 23:47:12,126 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: DistributedLogReplay = false
2013-11-22 23:47:12,140 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
util.FSHDFSUtils: Recovering lease on dfs file 
hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192599487
2013-11-22 23:47:12,140 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
util.FSHDFSUtils: recoverLease=true, attempt=0 on 
file=hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192599487
 after 0ms
2013-11-22 23:47:12,180 DEBUG [WriterThread-1] wal.HLogSplitter: Writer thread 
Thread[WriterThread-1,5,main]: starting
2013-11-22 23:47:12,180 DEBUG [WriterThread-2] wal.HLogSplitter: Writer thread 
Thread[WriterThread-2,5,main]: starting
2013-11-22 23:47:12,180 DEBUG [WriterThread-0] wal.HLogSplitter: Writer thread 
Thread[WriterThread-0,5,main]: starting
2013-11-22 23:47:12,211 DEBUG [WriterThread-0] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001256118.temp
 region=46430805f1342beef533afa42fe162a9
2013-11-22 23:47:12,228 DEBUG [WriterThread-1] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/cfecca49fdfd79c90f015c372bf96d38/recovered.edits/0000000000001256119.temp
 region=cfecca49fdfd79c90f015c372bf96d38
2013-11-22 23:47:12,241 DEBUG [WriterThread-2] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/d7d0ba72747c98bd720bc4644dd4c39a/recovered.edits/0000000000001256121.temp
 region=d7d0ba72747c98bd720bc4644dd4c39a
2013-11-22 23:47:12,286 DEBUG [WriterThread-0] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/c660251766971d6f3033290a5b3f3087/recovered.edits/0000000000001256146.temp
 region=c660251766971d6f3033290a5b3f3087
2013-11-22 23:47:12,286 DEBUG [WriterThread-1] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001256136.temp
 region=81d98d73d90d977f7b0aeccdc7bae5aa
2013-11-22 23:47:12,336 DEBUG [WriterThread-2] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/21b4b0afbbefc00529c9e3d07f5aa0b7/recovered.edits/0000000000001256152.temp
 region=21b4b0afbbefc00529c9e3d07f5aa0b7
2013-11-22 23:47:12,914 INFO  
[SplitLogWorker-c2024.halxg.cloudera.com,60020,1385187483879] 
regionserver.SplitLogWorker: worker 
c2024.halxg.cloudera.com,60020,1385187483879 acquired task 
/hbase/splitWAL/WALs%2Fc2023.halxg.cloudera.com%2C60020%2C1385187483937-splitting%2Fc2023.halxg.cloudera.com%252C60020%252C1385187483937.1385192691392
2013-11-22 23:47:12,918 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-0] 
wal.HLogSplitter: Splitting hlog: 
hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192691392,
 length=140511622
2013-11-22 23:47:12,918 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-0] 
wal.HLogSplitter: DistributedLogReplay = false
2013-11-22 23:47:12,923 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-0] 
util.FSHDFSUtils: Recovering lease on dfs file 
hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192691392
2013-11-22 23:47:12,924 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-0] 
util.FSHDFSUtils: recoverLease=true, attempt=0 on 
file=hdfs://c2020.halxg.cloudera.com:8020/hbase/WALs/c2023.halxg.cloudera.com,60020,1385187483937-splitting/c2023.halxg.cloudera.com%2C60020%2C1385187483937.1385192691392
 after 1ms
2013-11-22 23:47:12,933 DEBUG [WriterThread-0] wal.HLogSplitter: Writer thread 
Thread[WriterThread-0,5,main]: starting
2013-11-22 23:47:12,933 DEBUG [WriterThread-1] wal.HLogSplitter: Writer thread 
Thread[WriterThread-1,5,main]: starting
2013-11-22 23:47:12,933 DEBUG [WriterThread-2] wal.HLogSplitter: Writer thread 
Thread[WriterThread-2,5,main]: starting
2013-11-22 23:47:12,953 DEBUG [WriterThread-2] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/d0c5de700c2c776947177012021716c7/recovered.edits/0000000000001286681.temp
 region=d0c5de700c2c776947177012021716c7
2013-11-22 23:47:12,974 DEBUG [WriterThread-0] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/c660251766971d6f3033290a5b3f3087/recovered.edits/0000000000001286694.temp
 region=c660251766971d6f3033290a5b3f3087
2013-11-22 23:47:12,994 DEBUG [WriterThread-1] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/d23e69b1d07aa6a82f8ac0b5edc8336b/recovered.edits/0000000000001286687.temp
 region=d23e69b1d07aa6a82f8ac0b5edc8336b
2013-11-22 23:47:13,002 DEBUG [WriterThread-2] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001286697.temp
 region=81d98d73d90d977f7b0aeccdc7bae5aa
2013-11-22 23:47:13,028 DEBUG [WriterThread-0] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/21b4b0afbbefc00529c9e3d07f5aa0b7/recovered.edits/0000000000001286699.temp
 region=21b4b0afbbefc00529c9e3d07f5aa0b7
2013-11-22 23:47:13,033 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Finishing writing output logs and closing down.
2013-11-22 23:47:13,033 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Waiting for split writer threads to finish
2013-11-22 23:47:13,034 INFO  [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Split writers finished
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/21b4b0afbbefc00529c9e3d07f5aa0b7/recovered.edits/0000000000001256152.temp
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001256118.temp
2013-11-22 23:47:13,034 DEBUG [split-log-closeStream-1] wal.HLogSplitter: 
Closing 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/21b4b0afbbefc00529c9e3d07f5aa0b7/recovered.edits/0000000000001256152.temp
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001256136.temp
2013-11-22 23:47:13,034 DEBUG [split-log-closeStream-2] wal.HLogSplitter: 
Closing 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001256118.temp
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/c660251766971d6f3033290a5b3f3087/recovered.edits/0000000000001256146.temp
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/cfecca49fdfd79c90f015c372bf96d38/recovered.edits/0000000000001256119.temp
2013-11-22 23:47:13,034 DEBUG [split-log-closeStream-3] wal.HLogSplitter: 
Closing 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001256136.temp
2013-11-22 23:47:13,034 DEBUG [RS_LOG_REPLAY_OPS-c2024:60020-1] 
wal.HLogSplitter: Submitting close of 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/d7d0ba72747c98bd720bc4644dd4c39a/recovered.edits/0000000000001256121.temp
2013-11-22 23:47:13,036 DEBUG [WriterThread-1] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001286700.temp
 region=46430805f1342beef533afa42fe162a9
2013-11-22 23:47:13,052 INFO  [split-log-closeStream-2] wal.HLogSplitter: 
Closed wap 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001256118.temp
 (wrote 605 edits in 169ms)
2013-11-22 23:47:13,061 DEBUG [WriterThread-0] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/d7d0ba72747c98bd720bc4644dd4c39a/recovered.edits/0000000000001286701.temp
 region=d7d0ba72747c98bd720bc4644dd4c39a
2013-11-22 23:47:13,061 DEBUG [WriterThread-2] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/0e766ffee2d59971310cb00971f79054/recovered.edits/0000000000001286706.temp
 region=0e766ffee2d59971310cb00971f79054
2013-11-22 23:47:13,062 INFO  [split-log-closeStream-3] wal.HLogSplitter: 
Closed wap 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001256136.temp
 (wrote 596 edits in 196ms)
2013-11-22 23:47:13,062 INFO  [split-log-closeStream-1] wal.HLogSplitter: 
Closed wap 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/21b4b0afbbefc00529c9e3d07f5aa0b7/recovered.edits/0000000000001256152.temp
 (wrote 598 edits in 209ms)
2013-11-22 23:47:13,071 DEBUG [WriterThread-1] wal.HLogSplitter: Creating 
writer 
path=hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/cfecca49fdfd79c90f015c372bf96d38/recovered.edits/0000000000001286710.temp
 region=cfecca49fdfd79c90f015c372bf96d38
2013-11-22 23:47:13,112 DEBUG [split-log-closeStream-2] wal.HLogSplitter: 
Rename 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001256118.temp
 to 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/46430805f1342beef533afa42fe162a9/recovered.edits/0000000000001259702
2013-11-22 23:47:13,112 DEBUG [split-log-closeStream-3] wal.HLogSplitter: 
Rename 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001256136.temp
 to 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/81d98d73d90d977f7b0aeccdc7bae5aa/recovered.edits/0000000000001259698
2013-11-22 23:47:13,112 DEBUG [split-log-closeStream-2] wal.HLogSplitter: 
Closing 
hdfs://c2020.halxg.cloudera.com:8020/hbase/data/default/usertable/c660251766971d6f3033290a5b3f3087/recovered.edits/0000000000001256146.temp
{code}

> Alow more than one log splitter per RS
> --------------------------------------
>
>                 Key: HBASE-9736
>                 URL: https://issues.apache.org/jira/browse/HBASE-9736
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>            Reporter: stack
>            Assignee: Jeffrey Zhong
>            Priority: Critical
>         Attachments: hbase-9736.patch
>
>
> IIRC, this is an idea that came from the lads at Xiaomi.
> I have a small cluster of 6 RSs and one went down.  It had a few WALs.  I see 
> this in logs:
> 2013-10-09 05:47:27,890 DEBUG org.apache.hadoop.hbase.master.SplitLogManager: 
> total tasks = 25 unassigned = 21
> WAL splitting is held up for want of slots out on the cluster to split WALs.
> We need to be careful we don't overwhelm the foreground regionservers but 
> more splitters should help get all back online faster.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to