I recall a version of HBase 2 where MasterProcWALs didn't get cleaned up. Given the ID count in your pv2 wal file names is up to 200K's, I would venture a guess that the master is just spinning to process a bunch of old procedures.

You could try to move them to the side and be prepared to use HBCK2 to fix anything that is assigned properly as a result of SCP's not running.

Gentle reminder: please always share the full version details when asking questions.

On 6/18/20 10:03 AM, 郭文傑 (Rock) wrote:
  Hi,

My HBase cluster has about 50TB data.
After upgrade to 2.x, the Master startup long time.
I check the log that master try to reload WAL.
It already initialized more than 75 hours, I afraid it will spend more days.
How can I handle this issue?

The below is master log:
2020-06-18 21:59:14,690 INFO  [PEWorker-4] zookeeper.MetaTableLocator:
Setting hbase:meta (replicaId=0) location in ZooKeeper as
persp-16.persp.net,16020,1592295446983
<http://persp-16.persp.net%2C16020%2C1592295446983>
2020-06-18 21:59:14,691 INFO  [PEWorker-4]
assignment.RegionTransitionProcedure: Dispatch pid=192, ppid=145,
state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; AssignProcedure
table=hbase:meta, region=1588230740
2020-06-18 21:59:14,897
INFO  [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=16000]
assignment.AssignProcedure: Retry=625296 of max=2147483647; pid=192,
ppid=145, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
AssignProcedure table=hbase:meta, region=1588230740; rit=OPENING, location=
persp-16.persp.net,16020,1592295446983
<http://persp-16.persp.net%2C16020%2C1592295446983>
2020-06-18 21:59:14,924 WARN  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: procedure WALs count=456 above the warning threshold
10. check running procedures to see if something is stuck.
2020-06-18 21:59:14,924 INFO  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: Rolled new Procedure Store WAL, id=219931
2020-06-18 21:59:14,929 INFO  [PEWorker-6] assignment.AssignProcedure:
Starting pid=192, ppid=145, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:meta, region=1588230740;
rit=OFFLINE, location=null; forceNewPlan=true, retain=false target svr=null
2020-06-18 21:59:14,965 INFO  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: Remove the oldest log
hdfs://ha:8020/apps/hbase/data/MasterProcWALs/pv2-00000000000000219476.log
2020-06-18 21:59:14,965 INFO  [WALProcedureStoreSyncThread]
wal.ProcedureWALFile: Archiving
hdfs://ha:8020/apps/hbase/data/MasterProcWALs/pv2-00000000000000219476.log
to hdfs://ha:8020/apps/hbase/data/oldWALs/pv2-00000000000000219476.log
2020-06-18 21:59:15,109 INFO  [PEWorker-1] zookeeper.MetaTableLocator:
Setting hbase:meta (replicaId=0) location in ZooKeeper as
persp-30.persp.net,16020,1592295446723
<http://persp-30.persp.net%2C16020%2C1592295446723>
2020-06-18 21:59:15,110 INFO  [PEWorker-1]
assignment.RegionTransitionProcedure: Dispatch pid=192, ppid=145,
state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; AssignProcedure
table=hbase:meta, region=1588230740
2020-06-18 21:59:15,313
INFO  [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=16000]
assignment.AssignProcedure: Retry=625297 of max=2147483647; pid=192,
ppid=145, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
AssignProcedure table=hbase:meta, region=1588230740; rit=OPENING, location=
persp-30.persp.net,16020,1592295446723
<http://persp-30.persp.net%2C16020%2C1592295446723>
2020-06-18 21:59:15,313 INFO  [PEWorker-14] assignment.AssignProcedure:
Starting pid=192, ppid=145, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:meta, region=1588230740;
rit=OFFLINE, location=null; forceNewPlan=true, retain=false target svr=null


Reply via email to