[
https://issues.apache.org/jira/browse/HBASE-21323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16652809#comment-16652809
]
Duo Zhang commented on HBASE-21323:
-----------------------------------
Oh I think the problem is that we will only delete the sub procedures when root
procedure is done. Let me write a UT first and see how to fix.
> The holdingCleanupTracker is not updated properly and causes too many
> procedure wal files
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-21323
> URL: https://issues.apache.org/jira/browse/HBASE-21323
> Project: HBase
> Issue Type: Sub-task
> Reporter: Duo Zhang
> Priority: Major
>
> Keep seeing this
> {noformat}
> 2018-10-16,20:03:02,027 WARN [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: procedure
> WALs count=340 above the warning threshold 10. check running procedures to
> see if something is stuck.
> 2018-10-16,20:03:02,027 INFO [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Rolled new
> Procedure Store WAL, id=343
> 2018-10-16,20:03:02,027 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=991,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,027 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=992,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,027 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=994,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,027 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=995,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,870 WARN [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: procedure
> WALs count=341 above the warning threshold 10. check running procedures to
> see if something is stuck.
> 2018-10-16,20:03:02,870 INFO [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Rolled new
> Procedure Store WAL, id=344
> 2018-10-16,20:03:02,870 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=991,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,870 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=992,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,870 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=994,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:02,870 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=995,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:03,816 WARN [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: procedure
> WALs count=342 above the warning threshold 10. check running procedures to
> see if something is stuck.
> 2018-10-16,20:03:03,816 INFO [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Rolled new
> Procedure Store WAL, id=345
> 2018-10-16,20:03:03,816 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=991,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:03,816 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=992,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:03,816 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=994,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> 2018-10-16,20:03:03,816 DEBUG [Force-Update-PEWorker-0]
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Procedure pid=995,
> ppid=990, state=SUCCESS;
> org.apache.hadoop.hbase.master.replication.RefreshPeerProcedure has already
> been finished, skip force updating.
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)