[
https://issues.apache.org/jira/browse/HBASE-21490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691289#comment-16691289
]
Duo Zhang edited comment on HBASE-21490 at 11/19/18 6:15 AM:
-------------------------------------------------------------
{quote}
can we just use abort flag?
{quote}
No we don't. As said above, the sync thread will do periodicalRoll if not in
loading state, in this method we just call the close method with abort = false.
And it could happen that we fail to load procedures, and before we actually
call stop with abort = true, the sync thread has already deleted some inactive
logs based on the broken store tracker.
So generally speaking, we should store the 'failed loading' state in the class
to prevent further damage, since damage could happen before we call stop with
abort = true.
was (Author: apache9):
{code}
can we just use abort flag?
{code}
No we don't. As said above, the sync thread will do periodicalRoll if not in
loading state, in this method we just call the close method with abort = false.
And it could happen that we fail to load procedures, and before we actually
call stop with abort = true, the sync thread has already deleted some inactive
logs based on the broken store tracker.
So generally speaking, we should store the 'failed loading' state in the class
to prevent further damage, since damage could happen before we call stop with
abort = true.
> WALProcedure may remove proc wal files still with active procedures
> -------------------------------------------------------------------
>
> Key: HBASE-21490
> URL: https://issues.apache.org/jira/browse/HBASE-21490
> Project: HBase
> Issue Type: Sub-task
> Components: proc-v2
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2
>
> Attachments: HBASE-21490-UT.patch, HBASE-21490-v1.patch,
> HBASE-21490.patch, HBASE-21490.patch
>
>
> It happens for me several times. After master restart, all the procedures are
> gone.
> And the proc wal files were deleted before restarting, I see this in the
> master's log
> {noformat}
> 2018-11-16,20:57:40,177 INFO [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore: Remove all
> state logs with ID less than 184, since all the active procedures are in the
> latest log
> 2018-11-16,20:57:40,177 INFO [WALProcedureStoreSyncThread]
> org.apache.hadoop.hbase.procedure2.store.wal.ProcedureWALFile: Archiving
> hdfs://c4tst-xiaomi/hbase/c4tst-sync1/MasterProcWALs/pv2-00000000000000000184.log
> to hdfs://c4tst-xiaomi/hbase/c4tst-sync1/oldWALs/pv2-00000000000000000184.log
> {noformat}
> Let me dig...
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)