[
https://issues.apache.org/jira/browse/HBASE-21354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16658495#comment-16658495
]
Allan Yang commented on HBASE-21354:
------------------------------------
{quote}
Just add a note when building holdingCleanupTracker?
{quote}
Done in V4 patch.
{quote}
Make these info-level?
{quote}
What's your concern, sir [~stack]? I think DEBUG is enough.
{quote}
I've seen when lots of chaos where I cannot clean up a Procedure because
another holds a lock but the 'other' no longer exists.
{quote}
I don't think this one can solve this issue... If the other procedure is
holding a lock, that means the procedure is loaded normally form replay, and it
is somewhere for sure...
This patch only solve issues that the procedures may deleted improperly and
thus making their parent/children procedure 'corrupt'
> Procedure may be deleted improperly during master restarts resulting in
> 'Corrupt'
> ---------------------------------------------------------------------------------
>
> Key: HBASE-21354
> URL: https://issues.apache.org/jira/browse/HBASE-21354
> Project: HBase
> Issue Type: Sub-task
> Affects Versions: 2.1.0, 2.0.2
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Major
> Attachments: HBASE-21354.branch-2.0.001.patch,
> HBASE-21354.branch-2.0.002.patch, HBASE-21354.branch-2.0.003.patch,
> HBASE-21354.branch-2.0.004.patch
>
>
> Good news! [~stack], [~Apache9], I may find the root cause of mysterious
> ‘Corrupted procedure’ or some procedures disappeared after master
> restarts(happens during ITBLL).
> This is because during master restarts, we load procedures from the log, and
> builds the 'holdingCleanupTracker' according each log's tracker. We may mark
> a procedure in the oldest log as deleted if one log doesn't contain the
> procedure. This is Inappropriate since one log will not contain info of the
> log if this procedure was not updated during the time. We can only delete the
> procedure only if it is not in the global tracker, which have the whole
> picture.
> {code}
> trackerNode = tracker.lookupClosestNode(trackerNode, procId);
> if (trackerNode == null || !trackerNode.contains(procId) ||
> trackerNode.isModified(procId)) {
> // the procedure was removed or modified
> node.delete(procId);
> }
> {code}
> A test case(testProcedureShouldNotCleanOnLoad) shows cleanly how the
> corruption happened in the patch.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)