[ https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663774#comment-16663774 ]
Duo Zhang commented on HBASE-21375: ----------------------------------- Talked with [~allan163] offline, we think that the correct way to fix the problem is to not only check the first procedure in a queue, instead, iterate over all the procedures in a queue to see if we can find a ready one. And in the patch, I've also changed the hasParentLock to hasAncestorLock(a long time TODO). And for the new implementation, it may lead to the starvation problem... As if we keep scheduling procedures which want to hold the shared lock, the procedures which want to hold the exclusive look will never have a chance to be executed. What I can imagine in the real production is that, a busy running balancer can prevent any other operations on a table, for example, a ModifyTableProcedure. But I do not think this is big deal? It is not a good practice to run the balancer in such a frequence... There are some broken UTs, as the behavior is changed. Will fix it tomorrow. Just upload the patch for reviewing. Ping [~allan163] and [~stack]. > Revisit the lock and queue implementation in MasterProcedureScheduler > --------------------------------------------------------------------- > > Key: HBASE-21375 > URL: https://issues.apache.org/jira/browse/HBASE-21375 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 > Reporter: Duo Zhang > Assignee: Duo Zhang > Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, > HBASE-21375.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)