[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Description: See 'Deadlock' scenario in parent issue. Doing as focused subtask since parent has a few things going on in it. Let me reproduce it below: >From HBASE-20137, 'TestRSGroups is Flakey', >https://issues.apache.org/jira/browse/HBASE-20137?focusedCommentId=16390325&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16390325 * SCP is running because a server was aborted in test. * SCP starts AssignProcedure of region X from crashed server. * DisableTable Procedure runs because test has finished and we're doing table delete. Queues * UnassignProcedure for region X. * Disable Unassign gets Lock on region X first. * SCP AssignProcedure tries to get lock, waits on lock. * DisableTable Procedure UnassignProcedure RPC fails because server is down (Thats why the SCP). * Tries to expire the server it failed the RPC against. Fails (currently being SCP'd). * DisableTable Procedure Unassign is suspended. It is a suspend with lock on region X held * SCP can't run because lock on X is held * Test timesout. was:See 'Deadlock' scenario in parent issue. Doing as focused subtask since parent has a few things going on in it. > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch, > HBASE-20173.branch-2.002.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. > Let me reproduce it below: > From HBASE-20137, 'TestRSGroups is Flakey', > https://issues.apache.org/jira/browse/HBASE-20137?focusedCommentId=16390325&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16390325 > * SCP is running because a server was aborted in test. > * SCP starts AssignProcedure of region X from crashed server. > * DisableTable Procedure runs because test has finished and we're doing > table delete. Queues > * UnassignProcedure for region X. > * Disable Unassign gets Lock on region X first. > * SCP AssignProcedure tries to get lock, waits on lock. > * DisableTable Procedure UnassignProcedure RPC fails because server is down > (Thats why the SCP). > * Tries to expire the server it failed the RPC against. Fails (currently > being SCP'd). > * DisableTable Procedure Unassign is suspended. It is a suspend with lock on > region X held > * SCP can't run because lock on X is held > * Test timesout. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Resolution: Fixed Status: Resolved (was: Patch Available) > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch, > HBASE-20173.branch-2.002.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Attachment: HBASE-20173.branch-2.002.patch > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch, > HBASE-20173.branch-2.002.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Status: Patch Available (was: Open) Trying hadoopqa. This one is hard to write a test for since it depdendent on aligning two macro procedure steps exactly. My best bet I think is the test IntegrationTestDDLMasterFailover on a cluster. Will try it concurrent to this hadoopqa run. > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Priority: Critical (was: Major) > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20173) [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock
[ https://issues.apache.org/jira/browse/HBASE-20173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-20173: -- Attachment: HBASE-20173.branch-2.001.patch > [AMv2] DisableTableProcedure concurrent to ServerCrashProcedure can deadlock > > > Key: HBASE-20173 > URL: https://issues.apache.org/jira/browse/HBASE-20173 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-20173.branch-2.001.patch > > > See 'Deadlock' scenario in parent issue. Doing as focused subtask since > parent has a few things going on in it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)