[jira] [Assigned] (KUDU-1779) Consensus "stuck" with all transaction trackers are at limit
[ https://issues.apache.org/jira/browse/KUDU-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned KUDU-1779: - Assignee: Todd Lipcon > Consensus "stuck" with all transaction trackers are at limit > > > Key: KUDU-1779 > URL: https://issues.apache.org/jira/browse/KUDU-1779 > Project: Kudu > Issue Type: Bug > Components: consensus >Affects Versions: 1.1.0 >Reporter: Todd Lipcon >Assignee: Todd Lipcon >Priority: Critical > Labels: stability > > In a stress cluster, I saw one tablet get "stuck" in the following state: > - the transaction_tracker on all three replicas is "full" (no more can be > submitted) > - leader elections proceed just fine, but no leader is able to advance the > commit index > The issue seems to be that a replica will respond with 'CANNOT_PREPARE' when > its transaction tracker is full. The leader then ignores this response, and > doesn't advance the majority-replicated watermark. The transaction tracker > stays full forever because the in-flight transactions can't get committed. > Notes to follow. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (KUDU-1779) Consensus "stuck" with all transaction trackers are at limit
[ https://issues.apache.org/jira/browse/KUDU-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reassigned KUDU-1779: - Assignee: (was: Todd Lipcon) > Consensus "stuck" with all transaction trackers are at limit > > > Key: KUDU-1779 > URL: https://issues.apache.org/jira/browse/KUDU-1779 > Project: Kudu > Issue Type: Bug > Components: consensus >Affects Versions: 1.1.0 >Reporter: Todd Lipcon >Priority: Critical > Labels: stability > > In a stress cluster, I saw one tablet get "stuck" in the following state: > - the transaction_tracker on all three replicas is "full" (no more can be > submitted) > - leader elections proceed just fine, but no leader is able to advance the > commit index > The issue seems to be that a replica will respond with 'CANNOT_PREPARE' when > its transaction tracker is full. The leader then ignores this response, and > doesn't advance the majority-replicated watermark. The transaction tracker > stays full forever because the in-flight transactions can't get committed. > Notes to follow. -- This message was sent by Atlassian JIRA (v6.3.15#6346)