[ 
https://issues.apache.org/jira/browse/KUDU-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713726#comment-15713726
 ] 

Todd Lipcon commented on KUDU-1779:
-----------------------------------

fwiw, restarting the servers in question unstuck the tablet by virtue of 
aborting all of the pending leader operations.

> Consensus "stuck" with all transaction trackers are at limit
> ------------------------------------------------------------
>
>                 Key: KUDU-1779
>                 URL: https://issues.apache.org/jira/browse/KUDU-1779
>             Project: Kudu
>          Issue Type: Bug
>          Components: consensus
>    Affects Versions: 1.1.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>
> In a stress cluster, I saw one tablet get "stuck" in the following state:
> - the transaction_tracker on all three replicas is "full" (no more can be 
> submitted)
> - leader elections proceed just fine, but no leader is able to advance the 
> commit index
> The issue seems to be that a replica will respond with 'CANNOT_PREPARE' when 
> its transaction tracker is full. The leader then ignores this response, and 
> doesn't advance the majority-replicated watermark. The transaction tracker 
> stays full forever because the in-flight transactions can't get committed.
> Notes to follow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to