[
https://issues.apache.org/jira/browse/KUDU-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jean-Daniel Cryans resolved KUDU-1073.
--------------------------------------
Resolution: Cannot Reproduce
Fix Version/s: n/a
Target Version/s: (was: GA)
Haven't seen that in ages.
> Single TS falling too far behind hung YCSB
> ------------------------------------------
>
> Key: KUDU-1073
> URL: https://issues.apache.org/jira/browse/KUDU-1073
> Project: Kudu
> Issue Type: Bug
> Components: client, consensus
> Affects Versions: Private Beta
> Reporter: Todd Lipcon
> Assignee: Jean-Daniel Cryans
> Priority: Critical
> Fix For: n/a
>
>
> This caused a YCSB job to fail:
> - a server fell behind for some reason (haven't done root cause on why --
> maybe just a bit slow)
> - leader GCed the logs needed to catch it up, and thus stopped sending it any
> heartbeats or other messages
> - the server had one write pending
> - the java client apparently just kept retrying over and over against the
> same server
> The server with the pending txn may actually have been the leader at the time
> it was written - otherwise not sure why Java keeps retrying it. Or perhaps
> the Java client got an error on the leader, failed over to try the follower,
> and RPCs to the follower are timing out.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)