[
https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-694:
-------------------------------
Summary: Re-visit C++ client scan retry logic (was: Revist C++ client scan
retry logic)
> Re-visit C++ client scan retry logic
> ------------------------------------
>
> Key: KUDU-694
> URL: https://issues.apache.org/jira/browse/KUDU-694
> Project: Kudu
> Issue Type: Bug
> Components: client
> Affects Versions: Private Beta
> Reporter: Andrew Wang
>
> There are a number of remaining issues with scanner robustness, even after
> KUDU-597:
> * Once a node is marked as failed, it will not be used again in the call.
> This is more of an issue with longer timeouts (since the node is more likely
> to come back), or if the scan is LEADER_ONLY (since only one node being down
> leads to unavailability).
> * In the LEADER_ONLY case, since we don't refresh quorum information within
> the call, we won't recover when a failover happens.
> * The scanner code calls a number of other RPCs that are not retried on
> error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in
> GetTabletServer.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)