[jira] [Updated] (KUDU-694) Re-visit C++ client scan retry logic
[ https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Henke updated KUDU-694: - Target Version/s: 1.8.0 (was: 1.5.0) > Re-visit C++ client scan retry logic > > > Key: KUDU-694 > URL: https://issues.apache.org/jira/browse/KUDU-694 > Project: Kudu > Issue Type: Bug > Components: client >Affects Versions: Private Beta >Reporter: Andrew Wang >Priority: Major > > There are a number of remaining issues with scanner robustness, even after > KUDU-597: > * Once a node is marked as failed, it will not be used again in the call. > This is more of an issue with longer timeouts (since the node is more likely > to come back), or if the scan is LEADER_ONLY (since only one node being down > leads to unavailability). > * In the LEADER_ONLY case, since we don't refresh quorum information within > the call, we won't recover when a failover happens. > * The scanner code calls a number of other RPCs that are not retried on > error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in > GetTabletServer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (KUDU-694) Re-visit C++ client scan retry logic
[ https://issues.apache.org/jira/browse/KUDU-694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Serbin updated KUDU-694: --- Summary: Re-visit C++ client scan retry logic (was: Revist C++ client scan retry logic) > Re-visit C++ client scan retry logic > > > Key: KUDU-694 > URL: https://issues.apache.org/jira/browse/KUDU-694 > Project: Kudu > Issue Type: Bug > Components: client >Affects Versions: Private Beta >Reporter: Andrew Wang > > There are a number of remaining issues with scanner robustness, even after > KUDU-597: > * Once a node is marked as failed, it will not be used again in the call. > This is more of an issue with longer timeouts (since the node is more likely > to come back), or if the scan is LEADER_ONLY (since only one node being down > leads to unavailability). > * In the LEADER_ONLY case, since we don't refresh quorum information within > the call, we won't recover when a failover happens. > * The scanner code calls a number of other RPCs that are not retried on > error, i.e. LookupTabletByKey or RefreshProxy's DNS resolution in > GetTabletServer. -- This message was sent by Atlassian JIRA (v6.4.14#64029)