Adar Dembo has posted comments on this change. Change subject: KUDU-1380. Fix retry for BUSY on non-FT scans ......................................................................
Patch Set 4: (11 comments) Definitely an improvement over the previous code. I particularly like how HandleError() works through the various cases and explicitly states the effects of each case. Do you have any thoughts on how the error handling could be made generic? Not all errors are generic (e.g. SCANNER_EXPIRED), but some certainly are. http://gerrit.cloudera.org:8080/#/c/2654/4/src/kudu/client/client.cc File src/kudu/client/client.cc: Line 1154: // The user has specified a timeout 'data_->timeout_' which should : // apply to the total time for each call to NextBatch(). However, : // if this is a fault-tolerant scan, it's preferable to set a shorter : // timeout (the "default RPC timeout" for each individual RPC call -- : // so that if the server is hung we have time to fail over and try a : // different server. This comment no longer applies here. Line 1165: bool allow_time_for_failover = data_->is_fault_tolerant_; : ScanRpcStatus result = data_->SendScanRpc(batch_deadline, allow_time_for_failover); Nit: combine. Line 1175: data_->projection_, : &data_->client_projection_, : make_gscoped_ptr(data_->last_response_.release_data())); Nit: indentation is a little weird. Or is this intentional? http://gerrit.cloudera.org:8080/#/c/2654/4/src/kudu/client/scanner-internal.cc File src/kudu/client/scanner-internal.cc: Line 130: // If we ended because of the overall deadline, we're done. Nit: ended up here? Line 163: // Backoff, then force a re-fetch of the tablet metadata. I presume you're trying to maintain behavioral compatibility with the old code, but why do we backoff in this case? Seems analogous to TABLET_NOT_RUNNING to me, except that the locations are clearly stale. Line 210: const MonoTime& overall_deadline, : const MonoTime& deadline) { Nit: indentation. Line 221: CHECK(controller_.error_response()); Why CHECK and not a "malformed response" error? Is this how the client reacted before your change? Line 275: MonoTime rpc_deadline; Ahh, that comment from earlier probably belongs here. http://gerrit.cloudera.org:8080/#/c/2654/4/src/kudu/client/scanner-internal.h File src/kudu/client/scanner-internal.h: Line 38: handilng typo Line 47: // The server was busy (e.g. RPC queue overflow) Nit: missing terminating period. Line 57: (eg NetworkError because the TS : // was down. Nit: missing close parens. -- To view, visit http://gerrit.cloudera.org:8080/2654 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I048d3aa2a163143d1637ae87281ed91f0fc5ac65 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Todd Lipcon <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
