Adar Dembo has posted comments on this change.
Change subject: WIP: KUDU-1466: improve error message when writes fail at TS
Patch Set 1:
PS1, Line 22: But,
: perhaps it's actually better for this to be done in the retriable
I'm supportive of the idea, but I'm not sure how it's best implemented.
Do we use a policy that prefers one kind of error over another (i.e.
Status::TimedOut is always less interesting than other failures)? If we do
something context-free like that, it should be sufficient to pass the "last
error" around over the course of the operation, updating it whenever we see an
error of higher priority.
Or, do we prefer one error over another based on the operational phase that it
occurred (i.e. in tserver operations, lookup failures are always less
interesting than actual tserver failures)? This approach suggests we find the
appropriate "top-level" object for each operation (i.e. for scans, the
KuduScanner itself), track the best error there, and make sure it's available
to all phases of the operation to update if necessary.
For more context, we've already got "last" error tracking in KuduScanner::Data
and RpcRetrier. If we're going to add it to a third location, let's choose
deliberately and understand how all three work together.
The location picker isn't the worst place; it'll make the error available for
scans and writes, the main culprits. But it'd be nice to make it available in
administrative operations too, which are a little more ad-hoc at the moment.
To view, visit http://gerrit.cloudera.org:8080/3326
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins