bbeaudreault commented on pull request #4180:
URL: https://github.com/apache/hbase/pull/4180#issuecomment-1081912070
@ndimiduk great feedback as well, thanks! I agree with you on the relative
ambiguity here. I'm wondering how we can make incremental progress though. In
my opinion this PR is a step in the right direction, though still leaves lots
of area for improvement as you and Andrew call out.
If it were easy to generate a reasonable Retry-After-like return value, I'd
be glad to give that a try here. There are a couple of problems I see:
1. I'm not sure how we'd calculate that based on current implementations. I
think maybe based on an EWMA rate of call queue drainage? We don't currently
track that, and implementing it here seems out of scope for what started as a
unification of exception handling in an existing feature.
2. In my experience it's typically easier to handle backoff configuration on
the client side. This is for two reasons:
1. There are often many different clients, with different SLAs. These
usually play out in the timeout, num tries, and backoff policies.
2. Clients are typically easier to configure on-demand. HBase's
`onConfigurationChange` hooks can help with this, but it's still relatively
less agile than client configs imo.
With all of that said, in terms of setting ourselves up for further
iteration in future jiras, I wonder if the current implementation here is good
enough as is or could be improved with that in mind. I'm open to suggestions on
that, but here are my thoughts:
- I don't think the addition of `server_overloaded` in protobuf is a
problem, since we can change protobufs relatively freely.
- RemoteWithExtrasException is IA.Public, so if we add the serverOverloaded
stuff here now we'd need to handle deprecations if we wanted to remove that in
the future.
- That said, I think any future improvement here might want to be
predicated on first totally refactoring RemoteWithExtrasException, so we'd need
to handle handle deprecations anyway
- The bigger sticking point might be HBaseServerException, which now exposes
an isServerOverloaded() method. IMO we could leave this here even we added
further details down the road -- for example, there's little harm in having
both an `isServerOverloaded()` _and_ `getRetryAfterMillis()`. One describes the
state of the server, and the other describes what to do about it.
Based on those thoughts, I personally think this PR is good as is without
hampering our ability to improve this more down the line. That's not based on a
desire to avoid extra work here though, so if you think there are some tactical
changes we could make within the spirit of this jira please let me know!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]