[
https://issues.apache.org/jira/browse/KUDU-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved KUDU-1430.
-------------------------------
Resolution: Fixed
Assignee: Todd Lipcon
Fix Version/s: 0.9.0
This was fixed in 8094b73147e0238cdfd96a141554b5344c801cdb for 0.9
> Java client should not back-off so aggressively from busy servers
> -----------------------------------------------------------------
>
> Key: KUDU-1430
> URL: https://issues.apache.org/jira/browse/KUDU-1430
> Project: Kudu
> Issue Type: Bug
> Components: client
> Affects Versions: 0.8.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Fix For: 0.9.0
>
>
> Per the investigation in http://getkudu.io/2016/04/26/ycsb.html it seems like
> the 500ms backoff that the Java client does is way too aggressive. Using some
> back of the envelope math, we can see why:
> - a typical insert batch is probably on the order of 10ms (assuming a
> large-ish batch and a remote server... localhost as seen in the blog post is
> much faster)
> - imagine we have a soft threshold of 60GB and hard threshold of 100GB
> - if we are 1GB over the soft threshold, this is 1/40 = 2.5%. So, 2.5% of
> write requests will be rejected with 'TOO_BUSY'. Put another way, on average
> one out of every 40 write requests will be rejected.
> - With the 10ms per-request time above, this means that we'll experience a
> rejection approximately once every 400ms.
> If the rejection causes us to block for 500ms, then that means we're spending
> more time sleeping than sending requests -- operating at <50% of our peak
> throughput even though we are only 2.5% above the soft threshold. As we get
> to 10GB (25% above threshold) that means that we'll on average get 4 writes
> in (40ms) before we get rejected (500ms), and we're now spending 92% of our
> time sleeping.
> The intent of the probalistic rejection was always to make the insert rate
> "smooth" as memory fills up, but it seems the current implementation with the
> Java client is nearly as bad as the original "brick wall" we were trying to
> avoid.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)