kfaraz opened a new pull request, #14269:
URL: https://github.com/apache/druid/pull/14269

   ## Changes
   
   The defaults of the following config values in the 
`CoordinatorDynamicConfig` are being updated.
   
   #### 1. `maxSegmentsInNodeLoadingQueue = 500` (previous = 100)
   
   Rationale: With round-robin segment assignment now being the default 
assignment technique, the Coordinator can assign a large number of 
under-replicated/unavailable segments very quickly. Before round-robin, a large 
queue size would cause the Coordinato to get stuck in `RunRules` duty due to 
very slow strategy-based cost computations.
   
   #### 2. `replicationThrottleLimit = 500` (previous = 10)
   Rationale: Along with the reasoning given for 
`maxSegmentsInNodeLoadingQueue`, a very low `replicationThrottleLimit` can 
cause clusters to be very slow in getting to full replication, even when there 
are loading threads sitting idle.
   
   Note: It is okay to keep this value equal to 
`maxSegmentsInNodeLoadingQueue`. Even with equal values, load queues will not 
get filled up with just replicas, and segments that are completely unavailable 
will still get a fair chance. This is because while MSINLQ applies to a single 
server, `replicationThrottleLimit` applies to each tier.
   
   #### 3. `maxSegmentsToMove = 100` (previous = 5)
   
   Rationale: A very low value of this config (say 5) turns out to be very 
ineffective in balancing especially if there are a large number of segments in 
a cluster and/or a large skew between usages of two historical servers.
   On the other hand, a very large value can cause excessive moves every 
minute, which might have the following disadvantages:
   - Load of moving segments competing with load of 
unavailable/under-replicated segments
   - Unnecessary network costs due to constant download and delete of segments
   
   These defaults will be revisited after #13197 is merged.
   
   ## Testing
   
   These values have been tried on different production cluster sizes, and have 
been found to give satisfactory results.
   
   #### Release note
   Update default values of the following coordinator dynamic configs:
   - `maxSegmentsInNodeLoadingQueue = 500`
   - `maxSegmentsToMove = 100`
   - `replicationThrottleLimit = 500`
   
   <hr>
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not 
all of these items apply to every PR. Remove the items which are not done or 
not relevant to the PR. None of the items from the checklist below are strictly 
necessary, but it would be very helpful if you at least self-review the PR. -->
   
   This PR has:
   
   - [ ] been self-reviewed.
      - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to