[GitHub] [incubator-druid] leventov commented on a change in pull request #7154: rename maintenance mode to decommission

GitBox Wed, 06 Mar 2019 14:09:13 -0800

leventov commented on a change in pull request #7154: rename maintenance mode 
to decommission
URL: https://github.com/apache/incubator-druid/pull/7154#discussion_r263155699


 ##########
 File path: docs/content/configuration/index.md
 ##########
 @@ -803,9 +803,9 @@ Issuing a GET request at the same URL will return the spec 
that is currently in
 |`killDataSourceWhitelist`|List of dataSources for which kill tasks are sent 
if property `druid.coordinator.kill.on` is true. This can be a list of 
comma-separated dataSources or a JSON array.|none|
 |`killAllDataSources`|Send kill tasks for ALL dataSources if property 
`druid.coordinator.kill.on` is true. If this is set to true then 
`killDataSourceWhitelist` must not be specified or be empty list.|false|
 |`killPendingSegmentsSkipList`|List of dataSources for which pendingSegments 
are _NOT_ cleaned up if property `druid.coordinator.kill.pendingSegments.on` is 
true. This can be a list of comma-separated dataSources or a JSON array.|none|
-|`maxSegmentsInNodeLoadingQueue`|The maximum number of segments that could be 
queued for loading to any given server. This parameter could be used to speed 
up segments loading process, especially if there are "slow" processes in the 
cluster (with low loading speed) or if too much segments scheduled to be 
replicated to some particular node (faster loading could be preferred to better 
segments distribution). Desired value depends on segments loading speed, 
acceptable replication time and number of processes. Value 1000 could be a 
start point for a rather big cluster. Default value is 0 (loading queue is 
unbounded) |0|
-|`historicalNodesInMaintenance`| List of Historical nodes in maintenance mode. 
Coordinator doesn't assign new segments on those nodes and moves segments from 
the nodes according to a specified priority.|none|
-|`nodesInMaintenancePriority`| Priority of segments from servers in 
maintenance. Coordinator takes ceil(maxSegmentsToMove * (priority / 10)) from 
servers in maitenance during balancing phase, i.e.:<br>0 - no segments from 
servers in maintenance will be processed during balancing<br>5 - 50% segments 
from servers in maintenance<br>10 - 100% segments from servers in 
maintenance<br>By leveraging the priority an operator can prevent general nodes 
from overload or decrease maitenance time instead.|7|
+|`maxSegmentsInNodeLoadingQueue`|The maximum number of segments that could be 
queued for loading to any given server. This parameter could be used to speed 
up segments loading process, especially if there are "slow" nodes in the 
cluster (with low loading speed) or if too much segments scheduled to be 
replicated to some particular node (faster loading could be preferred to better 
segments distribution). Desired value depends on segments loading speed, 
acceptable replication time and number of nodes. Value 1000 could be a start 
point for a rather big cluster. Default value is 0 (loading queue is unbounded) 
|0|
+|`decommissioningNodes`| List of 'decommissioning' historical servers. The 
Coordinator doesn't assign new segments to these servers and moves segments 
away from the 'decommissioning' servers at the maximum rate specified by 
`decommissioningVelocity`.|none|
+|`decommissioningVelocity`| Decommissioning velocity determines the maximum 
number of segments that may be moved away from 'decommissioning' servers to 
non-decommissioning (that is, active) servers during one Coordinator's run. 
This value is relative to the total maximum segment movements allowed during 
one run which is determined by the `maxSegmentsToMove` configuration. 
Specifically, the maximum is `ceil(maxSegmentsToMove * (velocity / 10))`. For 
example, if `decommissioningVelocity` is 5, no more than 
`ceil(maxSegmentsToMove * 0.5)` segments may be moved away from 
'decommissioning' servers. If `decommissioningVelocity` is 0, segments will 
neither be moved from _or to_ 'decommissioning' servers, effectively putting 
them in a sort of 'maintenance' mode that will not participate in balancing or 
assignment by load rules. Decommissioning can also become stalled if there are 
no available active servers to place the segments. By leveraging the velocity 
an operator can prevent active servers from overload by prioritizing balancing, 
or decrease decommissioning time instead. The value should be between 0 and 
10.|7|
 
 Review comment:
   Sorry for asking you doing and re-doing renames and docs, but I think we 
should better use "percent" than "velocity".
   
   The reason is that after fixing #7159, there should be a single cap, 
`maxSegmentsToMove`. There should also be a configuration parameter that 
specifies what percent of that movement cap may be spent (at maximum) on 
segment loading and dropping (this is currently specified by a separate config 
`replicationThrottleLimit`). For that (future) configuration, I want to use 
"percent", (for example, "minGuaranteedBalancingMovesPercent"), because I think 
that 10% step (as in "velocity") might not be precise enough, and because 
"velocity" is simply not the right term for this situation.
   
   Now, there is an observation that moving segments away from decommissioning 
nodes looks very much like a temporary "drop" rule "for servers". For this 
reason, I want the configurations that specify min guaranteed balancing quota 
and max 'decommissioning' movement quota to use the same units.
   
   The other reason why percent may be preferable is that we don't need to 
explain what that are with `ceil`, `/ 10` etc. Everybody knows what percent 
are. So it's less likely that users specify a wrong number because they 
misinterpret the units (e. g. they specify `decommissioningVelocity=10` because 
they think that the velocity is actually expressed in percent. I. e. they 
wanted `decommissioningVelocity=1`).
   
   Specifically, I suggest this configuration be called 
"maxPercentOfDecommissioningMoves". It doesn't follow the prefix principle. 
"decommisioningMaxPercentOfMoves" is probably also acceptable, but because of 
strange word order, it's less understandable.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-druid] leventov commented on a change in pull request #7154: rename maintenance mode to decommission

Reply via email to