This is an automated email from the ASF dual-hosted git repository.
kfaraz pushed a commit to branch 27.0.0
in repository https://gitbox.apache.org/repos/asf/druid.git
The following commit(s) were added to refs/heads/27.0.0 by this push:
new 323a504176 Docs: Changes for coordinator improvements done in #13197
(#14590) (#14603)
323a504176 is described below
commit 323a50417611c2d573525be24f5f08c3ec0e74c4
Author: Kashif Faraz <[email protected]>
AuthorDate: Tue Jul 18 15:16:00 2023 +0530
Docs: Changes for coordinator improvements done in #13197 (#14590) (#14603)
---
docs/configuration/index.md | 33 +++++++++++++++++++++++++++------
docs/operations/metrics.md | 22 ++++++++++++----------
2 files changed, 39 insertions(+), 16 deletions(-)
diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index f0cf9c6011..073a61e9b9 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -887,7 +887,7 @@ These Coordinator static configurations can be defined in
the `coordinator/runti
|Property|Possible Values|Description|Default|
|--------|---------------|-----------|-------|
|`druid.serverview.type`|batch or http|Segment discovery method to use. "http"
enables discovering segments using HTTP instead of ZooKeeper.|http|
-|`druid.coordinator.loadqueuepeon.type`|curator or http|Whether to use "http"
or "curator" implementation to assign segment loads/drops to historical|http|
+|`druid.coordinator.loadqueuepeon.type`|curator or http|Implementation to use
to assign segment loads and drops to historicals. Curator-based implementation
is now deprecated, so you should transition to using HTTP-based segment
assignments.|http|
|`druid.coordinator.segment.awaitInitializationOnStart`|true or false|Whether
the Coordinator will wait for its view of segments to fully initialize before
starting up. If set to 'true', the Coordinator's HTTP server will not start up,
and the Coordinator will not announce itself as available, until the server
view is initialized.|true|
###### Additional config when "http" loadqueuepeon is used
@@ -907,7 +907,7 @@ These Coordinator static configurations can be defined in
the `coordinator/runti
#### Dynamic Configuration
-The Coordinator has dynamic configuration to change certain behavior on the
fly.
+The Coordinator has dynamic configurations to tune certain behavior on the
fly, without requiring a service restart.
It is recommended that you use the [web console](../operations/web-console.md)
to configure these parameters.
However, if you need to do it via HTTP, the JSON object can be submitted to
the Coordinator via a POST request at:
@@ -949,10 +949,11 @@ Issuing a GET request at the same URL will return the
spec that is currently in
|`millisToWaitBeforeDeleting`|How long does the Coordinator need to be a
leader before it can start marking overshadowed segments as unused in metadata
storage.|900000 (15 mins)|
|`mergeBytesLimit`|The maximum total uncompressed size in bytes of segments to
merge.|524288000L|
|`mergeSegmentsLimit`|The maximum number of segments that can be in a single
[append task](../ingestion/tasks.md).|100|
+|`smartSegmentLoading`|Enables ["smart" segment loading
mode](#smart-segment-loading) which dynamically computes the optimal values of
several properties that maximize Coordinator performance.|true|
|`maxSegmentsToMove`|The maximum number of segments that can be moved at any
given time.|100|
-|`replicantLifetime`|The maximum number of Coordinator runs for a segment to
be replicated before we start alerting.|15|
-|`replicationThrottleLimit`|The maximum number of segments that can be in the
replication queue of a historical tier at any given time.|500|
-|`balancerComputeThreads`|Thread pool size for computing moving cost of
segments in segment balancing. Consider increasing this if you have a lot of
segments and moving segments starts to get stuck.|1|
+|`replicantLifetime`|The maximum number of Coordinator runs for which a
segment can wait in the load queue of a Historical before Druid raises an
alert.|15|
+|`replicationThrottleLimit`|The maximum number of segment replicas that can be
assigned to a historical tier in a single Coordinator run. This property
prevents historicals from becoming overwhelmed when loading extra replicas of
segments that are already available in the cluster.|500|
+|`balancerComputeThreads`|Thread pool size for computing moving cost of
segments during segment balancing. Consider increasing this if you have a lot
of segments and moving segments begins to stall.|1|
|`killDataSourceWhitelist`|List of specific data sources for which kill tasks
are sent if property `druid.coordinator.kill.on` is true. This can be a list of
comma-separated data source names or a JSON array.|none|
|`killPendingSegmentsSkipList`|List of data sources for which pendingSegments
are _NOT_ cleaned up if property `druid.coordinator.kill.pendingSegments.on` is
true. This can be a list of comma-separated data sources or a JSON array.|none|
|`maxSegmentsInNodeLoadingQueue`|The maximum number of segments allowed in the
load queue of any given server. Use this parameter to load segments faster if,
for example, the cluster contains slow-loading nodes or if there are too many
segments to be replicated to a particular node (when faster loading is
preferred to better segments distribution). The optimal value depends on the
loading speed of segments, acceptable replication time and number of nodes.
|500|
@@ -961,9 +962,29 @@ Issuing a GET request at the same URL will return the spec
that is currently in
|`decommissioningMaxPercentOfMaxSegmentsToMove`| Upper limit of segments the
Coordinator can move from decommissioning servers to active non-decommissioning
servers during a single run. This value is relative to the total maximum number
of segments that can be moved at any given time based upon the value of
`maxSegmentsToMove`.<br /><br />If
`decommissioningMaxPercentOfMaxSegmentsToMove` is 0, the Coordinator does not
move segments to decommissioning servers, effectively putting them in [...]
|`pauseCoordination`| Boolean flag for whether or not the coordinator should
execute its various duties of coordinating the cluster. Setting this to true
essentially pauses all coordination work while allowing the API to remain up.
Duties that are paused include all classes that implement the `CoordinatorDuty`
Interface. Such duties include: Segment balancing, Segment compaction,
Submitting kill tasks for unused segments (if enabled), Logging of used
segments in the cluster, Marking of n [...]
|`replicateAfterLoadTimeout`| Boolean flag for whether or not additional
replication is needed for segments that have failed to load due to the expiry
of `druid.coordinator.load.timeout`. If this is set to true, the coordinator
will attempt to replicate the failed segment on a different historical server.
This helps improve the segment availability if there are a few slow historicals
in the cluster. However, the slow historical may still load the segment later
and the coordinator may iss [...]
-|`maxNonPrimaryReplicantsToLoad`|This is the maximum number of non-primary
segment replicants to load per Coordination run. This number can be set to put
a hard upper limit on the number of replicants loaded. It is a tool that can
help prevent long delays in new data being available for query after events
that require many non-primary replicants to be loaded by the cluster; such as a
Historical node disconnecting from the cluster. The default value essentially
means there is no limit on [...]
+|`maxNonPrimaryReplicantsToLoad`|The maximum number of replicas that can be
assigned across all tiers in a single Coordinator run. This parameter serves
the same purpose as `replicationThrottleLimit` except this limit applies at the
cluster-level instead of per tier. The default value does not apply a limit to
the number of replicas assigned per coordination cycle. If you want to use a
non-default value for this property, you may want to start with `~20%` of the
number of segments found [...]
+##### Smart segment loading
+The `smartSegmentLoading` mode simplifies Coordinator configuration for
segment loading and balancing.
+In this mode, the Coordinator does not require the user to provide values of
the following parameters and computes them automatically instead.
+> If you enable `smartSegmentLoading` mode **and** provide values for the
following properties, Druid ignores your values. The Coordinator computes them
automatically.
+The computed values are based on the current state of the cluster and are
designed to optimize Coordinator performance.
+
+|Property|Computed value|Description|
+|--------|--------------|-----------|
+|`useRoundRobinSegmentAssignment`|true|Speeds up segment assignment.|
+|`maxSegmentsInNodeLoadingQueue`|0|Removes the limit on load queue size.|
+|`replicationThrottleLimit`|2% of used segments, minimum value 100|Prevents
aggressive replication when a historical disappears only intermittently.|
+|`replicantLifetime`|60|Allows segments to wait about an hour (assuming a
Coordinator period of 1 minute) in the load queue before an alert is raised. In
`smartSegmentLoading` mode, load queues are not limited by size. Segments might
therefore assigned to a load queue even if the corresponding server is slow to
load them.|
+|`maxNonPrimaryReplicantsToLoad`|`Integer.MAX_VALUE` (no limit)|This
throttling is already handled by `replicationThrottleLimit`.|
+|`maxSegmentsToMove`|2% of used segments, minimum value 100, maximum value
1000|Ensures that some segments are always moving in the cluster to keep it
well balanced. The maximum value keeps the Coordinator run times bounded.|
+|`decommissioningMaxPercentOfMaxSegmentsToMove`|100|Prioritizes the move of
segments from decommissioning servers so that they can be terminated quickly.|
+
+When `smartSegmentLoading` is disabled, Druid uses the configured values of
these properties.
+Disable `smartSegmentLoading` only if you want to explicitly set the values of
any of the above properties.
+
+##### Audit history
To view the audit history of Coordinator dynamic config issue a GET request to
the URL -
```
diff --git a/docs/operations/metrics.md b/docs/operations/metrics.md
index b742fff795..d10bc6fa03 100644
--- a/docs/operations/metrics.md
+++ b/docs/operations/metrics.md
@@ -283,19 +283,21 @@ These metrics are for the Druid Coordinator and are reset
each time the Coordina
|Metric|Description|Dimensions|Normal Value|
|------|-----------|----------|------------|
-|`segment/assigned/count`|Number of segments assigned to be loaded in the
cluster.|`tier`|Varies|
-|`segment/moved/count`|Number of segments moved in the cluster.|`tier`|Varies|
-|`segment/unmoved/count`|Number of segments which were chosen for balancing
but were found to be already optimally placed.|`tier`|Varies|
-|`segment/dropped/count`|Number of segments chosen to be dropped from the
cluster due to being over-replicated.|`tier`|Varies|
-|`segment/deleted/count`|Number of segments marked as unused due to drop
rules.| |Varies|
-|`segment/unneeded/count`|Number of segments dropped due to being marked as
unused.|`tier`|Varies|
-|`segment/cost/raw`|Used in cost balancing. The raw cost of hosting
segments.|`tier`|Varies|
-|`segment/cost/normalization`|Used in cost balancing. The normalization of
hosting segments.|`tier`|Varies|
-|`segment/cost/normalized`|Used in cost balancing. The normalized cost of
hosting segments.|`tier`|Varies|
+|`segment/assigned/count`|Number of segments assigned to be loaded in the
cluster.|`dataSource`, `tier`|Varies|
+|`segment/moved/count`|Number of segments moved in the cluster.|`dataSource`,
`tier`|Varies|
+|`segment/dropped/count`|Number of segments chosen to be dropped from the
cluster due to being over-replicated.|`dataSource`, `tier`|Varies|
+|`segment/deleted/count`|Number of segments marked as unused due to drop
rules.|`dataSource`|Varies|
+|`segment/unneeded/count`|Number of segments dropped due to being marked as
unused.|`dataSource`, `tier`|Varies|
+|`segment/assignSkipped/count`|Number of segments that could not be assigned
to any server for loading. This can occur due to replication throttling, no
available disk space, or a full load queue.|`dataSource`, `tier`,
`description`|Varies|
+|`segment/moveSkipped/count`|Number of segments that were chosen for balancing
but could not be moved. This can occur when segments are already optimally
placed.|`dataSource`, `tier`, `description`|Varies|
+|`segment/dropSkipped/count`|Number of segments that could not be dropped from
any server.|`dataSource`, `tier`, `description`|Varies|
|`segment/loadQueue/size`|Size in bytes of segments to load.|`server`|Varies|
-|`segment/loadQueue/failed`|Number of segments that failed to load.|`server`|0|
|`segment/loadQueue/count`|Number of segments to load.|`server`|Varies|
|`segment/dropQueue/count`|Number of segments to drop.|`server`|Varies|
+|`segment/loadQueue/assigned`|Number of segments assigned for load or drop to
the load queue of a server.|`dataSource`, `server`|Varies|
+|`segment/loadQueue/success`|Number of segment assignments that completed
successfully.|`dataSource`, `server`|Varies|
+|`segment/loadQueue/failed`|Number of segment assignments that failed to
complete.|`dataSource`, `server`|0|
+|`segment/loadQueue/cancelled`|Number of segment assignments that were
canceled before completion.|`dataSource`, `server`|Varies|
|`segment/size`|Total size of used segments in a data source. Emitted only for
data sources to which at least one used segment belongs.|`dataSource`|Varies|
|`segment/count`|Number of used segments belonging to a data source. Emitted
only for data sources to which at least one used segment
belongs.|`dataSource`|< max|
|`segment/overShadowed/count`|Number of segments marked as unused due to being
overshadowed.| |Varies|
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]