jihoonson commented on a change in pull request #7306: Reconcile terminology
and method naming to 'used/unused segments'; Rename MetadataSegmentManager to
MetadataSegments
URL: https://github.com/apache/druid/pull/7306#discussion_r369332885
##########
File path: docs/design/coordinator.md
##########
@@ -33,11 +33,24 @@ For a list of API endpoints supported by the Coordinator,
see [Coordinator API](
### Overview
-The Druid Coordinator process is primarily responsible for segment management
and distribution. More specifically, the Druid Coordinator process communicates
to Historical processes to load or drop segments based on configurations. The
Druid Coordinator is responsible for loading new segments, dropping outdated
segments, managing segment replication, and balancing segment load.
-
-The Druid Coordinator runs periodically and the time between each run is a
configurable parameter. Each time the Druid Coordinator runs, it assesses the
current state of the cluster before deciding on the appropriate actions to
take. Similar to the Broker and Historical processes, the Druid Coordinator
maintains a connection to a Zookeeper cluster for current cluster information.
The Coordinator also maintains a connection to a database containing
information about available segments and rules. Available segments are stored
in a segment table and list all segments that should be loaded in the cluster.
Rules are stored in a rule table and indicate how segments should be handled.
-
-Before any unassigned segments are serviced by Historical processes, the
available Historical processes for each tier are first sorted in terms of
capacity, with least capacity servers having the highest priority. Unassigned
segments are always assigned to the processes with least capacity to maintain a
level of balance between processes. The Coordinator does not directly
communicate with a historical process when assigning it a new segment; instead
the Coordinator creates some temporary information about the new segment under
load queue path of the historical process. Once this request is seen, the
historical process will load the segment and begin servicing it.
+The Druid Coordinator process is primarily responsible for segment management
and distribution. More specifically, the
+Druid Coordinator process communicates to Historical processes to load or drop
segments based on configurations. The
+Druid Coordinator is responsible for loading new segments, dropping outdated
segments, ensuring that segments are
+"replicated" (that is, loaded on multiple different Historical nodes) proper
(configured) number of times, and moving
+("balancing") segments between Historical nodes to keep the latter evenly
loaded.
+
+The Druid Coordinator runs its duties periodically and the time between each
run is a configurable parameter. On each
+run, the Coordinator assesses the current state of the cluster before deciding
on the appropriate actions to take.
+Similar to the Broker and Historical processes, the Druid Coordinator
maintains a connection to a Zookeeper cluster for
+current cluster information. The Coordinator also maintains a connection to a
database containing information about
+"used" segments (that is, the segments that *should* be loaded in the cluster)
and the loading rules.
+
+Before any unassigned segments are serviced by Historical processes, the
Historical processes for each tier are first
+sorted in terms of capacity, with least capacity servers having the highest
priority. Unassigned segments are always
Review comment:
> Before any unassigned segments are serviced by Historical processes, the
Historical processes for each tier are first
sorted in terms of capacity, with least capacity servers having the highest
priority.
It seems like this statement was written in 2013.. I don't think this is
true anymore. Maybe better to mention `druid.coordinator.balancer.strategy` and
link
https://github.com/apache/druid/blob/master/docs/configuration/index.md#coordinator-operation.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]