jihoonson commented on a change in pull request #7306: Reconcile terminology 
and method naming to 'used/unused segments'; Rename MetadataSegmentManager to 
MetadataSegments
URL: https://github.com/apache/druid/pull/7306#discussion_r369332885
 
 

 ##########
 File path: docs/design/coordinator.md
 ##########
 @@ -33,11 +33,24 @@ For a list of API endpoints supported by the Coordinator, 
see [Coordinator API](
 
 ### Overview
 
-The Druid Coordinator process is primarily responsible for segment management 
and distribution. More specifically, the Druid Coordinator process communicates 
to Historical processes to load or drop segments based on configurations. The 
Druid Coordinator is responsible for loading new segments, dropping outdated 
segments, managing segment replication, and balancing segment load.
-
-The Druid Coordinator runs periodically and the time between each run is a 
configurable parameter. Each time the Druid Coordinator runs, it assesses the 
current state of the cluster before deciding on the appropriate actions to 
take. Similar to the Broker and Historical processes, the Druid Coordinator 
maintains a connection to a Zookeeper cluster for current cluster information. 
The Coordinator also maintains a connection to a database containing 
information about available segments and rules. Available segments are stored 
in a segment table and list all segments that should be loaded in the cluster. 
Rules are stored in a rule table and indicate how segments should be handled.
-
-Before any unassigned segments are serviced by Historical processes, the 
available Historical processes for each tier are first sorted in terms of 
capacity, with least capacity servers having the highest priority. Unassigned 
segments are always assigned to the processes with least capacity to maintain a 
level of balance between processes. The Coordinator does not directly 
communicate with a historical process when assigning it a new segment; instead 
the Coordinator creates some temporary information about the new segment under 
load queue path of the historical process. Once this request is seen, the 
historical process will load the segment and begin servicing it.
+The Druid Coordinator process is primarily responsible for segment management 
and distribution. More specifically, the
+Druid Coordinator process communicates to Historical processes to load or drop 
segments based on configurations. The
+Druid Coordinator is responsible for loading new segments, dropping outdated 
segments, ensuring that segments are
+"replicated" (that is, loaded on multiple different Historical nodes) proper 
(configured) number of times, and moving
+("balancing") segments between Historical nodes to keep the latter evenly 
loaded.
+
+The Druid Coordinator runs its duties periodically and the time between each 
run is a configurable parameter. On each
+run, the Coordinator assesses the current state of the cluster before deciding 
on the appropriate actions to take.
+Similar to the Broker and Historical processes, the Druid Coordinator 
maintains a connection to a Zookeeper cluster for
+current cluster information. The Coordinator also maintains a connection to a 
database containing information about
+"used" segments (that is, the segments that *should* be loaded in the cluster) 
and the loading rules.
+
+Before any unassigned segments are serviced by Historical processes, the 
Historical processes for each tier are first
+sorted in terms of capacity, with least capacity servers having the highest 
priority. Unassigned segments are always
 
 Review comment:
   > Before any unassigned segments are serviced by Historical processes, the 
Historical processes for each tier are first
   sorted in terms of capacity, with least capacity servers having the highest 
priority. 
   
   It seems like this statement was written in 2013.. I don't think this is 
true anymore. Maybe better to mention `druid.coordinator.balancer.strategy` and 
link 
https://github.com/apache/druid/blob/master/docs/configuration/index.md#coordinator-operation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to