Re: [PR] Revamp design page (druid)

via GitHub Mon, 04 Dec 2023 14:39:57 -0800


techdocsmith commented on code in PR #15486:
URL: https://github.com/apache/druid/pull/15486#discussion_r1414585842



##########
docs/design/broker.md:
##########
@@ -25,18 +26,18 @@ title: "Broker"
 
 ### Configuration
 
-For Apache Druid Broker Process Configuration, see [Broker 
Configuration](../configuration/index.md#broker).
+For Apache Druid Broker service configuration, see [Broker 
Configuration](../configuration/index.md#broker).
 
-For basic tuning guidance for the Broker process, see [Basic cluster 
tuning](../operations/basic-cluster-tuning.md#broker).
+For basic tuning guidance for the Broker service, see [Basic cluster 
tuning](../operations/basic-cluster-tuning.md#broker).
 
 ### HTTP endpoints
 
 For a list of API endpoints supported by the Broker, see [Broker 
API](../api-reference/legacy-metadata-api.md#broker).
 
 ### Overview
 
-The Broker is the process to route queries to if you want to run a distributed 
cluster. It understands the metadata published to ZooKeeper about what segments 
exist on what processes and routes queries such that they hit the right 
processes. This process also merges the result sets from all of the individual 
processes together.
-On start up, Historical processes announce themselves and the segments they 
are serving in Zookeeper.
+The Broker is the service to route queries to if you want to run a distributed 
cluster. It understands the metadata published to ZooKeeper about what segments 
exist on what services and routes queries such that they hit the right 
services. This service also merges the result sets from all of the individual 
services together.

Review Comment:
   Check the header levels. seems like we're starting with 3. here. Also, we 
could skip the ### Overview heading and just have some overview text.  This 
overview needs help :( , but maybe we fix that later. 



##########
docs/design/architecture.md:
##########
@@ -102,225 +178,4 @@ For more details, please see the [Metadata 
storage](../design/metadata-storage.m
 
 Used for internal service discovery, coordination, and leader election.
 
-For more details, please see the [ZooKeeper](zookeeper.md) page.
-
-
-## Storage design

Review Comment:
   Do we need to. link out to the new topic? Could even go in a ## Learn more 
at the end?



##########
docs/design/architecture.md:
##########
@@ -23,45 +23,121 @@ title: "Design"
   -->
 
 
-Druid has a distributed architecture that is designed to be cloud-friendly and 
easy to operate. You can configure and scale services independently so you have 
maximum flexibility over cluster operations. This design includes enhanced 
fault tolerance: an outage of one component does not immediately affect other 
components.
+Druid has a distributed architecture that is designed to be cloud-friendly and 
easy to operate. You can configure and scale services independently for maximum 
flexibility over cluster operations. This design includes enhanced fault 
tolerance: an outage of one component does not immediately affect other 
components.
 
-## Druid architecture
+The following diagram shows the services that make up the Druid architecture, 
their typical arrangement across servers, and how queries and data flow through 
this architecture.
 
-The following diagram shows the services that make up the Druid architecture, 
how they are typically organized into servers, and how queries and data flow 
through this architecture.
+![Druid architecture](../assets/druid-architecture.svg)
 
-![Druid architecture](../assets/druid-architecture.png)
-
-The following sections describe the components of this architecture. 
+The following sections describe the components of this architecture.
 
 ## Druid services
 
 Druid has several types of services:
 
-* [**Coordinator**](../design/coordinator.md) service manages data 
availability on the cluster.
-* [**Overlord**](../design/overlord.md) service controls the assignment of 
data ingestion workloads.
-* [**Broker**](../design/broker.md) handles queries from external clients.
-* [**Router**](../design/router.md) services are optional; they route requests 
to Brokers, Coordinators, and Overlords.
-* [**Historical**](../design/historical.md) services store queryable data.
-* [**MiddleManager**](../design/middlemanager.md) services ingest data.
+* [Coordinator](../design/coordinator.md) manages data availability on the 
cluster.
+* [Overlord](../design/overlord.md) controls the assignment of data ingestion 
workloads.
+* [Broker](../design/broker.md) handles queries from external clients.
+* [Router](../design/router.md) optionally routes requests to Brokers, 
Coordinators, and Overlords.
+* [Historical](../design/historical.md) stores queryable data.
+* [MiddleManager](../design/middlemanager.md) and [Peon](../design/peons.md) 
ingest data.
+* [Indexer](../design/indexer.md) serves an alternative to the MiddleManager + 
Peon task execution system.
 
 You can view services in the **Services** tab in the web console: 
 
 ![Druid services](../assets/services-overview.png "Services in the web 
console")
 
-
 ## Druid servers
 
-Druid services can be deployed any way you like, but for ease of deployment we 
suggest organizing them into three server types: Master, Query, and Data.
+You can deploy Druid services according to your preferences. For ease of 
deployment, we recommend organizing them into three server types: 
[Master](#master-server), [Query](#query-server), and [Data](#data-server).
+
+### Master server
+
+A Master server manages data ingestion and availability. It is responsible for 
starting new ingestion jobs and coordinating availability of data on the [Data 
server](#data-server).
+
+Master servers divide operations between Coordinator and Overlord services.
+
+#### Coordinator service
+
+[Coordinator](../design/coordinator.md) services watch over the Historical 
services on the Data servers. They are responsible for assigning segments to 
specific servers, and for ensuring segments are well-balanced across 
Historicals.
+
+#### Overlord service
+
+[Overlord](../design/overlord.md) services watch over the MiddleManager 
services on the Data servers and are the controllers of data ingestion into 
Druid. They are responsible for assigning ingestion tasks to MiddleManagers and 
for coordinating segment publishing.
+
+### Query server
+
+A Query server provides the endpoints that users and client applications 
interact with, routing queries to Data servers or other Query servers (and 
optionally proxied Master server requests).
+
+Query servers divide operations between Broker and Router services.
+
+#### Broker service
+
+[Broker](../design/broker.md) services receive queries from external clients 
and forward those queries to Data servers. When Brokers receive results from 
those subqueries, they merge those results and return them to the caller. End 
users typically query Brokers rather than querying Historical or MiddleManager 
services on Data servers directly.
+
+#### Router service (optional)

Review Comment:
   The router is enabled by default now. I think that it we'll want to verify 
what this should say.



##########
docs/design/broker.md:
##########
@@ -25,18 +26,18 @@ title: "Broker"
 
 ### Configuration
 
-For Apache Druid Broker Process Configuration, see [Broker 
Configuration](../configuration/index.md#broker).
+For Apache Druid Broker service configuration, see [Broker 
Configuration](../configuration/index.md#broker).
 
-For basic tuning guidance for the Broker process, see [Basic cluster 
tuning](../operations/basic-cluster-tuning.md#broker).
+For basic tuning guidance for the Broker service, see [Basic cluster 
tuning](../operations/basic-cluster-tuning.md#broker).
 
 ### HTTP endpoints
 
 For a list of API endpoints supported by the Broker, see [Broker 
API](../api-reference/legacy-metadata-api.md#broker).
 
 ### Overview
 
-The Broker is the process to route queries to if you want to run a distributed 
cluster. It understands the metadata published to ZooKeeper about what segments 
exist on what processes and routes queries such that they hit the right 
processes. This process also merges the result sets from all of the individual 
processes together.
-On start up, Historical processes announce themselves and the segments they 
are serving in Zookeeper.
+The Broker is the service to route queries to if you want to run a distributed 
cluster. It understands the metadata published to ZooKeeper about what segments 
exist on what services and routes queries such that they hit the right 
services. This service also merges the result sets from all of the individual 
services together.
+On start up, Historical services announce themselves and the segments they are 
serving in ZooKeeper.

Review Comment:
   Why does this switch to talking about Historical?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Revamp design page (druid)

Reply via email to