ektravel commented on code in PR #15486: URL: https://github.com/apache/druid/pull/15486#discussion_r1417688724
########## docs/design/architecture.md: ########## @@ -23,45 +23,121 @@ title: "Design" --> -Druid has a distributed architecture that is designed to be cloud-friendly and easy to operate. You can configure and scale services independently so you have maximum flexibility over cluster operations. This design includes enhanced fault tolerance: an outage of one component does not immediately affect other components. +Druid has a distributed architecture that is designed to be cloud-friendly and easy to operate. You can configure and scale services independently for maximum flexibility over cluster operations. This design includes enhanced fault tolerance: an outage of one component does not immediately affect other components. -## Druid architecture +The following diagram shows the services that make up the Druid architecture, their typical arrangement across servers, and how queries and data flow through this architecture. -The following diagram shows the services that make up the Druid architecture, how they are typically organized into servers, and how queries and data flow through this architecture. + - - -The following sections describe the components of this architecture. +The following sections describe the components of this architecture. ## Druid services Druid has several types of services: -* [**Coordinator**](../design/coordinator.md) service manages data availability on the cluster. -* [**Overlord**](../design/overlord.md) service controls the assignment of data ingestion workloads. -* [**Broker**](../design/broker.md) handles queries from external clients. -* [**Router**](../design/router.md) services are optional; they route requests to Brokers, Coordinators, and Overlords. -* [**Historical**](../design/historical.md) services store queryable data. -* [**MiddleManager**](../design/middlemanager.md) services ingest data. +* [Coordinator](../design/coordinator.md) manages data availability on the cluster. +* [Overlord](../design/overlord.md) controls the assignment of data ingestion workloads. +* [Broker](../design/broker.md) handles queries from external clients. +* [Router](../design/router.md) optionally routes requests to Brokers, Coordinators, and Overlords. +* [Historical](../design/historical.md) stores queryable data. +* [MiddleManager](../design/middlemanager.md) and [Peon](../design/peons.md) ingest data. +* [Indexer](../design/indexer.md) serves an alternative to the MiddleManager + Peon task execution system. You can view services in the **Services** tab in the web console:  - ## Druid servers -Druid services can be deployed any way you like, but for ease of deployment we suggest organizing them into three server types: Master, Query, and Data. +You can deploy Druid services according to your preferences. For ease of deployment, we recommend organizing them into three server types: [Master](#master-server), [Query](#query-server), and [Data](#data-server). + +### Master server + +A Master server manages data ingestion and availability. It is responsible for starting new ingestion jobs and coordinating availability of data on the [Data server](#data-server). + +Master servers divide operations between Coordinator and Overlord services. + +#### Coordinator service + +[Coordinator](../design/coordinator.md) services watch over the Historical services on the Data servers. They are responsible for assigning segments to specific servers, and for ensuring segments are well-balanced across Historicals. + +#### Overlord service + +[Overlord](../design/overlord.md) services watch over the MiddleManager services on the Data servers and are the controllers of data ingestion into Druid. They are responsible for assigning ingestion tasks to MiddleManagers and for coordinating segment publishing. + +### Query server + +A Query server provides the endpoints that users and client applications interact with, routing queries to Data servers or other Query servers (and optionally proxied Master server requests). + +Query servers divide operations between Broker and Router services. + +#### Broker service + +[Broker](../design/broker.md) services receive queries from external clients and forward those queries to Data servers. When Brokers receive results from those subqueries, they merge those results and return them to the caller. End users typically query Brokers rather than querying Historical or MiddleManager services on Data servers directly. + +#### Router service (optional) Review Comment: Updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
