kfaraz commented on code in PR #16412: URL: https://github.com/apache/druid/pull/16412#discussion_r1596214877
########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. Review Comment: ```suggestion You can now configure Druid to manage datasource schema centrally on the Coordinator. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. ``` ########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. + +If enabled, the following changes occur: + +- Realtime segment schema changes get periodically pushed to the Coordinator +- Tasks publish segment schemas and metadata to the metadata database Review Comment: ```suggestion - Tasks publish segment schemas and metadata to the metadata store ``` ########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. + +If enabled, the following changes occur: + +- Realtime segment schema changes get periodically pushed to the Coordinator +- Tasks publish segment schemas and metadata to the metadata database +- The Coordinator service polls the schema and segment metadata to build datasource schemas Review Comment: ```suggestion - The Coordinator polls the schema and segment metadata to build datasource schemas ``` ########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. + +If enabled, the following changes occur: + +- Realtime segment schema changes get periodically pushed to the Coordinator +- Tasks publish segment schemas and metadata to the metadata database +- The Coordinator service polls the schema and segment metadata to build datasource schemas +- Brokers fetch datasource schemas from the Coordinator when possible. If not, the Broker builds the schema. + +This behavior is currently opt-in. To enable this feature, set the following configs: + +- In your common runtime properties, set `druid.centralizedDatasourceSchema.enabled` to true. +- If you're using MiddleManagers, you also need to set `druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled` to true in your MiddleManager runtime properties. + +You can return to the previous behavior by changing the configs to false. + +You can configure the following properties to control how the Coordinator service handles unused segment schemas: + +|Name|Description|Required|Default| +|-|-|-|-| +|`druid.coordinator.kill.segmentSchema.on`| Boolean value for enabling automatic deletion of unused segment schemas. If set to true, the Coordinator service periodically identifies segment schemas that are not referenced by any used segment and marks them as unused. At a later point, these unused schemas are deleted. | No | True| +|`druid.coordinator.kill.segmentSchema.period`| How often to do automatic deletion of segment schemas in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration format. Value must be equal to or greater than `druid.coordinator.period.metadataStoreManagementPeriod`. Only applies if `druid.coordinator.kill.segmentSchema.on` is set to true.| No| `P1D`| +|`druid.coordinator.kill.segmentSchema.durationToRetain`| [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) duration for the time a segment schema is retained for from when it's marked as unused. Only applies if `druid.coordinator.kill.segmentSchema.on` is set to true.| Yes, if `druid.coordinator.kill.segmentSchema.on` is set to true.| `P90D`| + +In addition there are new metrics for monitoring after enabling centralized datasource schemas: Review Comment: ```suggestion In addition, there are new metrics available to monitor the performance of centralized schema management: ``` ########## docs/release-info/release-notes.md: ########## @@ -211,12 +257,26 @@ If your code or tests consume task reports, don't rely on the JSON to be a singl [#16041](https://github.com/apache/druid/pull/16041) +#### Improved handling of lock types Review Comment: ```suggestion #### Improved handling of lock types for streaming tasks ``` ########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. + +If enabled, the following changes occur: + +- Realtime segment schema changes get periodically pushed to the Coordinator +- Tasks publish segment schemas and metadata to the metadata database +- The Coordinator service polls the schema and segment metadata to build datasource schemas +- Brokers fetch datasource schemas from the Coordinator when possible. If not, the Broker builds the schema. + +This behavior is currently opt-in. To enable this feature, set the following configs: + +- In your common runtime properties, set `druid.centralizedDatasourceSchema.enabled` to true. +- If you're using MiddleManagers, you also need to set `druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled` to true in your MiddleManager runtime properties. Review Comment: ```suggestion - If you are using MiddleManagers, you also need to set `druid.indexer.fork.property.druid.centralizedDatasourceSchema.enabled` to true in your MiddleManager runtime properties. ``` ########## docs/release-info/release-notes.md: ########## @@ -57,6 +57,51 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### Centralized datasource schema (alpha) + +You can now configure Druid to centralize schema management using the Coordinator service. Previously, Brokers needed to query data nodes and tasks for segment schemas. Centralizing datasource schemas can improve startup time for Brokers and the efficiency of your deployment. + +If enabled, the following changes occur: + +- Realtime segment schema changes get periodically pushed to the Coordinator +- Tasks publish segment schemas and metadata to the metadata database +- The Coordinator service polls the schema and segment metadata to build datasource schemas +- Brokers fetch datasource schemas from the Coordinator when possible. If not, the Broker builds the schema. Review Comment: ```suggestion - Brokers fetch datasource schemas from the Coordinator when possible. If not, the Broker builds the schema itself by the existing mechanism of querying Historicals. ``` ########## docs/release-info/release-notes.md: ########## @@ -461,12 +536,23 @@ Parallel compaction task completion reports now have `segmentsRead` and `segment [#15947](https://github.com/apache/druid/pull/15947) +#### Changes to segment schema cleanup default values + +The following are the changes to the default values of segment schema cleanup: + +* The default value for `druid.coordinator.kill.segmentSchema.period` has changes from `PT1H` to `P1D`. +* The default value for `druid.coordinator.kill.segmentSchema.durationToRetain` has changed from `PR6H` to `P90D`. + +[#16354](https://github.com/apache/druid/pull/16354) + Review Comment: This section is not needed since these configs are being newly added in Druid 30 itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
