FrankChen021 commented on code in PR #19447: URL: https://github.com/apache/druid/pull/19447#discussion_r3234373693
########## docs/release-info/upgrade-notes.md: ########## @@ -26,7 +26,165 @@ The upgrade notes assume that you are upgrading from the Druid version that imme For the full release notes for a specific version, see the [releases page](https://github.com/apache/druid/releases). -## Announcements +## 37.0.0 + +### Upgrade notes + +#### Hadoop-based ingestion + +Support for Hadoop-based ingestion has been removed. The feature was deprecated in Druid 34. + +Use one of Druid's other supported ingestion methods, such as SQL-based ingestion or MiddleManager-less ingestion using Kubernetes. + +[#19109](https://github.com/apache/druid/pull/19109) + +#### Segment metadata cache on by default + +Starting in Druid 37, the segment metadata cache is on by default. This feature allows the Broker to cache segment metadata polled from the Coordinator, rather than having to fetch metadata for every query against the `sys.segments` table. This improves performance but increases memory usage on Brokers. + +The `druid.sql.planner.metadataSegmentCacheEnable` config controls this feature. + +[#19075](https://github.com/apache/druid/pull/19075) + +#### Streaming ingestion `parser` + +Support for the deprecated `parser` has been removed for streaming ingest tasks such as Kafka and Kinesis. Operators must now specify `inputSource`/`inputFormat` on the `ioConfig` of the supervisor spec, and the `dataSchema` must not specify a parser. Do this before upgrading to Druid 37 or newer. + +[#19173](https://github.com/apache/druid/pull/19173) [#19166](https://github.com/apache/druid/pull/19166) + +#### Rolling upgrades from Druid versions prior to version 0.23 + +You can't perform a rolling upgrade from versions earlier than Druid 0.23. + +[#18961](https://github.com/apache/druid/pull/18961) + +#### Metadata storage for auto-compaction with compaction supervisors + +Automatic compaction using compaction supervisors now requires incremental segment metadata caching to be enabled on the Overlord and Coordinator in the runtime properties. Specifically, the `druid.manager.segments.useIncrementalCache` config must be set to `always` or `ifSynced`. For more information about the config, see [Segment metadata cache](https://druid.apache.org/docs/latest/configuration/#segment-metadata-cache-experimental). + +Additionally, metadata store changes are required for this upgrade. + +If you already have `druid.metadata.storage.connector.createTables` set to `true`, no action is needed. + +If you have this feature turned off, you will need to alter the segments table and create the `compactionStates` table. The Postgres DDL is provided below as a guide: + +``` +-- create the indexing states lookup table and associated indices +CREATE TABLE druid_indexingStates ( + created_date VARCHAR(255) NOT NULL, + datasource VARCHAR(255) NOT NULL, + fingerprint VARCHAR(255) NOT NULL, + payload BYTEA NOT NULL, + used BOOLEAN NOT NULL, + pending BOOLEAN NOT NULL, + used_status_last_updated VARCHAR(255) NOT NULL, + PRIMARY KEY (fingerprint), + ); + + CREATE INDEX idx_druid_compactionStates_used ON druid_compactionStates(used, used_status_last_updated); Review Comment: [P2] Fix the manual indexingStates DDL before publishing it The manual Postgres DDL for operators with `createTables` disabled does not create a runnable schema: it creates `druid_indexingStates`, leaves a trailing comma after `PRIMARY KEY`, then creates the index on `druid_compactionStates`, which is not the table Druid uses. `SQLMetadataConnector` creates the indexingStates table and indexes that same table, so following this upgrade note as written will fail before the required migration is complete. ########## docs/release-info/upgrade-notes.md: ########## @@ -26,7 +26,165 @@ The upgrade notes assume that you are upgrading from the Druid version that imme For the full release notes for a specific version, see the [releases page](https://github.com/apache/druid/releases). -## Announcements +## 37.0.0 + +### Upgrade notes + +#### Hadoop-based ingestion + +Support for Hadoop-based ingestion has been removed. The feature was deprecated in Druid 34. + +Use one of Druid's other supported ingestion methods, such as SQL-based ingestion or MiddleManager-less ingestion using Kubernetes. + +[#19109](https://github.com/apache/druid/pull/19109) + +#### Segment metadata cache on by default + +Starting in Druid 37, the segment metadata cache is on by default. This feature allows the Broker to cache segment metadata polled from the Coordinator, rather than having to fetch metadata for every query against the `sys.segments` table. This improves performance but increases memory usage on Brokers. + +The `druid.sql.planner.metadataSegmentCacheEnable` config controls this feature. + +[#19075](https://github.com/apache/druid/pull/19075) + +#### Streaming ingestion `parser` + +Support for the deprecated `parser` has been removed for streaming ingest tasks such as Kafka and Kinesis. Operators must now specify `inputSource`/`inputFormat` on the `ioConfig` of the supervisor spec, and the `dataSchema` must not specify a parser. Do this before upgrading to Druid 37 or newer. + +[#19173](https://github.com/apache/druid/pull/19173) [#19166](https://github.com/apache/druid/pull/19166) + +#### Rolling upgrades from Druid versions prior to version 0.23 + +You can't perform a rolling upgrade from versions earlier than Druid 0.23. + +[#18961](https://github.com/apache/druid/pull/18961) + +#### Metadata storage for auto-compaction with compaction supervisors + +Automatic compaction using compaction supervisors now requires incremental segment metadata caching to be enabled on the Overlord and Coordinator in the runtime properties. Specifically, the `druid.manager.segments.useIncrementalCache` config must be set to `always` or `ifSynced`. For more information about the config, see [Segment metadata cache](https://druid.apache.org/docs/latest/configuration/#segment-metadata-cache-experimental). Review Comment: [P3] Update the segment metadata cache anchor The link points to `configuration/#segment-metadata-cache-experimental`, but the configuration page in this tree has the heading `Segment metadata cache`, which generates `#segment-metadata-cache`. This makes the upgrade note's reference jump to a missing anchor. ########## docs/release-info/upgrade-notes.md: ########## @@ -26,7 +26,165 @@ The upgrade notes assume that you are upgrading from the Druid version that imme For the full release notes for a specific version, see the [releases page](https://github.com/apache/druid/releases). -## Announcements +## 37.0.0 + +### Upgrade notes + +#### Hadoop-based ingestion + +Support for Hadoop-based ingestion has been removed. The feature was deprecated in Druid 34. Review Comment: [P3] Correct the Hadoop deprecation version This says Hadoop-based ingestion was deprecated in Druid 34, but the same upgrade-notes page records it as newly deprecated in 32.0 and only opt-in/scheduled-removal details were added in 34. The 37.0.0 note should say it was deprecated in Druid 32.0 to avoid contradicting the historical notes below. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
