LakshSingla commented on issue #15326: URL: https://github.com/apache/druid/issues/15326#issuecomment-1793426954
# <a name="28.0.0-upgrade-notes-and-incompatible-changes" href="#28.0.0-upgrade-notes-and-incompatible-changes">#</a> Upgrade notes and incompatible changes ## <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes">#</a> Upgrade notes ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table">#</a> Upgrade Druid segments table Druid 28.0.0 adds a new column to the Druid metadata table that requires an update to the table. If `druid.metadata.storage.connector.createTables` is set to `true` and the metadata store user has DDL privileges, the segments table gets automatically updated at startup to include the new `used_flag_last_updated` column. No additional work is needed for the upgrade. If either of those requirements are not met, pre-upgrade steps are required. You must make these updates before you upgrade to Druid 28.0.0, or the Coordinator and Overlord processes fail. Although you can manually alter your table to add the new `used_flag_last_updated` column, Druid also provides a CLI tool to do it. [#12599](https://github.com/apache/druid/pull/12599) In the example commands below: - `lib` is the Druid lib directory - `extensions` is the Druid extensions directory - `base` corresponds to the value of `druid.metadata.storage.tables.base` in the configuration, `druid` by default. - The `--connectURI` parameter corresponds to the value of `druid.metadata.storage.connector.connectURI`. - The `--user` parameter corresponds to the value of `druid.metadata.storage.connector.user`. - The `--password` parameter corresponds to the value of `druid.metadata.storage.connector.password`. - The `--action` parameter corresponds to the update action you are executing. In this case, it is `add-last-used-to-segments` #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-upgrade-step-for-mysql" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-upgrade-step-for-mysql">#</a> Upgrade step for MySQL ```bash cd ${DRUID_ROOT} java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"mysql-metadata-storage\"] -Ddruid.metadata.storage.type=mysql org.apache.druid.cli.Main tools metadata-update --connectURI="<mysql-uri>" --user USER --password PASSWORD --base druid --action add-used-flag-last-updated-to-segments ``` #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-upgrade-step-for-postgresql" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-upgrade-step-for-postgresql">#</a> Upgrade step for PostgreSQL ```bash cd ${DRUID_ROOT} java -classpath "lib/*" -Dlog4j.configurationFile=conf/druid/cluster/_common/log4j2.xml -Ddruid.extensions.directory="extensions" -Ddruid.extensions.loadList=[\"postgresql-metadata-storage\"] -Ddruid.metadata.storage.type=postgresql org.apache.druid.cli.Main tools metadata-update --connectURI="<postgresql-uri>" --user USER --password PASSWORD --base druid --action add-used-flag-last-updated-to-segments ``` #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-manual-upgrade-step" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-upgrade-druid-segments-table-manual-upgrade-step">#</a> Manual upgrade step ```SQL ALTER TABLE druid_segments ADD used_flag_last_updated varchar(255); ``` ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-recommended-syntax-for-sql-unnest" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-recommended-syntax-for-sql-unnest">#</a> Recommended syntax for SQL UNNEST The recommended syntax for SQL UNNEST has changed. We recommend using CROSS JOIN instead of commas for most queries to prevent issues with precedence. For example, use: ```sql SELECT column_alias_name1 FROM datasource CROSS JOIN UNNEST(source_expression1) AS table_alias_name1(column_alias_name1) CROSS JOIN UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... ``` Do not use: ```sql SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... ``` ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-dynamic-parameters" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-dynamic-parameters">#</a> Dynamic parameters The Apache Calcite version has been upgraded from 1.21 to 1.35. As part of the Calcite upgrade, the behavior of type inference for dynamic parameters has changed. To avoid any type interference issues, explicitly `CAST` all dynamic parameters as a specific data type in SQL queries. For example, use: ```sql SELECT (1 * CAST (? as DOUBLE))/2 as tmp ``` Do not use: ```sql SELECT (1 * ?)/2 as tmp ``` ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-nested-column-format" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-nested-column-format">#</a> Nested column format `json` type columns created with Druid 28.0.0 are not backwards compatible with Druid versions older than 26.0.0. If you are upgrading from a version prior to Druid 26.0.0 and you use `json` columns, upgrade to Druid 26.0.0 before you upgrade to Druid 28.0.0. Additionally, to downgrade to a version older than Druid 26.0.0, any new segments created in Druid 28.0.0 should be re-ingested using Druid 26.0.0 or 27.0.0 prior to further downgrading. When upgrading from a previous version, you can continue to write nested columns in a backwards compatible format (version 4). In a classic batch ingestion job, include `formatVersion` in the `dimensions` list of the `dimensionsSpec` property. For example: ```json "dimensionsSpec": { "dimensions": [ "product", "department", { "type": "json", "name": "shipTo", "formatVersion": 4 } ] }, ``` To set the default nested column version, set the desired format version in the common runtime properties. For example: ```java druid.indexing.formats.nestedColumnFormatVersion=4 ``` ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility">#</a> SQL compatibility Starting with Druid 28.0.0, the default way Druid treats nulls and booleans has changed. For nulls, Druid now differentiates between an empty string and a record with no data as well as between an empty numerical record and `0`. You can revert to the previous behavior by setting `druid.generic.useDefaultValueForNull` to `true`. This property affects both storage and querying, and must be set on all Druid service types to be available at both ingestion time and query time. Reverting this setting to the old value restores the previous behavior without reingestion. For booleans, Druid now strictly uses `1` (true) or `0` (false). Previously, true and false could be represented either as `true` and `false` as well as `1` and `0`, respectively. In addition, Druid now returns a null value for boolean comparisons like `True && NULL`. You can revert to the previous behavior by setting `druid.expressions.useStrictBooleans` to `false`. This property affects both storage and querying, and must be set on all Druid service types to be available at both ingestion time and query time. Reverting this setting to the old value restores the previous behavior without reingestion. The following table illustrates some example scenarios and the impact of the changes. <details><summary>Show the table</summary> | Query| Druid 27.0.0 and earlier| Druid 28.0.0 and later| |------|------------------------|----------------------| | Query empty string| Empty string (`''`) or null| Empty string (`''`)| | Query null string| Null or empty| Null| | COUNT(*)| All rows, including nulls| All rows, including nulls| | COUNT(column)| All rows excluding empty strings| All rows including empty strings but excluding nulls| | Expression 100 && 11| 11| 1| | Expression 100 || 11| 100| 1| | Null FLOAT/DOUBLE column| 0.0| Null| | Null LONG column| 0| Null| | Null `__time` column| 0, meaning 1970-01-01 00:00:00 UTC| 1970-01-01 00:00:00 UTC| | Null MVD column| `''`| Null| | ARRAY| Null| Null| | COMPLEX| none| Null| </details> Before upgrading to Druid 28.0.0, update your queries to account for the changed behavior as described in the following sections. #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-null-filters" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-null-filters">#</a> NULL filters If your queries use NULL in the filter condition to match both nulls and empty strings, you should add an explicit filter clause for empty strings. For example, update `s IS NULL` to `s IS NULL OR s = ''`. #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-count-functions" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-count-functions">#</a> COUNT functions `COUNT(column)` now counts empty strings. If you want to continue excluding empty strings from the count, replace `COUNT(column)` with `COUNT(column) FILTER(WHERE column <> '')`. #### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-groupby-queries" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-sql-compatibility-groupby-queries">#</a> GroupBy queries GroupBy queries on columns containing null values can now have additional entries as nulls can co-exist with empty strings. ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-stop-supervisors-that-ingest-from-multiple-kafka-topics-before-downgrading" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-stop-supervisors-that-ingest-from-multiple-kafka-topics-before-downgrading">#</a> Stop Supervisors that ingest from multiple Kafka topics before downgrading If you have added supervisors that ingest from multiple Kafka topics in Druid 28.0.0 or later, stop those supervisors before downgrading to a version prior to Druid 28.0.0 because the supervisors will fail in versions prior to Druid 28.0.0. ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-%60lenientaggregatormerge%60-deprecated" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-%60lenientaggregatormerge%60-deprecated">#</a> `lenientAggregatorMerge` deprecated `lenientAggregatorMerge` property in segment metadata queries has been deprecated. It will be removed in future releases. Use `aggregatorMergeStrategy` instead. `aggregatorMergeStrategy` also supports the `latest` and `earliest` strategies in addition to `strict` and `lenient` strategies from `lenientAggregatorMerge`. [#14560](https://github.com/apache/druid/pull/14560) [#14598](https://github.com/apache/druid/pull/14598) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-broker-parallel-merge-config-options" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-broker-parallel-merge-config-options">#</a> Broker parallel merge config options The paths for `druid.processing.merge.pool.*` and `druid.processing.merge.task.*` have been flattened to use `druid.processing.merge.*` instead. The legacy paths for the configs are now deprecated and will be removed in a future release. Migrate your settings to use the new paths because the old paths will be ignored in the future. [#14695](https://github.com/apache/druid/pull/14695) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-ingestion-options-for-array-typed-columns" href="#28.0.0-upgrade-notes-and-incompatible-changes-upgrade-notes-ingestion-options-for-array-typed-columns">#</a> Ingestion options for ARRAY typed columns Starting with Druid 28.0.0, the MSQ task engine can detect and ingest arrays as ARRAY typed columns when you set the query context parameter `arrayIngestMode` to `array`. The `arrayIngestMode` context parameter controls how ARRAY type values are stored in Druid segments. When you set `arrayIngestMode` to `array` (recommended for SQL compliance), the MSQ task engine stores all ARRAY typed values in [ARRAY typed columns](https://druid.apache.org/docs/latest/querying/arrays) and supports storing both VARCHAR and numeric typed arrays. For backwards compatibility, `arrayIngestMode` defaults to `mvd`. When `"arrayIngestMode":"mvd"`, Druid only supports VARCHAR typed arrays and stores them as [multi-value string columns](https://druid.apache.org/docs/latest/querying/multi-value-dimensions). When you set `arrayIngestMode` to `none`, Druid throws an exception when trying to store any type of arrays. For more information on how to ingest `ARRAY` typed columns with SQL-based ingestion, see [SQL data types](https://druid.apache.org/docs/latest/querying/sql-data-types#arrays) and [Array columns](https://druid.apache.org/docs/latest/querying/arrays). ## <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes">#</a> Incompatible changes ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-hadoop-2" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-hadoop-2">#</a> Removed Hadoop 2 Support for Hadoop 2 has been removed. Migrate to SQL-based ingestion or JSON-based batch ingestion if you are using Hadoop 2.x for ingestion today. If migrating to Druid's built-in ingestion is not possible, you must upgrade your Hadoop infrastructure to 3.x+ before upgrading to Druid 28.0.0. [#14763](https://github.com/apache/druid/pull/14763) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-groupby-v1-" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-groupby-v1-">#</a> Removed GroupBy v1 The GroupBy v1 engine has been removed. Use the GroupBy v2 engine instead, which has been the default GroupBy engine for several releases. There should be no impact on your queries. Additionally, `AggregatorFactory.getRequiredColumns` has been deprecated and will be removed in a future release. If you have an extension that implements `AggregatorFactory`, then this method should be removed from your implementation. [#14866](https://github.com/apache/druid/pull/14866) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-coordinator-dynamic-configs" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-coordinator-dynamic-configs">#</a> Removed Coordinator dynamic configs The `decommissioningMaxPercentOfMaxSegmentsToMove` config has been removed. The use case for this config is handled by smart segment loading now, which is enabled by default. [#14923](https://github.com/apache/druid/pull/14923) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-%60cachingcost%60-strategy" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-%60cachingcost%60-strategy">#</a> Removed `cachingCost` strategy The `cachingCost` strategy for segment loading has been removed. Use `cost` instead, which has the same benefits as `cachingCost`. If you have `cachingCost` set, the system ignores this setting and automatically uses `cost`. [#14798](https://github.com/apache/druid/pull/14798) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-%60insertcannotorderbydescending%60" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-%60insertcannotorderbydescending%60">#</a> Removed `InsertCannotOrderByDescending` The deprecated MSQ fault `InsertCannotOrderByDescending` has been removed. [#14588](https://github.com/apache/druid/pull/14588) ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-the-backward-compatibility-code-for-the-handoff-api" href="#28.0.0-upgrade-notes-and-incompatible-changes-incompatible-changes-removed-the-backward-compatibility-code-for-the-handoff-api">#</a> Removed the backward compatibility code for the Handoff API The backward compatibility code for the Handoff API in `CoordinatorBasedSegmentHandoffNotifier` has been removed. If you are upgrading from a Druid version older than 0.14.0, upgrade to a newer version of Druid before upgrading to Druid 28.0.0. [#14652](https://github.com/apache/druid/pull/14652) ## <a name="28.0.0-upgrade-notes-and-incompatible-changes-developer-notes" href="#28.0.0-upgrade-notes-and-incompatible-changes-developer-notes">#</a> Developer notes ### <a name="28.0.0-upgrade-notes-and-incompatible-changes-developer-notes-dependency-updates" href="#28.0.0-upgrade-notes-and-incompatible-changes-developer-notes-dependency-updates">#</a> Dependency updates The following dependencies have had their versions bumped: * Guava to `31.1-jre`. If you use an extension that has a transitive Guava dependency from Druid, it may be impacted [#14767](https://github.com/apache/druid/pull/14767) * Google Client APIs have been upgraded from 1.26.0 to 2.0.0 [#14414](https://github.com/apache/druid/pull/14414) * Apache Kafka has been upgraded to 3.5.1 [#14721](https://github.com/apache/druid/pull/14721) * Calcite has been upgraded to 1.35 [#14510](https://github.com/apache/druid/pull/14510) * `RoaringBitmap` has been upgraded from 0.9.0 to 0.9.49 [#15006](https://github.com/apache/druid/pull/15006) * `snappy-java` has been upgraded to 1.1.10.3 [#14641](https://github.com/apache/druid/pull/14641) * `decode-uri-component` has been upgraded to 0.2.2 [#13481](https://github.com/apache/druid/pull/13481) * `word-wrap` has been upgraded to 1.2.4 [#14613](https://github.com/apache/druid/pull/14613) * `tough-cookie` has been upgraded to 4.1.3 [#14557](https://github.com/apache/druid/pull/14557) * `qs` has been upgraded to 6.5.3 [#13510](https://github.com/apache/druid/pull/13510) * `api-util` has been upgraded to 2.1.3 [#14852](https://github.com/apache/druid/pull/14852) * `commons-cli` has been upgraded from 1.3.1 to 1.5.0 [#14837](https://github.com/apache/druid/pull/14837) * `tukaani:xz` has been upgraded from 1.8 to 1.9 [#14839](https://github.com/apache/druid/pull/14839) * `commons-compress` has been upgraded from 1.21 to 1.23.0 [#14820](https://github.com/apache/druid/pull/14820) * `protobuf.version` has been upgraded from 3.21.7 to 3.24.0 [#14823](https://github.com/apache/druid/pull/14823) * `dropwizard.metrics.version` has been upgraded from 4.0.0 to 4.2.19 [#14824](https://github.com/apache/druid/pull/14824) * `assertj-core` has been upgraded from 3.19.0 to 3.24.2 [#14815](https://github.com/apache/druid/pull/14815) * `maven-source-plugin` has been upgraded from 2.2.1 to 3.3.0 [#14812](https://github.com/apache/druid/pull/14812) * `scala-library` has been upgraded from 2.13.9 to 2.13.11 [#14826](https://github.com/apache/druid/pull/14826) * `oshi-core` has been upgraded from 6.4.2 to 6.4.4 [#14814](https://github.com/apache/druid/pull/14814) * `maven-surefire-plugin` has been upgraded from 3.0.0-M7 to 3.1.2 [#14813](https://github.com/apache/druid/pull/14813) * `apache-rat-plugin` has been upgraded from 0.12 to 0.15 [#14817](https://github.com/apache/druid/pull/14817) * `jclouds.version` has been upgraded from 1.9.1 to 2.0.3 [#14746](https://github.com/apache/druid/pull/14746) * `dropwizard.metrics:metrics-graphite` has been upgraded from 3.1.2 to 4.2.19 [#14842](https://github.com/apache/druid/pull/14842) * `postgresql` has been upgraded from 42.4.1 to 42.6.0 [#13959](https://github.com/apache/druid/pull/13959) * `org.mozilla:rhino` has been upgraded [#14765](https://github.com/apache/druid/pull/14765) * `apache.curator.version` has been upgraded from 5.4.0 to 5.5.0 [#14843](https://github.com/apache/druid/pull/14843) * `jackson-databind` has been upgraded to 2.12.7 [#14770](https://github.com/apache/druid/pull/14770) * `icu4j` from 55.1 to 73.2 has been upgraded from 55.1 to 73.2 [#14853](https://github.com/apache/druid/pull/14853) * `joda-time` has been upgraded from 2.12.4 to 2.12.5 [#14855](https://github.com/apache/druid/pull/14855) * `tough-cookie` has been upgraded from 4.0.0 to 4.1.3 [#14557](https://github.com/apache/druid/pull/14557) * `word-wrap` has been upgraded from 1.2.3 to 1.2.4 [#14613](https://github.com/apache/druid/pull/14613) * `decode-uri-component` has been upgraded from 0.2.0 to 0.2.2 [#13481](https://github.com/apache/druid/pull/13481) * `snappy-java` has been upgraded from 1.1.10.1 to 1.1.10.3 [#14641](https://github.com/apache/druid/pull/14641) * Hibernate validator version has been upgraded [#14757](https://github.com/apache/druid/pull/14757) * The Dependabot PR limit for Java dependencies has been increased [#14804](https://github.com/apache/druid/pull/14804) * `jetty` has been upgraded from 9.4.51.v20230217 to 9.4.53.v20231009 [#15129](https://github.com/apache/druid/pull/15129) * `netty4` has been upgraded from 4.1.94.Final to 4.1.100.Final [#15129](https://github.com/apache/druid/pull/15129) # <a name="28.0.0-credits" href="#28.0.0-credits">#</a> Credits @2bethere @317brian @a2l007 @abhishekagarwal87 @abhishekrb19 @adarshsanjeev @aho135 @AlexanderSaydakov @AmatyaAvadhanula @asdf2014 @benkrug @capistrant @clintropolis @cristian-popa @cryptoe @demo-kratia @dependabot[bot] @ektravel @findingrish @gargvishesh @georgew5656 @gianm @giuliotal @hardikbajaj @hqx871 @imply-cheddar @Jaehui-Lee @jasonk000 @jon-wei @kaisun2000 @kfaraz @kgyrtkirk @LakshSingla @lorem--ipsum @maytasm @pagrawal10 @panhongan @petermarshallio @pranavbhole @rash67 @rohangarg @SamWheating @sergioferragut @slfan1989 @somu-imply @suneet-s @techdocsmith @tejaswini-imply @TSFenwick @vogievetsky @vtlim @writer-jill @xvrl @yianni @YongGang @yuanlihan @zachjsh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
