ektravel commented on code in PR #15173: URL: https://github.com/apache/druid/pull/15173#discussion_r1379147559
########## docs/do-not-merge.md: ########## @@ -0,0 +1,815 @@ +<!--Intentionally, there's no Apache license so that the GHA fails it. This file is not meant to be merged. +--> + +Apache Druid 28.0.0 contains over $NUMBER_FEATURES new features, bug fixes, performance enhancements, documentation improvements, and additional test coverage from $NUMBER_OF_CONTRIBUTORS contributors. + +See the [complete set of changes](https://github.com/apache/druid/issues?q=is%3Aclosed+milestone%3A28.0+sort%3Aupdated-desc+) for additional details, including bug fixes. + +Review the [upgrade notes](#upgrade-notes) and [incompatible changes](#incompatible-changes) before you upgrade to Druid 28.0.0. + +# Important changes and deprecations + +In Druid 28.0.0, we have made substantial improvements to querying to make the system more ANSI SQL compatible. This includes changes in handling NULL and boolean values as well as boolean logic. At the same time, the Apache Calcite library has been upgraded to the latest version. While we have documented known query behavior changes, please read the [upgrade notes](#upgrade-notes) section carefully. Test your application before rolling out to broad production scenarios while closely monitoring the query status. + +## SQL compatibility + +Druid continues to make SQL query execution more consistent with how standard SQL behaves. However, there are feature flags available to restore the old behavior if needed. + +### Three-valued logic + +Druid native filters now observe SQL [three-valued logic](https://en.wikipedia.org/wiki/Three-valued_logic#SQL) (`true`, `false`, or `unknown`) instead of Druid's classic two-state logic when you set the following configuration values: + +* `druid.generic.useThreeValueLogicForNativeFilters = true` +* `druid.expressions.useStrictBooleans = true` +* `druid.generic.useDefaultValueForNull = false` + +[#15058](https://github.com/apache/druid/pull/15058) + +### Strict booleans + +`druid.expressions.useStrictBooleans` is now enabled by default. +Druid now handles booleans strictly using `1` (true) or `0` (false). +Previously, true and false could be represented either as `true` and `false` as well as `1` and `0`, respectively. +In addition, Druid now returns a null value for Boolean comparisons like `True && NULL`. + +If you don't explicitly configure this property in `runtime.properties`, clusters now use LONG types for any ingested boolean values and in the output of boolean functions for transformations and query time operations. + +This change may impact your query results. For more information, see [SQL compatibility in the upgrade notes](#sql-compatibility-1). + +[#14734](https://github.com/apache/druid/pull/14734) + +### NULL handling + +`druid.generic.useDefaultValueForNull` is now disabled by default. +Druid now differentiates between empty records and null records. +Previously, Druid might treat empty records as empty or null. + +This change may impact your query results. For more information, see [SQL compatibility in the upgrade notes](#sql-compatibility-1). + +[#14792](https://github.com/apache/druid/pull/14792) + +## SQL planner improvements + +Druid uses Apache Calcite for SQL planning and optimization. Starting in Druid 28.0.0, the Calcite version has been upgraded from 1.21 to 1.35. This upgrade brings in many bug fixes in SQL planning from Calcite. As part of the upgrade, the behavior of type inference for [dynamic parameters](#dynamic-parameters) and the [recommended syntax for UNNEST](#new-syntax-for-sql-unnest) have changed. + +### Dynamic parameters + +The behavior of type inference for dynamic parameters has changed. To avoid any type interference issues, explicitly `CAST` all dynamic parameters as a specific data type in SQL queries. For example, use: + +```sql +SELECT (1 * CAST (? as DOUBLE))/2 as tmp +``` + +Do not use: + +```sql +SELECT (1 * ?)/2 as tmp +``` + +### New syntax for SQL UNNEST + +The recommended syntax for SQL UNNEST has changed. We recommend using CROSS JOIN instead of commas for most queries to prevent issues with precedence. For example, use: + +```sql +SELECT column_alias_name1 FROM datasource CROSS JOIN UNNEST(source_expression1) AS table_alias_name1(column_alias_name1) CROSS JOIN UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... +``` + +Do not use: + +```sql +SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... +``` + +## Async query and query from deep storage + +[Query from deep storage](https://druid.apache.org/docs/latest/querying/query-deep-storage/) is no longer an experimental feature. When you query from deep storage, more data is available for queries without having to scale your Historical services to accommodate more data. To benefit from the space saving that query from deep storage offers, configure your load rules to unload data from your Historical services. + +## MSQ queries for realtime tasks + +The MSQ task engine can now include real time segments in query results. To do this, use the `includeSegmentSource` context parameter and set it to `REALTIME`. + +[#15024](https://github.com/apache/druid/pull/15024) + +## MSQ support for UNION ALL queries + +You can now use the MSQ task engine to run UNION ALL queries with `UnionDataSource`. + +[#14981](https://github.com/apache/druid/pull/14981) + +## Ingest from multiple Kafka topics to a single datasource + +You can now ingest streaming data from multiple Kafka topics to a datasource using a single supervisor. +You configure the topics for the supervisor spec using a regex pattern as the value for `topicPattern` in the IO config. If you add new topics to Kafka that match the regex, Druid automatically starts ingesting from those new topics. + +If you enable multi-topic ingestion for a datasource, downgrading will cause the Supervisor to fail. +For more information, see [Stop supervisors that ingest from multiple Kafka topics before downgrading](#stop-supervisors-that-ingest-from-multiple-kafka-topics-before-downgrading). + +[#14424](https://github.com/apache/druid/pull/14424) +[#14865](https://github.com/apache/druid/pull/14865) + +## SQL UNNEST and ingestion flattening + +The UNNEST function is no longer experimental. UNNEST lets you flatten and explode data during batch ingestion. For more information, see [UNNEST](https://druid.apache.org/docs/latest/querying/sql/#unnest) and [Unnest arrays within a column](https://druid.apache.org/docs/latest/tutorials/tutorial-unnest-arrays/). Review Comment: Updated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
