kfaraz commented on code in PR #15805: URL: https://github.com/apache/druid/pull/15805#discussion_r1496916086
########## docs/release-info/release-notes.md: ########## @@ -57,50 +57,609 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### MSQ export statements (experimental) + +Druid 29.0.0 adds experimental support for export statements to the MSQ task engine. This allows query tasks to write data to an external destination through the [`EXTERN` function](https://druid.apache.org/docs/latest/multi-stage-query/reference#extern-function). + +[#15689](https://github.com/apache/druid/pull/15689) + +### SQL PIVOT and UNPIVOT (experimental) + +Druid 29.0.0 adds experimental support for the SQL PIVOT and UNPIVOT operators. + +The PIVOT operator carries out an aggregation and transforms rows into columns in the output. The following is the general syntax for the PIVOT operator: + +```sql +PIVOT (aggregation_function(column_to_aggregate) + FOR column_with_values_to_pivot + IN (pivoted_column1 [, pivoted_column2 ...]) +) +``` + +The UNPIVOT operator transforms existing column values into rows. The following is the general syntax for the UNPIVOT operator: + +```sql +UNPIVOT (values_column + FOR names_column + IN (unpivoted_column1 [, unpivoted_column2 ... ]) +) +``` + +### Range support in window functions (experimental) + +Window functions (experimental) now support ranges where both endpoints are unbounded or are the current row. Ranges work in strict mode, which means that Druid will fail queries that aren't supported. You can turn off strict mode for ranges by setting the context parameter `windowingStrictValidation` to `false`. + +The following example shows a window expression with RANGE frame specifications: + +```sql +(ORDER BY c) +(ORDER BY c RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) +(ORDER BY c RANGE BETWEEN CURRENT ROW AND UNBOUNDED PRECEDING) +``` + +[#15703](https://github.com/apache/druid/pull/15703) [#15746](https://github.com/apache/druid/pull/15746) + +### Improved INNER joins + +Druid now supports arbitrary join conditions for INNER join. Any sub-conditions that can't be evaluated as part of the join are converted to a post-join filter. Improved join capabilities allow Druid to more effectively support applications like Tableau. + +[#15302](https://github.com/apache/druid/pull/15302) + +### Improved concurrent append and replace (experimental) + +You no longer have to manually determine the task lock type for concurrent append and replace (experimental) with the `taskLockType` task context. Instead, Druid can now determine it automatically for you. You can use the context parameter `"useConcurrentLocks": true` for individual tasks and datasources or enable concurrent append and replace at a cluster level using `druid.indexer.task.default.context`. + +[#15684](https://github.com/apache/druid/pull/15684) + +### First and last aggregators for double, float, and long data types + +Druid now supports first and last aggregators for the double, float, and long types in native and MSQ ingestion spec and MSQ queries. Previously, they were only supported for native queries. For more information, see [First and last aggregators](https://druid.apache.org/docs/latest/querying/aggregations/#first-and-last-aggregators). + +[#14462](https://github.com/apache/druid/pull/14462) + +Additionally, the following functions can now return numeric values: + +* EARLIEST and EARLIEST_BY +* LATEST and LATEST_BY + +You can use these functions as aggregators at ingestion time. + +[#15607](https://github.com/apache/druid/pull/15607) + +### Support for logging audit events + +Added support for logging audit events and improved coverage of audited REST API endpoints. +To enable logging audit events, set config `druid.audit.manager.type` to `log` in both the Coordinator and Overlord or in `common.runtime.properties`. When you set `druid.audit.manager.type` to `sql`, audit events are persisted to metadata store. + +In both cases, Druid audits the following events: + +* Coordinator + * Update load rules + * Update lookups + * Update coordinator dynamic config + * Update auto-compaction config +* Overlord + * Submit a task + * Create/update a supervisor + * Update worker config +* Basic security extension + * Create user + * Delete user + * Update user credentials + * Create role + * Delete role + * Assign role to user + * Set role permissions + + +[#15480](https://github.com/apache/druid/pull/15480) [#15653](https://github.com/apache/druid/pull/15653) + +Also fixed an issue with the basic auth integration test by not persisting logs to the database. + +[#15561](https://github.com/apache/druid/pull/15561) + +### Enabled empty ingest queries + +The MSQ task engine now allows empty ingest queries by default. Previously, ingest queries that produced no data would fail with the `InsertCannotBeEmpty` MSQ fault. +For more information, see [Empty ingest queries in the upgrade notes](#enabled-empty-ingest-queries). + +[#15674](https://github.com/apache/druid/pull/15674) [#15495](https://github.com/apache/druid/pull/15495) + +In the web console, you can use a toggle to control whether an ingestion fails if the ingestion query produces no data. + +[#15627](https://github.com/apache/druid/pull/15627) + +### MSQ support for Google Cloud Storage + +The MSQ task engine now supports Google Cloud Storage (GCS). You can use durable storage with GCS. See [Durable storage configurations](https://druid.apache.org/docs/latest/multi-stage-query/reference#durable-storage-configurations) for more information. + +[#15398](https://github.com/apache/druid/pull/15398) + +### Experimental extensions + +Druid 29.0.0 adds the following extensions. + +#### DDSketch + +A new DDSketch extension is available as a community contribution. The DDSketch extension (`druid-ddsketch`) provides support for approximate quantile queries using the [DDSketch](https://github.com/datadog/sketches-java) library. + +[#15049](https://github.com/apache/druid/pull/15049) + +#### Spectator histogram + +A new histogram extension is available as a community contribution. The Spectator-based histogram extension (`druid-spectator-histogram`) provides approximate histogram aggregators and percentile post-aggregators based on [Spectator](https://netflix.github.io/atlas-docs/spectator/) fixed-bucket histograms. + +[#15340](https://github.com/apache/druid/pull/15340) + +#### Delta Lake + +A new Delta Lake extension is available as a community contribution. The Delta Lake extension (`druid-deltalake-extensions`) lets you use the [Delta Lake input source](https://druid.apache.org/docs/latest/development/extensions-contrib/delta-lake) to ingest data stored in a Delta Lake table into Apache Druid. + +[#15755](https://github.com/apache/druid/pull/15755) + ## Functional area and related changes This section contains detailed release notes separated by areas. ### Web console +#### Support for array types + +Added support for array types for all the ingestion wizards. + + + +[#15588](https://github.com/apache/druid/pull/15588) + +#### File inputs for query detail archive + +The **Load query detail archive** now supports loading queries by selecting a JSON file directly or dragging the file into the dialog. + + + +[#15632](https://github.com/apache/druid/pull/15632) + +#### Improved lookup dialog + +The lookup dialog in the web console now includes following optional fields. See [JDBC lookup](https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global#jdbc-lookup) for more information. + +* Jitter seconds +* Load timeout seconds +* Max heap percentage + + + +[#15472](https://github.com/apache/druid/pull/15472/) + +#### Improved time chart brush and added auto-granularity + +Improved the web console **Explore** view as follows: + +* Added the notion of timezone in the explore view. +* Time chart is now able to automatically pick a granularity if "auto" is selected (which is the default) based on the current time filter extent. +* Brush is now automatically enabled in the time chart. +* Brush interval snaps to the selected time granularity. +* Added a highlight bubble to all visualizations (except table because it has its own). + +[#14990](https://github.com/apache/druid/pull/14990) + #### Other web console improvements -### Ingestion +* Added the ability to detect multiple `EXPLAIN PLAN` queries in the workbench and run them individually [#15570](https://github.com/apache/druid/pull/15570) +* Added the ability to sort a segment table on start and end when grouping by interval [#15720](https://github.com/apache/druid/pull/15720) +* Improved the time shift for compare logic in the web console to include literals [#15433](https://github.com/apache/druid/pull/15433) +* Improved robustness of time shifting in tables in Explore view [#15359](https://github.com/apache/druid/pull/15359) +* Improved ingesting data using the web console [#15339](https://github.com/apache/druid/pull/15339) +* Improved management proxy detection [#15453](https://github.com/apache/druid/pull/15453) +* Fixed rendering on a disabled worker [#15712](https://github.com/apache/druid/pull/15712) +* Fix an issue where `waitUntilSegmentLoad` would always be set to `true` even if explicitly set to `false` [#15781](https://github.com/apache/druid/pull/15781) +* Enabled table driven query modification actions to work with slices [#15779](https://github.com/apache/druid/pull/15779) + +### General ingestion + +#### Added system fields to input sources + +Added the option to return system fields when defining an input source. This allows for ingestion of metadata, such as an S3 object's URI. + +[#15276](https://github.com/apache/druid/pull/15276) + +#### Changed how Druid allocates weekly segments Review Comment: This entry should also be added as a bullet in `Segment allocation improvements section`. ########## docs/release-info/release-notes.md: ########## @@ -57,50 +57,609 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +### MSQ export statements (experimental) + +Druid 29.0.0 adds experimental support for export statements to the MSQ task engine. This allows query tasks to write data to an external destination through the [`EXTERN` function](https://druid.apache.org/docs/latest/multi-stage-query/reference#extern-function). + +[#15689](https://github.com/apache/druid/pull/15689) + +### SQL PIVOT and UNPIVOT (experimental) + +Druid 29.0.0 adds experimental support for the SQL PIVOT and UNPIVOT operators. + +The PIVOT operator carries out an aggregation and transforms rows into columns in the output. The following is the general syntax for the PIVOT operator: + +```sql +PIVOT (aggregation_function(column_to_aggregate) + FOR column_with_values_to_pivot + IN (pivoted_column1 [, pivoted_column2 ...]) +) +``` + +The UNPIVOT operator transforms existing column values into rows. The following is the general syntax for the UNPIVOT operator: + +```sql +UNPIVOT (values_column + FOR names_column + IN (unpivoted_column1 [, unpivoted_column2 ... ]) +) +``` + +### Range support in window functions (experimental) + +Window functions (experimental) now support ranges where both endpoints are unbounded or are the current row. Ranges work in strict mode, which means that Druid will fail queries that aren't supported. You can turn off strict mode for ranges by setting the context parameter `windowingStrictValidation` to `false`. + +The following example shows a window expression with RANGE frame specifications: + +```sql +(ORDER BY c) +(ORDER BY c RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) +(ORDER BY c RANGE BETWEEN CURRENT ROW AND UNBOUNDED PRECEDING) +``` + +[#15703](https://github.com/apache/druid/pull/15703) [#15746](https://github.com/apache/druid/pull/15746) + +### Improved INNER joins + +Druid now supports arbitrary join conditions for INNER join. Any sub-conditions that can't be evaluated as part of the join are converted to a post-join filter. Improved join capabilities allow Druid to more effectively support applications like Tableau. + +[#15302](https://github.com/apache/druid/pull/15302) + +### Improved concurrent append and replace (experimental) + +You no longer have to manually determine the task lock type for concurrent append and replace (experimental) with the `taskLockType` task context. Instead, Druid can now determine it automatically for you. You can use the context parameter `"useConcurrentLocks": true` for individual tasks and datasources or enable concurrent append and replace at a cluster level using `druid.indexer.task.default.context`. + +[#15684](https://github.com/apache/druid/pull/15684) + +### First and last aggregators for double, float, and long data types + +Druid now supports first and last aggregators for the double, float, and long types in native and MSQ ingestion spec and MSQ queries. Previously, they were only supported for native queries. For more information, see [First and last aggregators](https://druid.apache.org/docs/latest/querying/aggregations/#first-and-last-aggregators). + +[#14462](https://github.com/apache/druid/pull/14462) + +Additionally, the following functions can now return numeric values: + +* EARLIEST and EARLIEST_BY +* LATEST and LATEST_BY + +You can use these functions as aggregators at ingestion time. + +[#15607](https://github.com/apache/druid/pull/15607) + +### Support for logging audit events + +Added support for logging audit events and improved coverage of audited REST API endpoints. +To enable logging audit events, set config `druid.audit.manager.type` to `log` in both the Coordinator and Overlord or in `common.runtime.properties`. When you set `druid.audit.manager.type` to `sql`, audit events are persisted to metadata store. + +In both cases, Druid audits the following events: + +* Coordinator + * Update load rules + * Update lookups + * Update coordinator dynamic config + * Update auto-compaction config +* Overlord + * Submit a task + * Create/update a supervisor + * Update worker config +* Basic security extension + * Create user + * Delete user + * Update user credentials + * Create role + * Delete role + * Assign role to user + * Set role permissions + + +[#15480](https://github.com/apache/druid/pull/15480) [#15653](https://github.com/apache/druid/pull/15653) + +Also fixed an issue with the basic auth integration test by not persisting logs to the database. + +[#15561](https://github.com/apache/druid/pull/15561) + +### Enabled empty ingest queries + +The MSQ task engine now allows empty ingest queries by default. Previously, ingest queries that produced no data would fail with the `InsertCannotBeEmpty` MSQ fault. +For more information, see [Empty ingest queries in the upgrade notes](#enabled-empty-ingest-queries). + +[#15674](https://github.com/apache/druid/pull/15674) [#15495](https://github.com/apache/druid/pull/15495) + +In the web console, you can use a toggle to control whether an ingestion fails if the ingestion query produces no data. + +[#15627](https://github.com/apache/druid/pull/15627) + +### MSQ support for Google Cloud Storage + +The MSQ task engine now supports Google Cloud Storage (GCS). You can use durable storage with GCS. See [Durable storage configurations](https://druid.apache.org/docs/latest/multi-stage-query/reference#durable-storage-configurations) for more information. + +[#15398](https://github.com/apache/druid/pull/15398) + +### Experimental extensions + +Druid 29.0.0 adds the following extensions. + +#### DDSketch + +A new DDSketch extension is available as a community contribution. The DDSketch extension (`druid-ddsketch`) provides support for approximate quantile queries using the [DDSketch](https://github.com/datadog/sketches-java) library. + +[#15049](https://github.com/apache/druid/pull/15049) + +#### Spectator histogram + +A new histogram extension is available as a community contribution. The Spectator-based histogram extension (`druid-spectator-histogram`) provides approximate histogram aggregators and percentile post-aggregators based on [Spectator](https://netflix.github.io/atlas-docs/spectator/) fixed-bucket histograms. + +[#15340](https://github.com/apache/druid/pull/15340) + +#### Delta Lake + +A new Delta Lake extension is available as a community contribution. The Delta Lake extension (`druid-deltalake-extensions`) lets you use the [Delta Lake input source](https://druid.apache.org/docs/latest/development/extensions-contrib/delta-lake) to ingest data stored in a Delta Lake table into Apache Druid. + +[#15755](https://github.com/apache/druid/pull/15755) + ## Functional area and related changes This section contains detailed release notes separated by areas. ### Web console +#### Support for array types + +Added support for array types for all the ingestion wizards. + + + +[#15588](https://github.com/apache/druid/pull/15588) + +#### File inputs for query detail archive + +The **Load query detail archive** now supports loading queries by selecting a JSON file directly or dragging the file into the dialog. + + + +[#15632](https://github.com/apache/druid/pull/15632) + +#### Improved lookup dialog + +The lookup dialog in the web console now includes following optional fields. See [JDBC lookup](https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global#jdbc-lookup) for more information. + +* Jitter seconds +* Load timeout seconds +* Max heap percentage + + + +[#15472](https://github.com/apache/druid/pull/15472/) + +#### Improved time chart brush and added auto-granularity + +Improved the web console **Explore** view as follows: + +* Added the notion of timezone in the explore view. +* Time chart is now able to automatically pick a granularity if "auto" is selected (which is the default) based on the current time filter extent. +* Brush is now automatically enabled in the time chart. +* Brush interval snaps to the selected time granularity. +* Added a highlight bubble to all visualizations (except table because it has its own). + +[#14990](https://github.com/apache/druid/pull/14990) + #### Other web console improvements -### Ingestion +* Added the ability to detect multiple `EXPLAIN PLAN` queries in the workbench and run them individually [#15570](https://github.com/apache/druid/pull/15570) +* Added the ability to sort a segment table on start and end when grouping by interval [#15720](https://github.com/apache/druid/pull/15720) +* Improved the time shift for compare logic in the web console to include literals [#15433](https://github.com/apache/druid/pull/15433) +* Improved robustness of time shifting in tables in Explore view [#15359](https://github.com/apache/druid/pull/15359) +* Improved ingesting data using the web console [#15339](https://github.com/apache/druid/pull/15339) +* Improved management proxy detection [#15453](https://github.com/apache/druid/pull/15453) +* Fixed rendering on a disabled worker [#15712](https://github.com/apache/druid/pull/15712) +* Fix an issue where `waitUntilSegmentLoad` would always be set to `true` even if explicitly set to `false` [#15781](https://github.com/apache/druid/pull/15781) +* Enabled table driven query modification actions to work with slices [#15779](https://github.com/apache/druid/pull/15779) + +### General ingestion + +#### Added system fields to input sources + +Added the option to return system fields when defining an input source. This allows for ingestion of metadata, such as an S3 object's URI. + +[#15276](https://github.com/apache/druid/pull/15276) + +#### Changed how Druid allocates weekly segments + +When the requested granularity is a month or larger but a segment can't be allocated, Druid resorts to day partitioning. +Unless explicitly specified, Druid skips week-granularity segments for data partitioning because these segments don't align with the end of the month or more coarse-grained intervals. Review Comment: @cryptoe , there is no way to get back to the old behaviour now. Old behaviour was that if allocation is not possible for month, we try week next. In the new behaviour, we will try skip trying week and try day directly. In the new behaviour, week segments can only be allocated if the chosen partitioning in the append task is WEEK. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
