Re: [PR] DO NOT MERGE: 29.0.0 release notes (druid)

via GitHub Tue, 06 Feb 2024 07:41:38 -0800


ektravel commented on code in PR #15805:
URL: https://github.com/apache/druid/pull/15805#discussion_r1480056632



##########
docs/release-info/release-notes.md:
##########
@@ -57,50 +57,557 @@ For tips about how to write a good release note, see 
[Release notes](https://git
 
 This section contains important information about new and existing features.
 
+### SQL PIVOT and UNPIVOT (experimental)
+
+Druid 29.0.0 adds experimental support for the SQL PIVOT and UNPIVOT operators.
+
+The PIVOT operator carries out an aggregation and transforms rows into columns 
in the output. The following is the general syntax for the PIVOT operator:
+
+```sql
+PIVOT (aggregation_function(column_to_aggregate)
+  FOR column_with_values_to_pivot
+  IN (pivoted_column1 [, pivoted_column2 ...])
+)
+```
+
+The UNPIVOT operator transforms existing column values into rows. The 
following is the general syntax for the UNPIVOT operator:
+
+```sql
+UNPIVOT (values_column 
+  FOR names_column
+  IN (unpivoted_column1 [, unpivoted_column2 ... ])
+)
+```
+
+### Range support in window functions
+
+Window functions now support ranges where both endpoints are unbounded or are 
the current row. Ranges work in strict mode, which means that Druid will fail 
queries that aren't supported. You can turn off strict mode for ranges by 
setting the context parameter `windowingStrictValidation` to `false`.
+
+The following example shows a window expression with RANGE frame 
specifications:
+
+```sql
+(ORDER BY c)
+(ORDER BY c RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
+(ORDER BY c RANGE BETWEEN CURRENT ROW AND UNBOUNDED PRECEDING)
+```
+
+[#15703](https://github.com/apache/druid/pull/15703) 
[#15746](https://github.com/apache/druid/pull/15746)
+
+### Improved INNER joins
+
+Druid now supports arbitrary join conditions for INNER join. Any 
sub-conditions that can't be evaluated as part of the join are converted to a 
post-join filter. Improved join capabilities allow Druid to more effectively 
support applications like Tableau.
+
+[#15302](https://github.com/apache/druid/pull/15302)
+
+### First and last aggregators for double, float, and long data types
+
+Druid 29.0.0 adds support for first and last aggregators for the double, 
float, and long types in an ingestion spec and MSQ queries. Previously, they 
were only supported for native queries. For more information, see [First and 
last aggregators](https://druid.apache.org/docs/latest/querying/aggregations/).
+
+[#14462](https://github.com/apache/druid/pull/14462)
+
+### Support for logging audit events
+
+Added support for logging audit events and improved coverage of audited REST 
API endpoints. To log audit events, set config `druid.audit.manager.type` to 
`log`.
+
+[#15480](https://github.com/apache/druid/pull/15480) 
[#15653](https://github.com/apache/druid/pull/15653)
+
+### Enabled empty ingest queries
+
+The MSQ task engine now allows empty ingest queries by default. Previously, 
ingest queries that produced no data would fail with the `InsertCannotBeEmpty` 
MSQ fault.
+For more information, see [Empty ingest queries in the upgrade 
notes](#enabled-empty-ingest-queries).
+
+[#15674](https://github.com/apache/druid/pull/15674) 
[#15495](https://github.com/apache/druid/pull/15495)
+
+### Support for Google Cloud Storage
+
+Added support for Google Cloud Storage (GCS). You can now use durable storage 
with GCS. See [Durable storage 
configurations](https://druid.apache.org/docs/latest/multi-stage-query/reference#durable-storage-configurations)
 for more information.
+
+[#15398](https://github.com/apache/druid/pull/15398)
+
+### Experimental extensions
+
+Druid 29.0.0 adds the following extensions.
+
+#### DDSketch
+
+A new DDSketch extension is available as a community contribution. The 
DDSketch extension (`druid-ddsketch`) provides support for approximate quantile 
queries using the [DDSketch](https://github.com/datadog/sketches-java) library.
+
+[#15049](https://github.com/apache/druid/pull/15049)
+
+#### Spectator histogram
+
+A new histogram extension is available as a community contribution. The 
Spectator-based histogram extension (`druid-spectator-histogram`) provides 
approximate histogram aggregators and percentile post-aggregators based on 
[Spectator](https://netflix.github.io/atlas-docs/spectator/) fixed-bucket 
histograms.
+
+[#15340](https://github.com/apache/druid/pull/15340)
+
+#### Delta Lake
+
+A new Delta Lake extension is available as a community contribution. The Delta 
Lake extension (`druid-deltalake-extensions`) lets you use the [Delta Lake 
input 
source](https://druid.apache.org/docs/latest/development/extensions-contrib/delta-lake)
 to ingest data stored in a Delta Lake table into Apache Druid.
+
+[#15755](https://github.com/apache/druid/pull/15755)
+
+### Removed the `auto` search strategy
+
+Removed the `auto` search strategy from the native search query. Setting 
`searchStrategy` to `auto` is now equivalent to `useIndexes`. Improvements to 
how and when indexes are computed have allowed the `useIndexes` strategy to be 
more adaptive, skipping computing expensive indexes when possible.
+
+[#15550](https://github.com/apache/druid/pull/15550)
+
 ## Functional area and related changes
 
 This section contains detailed release notes separated by areas.
 
 ### Web console
 
+#### Improved lookup dialog
+
+The lookup dialog in the web console now includes following optional fields:
+
+* Jitter seconds
+* Load timeout seconds
+* Max heap percentage
+
+![Lookup dialog](./assets/image01.png)
+
+[#15472](https://github.com/apache/druid/pull/15472/)
+
+#### File inputs for query detail archive
+
+The **Load query detail archive** now supports loading queries by selecting a 
JSON file directly or dragging the file into the dialog.
+
+![Load query detail archive](./assets/image02.png)
+
+[#15632](https://github.com/apache/druid/pull/15632)
+
+#### Improved time chart brush and added auto-granularity
+
+* Added the notion of timezone in the explore view.
+* Added `chronoshift` as a dependency.
+* Time chart is now able to automatically pick a granularity if "auto" is 
selected (which is the default) based on the current time filter extent.
+* Brush is now automatically enabled in the time chart.
+* Brush interval snaps to the selected time granularity.
+* Added a highlight bubble to all visualizations (except table because it has 
its own).
+
+[#14990](https://github.com/apache/druid/pull/14990)
+
+#### Toggle to fail on empty inserts
+
+Added a new toggle to fail when an ingestion query produces no data.
+
+[#15627](https://github.com/apache/druid/pull/15627)
+
 #### Other web console improvements
 
-### Ingestion
+* Added the ability to detect multiple `EXPLAIN PLAN` queries in the workbench 
and run them individually [#15570](https://github.com/apache/druid/pull/15570)
+* Added the ability to sort a segment table on start and end when grouping by 
interval [#15720](https://github.com/apache/druid/pull/15720)
+* Improved the time shift for compare logic in the web console to include 
literals [#15433](https://github.com/apache/druid/pull/15433)
+* Improved robustness of time shifting in tables in Explore view 
[#15359](https://github.com/apache/druid/pull/15359)
+* Improved ingesting data using the web console 
[#15339](https://github.com/apache/druid/pull/15339)
+* Fixed rendering on a disabled worker 
[#15712](https://github.com/apache/druid/pull/15712)
+* Enabled table driven query modification actions to work with slices 
[#15779](https://github.com/apache/druid/pull/15779)
+
+### General ingestion
+
+#### Added system fields to input sources
+
+Added the option to return system fields when defining an input source. This 
allows for ingestion of metadata such as an S3 object's URI.
+
+[#15276](https://github.com/apache/druid/pull/15276)
+
+#### Changed how Druid allocates weekly segments
+
+When the requested granularity is a month or larger but a segment can't be 
allocated, Druid resorts to day partitioning.
+Unless explicitly specified, Druid skips week-granularity segments for data 
partitioning because these segments don't align with the end of the month or 
more coarse-grained intervals.
+
+[#15589](https://github.com/apache/druid/pull/15589)
+
+#### Changed how empty or null array columns are stored
+
+Columns ingested with the auto column indexer that contain only empty or null 
containing arrays are now stored as `ARRAY<LONG\>` instead of `COMPLEX<json\>`.
+
+[#15505](https://github.com/apache/druid/pull/15505)
+
+#### Enabled skipping compaction for datasources with partial-eternity segments
+
+Druid now skips compaction for datasources with segments that have their 
interval start or end coinciding with Eternity interval end-points.
+
+[#15542](https://github.com/apache/druid/pull/15542)
+
+#### Segment allocation improvements
+
+Improved segment allocation as follows:
+
+* Enhanced polling in segment allocation queue 
[#15590](https://github.com/apache/druid/pull/15590)
+* Fixed an issue in segment allocation that could cause loss of appended data 
when running interleaved append and replace tasks 
[#15459](https://github.com/apache/druid/pull/15459)
+
+#### Other ingestion improvements
+
+* Added a context parameter `useConcurrentLocks` for concurrent locks. You can 
set it for an individual task or at a cluster level using 
`druid.indexer.task.default.context` 
[#15684](https://github.com/apache/druid/pull/15684)
+* Added a default implementation for the `evalDimension` method in the 
RowFunction interface [#15452](https://github.com/apache/druid/pull/15452)
+* Added a configurable delay to the Peon service that determines how long a 
Peon should wait before dropping a segment 
[#15373](https://github.com/apache/druid/pull/15373)
+* Improved metadata store updates by attempting to retry updates rather than 
failing [#15141](https://github.com/apache/druid/pull/15141)
+* Fixed an issue where `systemField` values weren't properly decorated in the 
sampling response [#15536](https://github.com/apache/druid/pull/15536)
+* Fixed an issue with columnar frames always writing multi-valued columns 
where the input column had `hasMultipleValues = UNKNOWN` 
[#15300](https://github.com/apache/druid/pull/15300)
+* Fixed a race condition where there were multiple attempts to publish 
segments for the same sequence 
[#14995](https://github.com/apache/druid/pull/14995)
+* Fixed a race condition that can occur at high streaming concurrency 
[#15174](https://github.com/apache/druid/pull/15174)
+* Fixed an issue where complex types that are also numbers were assumed to 
also be double [#15272](https://github.com/apache/druid/pull/15272)
+* Fixed an issue with unnecessary retries triggered when exceptions like 
IOException obfuscated S3 exceptions 
[#15238](https://github.com/apache/druid/pull/15238)
+* Fixed segment retrieval when the input interval does not lie within the 
years `[1000, 9999]` [#15608](https://github.com/apache/druid/pull/15608)
+* Fixed empty strings being incorrectly converted to null values 
[#15525](https://github.com/apache/druid/pull/15525)
+* Simplified `IncrementalIndex` and `OnHeapIncrementalIndex` by removing some 
parameters [#15448](https://github.com/apache/druid/pull/15448)
+* Updated active task payloads being accessed from memory before reverting to 
the metadata store [#15377](https://github.com/apache/druid/pull/15377)
+
+### SQL-based ingestion
 
-#### SQL-based ingestion
+#### Added `castToType` parameter
 
-##### Other SQL-based ingestion improvements
+Added optional `castToType` parameter to `auto` column schema.
 
-#### Streaming ingestion
+[#15417](https://github.com/apache/druid/pull/15417)
 
-##### Other streaming ingestion improvements
+#### Improved the EXTEND operator
+
+You can now use types like `VARCHAR ARRAY` and `BIGINT ARRAY` with the EXTEND 
operator.
+
+For example:
+
+```sql
+EXTEND (a VARCHAR ARRAY, b BIGINT ARRAY, c VARCHAR)
+```
+
+specifies an extern input with native druid input types `ARRAY<STRING>`, 
`ARRAY<LONG>` and `STRING`.
+
+[#15458](https://github.com/apache/druid/pull/15458)
+
+#### Improved tombstone generation to honor granularity specified in a 
`REPLACE` query
+
+MSQ `REPLACE` queries now generate tombstone segments honoring the segment 
granularity specified in the query, rather than generating irregular 
tombstones. If a query generates more than 5000 tombstones, Druid returns an 
MSQ `TooManyBucketsFault` error, similar to the behavior with data segments.
+
+[#15243](https://github.com/apache/druid/pull/15243)
+
+#### Improved hash joins using filters
+
+Improved consistency of JOIN behavior for queries using either the native or 
MSQ engine to prune based on base (left-hand side) columns only.
+
+[#15299](https://github.com/apache/druid/pull/15299)
+
+#### Configurable page size limit
+
+You can now limit the pages size for results of SELECT queries run using the 
MSQ engine. See `rowsPerPage` in the [SQL-based ingestion 
reference](https://druid.apache.org/docs/latest/multi-stage-query/reference).
+
+### Streaming ingestion
+
+#### Improved Amazon Kinesis automatic reset
+
+Changed Amazon Kinesis automatic reset behavior to only reset the checkpoints 
for partitions where sequence numbers are unavailable.
+
+[#15338](https://github.com/apache/druid/pull/15338)
 
 ### Querying
 
-#### Other querying improvements
+#### Added IPv6_MATCH SQL function
+
+Added IPv6_MATCH SQL function for matching IPv6 addresses in a subnet:
+
+```sql
+IPV6_MATCH(address, subnet)
+```
+
+[#15212](https://github.com/apache/druid/pull/15212/)
+
+#### Added JSON_QUERY_ARRAY function
+
+Added JSON_QUERY_ARRAY which is similar to JSON_QUERY except the return type 
is always `ARRAY<COMPLEX<json>>` instead of `COMPLEX<json>`. Essentially, this 
function allows extracting arrays of objects from nested data and performing 
operations such as UNNEST, ARRAY_LENGTH, ARRAY_SLICE, or any other available 
ARRAY operations.
+
+[#15521](https://github.com/apache/druid/pull/15521)
 
-### Cluster management
+#### Added support for numeric support for EARLIEST and LATEST functions
 
-#### Other cluster management improvements
+In addition to string support, the following functions can now return numeric 
values:
+
+* EARLIEST and EARLIEST_BY
+* LATEST and LATEST_BY
+
+You can also use these functions as aggregations at ingestion time.
+
+[#15607](https://github.com/apache/druid/pull/15607)
+
+#### Added support for `aggregateMultipleValues`
+
+Improved the `ANY_VALUE(expr)` function to support the boolean option 
`aggregateMultipleValues`. The `aggregateMultipleValues` option is enabled by 
default. When you run ANY_VALUE on an MVD, the function returns the stringified 
array. If `aggregateMultipleValues` is set to `false`, ANY_VALUE returns the 
first value instead.
+
+[#15434](https://github.com/apache/druid/pull/15434)
+
+#### Added native `array contains element` filter
+
+Added native `array contains element` filter to improve performance when using 
ARRAY_CONTAINS on array columns.
+
+[#15366](https://github.com/apache/druid/pull/15366) 
[#15455](https://github.com/apache/druid/pull/15455)
+
+#### Changed `equals` filter for native queries
+
+The [equality 
filter](https://druid.apache.org/docs/latest/querying/filters#equality-filter) 
on mixed type `auto` columns that contain arrays must now be filtered as their 
presenting type. This means that if any rows are arrays (for example, the 
segment metadata and `information_schema` reports the type as some array type), 
then the native queries must also filter as if they are some array type.
+ 
+This change impacts mixed type `auto` columns that contain both scalars and 
arrays. It doesn't impact SQL, which already has this limitation due to how the 
type presents itself.
+
+[#15503](https://github.com/apache/druid/pull/15503)
+
+#### Improved `timestamp_extract` function
+
+The `timestamp_extract(expr, unit, [timezone])` Druid native query function 
now supports dynamic values.
+
+[#15586](https://github.com/apache/druid/pull/15586)
+
+#### Improved JSON_VALUE and JSON_QUERY
+
+Added support for using expressions to compute the JSON path argument for 
JSON_VALUE and JSON_QUERY functions.
+
+[#15320](https://github.com/apache/druid/pull/15320)
+
+#### Improved `ExpressionPostAggregator` array handling
+
+Improved the use of `ExpressionPostAggregator` to handle ARRAY types output by 
the grouping engine. The native expression system now recognizes 
`ComparableStringArray` and `ComparableList` array wrapper types and treats 
them as ARRAY types.
+
+Updated `FunctionalExpr` to streamline the handling of class cast exceptions 
as user errors, so that users are provided with clear exception messages.
+
+[#15543](https://github.com/apache/druid/pull/15543)
+
+#### Improved lookups
+
+Enhanced lookups as follows:
+
+* Improved loading and dropping of containers for lookups to reduce 
inconsistencies during updates 
[#14806](https://github.com/apache/druid/pull/14806)
+* Changed behavior for initialization of lookups to load the first lookup as 
is, regardless of cache status 
[#15598](https://github.com/apache/druid/pull/15598)
+
+#### Enabled query request queuing by default when total laning is turned on

Review Comment:
   Yes, I think it should. Added to "Upgrade notes" and upgrade-notes.md



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] DO NOT MERGE: 29.0.0 release notes (druid)

Reply via email to