vtlim commented on code in PR #13429: URL: https://github.com/apache/druid/pull/13429#discussion_r1042672580
########## docs/next-release-notes.md: ########## @@ -0,0 +1,517 @@ +--- +title: "WIP release notes for 25.0" +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ "License"); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +## Query engine + +### BIG_SUM SQL function + +Added SQL function `BIG_SUM` that uses the [Compressed Big Decimal](https://github.com/apache/druid/pull/10705) Druid extension. + +https://github.com/apache/druid/pull/13102 + +### Added Compressed Big Decimal min and max functions + +Added min and max functions for Compressed Big Decimal and exposed these functions via SQL: BIG_MIN and BIG_MAX. + +https://github.com/apache/druid/pull/13141 + +### Metrics used to downsample bucket + +Changed the way the MSQ task engine determines whether or not to downsample data, to improve accuracy. The task engine now uses the number of bytes instead of number of keys. + +https://github.com/apache/druid/pull/12998 + +### MSQ heap footprint + +When determining partition boundaries, the heap footprint of the sketches that MSQ uses is capped at 10% of available memory or 300 MB, whichever is lower. Previously, the cap was strictly 300 MB. + +https://github.com/apache/druid/pull/13274 + +### MSQ Docker improvement + +Enabled MSQ task query engine for Docker by default. + +https://github.com/apache/druid/pull/13069 + +### Improved MSQ warnings + +For disallowed MSQ warnings of certain types, the warning is now surfaced as the error. + +https://github.com/apache/druid/pull/13198 + +### Added support for indexSpec + +The MSQ task engine now supports the `indexSpec` context parameter. This context parameter can also be configured through the web console. + +https://github.com/apache/druid/pull/13275 + +### Added task start status to the worker report + +Added `pendingTasks` and `runningTasks` fields to the worker report for the MSQ task engine. +See [Query task status information](#query-task-status-information) for related web console changes. + +https://github.com/apache/druid/pull/13263 + +### Improved handling of secrets + +When MSQ submits tasks containing SQL with sensitive keys, the keys can get logged in the file. +Druid now masks the sensitive keys in the log files using regular expressions. + +https://github.com/apache/druid/pull/13231 + +### Use worker number to communicate between tasks + +Changed the way WorkerClient communicates between the worker tasks, to abstract away the complexity of resolving the `workerNumber` to the `taskId` from the callers. +Once the WorkerClient writes it's outputs to the durable storage, it adds a file with `__success` in the `workerNumber` output directory for that stage and with its `taskId`. This allows you to determine the worker, which has successfully written its outputs to the durable storage, and differentiate from the partial outputs by orphan or failed worker tasks. + +https://github.com/apache/druid/pull/13062 + +### Sketch merging mode + +When a query requires key statistics to generate partition boundaries, key statistics are gathered by the workers while reading rows from the datasource.You can now configure whether the MSQ task engine does this task in parallel or sequentially. Configure the behavior using `clusterStatisticsMergeMode` context parameter. For more information, see [Sketch merging mode](https://druid.apache.org/docs/latest/multi-stage-query/reference.html#sketch-merging-mode). + +https://github.com/apache/druid/pull/13205 + +## Querying + +### Improvements to querying user experience + +This release includes several improvements for querying: + +* Exposed HTTP response headers for SQL queries (https://github.com/apache/druid/pull/13052) +* Added the `shouldFinalize` feature for HLL and quantiles sketches. Druid will no longer finalize aggregators when: Review Comment: PR: https://github.com/apache/druid/pull/13524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
