Re: [PR] docs: 33.0.0 release notes (druid)

via GitHub Mon, 21 Apr 2025 21:51:37 -0700


FrankChen021 commented on code in PR #17872:
URL: https://github.com/apache/druid/pull/17872#discussion_r2053324326



##########
docs/release-info/release-notes.md:
##########
@@ -73,47 +134,226 @@ This section contains detailed release notes separated by 
areas.
 
 #### Streaming ingestion
 
+#### Query parameter for restarts
+
+You can now use an optional query parameter called `skipRestartIfUnmodified` 
for the `/druid/indexer/v1/supervisor` endpoint. You can set 
`skipRestartIfUnmodified=true` to not restart the supervisor if the spec is 
unchanged.
+
+For example:
+
+```bash
+curl -X POST --header "Content-Type: application/json" -d @supervisor.json 
localhost:8888/druid/indexer/v1/supervisor?skipRestartIfUnmodified=true
+```
+
+[#17707](https://github.com/apache/druid/pull/17707)
+
 ##### Other streaming ingestion improvements
 
+- Improved the efficiency of streaming ingestion by fetching active tasks from 
memory. This reduces the number of calls to the metadata store for active 
datasource task payloads [#16098](https://github.com/apache/druid/pull/16098)
+
 ### Querying
 
+#### Improved the query results API 
+
+The query results API (`GET /druid/v2/sql/statements/{queryId}/results`) now 
supports an optional `filename` parameter. When provided, the response 
instructs web browsers to save the results as a file instead of showing them 
inline (via the `Content-Disposition` header).
+
+[#17840](https://github.com/apache/druid/pull/17840)
+
+#### GROUP BY and ORDER BY for nulls
+
+SQL queries now support GROUP BY and ORDER BY for null types.
+
+[#16252](https://github.com/apache/druid/pull/16252)
+
 #### Other querying improvements
 
+- Queries that include functions with a large number of arguments, such as 
CASE statements, now run faster 
[#17613](https://github.com/apache/druid/pull/17613)
+
 ### Cluster management
 
+#### Controller task management
+
+You can now control how many task slots are available for MSQ taskengine 
controller tasks by using the following configs:
+
+| Property   | Description    | Default value |
+|-------|--------------|--------|
+| `druid.indexer.queue.controllerTaskSlotRatio` | (Optional) The proportion of 
available task slots that can be allocated to MSQ task engine controller tasks. 
This is a floating-point value between 0 and 1                                  
                          | null          |
+| `druid.indexer.queue.maxControllerTaskSlots`  | (Optional) The maximum 
number of task slots that can be allocated to controller tasks. This is an 
integer value that defines a hard limit on the number of task slots available 
for MSQ task engine controller tasks. | null         |
+
+[#16889](https://github.com/apache/druid/pull/16889)
+
 #### Other cluster management improvements
 
+- Improved logging for permissions issues 
[#17754](https://github.com/apache/druid/pull/17754)
+- Improved query distribution when `druid.broker.balancer.type` is set to 
`connectionCount` [#17764](https://github.com/apache/druid/pull/17764)
+
 ### Data management
 
+#### Compaction supervisors (experimental)
+
+You now configure compaction supervisors with the following Coordinator 
compaction config:
+
+- `useSupervisors` - Enable compaction to run as a supervisor on the Overlord 
instead of as a Coordinator duty
+- `engine` - Choose between `native` and `msq` to run compaction tasks. The 
`msq` setting uses the MSQ task engine and can be used only when 
`useSupervisors` is true.
+
+Previously, you used runtime properties for the Overlord. Support for these 
has been removed.
+
+[#17782](https://github.com/apache/druid/pull/17782)
+
+#### Compaction APIs
+
+You can use the following Overlord APIs to manage compaction:
+
+|Method|Path|Description|Required Permission|
+|--------|--------------------------------------------|------------|--------------------|
+|GET|`/druid/indexer/v1/compaction/config/cluster`|Get the cluster-level 
compaction config|Read configs|
+|POST|`/druid/indexer/v1/compaction/config/cluster`|Update the cluster-level 
compaction config|Write configs|
+|GET|`/druid/indexer/v1/compaction/config/datasources`|Get the compaction 
configs for all datasources|Read datasource|
+|GET|`/druid/indexer/v1/compaction/config/datasources/{dataSource}`|Get the 
compaction config of a single datasource|Read datasource|
+|POST|`/druid/indexer/v1/compaction/config/datasources/{dataSource}`|Update 
the compaction config of a single datasource|Write datasource|
+|GET|`/druid/indexer/v1/compaction/config/datasources/{dataSource}/history`|Get
 the compaction config history of a single datasource|Read datasource|
+|GET|`/druid/indexer/v1/compaction/status/datasources`|Get the compaction 
status of all datasources|Read datasource|
+|GET|`/druid/indexer/v1/compaction/status/datasources/{dataSource}`|Get the 
compaction status of a single datasource|Read datasource|
+
+#### Faster segment metadata operations
+
+Enable segment metadata caching on the Overlord with the runtime property 
`druid.manager.segments.useCache`. This feature is off by default.  
+
+You can set the property to the following values: 
+
+- `never`: Cache is disabled (default)
+- `always`: Reads are always done from the cache. Service start-up will be 
blocked until the cache has synced with the metadata store at least once. 
Transactions are blocked until the cache has synced with the metadata store at 
least once after becoming leader. 
+- `ifSynced`: Reads are done from the cache only if it has already synced with 
the metadata store. This mode does not block service start-up or transactions 
unlike the `always` setting.
+
+As part of this change, additional metrics have been introduced. For more 
information about these metrics, see [Segment metadata cache 
metrics](#segment-metadata-cache-metrics).
+
+[#17653](https://github.com/apache/druid/pull/17653) 
[#17824](https://github.com/apache/druid/pull/17824)
+
+#### Automatic kill task interval
+
+The Coordinator can optionally issue kill tasks for cleaning up unused 
segments. Starting with this release, individual kill tasks are limited to 
processing 30 days or fewer worth of segments per task by default. This 
improves performance of the individual kill tasks.
+
+The previous behavior (no limit on interval per kill task) can be restored by 
setting `druid.coordinator.kill.maxInterval = P0D.`
+
+[#17680](https://github.com/apache/druid/pull/17680)
+
 #### Other data management improvements
 
+- Metadata queries now return `maxIngestedEventTime`, which is the timestamp 
of the latest ingested event for the datasource. For realtime datasources, this 
may be later than `MAX(__time)` if `queryGranularity` is being used. For 
non-realtime datasources, this is equivalent to `MAX(__time)` 
[#17686](https://github.com/apache/druid/pull/17686)
+- Metadata kill queries are now more efficient. They consider a maximum end 
time since the last segment was killed 
[#17770](https://github.com/apache/druid/pull/17770)
+- Newly added segments are loaded more quickly 
[#17732](https://github.com/apache/druid/pull/17732)
+
 ### Metrics and monitoring
 
+#### Custom Histogram buckets for Prometheus
+
+You can now configure custom Histogram buckets for `timer` metrics from the 
Prometheus emitter using the `histogramBuckets` parameter. 
+
+If no custom buckets are provided, the following default buckets are used: 
`[0.1, 0.25, 0.5, 0.75, 1.0, 2.5, 5.0, 7.5, 10.0, 30.0, 60.0, 120.0, 300.0]`. 
If the user does not specify their own JSON file, a default mapping is used.
+
+[#17689](https://github.com/apache/druid/pull/17689)
+
+#### Segment metadata cache metrics
+
+The following metrics have been added:
+
+The following metrics have been introduced as part of the segment metadata 
cache performance improvement.
+
+- `segment/metadataCache/sync/time`
+- `segment/metadataCache/transactions/readOnly`
+- `segment/metadataCache/transactions/writeOnly`
+- `segment/metadataCache/transactions/readWrite`
+
+For more information about the segment metadata cache, see [Faster segment 
metadata operations](#faster-segment-metadata-operations).
+
+[#17653](https://github.com/apache/druid/pull/17653)
+
+#### Streaming ingestion lag metrics
+
+The Kafka supervisor now includes additional lag metrics for how many minutes 
of data Druid is behind:
+
+|Metric|Description|Default value|
+|-|-|
+|`ingest/kafka/updateOffsets/time`|Total time (in milliseconds) taken to fetch 
the latest offsets from Kafka stream and the ingestion tasks.|`dataSource`, 
`taskId`, `taskType`, `groupId`, `tags`|Generally a few seconds at most.|
+|`ingest/kafka/lag/time`|Total lag time in milliseconds between the current 
message sequence number consumed by the Kafka indexing tasks and latest 
sequence number in Kafka across all shards. Minimum emission period for this 
metric is a minute. Enabled only when `pusblishLagTime` is set to true on 
supervisor config.|`dataSource`, `stream`, `tags`|Greater than 0, up to max 
kafka retention period in milliseconds. |
+|`ingest/kafka/maxLag/time`|Max lag time in milliseconds between the current 
message sequence number consumed by the Kafka indexing tasks and latest 
sequence number in Kafka across all shards. Minimum emission period for this 
metric is a minute. Enabled only when `pusblishLagTime` is set to true on 
supervisor config.|`dataSource`, `stream`, `tags`|Greater than 0, up to max 
kafka retention period in milliseconds. |
+|`ingest/kafka/avgLag/time`|Average lag time in milliseconds between the 
current message sequence number consumed by the Kafka indexing tasks and latest 
sequence number in Kafka across all shards. Minimum emission period for this 
metric is a minute. Enabled only when `pusblishLagTime` is set to true on 
supervisor config.|`dataSource`, `stream`, `tags`|Greater than 0, up to max 
kafka retention period in milliseconds. |
+|`ingest/kinesis/updateOffsets/time`|Total time (in milliseconds) taken to 
fetch the latest offsets from Kafka stream and the ingestion 
tasks.|`dataSource`, `taskId`, `taskType`, `groupId`, `tags`|Generally a few 
seconds at most.|
+
+[#17735](https://github.com/apache/druid/pull/17735)
+
+#### Other metrics and monitoring changes
+
+- Added the `ingest/processed/bytes` metric that tracks the total number of 
bytes processed during ingestion tasks for JSON-based batch, SQL-based batch, 
and streaming ingestion tasks 
[#17581](https://github.com/apache/druid/pull/17581)
+
 ### Extensions
 
+#### Kubernetes 
+
+- You can now ingest payloads larger than 128KiB when using HDFS as deep 
storage for Middle Manager-less ingestion 
[#17742](https://github.com/apache/druid/pull/17742)
+- You can now run task pods in a namespace different from the rest of the 
cluster [#17738](https://github.com/apache/druid/pull/17738)
+- You can now name your K8s job names using 
`druid.indexer.runner.k8sTaskPodNamePrefix` 
[#17749](https://github.com/apache/druid/pull/17749)
+- The logging level is now set to info. Previously, it was set to debug 
[#17752](https://github.com/apache/druid/pull/17752)
+- Druid now supports lazy loading of pod templates so that any config changes 
you make are deployed more quickly 
[#17701](https://github.com/apache/druid/pull/17701)
+- Removed startup probe so that peon tasks can start up properly without being 
killed by Kubernetes [#17784](https://github.com/apache/druid/pull/17784)
+
 ### Documentation improvements
 
 ## Upgrade notes and incompatible changes
 
 ### Upgrade notes
 
-#### Front-coded dictionaries
+#### `useMaxMemoryEstimates`
+
+`useMaxMemoryEstimates` is now set to false for MSQ task engine tasks. 
Additionally, the property has been deprecated and will be removed in a future 
release. Setting this to false allows for better on-heap memory estimation.
 
-<!--Carry this forward until 32. Then move it to incompatible changes -->
+[#17792](https://github.com/apache/druid/pull/17792)
 
-In Druid 32.0.0, the front coded dictionaries feature will be turned on by 
default. Front-coded dictionaries reduce storage and improve performance by 
optimizing for strings where the front part looks similar.
+#### Automatic kill tasks interval
 
-Once this feature is on, you cannot easily downgrade to an earlier version 
that does not support the feature. 
+Automatic kill tasks are now limited to 30 days or fewer worth of segments per 
task.
 
-For more information, see [Migration guide: front-coded 
dictionaries](./migr-front-coded-dict.md).
+The previous behavior (no limit on interval per kill task) can be restored by 
setting `druid.coordinator.kill.maxInterval = P0D`.
 
-If you're already using this feature, you don't need to take any action. 
+[#17680](https://github.com/apache/druid/pull/17680)
 
+#### Kubernetes deployments
 
-### Incompatible changes
+By default, the Docker image now uses the canonical hostname if you're running 
Druid in Kubernetes. Otherwise, it uses the IP address otherwise 
[#17697](https://github.com/apache/druid/pull/17697) 

Review Comment:
   ```suggestion
   By default, the Docker image now uses the canonical hostname to register 
services in ZooKeeper for internal communication if you're running Druid in 
Kubernetes. Otherwise, it uses the IP address. 
[#17697](https://github.com/apache/druid/pull/17697). 
   
   You can set the environment variable `DRUID_SET_HOST_IP` to `1` to restore 
old behaviour.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] docs: 33.0.0 release notes (druid)

Reply via email to