FrankChen021 commented on code in PR #18919: URL: https://github.com/apache/druid/pull/18919#discussion_r2726156609
########## docs/release-info/release-notes.md: ########## @@ -61,59 +61,262 @@ This section contains important information about new and existing features. This section contains detailed release notes separated by areas. +#### Druid operator + +Druid Operator is a Kubernetes controller that manages the lifecycle of your Druid clusters. The operator simplifies the management of Druid clusters with its custom logic that is configurable through +Kubernetes CRDs. + +[#18435](https://github.com/apache/druid/pull/18435) + +#### Cost-based autoscaling for streaming ingestion + +Druid now supports cost-based autoscaling for streaming ingestion that optimizes task count by balancing lag reduction against resource efficiency.. This autoscaling strategy uses the following formula: + +``` +totalCost = lagWeight × lagRecoveryTime + idleWeight × idlenessCost +``` + +which accounts for the time to clear the backlog and compute time: + +``` +lagRecoveryTime = aggregateLag / (taskCount × avgProcessingRate) — time to clear backlog +idlenessCost = taskCount × taskDuration × predictedIdleRatio — wasted compute time +``` + +[#18819](https://github.com/apache/druid/pull/18819) + +#### Kubernetes client mode (experimental) + +THe new experimental Kubernetes client mode uses the `fabric8` `SharedInformers` to cache k8s metadata. This greatly reduces API traffic between the Overlord and k8s control plane. You can try out this feature using the following config: + +``` +druid.indexer.runner.useK8sSharedInformers=true +``` + +[#18599](https://github.com/apache/druid/pull/18599) + +#### cgroup v2 support + +cgroup v2 is now supported, and all cgroup metrics now emit `cgroupversion` to identify which version is being used. + +The following metrics automatically switch to v2 if v2 is detected: `CgroupCpuMonitor` , `CgroupCpuSetMonitor`, `CgroupDiskMonitor`,`MemoryMonitor`. `CpuAcctDeltaMonitor` fails gracefully if v2 is detected. + +Additionally, `CgroupV2CpuMonitor` now also emits `cgroup/cpu/shares` and `cgroup/cpu/cores_quota`. + +[#18705](https://github.com/apache/druid/pull/18705) + +#### Query reports for Dart + +Dart now supports query reports for running and recently completed queries. that can be fetched from the `/druid/v2/sql/queries/<sqlQueryId>/reports` endpoint. + +The format of the response is a JSON object with two keys, "query" and "report". The "query" key is the same info that is available from the existing `/druid/v2/sql/queries` endpoint. The "report" key is a report map including an MSQ report. + +You can control the retention behavior for reports using the following configs: + +* `druid.msq.dart.controller.maxRetainedReportCount`: Max number of reports that are retained. The default is 0, meaning no reports are retained +* `druid.msq.dart.controller.maxRetainedReportDuration`: How long reports are retained in ISO 8601 duration format. The default is PT0S, meaning time-based expiration is turned off + +[#18886](https://github.com/apache/druid/pull/18886) + +#### New segment format + +The new version 10 segment format improves upon version 9. Version 10 supports partial segment downloads, a feature provided by the experimental virtual storage fabric feature. To streamline partial fetches, the base segment contents get combined into a single file, `druid.segment.` + +Set `druid.indexer.task.buildV10=true` to make segments in the new format. + +You can use the `bin/dump-segment` tool to view segment metadata. The tool outputs serialized JSON. + +[#18880](https://github.com/apache/druid/pull/18880) [#18901](https://github.com/apache/druid/pull/18901) + ### Web console +#### New info available in the web console + +The web console now includes information about the number of available processors and the total memory (in binary bytes). + +This information is also available through the `sys.servers` table. + +[#18613](https://github.com/apache/druid/pull/18613) + #### Other web console improvements +* Added tracking for inactive workers for MSQ execution stages [#18768](https://github.com/apache/druid/pull/18768) +* Added a refresh button for JSON views and stage viewers [#18768](https://github.com/apache/druid/pull/18768) +* You can now define `ARRAY` type parameters in the query view [#18586](https://github.com/apache/druid/pull/18586) +* Changed system table queries to now automatically use the native engine [#18857](https://github.com/apache/druid/pull/18857) +* Improved time charts to support multiple measures [#18701](https://github.com/apache/druid/pull/18701) + ### Ingestion +* Added support for AWS `InternalError` code retries [#18720](https://github.com/apache/druid/pull/18720) +* Improved ingestion to be more resilient. Ingestion tasks no longer fail if the task log upload fails with an exception [#18748](https://github.com/apache/druid/pull/18748) +* Improved how Druid handles situations where data doesn't match the expected type [#18878](https://github.com/apache/druid/pull/18878) +* Improved JSON ingestion so that Druid can compute JSON values directly from dictionary or index structures, allowing ingestion to skip persisting raw JSON data entirely. This reduces on-disk storage size [#18589](https://github.com/apache/druid/pull/18589) +* You can now choose between full dictionary-based indexing and nulls-only indexing for long/double fields in a nested column [#18722](https://github.com/apache/druid/pull/18722) + #### SQL-based ingestion +##### Additional ingestion configurations + +You can now use the following configs to control how your data gets ingested and stored: + +* `maxInputFilesPerWorker`: Controls the maximum number of input files or segments per worker. +* `maxPartitions`: Controls the maximum number of output partitions for any single stage, which affects how many segments are generated during ingestion. + +[#18826](https://github.com/apache/druid/pull/18826) + ##### Other SQL-based ingestion improvements +* Added `maxRowsInMemory` to replace `rowsInMemory`. `rowsInMemory` now functions as an alternate way to provide that config and is ignored if `maxRowsInMemory` is specified. Previously, only `rowsInMemory` existed [#18832](https://github.com/apache/druid/pull/18832) +* Improved the parallelism for sort-merge joins [#18765](https://github.com/apache/druid/pull/18765) + #### Streaming ingestion +##### Record offset and partition + +You can now ingest the record offset (`offsetColumnName`) and partition (`partitionColumnName`) using the `KafkaInputFormat`. Their default names are `kafka.offset` and `kafka.partition` respectively . + +[#18757](https://github.com/apache/druid/pull/18757) + ##### Other streaming ingestion improvements +* Improved supervisors so that they can't kill tasks while the supervisor is stopping [#18767](https://github.com/apache/druid/pull/18767) +* Improved the lag-based autoscaler for streaming ingestion [#18745](https://github.com/apache/druid/pull/18745) +* Improved the `SeekableStream` supervisor autoscaler to wait for tasks to complete before attempting subsequent scale operations. This helps prevent duplicate supervisor history entries [#18715](https://github.com/apache/druid/pull/18715) + ### Querying #### Other querying improvements +* Improved the user experience for invalid `regex_exp` queries. An error gets returned now [#18762](https://github.com/apache/druid/pull/18762) + ### Cluster management +#### Dynamic capacity for Kubernetes-based deployments + +Druid can now dynamically tune the task runner capacity. + +Include the `capacity` field in a POST API call to `/druid/indexer/v1/k8s/taskrunner/executionconfig`. Setting a value this way overrides `druid.indexer.runner.capacity`. + +[#18591](https://github.com/apache/druid/pull/18591) + +#### Server properties table + +The `server_properties` table exposes the runtime properties configured for each Druid server. Each row represents a single property key-value pair associated with a specific server. Review Comment: ```suggestion The `sys.server_properties` table exposes the runtime properties configured for each Druid server. Each row represents a single property key-value pair associated with a specific server. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
