cryptoe commented on code in PR #18231: URL: https://github.com/apache/druid/pull/18231#discussion_r2218916187
########## docs/release-info/release-notes.md: ########## @@ -57,63 +57,308 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +#### Improved HTTP endpoints + +You can now use raw SQL in the HTTP body for `/druid/v2/sql` endpoints. You can set `Content-Type` to `text/plain` instead of `application/json`, so you can provide raw text that isn't escaped. + +[#17937](https://github.com/apache/druid/pull/17937) + +Additionally, SQL requests can now include multiple SET statements to build up context for the final statement. For example, the following query results in a statement that includes the `timeout`, `useCache`, `populateCache`, and `vectorize` query context parameters: + +```sql +SET timeout = 20000; +SET useCache = false; +SET populateCache = false; +SET vectorize = 'force'; +SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2 +``` + +This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported. + +[#17974](https://github.com/apache/druid/pull/17974) +### Cloning Historicals + +You can now configure clones for Historicals using the dynamic Coordinator configuration `cloneServers`. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one. + +Set the config to a map from the target Historical server to the source Historical: + +``` + "cloneServers": {"historicalClone":"historicalOriginal"} +``` + +The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the `cloneServers` config. + +When you query your data using the native query engine, you can prefer (`preferClones`), exclude (`excludeClones`), or include (`includeClones`) clones by setting the query context parameter `cloneQueryMode`. By default, clones are excluded. + +As part of this change, new Coordinator APIs are available. For more information, see [Coordinator APIs for clones](#coordinator-apis-for-clones). + +[#17863](https://github.com/apache/druid/pull/17863) [#17899](https://github.com/apache/druid/pull/17899) [#17956](https://github.com/apache/druid/pull/17956) +### Overlord kill tasks + +You can now run kill tasks directly on the Overlord itself. Running kill tasks on the Overlord provides the following benefits: + +- Unused segments are killed as soon as they're eligible and are killed faster +- Doesn't require a task slot +- Locked intervals are automatically skipped +- Configuration is simpler +- A large number of unused segments doesn't cause issues for them + +This feature is controlled by the following configs: + +- `druid.manager.segments.killUnused.enabled` - Whether the feature is enabled or not +- `druid.manager.segments.killUnused.bufferPeriod` - The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused. + +As part of this feature, [new metrics](#overlord-kill-task-metrics) have been added. + Review Comment: We should also mention that we need to enable segment metadata cache feature here, only then embedded kill tasks can be enabled. ########## docs/release-info/release-notes.md: ########## @@ -57,63 +57,308 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +#### Improved HTTP endpoints + +You can now use raw SQL in the HTTP body for `/druid/v2/sql` endpoints. You can set `Content-Type` to `text/plain` instead of `application/json`, so you can provide raw text that isn't escaped. + +[#17937](https://github.com/apache/druid/pull/17937) + +Additionally, SQL requests can now include multiple SET statements to build up context for the final statement. For example, the following query results in a statement that includes the `timeout`, `useCache`, `populateCache`, and `vectorize` query context parameters: + +```sql +SET timeout = 20000; +SET useCache = false; +SET populateCache = false; +SET vectorize = 'force'; +SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2 +``` + +This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported. + +[#17974](https://github.com/apache/druid/pull/17974) +### Cloning Historicals + +You can now configure clones for Historicals using the dynamic Coordinator configuration `cloneServers`. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one. + +Set the config to a map from the target Historical server to the source Historical: + +``` + "cloneServers": {"historicalClone":"historicalOriginal"} +``` + +The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the `cloneServers` config. + +When you query your data using the native query engine, you can prefer (`preferClones`), exclude (`excludeClones`), or include (`includeClones`) clones by setting the query context parameter `cloneQueryMode`. By default, clones are excluded. + +As part of this change, new Coordinator APIs are available. For more information, see [Coordinator APIs for clones](#coordinator-apis-for-clones). + +[#17863](https://github.com/apache/druid/pull/17863) [#17899](https://github.com/apache/druid/pull/17899) [#17956](https://github.com/apache/druid/pull/17956) +### Overlord kill tasks + +You can now run kill tasks directly on the Overlord itself. Running kill tasks on the Overlord provides the following benefits: + +- Unused segments are killed as soon as they're eligible and are killed faster +- Doesn't require a task slot +- Locked intervals are automatically skipped +- Configuration is simpler +- A large number of unused segments doesn't cause issues for them + +This feature is controlled by the following configs: + +- `druid.manager.segments.killUnused.enabled` - Whether the feature is enabled or not +- `druid.manager.segments.killUnused.bufferPeriod` - The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused. + +As part of this feature, [new metrics](#overlord-kill-task-metrics) have been added. + +[#18028](https://github.com/apache/druid/pull/18028) + +### Preferred tier selection +You can now configure the Broker service to prefer Historicals on a specific tier. This can help ensure Druid executes queries within the same availability zone if you have Druid deployed across multiple availability zones. + +[#18136](https://github.com/apache/druid/pull/18136) + +### Dart improvements NEED TO WRITE + +Dart specific endpoints have been removed and folded into SqlResource. [#18003](https://github.com/apache/druid/pull/18003) +Added a new engine QueryContext parameter. The value can be native or msq-dart. The value determines the engine used to run the query. The default value is native. [#18003](https://github.com/apache/druid/pull/18003) + +MSQ Dart is now able to query real-time tasks by setting the query context parameter includeSegmentSource to realtime, in a similar way to MSQ tasks. [#18076](https://github.com/apache/druid/pull/18076) + +### `SegmentMetadataCache` on the Coordinator + +[#17996](https://github.com/apache/druid/pull/17996) [#17935](https://github.com/apache/druid/pull/17935) + ## Functional area and related changes This section contains detailed release notes separated by areas. ### Web console +#### SET statements + +The web console supports using SET statements to specify query context parameters. For example, if you include `SET timeout = 20000;` in your query, the timeout query context parameter is set. Review Comment: This should be a headline feature. Lets add an example which hits the sql endpoint, executes dart, and has timeout set. ########## docs/release-info/release-notes.md: ########## @@ -57,63 +57,308 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +#### Improved HTTP endpoints + +You can now use raw SQL in the HTTP body for `/druid/v2/sql` endpoints. You can set `Content-Type` to `text/plain` instead of `application/json`, so you can provide raw text that isn't escaped. + +[#17937](https://github.com/apache/druid/pull/17937) + +Additionally, SQL requests can now include multiple SET statements to build up context for the final statement. For example, the following query results in a statement that includes the `timeout`, `useCache`, `populateCache`, and `vectorize` query context parameters: + +```sql +SET timeout = 20000; +SET useCache = false; +SET populateCache = false; +SET vectorize = 'force'; +SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2 +``` + +This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported. + +[#17974](https://github.com/apache/druid/pull/17974) +### Cloning Historicals + +You can now configure clones for Historicals using the dynamic Coordinator configuration `cloneServers`. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one. + +Set the config to a map from the target Historical server to the source Historical: + +``` + "cloneServers": {"historicalClone":"historicalOriginal"} +``` + +The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the `cloneServers` config. + +When you query your data using the native query engine, you can prefer (`preferClones`), exclude (`excludeClones`), or include (`includeClones`) clones by setting the query context parameter `cloneQueryMode`. By default, clones are excluded. + +As part of this change, new Coordinator APIs are available. For more information, see [Coordinator APIs for clones](#coordinator-apis-for-clones). + +[#17863](https://github.com/apache/druid/pull/17863) [#17899](https://github.com/apache/druid/pull/17899) [#17956](https://github.com/apache/druid/pull/17956) +### Overlord kill tasks + +You can now run kill tasks directly on the Overlord itself. Running kill tasks on the Overlord provides the following benefits: + +- Unused segments are killed as soon as they're eligible and are killed faster +- Doesn't require a task slot +- Locked intervals are automatically skipped +- Configuration is simpler +- A large number of unused segments doesn't cause issues for them + +This feature is controlled by the following configs: + +- `druid.manager.segments.killUnused.enabled` - Whether the feature is enabled or not +- `druid.manager.segments.killUnused.bufferPeriod` - The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused. + +As part of this feature, [new metrics](#overlord-kill-task-metrics) have been added. + +[#18028](https://github.com/apache/druid/pull/18028) + +### Preferred tier selection +You can now configure the Broker service to prefer Historicals on a specific tier. This can help ensure Druid executes queries within the same availability zone if you have Druid deployed across multiple availability zones. Review Comment: Waiting on PR author for release notes : https://github.com/apache/druid/pull/18136#issuecomment-3096315977 ########## docs/release-info/release-notes.md: ########## @@ -57,63 +57,308 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +#### Improved HTTP endpoints + +You can now use raw SQL in the HTTP body for `/druid/v2/sql` endpoints. You can set `Content-Type` to `text/plain` instead of `application/json`, so you can provide raw text that isn't escaped. + +[#17937](https://github.com/apache/druid/pull/17937) + +Additionally, SQL requests can now include multiple SET statements to build up context for the final statement. For example, the following query results in a statement that includes the `timeout`, `useCache`, `populateCache`, and `vectorize` query context parameters: + +```sql +SET timeout = 20000; +SET useCache = false; +SET populateCache = false; +SET vectorize = 'force'; +SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2 +``` + +This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported. + +[#17974](https://github.com/apache/druid/pull/17974) +### Cloning Historicals + +You can now configure clones for Historicals using the dynamic Coordinator configuration `cloneServers`. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one. + +Set the config to a map from the target Historical server to the source Historical: + +``` + "cloneServers": {"historicalClone":"historicalOriginal"} +``` + +The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the `cloneServers` config. + +When you query your data using the native query engine, you can prefer (`preferClones`), exclude (`excludeClones`), or include (`includeClones`) clones by setting the query context parameter `cloneQueryMode`. By default, clones are excluded. + +As part of this change, new Coordinator APIs are available. For more information, see [Coordinator APIs for clones](#coordinator-apis-for-clones). + +[#17863](https://github.com/apache/druid/pull/17863) [#17899](https://github.com/apache/druid/pull/17899) [#17956](https://github.com/apache/druid/pull/17956) +### Overlord kill tasks Review Comment: ```suggestion ### Embedded kill tasks on the Overlord (Experimental) ``` ########## docs/release-info/release-notes.md: ########## @@ -57,63 +57,308 @@ For tips about how to write a good release note, see [Release notes](https://git This section contains important information about new and existing features. +#### Improved HTTP endpoints + +You can now use raw SQL in the HTTP body for `/druid/v2/sql` endpoints. You can set `Content-Type` to `text/plain` instead of `application/json`, so you can provide raw text that isn't escaped. + +[#17937](https://github.com/apache/druid/pull/17937) + +Additionally, SQL requests can now include multiple SET statements to build up context for the final statement. For example, the following query results in a statement that includes the `timeout`, `useCache`, `populateCache`, and `vectorize` query context parameters: + +```sql +SET timeout = 20000; +SET useCache = false; +SET populateCache = false; +SET vectorize = 'force'; +SELECT "channel", "page", sum("added") from "wikipedia" GROUP BY 1, 2 +``` + +This improvement also works for INSERT and REPLACE queries using the MSQ task engine. Note that JDBC isn't supported. + +[#17974](https://github.com/apache/druid/pull/17974) +### Cloning Historicals + +You can now configure clones for Historicals using the dynamic Coordinator configuration `cloneServers`. Cloned Historicals are useful for situations such as rolling updates where you want to launch a new Historical as a replacement for an existing one. + +Set the config to a map from the target Historical server to the source Historical: + +``` + "cloneServers": {"historicalClone":"historicalOriginal"} +``` + +The clone doesn't participate in regular segment assignment or balancing. Instead, the Coordinator mirrors any segment assignment made to the original Historical onto the clone, so that the clone becomes an exact copy of the source. Segments on the clone Historical do not count towards replica counts either. If the original Historical disappears, the clone remains in the last known state of the source server until removed from the `cloneServers` config. + +When you query your data using the native query engine, you can prefer (`preferClones`), exclude (`excludeClones`), or include (`includeClones`) clones by setting the query context parameter `cloneQueryMode`. By default, clones are excluded. + +As part of this change, new Coordinator APIs are available. For more information, see [Coordinator APIs for clones](#coordinator-apis-for-clones). + +[#17863](https://github.com/apache/druid/pull/17863) [#17899](https://github.com/apache/druid/pull/17899) [#17956](https://github.com/apache/druid/pull/17956) +### Overlord kill tasks + +You can now run kill tasks directly on the Overlord itself. Running kill tasks on the Overlord provides the following benefits: + +- Unused segments are killed as soon as they're eligible and are killed faster +- Doesn't require a task slot +- Locked intervals are automatically skipped +- Configuration is simpler +- A large number of unused segments doesn't cause issues for them + +This feature is controlled by the following configs: + +- `druid.manager.segments.killUnused.enabled` - Whether the feature is enabled or not +- `druid.manager.segments.killUnused.bufferPeriod` - The amount of time that a segment must be unused before it is able to be permanently removed from metadata and deep storage. This can serve as a buffer period to prevent data loss if data ends up being needed after being marked unused. + +As part of this feature, [new metrics](#overlord-kill-task-metrics) have been added. + +[#18028](https://github.com/apache/druid/pull/18028) + +### Preferred tier selection +You can now configure the Broker service to prefer Historicals on a specific tier. This can help ensure Druid executes queries within the same availability zone if you have Druid deployed across multiple availability zones. + +[#18136](https://github.com/apache/druid/pull/18136) + +### Dart improvements NEED TO WRITE + +Dart specific endpoints have been removed and folded into SqlResource. [#18003](https://github.com/apache/druid/pull/18003) +Added a new engine QueryContext parameter. The value can be native or msq-dart. The value determines the engine used to run the query. The default value is native. [#18003](https://github.com/apache/druid/pull/18003) + +MSQ Dart is now able to query real-time tasks by setting the query context parameter includeSegmentSource to realtime, in a similar way to MSQ tasks. [#18076](https://github.com/apache/druid/pull/18076) Review Comment: Lets add an example where we hit the sql resource with `engine=msq-dart` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
