techdocsmith commented on code in PR #14492: URL: https://github.com/apache/druid/pull/14492#discussion_r1253694610
########## docs/api-reference/tasks-api.md: ########## @@ -25,77 +25,1565 @@ sidebar_label: Tasks This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid. Review Comment: Wonder if a brief definition of a task would be appropriate here. For example: https://druid.apache.org/docs/latest/ingestion/tasks.html However, in addition to ingestion, I think tasks are going to start doing some of the querying from deep storage work. check with @317brian about those changes ########## docs/api-reference/tasks-api.md: ########## @@ -25,77 +25,1565 @@ sidebar_label: Tasks This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid. -## Tasks +In this document, `{domain}` is a placeholder for the server address of deployment. For example, on the quickstart configuration, replace `{domain}` with `http://localhost:8888`. -Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` -as in `2016-06-27_2016-06-28`. +For query parameters that take an interval, provide ISO 8601 strings delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`. -`GET /druid/indexer/v1/tasks` +## Task information and retrieval -Retrieve list of tasks. Accepts query string parameters `state`, `datasource`, `createdTimeInterval`, `max`, and `type`. +### Get an array of tasks -|Query Parameter |Description | -|---|---| -|`state`|filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| -| `datasource`| return tasks filtered by Druid datasource.| -| `createdTimeInterval`| return tasks created within the specified interval. | -| `max`| maximum number of `"complete"` tasks to return. Only applies when `state` is set to `"complete"`.| -| `type`| filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/tasks` +Retrieves an array of all tasks in the Druid cluster. Each task object includes information on its ID, status, associated datasource, and other metadata. -`GET /druid/indexer/v1/completeTasks` +#### Query parameters -Retrieve list of complete tasks. Equivalent to `/druid/indexer/v1/tasks?state=complete`. +The endpoint supports a set of optional query parameters to filter results. -`GET /druid/indexer/v1/runningTasks` +|Parameter|Type|Description| +|---|---|---| +|`state`|String|Filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -Retrieve list of running tasks. Equivalent to `/druid/indexer/v1/tasks?state=running`. +#### Responses -`GET /druid/indexer/v1/waitingTasks` +<!--DOCUSAURUS_CODE_TABS--> -Retrieve list of waiting tasks. Equivalent to `/druid/indexer/v1/tasks?state=waiting`. +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of tasks* +<!--400 BAD REQUEST--> +<br/> +*Invalid `state` query parameter value* +<!--500 SERVER ERROR--> +<br/> +*Invalid query parameter* +<!--END_DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/pendingTasks` +--- + +#### Sample request + +The following example shows how to retrieve a list of tasks filtered with the following query parameters: +* State: `complete` +* Datasource: `wikipedia_api` +* Time interval: between `2015-09-12` and `2015-09-13` +* Max entries returned: `10` +* Task type: `query_worker` + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker HTTP/1.1 +Host: {domain} +``` + +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f-worker0_0", + "groupId": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f", + "type": "query_worker", + "createdTime": "2023-06-20T22:51:21.302Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 16911, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-5419da7a-b270-492f-90e6-920ecfba766a-worker0_0", + "groupId": "query-5419da7a-b270-492f-90e6-920ecfba766a", + "type": "query_worker", + "createdTime": "2023-06-20T22:45:53.909Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17030, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of complete tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/completeTasks` + +Retrieves an array of completed tasks in the Druid cluster. This is functionally equivalent to `/druid/indexer/v1/tasks?state=complete`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of complete tasks* +<!--404 NOT FOUND--> +<br/> +*Request sent to incorrect service* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/completeTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/completeTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_controller", + "createdTime": "2023-06-22T22:11:28.367Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 30317, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of running tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/runningTasks` + +Retrieves an array of running task objects in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=running`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of running tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/runningTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/runningTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "type": "query_controller", + "createdTime": "2023-06-22T22:54:43.170Z", + "queueInsertionTime": "2023-06-22T22:54:43.170Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "RUNNING", + "duration": -1, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of waiting tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/waitingTasks` + +Retrieves an array of waiting tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=waiting`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of waiting tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve list of pending tasks. Equivalent to `/druid/indexer/v1/tasks?state=pending`. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/waitingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/waitingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "groupId": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.217Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "groupId": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.548Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "groupId": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:06.671Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of pending tasks + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/pendingTasks` + +Retrieves an array of pending tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=pending`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -`GET /druid/indexer/v1/task/{taskId}` +#### Responses -Retrieve the 'payload' of a task. +<!--DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/task/{taskId}/status` +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of pending tasks* +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve the status of a task. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/pendingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/pendingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "groupId": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.037Z", + "queueInsertionTime": "2023-06-23T19:53:06.037Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "groupId": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.616Z", + "queueInsertionTime": "2023-06-23T19:53:06.616Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> -`GET /druid/indexer/v1/task/{taskId}/segments` +### Get task payload + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}` + +Retrieves the payload of a task given the task ID. It returns a JSON object with the task ID and payload that includes task configuration details and relevant specifications associated with the execution of the task. + +#### Responses +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved payload of task* +<!--404 NOT FOUND--> +<br/> +*Cannot find task with ID* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the task payload of a task with the specified ID `query-32663269-ead9-405a-8eb6-0817a952ef47`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47 HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + Review Comment: Is it possible to use an example with response payload? For the purposes of demonstration all those fields are not doing a lot of work. The fact that it's hidden helps, but still is a big payload. ########## docs/api-reference/tasks-api.md: ########## @@ -25,77 +25,1565 @@ sidebar_label: Tasks This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid. -## Tasks +In this document, `{domain}` is a placeholder for the server address of deployment. For example, on the quickstart configuration, replace `{domain}` with `http://localhost:8888`. -Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` -as in `2016-06-27_2016-06-28`. +For query parameters that take an interval, provide ISO 8601 strings delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`. -`GET /druid/indexer/v1/tasks` +## Task information and retrieval -Retrieve list of tasks. Accepts query string parameters `state`, `datasource`, `createdTimeInterval`, `max`, and `type`. +### Get an array of tasks -|Query Parameter |Description | -|---|---| -|`state`|filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| -| `datasource`| return tasks filtered by Druid datasource.| -| `createdTimeInterval`| return tasks created within the specified interval. | -| `max`| maximum number of `"complete"` tasks to return. Only applies when `state` is set to `"complete"`.| -| `type`| filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/tasks` +Retrieves an array of all tasks in the Druid cluster. Each task object includes information on its ID, status, associated datasource, and other metadata. -`GET /druid/indexer/v1/completeTasks` +#### Query parameters -Retrieve list of complete tasks. Equivalent to `/druid/indexer/v1/tasks?state=complete`. +The endpoint supports a set of optional query parameters to filter results. -`GET /druid/indexer/v1/runningTasks` +|Parameter|Type|Description| +|---|---|---| +|`state`|String|Filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -Retrieve list of running tasks. Equivalent to `/druid/indexer/v1/tasks?state=running`. +#### Responses -`GET /druid/indexer/v1/waitingTasks` +<!--DOCUSAURUS_CODE_TABS--> -Retrieve list of waiting tasks. Equivalent to `/druid/indexer/v1/tasks?state=waiting`. +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of tasks* +<!--400 BAD REQUEST--> +<br/> +*Invalid `state` query parameter value* +<!--500 SERVER ERROR--> +<br/> +*Invalid query parameter* +<!--END_DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/pendingTasks` +--- + +#### Sample request + +The following example shows how to retrieve a list of tasks filtered with the following query parameters: +* State: `complete` +* Datasource: `wikipedia_api` +* Time interval: between `2015-09-12` and `2015-09-13` +* Max entries returned: `10` +* Task type: `query_worker` + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker HTTP/1.1 +Host: {domain} +``` + +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f-worker0_0", + "groupId": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f", + "type": "query_worker", + "createdTime": "2023-06-20T22:51:21.302Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 16911, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-5419da7a-b270-492f-90e6-920ecfba766a-worker0_0", + "groupId": "query-5419da7a-b270-492f-90e6-920ecfba766a", + "type": "query_worker", + "createdTime": "2023-06-20T22:45:53.909Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17030, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of complete tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/completeTasks` + +Retrieves an array of completed tasks in the Druid cluster. This is functionally equivalent to `/druid/indexer/v1/tasks?state=complete`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of complete tasks* +<!--404 NOT FOUND--> +<br/> +*Request sent to incorrect service* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/completeTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/completeTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_controller", + "createdTime": "2023-06-22T22:11:28.367Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 30317, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of running tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/runningTasks` + +Retrieves an array of running task objects in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=running`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of running tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/runningTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/runningTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "type": "query_controller", + "createdTime": "2023-06-22T22:54:43.170Z", + "queueInsertionTime": "2023-06-22T22:54:43.170Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "RUNNING", + "duration": -1, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of waiting tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/waitingTasks` + +Retrieves an array of waiting tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=waiting`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of waiting tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve list of pending tasks. Equivalent to `/druid/indexer/v1/tasks?state=pending`. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/waitingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/waitingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "groupId": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.217Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "groupId": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.548Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "groupId": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:06.671Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of pending tasks + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/pendingTasks` + +Retrieves an array of pending tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=pending`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -`GET /druid/indexer/v1/task/{taskId}` +#### Responses -Retrieve the 'payload' of a task. +<!--DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/task/{taskId}/status` +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of pending tasks* +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve the status of a task. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/pendingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/pendingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "groupId": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.037Z", + "queueInsertionTime": "2023-06-23T19:53:06.037Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "groupId": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.616Z", + "queueInsertionTime": "2023-06-23T19:53:06.616Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> -`GET /druid/indexer/v1/task/{taskId}/segments` +### Get task payload + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}` + +Retrieves the payload of a task given the task ID. It returns a JSON object with the task ID and payload that includes task configuration details and relevant specifications associated with the execution of the task. + +#### Responses +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved payload of task* +<!--404 NOT FOUND--> +<br/> +*Cannot find task with ID* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the task payload of a task with the specified ID `query-32663269-ead9-405a-8eb6-0817a952ef47`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47 HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + { + "task": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "payload": { + "type": "query_controller", + "id": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "spec": { + "query": { + "queryType": "scan", + "dataSource": { + "type": "external", + "inputSource": { + "type": "http", + "uris": [ + "https://druid.apache.org/data/wikipedia.json.gz" + ] + }, + "inputFormat": { + "type": "json", + "keepNullColumns": false, + "assumeNewlineDelimited": false, + "useJsonNodeReader": false + }, + "signature": [ + { + "name": "added", + "type": "LONG" + }, + { + "name": "channel", + "type": "STRING" + }, + { + "name": "cityName", + "type": "STRING" + }, + { + "name": "comment", + "type": "STRING" + }, + { + "name": "commentLength", + "type": "LONG" + }, + { + "name": "countryIsoCode", + "type": "STRING" + }, + { + "name": "countryName", + "type": "STRING" + }, + { + "name": "deleted", + "type": "LONG" + }, + { + "name": "delta", + "type": "LONG" + }, + { + "name": "deltaBucket", + "type": "STRING" + }, + { + "name": "diffUrl", + "type": "STRING" + }, + { + "name": "flags", + "type": "STRING" + }, + { + "name": "isAnonymous", + "type": "STRING" + }, + { + "name": "isMinor", + "type": "STRING" + }, + { + "name": "isNew", + "type": "STRING" + }, + { + "name": "isRobot", + "type": "STRING" + }, + { + "name": "isUnpatrolled", + "type": "STRING" + }, + { + "name": "metroCode", + "type": "STRING" + }, + { + "name": "namespace", + "type": "STRING" + }, + { + "name": "page", + "type": "STRING" + }, + { + "name": "regionIsoCode", + "type": "STRING" + }, + { + "name": "regionName", + "type": "STRING" + }, + { + "name": "timestamp", + "type": "STRING" + }, + { + "name": "user", + "type": "STRING" + } + ] + }, + "intervals": { + "type": "intervals", + "intervals": [ + "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z" + ] + }, + "virtualColumns": [ + { + "type": "expression", + "name": "v0", + "expression": "timestamp_parse(\"timestamp\",null,'UTC')", + "outputType": "LONG" + } + ], + "resultFormat": "compactedList", + "columns": [ + "added", + "channel", + "cityName", + "comment", + "commentLength", + "countryIsoCode", + "countryName", + "deleted", + "delta", + "deltaBucket", + "diffUrl", + "flags", + "isAnonymous", + "isMinor", + "isNew", + "isRobot", + "isUnpatrolled", + "metroCode", + "namespace", + "page", + "regionIsoCode", + "regionName", + "timestamp", + "user", + "v0" + ], + "legacy": false, + "context": { + "finalize": true, + "maxNumTasks": 3, + "maxParseExceptions": 0, + "queryId": "32663269-ead9-405a-8eb6-0817a952ef47", + "scanSignature": "[{\"name\":\"added\",\"type\":\"LONG\"},{\"name\":\"channel\",\"type\":\"STRING\"},{\"name\":\"cityName\",\"type\":\"STRING\"},{\"name\":\"comment\",\"type\":\"STRING\"},{\"name\":\"commentLength\",\"type\":\"LONG\"},{\"name\":\"countryIsoCode\",\"type\":\"STRING\"},{\"name\":\"countryName\",\"type\":\"STRING\"},{\"name\":\"deleted\",\"type\":\"LONG\"},{\"name\":\"delta\",\"type\":\"LONG\"},{\"name\":\"deltaBucket\",\"type\":\"STRING\"},{\"name\":\"diffUrl\",\"type\":\"STRING\"},{\"name\":\"flags\",\"type\":\"STRING\"},{\"name\":\"isAnonymous\",\"type\":\"STRING\"},{\"name\":\"isMinor\",\"type\":\"STRING\"},{\"name\":\"isNew\",\"type\":\"STRING\"},{\"name\":\"isRobot\",\"type\":\"STRING\"},{\"name\":\"isUnpatrolled\",\"type\":\"STRING\"},{\"name\":\"metroCode\",\"type\":\"STRING\"},{\"name\":\"namespace\",\"type\":\"STRING\"},{\"name\":\"page\",\"type\":\"STRING\"},{\"name\":\"regionIsoCode\",\"type\":\"STRING\"},{\"name\":\"regionName\",\"type\ ":\"STRING\"},{\"name\":\"timestamp\",\"type\":\"STRING\"},{\"name\":\"user\",\"type\":\"STRING\"},{\"name\":\"v0\",\"type\":\"LONG\"}]", + "sqlInsertSegmentGranularity": "\"DAY\"", + "sqlQueryId": "32663269-ead9-405a-8eb6-0817a952ef47" + }, + "granularity": { + "type": "all" + } + }, + "columnMappings": [ + { + "queryColumn": "v0", + "outputColumn": "__time" + }, + { + "queryColumn": "added", + "outputColumn": "added" + }, + { + "queryColumn": "channel", + "outputColumn": "channel" + }, + { + "queryColumn": "cityName", + "outputColumn": "cityName" + }, + { + "queryColumn": "comment", + "outputColumn": "comment" + }, + { + "queryColumn": "commentLength", + "outputColumn": "commentLength" + }, + { + "queryColumn": "countryIsoCode", + "outputColumn": "countryIsoCode" + }, + { + "queryColumn": "countryName", + "outputColumn": "countryName" + }, + { + "queryColumn": "deleted", + "outputColumn": "deleted" + }, + { + "queryColumn": "delta", + "outputColumn": "delta" + }, + { + "queryColumn": "deltaBucket", + "outputColumn": "deltaBucket" + }, + { + "queryColumn": "diffUrl", + "outputColumn": "diffUrl" + }, + { + "queryColumn": "flags", + "outputColumn": "flags" + }, + { + "queryColumn": "isAnonymous", + "outputColumn": "isAnonymous" + }, + { + "queryColumn": "isMinor", + "outputColumn": "isMinor" + }, + { + "queryColumn": "isNew", + "outputColumn": "isNew" + }, + { + "queryColumn": "isRobot", + "outputColumn": "isRobot" + }, + { + "queryColumn": "isUnpatrolled", + "outputColumn": "isUnpatrolled" + }, + { + "queryColumn": "metroCode", + "outputColumn": "metroCode" + }, + { + "queryColumn": "namespace", + "outputColumn": "namespace" + }, + { + "queryColumn": "page", + "outputColumn": "page" + }, + { + "queryColumn": "regionIsoCode", + "outputColumn": "regionIsoCode" + }, + { + "queryColumn": "regionName", + "outputColumn": "regionName" + }, + { + "queryColumn": "timestamp", + "outputColumn": "timestamp" + }, + { + "queryColumn": "user", + "outputColumn": "user" + } + ], + "destination": { + "type": "dataSource", + "dataSource": "wikipedia_api", + "segmentGranularity": "DAY" + }, + "assignmentStrategy": "max", + "tuningConfig": { + "maxNumWorkers": 2, + "maxRowsInMemory": 100000, + "rowsPerSegment": 3000000 + } + }, + "sqlQuery": "\nINSERT INTO wikipedia_api \nSELECT \n TIME_PARSE(\"timestamp\") AS __time,\n * \nFROM TABLE(EXTERN(\n '{\"type\": \"http\", \"uris\": [\"https://druid.apache.org/data/wikipedia.json.gz\"]}', \n '{\"type\": \"json\"}', \n '[{\"name\": \"added\", \"type\": \"long\"}, {\"name\": \"channel\", \"type\": \"string\"}, {\"name\": \"cityName\", \"type\": \"string\"}, {\"name\": \"comment\", \"type\": \"string\"}, {\"name\": \"commentLength\", \"type\": \"long\"}, {\"name\": \"countryIsoCode\", \"type\": \"string\"}, {\"name\": \"countryName\", \"type\": \"string\"}, {\"name\": \"deleted\", \"type\": \"long\"}, {\"name\": \"delta\", \"type\": \"long\"}, {\"name\": \"deltaBucket\", \"type\": \"string\"}, {\"name\": \"diffUrl\", \"type\": \"string\"}, {\"name\": \"flags\", \"type\": \"string\"}, {\"name\": \"isAnonymous\", \"type\": \"string\"}, {\"name\": \"isMinor\", \"type\": \"string\"}, {\"name\": \"isNew\", \"type\": \"string\"}, {\"name\": \"isRobot\", \"type\ ": \"string\"}, {\"name\": \"isUnpatrolled\", \"type\": \"string\"}, {\"name\": \"metroCode\", \"type\": \"string\"}, {\"name\": \"namespace\", \"type\": \"string\"}, {\"name\": \"page\", \"type\": \"string\"}, {\"name\": \"regionIsoCode\", \"type\": \"string\"}, {\"name\": \"regionName\", \"type\": \"string\"}, {\"name\": \"timestamp\", \"type\": \"string\"}, {\"name\": \"user\", \"type\": \"string\"}]'\n ))\nPARTITIONED BY DAY\n", + "sqlQueryContext": { + "sqlQueryId": "32663269-ead9-405a-8eb6-0817a952ef47", + "sqlInsertSegmentGranularity": "\"DAY\"", + "maxNumTasks": 3, + "queryId": "32663269-ead9-405a-8eb6-0817a952ef47" + }, + "sqlResultsContext": { + "timeZone": "UTC", + "serializeComplexValues": true, + "stringifyArrays": true + }, + "sqlTypeNames": [ + "TIMESTAMP", + "BIGINT", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "BIGINT", + "VARCHAR", + "VARCHAR", + "BIGINT", + "BIGINT", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR" + ], + "context": { + "forceTimeChunkLock": true, + "useLineageBasedSegmentAllocation": true + }, + "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "dataSource": "wikipedia_api", + "resource": { + "availabilityGroup": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "requiredCapacity": 1 + } + } + } + ``` + +</details> + +### Get task status + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}/status` + +Retrieves the status of a task given the task ID. It returns a JSON object with the task's current state, task type, datasource, and other relevant metadata. + +#### Responses +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved task status* +<!--404 NOT FOUND--> +<br/> +*Cannot find task with ID* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the status of a task with the specified ID `query-223549f8-b993-4483-b028-1b0d54713cad`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-223549f8-b993-4483-b028-1b0d54713cad/status" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-223549f8-b993-4483-b028-1b0d54713cad/status HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response +<details> + <summary>Click to show sample response</summary> + + ```json + { + 'task': 'query-223549f8-b993-4483-b028-1b0d54713cad', + 'status': { + 'id': 'query-223549f8-b993-4483-b028-1b0d54713cad', + 'groupId': 'query-223549f8-b993-4483-b028-1b0d54713cad', + 'type': 'query_controller', + 'createdTime': '2023-06-22T22:11:28.367Z', + 'queueInsertionTime': '1970-01-01T00:00:00.000Z', + 'statusCode': 'RUNNING', + 'status': 'RUNNING', + 'runnerStatusCode': 'RUNNING', + 'duration': -1, + 'location': {'host': 'localhost', 'port': 8100, 'tlsPort': -1}, + 'dataSource': 'wikipedia_api', + 'errorMsg': None + } + } + ``` + +</details> + +### Get task segments + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}/segments` > This API is deprecated and will be removed in future releases. -Retrieve information about the segments of a task. +Retrieves information about segments generated by the task given the task ID. To hit this endpoint, make sure to enable the audit log config on the Overlord with `druid.indexer.auditLog.enabled = true`. + +In addition to enabling audit logs, configure a cleanup strategy to prevent overloading the metadata store with old audit logs which may cause performance issues. To enable automated cleanup of audit logs on the Coordinator, set `druid.coordinator.kill.audit.on`. You may also manually export the audit logs to external storage. For more information, see [Audit records](../operations/clean-metadata-store.md#audit-records). -`GET /druid/indexer/v1/task/{taskId}/reports` +#### Responses +<!--DOCUSAURUS_CODE_TABS--> -Retrieve a [task completion report](../ingestion/tasks.md#task-reports) for a task. Only works for completed tasks. +<!--200 SUCCESS--> +<br/> +*Successfully retrieved task segments* +<!--END_DOCUSAURUS_CODE_TABS--> -`POST /druid/indexer/v1/task` +--- + +#### Sample request + +The following examples shows how to retrieve the task segment of the task with the specified ID `query-52a8aafe-7265-4427-89fe-dc51275cc470`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +A successful request returns a `200 OK` response and an array of the task segments. + +### Get task log + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}/log` + +Retrieves the event log associated with a task. It returns a list of logged events during the lifecycle of the task. The endpoint is useful for providing information about the execution of the task, including any errors or warnings raised. + +#### Query parameters +* `offset` (optional) + * Type: Int + * Exclude the first passed in number of entries from the response. + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved task log* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the task log of a task with the specified ID `index_kafka_social_media_0e905aa31037879_nommnaeg`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/index_kafka_social_media_0e905aa31037879_nommnaeg/log" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/index_kafka_social_media_0e905aa31037879_nommnaeg/log HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + 2023-07-03T22:11:17,891 INFO [qtp1251996697-122] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Sequence[index_kafka_social_media_0e905aa31037879_0] end offsets updated from [{0=9223372036854775807}] to [{0=230985}]. + 2023-07-03T22:11:17,900 INFO [qtp1251996697-122] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved sequence metadata to disk: [SequenceMetadata{sequenceId=0, sequenceName='index_kafka_social_media_0e905aa31037879_0', assignments=[0], startOffsets={0=230985}, exclusiveStartPartitions=[], endOffsets={0=230985}, sentinel=false, checkpointed=true}] + 2023-07-03T22:11:17,901 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Received resume command, resuming ingestion. + 2023-07-03T22:11:17,901 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Finished reading partition[0], up to[230985]. + 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Resetting generation and member id due to: consumer pro-actively leaving the group + 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Request joining group due to: consumer pro-actively leaving the group + 2023-07-03T22:11:17,902 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Unsubscribed all topics or patterns and assigned partitions + 2023-07-03T22:11:17,912 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted rows[0] and (estimated) bytes[0] + 2023-07-03T22:11:17,916 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Flushed in-memory data with commit metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}] for segments: + 2023-07-03T22:11:17,917 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted stats: processed rows: [0], persisted rows[0], sinks: [0], total fireHydrants (across sinks): [0], persisted fireHydrants (across sinks): [0] + 2023-07-03T22:11:17,919 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Pushing [0] segments in background + 2023-07-03T22:11:17,921 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted rows[0] and (estimated) bytes[0] + 2023-07-03T22:11:17,924 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Flushed in-memory data with commit metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}] for segments: + 2023-07-03T22:11:17,924 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-persist] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Persisted stats: processed rows: [0], persisted rows[0], sinks: [0], total fireHydrants (across sinks): [0], persisted fireHydrants (across sinks): [0] + 2023-07-03T22:11:17,925 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Preparing to push (stats): processed rows: [0], sinks: [0], fireHydrants (across sinks): [0] + 2023-07-03T22:11:17,925 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.StreamAppenderator - Push complete... + 2023-07-03T22:11:17,929 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SequenceMetadata - With empty segment set, start offsets [SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}] and end offsets [SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}] are the same, skipping metadata commit. + 2023-07-03T22:11:17,930 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Published [0] segments with commit metadata [{nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}] + 2023-07-03T22:11:17,930 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Published 0 segments for sequence [index_kafka_social_media_0e905aa31037879_0] with metadata [AppenderatorDriverMetadata{segments={}, lastSegmentIds={}, callerMetadata={nextPartitions=SeekableStreamStartSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}, exclusivePartitions=[]}, publishPartitions=SeekableStreamEndSequenceNumbers{stream='social_media', partitionSequenceNumberMap={0=230985}}}}]. + 2023-07-03T22:11:17,931 INFO [[index_kafka_social_media_0e905aa31037879_nommnaeg]-publish] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved sequence metadata to disk: [] + 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Handoff complete for segments: + 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Resetting generation and member id due to: consumer pro-actively leaving the group + 2023-07-03T22:11:17,932 INFO [task-runner-0-priority-0] org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=consumer-kafka-supervisor-dcanhmig-1, groupId=kafka-supervisor-dcanhmig] Request joining group due to: consumer pro-actively leaving the group + 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Metrics scheduler closed + 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Closing reporter org.apache.kafka.common.metrics.JmxReporter + 2023-07-03T22:11:17,933 INFO [task-runner-0-priority-0] org.apache.kafka.common.metrics.Metrics - Metrics reporters closed + 2023-07-03T22:11:17,935 INFO [task-runner-0-priority-0] org.apache.kafka.common.utils.AppInfoParser - App info kafka.consumer for consumer-kafka-supervisor-dcanhmig-1 unregistered + 2023-07-03T22:11:17,936 INFO [task-runner-0-priority-0] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/internal-discovery/PEON/localhost:8100] + 2023-07-03T22:11:17,972 INFO [task-runner-0-priority-0] org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced self [{"druidNode":{"service":"druid/middleManager","host":"localhost","bindOnHost":false,"plaintextPort":8100,"port":-1,"tlsPort":-1,"enablePlaintextPort":true,"enableTlsPort":false},"nodeType":"peon","services":{"dataNodeService":{"type":"dataNodeService","tier":"_default_tier","maxSize":0,"type":"indexer-executor","serverType":"indexer-executor","priority":0},"lookupNodeService":{"type":"lookupNodeService","lookupTier":"__default"}}}]. + 2023-07-03T22:11:17,972 INFO [task-runner-0-priority-0] org.apache.druid.curator.announcement.Announcer - Unannouncing [/druid/announcements/localhost:8100] + 2023-07-03T22:11:17,996 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { + "id" : "index_kafka_social_media_0e905aa31037879_nommnaeg", + "status" : "SUCCESS", + "duration" : 3601130, + "errorMsg" : null, + "location" : { + "host" : null, + "port" : -1, + "tlsPort" : -1 + } + } + 2023-07-03T22:11:17,998 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [ANNOUNCEMENTS] + 2023-07-03T22:11:18,005 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [SERVER] + 2023-07-03T22:11:18,009 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@6491006{HTTP/1.1, (http/1.1)}{0.0.0.0:8100} + 2023-07-03T22:11:18,009 INFO [main] org.eclipse.jetty.server.session - node0 Stopped scavenging + 2023-07-03T22:11:18,012 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@742aa00a{/,null,STOPPED} + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [NORMAL] + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.ZkCoordinator - Stopping ZkCoordinator for [DruidServerMetadata{name='localhost:8100', hostAndPort='localhost:8100', hostAndTlsPort='null', maxSize=0, tier='_default_tier', type=indexer-executor, priority=0}] + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.SegmentLoadDropHandler - Stopping... + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.server.coordination.SegmentLoadDropHandler - Stopped. + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting graceful shutdown of task[index_kafka_social_media_0e905aa31037879_nommnaeg]. + 2023-07-03T22:11:18,014 INFO [main] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Stopping forcefully (status: [PUBLISHING]) + 2023-07-03T22:11:18,019 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited. Lookup notices are not handled anymore. + 2023-07-03T22:11:18,020 INFO [main] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [name]. + 2023-07-03T22:11:18,020 INFO [Curator-Framework-0] org.apache.curator.framework.imps.CuratorFrameworkImpl - backgroundOperationsLoop exiting + 2023-07-03T22:11:18,147 INFO [main] org.apache.zookeeper.ZooKeeper - Session: 0x1000097ceaf0007 closed + 2023-07-03T22:11:18,147 INFO [main-EventThread] org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1000097ceaf0007 + 2023-07-03T22:11:18,151 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [INIT] + Finished peon task + ``` + +</details> + +### Get task completion report + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}/reports` + +Retrieves a [task completion report](../ingestion/tasks.md#task-reports) for a task. It returns a JSON object with information about the number of rows ingested, and any parse exceptions that Druid raised. + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved task report* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the completion report of a task with the specified ID `query-52a8aafe-7265-4427-89fe-dc51275cc470`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response +<details> + <summary>Click to show sample response</summary> + + ```json + { + "ingestionStatsAndErrors": { + "type": "ingestionStatsAndErrors", + "taskId": "query-52a8aafe-7265-4427-89fe-dc51275cc470", + "payload": { + "ingestionState": "COMPLETED", + "unparseableEvents": {}, + "rowStats": { + "determinePartitions": { + "processed": 0, + "processedBytes": 0, + "processedWithError": 0, + "thrownAway": 0, + "unparseable": 0 + }, + "buildSegments": { + "processed": 39244, + "processedBytes": 17106256, + "processedWithError": 0, + "thrownAway": 0, + "unparseable": 0 + } + }, + "errorMsg": null, + "segmentAvailabilityConfirmed": false, + "segmentAvailabilityWaitTimeMs": 0 + } + } + } + ``` + +</details> + +## Task operations + +### Submit a task + +#### URL + +<code class="postAPI">POST</code> `/druid/indexer/v1/task` + +Submits a task or supervisor spec to the Overlord. It returns the task ID of the submitted task. Review Comment: ```suggestion Submits a JSON-based ingestion spec or supervisor spec to the Overlord. It returns the task ID of the submitted task. ``` May want to say that for most batch ingestion use cases use: https://github.com/apache/druid/blob/master/docs/api-reference/sql-ingestion-api.md Refer to https://github.com/demo-kratia/druid/blob/tasks-api-refactor/docs/ingestion/ingestion-spec.md for payload documentation ########## docs/api-reference/tasks-api.md: ########## @@ -25,77 +25,1565 @@ sidebar_label: Tasks This document describes the API endpoints for task retrieval, submission, and deletion for Apache Druid. -## Tasks +In this document, `{domain}` is a placeholder for the server address of deployment. For example, on the quickstart configuration, replace `{domain}` with `http://localhost:8888`. -Note that all _interval_ URL parameters are ISO 8601 strings delimited by a `_` instead of a `/` -as in `2016-06-27_2016-06-28`. +For query parameters that take an interval, provide ISO 8601 strings delimited by `_` instead of `/`. For example, `2023-06-27_2023-06-28`. -`GET /druid/indexer/v1/tasks` +## Task information and retrieval -Retrieve list of tasks. Accepts query string parameters `state`, `datasource`, `createdTimeInterval`, `max`, and `type`. +### Get an array of tasks -|Query Parameter |Description | -|---|---| -|`state`|filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| -| `datasource`| return tasks filtered by Druid datasource.| -| `createdTimeInterval`| return tasks created within the specified interval. | -| `max`| maximum number of `"complete"` tasks to return. Only applies when `state` is set to `"complete"`.| -| `type`| filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/tasks` +Retrieves an array of all tasks in the Druid cluster. Each task object includes information on its ID, status, associated datasource, and other metadata. -`GET /druid/indexer/v1/completeTasks` +#### Query parameters -Retrieve list of complete tasks. Equivalent to `/druid/indexer/v1/tasks?state=complete`. +The endpoint supports a set of optional query parameters to filter results. -`GET /druid/indexer/v1/runningTasks` +|Parameter|Type|Description| +|---|---|---| +|`state`|String|Filter list of tasks by task state, valid options are `running`, `complete`, `waiting`, and `pending`.| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -Retrieve list of running tasks. Equivalent to `/druid/indexer/v1/tasks?state=running`. +#### Responses -`GET /druid/indexer/v1/waitingTasks` +<!--DOCUSAURUS_CODE_TABS--> -Retrieve list of waiting tasks. Equivalent to `/druid/indexer/v1/tasks?state=waiting`. +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of tasks* +<!--400 BAD REQUEST--> +<br/> +*Invalid `state` query parameter value* +<!--500 SERVER ERROR--> +<br/> +*Invalid query parameter* +<!--END_DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/pendingTasks` +--- + +#### Sample request + +The following example shows how to retrieve a list of tasks filtered with the following query parameters: +* State: `complete` +* Datasource: `wikipedia_api` +* Time interval: between `2015-09-12` and `2015-09-13` +* Max entries returned: `10` +* Task type: `query_worker` + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12T00%3A00%3A00Z%2F2015-09-13T23%3A59%3A59Z&max=10&type=query_worker HTTP/1.1 +Host: {domain} +``` + +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f-worker0_0", + "groupId": "query-fa82fa40-4c8c-4777-b832-cabbee5f519f", + "type": "query_worker", + "createdTime": "2023-06-20T22:51:21.302Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 16911, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-5419da7a-b270-492f-90e6-920ecfba766a-worker0_0", + "groupId": "query-5419da7a-b270-492f-90e6-920ecfba766a", + "type": "query_worker", + "createdTime": "2023-06-20T22:45:53.909Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17030, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of complete tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/completeTasks` + +Retrieves an array of completed tasks in the Druid cluster. This is functionally equivalent to `/druid/indexer/v1/tasks?state=complete`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of complete tasks* +<!--404 NOT FOUND--> +<br/> +*Request sent to incorrect service* +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/completeTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/completeTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_worker", + "createdTime": "2023-06-22T22:11:37.012Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 17897, + "location": { + "host": "localhost", + "port": 8101, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-223549f8-b993-4483-b028-1b0d54713cad", + "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad", + "type": "query_controller", + "createdTime": "2023-06-22T22:11:28.367Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "SUCCESS", + "status": "SUCCESS", + "runnerStatusCode": "NONE", + "duration": 30317, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of running tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/runningTasks` + +Retrieves an array of running task objects in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=running`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of running tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/runningTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/runningTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "type": "query_controller", + "createdTime": "2023-06-22T22:54:43.170Z", + "queueInsertionTime": "2023-06-22T22:54:43.170Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "RUNNING", + "duration": -1, + "location": { + "host": "localhost", + "port": 8100, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of waiting tasks + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/waitingTasks` + +Retrieves an array of waiting tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=waiting`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| + +#### Responses + +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of waiting tasks* + +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve list of pending tasks. Equivalent to `/druid/indexer/v1/tasks?state=pending`. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/waitingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/waitingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "groupId": "index_parallel_wikipedia_auto_biahcbmf_2023-06-26T21:08:05.216Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.217Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "groupId": "index_parallel_wikipedia_auto_afggfiec_2023-06-26T21:08:05.546Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:05.548Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + }, + { + "id": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "groupId": "index_parallel_wikipedia_auto_jmmddihf_2023-06-26T21:08:06.644Z", + "type": "index_parallel", + "createdTime": "2023-06-26T21:08:06.671Z", + "queueInsertionTime": "1970-01-01T00:00:00.000Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "WAITING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_auto", + "errorMsg": null + } + ] + ``` + +</details> + +### Get an array of pending tasks + +#### URL + +<code class="getAPI">GET</code> `/druid/indexer/v1/pendingTasks` + +Retrieves an array of pending tasks in the Druid cluster. It is functionally equivalent to `/druid/indexer/v1/tasks?state=pending`. + +#### Query parameters + +The endpoint supports a set of optional query parameters to filter results. + +|Parameter|Type|Description| +|---|---|---| +| `datasource`|String| Return tasks filtered by Druid datasource.| +| `createdTimeInterval`|String (ISO-8601)| Return tasks created within the specified interval. | +| `max`|Integer|Maximum number of `complete` tasks to return. Only applies when `state` is set to `complete`.| +| `type`|String|Filter tasks by task type. See [task documentation](../ingestion/tasks.md) for more details.| -`GET /druid/indexer/v1/task/{taskId}` +#### Responses -Retrieve the 'payload' of a task. +<!--DOCUSAURUS_CODE_TABS--> -`GET /druid/indexer/v1/task/{taskId}/status` +<!--200 SUCCESS--> +<br/> +*Successfully retrieved list of pending tasks* +<!--END_DOCUSAURUS_CODE_TABS--> -Retrieve the status of a task. +--- + +#### Sample request + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/pendingTasks" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/pendingTasks HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + [ + { + "id": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "groupId": "query-7b37c315-50a0-4b68-aaa8-b1ef1f060e67", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.037Z", + "queueInsertionTime": "2023-06-23T19:53:06.037Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + }, + { + "id": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "groupId": "query-544f0c41-f81d-4504-b98b-f9ab8b36ef36", + "type": "query_controller", + "createdTime": "2023-06-23T19:53:06.616Z", + "queueInsertionTime": "2023-06-23T19:53:06.616Z", + "statusCode": "RUNNING", + "status": "RUNNING", + "runnerStatusCode": "PENDING", + "duration": -1, + "location": { + "host": null, + "port": -1, + "tlsPort": -1 + }, + "dataSource": "wikipedia_api", + "errorMsg": null + } + ] + ``` + +</details> -`GET /druid/indexer/v1/task/{taskId}/segments` +### Get task payload + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}` + +Retrieves the payload of a task given the task ID. It returns a JSON object with the task ID and payload that includes task configuration details and relevant specifications associated with the execution of the task. + +#### Responses +<!--DOCUSAURUS_CODE_TABS--> + +<!--200 SUCCESS--> +<br/> +*Successfully retrieved payload of task* +<!--404 NOT FOUND--> +<br/> +*Cannot find task with ID* + +<!--END_DOCUSAURUS_CODE_TABS--> + +--- + +#### Sample request + +The following examples shows how to retrieve the task payload of a task with the specified ID `query-32663269-ead9-405a-8eb6-0817a952ef47`. + +<!--DOCUSAURUS_CODE_TABS--> + +<!--cURL--> +```shell +curl "{domain}/druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47" +``` +<!--HTTP--> +```HTTP +GET /druid/indexer/v1/task/query-32663269-ead9-405a-8eb6-0817a952ef47 HTTP/1.1 +Host: {domain} +``` +<!--END_DOCUSAURUS_CODE_TABS--> + + +#### Sample response + +<details> + <summary>Click to show sample response</summary> + + ```json + { + "task": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "payload": { + "type": "query_controller", + "id": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "spec": { + "query": { + "queryType": "scan", + "dataSource": { + "type": "external", + "inputSource": { + "type": "http", + "uris": [ + "https://druid.apache.org/data/wikipedia.json.gz" + ] + }, + "inputFormat": { + "type": "json", + "keepNullColumns": false, + "assumeNewlineDelimited": false, + "useJsonNodeReader": false + }, + "signature": [ + { + "name": "added", + "type": "LONG" + }, + { + "name": "channel", + "type": "STRING" + }, + { + "name": "cityName", + "type": "STRING" + }, + { + "name": "comment", + "type": "STRING" + }, + { + "name": "commentLength", + "type": "LONG" + }, + { + "name": "countryIsoCode", + "type": "STRING" + }, + { + "name": "countryName", + "type": "STRING" + }, + { + "name": "deleted", + "type": "LONG" + }, + { + "name": "delta", + "type": "LONG" + }, + { + "name": "deltaBucket", + "type": "STRING" + }, + { + "name": "diffUrl", + "type": "STRING" + }, + { + "name": "flags", + "type": "STRING" + }, + { + "name": "isAnonymous", + "type": "STRING" + }, + { + "name": "isMinor", + "type": "STRING" + }, + { + "name": "isNew", + "type": "STRING" + }, + { + "name": "isRobot", + "type": "STRING" + }, + { + "name": "isUnpatrolled", + "type": "STRING" + }, + { + "name": "metroCode", + "type": "STRING" + }, + { + "name": "namespace", + "type": "STRING" + }, + { + "name": "page", + "type": "STRING" + }, + { + "name": "regionIsoCode", + "type": "STRING" + }, + { + "name": "regionName", + "type": "STRING" + }, + { + "name": "timestamp", + "type": "STRING" + }, + { + "name": "user", + "type": "STRING" + } + ] + }, + "intervals": { + "type": "intervals", + "intervals": [ + "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z" + ] + }, + "virtualColumns": [ + { + "type": "expression", + "name": "v0", + "expression": "timestamp_parse(\"timestamp\",null,'UTC')", + "outputType": "LONG" + } + ], + "resultFormat": "compactedList", + "columns": [ + "added", + "channel", + "cityName", + "comment", + "commentLength", + "countryIsoCode", + "countryName", + "deleted", + "delta", + "deltaBucket", + "diffUrl", + "flags", + "isAnonymous", + "isMinor", + "isNew", + "isRobot", + "isUnpatrolled", + "metroCode", + "namespace", + "page", + "regionIsoCode", + "regionName", + "timestamp", + "user", + "v0" + ], + "legacy": false, + "context": { + "finalize": true, + "maxNumTasks": 3, + "maxParseExceptions": 0, + "queryId": "32663269-ead9-405a-8eb6-0817a952ef47", + "scanSignature": "[{\"name\":\"added\",\"type\":\"LONG\"},{\"name\":\"channel\",\"type\":\"STRING\"},{\"name\":\"cityName\",\"type\":\"STRING\"},{\"name\":\"comment\",\"type\":\"STRING\"},{\"name\":\"commentLength\",\"type\":\"LONG\"},{\"name\":\"countryIsoCode\",\"type\":\"STRING\"},{\"name\":\"countryName\",\"type\":\"STRING\"},{\"name\":\"deleted\",\"type\":\"LONG\"},{\"name\":\"delta\",\"type\":\"LONG\"},{\"name\":\"deltaBucket\",\"type\":\"STRING\"},{\"name\":\"diffUrl\",\"type\":\"STRING\"},{\"name\":\"flags\",\"type\":\"STRING\"},{\"name\":\"isAnonymous\",\"type\":\"STRING\"},{\"name\":\"isMinor\",\"type\":\"STRING\"},{\"name\":\"isNew\",\"type\":\"STRING\"},{\"name\":\"isRobot\",\"type\":\"STRING\"},{\"name\":\"isUnpatrolled\",\"type\":\"STRING\"},{\"name\":\"metroCode\",\"type\":\"STRING\"},{\"name\":\"namespace\",\"type\":\"STRING\"},{\"name\":\"page\",\"type\":\"STRING\"},{\"name\":\"regionIsoCode\",\"type\":\"STRING\"},{\"name\":\"regionName\",\"type\ ":\"STRING\"},{\"name\":\"timestamp\",\"type\":\"STRING\"},{\"name\":\"user\",\"type\":\"STRING\"},{\"name\":\"v0\",\"type\":\"LONG\"}]", + "sqlInsertSegmentGranularity": "\"DAY\"", + "sqlQueryId": "32663269-ead9-405a-8eb6-0817a952ef47" + }, + "granularity": { + "type": "all" + } + }, + "columnMappings": [ + { + "queryColumn": "v0", + "outputColumn": "__time" + }, + { + "queryColumn": "added", + "outputColumn": "added" + }, + { + "queryColumn": "channel", + "outputColumn": "channel" + }, + { + "queryColumn": "cityName", + "outputColumn": "cityName" + }, + { + "queryColumn": "comment", + "outputColumn": "comment" + }, + { + "queryColumn": "commentLength", + "outputColumn": "commentLength" + }, + { + "queryColumn": "countryIsoCode", + "outputColumn": "countryIsoCode" + }, + { + "queryColumn": "countryName", + "outputColumn": "countryName" + }, + { + "queryColumn": "deleted", + "outputColumn": "deleted" + }, + { + "queryColumn": "delta", + "outputColumn": "delta" + }, + { + "queryColumn": "deltaBucket", + "outputColumn": "deltaBucket" + }, + { + "queryColumn": "diffUrl", + "outputColumn": "diffUrl" + }, + { + "queryColumn": "flags", + "outputColumn": "flags" + }, + { + "queryColumn": "isAnonymous", + "outputColumn": "isAnonymous" + }, + { + "queryColumn": "isMinor", + "outputColumn": "isMinor" + }, + { + "queryColumn": "isNew", + "outputColumn": "isNew" + }, + { + "queryColumn": "isRobot", + "outputColumn": "isRobot" + }, + { + "queryColumn": "isUnpatrolled", + "outputColumn": "isUnpatrolled" + }, + { + "queryColumn": "metroCode", + "outputColumn": "metroCode" + }, + { + "queryColumn": "namespace", + "outputColumn": "namespace" + }, + { + "queryColumn": "page", + "outputColumn": "page" + }, + { + "queryColumn": "regionIsoCode", + "outputColumn": "regionIsoCode" + }, + { + "queryColumn": "regionName", + "outputColumn": "regionName" + }, + { + "queryColumn": "timestamp", + "outputColumn": "timestamp" + }, + { + "queryColumn": "user", + "outputColumn": "user" + } + ], + "destination": { + "type": "dataSource", + "dataSource": "wikipedia_api", + "segmentGranularity": "DAY" + }, + "assignmentStrategy": "max", + "tuningConfig": { + "maxNumWorkers": 2, + "maxRowsInMemory": 100000, + "rowsPerSegment": 3000000 + } + }, + "sqlQuery": "\nINSERT INTO wikipedia_api \nSELECT \n TIME_PARSE(\"timestamp\") AS __time,\n * \nFROM TABLE(EXTERN(\n '{\"type\": \"http\", \"uris\": [\"https://druid.apache.org/data/wikipedia.json.gz\"]}', \n '{\"type\": \"json\"}', \n '[{\"name\": \"added\", \"type\": \"long\"}, {\"name\": \"channel\", \"type\": \"string\"}, {\"name\": \"cityName\", \"type\": \"string\"}, {\"name\": \"comment\", \"type\": \"string\"}, {\"name\": \"commentLength\", \"type\": \"long\"}, {\"name\": \"countryIsoCode\", \"type\": \"string\"}, {\"name\": \"countryName\", \"type\": \"string\"}, {\"name\": \"deleted\", \"type\": \"long\"}, {\"name\": \"delta\", \"type\": \"long\"}, {\"name\": \"deltaBucket\", \"type\": \"string\"}, {\"name\": \"diffUrl\", \"type\": \"string\"}, {\"name\": \"flags\", \"type\": \"string\"}, {\"name\": \"isAnonymous\", \"type\": \"string\"}, {\"name\": \"isMinor\", \"type\": \"string\"}, {\"name\": \"isNew\", \"type\": \"string\"}, {\"name\": \"isRobot\", \"type\ ": \"string\"}, {\"name\": \"isUnpatrolled\", \"type\": \"string\"}, {\"name\": \"metroCode\", \"type\": \"string\"}, {\"name\": \"namespace\", \"type\": \"string\"}, {\"name\": \"page\", \"type\": \"string\"}, {\"name\": \"regionIsoCode\", \"type\": \"string\"}, {\"name\": \"regionName\", \"type\": \"string\"}, {\"name\": \"timestamp\", \"type\": \"string\"}, {\"name\": \"user\", \"type\": \"string\"}]'\n ))\nPARTITIONED BY DAY\n", + "sqlQueryContext": { + "sqlQueryId": "32663269-ead9-405a-8eb6-0817a952ef47", + "sqlInsertSegmentGranularity": "\"DAY\"", + "maxNumTasks": 3, + "queryId": "32663269-ead9-405a-8eb6-0817a952ef47" + }, + "sqlResultsContext": { + "timeZone": "UTC", + "serializeComplexValues": true, + "stringifyArrays": true + }, + "sqlTypeNames": [ + "TIMESTAMP", + "BIGINT", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "BIGINT", + "VARCHAR", + "VARCHAR", + "BIGINT", + "BIGINT", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR", + "VARCHAR" + ], + "context": { + "forceTimeChunkLock": true, + "useLineageBasedSegmentAllocation": true + }, + "groupId": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "dataSource": "wikipedia_api", + "resource": { + "availabilityGroup": "query-32663269-ead9-405a-8eb6-0817a952ef47", + "requiredCapacity": 1 + } + } + } + ``` + +</details> + +### Get task status + +#### URL +<code class="getAPI">GET</code> `/druid/indexer/v1/task/{taskId}/status` + +Retrieves the status of a task given the task ID. It returns a JSON object with the task's current state, task type, datasource, and other relevant metadata. Review Comment: Not sure if it should be here, but we should list the different statuses and what they mean: For example `RUNNING -- actively executing?`, `COMPLETE--finished w/success?`, `FAILED--finished with error`, `PENDING ---waiting on a slot?` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
