Junge-401 opened a new issue, #14390:
URL: https://github.com/apache/druid/issues/14390
When compacting multiple days of data, one day has no data, and the entire
comapct task fails.
### Affected Version
0.22.0
### Description
For example, I have two tables TEST_SRC and TEST_DEST, where the TEST_DEST
table loads data from TEST_SRC for several days through reindex. This is the
schema of reindex:
``` json
{
"type": "index_parallel",
"spec": {
"ioConfig": {
"type": "index_parallel",
"inputSource": {
"type": "druid",
"dataSource": "TEST_SRC",
"interval": "2023-05-04T00:00:00/2023-05-19T00:00:00"
}
},
"tuningConfig": {
"type": "index_parallel",
"partitionsSpec": {
"type": "dynamic"
},
"maxNumConcurrentSubTasks": 3
},
"dataSchema": {
"timestampSpec": {
"column": "__time",
"format": "millis"
},
"granularitySpec": {
"rollup": true,
"queryGranularity": "minute",
"segmentGranularity": "hour"
},
"dimensionsSpec": {
"dimensions": [
{
"name": "mobile_app_id",
"type": "string"
},
{
"name": "country_id",
"type": "string"
}
]
},
"metricsSpec": [
{
"type": "longSum",
"name": "view_count",
"fieldName": "view_count",
"expression": null
}
],
"dataSource": "TEST_DEST"
}
}
}
```
The whole reindex work process is relatively smooth. Except for 2023-05-07,
there are data for other days.
Then I try to compact all segments in TEST_DEST and rollup at the same time。
```json
{
"type":"compact",
"dataSource":"TEST_DEST",
"granularitySpec":{
"type":"uniform",
"segmentGranularity":{
"type":"period",
"period":"PT6H"
},
"queryGranularity":"HOUR",
"intervals":[
"2023-05-04T00:00:00/2023-05-19T00:00:00"
]
},
"tuningConfig":{
"type":"index_parallel",
"maxNumConcurrentSubTasks":3,
"forceGuaranteedRollup":"true",
"partitionsSpec":{
"type":"single_dim",
"targetRowsPerSegment":3000000,
"partitionDimension":"mobile_app_id"
}
},
"ioConfig":{
"type":"compact",
"inputSpec":{
"type":"interval",
"interval":"2023-05-04T00:00:00/2023-05-19T00:00:00"
},
"appendToExisting":false
}
}
```
Then the whole compact task will fail very quickly. Then I see log
information like this in overload.log.
```text
2023-06-08T12:08:34,491 INFO [qtp617662116-138]
org.apache.druid.indexing.overlord.TaskLockbox -
Task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] already present in
TaskLock[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]
2023-06-08T12:08:57,532 INFO [qtp617662116-136]
org.apache.druid.indexing.overlord.TaskLockbox - Cannot create a new
taskLockPosse for request[TimeChunkLockRequest{lockType=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-04T00:00:00.000Z/2023-05-19T00:00:00.000Z,
preferredVersion='null', priority=25, revoked=false}] because existing
locks[[TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-04T00:00:00.000Z/2023-05-05T12:00:00.000Z,
version='2023-06-08T12:08:34.294Z', priority=25, revoked=false},
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-06T00:00:00.000Z/2023-05-06T06:00:00.000Z,
version='2023-06-08T12:08:34.300Z', priority=25, revoked=
false}, taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-08T00:00:00.000Z/2023-05-08T12:00:00.000Z,
version='2023-06-08T12:08:34.309Z', priority=25, revoked=false},
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-09T00:00:00.000Z/2023-05-09T12:00:00.000Z,
version='2023-06-08T12:08:34.315Z', priority=25, revoked=false},
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-10T00:00:00.000Z/2023-05-10T18:00:00.000Z,
version='2023-06-08T12:08:34.321Z', priority=25, revoked=false}, ta
skIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-11T00:00:00.000Z/2023-05-11T06:00:00.000Z,
version='2023-06-08T12:08:34.329Z', priority=25, revoked=false},
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]},
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-15T00:00:00.000Z/2023-05-19T00:00:00.000Z,
version='2023-06-08T12:08:34.336Z', priority=25, revoked=false},
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}]] have same or
higher priorities
2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.RemoteTaskRunner -
Worker[wukong-v3650-micro.syh.com:8091] wrote FAILED status for task
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] on
[TaskLocation{host='wukong-v3650-micro.syh.com', port=8091, tlsPort=-1}]
2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.RemoteTaskRunner -
Worker[wukong-v3650-micro.syh.com:8091] completed
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] with status[FAILED]
2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.TaskQueue - Received FAILED status for task:
compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z
2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.RemoteTaskRunner - Shutdown
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] because: [notified status
change from task]
2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.RemoteTaskRunner - Cleaning up
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] on
worker[wukong-v3650-micro.syh.com:8091]
2023-06-08T12:08:57,588 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.TaskLockbox - Removing
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] from activeTasks
2023-06-08T12:08:57,588 INFO [Curator-PathChildrenCache-3]
org.apache.druid.indexing.overlord.TaskLockbox - Removing
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] from
TaskLock[TimeChunkLock{type=EXCLUSIVE,
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z',
dataSource='TEST_DEST',
interval=2023-05-04T00:00:00.000Z/2023-05-05T12:00:00.000Z,
version='2023-06-08T12:08:34.294Z', priority=25, revoked=false}]
```
index.log
```text
2023-06-08T12:08:57,518 INFO
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
org.apache.druid.indexing.common.task.CompactionTask - Generated [1]
compaction task specs
2023-06-08T12:08:57,518 INFO
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
org.apache.druid.indexing.common.task.AbstractBatchIndexTask - Using timeChunk
lock for perfect rollup
2023-06-08T12:08:57,532 WARN
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
org.apache.druid.indexing.common.task.CompactionTask - indexSpec is not ready:
[{
"id" : "compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
"groupId" : "compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
"availabilityGroup" :
"compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
"appenderatorTrackingTaskId" :
"compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
2023-06-08T12:08:57,532 INFO
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
org.apache.druid.indexing.common.task.CompactionTask - Ran [1] specs, [0]
succeeded, [1] failed
2023-06-08T12:08:57,532 ERROR [threading-task-runner-executor-31]
org.apache.druid.segment.realtime.appenderator.UnifiedIndexerAppenderatorsManager
- Could not find datasource bundle for [TEST_DEST], task
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]
2023-06-08T12:08:57,558 INFO [threading-task-runner-executor-31]
org.apache.druid.indexing.overlord.ThreadingTaskRunner - Removed task
directory: var/druid/task/compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z
2023-06-08T12:08:57,583 INFO [WorkerTaskManager-NoticeHandler]
org.apache.druid.indexing.worker.WorkerTaskManager - Task
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] completed with status
[FAILED].
```
But when I try to merge except the day 2023-05-07 which has no data, the
whole compact task can run normally!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]