Junge-401 opened a new issue, #14390:
URL: https://github.com/apache/druid/issues/14390

   
   When compacting multiple days of data, one day has no data, and the entire 
comapct task fails.
   
   ### Affected Version
   
   0.22.0
   
   ### Description
   For example, I have two tables TEST_SRC and TEST_DEST, where the TEST_DEST 
table loads data from TEST_SRC for several days through reindex. This is the 
schema of reindex:
   ``` json
   {
     "type": "index_parallel",
     "spec": {
       "ioConfig": {
         "type": "index_parallel",
         "inputSource": {
           "type": "druid",
           "dataSource": "TEST_SRC",
           "interval": "2023-05-04T00:00:00/2023-05-19T00:00:00"
         }
       },
       "tuningConfig": {
         "type": "index_parallel",
         "partitionsSpec": {
           "type": "dynamic"
         },
         "maxNumConcurrentSubTasks": 3
       },
       "dataSchema": {
         "timestampSpec": {
           "column": "__time",
           "format": "millis"
         },
         "granularitySpec": {
           "rollup": true,
           "queryGranularity": "minute",
           "segmentGranularity": "hour"
         },
         "dimensionsSpec": {
           "dimensions": [
             {
               "name": "mobile_app_id",
               "type": "string"
             },
             {
               "name": "country_id",
               "type": "string"
             }
           ]
         },
         "metricsSpec": [
           {
             "type": "longSum",
             "name": "view_count",
             "fieldName": "view_count",
             "expression": null
           }
         ],
         "dataSource": "TEST_DEST"
       }
     }
   }
   ```
   The whole reindex work process is relatively smooth. Except for 2023-05-07, 
there are data for other days.
   
   Then I try to compact all segments in TEST_DEST and rollup at the same time。
   
   ```json
   {
     "type":"compact",
     "dataSource":"TEST_DEST",
     "granularitySpec":{
       "type":"uniform",
       "segmentGranularity":{
         "type":"period",
         "period":"PT6H"
       },
       "queryGranularity":"HOUR",
       "intervals":[
         "2023-05-04T00:00:00/2023-05-19T00:00:00"
       ]
     },
     "tuningConfig":{
       "type":"index_parallel",
       "maxNumConcurrentSubTasks":3,
       "forceGuaranteedRollup":"true",
       "partitionsSpec":{
         "type":"single_dim",
         "targetRowsPerSegment":3000000,
         "partitionDimension":"mobile_app_id"
       }
     },
     "ioConfig":{
       "type":"compact",
       "inputSpec":{
         "type":"interval",
         "interval":"2023-05-04T00:00:00/2023-05-19T00:00:00"
       },
       "appendToExisting":false
     }
   }
   ```
   Then the whole compact task will fail very quickly. Then I see log 
information like this in overload.log.
   ```text
   2023-06-08T12:08:34,491 INFO [qtp617662116-138] 
org.apache.druid.indexing.overlord.TaskLockbox - 
Task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] already present in 
TaskLock[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]
   2023-06-08T12:08:57,532 INFO [qtp617662116-136] 
org.apache.druid.indexing.overlord.TaskLockbox - Cannot create a new 
taskLockPosse for request[TimeChunkLockRequest{lockType=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-04T00:00:00.000Z/2023-05-19T00:00:00.000Z, 
preferredVersion='null', priority=25, revoked=false}] because existing 
locks[[TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-04T00:00:00.000Z/2023-05-05T12:00:00.000Z, 
version='2023-06-08T12:08:34.294Z', priority=25, revoked=false}, 
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-06T00:00:00.000Z/2023-05-06T06:00:00.000Z, 
version='2023-06-08T12:08:34.300Z', priority=25, revoked=
 false}, taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-08T00:00:00.000Z/2023-05-08T12:00:00.000Z, 
version='2023-06-08T12:08:34.309Z', priority=25, revoked=false}, 
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-09T00:00:00.000Z/2023-05-09T12:00:00.000Z, 
version='2023-06-08T12:08:34.315Z', priority=25, revoked=false}, 
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-10T00:00:00.000Z/2023-05-10T18:00:00.000Z, 
version='2023-06-08T12:08:34.321Z', priority=25, revoked=false}, ta
 skIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-11T00:00:00.000Z/2023-05-11T06:00:00.000Z, 
version='2023-06-08T12:08:34.329Z', priority=25, revoked=false}, 
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}, 
TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-15T00:00:00.000Z/2023-05-19T00:00:00.000Z, 
version='2023-06-08T12:08:34.336Z', priority=25, revoked=false}, 
taskIds=[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]}]] have same or 
higher priorities
   2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - 
Worker[wukong-v3650-micro.syh.com:8091] wrote FAILED status for task 
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] on 
[TaskLocation{host='wukong-v3650-micro.syh.com', port=8091, tlsPort=-1}]
   2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - 
Worker[wukong-v3650-micro.syh.com:8091] completed 
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] with status[FAILED]
   2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.TaskQueue - Received FAILED status for task: 
compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z
   2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - Shutdown 
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] because: [notified status 
change from task]
   2023-06-08T12:08:57,583 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - Cleaning up 
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] on 
worker[wukong-v3650-micro.syh.com:8091]
   2023-06-08T12:08:57,588 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.TaskLockbox - Removing 
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] from activeTasks
   2023-06-08T12:08:57,588 INFO [Curator-PathChildrenCache-3] 
org.apache.druid.indexing.overlord.TaskLockbox - Removing 
task[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] from 
TaskLock[TimeChunkLock{type=EXCLUSIVE, 
groupId='compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z', 
dataSource='TEST_DEST', 
interval=2023-05-04T00:00:00.000Z/2023-05-05T12:00:00.000Z, 
version='2023-06-08T12:08:34.294Z', priority=25, revoked=false}]
   ```
   index.log
   ```text
   2023-06-08T12:08:57,518 INFO 
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
 org.apache.druid.indexing.common.task.CompactionTask - Generated [1] 
compaction task specs
   2023-06-08T12:08:57,518 INFO 
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
 org.apache.druid.indexing.common.task.AbstractBatchIndexTask - Using timeChunk 
lock for perfect rollup
   2023-06-08T12:08:57,532 WARN 
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
 org.apache.druid.indexing.common.task.CompactionTask - indexSpec is not ready: 
[{
     "id" : "compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
     "groupId" : "compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
       "availabilityGroup" : 
"compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
       "appenderatorTrackingTaskId" : 
"compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z",
   2023-06-08T12:08:57,532 INFO 
[[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]-threading-task-runner-executor-31]
 org.apache.druid.indexing.common.task.CompactionTask - Ran [1] specs, [0] 
succeeded, [1] failed
   2023-06-08T12:08:57,532 ERROR [threading-task-runner-executor-31] 
org.apache.druid.segment.realtime.appenderator.UnifiedIndexerAppenderatorsManager
 - Could not find datasource bundle for [TEST_DEST], task 
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z]
   2023-06-08T12:08:57,558 INFO [threading-task-runner-executor-31] 
org.apache.druid.indexing.overlord.ThreadingTaskRunner - Removed task 
directory: var/druid/task/compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z
   2023-06-08T12:08:57,583 INFO [WorkerTaskManager-NoticeHandler] 
org.apache.druid.indexing.worker.WorkerTaskManager - Task 
[compact_TEST_DEST_nadfdjkg_2023-06-08T12:08:34.278Z] completed with status 
[FAILED].
   ```
   But when I try to merge except the day 2023-05-07 which has no data, the 
whole compact task can run normally!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to