nozjkoitop opened a new issue, #17660:
URL: https://github.com/apache/druid/issues/17660

   ### Affected Version
   
   Tested with version 30.0.0. Related code fragments look unchanged in the 
master.
   
   ### Description
   
   Compaction of partitioned segments seems to cause an issue with 
segment-level locking. Specifically, partial segment tasks are not updated to 
properly work with segment-level locking. During initialization, the compaction 
task creates segment locks, but all partial tasks expect to obtain unique 
intervals from the LockPosses. This discrepancy leads to several problems 
during the compaction process.
   
   ### Current Behavior
   
   The [problematic 
method](https://github.com/apache/druid/blob/a964220260dd41d4084e653158fd7ec45092d716/indexing-service/src/main/java/org/apache/druid/indexing/common/task/CachingLocalSegmentAllocator.java#L105)
 receives 3 locks for each segment + a compaction-stage related lock and throws 
Duplicate Key exception:
   ```
   java.lang.IllegalStateException: Duplicate key 
2025-01-13T00:00:00.000Z/2025-01-14T00:00:00.000Z (attempted merging values 
2025-01-13T00:00:04.948Z and 2025-01-13T00:00:04.948Z)
        at 
java.base/java.util.stream.Collectors.duplicateKeyException(Collectors.java:133)
 ~[?:?]
        at 
java.base/java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:180)
 ~[?:?]
        at 
java.base/java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) 
~[?:?]
        at 
java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
 ~[?:?]
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) 
~[?:?]
        at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
 ~[?:?]
        at 
java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
 ~[?:?]
        at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
~[?:?]
        at 
java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
 ~[?:?]
        at 
org.apache.druid.indexing.common.task.CachingLocalSegmentAllocator.createVersionFinder(CachingLocalSegmentAllocator.java:114)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.CachingLocalSegmentAllocator.<init>(CachingLocalSegmentAllocator.java:85)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.SegmentAllocators.forNonLinearPartitioning(SegmentAllocators.java:95)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialRangeSegmentGenerateTask.createSegmentAllocator(PartialRangeSegmentGenerateTask.java:184)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentGenerateTask.generateSegments(PartialSegmentGenerateTask.java:193)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentGenerateTask.runTask(PartialSegmentGenerateTask.java:123)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:179) 
~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:478)
 [druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:450)
 [druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
 [guava-32.0.1-jre.jar:?]
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75)
 [guava-32.0.1-jre.jar:?]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
 [guava-32.0.1-jre.jar:?]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 [?:?]
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 [?:?]
        at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
   ```
   #### Performed debugging:
   
   `forceTimeChunkLock=false`:
   
   
![Image](https://github.com/user-attachments/assets/6c3c2d6e-e97c-4ad0-996e-475c640d46cd)
   
   First 3 locks are created at the sturtup of the main compaction task and 
treated as TimeChunk although Segment-level were created at very beginning.
   
   `forceTimeChunkLock=true`:
   
   
![Image](https://github.com/user-attachments/assets/c8cf42ab-c044-405f-add8-8a15df072588)
   
   After implementing a simple workaround in the problematic method of 
CachingLocalSegmentAllocator (simple grouping and selecting max version) and 
successfull `partial_range_index_generate`, the compaction still fails. The 
error occurs in the validation stage of `partial_index_generic_merge`. And the 
following exception is thrown:
   ```
   org.apache.druid.java.util.common.ISE: Unexpected state: Two 
versions([2025-01-22T12:19:31.907Z], [2025-01-22T12:19:31.907Z]) for the same 
interval[2016-06-27T00:00:00.000Z/2016-06-28T00:00:00.000Z]
   
   at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.lambda$runTask$2(PartialSegmentMergeTask.java:164)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
   at java.base/java.util.ArrayList.forEach(ArrayList.java:1541) ~[?:?]
   at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:154)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
   at 
org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:46)
 ~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
   at 
org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:179) 
~[druid-indexing-service-30.0.0-patch.jar:30.0.0-patch]
   at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:478)
 [druid-indexing-service-30.0.0-patch.jar:30.0.0]
   at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:450)
 [druid-indexing-service-30.0.0-patch.jar:30.0.0]
   at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
 [guava-32.0.1-jre.jar:?]
   at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:75)
 [guava-32.0.1-jre.jar:?]
   at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
 [guava-32.0.1-jre.jar:?]
   at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
 [?:?]
   at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
 [?:?]
   at java.base/java.lang.Thread.run(Thread.java:829) [?:?]
   ```
   ### Expected Behavior
   
   The partial segment tasks should correctly handle segment-level locks, 
ensuring proper lock management during compaction, and the compaction task 
should complete successfully.
   
   ### Steps to Reproduce
   
   * Run a compaction task for an interval that includes partitioned segments 
with forceTimeChunkLock=false.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to