nozjkoitop opened a new pull request, #18099:
URL: https://github.com/apache/druid/pull/18099

   ### Description
   
   #### The Issue
   REPLACE tasks (single-phase parallel index) that should delete rows were 
marked successful but did nothing - the row count stayed the same even though 
the thrownAway metric was positive.
   Logs showed that segments created by these tasks had a version of epoch (0).
   
   After some debugging, the flow appears to be:
   
   * When an existing lock can’t be obtained, the task [acquires a new 
one](https://github.com/apache/druid/blob/89cf9b263dc16238bd1493414826ea087f50d168/indexing-service/src/main/java/org/apache/druid/indexing/common/task/AbstractBatchIndexTask.java#L867).
   * TaskLockAcquireAction [builds the lock request without 
preferredVersion](https://github.com/apache/druid/blob/89cf9b263dc16238bd1493414826ea087f50d168/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/TimeChunkLockTryAcquireAction.java#L76).
   * The request [returns a version of epoch 
(0)](https://github.com/apache/druid/blob/89cf9b263dc16238bd1493414826ea087f50d168/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/TaskLocks.java#L90)
 for [an APPEND 
lock](https://github.com/apache/druid/blob/89cf9b263dc16238bd1493414826ea087f50d168/indexing-service/src/main/java/org/apache/druid/indexing/overlord/TimeChunkLockRequest.java#L118).
   * With concurrent locks enabled, the lock type is [always 
APPEND](https://github.com/apache/druid/blob/89cf9b263dc16238bd1493414826ea087f50d168/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/TaskLocks.java#L158),
 even for REPLACE tasks.
   
   #### Possible fixes
   __1st option__ – Allow a REPLACE lock in 
[determineLockTypeForAppend](https://github.com/apache/druid/blob/12c1536e441011cd7d55dd05752d3567bd97fb16/indexing-service/src/main/java/org/apache/druid/indexing/common/actions/TaskLocks.java#L150).
   _Workaround-ish, since it needs an explicit lock type in the task context._
   
   _was_ 
   ```
   public static TaskLockType determineLockTypeForAppend(Map<String, Object> 
taskContext) {
     final boolean useConcurrentLocks = (boolean) taskContext.getOrDefault(
         Tasks.USE_CONCURRENT_LOCKS,
         Tasks.DEFAULT_USE_CONCURRENT_LOCKS
     );
     if (useConcurrentLocks) {
       return TaskLockType.APPEND;
     }
     final Object lockType = taskContext.get(Tasks.TASK_LOCK_TYPE);
     if (lockType == null) {
       final boolean useSharedLock = (boolean) 
taskContext.getOrDefault(Tasks.USE_SHARED_LOCK, false);
       return useSharedLock ? TaskLockType.SHARED : TaskLockType.EXCLUSIVE;
     } else {
       return TaskLockType.valueOf(lockType.toString());
     }
   }
   ```
   _became_ 
   ```
   public static TaskLockType determineLockTypeForAppend(Map<String, Object> 
taskContext) {
     final boolean useConcurrentLocks = (boolean) taskContext.getOrDefault(
         Tasks.USE_CONCURRENT_LOCKS,
         Tasks.DEFAULT_USE_CONCURRENT_LOCKS
     );
     final Object rawLock = taskContext.get(Tasks.TASK_LOCK_TYPE);
     TaskLockType lockType = (rawLock != null) ? 
TaskLockType.valueOf(rawLock.toString()) : null;
   
     if (useConcurrentLocks) {
       return (lockType == TaskLockType.REPLACE) ? TaskLockType.REPLACE : 
TaskLockType.APPEND;
     }
     if (lockType == null) {
       final boolean useSharedLock = (boolean) 
taskContext.getOrDefault(Tasks.USE_SHARED_LOCK, false);
       return useSharedLock ? TaskLockType.SHARED : TaskLockType.EXCLUSIVE;
     } else {
       return TaskLockType.valueOf(lockType.toString());
     }
   }
   
   ```
   __2nd option – The solution implemented in this PR.__
   It follows the TaskLocks comment about deduplicating 
determineLockTypeForAppend, making the logic cleaner and ensuring REPLACE tasks 
get the correct lock type and version.
   
   This PR has:
   
   - [X] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] a release note entry in the PR description.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [X] been tested in a test Druid cluster.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to