chia7712 commented on a change in pull request #8657:
URL: https://github.com/apache/kafka/pull/8657#discussion_r482753529



##########
File path: 
core/src/test/scala/unit/kafka/coordinator/AbstractCoordinatorConcurrencyTest.scala
##########
@@ -201,8 +201,8 @@ object AbstractCoordinatorConcurrencyTest {
         }
       }
       val producerRequestKeys = 
entriesPerPartition.keys.map(TopicPartitionOperationKey(_)).toSeq
-      watchKeys ++= producerRequestKeys
       producePurgatory.tryCompleteElseWatch(delayedProduce, 
producerRequestKeys)
+      watchKeys ++= producerRequestKeys

Review comment:
       According to above case, there is a potential deadlock.
   
   ```
       var watchCreated = false
       for(key <- watchKeys) {
         // If the operation is already completed, stop adding it to the rest 
of the watcher list.
         if (operation.isCompleted)
           return false
         watchForOperation(key, operation)
   
         if (!watchCreated) {
           watchCreated = true
           estimatedTotalOperations.incrementAndGet()
         }
       }
   
       isCompletedByMe = operation.safeTryComplete()
       if (isCompletedByMe)
         return true
   
   ```
   
   ```safeTryComplete()``` is executed after updating ```watchKey```. Hence, it 
is possible that the lock of this request is held by **another thread**. The 
deadlock happens if this ```tryCompleteElseWatch``` is holding the **lock** 
required by **another thread**.
   
   It seems to me the simple approach is to remove 
```operation.safeTryComplete```. That should be fine since we have called 
```tryComplete``` before.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to