Sachin Mittal created KAFKA-4848:
------------------------------------

             Summary: Stream thread getting into deadlock state while trying to 
get rocksdb lock in retryWithBackoff
                 Key: KAFKA-4848
                 URL: https://issues.apache.org/jira/browse/KAFKA-4848
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 0.10.2.0
            Reporter: Sachin Mittal
         Attachments: thr-1

We see a deadlock state when streams thread to process a task takes longer than 
MAX_POLL_INTERVAL_MS_CONFIG time. In this case this threads partitions are 
assigned to some other thread including rocksdb lock. When it tries to process 
the next task it cannot get rocks db lock and simply keeps waiting for that 
lock forever.

in retryWithBackoff for AbstractTaskCreator we have a backoffTimeMs = 50L.
If it does not get lock the we simply increase the time by 10x and keep trying 
inside the while true loop.

We need to have a upper bound for this backoffTimeM. If the time is greater 
than  MAX_POLL_INTERVAL_MS_CONFIG and it still hasn't got the lock means this 
thread's partitions are moved somewhere else and it may not get the lock again.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to