apoorvmittal10 commented on PR #19759: URL: https://github.com/apache/kafka/pull/19759#issuecomment-2897258835
> > This was a tricky bug to find, so nice catch. > > @apoorvmittal10 : My understanding is that this PR is not fixing a bug. Without the PR, we still guarantee that onComplete() will be called exactly once. So, this PR just saves some unnecessary work after the operation completes. Is that correct? It does both, without the PR, though we guarantee that onComplete will be called exactly once but there can be a thread which is inside onComplete (when timeout occurs) and a separate thread invoking tryComplete. For DelayedShareFetch, prior this PR, a thread can be waiting in onComplete for lock which has already marked `completed` true in DelayedOperation but parallel thread in `tryComplete` is running, which acquires some partitions. Now the thread in tryComplete cannot invoke `onComplete` as `completed` is marked `true` in DelayedOperation, hence acquired partitions will be released. Then other thread which was waiting in onComplete for the lock, starts executing and releases the same acquired topic partitions again. This PR fixes this issue as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org