David Jacot created KAFKA-17501:
-----------------------------------

             Summary: WriteTxnMarkers API must not return until markers are 
written and materialized in group coordinator's cache
                 Key: KAFKA-17501
                 URL: https://issues.apache.org/jira/browse/KAFKA-17501
             Project: Kafka
          Issue Type: Bug
            Reporter: David Jacot
            Assignee: David Jacot


We have observed the below errors in some cluster:

Uncaught exception in scheduled task 'handleTxnCompletion-902667' 
exception.message:Trying to complete a transactional offset commit for 
producerId 902667 and groupId lkc-g1dwm_insertion-order-budget-depletor even 
though the offset commit record itself hasn't been appended to the log.

When a transaction is completed, the transaction coordinator sends a 
WriteTxnMarkers request to all the partitions involved in the transaction to 
write the markers to them. When the broker receives it, it writes the markers 
and if markers are written to the __consumer_offsets partitions, it informs the 
group coordinator that it can materialize the pending transactional offsets in 
its main cache. The group coordinator does this asynchronously since Apache 
Kafka 2.0, see this 
[patch|https://github.com/apache/kafka/commit/c53e274d3128bc92f0e8b6a79c407cf764f16f7b].

The above error appends when the asynchronous operation is executed by the 
scheduler and the operation finds that there are pending transactional offsets 
that were not written yet. How come?

There is actually an issue is the steps described above. The group coordinator 
does not wait until the asynchronous operation completes to return to the api 
layer. Hence the WriteTxnMarkers response may be send back to the transaction 
coordinator before the async operation is actually completed. Hence it is 
possible that the next transactional produce to be started also before the 
operation is completed too. This could explain why the group coordinator has 
pending transactional offsets that are not written yet.

There is a similar issue when the transaction is aborted. However on this path, 
we don’t have any checks to verify whether all the pending transactional 
offsets have been written or not so we don’t see any errors in our logs. Due to 
the same race condition, it is possible to actually remove the wrong pending 
transactional offsets.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to