Hi Rowland,

Thank you for the feedback.  For the 2PC cases, the expectation is that the
timeout on the client would be set to "effectively infinite", that would
exceed all practical 2PC delays.  Now I think that this flexibility is
confusing and can be misused, I have updated the KIP to just say that if
2PC is used, the transaction never expires.

-Artem

On Thu, Jan 4, 2024 at 6:14 PM Rowland Smith <rowl...@gmail.com> wrote:

> It is probably me. I copied the original message subject into a new email.
> Perhaps that is not enough to link them.
>
> It was not my understanding from reading KIP-939 that we are doing away
> with any transactional timeout in the Kafka broker. As I understand it, we
> are allowing the application to set the transaction timeout to a value that
> exceeds the *transaction.max.timeout.ms
> <http://transaction.max.timeout.ms>* setting
> on the broker, and having no timeout if the application does not set
> *transaction.timeout.ms
> <http://transaction.timeout.ms>* on the producer. The KIP says that the
> semantics of *transaction.timeout.ms <http://transaction.timeout.ms>* are
> not being changed, so I take that to mean that the broker will continue to
> enforce a timeout if provided, and abort transactions that exceed it. From
> the KIP:
>
> Client Configuration Changes
>
> *transaction.two.phase.commit.enable* The default would be ‘false’.  If set
> to ‘true’, then the broker is informed that the client is participating in
> two phase commit protocol and can set transaction timeout to values that
> exceed *transaction.max.timeout.ms <http://transaction.max.timeout.ms>*
> setting
> on the broker (if the timeout is not set explicitly on the client and the
> two phase commit is set to ‘true’ then the transaction never expires).
>
> *transaction.timeout.ms <http://transaction.timeout.ms>* The semantics is
> not changed, but it can be set to values that exceed
> *transaction.max.timeout.ms
> <http://transaction.max.timeout.ms>* if two.phase.commit.enable is set to
> ‘true’.
>
>
> Thinking about this more I believe we would also have a possible race
> condition if the broker is unaware that a transaction has been prepared.
> The application might call prepare and get a positive response, but the
> broker might have already aborted the transaction for exceeding the
> timeout. It is a general rule of 2PC that once a transaction has been
> prepared it must be possible for it to be committed or aborted. It seems in
> this case a prepared transaction might already be aborted by the broker, so
> it would be impossible to commit.
>
> I hope this is making sense and I am not misunderstanding the KIP. Please
> let me know if I am.
>
> - Rowland
>
>
> On Thu, Jan 4, 2024 at 12:56 PM Justine Olshan
> <jols...@confluent.io.invalid>
> wrote:
>
> > Hey Rowland,
> >
> > Not sure why this message showed up in a different thread from the other
> > KIP-939 discussion (is it just me?)
> >
> > In KIP-939, we do away with having any transactional timeout on the Kafka
> > side. The external coordinator is fully responsible for controlling
> whether
> > the transaction completes.
> >
> > While I think there is some use in having a prepare stage, I just wanted
> to
> > clarify what the current KIP is proposing.
> >
> > Thanks,
> > Justine
> >
> > On Wed, Jan 3, 2024 at 7:49 PM Rowland Smith <rowl...@gmail.com> wrote:
> >
> > > Hi Artem,
> > >
> > > I saw your response in the thread I started discussing Kafka
> distributed
> > > transaction support and the XA interface. I would like to work with you
> > to
> > > add XA support to Kafka on top of the excellent foundational work that
> > you
> > > have started with KIP-939. I agree that explicit XA support should not
> be
> > > included in the Kafka codebase as long as the right set of basic
> > operations
> > > are provided. I will begin pulling together a KIP to follow KIP-939.
> > >
> > > I did have one comment on KIP-939 itself. I see that you considered an
> > > explicit "prepare" RPC, but decided not to add it. If I understand your
> > > design correctly, that would mean that a 2PC transaction would have a
> > > single timeout that would need to be long enough to ensure that
> prepared
> > > transactions are not aborted when an external coordinator fails.
> However,
> > > this also means that an unprepared transaction would not be aborted
> > without
> > > waiting for the same timeout. Since long running transactions block
> > > transactional consumers, having a long timeout for all transactions
> could
> > > be disruptive. An explicit "prepare " RPC would allow the server to
> abort
> > > unprepared transactions after a relatively short timeout, and apply a
> > much
> > > longer timeout only to prepared transactions. The explicit "prepare"
> RPC
> > > would make Kafka server more resilient to client failure at the cost of
> > an
> > > extra synchronous RPC call. I think its worth reconsidering this.
> > >
> > > With an XA implementation this might become a more significant issue
> > since
> > > the transaction coordinator has no memory of unprepared transactions
> > across
> > > restarts. Such transactions would need to be cleared by hand through
> the
> > > admin client even when the transaction coordinator restarts
> successfully.
> > >
> > > - Rowland
> > >
> >
>

Reply via email to