Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Apurva Mehta Mon, 06 Feb 2017 15:12:26 -0800

Hi Tom,

I updated the KIP with a note our plans for performance testing:


https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-ExactlyOnceDeliveryandTransactionalMessaging-Performance

Thanks for pointing that out.

Regards,
Apurva





On Mon, Feb 6, 2017 at 7:40 AM, Tom Crayford <tcrayf...@heroku.com> wrote:

> I think the updated wiki page makes sense with respect to ACLs, there seems
> to be little potential for abuse there (other than the noted and known
> issues).
>
> I am going to note that this is a major complexity increase for Kafka, and
> that I'm concerned about performance impact (the JVM is quite… pedantic
> about method size, for example, and even adding conditionals to larger
> methods could impact this). The KIP doesn't note plans for performance
> testing.
>
> I'm also concerned about the impact on non-JVM client libraries - writing a
> client for Kafka is already a very complicated undertaking, and this adds
> to that complexity significantly.
>
> However, the approach seems ok enough. It does also violate the "Kafka has
> dumb brokers and smart clients" (which I realize is in direct contradiction
> of the previous statement about client implementation being hard). I'd love
> to see some discussion in either the doc or the wiki as to why much of this
> transactional work isn't a client-side part of Kafka Streams.
>
> On Sat, Feb 4, 2017 at 3:38 AM, Jason Gustafson <ja...@confluent.io>
> wrote:
>
> > One additional note on the authorization. The WriteTxnMarker API is
> > restricted to inter-broker usage, so it requires Cluster authorization
> > (just like other inter-broker APIs). I've updated the document and wiki
> to
> > reflect this.
> >
> > Also, I have renamed GroupCoordinatorRequest to FindCoordinatorRequest
> > since there is no group for transactional producers. Let me know if there
> > are any concerns.
> >
> > -Jason
> >
> > On Fri, Feb 3, 2017 at 2:35 PM, Jason Gustafson <ja...@confluent.io>
> > wrote:
> >
> > > Hi Tom,
> > >
> > > I said this in the voting thread, but can the authors include a section
> > >> about new ACLs if there are going to be ACLs for TransactionalId. It's
> > >> mentioned in the google doc, but I think new ACLs should be in a KIP
> > >> directly.
> > >
> > >
> > > We've updated the wiki. Can you take a look and let us know if you have
> > > additional concerns?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Fri, Feb 3, 2017 at 1:52 PM, Rajini Sivaram <
> rajinisiva...@gmail.com>
> > > wrote:
> > >
> > >> Hi Jason,
> > >>
> > >> Thank you for the responses. Agree that authorizing transactional.id
> in
> > >> the
> > >> producer requests will be good enough for version 1. And making it
> > tighter
> > >> in future based on delegation tokens sounds good too.
> > >>
> > >> Regards,
> > >>
> > >> Rajini
> > >>
> > >>
> > >> On Fri, Feb 3, 2017 at 8:04 PM, Jason Gustafson <ja...@confluent.io>
> > >> wrote:
> > >>
> > >> > Hey Rajini,
> > >> >
> > >> > Thanks for the questions. Responses below:
> > >> >
> > >> >
> > >> > > 1. Will the transaction coordinator check topic ACLs based on the
> > >> > > requesting client's credentials? Access to transaction logs,
> topics
> > >> being
> > >> > > added for transaction etc?
> > >> >
> > >> >
> > >> > Good question. I think it makes sense to check topic Write
> permission
> > >> when
> > >> > adding partitions to the transaction. I'll add this to the document.
> > >> > Perhaps authorization to the transaction log itself, however, can be
> > >> > assumed from having access to the ProducerTransactionalId resource?
> > This
> > >> > would be similar to how access to __consumer_offsets is assumed if
> the
> > >> > client has access to the Group resource.
> > >> >
> > >> > 2. If I create a transactional produce request (by hand, not using
> the
> > >> > > producer API) with a random PID (random, hence unlikely to be in
> > use),
> > >> > will
> > >> > > the broker append a transactional message to the logs, preventing
> > LSO
> > >> > from
> > >> > > moving forward? What validation will broker do for PIDs?
> > >> >
> > >> >
> > >> > Yes, that is correct. Validation of the TransactionalId to PID
> binding
> > >> is a
> > >> > known gap in the current proposal, and is discussed in the design
> > >> document.
> > >> > Now that I'm thinking about it a bit more, I think there is a good
> > case
> > >> for
> > >> > including the TransactionalId in the ProduceRequest (I think Jun
> > >> suggested
> > >> > this previously). Verifying it does not ensure that the included PID
> > is
> > >> > correct, but it does ensure that the client is authorized to use
> > >> > transactions. If the client wanted to do an "endless transaction
> > >> attack,"
> > >> > having Write access to the topic and an authorized transactionalID
> is
> > >> all
> > >> > they would need anyway even if we could authorize the PID itself.
> This
> > >> > seems like a worthwhile improvement.
> > >> >
> > >> > For future work, my half-baked idea to authorize the PID binding is
> to
> > >> > leverage the delegation work in KIP-48. When the PID is generated,
> we
> > >> can
> > >> > give the producer a token which is then used in produce requests
> (say
> > an
> > >> > hmac covering the TransactionalId and PID).
> > >> >
> > >> >
> > >> > > 3. Will every broker check that a client sending transactional
> > produce
> > >> > > requests at least has write access to transaction log topic since
> it
> > >> is
> > >> > not
> > >> > > validating transactional.id (for every produce request)?
> > >> >
> > >> >  4. I understand that brokers cannot authorize the transactional id
> > for
> > >> > each
> > >> > > produce request since requests contain only the PID. But since
> there
> > >> is a
> > >> > > one-to-one mapping between PID and transactional.id, and a
> > >> connection is
> > >> > > never expected to change its transactional.id, perhaps it is
> > >> feasible to
> > >> > > add authorization and cache the results in the Session? Perhaps
> not
> > >> for
> > >> > > version 1, but feels like it will be good to close the security
> gap
> > >> here.
> > >> > > Obviously it would be simpler if transactional.id was in the
> > produce
> > >> > > request if the overhead was acceptable.
> > >> >
> > >> >
> > >> > I think my response above addresses both of these. We should include
> > the
> > >> > TransactionalId in the ProduceRequest. Of course it need not be
> > >> included in
> > >> > the message format, so I'm not too concerned about the additional
> > >> overhead
> > >> > it adds.
> > >> >
> > >> > Thanks,
> > >> > Jason
> > >> >
> > >> >
> > >> > On Fri, Feb 3, 2017 at 6:52 AM, Ismael Juma <ism...@juma.me.uk>
> > wrote:
> > >> >
> > >> > > Comments inline.
> > >> > >
> > >> > > On Thu, Feb 2, 2017 at 6:28 PM, Jason Gustafson <
> ja...@confluent.io
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Took me a while to remember why we didn't do this. The timestamp
> > >> that
> > >> > is
> > >> > > > included at the message set level is the max timestamp of all
> > >> messages
> > >> > in
> > >> > > > the message set as is the case in the current message format (I
> > will
> > >> > > update
> > >> > > > the document to make this explicit). We could make the message
> > >> > timestamps
> > >> > > > relative to the max timestamp, but that makes serialization a
> bit
> > >> > awkward
> > >> > > > since the timestamps are not assumed to be increasing
> sequentially
> > >> or
> > >> > > > monotonically. Once the messages in the message set had been
> > >> > determined,
> > >> > > we
> > >> > > > would need to go back and adjust the relative timestamps.
> > >> > > >
> > >> > >
> > >> > > Yes, I thought this would be a bit tricky and hence why I
> mentioned
> > >> the
> > >> > > option of adding a new field at the message set level for the
> first
> > >> > > timestamp even though that's not ideal either.
> > >> > >
> > >> > > Here's one idea. We let the timestamps in the messages be varints,
> > >> but we
> > >> > > > make their values be relative to the timestamp of the previous
> > >> message,
> > >> > > > with the timestamp of the first message being absolute. For
> > >> example, if
> > >> > > we
> > >> > > > had timestamps 500, 501, 499, then we would write 500 for the
> > first
> > >> > > > message, 1 for the next, and -2 for the final message. Would
> that
> > >> work?
> > >> > > Let
> > >> > > > me think a bit about it and see if there are any problems.
> > >> > > >
> > >> > >
> > >> > > It's an interesting idea. Comparing to the option of having the
> > first
> > >> > > timestamp in the message set, It's a little more space efficient
> as
> > we
> > >> > > don't have both a full timestamp in the message set _and_ a varint
> > in
> > >> the
> > >> > > first message (which would always be 0, so we avoid the extra
> byte)
> > >> and
> > >> > > also the deltas could be a little smaller in the common case. The
> > main
> > >> > > downside is that it introduces a semantics inconsistency between
> the
> > >> > first
> > >> > > message and the rest. Not ideal, but maybe we can live with that.
> > >> > >
> > >> > > Ismael
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Reply via email to