Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Gwen Shapira Fri, 10 Feb 2017 18:55:20 -0800

Understood, thank you!

On Fri, Feb 10, 2017 at 2:54 PM, Apurva Mehta <apu...@confluent.io> wrote:
> The point about management of transactionalId was raised before, and Jason
> tackled it here (note it was called 'AppId' back then):
> https://www.mail-archive.com/dev@kafka.apache.org/msg61200.html
>
> 1. Essentially, the full transactional functionality really only makes
> sense for a stateful application. Further, in a stateful and partitioned
> application, each shard must have a stable unique id. This can be used
> directly as the transactionalId.
>
> 2. For non-stateful applications which want transactions within a single
> session of a producer, they can generate a UUID per session, and use that
> as the transactionalId.
>
> So yes, managing transactionalIds are not trivial, but stateful
> applications will have the means to do so.
>
> The final note is that the primary way to use these transactional apis is
> through Kafka streams, and in that context, the transactionalId will be a
> combination of the streams appId and the streams taskId. Note that this is
> essentially the shardId alluded to in point 1 above, and Streams can take
> care of it transparently.
>
> Thanks,
> Apurva
>
>
>
> On Fri, Feb 10, 2017 at 2:33 PM, Gwen Shapira <g...@confluent.io> wrote:
>
>> Thank you so much for the detailed explanation!
>> I got the READ COMMITTED behavior. It is very odd that we will
>> routinely see parts of a committed transaction but not other parts.
>> But I understand the reasons.
>>
>> Regarding transactional.id: I read the document but I'm still not 100%
>> clear on how unique transactional ids can happen in practice.
>> I'm sure you've thought of this, but I'd like to understand:
>> Right now, I'm writing an consume-process-produce app, I give it a
>> consumer group.id, and it is safe to deploy the app multiple times
>> with the same configuration - because all the instances will have the
>> same group.id. But it sounds like each instance will need its own
>> transactional.id, so it can commit its own transactions? Is the idea
>> that I'll read the configuration from a file and an admin will be
>> responsible to come up with unique transactional IDs while deploying?
>> This sounds a bit more complex than I'd like. Maybe you have a simple
>> solution I am not seeing...
>>
>> Gwen
>>
>>
>>
>>
>> On Fri, Feb 10, 2017 at 12:25 AM, Jason Gustafson <ja...@confluent.io>
>> wrote:
>> > Hey Gwen,
>> >
>> > Thanks for the questions and comments. Responses below:
>> >
>> > I not sure I'm clear on the expected behavior of READ_COMMITTED in
>> >> some interleaved cases:
>> >> * If a transaction starts, sends few messages and someone writes
>> >> non-transactional event into the same topic/partition, few more events
>> >> from that transaction and then a commit. I think the producer will
>> >> block until the commit, but I'm not sure in what order I'll see events
>> >> after that.
>> >
>> > * Same for transaction A starts, transactions B starts, transaction B
>> >> commits, transaction A commits... when will we unblock? and what will
>> >> we see?
>> >
>> >
>> > In READ_COMMITTED (as well as in READ_UNCOMMITTED), the messages are
>> always
>> > returned in the order of their offsets. What controls the visibility of
>> > messages in READ_COMMITTED is the last stable offset (LSO). This is
>> defined
>> > as the last offset in the log such that all messages with smaller offsets
>> > have been decided (either committed or aborted).
>> >
>> > Take the first case: non-transactional data interleaved with
>> transactional
>> > data. Let N be a non-transactional message, T be a transactional message,
>> > and C be a commit marker. Suppose we have the following state of the log
>> > (the first row is the offsets, the second is the messages):
>> >
>> > 0, 1, 2
>> > N, T, N
>> >
>> > The LSO in this log is 1, which means the non-transactional message at
>> > offset 0 can be read by a consumer, but the one at offset 2 cannot
>> because
>> > the transaction beginning at offset 1 has not completed. Later we get
>> some
>> > more data:
>> >
>> > 0, 1, 2, 3, 4
>> > N, T, N, N, T
>> >
>> > The LSO still cannot advance because the transaction beginning at offset
>> 1
>> > has not completed. Finally we get the commit marker:
>> >
>> > 0, 1, 2, 3, 4, 5
>> > N, T, N, N, T, C
>> >
>> > At this point, the LSO advances to 6, and all of the messages from
>> offset 1
>> > to 5 are returned, in that order, to consumers.
>> >
>> > Now consider interleaved transactions. Let T1 be a transactional message
>> > from producer 1, and T2 be a transactional message from producer 2. Both
>> > producers begin a transaction:
>> >
>> > 0, 1, 2
>> > T1, T2, T1
>> >
>> > The LSO is 0 since the transaction starting from that offset has not been
>> > decided. Now suppose T1 commits.
>> >
>> > 0, 1, 2, 3
>> > T1, T2, T1, C1
>> >
>> > The LSO then advances to offset 1. The message at offset 0 becomes
>> visible
>> > to consumers in READ_COMMITTED mode, but the remaining messages from the
>> > transaction are blocked until T2 is decided. Now T2 commits:
>> >
>> > 0, 1, 2, 3, 4, 5
>> > T1, T2, T1, C1, T2, C2
>> >
>> > The LSO advances to 6, and just like above, all of the data from offsets
>> 1
>> > to 5 becomes visible.
>> >
>> > This ordering is what we refer to as "offset order." We always return
>> > messages in the order of their offsets. Another interesting option to
>> > consider is "transaction order." Consider again the state after T1
>> > committed:
>> >
>> > 0, 1, 2, 3
>> > T1, T2, T1, C1
>> >
>> > In transaction order, all of the messages from T1 become visible
>> > immediately upon receiving the commit marker. Later when T2 commits, its
>> > messages will also becomes visible. The main advantage is lower latency,
>> > but it is quite a bit more complicated to implement (see the design
>> > document for the discussion). We went with the simpler approach and left
>> > this for possible future work.
>> >
>> > i'm concerned about transactional.id uniqueness in real-world
>> conditions.
>> >> When users were forced to manually define unique broker ids, we saw
>> >> lots of errors with "duplicate broker id" on startup show up on the
>> >> mailing list.
>> >> When we added a persistent (to filesystem) auto-generated broker ID,
>> >> the number of issues dropped dramatically.
>> >> I wonder if this makes sense here too.
>> >
>> >
>> > There has been a lot of discussion about this. I will link you to the
>> > section in the design document:
>> > https://docs.google.com/document/d/11Jqy_GjUGtdXJK94XGsEIK7CP1SnQGdp2eF
>> 0wSw9ra8/edit#heading=h.lizhp2urqn59
>> > .
>> >
>> > It sounds like producers will now be idempotent by default. I.e. - if
>> >> we upgrade and do nothing different, we have idempotent producer? and
>> >> absolutely no way to get the old behavior back?
>> >
>> >
>> > This may have been a case of the KIP wiki getting out of sync with the
>> > design document. I've updated it to include the `enable.idempotence`
>> > producer config which was added previously to the design document. By
>> > default, idempotence will be disabled. Down the road, it may make sense
>> to
>> > change that, but we're uncomfortable turning it on in the first version
>> > because of the subtle changes in behavior (in particular, the fact that
>> we
>> > raise the fatal OutOfSequenceException in the case of message loss or PID
>> > expiration).
>> >
>> > Thanks,
>> > Jason
>> >
>> > On Thu, Feb 9, 2017 at 10:41 PM, Gwen Shapira <g...@confluent.io> wrote:
>> >
>> >> It sounds like producers will now be idempotent by default. I.e. - if
>> >> we upgrade and do nothing different, we have idempotent producer? and
>> >> absolutely no way to get the old behavior back?
>> >>
>> >> I don't think anyone needs non-idempotent producer, just want to
>> clarify :)
>> >>
>> >> On Tue, Feb 7, 2017 at 3:10 PM, Jason Gustafson <ja...@confluent.io>
>> >> wrote:
>> >> > A couple follow-ups from Ismael's previous comments.
>> >> >
>> >> > 1. We have removed the individual message CRC field from the message
>> >> > format. Because this field can already change on the broker in some
>> >> > situations, we feel it is probably not wise to let clients depend on
>> it,
>> >> > and removing it saves some space and the redundant verification on the
>> >> > broker. We have also proposed to deprecate the checksum() APIs which
>> are
>> >> > exposed in RecordMetadata and ConsumerRecord.
>> >> > 2. We changed the message timestamp field to be variable length. The
>> >> values
>> >> > are computed relative to the timestamp of the first message in the
>> set.
>> >> We
>> >> > considered my previous suggestion to make the first message timestamp
>> >> > absolute with later messages relative to their previous timestamp, but
>> >> > ultimately felt it was simpler to stick with the same pattern that is
>> >> used
>> >> > for relative offsets. To enable this, we added an initial timestamp
>> field
>> >> > to the message set.
>> >> >
>> >> > Thanks,
>> >> > Jason
>> >> >
>> >> > On Mon, Feb 6, 2017 at 6:01 PM, Apurva Mehta <apu...@confluent.io>
>> >> wrote:
>> >> >
>> >> >> Hello,
>> >> >>
>> >> >> I have gone ahead and updated the KIP wiki with a summary of the
>> >> changes to
>> >> >> the RPC protocol. The KIP wiki should now have _all_ the public
>> facing
>> >> >> changes being proposed.
>> >> >>
>> >> >> The proposed changes were always in the Google doc, and now we are
>> >> simply
>> >> >> making good on our promise to copy them over to the wiki since the
>> >> design
>> >> >> is almost finalized.
>> >> >>
>> >> >> Thanks,
>> >> >> Apurva
>> >> >>
>> >> >> On Mon, Feb 6, 2017 at 4:02 PM, Jason Gustafson <ja...@confluent.io>
>> >> >> wrote:
>> >> >>
>> >> >> > Hey Tom,
>> >> >> >
>> >> >> > Re; complexity. This is always a tradeoff with new features. The
>> >> changes
>> >> >> > we've made during the design and review process have greatly
>> >> simplified
>> >> >> the
>> >> >> > implementation for clients, and especially for the consumer, but
>> >> there is
>> >> >> > nontrivial work needed here to support transactions on the
>> producer.
>> >> I'm
>> >> >> > not sure how it could be otherwise and we've spent a ton of time
>> >> thinking
>> >> >> > about this. It's also worth mentioning that there's a relatively
>> low
>> >> bar
>> >> >> to
>> >> >> > support the idempotent producer while keeping the client thin (it
>> >> >> requires
>> >> >> > support for one new request type and some sequence bookkeeping).
>> >> >> >
>> >> >> > Ultimately, we have to decide whether the improved semantics are
>> worth
>> >> >> the
>> >> >> > cost of the complexity. In my opinion, they are. The benefit of
>> having
>> >> >> > exactly-once processing in streaming applications is huge. And to
>> >> manage
>> >> >> > the complexity, we've intentionally used patterns that were already
>> >> >> > familiar in the codebase (e.g. our approach to maintaining
>> transaction
>> >> >> > state through a coordinator is copied from how consumer offsets are
>> >> >> > managed). We've moved as much work from the clients to the broker
>> as
>> >> >> > possible, and we have resisted at every turn complicating the
>> client
>> >> APIs
>> >> >> > even where it may have simplified some internals.
>> >> >> >
>> >> >> > -Jason
>> >> >> >
>> >> >> > On Mon, Feb 6, 2017 at 2:55 PM, Apurva Mehta <apu...@confluent.io>
>> >> >> wrote:
>> >> >> >
>> >> >> > > Hi Tom,
>> >> >> > >
>> >> >> > > I updated the KIP with a note our plans for performance testing:
>> >> >> > >
>> >> >> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >> >> > > 98+-+Exactly+Once+Delivery+and+Transactional+Messaging#KIP-98-
>> >> >> > > ExactlyOnceDeliveryandTransactionalMessaging-Performance
>> >> >> > >
>> >> >> > > Thanks for pointing that out.
>> >> >> > >
>> >> >> > > Regards,
>> >> >> > > Apurva
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > >
>> >> >> > > On Mon, Feb 6, 2017 at 7:40 AM, Tom Crayford <
>> tcrayf...@heroku.com>
>> >> >> > wrote:
>> >> >> > >
>> >> >> > > > I think the updated wiki page makes sense with respect to ACLs,
>> >> there
>> >> >> > > seems
>> >> >> > > > to be little potential for abuse there (other than the noted
>> and
>> >> >> known
>> >> >> > > > issues).
>> >> >> > > >
>> >> >> > > > I am going to note that this is a major complexity increase for
>> >> >> Kafka,
>> >> >> > > and
>> >> >> > > > that I'm concerned about performance impact (the JVM is quite…
>> >> >> pedantic
>> >> >> > > > about method size, for example, and even adding conditionals to
>> >> >> larger
>> >> >> > > > methods could impact this). The KIP doesn't note plans for
>> >> >> performance
>> >> >> > > > testing.
>> >> >> > > >
>> >> >> > > > I'm also concerned about the impact on non-JVM client
>> libraries -
>> >> >> > > writing a
>> >> >> > > > client for Kafka is already a very complicated undertaking, and
>> >> this
>> >> >> > adds
>> >> >> > > > to that complexity significantly.
>> >> >> > > >
>> >> >> > > > However, the approach seems ok enough. It does also violate the
>> >> >> "Kafka
>> >> >> > > has
>> >> >> > > > dumb brokers and smart clients" (which I realize is in direct
>> >> >> > > contradiction
>> >> >> > > > of the previous statement about client implementation being
>> hard).
>> >> >> I'd
>> >> >> > > love
>> >> >> > > > to see some discussion in either the doc or the wiki as to why
>> >> much
>> >> >> of
>> >> >> > > this
>> >> >> > > > transactional work isn't a client-side part of Kafka Streams.
>> >> >> > > >
>> >> >> > > > On Sat, Feb 4, 2017 at 3:38 AM, Jason Gustafson <
>> >> ja...@confluent.io>
>> >> >> > > > wrote:
>> >> >> > > >
>> >> >> > > > > One additional note on the authorization. The WriteTxnMarker
>> >> API is
>> >> >> > > > > restricted to inter-broker usage, so it requires Cluster
>> >> >> > authorization
>> >> >> > > > > (just like other inter-broker APIs). I've updated the
>> document
>> >> and
>> >> >> > wiki
>> >> >> > > > to
>> >> >> > > > > reflect this.
>> >> >> > > > >
>> >> >> > > > > Also, I have renamed GroupCoordinatorRequest to
>> >> >> > FindCoordinatorRequest
>> >> >> > > > > since there is no group for transactional producers. Let me
>> >> know if
>> >> >> > > there
>> >> >> > > > > are any concerns.
>> >> >> > > > >
>> >> >> > > > > -Jason
>> >> >> > > > >
>> >> >> > > > > On Fri, Feb 3, 2017 at 2:35 PM, Jason Gustafson <
>> >> >> ja...@confluent.io>
>> >> >> > > > > wrote:
>> >> >> > > > >
>> >> >> > > > > > Hi Tom,
>> >> >> > > > > >
>> >> >> > > > > > I said this in the voting thread, but can the authors
>> include
>> >> a
>> >> >> > > section
>> >> >> > > > > >> about new ACLs if there are going to be ACLs for
>> >> >> TransactionalId.
>> >> >> > > It's
>> >> >> > > > > >> mentioned in the google doc, but I think new ACLs should
>> be
>> >> in a
>> >> >> > KIP
>> >> >> > > > > >> directly.
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > > > We've updated the wiki. Can you take a look and let us
>> know if
>> >> >> you
>> >> >> > > have
>> >> >> > > > > > additional concerns?
>> >> >> > > > > >
>> >> >> > > > > > Thanks,
>> >> >> > > > > > Jason
>> >> >> > > > > >
>> >> >> > > > > > On Fri, Feb 3, 2017 at 1:52 PM, Rajini Sivaram <
>> >> >> > > > rajinisiva...@gmail.com>
>> >> >> > > > > > wrote:
>> >> >> > > > > >
>> >> >> > > > > >> Hi Jason,
>> >> >> > > > > >>
>> >> >> > > > > >> Thank you for the responses. Agree that authorizing
>> >> >> > > transactional.id
>> >> >> > > > in
>> >> >> > > > > >> the
>> >> >> > > > > >> producer requests will be good enough for version 1. And
>> >> making
>> >> >> it
>> >> >> > > > > tighter
>> >> >> > > > > >> in future based on delegation tokens sounds good too.
>> >> >> > > > > >>
>> >> >> > > > > >> Regards,
>> >> >> > > > > >>
>> >> >> > > > > >> Rajini
>> >> >> > > > > >>
>> >> >> > > > > >>
>> >> >> > > > > >> On Fri, Feb 3, 2017 at 8:04 PM, Jason Gustafson <
>> >> >> > ja...@confluent.io
>> >> >> > > >
>> >> >> > > > > >> wrote:
>> >> >> > > > > >>
>> >> >> > > > > >> > Hey Rajini,
>> >> >> > > > > >> >
>> >> >> > > > > >> > Thanks for the questions. Responses below:
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > > 1. Will the transaction coordinator check topic ACLs
>> >> based
>> >> >> on
>> >> >> > > the
>> >> >> > > > > >> > > requesting client's credentials? Access to transaction
>> >> logs,
>> >> >> > > > topics
>> >> >> > > > > >> being
>> >> >> > > > > >> > > added for transaction etc?
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > Good question. I think it makes sense to check topic
>> Write
>> >> >> > > > permission
>> >> >> > > > > >> when
>> >> >> > > > > >> > adding partitions to the transaction. I'll add this to
>> the
>> >> >> > > document.
>> >> >> > > > > >> > Perhaps authorization to the transaction log itself,
>> >> however,
>> >> >> > can
>> >> >> > > be
>> >> >> > > > > >> > assumed from having access to the
>> ProducerTransactionalId
>> >> >> > > resource?
>> >> >> > > > > This
>> >> >> > > > > >> > would be similar to how access to __consumer_offsets is
>> >> >> assumed
>> >> >> > if
>> >> >> > > > the
>> >> >> > > > > >> > client has access to the Group resource.
>> >> >> > > > > >> >
>> >> >> > > > > >> > 2. If I create a transactional produce request (by hand,
>> >> not
>> >> >> > using
>> >> >> > > > the
>> >> >> > > > > >> > > producer API) with a random PID (random, hence
>> unlikely
>> >> to
>> >> >> be
>> >> >> > in
>> >> >> > > > > use),
>> >> >> > > > > >> > will
>> >> >> > > > > >> > > the broker append a transactional message to the logs,
>> >> >> > > preventing
>> >> >> > > > > LSO
>> >> >> > > > > >> > from
>> >> >> > > > > >> > > moving forward? What validation will broker do for
>> PIDs?
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > Yes, that is correct. Validation of the TransactionalId
>> to
>> >> PID
>> >> >> > > > binding
>> >> >> > > > > >> is a
>> >> >> > > > > >> > known gap in the current proposal, and is discussed in
>> the
>> >> >> > design
>> >> >> > > > > >> document.
>> >> >> > > > > >> > Now that I'm thinking about it a bit more, I think there
>> >> is a
>> >> >> > good
>> >> >> > > > > case
>> >> >> > > > > >> for
>> >> >> > > > > >> > including the TransactionalId in the ProduceRequest (I
>> >> think
>> >> >> Jun
>> >> >> > > > > >> suggested
>> >> >> > > > > >> > this previously). Verifying it does not ensure that the
>> >> >> included
>> >> >> > > PID
>> >> >> > > > > is
>> >> >> > > > > >> > correct, but it does ensure that the client is
>> authorized
>> >> to
>> >> >> use
>> >> >> > > > > >> > transactions. If the client wanted to do an "endless
>> >> >> transaction
>> >> >> > > > > >> attack,"
>> >> >> > > > > >> > having Write access to the topic and an authorized
>> >> >> > transactionalID
>> >> >> > > > is
>> >> >> > > > > >> all
>> >> >> > > > > >> > they would need anyway even if we could authorize the
>> PID
>> >> >> > itself.
>> >> >> > > > This
>> >> >> > > > > >> > seems like a worthwhile improvement.
>> >> >> > > > > >> >
>> >> >> > > > > >> > For future work, my half-baked idea to authorize the PID
>> >> >> binding
>> >> >> > > is
>> >> >> > > > to
>> >> >> > > > > >> > leverage the delegation work in KIP-48. When the PID is
>> >> >> > generated,
>> >> >> > > > we
>> >> >> > > > > >> can
>> >> >> > > > > >> > give the producer a token which is then used in produce
>> >> >> requests
>> >> >> > > > (say
>> >> >> > > > > an
>> >> >> > > > > >> > hmac covering the TransactionalId and PID).
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > > 3. Will every broker check that a client sending
>> >> >> transactional
>> >> >> > > > > produce
>> >> >> > > > > >> > > requests at least has write access to transaction log
>> >> topic
>> >> >> > > since
>> >> >> > > > it
>> >> >> > > > > >> is
>> >> >> > > > > >> > not
>> >> >> > > > > >> > > validating transactional.id (for every produce
>> request)?
>> >> >> > > > > >> >
>> >> >> > > > > >> >  4. I understand that brokers cannot authorize the
>> >> >> transactional
>> >> >> > > id
>> >> >> > > > > for
>> >> >> > > > > >> > each
>> >> >> > > > > >> > > produce request since requests contain only the PID.
>> But
>> >> >> since
>> >> >> > > > there
>> >> >> > > > > >> is a
>> >> >> > > > > >> > > one-to-one mapping between PID and transactional.id,
>> >> and a
>> >> >> > > > > >> connection is
>> >> >> > > > > >> > > never expected to change its transactional.id,
>> perhaps
>> >> it
>> >> >> is
>> >> >> > > > > >> feasible to
>> >> >> > > > > >> > > add authorization and cache the results in the
>> Session?
>> >> >> > Perhaps
>> >> >> > > > not
>> >> >> > > > > >> for
>> >> >> > > > > >> > > version 1, but feels like it will be good to close the
>> >> >> > security
>> >> >> > > > gap
>> >> >> > > > > >> here.
>> >> >> > > > > >> > > Obviously it would be simpler if transactional.id
>> was in
>> >> >> the
>> >> >> > > > > produce
>> >> >> > > > > >> > > request if the overhead was acceptable.
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > I think my response above addresses both of these. We
>> >> should
>> >> >> > > include
>> >> >> > > > > the
>> >> >> > > > > >> > TransactionalId in the ProduceRequest. Of course it need
>> >> not
>> >> >> be
>> >> >> > > > > >> included in
>> >> >> > > > > >> > the message format, so I'm not too concerned about the
>> >> >> > additional
>> >> >> > > > > >> overhead
>> >> >> > > > > >> > it adds.
>> >> >> > > > > >> >
>> >> >> > > > > >> > Thanks,
>> >> >> > > > > >> > Jason
>> >> >> > > > > >> >
>> >> >> > > > > >> >
>> >> >> > > > > >> > On Fri, Feb 3, 2017 at 6:52 AM, Ismael Juma <
>> >> >> ism...@juma.me.uk>
>> >> >> > > > > wrote:
>> >> >> > > > > >> >
>> >> >> > > > > >> > > Comments inline.
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > On Thu, Feb 2, 2017 at 6:28 PM, Jason Gustafson <
>> >> >> > > > ja...@confluent.io
>> >> >> > > > > >
>> >> >> > > > > >> > > wrote:
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > > Took me a while to remember why we didn't do this.
>> The
>> >> >> > > timestamp
>> >> >> > > > > >> that
>> >> >> > > > > >> > is
>> >> >> > > > > >> > > > included at the message set level is the max
>> timestamp
>> >> of
>> >> >> > all
>> >> >> > > > > >> messages
>> >> >> > > > > >> > in
>> >> >> > > > > >> > > > the message set as is the case in the current
>> message
>> >> >> format
>> >> >> > > (I
>> >> >> > > > > will
>> >> >> > > > > >> > > update
>> >> >> > > > > >> > > > the document to make this explicit). We could make
>> the
>> >> >> > message
>> >> >> > > > > >> > timestamps
>> >> >> > > > > >> > > > relative to the max timestamp, but that makes
>> >> >> serialization
>> >> >> > a
>> >> >> > > > bit
>> >> >> > > > > >> > awkward
>> >> >> > > > > >> > > > since the timestamps are not assumed to be
>> increasing
>> >> >> > > > sequentially
>> >> >> > > > > >> or
>> >> >> > > > > >> > > > monotonically. Once the messages in the message set
>> had
>> >> >> been
>> >> >> > > > > >> > determined,
>> >> >> > > > > >> > > we
>> >> >> > > > > >> > > > would need to go back and adjust the relative
>> >> timestamps.
>> >> >> > > > > >> > > >
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > Yes, I thought this would be a bit tricky and hence
>> why I
>> >> >> > > > mentioned
>> >> >> > > > > >> the
>> >> >> > > > > >> > > option of adding a new field at the message set level
>> for
>> >> >> the
>> >> >> > > > first
>> >> >> > > > > >> > > timestamp even though that's not ideal either.
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > Here's one idea. We let the timestamps in the
>> messages be
>> >> >> > > varints,
>> >> >> > > > > >> but we
>> >> >> > > > > >> > > > make their values be relative to the timestamp of
>> the
>> >> >> > previous
>> >> >> > > > > >> message,
>> >> >> > > > > >> > > > with the timestamp of the first message being
>> absolute.
>> >> >> For
>> >> >> > > > > >> example, if
>> >> >> > > > > >> > > we
>> >> >> > > > > >> > > > had timestamps 500, 501, 499, then we would write
>> 500
>> >> for
>> >> >> > the
>> >> >> > > > > first
>> >> >> > > > > >> > > > message, 1 for the next, and -2 for the final
>> message.
>> >> >> Would
>> >> >> > > > that
>> >> >> > > > > >> work?
>> >> >> > > > > >> > > Let
>> >> >> > > > > >> > > > me think a bit about it and see if there are any
>> >> problems.
>> >> >> > > > > >> > > >
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > It's an interesting idea. Comparing to the option of
>> >> having
>> >> >> > the
>> >> >> > > > > first
>> >> >> > > > > >> > > timestamp in the message set, It's a little more space
>> >> >> > efficient
>> >> >> > > > as
>> >> >> > > > > we
>> >> >> > > > > >> > > don't have both a full timestamp in the message set
>> >> _and_ a
>> >> >> > > varint
>> >> >> > > > > in
>> >> >> > > > > >> the
>> >> >> > > > > >> > > first message (which would always be 0, so we avoid
>> the
>> >> >> extra
>> >> >> > > > byte)
>> >> >> > > > > >> and
>> >> >> > > > > >> > > also the deltas could be a little smaller in the
>> common
>> >> >> case.
>> >> >> > > The
>> >> >> > > > > main
>> >> >> > > > > >> > > downside is that it introduces a semantics
>> inconsistency
>> >> >> > between
>> >> >> > > > the
>> >> >> > > > > >> > first
>> >> >> > > > > >> > > message and the rest. Not ideal, but maybe we can live
>> >> with
>> >> >> > > that.
>> >> >> > > > > >> > >
>> >> >> > > > > >> > > Ismael
>> >> >> > > > > >> > >
>> >> >> > > > > >> >
>> >> >> > > > > >>
>> >> >> > > > > >
>> >> >> > > > > >
>> >> >> > > > >
>> >> >> > > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Gwen Shapira
>> >> Product Manager | Confluent
>> >> 650.450.2760 | @gwenshap
>> >> Follow us: Twitter | blog
>> >>
>>
>>
>>
>> --
>> Gwen Shapira
>> Product Manager | Confluent
>> 650.450.2760 | @gwenshap
>> Follow us: Twitter | blog
>>




-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: [DISCUSS] KIP-98: Exactly Once Delivery and Transactional Messaging

Reply via email to