Looks great!

A few questions:

   1. What is the relationship between transaction.app.id and the existing
   config application.id in streams?
   2. The initTransactions() call is a little annoying. Can we get rid of
   that and call it automatically if you set a transaction.app.id when we
   do the first message send as we do with metadata? Arguably we should have
   included a general connect() or init() call in the producer, but given that
   we didn't do this it seems weird that the cluster metadata initializes
   automatically on demand and the transaction metadata doesn't.
   3. The equivalent concept of what we call "fetch.mode" in databases is
   called "isolation level" and takes values like "serializable", "read
   committed", "read uncommitted". Since we went with transaction as the name
   for the thing in between the begin/commit might make sense to use this
   terminology for the concept and levels? I think the behavior we are
   planning is "read committed" and the alternative re-ordering behavior is
   equivalent to "serializable"?
   4. Can the PID be made 4 bytes if we handle roll-over gracefully? 2
   billion concurrent producers should be enough for anyone, right?
   5. One implication of factoring out the message set seems to be you
   can't ever "repack" messages to improve compression beyond what is done by
   the producer. We'd talked about doing this either by buffering when writing
   or during log cleaning. This isn't a show stopper but I think one
   implication is that we won't be able to do this. Furthermore with log
   cleaning you'd assume that over time ALL messages would collapse down to a
   single wrapper as compaction removes the others.

-Jay

On Wed, Nov 30, 2016 at 2:19 PM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hi all,
>
> I have just created KIP-98 to enhance Kafka with exactly once delivery
> semantics:
>
> *https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 98+-+Exactly+Once+Delivery+and+Transactional+Messaging
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 98+-+Exactly+Once+Delivery+and+Transactional+Messaging>*
>
> This KIP adds a transactional messaging mechanism along with an idempotent
> producer implementation to make sure that 1) duplicated messages sent from
> the same identified producer can be detected on the broker side, and 2) a
> group of messages sent within a transaction will atomically be either
> reflected and fetchable to consumers or not as a whole.
>
> The above wiki page provides a high-level view of the proposed changes as
> well as summarized guarantees. Initial draft of the detailed implementation
> design is described in this Google doc:
>
> https://docs.google.com/document/d/11Jqy_GjUGtdXJK94XGsEIK7CP1SnQGdp2eF
> 0wSw9ra8
>
>
> We would love to hear your comments and suggestions.
>
> Thanks,
>
> -- Guozhang
>

Reply via email to