I said this in the voting thread, but can the authors include a section about new ACLs if there are going to be ACLs for TransactionalId. It's mentioned in the google doc, but I think new ACLs should be in a KIP directly.
On Thu, Feb 2, 2017 at 2:42 PM, Ismael Juma <ism...@juma.me.uk> wrote: > Thanks for the responses and updates to the document, Guozhang and Jason. > They look good. One follow-up and one additional comment: > > 1. I did some benchmarking and CRC32C seems to be a massive win when using > the hardware instruction (particularly for messages larger than 65k), so > I'm keen on taking advantage of the message format version bump to add > support for it. I can write a separate KIP for this as it's not tied to > Exactly-once, but it would be good to include the code change in the same > PR that bumps the message format version. The benchmark and results can be > found in the following link: > https://gist.github.com/ijuma/e58ad79d489cb831b290e83b46a7d7bb. > > 2. The message timestamp field is 8 bytes. Did we consider storing the > first timestamp in the message set and then storing deltas using varints in > the messages like we do for offsets (the difference would be the usage of > signed varints)? It seems like the deltas would be quite a bit smaller in > the common case (potentially 0 for log append time, so we could even not > store them at all using attributes like we do for key/value lengths). An > alternative is using MaxTimestamp that is already present in the message > set and computing deltas from that, but that seems more complicated. In any > case, details aside, was this idea considered and rejected or is it worth > exploring further? > > Ismael > > On Thu, Feb 2, 2017 at 1:02 AM, Jason Gustafson <ja...@confluent.io> > wrote: > > > Ismael, > > > > Thanks for the comments. A few responses below: > > > > > > > 2. `ProducerAppId` is a new authorization resource type. This > introduces > > a > > > compatibility issue with regards to existing third-party authorizers. > It > > > would be good to highlight this in the migration/compatibility section. > > > > > > Ack. I added a note in the migration section. > > > > 4. The Migration plan is relatively brief at the moment. Have we > > considered > > > if there's any additional work required due to KIP-97 (introduced in > > > 0.10.2.0)? > > > > > > Thanks, I added a few notes about client compatibility to the migration > > section. I covered the main issues that come to mind, but let me know if > > you think of others. > > > > 7. It seems like there is a bit of inconsistency when it comes to naming > > > convention. For example, we have `InitPIDRequest`, `PidSnapshot` > > > and `InvalidPidMapping`. The latter two match Kafka's naming > conventions. > > > There are a few other examples like that and it would be good to clean > > them > > > up. > > > > > > Let's go with InitPidRequest for consistency. Haha, "InitPIdRequest" > seems > > like a compromise which satisfies no one. > > > > > > -Jason > > > > On Wed, Feb 1, 2017 at 4:14 PM, Guozhang Wang <wangg...@gmail.com> > wrote: > > > > > Ismael, thanks for your feedbacks. Replied inline. > > > > > > On Wed, Feb 1, 2017 at 8:03 AM, Ismael Juma <ism...@juma.me.uk> wrote: > > > > > > > Hi all, > > > > > > > > A few comments follow: > > > > > > > > 1. The document states "inter-broker communications will be increased > > by > > > M > > > > * N * P round trips per sec. We need to conduct some system > performance > > > > test to make sure this additional inter-broker traffic would not > > largely > > > > impact the broker cluster". Has this testing been done? And if not, > are > > > we > > > > planning to do it soon? It seems important to validate this sooner > > rather > > > > than later. This applies more generally too, it would be great to > > > > understand how the new message format affects the producer with small > > > > messages, for example. > > > > > > > > > > > Yes we are conducting the perf tests with the message format changes in > > the > > > first stage; then the inter-broker communication with minimal > transaction > > > coordinator implementations in the second stage. > > > > > > > > > > 2. `ProducerAppId` is a new authorization resource type. This > > introduces > > > a > > > > compatibility issue with regards to existing third-party authorizers. > > It > > > > would be good to highlight this in the migration/compatibility > section. > > > > > > > > 3. I was happy to see that default values for the new configs have > been > > > > added to the document since I last checked it. It would be good to > > > explain > > > > the motivation for the choices. > > > > > > > > > > > Updated doc. > > > > > > > > > > 4. The Migration plan is relatively brief at the moment. Have we > > > considered > > > > if there's any additional work required due to KIP-97 (introduced in > > > > 0.10.2.0)? > > > > > > > > 5. transactional.id sounds good > > > > > > > > 6. Since we are keeping per message CRCs for auditing apps, have we > > > > considered mitigating the performance cost by using the more > performant > > > > CRC32c in the new message format version? > > > > > > > > > > > We have not discussed about this before. But I think it should be > doable > > as > > > long as we can include the additional conversion logic in the migration > > > plan. > > > > > > > > > > Nits: > > > > > > > > 7. It seems like there is a bit of inconsistency when it comes to > > naming > > > > convention. For example, we have `InitPIDRequest`, `PidSnapshot` > > > > and `InvalidPidMapping`. The latter two match Kafka's naming > > conventions. > > > > There are a few other examples like that and it would be good to > clean > > > them > > > > up. > > > > > > > > > > > I agree with the inconsistency issue. About the name itself though, > > should > > > it be "InitPIdRequest", "PIdSnapshot" "InvalidPIdMapping" though, since > > we > > > need to capitalize "I" right? > > > > > > > > > > 8. The document states "The first four fields of a message set in > this > > > > format must to be the same as the existing format because any fields > > > before > > > > the magic byte cannot be changed in order to provide a path for > > upgrades > > > > following a similar approach as was used in KIP-32". This makes > things > > > > easier, but it seems to me that the only strict requirement is that > the > > > > magic byte remains in the same offset and with the same size. > > > > > > > > > > > I agree theoretically it is not required, but I think in practice it is > > > actually better to make it more restrict: the three fields before magic > > > byte are offset, length, and crc. Among them, crc needs to be before > > magic > > > byte if it wants to cover the magic byte fields; length would better be > > > before the magic byte as well for pre-allocate memory to > deser/decompress > > > the message set, and the only field that does not matter too much to be > > > after magic byte is offset, but in KIP-98 we will use it as the base > > offset > > > for message set and some validation checks can be optimized to not scan > > > through the whole message with this field in front of the format. > > > > > > > > > > On Mon, Jan 30, 2017 at 10:37 PM, Guozhang Wang <wangg...@gmail.com> > > > > wrote: > > > > > > > > > Hello Folks, > > > > > > > > > > We have addressed all the comments collected so far, and would like > > to > > > > > propose a voting thread this Wednesday. If you have any further > > > comments > > > > on > > > > > this KIP, please feel free to continue sending them on this thread > > > before > > > > > that. > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > On Mon, Jan 30, 2017 at 1:10 PM, Jason Gustafson < > ja...@confluent.io > > > > > > > > wrote: > > > > > > > > > > > +1 for transactional.id. > > > > > > > > > > > > -Jason > > > > > > > > > > > > On Mon, Jan 30, 2017 at 1:05 PM, Guozhang Wang < > wangg...@gmail.com > > > > > > > > wrote: > > > > > > > > > > > > > If I have to choose between app.id and > transactional.instance.id > > , > > > > I'd > > > > > > > choose the latter. > > > > > > > > > > > > > > Renaming transactional.instance.id to transactional.id sounds > > even > > > > > > better. > > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 30, 2017 at 11:49 AM, Apurva Mehta < > > > apu...@confluent.io> > > > > > > > wrote: > > > > > > > > > > > > > > > > Bumping one suggestion from Apurva above. The name "AppID" > > has > > > > > caused > > > > > > > > some > > > > > > > > > confusion. We're considering the following renaming: > > > > > > > > > > > > > > > > > > 1. AppID -> ProducerId (transaction.app.id -> producer.id) > > > > > > > > > 2. PID -> IPID (internal producer ID) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > How about AppId -> TransactionalId (transaction.app.id -> > > > > > > > transactional.id > > > > > > > > ) > > > > > > > > > > > > > > > > This makes it clear that this id just needs to be set when > the > > > > > > > application > > > > > > > > wishes to use transactions. I also think it is more intuitive > > in > > > > the > > > > > > > > context of how this id is used, viz. to maintain transactions > > > > across > > > > > > > > producer sessions. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Apurva > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > -- > > > -- Guozhang > > > > > >