Hi Anna, That sounds good to me - Becket/others any thoughts?
Thanks, Joel On Fri, Jan 8, 2016 at 12:41 PM, Anna Povzner <a...@confluent.io> wrote: > Hi Becket and everyone, > > Could we please add the following functionality to this KIP. I think it > would be very useful for the broker to return the timestamp in the ack to > the producer (in response: timestamp per partition) and propagate it back > to client in RecordMetadata. This way, if timestamp type is LogAppendTime, > the producer client will see what timestamp was actually set -- and it > would match the timestamp that consumer sees. Also, returning the timestamp > in RecordMetadata is also useful for timestamp type = CreateTime, since > timestamp could be also set in KafkaProducer (if client set timestamp in > ProducerRecord to 0). > > Since this requires protocol change as well, it will be better to implement > this as part of KIP-32, rather than proposing a new KIP. > > Thanks, > Anna > > > On Fri, Jan 8, 2016 at 12:53 PM, Joel Koshy <jjkosh...@gmail.com> wrote: > > > +1 from me > > > > Looking through this thread it seems there was some confusion on the > > migration discussion. This discussion in fact happened in the KIP-31 > > discuss thread, not so much in the KIP hangout. There is considerable > > overlap in discussions between KIP-3[1,2,3] so it makes sense to > > cross-reference all of these. > > > > I'm finding the Apache list archive a little cumbersome to use (e.g., the > > current link in KIP-31 points to the beginning of September archives) but > > the emails discussing migration were in October: > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/thread > > > > Markmail has a better interface but interestingly it has not indexed any > of > > the emails from August, September and early October ( > > > > > http://markmail.org/search/?q=list%3Aorg.apache.incubator.kafka-dev+date%3A201509-201511+order%3Adate-backward > > ). > > Perhaps KIPs should include a permalink to the first message of the > DISCUSS > > thread. E.g., > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm5jvL_dPeZWnfBD-vONgSZWOq1VL1Ss8OSUOCPXmtg8rQ%40mail.gmail.com%3E > > > > Also, just to clarify Jay's comments on the content of KIPs: I think > having > > a pseudo-code spec/implementation guide is useful (especially for > > client-side KIPs). While the motivation should definitely capture “why we > > are doing the KIP” it probably shouldn’t have to exhaustively capture > “why > > we are doing the KIP *this way*”. i.e., some of the discussions are > > extremely nuanced and in this case spans multiple KIPs so links to other > > KIPs and the discuss threads and KIP hangout recordings are perhaps > > sufficient to fill this gap - or maybe a new section that summarizes the > > discussions. > > > > Thanks, > > > > Joel > > > > On Wed, Jan 6, 2016 at 9:29 AM, Jun Rao <j...@confluent.io> wrote: > > > > > Hi, Jiangjie, > > > > > > 52. Replacing MessageSet with o.a.k.common.record will be ideal. > > > Unfortunately, we use MessageSet in SimpleConsumer, which is part of > the > > > public api. Replacing MessageSet with o.a.k.common.record will be an > > > incompatible api change. So, we probably should do this after we > > deprecate > > > SimpleConsumer. > > > > > > My original question is actually whether we just bump up magic byte in > > > Message once to incorporate both the offset and the timestamp change. > It > > > seems that the answer is yes. Could you reflect that in the KIP? > > > > > > Thanks, > > > > > > Jun > > > > > > > > > On Wed, Jan 6, 2016 at 7:01 AM, Becket Qin <becket....@gmail.com> > wrote: > > > > > > > Thanks a lot for the careful reading, Jun. > > > > Please see inline replies. > > > > > > > > > > > > > On Jan 6, 2016, at 3:24 AM, Jun Rao <j...@confluent.io> wrote: > > > > > > > > > > Jiangjie, > > > > > > > > > > Thanks for the updated KIP. Overall, a +1 on the proposal. A few > > minor > > > > > comments on the KIP. > > > > > > > > > > KIP-32: > > > > > 50. 6.c says "The log rolling has to depend on the earliest > > timestamp", > > > > > which is inconsistent with KIP-33. > > > > Corrected. > > > > > > > > > > 51. 8.b "If the time difference threshold is set to 0. The > timestamp > > in > > > > the > > > > > message is equivalent to LogAppendTime." If the time difference is > 0 > > > and > > > > > CreateTime is used, all messages will likely be rejected in this > > > > proposal. > > > > > So, it's not equivalent to LogAppendTime. > > > > Corrected. > > > > > > > > > > 52. Could you include the new value of magic byte in message format > > > > change? > > > > > Also, do we have a single new message format that includes both the > > > > offset > > > > > change (relative offset for inner messages) and the addition of > > > > timestamp? > > > > I am actually thinking about this when I am writing the patch. > > > > The timestamp will be added to the o.a.k.common.record.Record and > > > > Kafka.message.Message. The offset change is in > > > > o.a.k.common.record.MemoryRecords and Kafka.message.MessageSet. To > > avoid > > > > unnecessary changes, my current patch did not merge them together but > > > > simply make sure the version of Record(Message) and > > > > MemoryRecords(MessageSet) matches. > > > > > > > > Currently new clients uses classes in o.a.k.common.record, and the > > broker > > > > and old clients uses classes in kafka.message. > > > > I am thinking about doing the followings: > > > > 1. Migrate broker to use o.a.k.common.record. > > > > 2. Add message format V0 and V1 to o.a.k.common.protocol.Protocols. > > > > Ideally we should be able to define all the wire protocols in > > > > o.a.k.common.protocol.Protocol. So instead of having Record class to > > > parse > > > > byte arrays by itself, we can use Schema to parse the records. > > > > > > > > Would that be better? > > > > > > > > > > 53. Could you document the changes in ProducerRequest V2 and > > > FetchRequest > > > > > V2 (and the responses)? > > > > Done. > > > > > > > > > > 54. In migration phase 1, step 2, does internal ApiVersion mean > > > > > inter.broker.protocol.version? > > > > Yes. > > > > > > > > > > 55. In canary step 2.b, it says "It will only see > > > > > ProduceRequest/FetchRequest V1 from other brokers and clietns.". > But > > in > > > > > phase 2, a broker will receive FetchRequest V2 from other brokers. > > > > I meant when we canary a broker in phase 2, there will be only one > > broker > > > > entering phase 2, the other brokers will remain at phase 1. > > > > > > > > > > > > > > > KIP-33: > > > > > 60. The KIP still says maintaining index at "at minute granularity" > > > even > > > > > though the index interval is configurable now. > > > > Corrected. > > > > > > > > > > 61. In this design, it's possible for a log segment to have an > empty > > > time > > > > > index. In the worse case, we may have to scan more than the active > > > > segment > > > > > to recover the latest timestamp. > > > > Corrected. > > > > > > > > > > Thanks, > > > > > > > > > > Jun > > > > > > > > > > On Mon, Jan 4, 2016 at 11:37 AM, Aditya Auradkar < > > > > > aaurad...@linkedin.com.invalid> wrote: > > > > > > > > > >> Hey Becket/Anna - > > > > >> > > > > >> I have a few comments about the KIP. > > > > >> > > > > >> 1. (Minor) Can we rename the KIP? It's currently "Add CreateTime > and > > > > >> LogAppendTime etc..". This is actually the title of the now > rejected > > > > Option > > > > >> 1. > > > > >> 2. (Minor) Can we rename the proposed option? It isn't really > > "option > > > 4" > > > > >> anymore. > > > > >> 3. I'm not clear on what exactly happens to compressed messages > > > > >> when message.timestamp.type=LogAppendTime? Does every batch get > > > > >> recompressed because the inner message gets rewritten with the > > server > > > > >> timestamp? Or does the message set on disk have the timestamp set > to > > > > -1. In > > > > >> that case, what do we use as timestamp for the message? > > > > >> 4. Do message.timestamp.type and max.message.time.difference.ms > > need > > > > to be > > > > >> per-topic configs? It seems that this is really a client config > > i.e. a > > > > >> client is the source of timestamps not a topic. It could also be a > > > > >> broker-level config to keep things simple. > > > > >> 5. The "Proposed Changes" section in the KIP tries to build a > > > time-based > > > > >> index for query but that is a separate proposal (KIP-33). Can we > > more > > > > >> crisply identify what exactly will change when this KIP (and 31) > is > > > > >> implemented? It isn't super clear to me at this point. > > > > >> > > > > >> Aside from that, I think the "Rejected Alternatives" section of > the > > > KIP > > > > is > > > > >> excellent. Very good insight into what options were discussed and > > > > rejected. > > > > >> > > > > >> Aditya > > > > >> > > > > >>> On Mon, Dec 28, 2015 at 3:57 PM, Becket Qin < > becket....@gmail.com> > > > > wrote: > > > > >>> > > > > >>> Thanks Guozhang, Gwen and Neha for the comments. Sorry for late > > reply > > > > >>> because I only have occasional gmail access from my phone... > > > > >>> > > > > >>> I just updated the wiki for KIP-32. > > > > >>> > > > > >>> Gwen, > > > > >>> > > > > >>> Yes, the migration plan is what you described. > > > > >>> > > > > >>> I agree with your comments on the version. > > > > >>> I changed message.format.version to use the release version. > > > > >>> I did not change the internal version, we can discuss this in a > > > > separate > > > > >>> thread. > > > > >>> > > > > >>> Thanks, > > > > >>> > > > > >>> Jiangjie (Becket) Qin > > > > >>> > > > > >>> > > > > >>> > > > > >>>> On Dec 24, 2015, at 5:38 AM, Guozhang Wang <wangg...@gmail.com> > > > > wrote: > > > > >>>> > > > > >>>> Also I agree with Gwen that such changes may worth a 0.10 > release > > or > > > > >> even > > > > >>>> 1.0, having it in 0.9.1 would be quite confusing to users. > > > > >>>> > > > > >>>> Guozhang > > > > >>>> > > > > >>>>> On Wed, Dec 23, 2015 at 1:36 PM, Guozhang Wang < > > wangg...@gmail.com > > > > > > > > >>> wrote: > > > > >>>>> > > > > >>>>> Becket, > > > > >>>>> > > > > >>>>> Please let us know once you have updated the wiki page > regarding > > > the > > > > >>>>> migration plan. Thanks! > > > > >>>>> > > > > >>>>> Guozhang > > > > >>>>> > > > > >>>>>> On Wed, Dec 23, 2015 at 11:52 AM, Gwen Shapira < > > g...@confluent.io > > > > > > > > >>> wrote: > > > > >>>>>> > > > > >>>>>> Thanks Becket, Anne and Neha for responding to my concern. > > > > >>>>>> > > > > >>>>>> I had an offline discussion with Anne where she helped me > > > understand > > > > >>> the > > > > >>>>>> migration process. It isn't as bad as it looks in the KIP :) > > > > >>>>>> > > > > >>>>>> If I understand it correctly, the process (for users) will be: > > > > >>>>>> > > > > >>>>>> 1. Prepare for upgrade (set format.version = 0, ApiVersion = > > > 0.9.0) > > > > >>>>>> 2. Rolling upgrade of brokers > > > > >>>>>> 3. Bump ApiVersion to 0.9.0-1, so fetch requests between > brokers > > > > will > > > > >>> use > > > > >>>>>> the new protocol > > > > >>>>>> 4. Start upgrading clients > > > > >>>>>> 5. When "enough" clients are upgraded, bump format.version to > 1 > > > > >>> (rolling). > > > > >>>>>> > > > > >>>>>> Becket, can you confirm? > > > > >>>>>> > > > > >>>>>> Assuming this is the process, I'm +1 on the change. > > > > >>>>>> > > > > >>>>>> Reminder to coders and reviewers that pull-requests with > > > user-facing > > > > >>>>>> changes should include documentation changes as well as code > > > > changes. > > > > >>>>>> And a polite request to try to be helpful to users on when to > > use > > > > >>>>>> create-time and when to use log-append-time as configuration - > > > this > > > > >> is > > > > >>> not > > > > >>>>>> a trivial decision. > > > > >>>>>> > > > > >>>>>> A separate point I'm going to raise in a different thread is > > that > > > we > > > > >>> need > > > > >>>>>> to streamline our versions a bit: > > > > >>>>>> 1. I'm afraid that 0.9.0-1 will be confusing to users who care > > > about > > > > >>>>>> released versions (what if we forget to change it before the > > > > release? > > > > >>> Is > > > > >>>>>> it > > > > >>>>>> meaningful enough to someone running off trunk?), we need to > > come > > > up > > > > >>> with > > > > >>>>>> something that will work for both LinkedIn and everyone else. > > > > >>>>>> 2. ApiVersion has real version numbers. message.format.version > > has > > > > >>>>>> sequence > > > > >>>>>> numbers. This makes us look pretty silly :) > > > > >>>>>> > > > > >>>>>> My version concerns can be addressed separately and should not > > > hold > > > > >>> back > > > > >>>>>> this KIP. > > > > >>>>>> > > > > >>>>>> Gwen > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> > > > > >>>>>> On Tue, Dec 22, 2015 at 11:01 PM, Becket Qin < > > > becket....@gmail.com> > > > > >>>>>> wrote: > > > > >>>>>> > > > > >>>>>>> Hi Anna, > > > > >>>>>>> > > > > >>>>>>> Thanks for initiating the voting process. I did not start the > > > > voting > > > > >>>>>>> process because there were still some ongoing discussion with > > Jun > > > > >>> about > > > > >>>>>> the > > > > >>>>>>> timestamp regarding compressed messages. That is why the wiki > > > page > > > > >>>>>> hasn't > > > > >>>>>>> reflected the latest conversation as Guozhang pointed out. > > > > >>>>>>> > > > > >>>>>>> Like Neha said I think we have reached general agreement on > > this > > > > >> KIP. > > > > >>> So > > > > >>>>>>> it is probably fine to start the KIP voting. At least we draw > > > more > > > > >>>>>>> attention to the KIP even if there are some new discussion to > > > bring > > > > >>> up. > > > > >>>>>>> > > > > >>>>>>> Regarding the upgrade plan, given we decided to implement > > KIP-31 > > > > and > > > > >>>>>>> KIP-32 in the same patch to avoid change binary protocol > twice, > > > the > > > > >>>>>> upgrade > > > > >>>>>>> plan was mostly discussed on the discussion thread of KIP-31. > > > > >>>>>>> > > > > >>>>>>> Since the voting has been initiated, I will update the wiki > > with > > > > >>> latest > > > > >>>>>>> conversation to avoid further confusion. > > > > >>>>>>> > > > > >>>>>>> BTW, I actually have started coding work on KIP-31 and KIP-32 > > and > > > > >> will > > > > >>>>>>> focus on the patch before I return from vacation in mid Jan > > > because > > > > >> I > > > > >>>>>> have > > > > >>>>>>> no LInkedIn VPN access in China anyway... > > > > >>>>>>> > > > > >>>>>>> Thanks, > > > > >>>>>>> > > > > >>>>>>> Jiangjie > > > > >>>>>>> > > > > >>>>>>>> On Dec 23, 2015, at 12:31 PM, Anna Povzner < > a...@confluent.io > > > > > > > >>> wrote: > > > > >>>>>>>> > > > > >>>>>>>> Hi Gwen, > > > > >>>>>>>> > > > > >>>>>>>> I just wanted to point out that I just started the vote. > > Becket > > > > >> wrote > > > > >>>>>> the > > > > >>>>>>>> proposal and led the discussions. > > > > >>>>>>>> > > > > >>>>>>>> What I understood from reading the discussion thread, the > > > > migration > > > > >>>>>> plan > > > > >>>>>>>> was discussed at the KIP meeting, and not much on the > mailing > > > list > > > > >>>>>>> itself. > > > > >>>>>>>> > > > > >>>>>>>> My question about the migration plan was same as Guozhang > > wrote: > > > > >> The > > > > >>>>>> case > > > > >>>>>>>> when an upgraded broker receives an old producer request. > The > > > > >>>>>> proposal is > > > > >>>>>>>> for the broker to fill in the timestamp field with the > current > > > > time > > > > >>> at > > > > >>>>>>> the > > > > >>>>>>>> broker. Cons: it goes against the definition of CreateTime > > type > > > of > > > > >>> the > > > > >>>>>>>> timestamp (we are "over-writing" it at the broker). Pros: It > > > looks > > > > >>>>>> like > > > > >>>>>>>> most of the use-cases would actually want that behavior, > > because > > > > >>>>>>> otherwise > > > > >>>>>>>> timestamp is useless and they will need to support an old, > > > > >>>>>> pre-timestamp, > > > > >>>>>>>> behavior. E.g., if we modify log retention policy to use the > > > > >>>>>> timestamp, > > > > >>>>>>> we > > > > >>>>>>>> would need to support an old implementation (the one that > does > > > not > > > > >>> use > > > > >>>>>>>> timestamps in the message). So I actually have a preference > > for > > > > the > > > > >>>>>>>> proposed approach. > > > > >>>>>>>> > > > > >>>>>>>> Thanks, > > > > >>>>>>>> Anna > > > > >>>>>>>> > > > > >>>>>>>>> On Tue, Dec 22, 2015 at 8:02 PM, Neha Narkhede < > > > > n...@confluent.io > > > > >>> > > > > >>>>>>> wrote: > > > > >>>>>>>>> > > > > >>>>>>>>> Hey Gwen, > > > > >>>>>>>>> > > > > >>>>>>>>> Migration plan wasn't really discussed a ton in the > previous > > > > >>> threads. > > > > >>>>>>> So it > > > > >>>>>>>>> will be great to dive deep and see if there are gaps > there. I > > > had > > > > >>>>>> some > > > > >>>>>>>>> questions, but the details listed on the KIP are great. > > > > >>>>>>>>> > > > > >>>>>>>>> It is complex, though the plan outlined in the wiki > assumes a > > > > zero > > > > >>>>>>> downtime > > > > >>>>>>>>> upgrade assuming that producers and consumers can't be > > upgraded > > > > in > > > > >>>>>>> tandem. > > > > >>>>>>>>> This is typical for companies that have a significant Kafka > > > > >>>>>> footprint, > > > > >>>>>>> like > > > > >>>>>>>>> LinkedIn. > > > > >>>>>>>>> > > > > >>>>>>>>> Thanks, > > > > >>>>>>>>> Neha > > > > >>>>>>>>> > > > > >>>>>>>>>> On Tue, Dec 22, 2015 at 7:48 PM, Gwen Shapira < > > > > g...@confluent.io > > > > >>> > > > > >>>>>>> wrote: > > > > >>>>>>>>>> > > > > >>>>>>>>>> Hi Anna, > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks for the KIP, especially for the details on all the > > > > >>>>>> alternatives > > > > >>>>>>>>> and > > > > >>>>>>>>>> how we arrived at the proposal. Its really great! > > > > >>>>>>>>>> > > > > >>>>>>>>>> Can you point me at where the migration plan was > discussed? > > It > > > > >>> looks > > > > >>>>>>>>> overly > > > > >>>>>>>>>> complex and I have a bunch of questions, but if there was > a > > > > >>>>>> discussion, > > > > >>>>>>>>> I'd > > > > >>>>>>>>>> like to read up rather than repeating it :) > > > > >>>>>>>>>> > > > > >>>>>>>>>> Gwen > > > > >>>>>>>>>> > > > > >>>>>>>>>>> On Tue, Dec 22, 2015 at 12:34 PM, Anna Povzner < > > > > >> a...@confluent.io > > > > >>>> > > > > >>>>>>>>>> wrote: > > > > >>>>>>>>>> > > > > >>>>>>>>>>> Hi, > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> I am opening the voting thread for KIP-32: Add CreateTime > > and > > > > >>>>>>>>>>> LogAppendTime to Kafka message. > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> For reference, here's the KIP wiki: > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+CreateTime+and+LogAppendTime+to+Kafka+message > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> And the mailing list threads: > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> September: > > > > >> > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201509.mbox/%3CCAHrRUm6NMg%3DPh4HAJdxr%3DpmZhfFcD5OEV2yxj3fg%2BXnEBTW%2B3w%40mail.gmail.com%3E > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> October: > > > > >> > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201510.mbox/%3CCAHrRUm7RiBAJxwO15s1tztz%3D15oibO-QJ%2B_w8AxafTnuw3jjCw%40mail.gmail.com%3E > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> December: > > > > >> > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/kafka-dev/201512.mbox/%3CCAHrRUm4ugxDYzyy26MGRGKpK4hsjT4EKTuu18M3wztYq4PE%3DaQ%40mail.gmail.com%3E > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> > > > > >>>>>>>>>>> Thanks > > > > >>>>>>>>>>> Anna > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Thanks, > > > > >>>>>>>>> Neha > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> -- > > > > >>>>> -- Guozhang > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> -- > > > > >>>> -- Guozhang > > > > >> > > > > > > > > > >