Hi Jay, I just modified the KIP. The only concern I have about this change is that it will break existing deployments. And we need to change the command line arguments format for other tools as well. It is defiitely better that we conform to the unix standard. It is just I am not sure if the change worth it given we have been using this argument format for a while.
Jiangjie (Becket) Qin On 2/26/15, 8:40 PM, "Jay Kreps" <jay.kr...@gmail.com> wrote: >Can we change the command line arguments for mm to match the command line >arguments elsewhere. This proposal seems to have two formats: >*--consumer.rebalance.listener* >and >*--abortOnSendFail* >The '.' separators for command line options predate this JIRA but I think >the new camelCase option is a new invention. All the other command line >tools, as well as pretty much all of unix uses dashes like this: >*--consumer-rebalance-listener* >I don't really know the history of tis but let's move it to normal unix >dashes across the board as well as examine the options for any >other>inconsistencies. > >-Jay > > >On Thu, Feb 26, 2015 at 11:57 AM, Jiangjie Qin <j...@linkedin.com.invalid> >wrote: > >> Hi Neha, >> >> Thanks for the comment. That’s a really good point. >> >> Originally I’m thinking about allowing user to tweak some parameter as >> needed. >> For example, some user might want to have ppeline enabled and can >> tolerate reordering, some user might want to use acks=1 or acks=0, some >> might want to move forward when error is encountered in callback. >> So we don’t want to enforce all the settings of no.data.loss. Meanwhile >>we >> want to make the life easier for the users who want no data loss so they >> don’t need to set the configs one by one, therefore we crated this >>option. >> >> But as you suggested, we can probably make no.data.loss settings to be >> default and removed the ―no.data.loss option, so if people want to tweak >> the settngs, they can just change them, otherwise they get the default >> no-data-loss settings. >> >> I’ll modify the KIP. >> >> Thanks. >> >> Jiangjie (Becket) Qin >> >> On 2/26/15, 8:58 AM, "Neha Narkhede" <n...@confluent.io> wrote: >> >> >Hey Becket, >> > >> >The KIP proposes addition of a --no.data.loss command line option to >>the >> >MirrorMaker. Though when would the uer not want that option? I'm >> >wondering >> >what the benefit of providing that option is if every user would want >>that >> >for correct mirroring behavior. >> > >> >Other than that, the KIP looks great! >> > >> >Thanks, >> >Neha >> > >> >On Wed, Feb 25, 2015 at 3:56 PM, Jiangjie Qin >><j...@linkedin.com.invalid> >> >wrote: >> > >> >> For 1), the current design allow you to do it. The customizable >>message >> >> handler takes in a ConsumerRecord and spit a List<ProducerRecord>, >>you >> >>can >> >> just put a topic for the ProducerRecord different from >>ConsumerRecord. >> >> >> >> WRT performance, we did some test in LinkedIn, the performance looks >> >>good >> >> to us. >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> On 2/25/15, 3:41 PM, "Bhavesh Mistry" <mistry.p.bhav...@gmail.com> >> >>wrote: >> >> >> >> >Hi Jiangjie, >> >> > >> >> >It might be too late. But, I wanted to bring-up following use case >>for >> >> >adopting new MM: >> >> > >> >> >1) Ability to publish message rom src topic to different >>destination >> >> >topic >> >> >via --overidenTopics=srcTopic:newDestinationTopic >> >> > >> >> >In order to adopt, new MM enhancement customer will compare >> >>performance of >> >> >new MM and data quality while running old MM against same >>destination >> >> >cluster in Prd. >> >> > >> >> >Let me know if you agree to that or not. Also, If yes, will be >>able to >> >> >able to provide this feature in release version. >> >> > >> >> >Thanks, >> >> > >> >> >Bhavesh >> >> > >> >> > >> >> >On Tue, Feb 24, 2015 at 5:31 PM, Jiangjie Qin >> >><j...@linkedin.com.invalid> >> >> >wrote: >> >> > >> >> >> Sure! Just created the voting thread :) >> >> >> >> >> >> On 2/24/5, 4:44 PM, "Jay Kreps" <j...@confluent.io> wrote: >> >> >> >> >> >> >Hey Jiangjie, >> >> >> > >> >> >> >Let's do an official vote so that we know hat we are voting on >>and >> >>we >> >> >>are >> >> >> >crisp on what the outcome was. This thread is very long :- >> >> >> > >> >> >> >-Jay >> >> >> > >> >> >> >On Tue, Feb 24, 2015 at 2:53 PM, Jiangjie Qin >> >> >><j...@linkedin.com.invalid> >> >> >> >wrote: >> >> >> > >> >> >> >> I updated the KIP page based on the discussion we had. >> >> >> >> >> >> >> >> Should I launch another vote or we can think of this mail >>thread >> >>has >> >> >> >> already included a vote? >> >> >> >> >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> >> >> >> On 2/11/15, 5:15 PM, "Neha Nakhede" <n...@confuent.io> wrote: >> >>>> >> >> >> >> >> >Thanks for the explanation, Joel! Would love to see the >>results >> >>of >> >> >>the >> >> >> >> >throughput experiment and I'm a +1 on everything els, ncluding >> >>the >> >> >> >> >rebalance callback and record handler. >> >> >> >> > >> >> >> >> >-Neha >> >> >> >> > >> >> >> >> >On Wed, Feb 11, 2015 at 1:13 PM Jay Kreps >><jay.kreps@gmailcom> >> >> >>wrote: >> >> >> >> > >> >> >> >> >> Cool, I agree with all that. >> >> >> >> >> >> >> >> >> >> I agree about the need for a rebalancing callback. >> >> >> >> >> >> >> >> >> >> Totally agree about record handler. >> >> > >> >> >> >> >> >> >> It would be great to see if a prototype of this is workable. >> >> >> >> >> >> >> >> >> >> Thanks guys! >> >> >> >> >> >> >> >> >> >> -Jay >> >> >> >> >> >> >> >> >> >> On Wed, Feb 11 2015 at 12:36 PM, Joel Koshy >> >><jjkosh...@gmail.com >> >> > >> >> >> >> >>wrote: >> >> >> >> >> >> >> >> >> >> > Hey Jay, >> >> >> >> >> > >> >> >> >> >> > Guozhang, Becket and I got together todiscus this and we >> >> >>think: >> >> >> >> >> > >> >> >> >> >> > - It seems that your proposal based on the new consumr and >> >>flush >> >> >> >>call >> >> >> >> >> > should work. >> >> >> >> >> > - We would likely need to call the poll with a timeout >>that >> >> >>matches >> >> >> >> >> > the offset commt interval in ordr to deal with low >>volume >> >> >> >> >> > mirroring pipelines. >> >> >> >> >> > - We will still need a reblnce callback to educe >> >>duplicates - >> >> >> >>the >> >> >> >> >> > rebalance callback would need to flush and ommit >>offsets. >> >> >> >> >> > - The only remaining question is if the overall >>throughput is >> >> >> >> >> > sufficient. I think someone at LinkedIn (I don't >>remember >> >>who) >> >> >> >>did >> >> >> >> >> > some experiments wth data channel size == 1 and ran >>into >> >> >>issues. >> >> >> >> >> > That was not thoroughly investigated though. >> >> >> >> >> > - The addition of flush may actually make this solution >> >>viable >> >> >>for >> >> >> >>the >> >> >> >> >> > current mirror-maker (wih the old consumer). We can >> >>prototype >> >> >> >>that >> >> >> >> >> > offline and if it works out well we can redo KAFKA-1650 >> >>(i.e., >> >> >> >> >> > refactor the current mirror make). The flush call and >>the >> >>new >> >> >> >> >> > consumer didn't exist at the time we did KAFKA-1650 so >>this >> >> >>did >> >> >> >>not >> >> >> >> >> > occur to us. >> >> >> >> >> > - We think the RecordHandler is still a useful small >>addition >> >> >>for >> >> >> >>the >> >> >> >> >> > use-cases mentioned earlier in this thread. >> >> >> >> >> > >> >> >> >> >> > Thanks, >> >> >> >> >> > >> >> >> >> >> > Joel >> >> >> >> >> > >> >> >> >> >> > On Wed, Feb 11, 2015 at 09:05:39AM -0800, Jay Kreps wrote: >> >> >> >> >> > > Guozhang, I agree with 1-3, I do think what I was >>proposing >> >> >>was >> >> >> >> >>simpler >> >> >> >> >> > but >> >> >> >> >> > > perhaps there re gaps in that? >> >> >> >> >> > > >> >> >> >> >> > > Hey Joel--Here was a sketch of what I was proposing. I >>do >> >> >>think >> >> >> >>this >> >> >> >> >> > get's >> >> >> >> >> > > rid of manual offset tracking, espcially doing so across >> >> >>threads >> >> >> >> >>with >> >> >> >> >> > > dedicated commit threads, which I think is prety >>complex. >> >> >> >> >> > > >> >> >> > >> > > while(true) { >> >> >> >> >> > > val recs = consumer.poll(Long.MaxValue); >> >> >> >> >> > > for (rec <- recs) >> >> >> >> >> > > proucer.sd(rec, logErrorCallback) >> >> >> >> >> > > if(System.currentTimeMillis - lastCommit > >> >> >>commitInterval) { >> >> >> >> >> > > producer.flush() >> >> >> >> >> > > consumer.commit() >> >> >> >> >> > > lastCommit = System.currentTimeMillis >> >> >> >> >> > > } >> >> >> >> >> > > } >> >> >> >> >> > > >> >> >> >> >> > > (See the previousemail for details). I think the >>question >> >> >>is: is >> >> >> >> >>there >> >> >> >> >> > any >> >> >> >> >> > > reason--performance, correctness, etc--that this won't >> >>work? >> >> >> >> >>Basically >> >> >> >> >> I >> >> >> >> >> > > think you guys have thought about this more so I may be >> >> >>missing >> >> >> > >> > something. >> >> >> >> >> >> If so let's flag it while we still have leeway on the >> >> >>consumer. >> >> >> >> >> > > >> >> >> >> >> > > If we think that will work, well I do think it is >> > >>conceptually a >> >> >> >>lot >> >> >> >> >> > > simpler than the current code, though I suppose one >>could >> >> >> >>disagree >> >> >> >> >>on >> >> >> >> >> > that. >> >> >> >> >> > > >> >> >> >> >> > > -Jay >> >> >> >> >> > > >> >> >> >> >> > > On Wed, Feb 11, 2015 at 5:53 AM, Joel Koshy >> >> >><jjkosh...@gmail.com >> >> >> > >> >> >> >> >> wrote: >> >> >> >> >> > > >> >> >> > >> > > > Hi Jay, >> >> >> >> >> > > > >> >> >> >> >> > > > > The data channels are actually a big part of the >> >> >>complexity >> >> >> >>of >> >> >> >> >>the >> >> >> >> >> > zero >> >> >> >> >> > > > > data loss design, though, right? Because then you >>need >> >> >>ome >> >> >> >> >>reverse >> >> >> >> >> > > > channel >> >> >> >> >> > > > > to flo the acks back to the consumer based on where >>you >> >> >>are >> >> >> >> >>versus >> >> >> >> >> > just >> >> >> >> >> > > > > acking what you have read and written (as in the >>code >> >> >> >>snippet I >> >> >> >> >>put >> >> >> >> >> > up). >> >> >> >> >> > > >>> >> >> >> >> > > > I'm not sure if we are on the same >> >> >> >> >> > > >>> >> >> >> >> > > > page. Even if the >>data >> >> >> >>channel >> >> >> >> >>was >> >> >> >> >> > > > not there the current handling fr zero data loss would >> >> >>remain >> >> >> >> >>very >> >> > >> >> > > > similar - you would need to maintain lists of unacked >> >>source >> >> >> >> >>offsets. >> >> >> >> >> > > > I'm wondering if the KIP needs more detail on how it >>is >> >> >> >>currently > >> >> >> >> > > > implemented; or are suggesting a different approach >>(in >> >> >>which >> >> >> >> >>case I >> >> >> >> >> > > > have notfully understood). I'm not sure whatyou mean >>by >> >> >> >>flowing >> >> >> >> >> acks >> >> >> >> >> > > > back to the consumer - the MM commits offsets after >>the >> >> >> >>producer >> >> >> >> >>ack >> >> >> >> >> > > > has been received. There is some additional complexity >> >> >> >>introduced >> > >> >> >>in >> >> >> >> >> > > > reducing duplicates on a rebalance - this is actually >> >> >>optional >> >> >> >> >>(since >> >> >> >> >> > > > duplicates are currently a given). The reasonthat was >> >>done >> >> >> >> >>anyway is >> >> >> >> >> > > > that with the auto-commit turned off duplicates are >> >>almost >> >> >> >> >>guaranteed >> >> >> >> >> > > > on a rebalance. >> >> >> >> >> > > > >> >> >> >> >> > > > > I think the point that Neha and I were trying to >>make >> >>was >> >> >> >>that >> >> >> >> >>the >> >> >> >> >> > > > > motivation to embed stuff into MM kindof is related >>to >> >> >>how >> >> >> >> >> complex a >> >> >> >> >> > > > > simple "consume and prouce" with good throughput >>will >> >> >>be. If >> >> >> >> >>it is >> >> >> >> >> > > > simple >> >> >> >> >> > > > > to write such a thing in a few lines, the pain of >> >> >>embedding a >> >> >> >> >>bunch >> >> >> >> >> > of >> >> >> >> >> > > > > stuff won't be worth it, if it has to be as complex >>as >> >>the >> >> >> >> >>current >> >> >> >> >> mm >> >> >> >> >> > > > then >> >> >> >> >> > > > > of course we will need all kinds of plug ins >>because no >> >> >>one >> >> >> >> >>will be >> >> >> >> >> >able >> >> >> >> >> > > > to >> >> >> >> >> > > > > write such a thing. I don't have a huge concern >>with a >> >> >>simple >> >> >> >> >> plug-in >> >> >> >> >> > > > but I >> >> >> >> >> > > > > think if i turns into something more complex with >> >> >>filtering >> >> >> >>and >> >> >> >> >> > > > > aggregation or whatever we really need to stop and >> >>think a >> >> >> >>bit >> >> >> >> >> about >> >> >> >> >> > the >> >> >> > >> > > > > design. >> >> >> >> >> > > > >> >> >> >> >> > > > I agree - I don't think there is a usecase for any >> >>comple >> >> >> >> >>plug-in. >> >> >> >> >> > > > It is pretty much what Becket has described curently >>for >> >> >>the >> >> >> >> >>mesage >> >> >> >> >> > > > handler - i.e., take an incoming record and return a >> >>list of >> >> >> >> >>outgoing >> >> >> >> >> > > > records (which could be empty if you filter). >> >> >> >> >> > > > >> >> >> >> >> > > > So here is my ake on the MM: >> >> >> >> >> > > > - Bare bones: simple consumer - producer pairs (07 >> >>style). >> >> >> >>This >> >> >> >> >>is >> >> >> >> >> > > > ideal, but does not handle no data los >> >> >> >> >> > > > - Above plus support no data loss. This actually adds >> >>quite >> >> >>a >> >> >> >>bit >> >> >> >> >>of >> >> >> >> >> > > > complexity. >> >> >> >> >> > > > - Above plus the message handler. This is a trivial >> >> >>addition I >> >> >> >> >>think >> >> >> >> >> > > > that makes the MM usable in a few other >>mirroring-like >> >> >> >> >> applications. >> >> >> >> >> > > > >> >> >> >> >> > > > Joel >> >> >> >> >> > > > >> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 12:31 PM, Joel Koshy >> >> >> >> >><jjkosh...@gmail.com> >> >> >> >> >> > > > wrote: >> >> >> >> >> > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > >> On Tue, Feb 10, 2015 at 12:13:46PM -0800, Neha >> >>Narkhede >> >> >> >>wrote: >> >> >> >> >> > > > > > > I think all of us agree that we want to design >> >> >> >>MirrorMaker >> >> >> >> >>for >> >> >> >> >> 0 >> >> >> >> >> > data >> >> >> >> >> > > > > > loss. >> >> >> >> >> > > > > > > With the absence of the data channel, 0 data >>loss >> >> >>will be >> >> >> >> >>much >> >> >> >> >> > > > simpler to >> >> >> >> >> > > > > > > implement. >> >> >> >> >> > > > > > >> > >> >> >> > > > > > The data channel is irrelevant to theimplementation >> >>of >> >> >> >>zero >> >> >> >> >>data >> >> >> >> >> > > > > > loss. The complexity in the implementation of no >>data >> >> >>loss >> >> >> >> >>that >> >> >> >> >> you >> >> >> >> >> > > > > > are seeing in mirror-maker affects all >> >> >>consume-then-produce >> >> >> >> >> > patterns >> >> >> >> >> > > > > > whether or not there is a data hannel. You still >> >>need >> >> >>to >> >> >> >> >> > mantain a >> >> >> >> >> > > > > > list of unacked offsets. What I meant earlier is >> >>that we >> >> >> >>can >> >> >> >> >> > > > > > brainstorm completely different approaches to >> >> >>supporting no >> >> >> >> >>data >> >> >> >> >> > loss, >> >> >> >> >> > > > > > but the current implementation is the only >>solution >> >>we >> >> >>ar >> >> >> >> >>aware >> >> >> >> >> > of. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > My arguments for adding a message handler are >>that: >> >> >> >> >> > > > > > > > 1. It is more efficient to do something in >>common >> >> >>for >> >> >> >>all >> >> >> >> >>the >> >> >> >> >> > > > clients >> >> >> >> >> > > > > > in >> >> >> >> >> > > > > > > > pipeline than letting each client do thesame >> >>thing >> >> >>for >> >> >> >> >>many >> >> >> >> >> > > > times. And >> >> >> >> >> > > > > > > > there are concrete use cases for the message >> >>handler >> >> >> >> >>already. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > What are the concrete use cases? >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > I think Becket alrady described a couple of use >> >>cases >> >> >> >> >>earlier in >> >> >> >> >> > the >> >> >> >> >> > > > > > thread. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > <quote> >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > 1 Format conversion. We have a use case where >> >>clients >> >> >>of >> >> >> >> >>source >> >> >> >> >> > > > > > cluster >> >> >> >> >> > > > > > use an internal schema and clients of target >>cluster >> >> >>use a >> >> >> >> >> > different >> >> >> >> >> > > > > > public schema. >> >> >> >> >> > > > > > 2. Message filtering: For the messagespublished >>to >> >> >>source >> >> >> >> >> cluster, >> >> >> >> >> > > > > > there >> >> >> >> >> > > > > > ar some messages private to source cluster clients >> >>and >> >> >> >>should >> >> >> >> >> not >> >> >> >> >> > > > > > exposed >> >> >> >> >> > > > > > to target cluster clients. It would be difficult >>to >> >> >>publish >> >> >> >> >>those >> >> >> >> >> > > > > > messages >> >> >> >> >> > > > > > into different partitions because they need to be >> >> >>ordered. >> >> >> >> >> > > > > > I agree that we can always filter/convert messages >> >>after >> >> >> >>they >> >> >> >> >>are >> >> >> >> >> > > > > > copied >> >> >> >> >> > > > > > to thetarget cluster, but that costs network >> >>bandwidth >> >> >> >> >> > unnecessarily, >> > >> >> >> > > > > > especially if that is a cross colo mirror. With the >> >> >> >>handler, >> >> >> >> >>we >> >> >> >> >> can >> >> >> >> >> > > > > > co-locate the mirror maker with source cluster and >> >>save >> >> >> >>that >> >> >> >> >> cost. >> >> >> >> >> > > > > > Also, >> >> >> >> >> > > > > > imagine there are many downstream consumers >>consuming >> >> >>from >> >> >> >>the >> >> >> >> >> > target >> >> >> >> >> > > > > > cluster, filtering/reformatting the messages >>before >> >>the >> >> >> >> >>messages >> >> > >> >> > reach >> >> > >> >> > > > > > te >> >> >> >> >> > > > > > target cluster is much more efficient than having >> >>each >> >> >>of >> >> >> >>the >> >> >> >> >> > > > > > consumers do >> >> >> >> >> > > > > > this individually on their own. >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > </quote> >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > Also the KIP still refers to he datachannel in a >> >>few >> >> >> >> >>places >> >> >> >> >> > > > (Motivation >> >> >> >> >> > > > > > > and "On consumer rebalance" sections). Can you >> >>update >> >> >>the >> >> >> >> >>wiki >> >> >> >> >> > so it >> >> >> >> >> > > > is >> >> >> >> >> > > > > > > easier to review the new design, espeially the >> >>data >> >> >>loss >> >> >> >> >>part. >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> >> > > > > On Tue, Feb 10, 2015 at 10:36 AM, Joel Koshy < >> >> >> >> >> > jjkosh...@gmail.com> >> >> >> >> >> > > > > > wrote: >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > I think the message handler adds little to >>no>> >> >> >>complexity >> >> >> >> >>to >> >> >> >> >> the >> >> >> >> >> > > > mirror >> >> >> >> >> > > > > > > > maker. Jay/Neha, the MM became scary due to >>the >> >> >> >> >> rearchitecture >> >> >> >> >> > we >> >> >> >> >> > > > did >> >> >> >> >> > > > > > > > for 0.8 due to performance issues compared >>with >> >>0.7 >> >> >>- >> >> >> >>we >> >> >> >> >> should >> >> >> >> >> > > > remove >> >> >> >> >> > > > > > > > the data channel if it can match the current >> >> >> >>throughput. I >> >> >> >> >> > agree >> >> >> >> >> > > > it is >> >> >> >> >> > > > > > > worth prototyping and testing that so the MM >> >> >> >>architecture >> >> >> >> >>is >> >> >> >> >> > > > > > > > simplified. >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > > The MM became a little scarier in KAFKA-1650 >>in >> >> >>order >> > >> >>to >> >> >> >> >> > support no >> >> >> >> >> > > > > > > > data loss. I think the implementation for no >>data >> >> >>loss >> >> >> >> >>will >> >> >> >> >> > remain >> >> >> >> >> > > > > > > > about the same even in the new model (even >> >>without >> >> >>the >> >> >> >> >>data >> >> >> >> >> > > > channel) - >> >> >> >> >> > > > > > > > we can probably brainstorm more if there is a >> >> >> >> >>better/simpler >> >> >> >> >> > way >> >> >> >> >> > > > to do >> >> >> >> >> > > > > > > > it (maybe there is in the absence of the data >> >> >>channel) >> >> >> >> >>but at >> >> >> >> >> > the >> >> >> >> >> > > > time >> >> >> >> >> > > > > > > > it was the best we (i.e., Becket, myself, Jun >>and >> >> >> >>Guozhang >> >> >> >> >> who >> >> >> >> >> > > > > > > > participated on the review) could come up >>with. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > So I'm definitely +1 on whatever it takes to >> >> >>support no >> >> >> >> >>data >> >> >> >> >> lss. >> >> >> >> >> > > > I >> >> >> >> >> > > > > > > > think most people would want that out of the >>box. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > As for the message handler, as Becket wrote >>and I >> >> >>agree >> >> >> >> >>with, >> >> >> >> >> > it is >> >> >> >> >> > > > > > > > really a trivial addition that would benefit >> >> >>(perhaps >> >> >> >>not >> >> >> >> >> most, >> >> >> >> >> > > > but at >> >> >> >> >> > > > > > > > least some). So I'm personally +1 on that as >> >>well. >> >> >>That >> >> >> >> >>said, >> >> >> >> >> > I'm >> >> >> >> >> > > > also >> >> >> >> >> > > > > > > > okay with it not being there. I think the MM >>is >> >> >>fairly >> >> >> >> >> > stand-alone >> >> >> >> >> > > > and >> >> >> >> >> > > > > > > > simple eough that it is entirely reasonable >>and >> >> >> >> >>absolutely >> >> >> >> >> > > > feasible >> >> >> >> >> > > > > > > > or companies to fork/re-implement the mirror >> >>maker >> >> >>for >> >> >> >> >>their >> >> >> >> >> > own >> >> >> >> >> > > > > > > > needs. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > So in summary, I'm +1 on the KIP. >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > Thanks, >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > Joel >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > On Mon, Feb 09, 2015 at 09:19:57PM +0000, >> >>Jiangjie >> >> >>Qin >> >> >> >> >>wrote: >> >> >> >> >> > > > > > > > > I just updated the KIP page and incorporated >> >>Jay >> >> >>and >> >> >> >> >>Neha’s >> >> >> >> >> > > > > > suggestion. >> >> >> >> >> > > > > > > > As >> >> >> >> >> > > > > > > > > a brief summay of where we are: >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Consensus reached: >> >> >> >> >> > > > > > > > > Have N independent mirror maker threads each >> >>has >> >> >> >>their >> >> >> >> >>own >> >> >> >> >> > > > consumers >> >> >> >> >> > > > > > but >> >> >> >> >> > > > > > > > > share a producer. The mirror maker threads >> >>will be >> > >> >> >> > responsible >> >> >> >> >> > > > for >> >> >> >> >> > > > > > > > > decompression, compression and offset commit >> >>No >> >> >>data >> >> >> >> >> > channel and >> >> >> >> >> > > > > > > > separate >> >> >> >> >> > > > > > > > > offset commit thread is needed. Consumer >> >>rebalance >> >> >> >> >>callback >> >> >> >> >> > will >> >> >> >> >> > > > be >> >> >> >> >> > > > > > used >> >> >> >> >> > > > > > > > > to avoid duplicates on rebalance. >> >> >> >> >> > > > > > > > > >> >> >>>> >> > > > > > > > > Still under discussion: >> >> >> >> >> > > > > > > > > Whether message handler is needed. >> >> >> >> >> > >> > > > > > >> >> >> >> >> > > > > > > > > My arguments for adding a message handler >>are >> >> >>that: >> >> >> >> >> > > > > > > > > 1. It is more efficient to do something in >> >>common >> >> >>for >> >> >> >> >>all >> >> >> >> >> the >> >> >> >> >> > > > > > clients in >> >> >> >> >> > > > > > > > > pipeline than letting each client do the >>same >> >> >>thing >> >> >> >>for >> >> >> >> >> many >> >> >> >> >> > > > times. >>>> >> >> >> > > > > > And >> >> >> >> >> > > > > > > > > there are concrete use cases for the message >> >> >>handler >> >> >> >> >> already. >> >> >> >> >> > > > > > > > > 2. It is not a big complicated add-on to >>mirror >> >> >> >>maker. >> >> >> >> >> > > > > > > > > 3.Without a message handler, for customers >> >>needs >> >> >>it, >> >> >> >> >>they >> >> >> >> >> > have >> >> >> >> >> > > > to >> >> >> >> >> > > > > > > > > re-implement all the logics of mirror maker >>by >> >> >> >> >>themselves >> >> >> >> >> > just in >> >> >> >> >> > > > > > order >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > add tis handling in pipeline. >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Any thoughts? >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > Thanks. >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > ―Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > On 2/8/15, 6:35 PM, "Jiangjie Qin" >> >> >> >>j...@linkedin.com> >> >> >> >> >> > wrote: >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > > >Hi Jay, thanks a lot for the comments. >> >> >> >> >> > > > > > > > > >I think this solution is better. We >>probably >> >> >>don’t >> >> >> >>need >> >> >> >> >> data >> >> >> >> >> > > > channel >> >> >> >> >> > > > > > > > > >anymore. It canbe replaced with a list of >> >> >>producer >> >>>> >>if >> >> >> >> >>we >> >> >> >> >> > need >> >> >> >> >> > > > more >> >> >> >> >> > > > > > > > sender >> >> >> >> >> > > > > > > > > >thread. >> >> >> >> >> > > > > > > > > >I’l update the KIP page. >> >> >> >> >> > > > > > > > > > >> >> >>>> >> > > > > > > > > >The reasoning about message handler is >>mainly >> >>for >> >> >> >> >> efficiency >> >> >> >> >> > > > > > purpose. >> >> >> >> >> > > > > > > > I’m >> >> >> >> >> > > > > > > > > >thinking that if something can be done in >> >> >>pipeline >> >> >> >>for >> >> >> >> >>all >> >> >> >> >> > the >> >> >> >> >> > > > > > clients >> >> >> >> >> > > > > > > > > >such as filtering/reformatting, it is >>probably >> >> >> >>better >> >> >> >> >>to >> >> >> >> >> do >> >> >> >> >> > it >> >> > >> >> > > > in >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > >pipeline than asking 100 clients do the same >> >> >>thing >> >> >> >>for >> >> >> >> >>100 >> >> >> >> >> > > > times. >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >―Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >On 2/8/15, 4:59 PM, "Jay Kreps" >> >> >> >><jay.kr...@gmail.co> >> >> >> >> >> > wrote: >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >>Yeah, I second Neha's comments. The >>current >> >mm >> >> >>code >> >> >> >> >>has >> >> >> >> >> > taken >> >> >> >> >> > > > > > something >> >> >> >> >> > > > > > > > > >>pretty simple and made it pretty scary >>with >> >> >> >>callbacs >> >> >> >> >>and >> >> >> >> >> > > > > > wait/notify >> >> >> >> >> > > > > > > > > >>stuff. Do we believe this works? I cant >> >>tell by >> >> >> >> >>looking >> >> >> >> > > at it >> >> >> >> >> > > > > > which is >> >> >> >> >> > > > > > > > > >>kind of bad for something important like >> >>this. I >> >> >> >>don't >> >> >> >> >> mean >> >> >> >> >> > > > this as >> >> >> >> >>> > > > > > > > >>criticism, I know the history: we added in >> >> >>memory >> >> >> >> >>queus >> >> >> >> >> to >> >> >> >> >> > > > help >> >> >> >> >> > > > > > with >> >> >> >> >> > > > > > > > > >>other >> >> >> >> >> > > > > > > > > >>performance problems without thinking >>about >> >> >> >> >>correctness, >> >> >> >> >> > then >> >> >> >> >> > > > we >> >> >> >> >> > > > > > added >> >> >> >> >> > > > > > > > > >>stuff to work around the in-memory queues >>not >> >> >>lose >> >> >> >> >>data, >> >> >> >> >> > and >> >> >> >> >> > > > so on. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Can we instead do the pposite exercise and >> >> >>start >> >> >> >>with >> >> >> >> >> the >> >> >> >> >> > > > basics >> >> >> >> >> > > > > > of >> >> >> >> >> > > > > > > > what >> >> >> >> >> > > > > > > > > >>mm should do and think about what >> >>deficiencies >> >> >> >> >>prevents >> >> >> >> >> > this >> >> >> >> >> > > > > > approach >> >> >> >> >> > > > > > > > > >>from >> >> >> >> >> > > > > > > > > >>working? Then let's make sure the >>currently >> >> >> >>in-flight >> >> >> >> >> work >> >> >> >> >> > will >> >> >> >> >> > > > > > remove >> >> >> >> >> > > > > > > > > >>these deficiencies. After all mm is kind >>of >> >>the >> >> >> >> >> > prototypical >> >> >> >> >> > > > kafka >> >> >> >> >> > > > > > use >> >> >> >> >> > > > > > > > > >>case >> >> >> >> >> > > > > > > > > >>so if we can't make our clients to this >> >> >>probably no >> >> >> >> >>one >> >> >> >> >> > else >> >> >> >> >> > > > can. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>I think mm should just be N independent >> >>threads >> >> >> >>each >> >> >> >> >>of >> >> >> >> >> > which >> >> >> >> >> > > > has >> >> >> >> >> > > > > > their >> >> >> >> >> > > > > > > > > >>own >> >> >> >> >> > > > > > > > > >>consumer but share a producer and each of >> >>which >> >> >> >>looks >> >> >> >> >> like >> >> >> >> >> > > > this: >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>while(true) { >> >> >> >> >> > > > > > > > > >> val recs = >>consumer.poll(Long.MaxValue); >> >> >> >> >> > > > > > > > > >> for (rec <- recs) >> >> >> >> >> > > > > > > > > >> producer.send(rec, >>logErrorCallback) >> >> >> >> >> > > > > > > > > >> if(System.currentTimeMillis - >>lastCommit >> >>> >> >> >> >> >> > commitInterval) >> >> >> >> >> > > { >> >> >> >> >> > > > > > > > > >> producer.flush() >> >> >> >> >> > > > > > > > > >> consumer.commit() >> >> >> >> >> > > > > > > > > >> lastCommit = >>System.currentTimeMillis >> >> >> >> >> > > > > > > > > >> } >> >> >> >> >> > > > > > > > > >>} >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>This will depend on setting the retry >>count >> >>in >> >> >>the >> >> >> >> >> > producer to >> >> >> >> >> > > > > > > > something >> >> >> >> >> > > > > > > > > >>high with a largish backoff so that a >>failed >> >> >>send >> >> >> >> >>attempt >> >> >> >> >> > > > doesn't >> >> >> >> >> > > > > > drop >> >> >> >> >> > > > > > > > > >>data. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>We will need to use the callback to force >>a >> >> >>flush >> >> >> >>and >> >> >> >> >> > offset >> >> >> >> >> > > > > > commit on >> >> >> >> >> > > > > > > > > >>rebalance. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>This approach may have a few more TCP >> >> >>connections >> >> >> >>due >> >> >> >> >>to >> >> >> >> >> > using >> >> >> >> >> > > > > > multiple >> >> >> >> >> > > > > > > > > >>consumers but I think it is a lot easier >>to >> >> >>reason >> >> >> >> >>about >> >> >> >> >> > and >> >> >> >> >> > > > the >> >> >> >> >> > > > > > total >> >> >> >> >> > > > > > > > > >>number of mminstances is always going to >>be >> >> >>small. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Let's talk about where this simple >>approach >> >> >>falls >> >> >> >> >>short, >> >> >> >> >> I >> >> >> >> >> > > > think >> >> >> >> >> > > > > > that >> >> >> >> >> > > > > > > > >>will >> >> >> >> >> > > > > > > > > >>help us understand your motivations for >> >> >>additional >> >> >> >> >> > elements. >> >> >> >> >> > > > > > > > >> >> >> >> >> >> > > > > > > > > >>Another advantage of this is that it is so >> >> >>simple I >> >> >> >> >>don't >> >> >> >> >> > > > think we >> >> >> >> >> > > > > > > > really >> >> >> >> >> > > > > > > > > >>even need to both making mm extensible >> >>because >> >> >> >>writing >> >> >> >> >> > your own >> >> >> >> >> > > > > > code >> >> >> >> >> > > > > > > > that >> >> >> >> >> > > > > > > > > >>does custom processing or transformation >>is >> >>just >> >> >> >>ten >> >> >> >> >> lines >> >> >> >> >> > and >> >> >> >> >> > > > no >> >> >> >> >> > > > > > plug >> >> >> >> >> > > > > > > > in >> >> >> >> >> > > > > > > > > >>system is going to make it simpler. >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>-Jay >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>On Sun, Feb 8, 2015 at 2:40 PM, Neha >> >>Narkhede < >> >> >> >> >> > > > n...@confluent.io> >> >> >> >> >> > > > > > > > wrote: >> >> >> >> >> > > > > > > > > >> >> >> >> >> >> > > > > > > > > >>> Few comments - >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> 1. Why do we need the message handler? >>Do >> >>you >> >> >> >>have >> >> >> >> >> > concrete >> >> >> >> >> > > > use >> >> >> >> >> > > > > > cases >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> mind? If not, we should consider adding >>it >> >>in >> >> >>the >> >> >> >> >> future >> >> >> >> >> > > > when/if >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > do >> >> >> >> >> > > > > > > > > >>>have >> >> >> >> >> > > > > > > > > >>> use cases for it. The purpose of the >>mirror >> >> >>maker >> >> >> >> >>is a >> >> >> >> >> > simple >> >> >> >> >> > > > > > tool >> >> >> >> >> > > > > > > > for >> >> >> >> >> > > > > > > > > >>> setting up Kafka cluster replicas. I >>don't >> >>see >> >> >> >>why >> >> >> >> >>we >> >> >> >> >> > need to >> >> >> >> >> > > > > > > > include a >> >> >> >> >> > > > > > > > > >>> message handler for doing stream >> >> >>transformations >> >> >> >>or >> >> >> >> >> > > > filtering. >> >> >> >> >> > > > > > You >> >> >> >> >> > > > > > > > can >> >> >> >> >> > > > > > > > > >>> always write a simple process for doing >> >>that >> >> >>once >> >> >> >> >>the >> >> >> >> >> > data is >> >> >> >> >> > > > > > copied >> >> >> >> >> > > > > > > > as >> >> >> >> >> > > > > > > > > >>>is >> >> >> >> >> > > > > > > > > >>> in the target cluster >> >> >> >> >> > > > > > > > > >>> 2. Why keep both designs? We should >>prefer >> >>the >> >> >> >> >>simpler >> >> >> >> >> > design >> >> >> >> >> > > > > > unless >> >> >> >> >> > > > > > > > it >> >> >> >> >> > > > > > > > > >>>is >> >> >> >> >> > > > > > > > > >>> not feasible due to the performance >>issue >> >> >>that we >> >> >> >> >> > previously >> >> >> >> >> > > > > > had. Did >> >> >> >> >> > > > > > > > > >>>you >> >> >> >> >> > > > > > > > > >>> get a chance to run some tests to see if >> >>that >> >> >>is >> >> >> >> >>really >> >> >> >> >> > > > still a >> >> >> >> >> > > > > > > > problem >> >> >> >> >> > > > > > > > > >>>or >> >> >> >> >> > > > > > > > > >>> not? It will be easier to think about >>the >> >> >>design >> >> >> >>and >> >> >> >> >> also >> >> >> >> >> > > > make >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > KIP >> >> >> >> >> > > > > > > > > >>> complete if we make a call on the design >> >> >>first. >> >> >> >> >> > > > > > > > > >>> 3. Can you explain the need for keeping >>a >> >> >>list of >> >> >> >> >> unacked >> >> >> >> >> > > > > > offsets per >> >> >> >> >> > > > > > > > > >>> partition? Consider adding a section on >> >> >>retries >> >> >> >>and >> >> >> >> >>how >> >> >> >> >> > you >> >> >> >> >> > > > plan >> >> >> >> >> > > > > > to >> >> >> >> >> > > > > > > > > >>>handle >> >> >> >> >> > > > > > > > > >>> the case when the producer runs out of >>all >> >> >> >>retries. >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> Thanks, >> >> >> >> >> > > > > > > > > >>> Neha >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> On Sun, Feb 8, 2015 at 2:06 PM, Jiangjie >> >>Qin >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> wrote: >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> > Hi Neha, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Yes, I’ve updated the KIP so the >>entire >> >>KIP >> >> >>is >> >> >> >> >>based >> >> >> >> >> > on new >> >> >> >> >> > > > > > > > consumer >> >> >> >> >> > > > > > > > > >>>now. >> >> >> >> >> > > > > > > > > >>> > I’ve put both designs with and without >> >>data >> >> >> >> >>channel >> >> >> >> >> in >> >> >> >> >> > the >> >> >> >> >> > > > KIP >> >> >> >> >> > > > > > as I >> >> >> >> >> > > > > > > > > >>>still >> >> >> >> >> > > > > > > > > >>> > feel we might need the data channel to >> >> >>provide >> >> >> >> >>more >> >> >> >> >> > > > > > flexibility, >> >> >> >> >> > > > > > > > > >>> > especially after message handler is >> >> >>introduced. >> >> >> >> >>I’ve >> >> >> >> >> > put my >> >> >> >> >> > > > > > > > thinking >> >> >> >> >> > > > > > > > > >>>of >> >> >> >> >> > > > > > > > > >>> > the pros and cons of the two designs >>in >> >>the >> >> >> >>KIP as >> >> >> >> >> > well. >> >> >> >> >> > > > It’ll >> >> >> >> >> > > > > > be >> >> >> >> >> > > > > > > > > >>>great >> >> >> >> >> > > > > > > > > >>> if >> >> >> >> >> > > > > > > > > >>> > you can give a review and comment. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > On 2/6/15, 7:30 PM, "Neha Narkhede" < >> >> >> >> >> n...@confluent.io >> >> >> >> >> > > >> >> >> >> >> > > > wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >Hey Becket, >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >What are the next steps on this KIP. >>As >> >>per >> >> >> >>your >> >> >> >> >> > comment >> >> >> >> >> > > > > > earlier >> >> >> >> >> > > > > > > > on >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >thread - >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >I do agree it makes more sense >> >> >> >> >> > > > > > > > > >>> > >> to avoid duplicate effort and plan >> >>based >> >> >>on >> >> >> >>new >> >> >> >> >> > > > consumer. >> >> >> >> >> > > > > > I’ll >> >> >> >> >> > > > > > > > > >>>modify >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> KIP. >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >Did you get a chance to think about >>the >> >> >> >> >>simplified >> >> >> >> >> > design >> >> >> >> >> > > > > > that we >> >> >> >> >> > > > > > > > > >>> proposed >> >> >> >> >> > > > > > > > > >>> > >earlier? Do you plan to update the >>KIP >> >>with >> >> >> >>that >> >> >> >> >> > proposal? >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >Thanks, >> >> >> >> >> > > > > > > > > >>> > >Neha >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >On Wed, Feb 4, 2015 at 12:12 PM, >> >>Jiangjie >> >> >>Qin >> >> >> >> >> > > > > > > > > >>><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >wrote: >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >> In mirror maker we do not do >> >> >> >>de-serialization >> >> >> >> >>on >> >> >> >> >> the >> >> >> >> >> > > > > > messages. >> >> >> >> >> > > > > > > > > >>>Mirror >> >> >> >> >> > > > > > > > > >>> > >> maker use source TopicPartition >>hash >> >>to >> >> >> >>chose a >> >> >> >> >> > > > producer to >> >> >> >> >> > > > > > send >> >> >> >> >> > > > > > > > > >>> > >>messages >> >> >> >> >> > > > > > > > > >>> > >> from the same source partition. The >> >> >> >>partition >> >> >> >> >> those >> >> >> >> >> > > > > > messages end >> >> >> >> >> > > > > > > > > >>>up >> >> >> >> >> > > > > > > > > >>> with >> >> >> >> >> > > > > > > > > >>> > >> are decided by Partitioner class in >> >> >> >> >>KafkaProducer >> >> >> >> >> > > > (assuming >> >> >> >> >> > > > > > you >> >> >> >> >> > > > > > > > > >>>are >> >> >> >> >> > > > > > > > > >>> > >>using >> >> >> >> >> > > > > > > > > >>> > >> the new producer), which uses hash >> >>code >> >> >>of >> >> >> >> >> bytes[]. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> If deserialization is needed, it >>has >> >>to >> >> >>be >> >> >> >> >>done in >> >> >> >> >> > > > message >> >> >> >> >> > > > > > > > > >>>handler. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> On 2/4/15, 11:33 AM, "Bhavesh >>Mistry" >> >>< >> >> >> >> >> > > > > > > > mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks for entertaining my >>question >> >>so >> >> >>far. >> >> >> >> >>Last >> >> >> >> >> > > > > > question, I >> >> >> >> >> > > > > > > > > >>>have is >> >> >> >> >> > > > > > > > > >>> > >> >about >> >> >> >> >> > > > > > > > > >>> > >> >serialization of message key. If >>the >> >> >>key >> >> >> >> >> > > > de-serialization >> >> >> >> >> > > > > > > > > >>>(Class) is >> >> >> >> >> > > > > > > > > >>> > >>not >> >> >> >> >> > > > > > > > > >>> > >> >present at the MM instance, then >> >>does it >> >> >> >>use >> >> >> >> >>raw >> >> >> >> >> > byte >> >> >> >> >> > > > > > hashcode >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > >>> > >> >determine >> >> >> >> >> > > > > > > > > >>> > >> >the partition ? How are you >>going to >> >> >> >>address >> >> >> >> >>the >> >> >> >> >> > > > situation >> >> >> >> >> > > > > > > > where >> >> >> >> >> > > > > > > > > >>>key >> >> >> >> >> > > > > > > > > >>> > >> >needs >> >> >> >> >> > > > > > > > > >>> > >> >to be de-serialization and get >>actual >> >> >> >>hashcode >> >> >> >> >> > needs >> >> >> >> >> > > > to be >> >> >> >> >> > > > > > > > > >>>computed >> >> >> >> >> > > > > > > > > >>> ?. >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >On Fri, Jan 30, 2015 at 1:41 PM, >> >> >>Jiangjie >> >> >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> > >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> Hi Bhavesh, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> Please see inline comments. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> On 1/29/15, 7:00 PM, "Bhavesh >> >>Mistry" >> >> >> >> >> > > > > > > > > >>><mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks for the input. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >a) Is MM will producer ack >>will >> >>be >> >> >> >>attach >> >> >> >> >>to >> >> >> >> >> > > > Producer >> >> >> >> >> > > > > > > > > >>>Instance or >> >> >> >> >> > > > > > > > > >>> > >>per >> >> >> >> >> > > > > > > > > >>> > >> >> >topic. Use case is that one >> >>instance >> >> >> >>of MM >> >> >> >> >> > > > > > > > > >>> > >> >> >needs to handle both strong ack >> >>and >> >> >>also >> >> >> >> >>ack=0 >> >> >> >> >> > for >> >> >> >> >> > > > some >> >> >> >> >> > > > > > > > topic. >> >> >> >> >> > > > > > > > > >>> Or >> >> >> >> >> > > > > > > > > >>> > >>it >> >> >> >> >> > > > > > > > > >>> > >> >> >would >> >> >> >> >> > > > > > > > > >>> > >> >> >be better to set-up another >> >>instance >> >> >>of >> >> >> >>MM. >> >> >> >> >> > > > > > > > > >>> > >> >> The acks setting is producer >>level >> >> >> >>setting >> >> >> >> >> > instead of >> >> >> >> >> > > > > > topic >> >> >> >> >> > > > > > > > > >>>level >> >> >> >> >> > > > > > > > > >>> > >> >>setting. >> >> >> >> >> > > > > > > > > >>> > >> >> In this case you probably need >>to >> >>set >> >> >>up >> >> >> >> >> another >> >> >> >> >> > > > > > instance. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >b) Regarding TCP connections, >>Why >> >> >>does >> >> >> >> >> #producer >> >> >> >> >> > > > > > instance >> >> >> >> >> > > > > > > > > >>>attach >> >> >> >> >> > > > > > > > > >>> to >> >> >> >> >> > > > > > > > > >>> > >>TCP >> >> >> >> >> > > > > > > > > >>> > >> >> >connection. Is it possible to >>use >> >> >> >>Broker >> >> >> >> >> > > > Connection TCP >> >> >> >> >> > > > > > > > Pool, >> >> >> >> >> > > > > > > > > >>> > >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >will just checkout TCP >>connection >> >> to >> >> >> >> >>Broker. >> >> >> >> >> > So, >> >> >> >> >> > > > # of >> >> >> >> >> > > > > > > > > >>>Producer >> >> >> >> >> > > > > > > > > >>> > >> >>Instance >> >> >> >> >> > > > > > > > > >>> > >> >> >does not correlation to Brokers >> >> >> >>Connection. >> >> >> >> >> Is >> >> >> >> >> > this >> >> >> >> >> > > > > > > > possible >> >> >> >> >> > > > > > > > > >>>? >> >> >> >> >> > > > > > > > > >>> > >> >> In new producer, each producer >> >> >>maintains >> >> >> >>a >> >> >> >> >> > > > connection to >> >> >> >> >> > > > > > each >> >> >> >> >> > > > > > > > > >>> broker >> >> >> >> >> > > > > > > > > >>> > >> >> within the producer instance. >> >>Making >> >> >> >> >>producer >> >> >> >> >> > > > instances >> >> >> >> >> > > > > > to >> >> >> >> >> > > > > > > > > >>>share >> >> >> >> >> > > > > > > > > >>> the >> >> >> >> >> > > > > > > > > >>> > >>TCP >> >> >> >> >> > > > > > > > > >>> > >> >> connections is a very big >>change to >> >> >>the >> >> >> >> >>current >> >> >> >> >> > > > design, >> >> >> >> >> > > > > > so I >> >> >> >> >> > > > > > > > > >>> suppose >> >> >> >> >> > > > > > > > > >>> > >>we >> >> >> >> >> > > > > > > > > >>> > >> >> won’t be able to do that. >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >On Thu, Jan 29, 2015 at 11:50 >>AM, >> >> >> >>Jiangjie >> >> >> >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> Hi Bhavesh, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think it is the right >> >>discussion >> >> >>to >> >> >> >> >>have >> >> >> >> >> > when >> >> >> >> >> > > > we are >> >> >> >> >> > > > > > > > > >>>talking >> >> >> >> >> > > > > > > > > >>> > >>about >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> new new design for MM. >> >> >> >> >> > > > > > > > > >>> > >> >> >> Please see the inline >>comments. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> On 1/28/15, 10:48 PM, >>"Bhavesh >> >> >>Mistry" >> >> >> >> >> > > > > > > > > >>> > >><mistry.p.bhav...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Hi Jiangjie, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I just wanted to let you >>know >> >> >>about >> >> >> >>our >> >> >> >> >>use >> >> >> >> >> > case >> >> >> >> >> > > > and >> >> >> >> >> > > > > > > > stress >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >> >>point >> >> >> >> >> > > > > > > > > >>> > >> >> >>that >> >> >> >> >> > > > > > > > > >>> > >> >> >> >local data center broker >> >>cluster >> >> >>have >> >> >> >> >>fewer >> >> >> >> >> > > > > > partitions >> >> >> >> >> > > > > > > > than >> >> >> >> >> > > > > > > > > >>>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >destination >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offline broker cluster. Just >> >> >>because >> >> >> >>we >> >> >> >> >>do >> >> >> >> >> > the >> >> >> >> >> > > > batch >> >> >> >> >> > > > > > pull >> >> >> >> >> > > > > > > > > >>>from >> >> >> >> >> > > > > > > > > >>> > >>CAMUS >> >> >> >> >> > > > > > > > > >>> > >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >order to drain data faster >>than >> >> >>the >> >> >> >> >> injection >> >> >> >> >> > > > rate >> >> >> >> >> > > > > > (from >> >> >> >> >> > > > > > > > > >>>four >> >> >> >> >> > > > > > > > > >>> DCs >> >> >> >> >> > > > > > > > > >>> > >> >>for >> >> >> >> >> > > > > > > > > >>> > >> >> >>same >> >> >> >> >> > > > > > > > > >>> > >> >> >> >topic). >> >> >> >> >> > > > > > > > > >>> > >> >> >> Keeping the same partition >> >>number >> >> >>in >> >> >> >> >>source >> >> >> >> >> > and >> >> >> >> >> > > > target >> >> >> >> >> > > > > > > > > >>>cluster >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >>an >> >> >> >> >> > > > > > > > > >>> > >> >> >> option but will not be >>enforced >> >>by >> >> >> >> >>default. >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >We are facing following >>issues >> >> >> >>(probably >> >> >> >> >> due >> >> >> >> >> > to >> >> >> >> >> > > > > > > > > >>>configuration): >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >1) We occasionally >>loose >> >>data >> >> >> >>due >> >> >> >> >>to >> >> >> >> >> > message >> >> >> >> >> > > > > > batch >> >> >> >> >> > > > > > > > > >>>size is >> >> >> >> >> > > > > > > > > >>> > >>too >> >> >> >> >> > > > > > > > > >>> > >> >> >>large >> >> >> >> >> > > > > > > > > >>> > >> >> >> >(2MB) on target data (we are >> >>using >> >> >> >>old >> >> >> >> >> > producer >> >> >> >> >> > > > but I >> >> >> >> >> > > > > > > > think >> >> >> >> >> > > > > > > > > >>>new >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >will solve this problem to >>some >> >> >> >>extend). >> >> >> >> >> > > > > > > > > >>> > >> >> >> We do see this issue in >> >>LinkedIn as >> >> >> >>well. >> >> >> >> >> New >> >> >> >> >> > > > producer >> >> >> >> >> > > > > > > > also >> >> >> >> >> > > > > > > > > >>> might >> >> >> >> >> > > > > > > > > >>> > >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> this issue. There are some >> >> >>proposal of >> >> >> >> >> > solutions, >> >> >> >> >> > > > but >> >> >> >> >> > > > > > no >> >> >> >> >> > > > > > > > > >>>real >> >> >> >> >> > > > > > > > > >>> work >> >> >> >> >> > > > > > > > > >>> > >> >> >>started >> >> >> >> >> > > > > > > > > >>> > >> >> >> yet. For now, as a >>workaround, >> >> >> >>setting a >> >> >> >> >> more >> >> >> >> >> > > > > > aggressive >> >> >> >> >> > > > > > > > > >>>batch >> >> >> >> >> > > > > > > > > >>> > >>size >> >> >> >> >> > > > > > > > > >>> > >> >>on >> >> >> >> >> > > > > > > > > >>> > >> >> >> producer side should work. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >2) Since only one >> >>instance is >> >> >> >>set >> >> >> >> >>to >> >> >> >> >> MM >> >> >> >> >> > > > data, >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > are >> >> >> >> >> > > > > > > > > >>>not >> >> >> >> >> > > > > > > > > >>> > >>able >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set-up ack per topic instead >> >>ack >> >> >>is >> >> >> >> >> attached >> >> >> >> >> > to >> >> >> >> >> > > > > > producer >> >> >> >> >> > > > > > > > > >>> > >>instance. >> >> >> >> >> > > > > > > > > >>> > >> >> >> I don’t quite get the >>question >> >> >>here. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >3) How are you going to >> >> >>address >> >> >> >>two >> >> >> >> >> > phase >> >> >> >> >> > > > commit >> >> >> >> >> > > > > > > > > >>>problem >> >> >> >> >> > > > > > > > > >>> if >> >> >> >> >> > > > > > > > > >>> > >> >>ack is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >set >> >> >> >> >> > > > > > > > > >>> > >> >> >> >to strongest, but auto >>commit >> >>is >> >> >>on >> >> >> >>for >> >> >> >> >> > consumer >> >> >> >> >> > > > > > (meaning >> >> >> >> >> > > > > > > > > >>> > >>producer >> >> >> >> >> > > > > > > > > >>> > >> >>does >> >> >> >> >> > > > > > > > > >>> > >> >> >> >not >> >> >> >> >> > > > > > > > > >>> > >> >> >> >get ack, but consumer auto >> >> >>committed >> >> >> >> >> offset >> >> >> >> >> > that >> >> >> >> >> > > > > > > > message). >> >> >> >> >> > > > > > > > > >>> Is >> >> >> >> >> > > > > > > > > >>> > >> >>there >> >> >> >> >> > > > > > > > > >>> > >> >> >> >transactional (Kafka >> >>transaction >> >> >>is >> >> >> >>in >> >> >> >> >> > process) >> >> >> >> >> > > > > > based ack >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >>commit >> >> >> >> >> > > > > > > > > >>> > >> >> >> >offset >> >> >> >> >> > > > > > > > > >>> > >> >> >> >? >> >> >> >> >> > > > > > > > > >>> > >> >> >> Auto offset commit should be >> >>turned >> >> >> >>off >> >> >> >> >>in >> >> >> >> >> > this >> >> >> >> >> > > > case. >> >> >> >> >> > > > > > The >> >> >> >> >> > > > > > > > > >>>offset >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >>only >> >> >> >> >> > > > > > > > > >>> > >> >> >> be committed once by the >>offset >> >> >>commit >> >> >> >> >> > thread. So >> >> >> >> >> > > > > > there is >> >> >> >> >> > > > > > > > > >>>no >> >> >> >> >> > > > > > > > > >>> two >> >> >> >> >> > > > > > > > > >>> > >> >>phase >> >> >> >> >> > > > > > > > > >>> > >> >> >> commit. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >4) How are you >>planning to >> >> >>avoid >> >> >> >> >> > duplicated >> >> >> >> >> > > > > > message? >> >> >> >> >> > > > > > > > > >>>( Is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >brokergoing >> >> >> >> >> > > > > > > > > >>> > >> >> >> >have moving window of >>message >> >> >> >>collected >> >> >> >> >>and >> >> >> >> >> > > > de-dupe >> >> >> >> >> > > > > > ?) >> >> >> >> >> > > > > > > > > >>> > >>Possibly, we >> >> >> >> >> > > > > > > > > >>> > >> >> >>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >this from retry set to 5…? >> >> >> >> >> > > > > > > > > >>> > >> >> >> We are not trying to >>completely >> >> >>avoid >> >> >> >> >> > duplicates. >> >> >> >> >> > > > The >> >> >> >> >> > > > > > > > > >>>duplicates >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >> still be there if: >> >> >> >> >> > > > > > > > > >>> > >> >> >> 1. Producer retries on >>failure. >> >> >> >> >> > > > > > > > > >>> > >> >> >> 2. Mirror maker is hard >>killed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> Currently, dedup is expected >>to >> >>be >> >> >> >>done >> >> >> >> >>by >> >> >> >> >> > user if >> >> >> >> >> > > > > > > > > >>>necessary. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >5) Last, is there any >> >> >>warning or >> >> >> >> >>any >> >> >> >> >> > thing >> >> >> >> >> > > > you >> >> >> >> >> > > > > > can >> >> >> >> >> > > > > > > > > >>>provide >> >> >> >> >> > > > > > > > > >>> > >> >>insight >> >> >> >> >> > > > > > > > > >>> > >> >> >> >from MM component about data >> >> >> >>injection >> >> >> >> >>rate >> >> >> >> >> > into >> >> >> >> >> > > > > > > > > >>>destination >> >> >> >> >> > > > > > > > > >>> > >> >> >>partitions is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >NOT evenly distributed >> >>regardless >> >> >> of >> >> >> >> >> keyed >> >> >> >> >> > or >> >> >> >> >> > > > > > non-keyed >> >> >> >> >> > > > > > > > > >>> message >> >> >> >> >> > > > > > > > > >>> > >> >> >>(Hence >> >> >> >> >> > > > > > > > > >>> > >> >> >> >there is ripple effect such >>as >> >> >>data >> >> >> >>not >> >> >> >> >> > arriving >> >> >> >> >> > > > > > late, or >> >> >> >> >> > > > > > > > > >>>data >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >> >>arriving >> >> >> >> >> > > > > > > > > >>> > >> >> >> >out of order in intern of >>time >> >> >>stamp >> >> >> >> >>and >> >> >> >> >> > early >> >> >> >> >> > > > some >> >> >> >> >> > > > > > > > time, >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >> >>CAMUS >> >> >> >> >> > > > > > > > > >>> > >> >> >> >creates huge number of file >> >>count >> >> >>on >> >> >> >> >>HDFS >> >> >> >> >> > due to >> >> >> >> >> > > > > > uneven >> >> >> >> >> > > > > > > > > >>> injection >> >> >> >> >> > > > > > > > > >>> > >> >>rate >> >> >> >> >> > > > > > > > > >>> > >> >> >>. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Camus Job is configured to >>run >> >> >> >>every 3 >> >> >> >> >> > minutes.) >> >> >> >> >> > > > > > > > > >>> > >> >> >> I think uneven data >> >>distribution is >> >> >> >> >> typically >> >> >> >> >> > > > caused >> >> >> >> >> > > > > > by >> >> >> >> >> > > > > > > > > >>>server >> >> >> >> >> > > > > > > > > >>> > >>side >> >> >> >> >> > > > > > > > > >>> > >> >> >> unbalance, instead of >>something >> >> >>mirror >> >> >> >> >>maker >> >> >> >> >> > could >> >> >> >> >> > > > > > > > control. >> >> >> >> >> > > > > > > > > >>>In >> >> >> >> >> > > > > > > > > >>> new >> >> >> >> >> > > > > > > > > >>> > >> >> >>mirror >> >> >> >> >> > > > > > > > > >>> > >> >> >> maker, however, there is a >> >> >> >>customizable >> >> >> >> >> > message >> >> >> >> >> > > > > > handler, >> >> >> >> >> > > > > > > > > >>>that >> >> >> >> >> > > > > > > > > >>> > >>might >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> able to help a little bit. In >> >> >>message >> >> >> >> >> handler, >> >> >> >> >> > > > you can >> >> >> >> >> > > > > > > > > >>> explicitly >> >> >> >> >> > > > > > > > > >>> > >> >>set a >> >> >> >> >> > > > > > > > > >>> > >> >> >> partition that you want to >> >>produce >> >> >>the >> >> >> >> >> message >> >> >> >> >> > > > to. So >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> know >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> uneven data distribution in >> >>target >> >> >> >> >>cluster, >> >> >> >> >> > you >> >> >> >> >> > > > may >> >> >> >> >> > > > > > offset >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> > >>here. >> >> >> >> >> > > > > > > > > >>> > >> >>But >> >> >> >> >> > > > > > > > > >>> > >> >> >> that probably only works for >> >> >>non-keyed >> >> >> >> >> > messages. >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >I am not sure if this is >>right >> >> >> >> >>discussion >> >> >> >> >> > form to >> >> >> >> >> > > > > > bring >> >> >> >> >> > > > > > > > > >>>these >> >> >> >> >> > > > > > > > > >>> to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >your/kafka >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Dev team attention. This >> >>might be >> >> >> >>off >> >> >> >> >> track, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >Bhavesh >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >On Wed, Jan 28, 2015 at >>11:07 >> >>AM, >> >> >> >> >>Jiangjie >> >> >> >> >> > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> I’ve updated the KIP page. >> >> >> >>Feedbacks >> >> >> >> >>are >> >> >> >> >> > > > welcome. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Regarding the simple >>mirror >> >> >>maker >> >> >> >> >> design. I >> >> >> >> >> > > > thought >> >> >> >> >> > > > > > > > over >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> and >> >> >> >> >> > > > > > > > > >>> > >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>some >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> worries: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> There are two things that >> >>might >> >> >> >>worth >> >> >> >> >> > thinking: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 1. One of the enhancement >>to >> >> >>mirror >> >> >> >> >>maker >> >> >> >> >> > is >> >> >> >> >> > > > > > adding a >> >> >> >> >> > > > > > > > > >>>message >> >> >> >> >> > > > > > > > > >>> > >> >> >>handler to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> do things like >>reformatting. >> >>I >> >> >> >>think >> >> >> >> >>we >> >> >> >> >> > might >> >> >> >> >> > > > > > > > potentially >> >> >> >> >> > > > > > > > > >>> want >> >> >> >> >> > > > > > > > > >>> > >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>have >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> more threads processing >>the >> >> >> >>messages >> >> >> >> >>than >> >> >> >> >> > the >> >> >> >> >> > > > > > number of >> >> >> >> >> > > > > > > > > >>> > >>consumers. >> >> >> >> >> > > > > > > > > >>> > >> >> >>If we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> follow the simple mirror >> >>maker >> >> >> >> >>solution, >> >> >> >> >> we >> >> >> >> >> > > > lose >> >> >> >> >> > > > > > this >> >> >> >> >> > > > > > > > > >>> > >>flexibility. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> 2. This might not matter >>too >> >> >>much, >> >> >> >>but >> >> >> >> >> > creating >> >> >> >> >> > > > > > more >> >> >> >> >> > > > > > > > > >>> consumers >> >> >> >> >> > > > > > > > > >>> > >> >>means >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>more >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> footprint of TCP >>connection / >> >> >> >>memory. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Any thoughts on this? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> On 1/26/15, 10:35 AM, >> >>"Jiangjie >> >> >> >>Qin" < >> >> >> >> >> > > > > > > > j...@linkedin.com> >> >> >> >> >> > > > > > > > > >>> > wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Hi Jay and Neha, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks a lot for the >>reply >> >>and >> >> >> >> >> > explanation. I >> >> >> >> >> > > > do >> >> >> >> >> > > > > > agree >> >> >> >> >> > > > > > > > > >>>it >> >> >> >> >> > > > > > > > > >>> > >>makes >> >> >> >> >> > > > > > > > > >>> > >> >>more >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>sense >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >to avoid duplicate effort >> >>and >> >> >>plan >> >> >> >> >>based >> >> >> >> >> > on >> >> >> >> >> > > > new >> >> >> >> >> > > > > > > > > >>>consumer. >> >> >> >> >> > > > > > > > > >>> I’ll >> >> >> >> >> > > > > > > > > >>> > >> >> >>modify >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >KIP. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >To Jay’s question on >>message >> >> >> >> >>ordering - >> >> >> >> >> > The >> >> >> >> >> > > > data >> >> >> >> >> > > > > > > > channel >> >> >> >> >> > > > > > > > > >>> > >> >>selection >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>makes >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >sure that the messages >>from >> >>the >> >> >> >>same >> >> >> >> >> > source >> >> >> >> >> > > > > > partition >> >> >> >> >> > > > > > > > > >>>will >> >> >> >> >> > > > > > > > > >>> > >>sent >> >> >> >> >> > > > > > > > > >>> > >> >>by >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer. So the >>order >> >>of >> >> >>the >> >> >> >> >> > messages is >> >> >> >> >> > > > > > > > > >>>guaranteed >> >> >> >> >> > > > > > > > > >>> with >> >> >> >> >> > > > > > > > > >>> > >> >> >>proper >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >producer settings >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >>>>(MaxInFlightRequests=1,retries=Integer.MaxValue, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>etc.) >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For keyed messages, >>because >> >> >>they >> >> >> >>come >> >> >> >> >> > from the >> >> >> >> >> > > > > > same >> >> >> >> >> > > > > > > > > >>>source >> >> >> >> >> > > > > > > > > >>> > >> >>partition >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >will end up in the same >> >>target >> >> >> >> >> partition, >> >> >> >> >> > as >> >> >> >> >> > > > long >> >> >> >> >> > > > > > as >> >> >> >> >> > > > > > > > > >>>they >> >> >> >> >> > > > > > > > > >>> are >> >> >> >> >> > > > > > > > > >>> > >> >>sent >> >> >> >> >> > > > > > > > > >>> > >> >> >>by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >same producer, the order >>is >> >> >> >> >>guaranteed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >For non-keyed messages, >>the >> >> >> >>messages >> >> >> >> >> > coming >> >> >> >> >> > > > from >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > > >>>same >> >> >> >> >> > > > > > > > > >>> > >>source >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >might go to different >>target >> >> >> >> >>partitions. >> >> >> >> >> > The >> >> >> >> >> > > > > > order is >> >> >> >> >> > > > > > > > > >>>only >> >> >> >> >> > > > > > > > > >>> > >> >> >>guaranteed >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >within each partition. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Anyway, I’ll modify the >>KIP >> >>and >> >> >> >>data >> >> >> >> >> > channel >> >> >> >> >> > > > will >> >> >> >> >> > > > > > be >> >> >> >> >> > > > > > > > > >>>away. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >Jiangjie (Becket) Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >On 1/25/15, 4:34 PM, >>"Neha >> >> >> >>Narkhede" >> >> >> >> >>< >> >> >> >> >> > > > > > > > n...@confluent.io> >> >> >> >> >> > > > > > > > > >>> > >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>I think there is some >> >>value in >> >> >> >> >> > investigating >> >> >> >> >> > > > if >> >> >> >> >> > > > > > we >> >> >> >> >> > > > > > > > can >> >> >> >> >> > > > > > > > > >>>go >> >> >> >> >> > > > > > > > > >>> > >>back >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>simple mirror maker >> >>design, as >> >> >> >>Jay >> >> >> >> >> points >> >> >> >> >> > > > out. >> >> >> >> >> > > > > > Here >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> have >> >> >> >> >> > > > > > > > > >>> > >>N >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>threads, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>each has a consumer and >>a >> >> >> >>producer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>The reason why we had to >> >>move >> >> >> >>away >> >> >> >> >>from >> >> >> >> >> > that >> >> >> >> >> > > > was >> >> >> >> >> > > > > > a >> >> >> >> >> > > > > > > > > >>> > >>combination >> >> >> >> >> > > > > > > > > >>> > >> >>of >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>difference in throughput >> >> >>between >> >> >> >>the >> >> >> >> >> > consumer >> >> >> >> >> > > > > > and the >> >> >> >> >> > > > > > > > > >>>old >> >> >> >> >> > > > > > > > > >>> > >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deficiency of the >>consumer >> >> >> >> >>rebalancing >> >> >> >> >> > that >> >> >> >> >> > > > > > limits >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> total >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker threads. So >> >>the >> >> >>only >> >> >> >> >> option >> >> >> >> >> > > > > > available >> >> >> >> >> > > > > > > > was >> >> >> >> >> > > > > > > > > >>>to >> >> >> >> >> > > > > > > > > >>> > >> >>increase >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput of the >>limited >> >># of >> >> >> >> >>mirror >> >> >> >> >> > maker >> >> >> >> >> > > > > > threads >> >> >> >> >> > > > > > > > > >>>that >> >> >> >> >> > > > > > > > > >>> > >>could >> >> >> >> >> > > > > > > > > >>> > >> >>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>deployed. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Now that queuing design >>may >> >> >>not >> >> >> >>make >> >> >> >> >> > sense, >> >> >> >> >> > > > if >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > new >> >> >> >> >> > > > > > > > > >>> > >> >>producer's >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>throughput is almost >> >>similar >> >> >>to >> >> >> >>the >> >> >> >> >> > consumer >> >> >> >> >> > > > AND >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > > >>>fact >> >> >> >> >> > > > > > > > > >>> > >>that >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>round-robin based >>consumer >> >> >> >> >>rebalancing >> >> >> >> >> > can >> >> >> >> >> > > > allow >> >> >> >> >> > > > > > a >> >> >> >> >> > > > > > > > very >> >> >> >> >> > > > > > > > > >>> high >> >> >> >> >> > > > > > > > > >>> > >> >> >>number of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>mirror maker instances >>to >> >> >>exist. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>This is the end state >>that >> >>the >> >> >> >> >>mirror >> >> >> >> >> > maker >> >> >> >> >> > > > > > should be >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> once >> >> >> >> >> > > > > > > > > >>> > >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>consumer is complete, >>so it >> >> >> >>wouldn't >> >> >> >> >> > hurt to >> >> >> >> >> > > > see >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > we >> >> >> >> >> > > > > > > > > >>>can >> >> >> >> >> > > > > > > > > >>> > >>just >> >> >> >> >> > > > > > > > > >>> > >> >> >>move >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>that right now. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>On Fri, Jan 23, 2015 at >> >>8:40 >> >> >>PM, >> >> >> >>Jay >> >> >> >> >> > Kreps >> >> >> >> >> > > > > > > > > >>> > >><jay.kr...@gmail.com >> >> >> >> >> > > > > > > > > >>> > >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> QQ: If we ever use a >> >> >>different >> >> >> >> >> > technique >> >> >> >> >> > > > for >> >> >> >> >> > > > > > the >> >> >> >> >> > > > > > > > data >> >> >> >> >> > > > > > > > > >>> > >>channel >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>selection >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> than for the producer >> >> >> >>partitioning >> >> >> >> >> > won't >> >> >> >> >> > > > that >> >> >> >> >> > > > > > break >> >> >> >> >> > > > > > > > > >>> > >>ordering? >> >> >> >> >> > > > > > > > > >>> > >> >>How >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> ensure these things >>stay >> >>in >> >> >> >>sync? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> With respect to the >>new >> >> >> >> >>consumer--I >> >> >> >> >> > really >> >> >> >> >> > > > do >> >> >> >> >> > > > > > want >> >> >> >> >> > > > > > > > to >> >> >> >> >> > > > > > > > > >>> > >> >>encourage >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>people >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> think through how MM >>will >> >> >>work >> >> >> >> >>with >> >> >> >> >> > the new >> >> >> >> >> > > > > > > > consumer. >> >> >> >> >> > > > > > > > > >>>I >> >> >> >> >> > > > > > > > > >>> > >>mean >> >> >> >> >> > > > > > > > > >>> > >> >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>isn't >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> very far off, maybe a >>few >> >> >> >>months >> >> >> >> >>if >> >> >> >> >> we >> >> >> >> >> > > > hustle? >> >> >> >> >> > > > > > I >> >> >> >> >> > > > > > > > > >>>could >> >> >> >> >> > > > > > > > > >>> > >> >>imagine us >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>getting >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> this mm fix done maybe >> >> >>sooner, >> >> >> >> >>maybe >> >> >> >> >> > in a >> >> >> >> >> > > > > > month? >> >> >> >> >> > > > > > > > So I >> >> >> >> >> > > > > > > > > >>> guess >> >> >> >> >> > > > > > > > > >>> > >> >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>buys >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>us an >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> extra month before we >> >>rip it >> >> >> >>out >> >> >> >> >>and >> >> >> >> >> > throw >> >> >> >> >> > > > it >> >> >> >> >> > > > > > away? >> >> >> >> >> > > > > > > > > >>>Maybe >> >> >> >> >> > > > > > > > > >>> > >>two? >> >> >> >> >> > > > > > > > > >>> > >> >> >>This >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>bug >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>has >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> been there for a >>while, >> >> >>though, >> >> >> >> >> right? >> >> >> >> >> > Is >> >> >> >> >> > > > it >> >> >> >> >> > > > > > worth >> >> >> >> >> > > > > > > > > >>>it? >> >> >> >> >> > > > > > > > > >>> > >> >>Probably >> >> >> >> >> > > > > > > > > >>> > >> >> >>it >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>but >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it still kind of >>sucks to >> >> >>have >> >> >> >>the >> >> >> >> >> > > > duplicate >> >> >> >> >> > > > > > > > effort. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> So anyhow let's >> >>definitely >> >> >> >>think >> >> >> >> >> about >> >> >> >> >> > how >> >> >> >> >> > > > > > things >> >> >> >> >> > > > > > > > > >>>will >> >> >> >> >> > > > > > > > > >>> work >> >> >> >> >> > > > > > > > > >>> > >> >>with >> >> >> >> >> > > > > > > > > >>> > >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> consumer. I think we >>can >> >> >> >>probably >> >> >> >> >> just >> >> >> >> >> > > > have N >> >> >> >> >> > > > > > > > > >>>threads, >> >> >> >> >> > > > > > > > > >>> each >> >> >> >> >> > > > > > > > > >>> > >> >> >>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>has >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> producer and consumer >> >>and is >> >> >> >> >> internally >> >> >> >> >> > > > single >> >> >> >> >> > > > > > > > > >>>threaded. >> >> >> >> >> > > > > > > > > >>> > >>Any >> >> >> >> >> > > > > > > > > >>> > >> >> >>reason >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wouldn't work? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> -Jay >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> On Wed, Jan 21, 2015 >>at >> >>5:29 >> >> >> >>PM, >> >> >> >> >> > Jiangjie >> >> >> >> >> > > > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >>>>><j...@linkedin.com.invalid> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Hi Jay, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Thanks for comments. >> >> >>Please >> >> >> >>see >> >> >> >> >> > inline >> >> >> >> >> > > > > > responses. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Jiangjie (Becket) >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > On 1/21/15, 1:33 PM, >> >>"Jay >> >> >> >>Kreps" >> >> >> >> >> > > > > > > > > >>><jay.kr...@gmail.com> >> >> >> >> >> > > > > > > > > >>> > >> >>wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >Hey guys, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >A couple >> >> >>questions/comments: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >1. The callback and >> >> >> >> >> user-controlled >> >> >> >> >> > > > commit >> >> >> >> >> > > > > > > > offset >> >> >> >> >> > > > > > > > > >>> > >> >> >>functionality >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> already >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >in the new consumer >> >> >>which we >> >> >> >> >>are >> >> >> >> >> > > > working on >> >> >> >> >> > > > > > in >> >> >> >> >> > > > > > > > > >>> parallel. >> >> >> >> >> > > > > > > > > >>> > >> >>If we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> accelerated >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that work it might >> >>help >> >> >> >> >> concentrate >> >> >> >> >> > > > > > efforts. I >> >> >> >> >> > > > > > > > > >>>admit >> >> >> >> >> > > > > > > > > >>> > >>this >> >> >> >> >> > > > > > > > > >>> > >> >> >>might >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>take >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >slightly longer in >> >> >>calendar >> >> >> >> >>time >> >> >> >> >> but >> >> >> >> >> > > > could >> >> >> >> >> > > > > > still >> >> >> >> >> > > > > > > > > >>> > >>probably >> >> >> >> >> > > > > > > > > >>> > >> >>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>done >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>this >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >quarter. Have you >>guys >> >> >> >> >>considered >> >> >> >> >> > that >> >> >> >> >> > > > > > approach? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Yes, I totally agree >> >>that >> >> >> >> >>ideally >> >> >> >> >> we >> >> >> >> >> > > > should >> >> >> >> >> > > > > > put >> >> >> >> >> > > > > > > > > >>>efforts >> >> >> >> >> > > > > > > > > >>> > >>on >> >> >> >> >> > > > > > > > > >>> > >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > The main reason for >> >>still >> >> >> >> >>working >> >> >> >> >> on >> >> >> >> >> > the >> >> >> >> >> > > > old >> >> >> >> >> > > > > > > > > >>>consumer >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >>that >> >> >> >> >> > > > > > > > > >>> > >> >> >>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>expect >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> it >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > would still be used >>in >> >> >> >>LinkedIn >> >> >> >> >>for >> >> >> >> >> > > > quite a >> >> >> >> >> > > > > > while >> >> >> >> >> > > > > > > > > >>> before >> >> >> >> >> > > > > > > > > >>> > >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > could be fully >>rolled >> >>out. >> >> >> >>And >> >> >> >> >>we >> >> >> >> >> > > > recently >> >> >> >> >> > > > > > > > > >>>suffering a >> >> >> >> >> > > > > > > > > >>> > >>lot >> >> >> >> >> > > > > > > > > >>> > >> >>from >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>mirror >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > maker data loss >>issue. >> >>So >> >> >>our >> >> >> >> >> current >> >> >> >> >> > > > plan is >> >> >> >> >> > > > > > > > > >>>making >> >> >> >> >> > > > > > > > > >>> > >> >>necessary >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>changes to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > make current mirror >> >>maker >> >> >> >> >>stable in >> >> >> >> >> > > > > > production. >> >> >> >> >> > > > > > > > > >>>Then we >> >> >> >> >> > > > > > > > > >>> > >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >>test >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>and >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > rollout new consumer >> >> >> >>gradually >> >> >> >> >> > without >> >> >> >> >> > > > > > getting >> >> >> >> >> > > > > > > > > >>>burnt. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >2. I think >> >>partitioning >> >> >>on >> >> >> >>the >> >> >> >> >> hash >> >> >> >> >> > of >> >> >> >> >> > > > the >> >> >> >> >> > > > > > topic >> >> >> >> >> > > > > > > > > >>> > >>partition >> >> >> >> >> > > > > > > > > >>> > >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>not a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>very >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >good idea because >>that >> >> >>will >> >> >> >> >>make >> >> >> >> >> the >> >> >> >> >> > > > case of >> >> >> >> >> > > > > > > > going >> >> >> >> >> > > > > > > > > >>> from >> >> >> >> >> > > > > > > > > >>> > >>a >> >> >> >> >> > > > > > > > > >>> > >> >> >>cluster >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>with >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fewer partitions to >> >>one >> >> >>with >> >> >> >> >>more >> >> >> >> >> > > > > > partitions not >> >> >> >> >> > > > > > > > > >>> work. I >> >> >> >> >> > > > > > > > > >>> > >> >> >>think an >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >intuitive >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >way to do this >>would >> >>be >> >> >>the >> >> >> >> >> > following: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >a. Default >>behavior: >> >> >>Just do >> >> >> >> >>what >> >> >> >> >> > the >> >> >> >> >> > > > > > producer >> >> >> >> >> > > > > > > > > >>>does. >> >> >> >> >> > > > > > > > > >>> > >>I.e. >> >> >> >> >> > > > > > > > > >>> > >> >>if >> >> >> >> >> > > > > > > > > >>> > >> >> >>you >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> specify a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >key use it for >> >> >> >>partitioning, if >> >> >> >> >> not >> >> >> >> >> > just >> >> >> >> >> > > > > > > > partition >> >> >> >> >> > > > > > > > > >>>in >> >> >> >> >> > > > > > > > > >>> a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>round-robin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >fashion. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >b. Add a >> >> >> >>--preserve-partition >> >> >> >> >> option >> >> >> >> >> > > > that >> >> >> >> >> > > > > > will >> >> >> >> >> > > > > > > > > >>> > >>explicitly >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>inherent >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition from the >> >>source >> >> >> >> >> > irrespective >> >> >> >> >> > > > of >> >> >> >> >> > > > > > > > whether >> >> >> >> >> > > > > > > > > >>> there >> >> >> >> >> > > > > > > > > >>> > >>is >> >> >> >> >> > > > > > > > > >>> > >> >>a >> >> >> >> >> > > > > > > > > >>> > >> >> >>key >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> which >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >partition that key >> >>would >> >> >> >>hash >> >> >> >> >>to. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Sorry that I did not >> >> >>explain >> >> >> >> >>this >> >> >> >> >> > clear >> >> >> >> >> > > > > > enough. >> >> >> >> >> > > > > > > > The >> >> >> >> >> > > > > > > > > >>> hash >> >> >> >> >> > > > > > > > > >>> > >>of >> >> >> >> >> > > > > > > > > >>> > >> >> >>topic >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition is only >>used >> >> >>when >> >> >> >> >>decide >> >> >> >> >> > which >> >> >> >> >> > > > > > mirror >> >> >> >> >> > > > > > > > > >>>maker >> >> >> >> >> > > > > > > > > >>> > >>data >> >> >> >> >> > > > > > > > > >>> > >> >> >>channel >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>queue >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the consumer thread >> >>should >> >> >> >>put >> >> >> >> >> > message >> >> >> >> >> > > > into. >> >> >> >> >> > > > > > It >> >> >> >> >> > > > > > > > > >>>only >> >> >> >> >> > > > > > > > > >>> > >>tries >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >>make >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>sure >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the messages from >>the >> >>same >> >> >> >> >> partition >> >> >> >> >> > is >> >> >> >> >> > > > sent >> >> >> >> >> > > > > > by >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> same >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > to guarantee the >> >>sending >> >> >> >>order. >> >> >> >> >> This >> >> >> >> >> > is >> >> >> >> >> > > > not >> >> >> >> >> > > > > > at >> >> >> >> >> > > > > > > > all >> >> >> >> >> > > > > > > > > >>> > >>related >> >> >> >> >> > > > > > > > > >>> > >> >>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>which >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > partition in target >> >> >>cluster >> >> >> >>the >> >> >> >> >> > messages >> >> >> >> >> > > > end >> >> >> >> >> > > > > > up. >> >> >> >> >> > > > > > > > > >>>That >> >> >> >> >> > > > > > > > > >>> is >> >> >> >> >> > > > > > > > > >>> > >> >>still >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>decided by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > producer. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >3. You don't >>actually >> >> >>give >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> ConsumerRebalanceListener >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>interface. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>What >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >that going to look >> >>like? >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > Good point! I should >> >>have >> >> >>put >> >> >> >> >>it in >> >> >> >> >> > the >> >> >> >> >> > > > > > wiki. I >> >> >> >> >> > > > > > > > > >>>just >> >> >> >> >> > > > > > > > > >>> > >>added >> >> >> >> >> > > > > > > > > >>> > >> >>it. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >4. What is >> >> >> >>MirrorMakerRecord? I >> >> >> >> >> > think >> >> >> >> >> > > > > > ideally >> >> >> >> >> > > > > > > > the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >>>MirrorMakerMessageHandler >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >interface would >>take a >> >> >> >> >> > ConsumerRecord as >> >> >> >> >> > > > > > input >> >> >> >> >> > > > > > > > and >> >> >> >> >> > > > > > > > > >>> > >>return a >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >ProducerRecord, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >right? That would >> >>allow >> >> >>you >> >> >> >>to >> >> >> >> >> > > > transform the >> >> >> >> >> > > > > > > > key, >> >> >> >> >> > > > > > > > > >>> value, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>partition, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >destination >>topic... >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > MirrorMakerRecord is >> >> >> >>introduced >> >> >> >> >>in >> >> >> >> >> > > > > > KAFKA-1650, >> >> >> >> >> > > > > > > > > >>>which is >> >> >> >> >> > > > > > > > > >>> > >> >>exactly >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>the >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>same >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > as ConsumerRecord in >> >> >> >>KAFKA-1760. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > private[kafka] class >> >> >> >> >> > MirrorMakerRecord >> >> >> >> >> > > > (val >> >> >> >> >> > > > > > > > > >>> sourceTopic: >> >> >> >> >> > > > > > > > > >>> > >> >> >>String, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val >>sourcePartition: >> >> >>Int, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val sourceOffset: >> >>Long, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val key: >>Array[Byte], >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > val value: >> >>Array[Byte]) >> >> >>{ >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > def size = >> >>value.length >> >> >>+ >> >> >> >>{if >> >> >> >> >> (key >> >> >> >> >> > == >> >> >> >> >> > > > > > null) 0 >> >> >> >> >> > > > > > > > > >>>else >> >> >> >> >> > > > > > > > > >>> > >> >> >>key.length} >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > } >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > However, because >>source >> >> >> >> >>partition >> >> >> >> >> and >> >> >> >> >> > > > offset >> >> >> >> >> > > > > > is >> >> >> >> >> > > > > > > > > >>>needed >> >> >> >> >> > > > > > > > > >>> in >> >> >> >> >> > > > > > > > > >>> > >> >> >>producer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > for consumer offsets >> >> >> >> >>bookkeeping, >> >> >> >> >> the >> >> >> >> >> > > > record >> >> >> >> >> > > > > > > > > >>>returned >> >> >> >> >> > > > > > > > > >>> by >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >>MirrorMakerMessageHandler >> >> >> >>needs >> >> >> >> >>to >> >> >> >> >> > > > contain >> >> >> >> >> > > > > > those >> >> >> >> >> > > > > > > > > >>> > >> >>information. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>Therefore >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > ProducerRecord does >>not >> >> >>work >> >> >> >> >>here. >> >> >> >> >> We >> >> >> >> >> > > > could >> >> >> >> >> > > > > > > > > >>>probably >> >> >> >> >> > > > > > > > > >>> let >> >> >> >> >> > > > > > > > > >>> > >> >> >>message >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>handler >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > take ConsumerRecord >>for >> >> >>both >> >> >> >> >>input >> >> >> >> >> > and >> >> >> >> >> > > > > > output. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >5. Have you guys >> >>thought >> >> >> >>about >> >> >> >> >> what >> >> >> >> >> > the >> >> >> >> >> > > > > > > > > >>>implementation >> >> >> >> >> > > > > > > > > >>> > >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >>look >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>like in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >terms of threading >> >> >> >>architecture >> >> >> >> >> etc >> >> >> >> >> > with >> >> >> >> >> > > > > > the new >> >> >> >> >> > > > > > > > > >>> > >>consumer? >> >> >> >> >> > > > > > > > > >>> > >> >> >>That >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>will >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >soon so even if we >> >>aren't >> >> >> >> >>starting >> >> >> >> >> > with >> >> >> >> >> > > > that >> >> >> >> >> > > > > > > > let's >> >> >> >> >> > > > > > > > > >>> make >> >> >> >> >> > > > > > > > > >>> > >> >>sure >> >> >> >> >> > > > > > > > > >>> > >> >> >>we >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>can >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >rid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >of a lot of the >> >>current >> >> >> >>mirror >> >> >> >> >> maker >> >> >> >> >> > > > > > accidental >> >> >> >> >> > > > > > > > > >>> > >>complexity >> >> >> >> >> > > > > > > > > >>> > >> >>in >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>terms >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>of >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >threads and queues >> >>when >> >> >>we >> >> >> >> >>move to >> >> >> >> >> > that. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > I haven¹t thought >> >>about it >> >> >> >> >> > throughly. The >> >> >> >> >> > > > > > quick >> >> >> >> >> > > > > > > > > >>>idea is >> >> >> >> >> > > > > > > > > >>> > >> >>after >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>migration >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > the new consumer, >>it is >> >> >> >>probably >> >> >> >> >> > better >> >> >> >> >> > > > to >> >> >> >> >> > > > > > use a >> >> >> >> >> > > > > > > > > >>>single >> >> >> >> >> > > > > > > > > >>> > >> >> >>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>thread. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > If multithread is >> >>needed, >> >> >> >> >> decoupling >> >> >> >> >> > > > > > consumption >> >> >> >> >> > > > > > > > > >>>and >> >> >> >> >> > > > > > > > > >>> > >> >>processing >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>might >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>be >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > used. MirrorMaker >> >> >>definitely >> >> >> >> >>needs >> >> >> >> >> > to be >> >> >> >> >> > > > > > changed >> >> >> >> >> > > > > > > > > >>>after >> >> >> >> >> > > > > > > > > >>> > >>new >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>consumer >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>get >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > checked in. I¹ll >> >>document >> >> >>the >> >> >> >> >> changes >> >> >> >> >> > > > and can >> >> >> >> >> > > > > > > > > >>>submit >> >> >> >> >> > > > > > > > > >>> > >>follow >> >> >> >> >> > > > > > > > > >>> > >> >>up >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>patches >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > after the new >>consumer >> >>is >> >> >> >> >> available. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >-Jay >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >On Tue, Jan 20, >>2015 >> >>at >> >> >>4:31 >> >> >> >> >>PM, >> >> >> >> >> > > > Jiangjie >> >> >> >> >> > > > > > Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>>><j...@linkedin.com.invalid >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >wrote: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Hi Kafka Devs, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> We are working on >> >>Kafka >> >> >> >> >>Mirror >> >> >> >> >> > Maker >> >> >> >> >> > > > > > > > > >>>enhancement. A >> >> >> >> >> > > > > > > > > >>> > >>KIP >> >> >> >> >> > > > > > > > > >>> > >> >>is >> >> >> >> >> > > > > > > > > >>> > >> >> >> >>posted >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>to >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> document and >> >>discuss on >> >> >> >>the >> >> >> >> >> > > > followings: >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 1. KAFKA-1650: No >> >>Data >> >> >> >>loss >> >> >> >> >> mirror >> >> >> >> >> > > > maker >> >> >> >> >> > > > > > > > change >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 2. KAFKA-1839: To >> >>allow >> >> >> >> >> partition >> >> >> >> >> > > > aware >> >> >> >> >> > > > > > > > mirror. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> 3. KAFKA-1840: To >> >>allow >> >> >> >> >>message >> >> >> >> >> > > > > > > > filtering/format >> >> >> >> >> > > > > > > > > >>> > >> >>conversion >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Feedbacks are >> >>welcome. >> >> >> >>Please >> >> >> >> >> let >> >> >> >> >> > us >> >> >> >> >> > > > know >> >> >> >> >> > > > > > if >> >> >> >> >> > > > > > > > you >> >> >> >> >> > > > > > > > > >>> have >> >> >> >> >> > > > > > > > > >>> > >>any >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>>questions or >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> concerns. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Thanks. >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> Jiangjie (Becket) >> >>Qin >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>-- >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Thanks, >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >>Neha >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > > >> >> >> >> >> > > > > > > > > >>> > >-- >> >> >> >> >> > > > > > > > > >>> > >Thanks, >> >> >> >> >> > > > > > > > > >>> > >Neha >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> > >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > >>> -- >> >> >> >> >> > > > > > > > > >>> Thanks, >> >> >> >> >> > > > > > > > > >>> Neha >> >> >> >> >> > > > > > > > > >>> >> >> >> >> >> > > > > > > > > > >> >> >> >> >> > > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > >> >> >> >> >> > > > > > > -- >> >> >> >> >> > > > > > > Thanks, >> >> >> >> >> > > > > > > Neha >> >> >> >> >> > > > > > >> >> >> >> >> > > > > > >> >> >> >> >> > > > >> >> >> >> >> > > > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> >-- >> >> >> >> >Thanks, >> >> >> >> >Neha >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > >> > >> >-- >> >Thanks, >> >Neha >> >>