I'm pretty satisfied with the current workarounds (Avro container format), so I'm not too excited about the extra work required to do headers in Kafka. I absolutely don't mind it if you do it... I think the Apache convention for "good idea, but not willing to put any work toward it" is +0.5? anyway, that's what I was trying to convey :)
On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <tpal...@gmail.com> wrote: > Well I guess my question for you, then, is what is holding you back from > full support for headers? What’s the bit that you’re missing that has you > under a full +1? > > -Todd > > > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <g...@confluent.io> wrote: > >> I know why people who support headers support them, and I've seen what >> the discussion is like. >> >> This is why I'm asking people who are against headers (especially >> committers) what will make them change their mind - so we can get this >> part over one way or another. >> >> If I sound frustrated it is not at Radai, Jun or you (Todd)... I am >> just looking for something concrete we can do to move the discussion >> along to the yummy design details (which is the argument I really am >> looking forward to). >> >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <tpal...@gmail.com> wrote: >> > So, Gwen, to your question (even though I’m not a committer)... >> > >> > I have always been a strong supporter of introducing the concept of an >> > envelope to messages, which headers accomplishes. The message key is >> > already an example of a piece of envelope information. By providing a >> means >> > to do this within Kafka itself, and not relying on use-case specific >> > implementations, you make it much easier for components to interoperate. >> It >> > simplifies development of all these things (message routing, auditing, >> > encryption, etc.) because each one does not have to reinvent the wheel. >> > >> > It also makes it much easier from a client point of view if the headers >> are >> > defined as part of the protocol and/or message format in general because >> > you can easily produce and consume messages without having to take into >> > account specific cases. For example, I want to route messages, but >> client A >> > doesn’t support the way audit implemented headers, and client B doesn’t >> > support the way encryption or routing implemented headers, so now my >> > application has to create some really fragile (my autocorrect just tried >> to >> > make that “tragic”, which is probably appropriate too) code to strip >> > everything off, rather than just consuming the messages, picking out the >> 1 >> > or 2 headers it’s interested in, and performing its function. >> > >> > Honestly, this discussion has been going on for a long time, and it’s >> > always “Oh, you came up with 2 use cases, and yeah, those use cases are >> > real things that someone would want to do. Here’s an alternate way to >> > implement them so let’s not do headers.” If we have a few use cases that >> we >> > actually came up with, you can be sure that over the next year there’s a >> > dozen others that we didn’t think of that someone would like to do. I >> > really think it’s time to stop rehashing this discussion and instead >> focus >> > on a workable standard that we can adopt. >> > >> > -Todd >> > >> > >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <tpal...@gmail.com> wrote: >> > >> >> C. per message encryption >> >>> One drawback of this approach is that this significantly reduce the >> >>> effectiveness of compression, which happens on a set of serialized >> >>> messages. An alternative is to enable SSL for wire encryption and rely >> on >> >>> the storage system (e.g. LUKS) for at rest encryption. >> >> >> >> >> >> Jun, this is not sufficient. While this does cover the case of removing >> a >> >> drive from the system, it will not satisfy most compliance requirements >> for >> >> encryption of data as whoever has access to the broker itself still has >> >> access to the unencrypted data. For end-to-end encryption you need to >> >> encrypt at the producer, before it enters the system, and decrypt at the >> >> consumer, after it exits the system. >> >> >> >> -Todd >> >> >> >> >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <radai.rosenbl...@gmail.com> >> wrote: >> >> >> >>> another big plus of headers in the protocol is that it would enable >> rapid >> >>> iteration on ideas outside of core kafka and would reduce the number of >> >>> future wire format changes required. >> >>> >> >>> a lot of what is currently a KIP represents use cases that are not 100% >> >>> relevant to all users, and some of them require rather invasive wire >> >>> protocol changes. a thing a good recent example of this is kip-98. >> >>> tx-utilizing traffic is expected to be a very small fraction of total >> >>> traffic and yet the changes are invasive. >> >>> >> >>> every such wire format change translates into painful and slow >> adoption of >> >>> new versions. >> >>> >> >>> i think a lot of functionality currently in KIPs could be "spun out" >> and >> >>> implemented as opt-in plugins transmitting data over headers. this >> would >> >>> keep the core wire format stable(r), core codebase smaller, and avoid >> the >> >>> "burden of proof" thats sometimes required to prove a certain feature >> is >> >>> useful enough for a wide-enough audience to warrant a wire format >> change >> >>> and code complexity additions. >> >>> >> >>> (to be clear - kip-98 goes beyond "mere" wire format changes and im not >> >>> saying it could have been completely done with headers, but >> exactly-once >> >>> delivery certainly could) >> >>> >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen Shapira <g...@confluent.io> >> wrote: >> >>> >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <radai.rosenbl...@gmail.com> >> >>> wrote: >> >>> > > "For use cases within an organization, one could always use other >> >>> > > approaches such as company-wise containers" >> >>> > > this is what linkedin has traditionally done but there are now >> cases >> >>> > (read >> >>> > > - topics) where this is not acceptable. this makes headers useful >> even >> >>> > > within single orgs for cases where one-container-fits-all cannot >> >>> apply. >> >>> > > >> >>> > > as for the particular use cases listed, i dont want this to devolve >> >>> to a >> >>> > > discussion of particular use cases - i think its enough that some >> of >> >>> them >> >>> > >> >>> > I think a main point of contention is that: We identified few >> >>> > use-cases where headers are useful, do we want Kafka to be a system >> >>> > that supports those use-cases? >> >>> > >> >>> > For example, Jun said: >> >>> > "Not sure how widely useful record-level lineage is though since the >> >>> > overhead could >> >>> > be significant." >> >>> > >> >>> > We know NiFi supports record level lineage. I don't think it was >> >>> > developed for lols, I think it is safe to assume that the NSA needed >> >>> > that functionality. We also know that certain financial institutes >> >>> > need to track tampering with records at a record level and there are >> >>> > federal regulations that absolutely require this. They also need to >> >>> > prove that routing apps that "touches" the messages and either reads >> >>> > or updates headers couldn't have possibly modified the payload >> itself. >> >>> > They use record level encryption to do that - apps can read and >> >>> > (sometimes) modify headers but can't touch the payload. >> >>> > >> >>> > We can totally say "those are corner cases and not worth adding >> >>> > headers to Kafka for", they should use a different pubsub message for >> >>> > that (Nifi or one of the other 1000 that cater specifically to the >> >>> > financial industry). >> >>> > >> >>> > But this gets us into a catch 22: >> >>> > If we discuss a specific use-case, someone can always say it isn't >> >>> > interesting enough for Kafka. If we discuss more general trends, >> >>> > others can say "well, we are not sure any of them really needs >> headers >> >>> > specifically. This is just hand waving and not interesting.". >> >>> > >> >>> > I think discussing use-cases in specifics is super important to >> decide >> >>> > implementation details for headers (my use-cases lean toward >> numerical >> >>> > keys with namespaces and object values, others differ), but I think >> we >> >>> > need to answer the general "Are we going to have headers" question >> >>> > first. >> >>> > >> >>> > I'd love to hear from the other committers in the discussion: >> >>> > What would it take to convince you that headers in Kafka are a good >> >>> > idea in general, so we can move ahead and try to agree on the >> details? >> >>> > >> >>> > I feel like we keep moving the goal posts and this is truly >> exhausting. >> >>> > >> >>> > For the record, I mildly support adding headers to Kafka (+0.5?). >> >>> > The community can continue to find workarounds to the issue and there >> >>> > are some benefits to keeping the message format and clients simpler. >> >>> > But I see the usefulness of headers to many use-cases and if we can >> >>> > find a good and generally useful way to add it to Kafka, it will make >> >>> > Kafka easier to use for many - worthy goal in my eyes. >> >>> > >> >>> > > are interesting/feasible, but: >> >>> > > A+B. i think there are use cases for polyglot topics. especially if >> >>> kafka >> >>> > > is being used to "trunk" something else. >> >>> > > D. multiple topics would make it harder to write portable consumer >> >>> code. >> >>> > > partition remapping would mess with locality of consumption >> >>> guarantees. >> >>> > > E+F. a use case I see for lineage/metadata is billing/chargeback. >> for >> >>> > that >> >>> > > use case it is not enough to simply record the point of origin, but >> >>> every >> >>> > > replication stop (think mirror maker) must also add a record to >> form a >> >>> > > "transit log". >> >>> > > >> >>> > > as for stream processing on top of kafka - i know samza has a >> metadata >> >>> > map >> >>> > > which they carry around in addition to user values. headers are the >> >>> > perfect >> >>> > > fit for these things. >> >>> > > >> >>> > > >> >>> > > >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun Rao <j...@confluent.io> wrote: >> >>> > > >> >>> > >> Hi, Michael, >> >>> > >> >> >>> > >> In order to answer the first two questions, it would be helpful >> if we >> >>> > could >> >>> > >> identify 1 or 2 strong use cases for headers in the space for >> >>> > third-party >> >>> > >> vendors. For use cases within an organization, one could always >> use >> >>> > other >> >>> > >> approaches such as company-wise containers to get around w/o >> >>> headers. I >> >>> > >> went through the use cases in the KIP and in Radai's wiki ( >> >>> > >> https://cwiki.apache.org/confluence/display/KAFKA/A+ >> >>> > Case+for+Kafka+Headers >> >>> > >> ). >> >>> > >> The following are the ones that that I understand and could be in >> the >> >>> > >> third-party use case category. >> >>> > >> >> >>> > >> A. content-type >> >>> > >> It seems that in general, content-type should be set at the topic >> >>> level. >> >>> > >> Not sure if mixing messages with different content types should be >> >>> > >> encouraged. >> >>> > >> >> >>> > >> B. schema id >> >>> > >> Since the value is mostly useless without schema id, it seems that >> >>> > storing >> >>> > >> the schema id together with serialized bytes in the value is >> better? >> >>> > >> >> >>> > >> C. per message encryption >> >>> > >> One drawback of this approach is that this significantly reduce >> the >> >>> > >> effectiveness of compression, which happens on a set of serialized >> >>> > >> messages. An alternative is to enable SSL for wire encryption and >> >>> rely >> >>> > on >> >>> > >> the storage system (e.g. LUKS) for at rest encryption. >> >>> > >> >> >>> > >> D. cluster ID for mirroring across Kafka clusters >> >>> > >> This is actually interesting. Today, to avoid introducing cycles >> when >> >>> > doing >> >>> > >> mirroring across data centers, one would either have to set up two >> >>> Kafka >> >>> > >> clusters (a local and an aggregate) per data center or rename >> topics. >> >>> > >> Neither is ideal. With headers, the producer could tag each >> message >> >>> with >> >>> > >> the producing cluster ID in the header. MirrorMaker could then >> avoid >> >>> > >> mirroring messages to a cluster if they are tagged with the same >> >>> cluster >> >>> > >> id. >> >>> > >> >> >>> > >> However, an alternative approach is to introduce sth like >> >>> hierarchical >> >>> > >> topic and store messages from different clusters in different >> >>> partitions >> >>> > >> under the same topic. This approach avoids filtering out unneeded >> >>> data >> >>> > and >> >>> > >> makes offset preserving easier to support. It may make compaction >> >>> > trickier >> >>> > >> though since the same key may show up in different partitions. >> >>> > >> >> >>> > >> E. record-level lineage >> >>> > >> For example, a source connector could store in the message the >> >>> metadata >> >>> > >> (e.g. UUID) of the source record. Similarly, if a stream job >> >>> transforms >> >>> > >> messages from topic A to topic B, the library could include the >> >>> source >> >>> > >> message offset in each of the transformed message in the header. >> Not >> >>> > sure >> >>> > >> how widely useful record-level lineage is though since the >> overhead >> >>> > could >> >>> > >> be significant. >> >>> > >> >> >>> > >> F. auditing metadata >> >>> > >> We could put things like clientId/host/user in the header in each >> >>> > message >> >>> > >> for auditing. These metadata are really at the producer level >> though. >> >>> > So, a >> >>> > >> more efficient way is to only include a "producerId" per message >> and >> >>> > send >> >>> > >> the producerId -> metadata mapping independently. KIP-98 is >> actually >> >>> > >> proposing including such a producerId natively in the message. >> >>> > >> >> >>> > >> So, overall, I not sure that I am fully convinced of the strong >> >>> > third-party >> >>> > >> use cases of headers yet. Perhaps we could discuss a bit more to >> make >> >>> > one >> >>> > >> or two really convincing use cases. >> >>> > >> >> >>> > >> Another orthogonal question is whether header should be exposed >> in >> >>> > stream >> >>> > >> processing systems such Kafka stream, Samza, and Spark streaming. >> >>> > >> Currently, those systems just deal with key/value pairs. Should we >> >>> > expose a >> >>> > >> third thing header there too or somehow map header to key or >> value? >> >>> > >> >> >>> > >> Thanks, >> >>> > >> >> >>> > >> Jun >> >>> > >> >> >>> > >> >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM, Michael Pearce < >> >>> michael.pea...@ig.com> >> >>> > >> wrote: >> >>> > >> >> >>> > >> > I assume, that after a period of a week, that there is no >> concerns >> >>> now >> >>> > >> > with points 1, and 2 and now we have agreement that headers are >> >>> useful >> >>> > >> and >> >>> > >> > needed in Kafka. As such if put to a KIP vote, this wouldn’t be >> a >> >>> > reason >> >>> > >> to >> >>> > >> > reject. >> >>> > >> > >> >>> > >> > @ >> >>> > >> > Ignacio on point 4). >> >>> > >> > I think for purpose of getting this KIP moving past this, we can >> >>> state >> >>> > >> the >> >>> > >> > key will be a 4 bytes space that can will be naturally >> interpreted >> >>> as >> >>> > an >> >>> > >> > Int32 (if namespacing is later wanted you can easily split this >> >>> into >> >>> > two >> >>> > >> > int16 spaces), from the wire protocol implementation this makes >> no >> >>> > >> > difference I don’t believe. Is this reasonable to all? >> >>> > >> > >> >>> > >> > On 5) as per point 4 therefor happy we keep with 32 bits. >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > On 18/11/2016, 20:34, "ignacio.so...@gmail.com on behalf of >> >>> Ignacio >> >>> > >> > Solis" <ignacio.so...@gmail.com on behalf of iso...@igso.net> >> >>> wrote: >> >>> > >> > >> >>> > >> > Summary: >> >>> > >> > >> >>> > >> > 3) Yes - Header value as byte[] >> >>> > >> > >> >>> > >> > 4a) Int,Int - No >> >>> > >> > 4b) Int - Yes >> >>> > >> > 4c) String - Reluctant maybe >> >>> > >> > >> >>> > >> > 5) I believe the header system should take a single int. I >> >>> think >> >>> > >> > 32bits is >> >>> > >> > a good size, if you want to interpret this as to 16bit >> numbers >> >>> in >> >>> > the >> >>> > >> > layer >> >>> > >> > above go right ahead. If somebody wants to argue for 16 >> bits >> >>> or >> >>> > 64 >> >>> > >> > bits of >> >>> > >> > header key space I would listen. >> >>> > >> > >> >>> > >> > >> >>> > >> > Discussion: >> >>> > >> > Dividing the key space into sub_key_1 and sub_key_2 makes no >> >>> > sense to >> >>> > >> > me at >> >>> > >> > this layer. Are we going to start providing APIs to get all >> >>> the >> >>> > >> > sub_key_1s? or all the sub_key_2s? If there is no >> >>> distinguishing >> >>> > >> > functions >> >>> > >> > that are applied to each one then they should be a single >> >>> value. >> >>> > At >> >>> > >> > this >> >>> > >> > layer all we're doing is equality. >> >>> > >> > If the above layer wants to interpret this as 2, 3 or more >> >>> values >> >>> > >> > that's a >> >>> > >> > different question. I personally think it's all one >> keyspace >> >>> > that is >> >>> > >> > getting assigned using some structure, but if you want to >> >>> > sub-assign >> >>> > >> > parts >> >>> > >> > of it then that's fine. >> >>> > >> > >> >>> > >> > The same discussion applies to strings. If somebody argued >> for >> >>> > >> > strings, >> >>> > >> > would we be arguing to divide the strings with dots ('.') >> as a >> >>> > >> > requirement? >> >>> > >> > Would we want them to give us the different name segments >> >>> > separately? >> >>> > >> > Would we be performing any actions on this key other than >> >>> > matching? >> >>> > >> > >> >>> > >> > Nacho >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > On Fri, Nov 18, 2016 at 9:30 AM, Michael Pearce < >> >>> > >> michael.pea...@ig.com >> >>> > >> > > >> >>> > >> > wrote: >> >>> > >> > >> >>> > >> > > #jay #jun any concerns on 1 and 2 still? >> >>> > >> > > >> >>> > >> > > @all >> >>> > >> > > To get this moving along a bit more I'd also like to ask >> to >> >>> get >> >>> > >> > clarity on >> >>> > >> > > the below last points: >> >>> > >> > > >> >>> > >> > > 3) I believe we're all roughly happy with the header value >> >>> > being a >> >>> > >> > byte[]? >> >>> > >> > > >> >>> > >> > > 4) I believe consensus has been for an namespace based int >> >>> > approach >> >>> > >> > > {int,int} for the key. Any objections if this is what we >> go >> >>> > with? >> >>> > >> > > >> >>> > >> > > 5) as we have if assumption in (4) is correct, {int,int} >> >>> keys. >> >>> > >> > > Should both int's be int16 or int32? >> >>> > >> > > I'm for them being int16(2 bytes) as combined is space of >> >>> > 4bytes as >> >>> > >> > per >> >>> > >> > > original and gives plenty of combinations for the >> >>> foreseeable, >> >>> > and >> >>> > >> > keeps >> >>> > >> > > the overhead small. >> >>> > >> > > >> >>> > >> > > Do we see any benefit in another kip call to discuss >> these at >> >>> > all? >> >>> > >> > > >> >>> > >> > > Cheers >> >>> > >> > > Mike >> >>> > >> > > ________________________________________ >> >>> > >> > > From: K Burstev <k.burs...@yandex.com> >> >>> > >> > > Sent: Friday, November 18, 2016 7:07:07 AM >> >>> > >> > > To: dev@kafka.apache.org >> >>> > >> > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers >> >>> > >> > > >> >>> > >> > > For what it is worth also i agree. As a user: >> >>> > >> > > >> >>> > >> > > 1) Yes - Headers are worthwhile >> >>> > >> > > 2) Yes - Headers should be a top level option >> >>> > >> > > >> >>> > >> > > 14.11.2016, 21:15, "Ignacio Solis" <iso...@igso.net>: >> >>> > >> > > > 1) Yes - Headers are worthwhile >> >>> > >> > > > 2) Yes - Headers should be a top level option >> >>> > >> > > > >> >>> > >> > > > On Mon, Nov 14, 2016 at 9:16 AM, Michael Pearce < >> >>> > >> > michael.pea...@ig.com> >> >>> > >> > > > wrote: >> >>> > >> > > > >> >>> > >> > > >> Hi Roger, >> >>> > >> > > >> >> >>> > >> > > >> The kip details/examples the original proposal for key >> >>> > spacing >> >>> > >> , >> >>> > >> > not >> >>> > >> > > the >> >>> > >> > > >> new mentioned as per discussion namespace idea. >> >>> > >> > > >> >> >>> > >> > > >> We will need to update the kip, when we get agreement >> >>> this >> >>> > is a >> >>> > >> > better >> >>> > >> > > >> approach (which seems to be the case if I have >> understood >> >>> > the >> >>> > >> > general >> >>> > >> > > >> feeling in the conversation) >> >>> > >> > > >> >> >>> > >> > > >> Re the variable ints, at very early stage we did think >> >>> about >> >>> > >> > this. I >> >>> > >> > > think >> >>> > >> > > >> the added complexity for the saving isn't worth it. >> I'd >> >>> > rather >> >>> > >> go >> >>> > >> > > with, if >> >>> > >> > > >> we want to reduce overheads and size int16 (2bytes) >> keys >> >>> as >> >>> > it >> >>> > >> > keeps it >> >>> > >> > > >> simple. >> >>> > >> > > >> >> >>> > >> > > >> On the note of no headers, there is as per the kip as >> we >> >>> > use an >> >>> > >> > > attribute >> >>> > >> > > >> bit to denote if headers are present or not as such >> >>> > provides a >> >>> > >> > zero >> >>> > >> > > >> overhead currently if headers are not used. >> >>> > >> > > >> >> >>> > >> > > >> I think as radai mentions would be good first if we >> can >> >>> get >> >>> > >> > clarity if >> >>> > >> > > do >> >>> > >> > > >> we now have general consensus that (1) headers are >> >>> > worthwhile >> >>> > >> and >> >>> > >> > > useful, >> >>> > >> > > >> and (2) we want it as a top level entity. >> >>> > >> > > >> >> >>> > >> > > >> Just to state the obvious i believe (1) headers are >> >>> > worthwhile >> >>> > >> > and (2) >> >>> > >> > > >> agree as a top level entity. >> >>> > >> > > >> >> >>> > >> > > >> Cheers >> >>> > >> > > >> Mike >> >>> > >> > > >> ________________________________________ >> >>> > >> > > >> From: Roger Hoover <roger.hoo...@gmail.com> >> >>> > >> > > >> Sent: Wednesday, November 9, 2016 9:10:47 PM >> >>> > >> > > >> To: dev@kafka.apache.org >> >>> > >> > > >> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers >> >>> > >> > > >> >> >>> > >> > > >> Sorry for going a little in the weeds but thanks for >> the >> >>> > >> replies >> >>> > >> > > regarding >> >>> > >> > > >> varint. >> >>> > >> > > >> >> >>> > >> > > >> Agreed that a prefix and {int, int} can be the same. >> It >> >>> > doesn't >> >>> > >> > look >> >>> > >> > > like >> >>> > >> > > >> that's what the KIP is saying the "Open" section. The >> >>> > example >> >>> > >> > shows >> >>> > >> > > >> 2100001 >> >>> > >> > > >> for New Relic and 210002 for App Dynamics implying >> that >> >>> the >> >>> > New >> >>> > >> > Relic >> >>> > >> > > >> organization will have only a single header id to work >> >>> > with. Or >> >>> > >> > is >> >>> > >> > > 2100001 >> >>> > >> > > >> a prefix? The main point of a namespace or prefix is >> to >> >>> > reduce >> >>> > >> > the >> >>> > >> > > >> overhead of config mapping or registration depending >> on >> >>> how >> >>> > >> > > >> namespaces/prefixes are managed. >> >>> > >> > > >> >> >>> > >> > > >> Would love to hear more feedback on the higher-level >> >>> > questions >> >>> > >> > > though... >> >>> > >> > > >> >> >>> > >> > > >> Cheers, >> >>> > >> > > >> >> >>> > >> > > >> Roger >> >>> > >> > > >> >> >>> > >> > > >> On Wed, Nov 9, 2016 at 11:38 AM, radai < >> >>> > >> > radai.rosenbl...@gmail.com> >> >>> > >> > > wrote: >> >>> > >> > > >> >> >>> > >> > > >> > I think this discussion is getting a bit into the >> >>> weeds on >> >>> > >> > technical >> >>> > >> > > >> > implementation details. >> >>> > >> > > >> > I'd liek to step back a minute and try and establish >> >>> > where we >> >>> > >> > are in >> >>> > >> > > the >> >>> > >> > > >> > larger picture: >> >>> > >> > > >> > >> >>> > >> > > >> > (re-wording nacho's last paragraph) >> >>> > >> > > >> > 1. are we all in agreement that headers are a >> >>> worthwhile >> >>> > and >> >>> > >> > useful >> >>> > >> > > >> > addition to have? this was contested early on >> >>> > >> > > >> > 2. are we all in agreement on headers as top level >> >>> entity >> >>> > vs >> >>> > >> > headers >> >>> > >> > > >> > squirreled-away in V? >> >>> > >> > > >> > >> >>> > >> > > >> > if there are still concerns around these #2 points >> >>> (#jay? >> >>> > >> > #jun?)? >> >>> > >> > > >> > >> >>> > >> > > >> > (and now back to our normal programming ...) >> >>> > >> > > >> > >> >>> > >> > > >> > varints are nice. having said that, its adding >> >>> complexity >> >>> > >> (see >> >>> > >> > > >> > https://github.com/addthis/ >> stream-lib/blob/master/src/ >> >>> > >> > > >> > main/java/com/clearspring/ >> analytics/util/Varint.java >> >>> > >> > > >> > as 1st google result) and would require anyone >> writing >> >>> > other >> >>> > >> > clients >> >>> > >> > > (C? >> >>> > >> > > >> > Python? Go? Bash? ;-) ) to get/implement the same, >> and >> >>> for >> >>> > >> > relatively >> >>> > >> > > >> > little gain (int vs string is order of magnitude, >> this >> >>> > isnt). >> >>> > >> > > >> > >> >>> > >> > > >> > int namespacing vs {int, int} namespacing are >> basically >> >>> > the >> >>> > >> > same >> >>> > >> > > thing - >> >>> > >> > > >> > youre just namespacing an int64 and giving people >> while >> >>> > 2^32 >> >>> > >> > ranges >> >>> > >> > > at a >> >>> > >> > > >> > time. the part i like about this is letting people >> >>> have a >> >>> > >> large >> >>> > >> > > swath of >> >>> > >> > > >> > numbers with one registration so they dont have to >> come >> >>> > back >> >>> > >> > for >> >>> > >> > > every >> >>> > >> > > >> > single plugin/header they want to "reserve". >> >>> > >> > > >> > >> >>> > >> > > >> > >> >>> > >> > > >> > On Wed, Nov 9, 2016 at 11:01 AM, Roger Hoover < >> >>> > >> > > roger.hoo...@gmail.com> >> >>> > >> > > >> > wrote: >> >>> > >> > > >> > >> >>> > >> > > >> > > Since some of the debate has been about overhead + >> >>> > >> > performance, I'm >> >>> > >> > > >> > > wondering if we have considered a varint encoding >> ( >> >>> > >> > > >> > > https://developers.google.com/ >> protocol-buffers/docs/ >> >>> > >> > > encoding#varints) >> >>> > >> > > >> > for >> >>> > >> > > >> > > the header length field (int32 in the proposal) >> and >> >>> for >> >>> > >> > header >> >>> > >> > > ids? If >> >>> > >> > > >> > you >> >>> > >> > > >> > > don't use headers, the overhead would be a single >> >>> byte >> >>> > and >> >>> > >> > for each >> >>> > >> > > >> > header >> >>> > >> > > >> > > id < 128 would also need only a single byte? >> >>> > >> > > >> > > >> >>> > >> > > >> > > >> >>> > >> > > >> > > >> >>> > >> > > >> > > On Wed, Nov 9, 2016 at 6:43 AM, radai < >> >>> > >> > radai.rosenbl...@gmail.com> >> >>> > >> > > >> > wrote: >> >>> > >> > > >> > > >> >>> > >> > > >> > > > @magnus - and very dangerous (youre essentially >> >>> > >> > downloading and >> >>> > >> > > >> > executing >> >>> > >> > > >> > > > arbitrary code off the internet on your servers >> ... >> >>> > bad >> >>> > >> > idea >> >>> > >> > > without >> >>> > >> > > >> a >> >>> > >> > > >> > > > sandbox, even with) >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > as for it being a purely administrative task - i >> >>> > >> disagree. >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > i wish it would, really, because then my earlier >> >>> > point on >> >>> > >> > the >> >>> > >> > > >> > complexity >> >>> > >> > > >> > > of >> >>> > >> > > >> > > > the remapping process would be invalid, but at >> >>> > linkedin, >> >>> > >> > for >> >>> > >> > > example, >> >>> > >> > > >> > we >> >>> > >> > > >> > > > (the team im in) run kafka as a service. we dont >> >>> > really >> >>> > >> > know >> >>> > >> > > what our >> >>> > >> > > >> > > users >> >>> > >> > > >> > > > (developing applications that use kafka) are up >> to >> >>> at >> >>> > any >> >>> > >> > given >> >>> > >> > > >> moment. >> >>> > >> > > >> > > it >> >>> > >> > > >> > > > is very possible (given the existance of headers >> >>> and a >> >>> > >> > > corresponding >> >>> > >> > > >> > > plugin >> >>> > >> > > >> > > > ecosystem) for some application to "equip" their >> >>> > >> producers >> >>> > >> > and >> >>> > >> > > >> > consumers >> >>> > >> > > >> > > > with the required plugin without us knowing. i >> dont >> >>> > mean >> >>> > >> > to imply >> >>> > >> > > >> thats >> >>> > >> > > >> > > > bad, i just want to make the point that its not >> as >> >>> > simple >> >>> > >> > > keeping it >> >>> > >> > > >> in >> >>> > >> > > >> > > > sync across a large-enough organization. >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > On Wed, Nov 9, 2016 at 6:17 AM, Magnus Edenhill >> < >> >>> > >> > > mag...@edenhill.se> >> >>> > >> > > >> > > > wrote: >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > > I think there is a piece missing in the >> Strings >> >>> > >> > discussion, >> >>> > >> > > where >> >>> > >> > > >> > > > > pro-Stringers >> >>> > >> > > >> > > > > reason that by providing unique string >> >>> identifiers >> >>> > for >> >>> > >> > each >> >>> > >> > > header >> >>> > >> > > >> > > > > everything will just >> >>> > >> > > >> > > > > magically work for all parts of the stream >> >>> pipeline. >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > But the strings dont mean anything by >> themselves, >> >>> > and >> >>> > >> > while we >> >>> > >> > > >> could >> >>> > >> > > >> > > > > probably envision >> >>> > >> > > >> > > > > some auto plugin loader that downloads, >> compiles, >> >>> > links >> >>> > >> > and >> >>> > >> > > runs >> >>> > >> > > >> > > plugins >> >>> > >> > > >> > > > > on-demand >> >>> > >> > > >> > > > > as soon as they're seen by a consumer, I dont >> >>> really >> >>> > >> see >> >>> > >> > a >> >>> > >> > > use-case >> >>> > >> > > >> > for >> >>> > >> > > >> > > > > something >> >>> > >> > > >> > > > > so dynamic (and fragile) in practice. >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > In the real world an application will be >> >>> configured >> >>> > >> with >> >>> > >> > a set >> >>> > >> > > of >> >>> > >> > > >> > > plugins >> >>> > >> > > >> > > > > to either add (producer) >> >>> > >> > > >> > > > > or read (consumer) headers. >> >>> > >> > > >> > > > > This is an administrative task based on what >> >>> > features a >> >>> > >> > client >> >>> > >> > > >> > > > > needs/provides and results in >> >>> > >> > > >> > > > > some sort of configuration to enable and >> >>> configure >> >>> > the >> >>> > >> > desired >> >>> > >> > > >> > plugins. >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > Since this needs to be kept somewhat in sync >> >>> across >> >>> > an >> >>> > >> > > organisation >> >>> > >> > > >> > > > (there >> >>> > >> > > >> > > > > is no point in having producers >> >>> > >> > > >> > > > > add headers no consumers will read, and vice >> >>> versa), >> >>> > >> the >> >>> > >> > added >> >>> > >> > > >> > > complexity >> >>> > >> > > >> > > > > of assigning an id namespace >> >>> > >> > > >> > > > > for each plugin as it is being configured >> should >> >>> be >> >>> > >> > tolerable. >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > /Magnus >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > 2016-11-09 13:06 GMT+01:00 Michael Pearce < >> >>> > >> > > michael.pea...@ig.com>: >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > > Just following/catching up on what seems to >> be >> >>> an >> >>> > >> > active >> >>> > >> > > night :) >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > @Radai sorry if it may seem obvious but what >> >>> does >> >>> > MD >> >>> > >> > stand >> >>> > >> > > for? >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > My take on String vs Int: >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > I will state first I am pro Int (16 or 32). >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > I do though playing devils advocate see a >> big >> >>> plus >> >>> > >> > with the >> >>> > >> > > >> > argument >> >>> > >> > > >> > > of >> >>> > >> > > >> > > > > > String keys, this is around integrating >> into an >> >>> > >> > existing >> >>> > >> > > >> > eco-system. >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > As many other systems use String based >> headers >> >>> > >> (Flume, >> >>> > >> > JMS) >> >>> > >> > > it >> >>> > >> > > >> > makes >> >>> > >> > > >> > > > it >> >>> > >> > > >> > > > > > much easier for these to be >> >>> > incorporated/integrated >> >>> > >> > into. >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > How with Int based headers could we provide >> a >> >>> > >> > way/guidence to >> >>> > >> > > >> make >> >>> > >> > > >> > > this >> >>> > >> > > >> > > > > > integration simple / easy with transition >> flows >> >>> > over >> >>> > >> to >> >>> > >> > > kafka? >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > * tough luck buddy you're on your own >> >>> > >> > > >> > > > > > * simply hash the string into int code and >> hope >> >>> > for >> >>> > >> no >> >>> > >> > > collisions >> >>> > >> > > >> > > (how >> >>> > >> > > >> > > > to >> >>> > >> > > >> > > > > > convert back though?) >> >>> > >> > > >> > > > > > * http2 style as mentioned by nacho. >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > cheers, >> >>> > >> > > >> > > > > > Mike >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > ________________________________________ >> >>> > >> > > >> > > > > > From: radai <radai.rosenbl...@gmail.com> >> >>> > >> > > >> > > > > > Sent: Wednesday, November 9, 2016 8:12 AM >> >>> > >> > > >> > > > > > To: dev@kafka.apache.org >> >>> > >> > > >> > > > > > Subject: Re: [DISCUSS] KIP-82 - Add Record >> >>> Headers >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > thinking about it some more, the best way to >> >>> > transmit >> >>> > >> > the >> >>> > >> > > header >> >>> > >> > > >> > > > > remapping >> >>> > >> > > >> > > > > > data to consumers would be to put it in the >> MD >> >>> > >> response >> >>> > >> > > payload, >> >>> > >> > > >> so >> >>> > >> > > >> > > > maybe >> >>> > >> > > >> > > > > > it should be discussed now. >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > On Wed, Nov 9, 2016 at 12:09 AM, radai < >> >>> > >> > > >> radai.rosenbl...@gmail.com >> >>> > >> > > >> > > >> >>> > >> > > >> > > > > wrote: >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > > im not opposed to the idea of namespace >> >>> mapping. >> >>> > >> all >> >>> > >> > im >> >>> > >> > > saying >> >>> > >> > > >> is >> >>> > >> > > >> > > > that >> >>> > >> > > >> > > > > > its >> >>> > >> > > >> > > > > > > not part of the "mvp" and, since it >> requires >> >>> no >> >>> > >> wire >> >>> > >> > format >> >>> > >> > > >> > change, >> >>> > >> > > >> > > > can >> >>> > >> > > >> > > > > > > always be added later. >> >>> > >> > > >> > > > > > > also, its not as simple as just >> configuring >> >>> MM >> >>> > to >> >>> > >> do >> >>> > >> > the >> >>> > >> > > >> > transform: >> >>> > >> > > >> > > > > lets >> >>> > >> > > >> > > > > > > say i've implemented large message >> support as >> >>> > >> > {666,1} and >> >>> > >> > > on >> >>> > >> > > >> some >> >>> > >> > > >> > > > > mirror >> >>> > >> > > >> > > > > > > target cluster its been remapped to >> {999,1}. >> >>> the >> >>> > >> > consumer >> >>> > >> > > >> plugin >> >>> > >> > > >> > > code >> >>> > >> > > >> > > > > > would >> >>> > >> > > >> > > > > > > also need to be told to look for the large >> >>> > message >> >>> > >> > "part X >> >>> > >> > > of >> >>> > >> > > >> Y" >> >>> > >> > > >> > > > header >> >>> > >> > > >> > > > > > > under {999,1}. doable, but tricky. >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > > On Tue, Nov 8, 2016 at 10:29 PM, Gwen >> >>> Shapira < >> >>> > >> > > >> g...@confluent.io >> >>> > >> > > >> > > >> >>> > >> > > >> > > > > wrote: >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > >> While you can do whatever you want with a >> >>> > >> namespace >> >>> > >> > and >> >>> > >> > > your >> >>> > >> > > >> > code, >> >>> > >> > > >> > > > > > >> what I'd expect is for each app to >> >>> namespaces >> >>> > >> > > configurable... >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> So if I accidentally used 666 for my HR >> >>> > >> department, >> >>> > >> > and >> >>> > >> > > still >> >>> > >> > > >> > want >> >>> > >> > > >> > > > to >> >>> > >> > > >> > > > > > >> run RadaiApp, I can config "namespace=42" >> >>> for >> >>> > >> > RadaiApp and >> >>> > >> > > >> > > > everything >> >>> > >> > > >> > > > > > >> will look normal. >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> This means you only need to sync usage >> >>> inside >> >>> > your >> >>> > >> > own >> >>> > >> > > >> > > organization. >> >>> > >> > > >> > > > > > >> Still hard, but somewhat easier than >> syncing >> >>> > with >> >>> > >> > the >> >>> > >> > > entire >> >>> > >> > > >> > > world. >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> On Tue, Nov 8, 2016 at 10:07 PM, radai < >> >>> > >> > > >> > > radai.rosenbl...@gmail.com> >> >>> > >> > > >> > > > > > >> wrote: >> >>> > >> > > >> > > > > > >> > and we can start with {namespace, id} >> and >> >>> no >> >>> > >> > re-mapping >> >>> > >> > > >> > support >> >>> > >> > > >> > > > and >> >>> > >> > > >> > > > > > >> always >> >>> > >> > > >> > > > > > >> > add it later on if/when collisions >> >>> actually >> >>> > >> > happen (i >> >>> > >> > > dont >> >>> > >> > > >> > think >> >>> > >> > > >> > > > > > they'd >> >>> > >> > > >> > > > > > >> be >> >>> > >> > > >> > > > > > >> > a problem). >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> > every interested party (so orgs or >> >>> > individuals) >> >>> > >> > could >> >>> > >> > > then >> >>> > >> > > >> > > > register >> >>> > >> > > >> > > > > a >> >>> > >> > > >> > > > > > >> > prefix (0 = reserved, 1 = confluent ... >> >>> 666 >> >>> > = me >> >>> > >> > :-) ) >> >>> > >> > > and >> >>> > >> > > >> do >> >>> > >> > > >> > > > > whatever >> >>> > >> > > >> > > > > > >> with >> >>> > >> > > >> > > > > > >> > the 2nd ID - so once linkedin >> registers, >> >>> say >> >>> > 3, >> >>> > >> > then >> >>> > >> > > >> linkedin >> >>> > >> > > >> > > devs >> >>> > >> > > >> > > > > are >> >>> > >> > > >> > > > > > >> free >> >>> > >> > > >> > > > > > >> > to use {3, *} with a reasonable >> >>> expectation >> >>> > to >> >>> > >> to >> >>> > >> > > collide >> >>> > >> > > >> with >> >>> > >> > > >> > > > > > anything >> >>> > >> > > >> > > > > > >> > else. further partitioning of that * >> >>> becomes >> >>> > >> > linkedin's >> >>> > >> > > >> > problem, >> >>> > >> > > >> > > > but >> >>> > >> > > >> > > > > > the >> >>> > >> > > >> > > > > > >> > "upstream registration" of a namespace >> >>> only >> >>> > has >> >>> > >> to >> >>> > >> > > happen >> >>> > >> > > >> > once. >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> > On Tue, Nov 8, 2016 at 9:03 PM, James >> >>> Cheng < >> >>> > >> > > >> > > wushuja...@gmail.com >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > > >> wrote: >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> > On Nov 8, 2016, at 5:54 PM, Gwen >> >>> Shapira < >> >>> > >> > > >> > g...@confluent.io> >> >>> > >> > > >> > > > > > wrote: >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > Thank you so much for this clear and >> >>> fair >> >>> > >> > summary of >> >>> > >> > > the >> >>> > >> > > >> > > > > arguments. >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > I'm in favor of ints. Not a >> >>> deal-breaker, >> >>> > but >> >>> > >> > in >> >>> > >> > > favor. >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > Even more in favor of Magnus's >> >>> > decentralized >> >>> > >> > > suggestion >> >>> > >> > > >> > with >> >>> > >> > > >> > > > > > Roger's >> >>> > >> > > >> > > > > > >> >> > tweak: add a namespace for headers. >> >>> This >> >>> > will >> >>> > >> > allow >> >>> > >> > > each >> >>> > >> > > >> > app >> >>> > >> > > >> > > to >> >>> > >> > > >> > > > > > just >> >>> > >> > > >> > > > > > >> >> > use whatever IDs it wants >> internally, >> >>> and >> >>> > >> then >> >>> > >> > let >> >>> > >> > > the >> >>> > >> > > >> > admin >> >>> > >> > > >> > > > > > >> deploying >> >>> > >> > > >> > > > > > >> >> > the app figure out an available >> >>> namespace >> >>> > ID >> >>> > >> > for the >> >>> > >> > > app >> >>> > >> > > >> to >> >>> > >> > > >> > > > live >> >>> > >> > > >> > > > > > in. >> >>> > >> > > >> > > > > > >> >> > So io.confluent.schema-registry can >> be >> >>> > >> > namespace >> >>> > >> > > 0x01 on >> >>> > >> > > >> my >> >>> > >> > > >> > > > > > >> deployment >> >>> > >> > > >> > > > > > >> >> > and 0x57 on yours, and the poor guys >> >>> > >> > developing the >> >>> > >> > > app >> >>> > >> > > >> > don't >> >>> > >> > > >> > > > > need >> >>> > >> > > >> > > > > > to >> >>> > >> > > >> > > > > > >> >> > worry about that. >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> Gwen, if I understand your example >> >>> right, an >> >>> > >> > > application >> >>> > >> > > >> > > deployer >> >>> > >> > > >> > > > > > might >> >>> > >> > > >> > > > > > >> >> decide to use 0x01 in one deployment, >> and >> >>> > that >> >>> > >> > means >> >>> > >> > > that >> >>> > >> > > >> > once >> >>> > >> > > >> > > > the >> >>> > >> > > >> > > > > > >> message >> >>> > >> > > >> > > > > > >> >> is written into the broker, it will be >> >>> > saved on >> >>> > >> > the >> >>> > >> > > broker >> >>> > >> > > >> > with >> >>> > >> > > >> > > > > that >> >>> > >> > > >> > > > > > >> >> specific namespace (0x01). >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> If you were to mirror that message >> into >> >>> > another >> >>> > >> > > cluster, >> >>> > >> > > >> the >> >>> > >> > > >> > > 0x01 >> >>> > >> > > >> > > > > > would >> >>> > >> > > >> > > > > > >> >> accompany the message, right? What if >> the >> >>> > >> > deployers of >> >>> > >> > > the >> >>> > >> > > >> > same >> >>> > >> > > >> > > > app >> >>> > >> > > >> > > > > > in >> >>> > >> > > >> > > > > > >> the >> >>> > >> > > >> > > > > > >> >> other cluster uses 0x57? They won't >> >>> > understand >> >>> > >> > each >> >>> > >> > > other? >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> I'm not sure that's an avoidable >> >>> problem. I >> >>> > >> > think it >> >>> > >> > > simply >> >>> > >> > > >> > > means >> >>> > >> > > >> > > > > > that >> >>> > >> > > >> > > > > > >> in >> >>> > >> > > >> > > > > > >> >> order to share data, you have to also >> >>> have a >> >>> > >> > shared >> >>> > >> > > (agreed >> >>> > >> > > >> > > upon) >> >>> > >> > > >> > > > > > >> >> understanding of what the namespaces >> >>> mean. >> >>> > >> Which >> >>> > >> > I >> >>> > >> > > think >> >>> > >> > > >> > makes >> >>> > >> > > >> > > > > sense, >> >>> > >> > > >> > > > > > >> >> because the alternate (sharing >> *nothing* >> >>> at >> >>> > >> all) >> >>> > >> > would >> >>> > >> > > mean >> >>> > >> > > >> > > that >> >>> > >> > > >> > > > > > there >> >>> > >> > > >> > > > > > >> >> would be no way to understand each >> other. >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> -James >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> > Gwen >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > On Tue, Nov 8, 2016 at 4:23 PM, >> radai < >> >>> > >> > > >> > > > > radai.rosenbl...@gmail.com> >> >>> > >> > > >> > > > > > >> >> wrote: >> >>> > >> > > >> > > > > > >> >> >> +1 for sean's document. it covers >> >>> pretty >> >>> > >> much >> >>> > >> > all >> >>> > >> > > the >> >>> > >> > > >> > > > trade-offs >> >>> > >> > > >> > > > > > and >> >>> > >> > > >> > > > > > >> >> >> provides concrete figures to argue >> >>> about >> >>> > :-) >> >>> > >> > > >> > > > > > >> >> >> (nit-picking - used the same xkcd >> >>> twice, >> >>> > >> also >> >>> > >> > trove >> >>> > >> > > has >> >>> > >> > > >> > been >> >>> > >> > > >> > > > > > >> superceded >> >>> > >> > > >> > > > > > >> >> for >> >>> > >> > > >> > > > > > >> >> >> purposes of high performance >> >>> collections: >> >>> > >> > look at >> >>> > >> > > >> > > > > > >> >> >> https://github.com/leventov/ >> Koloboke) >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> so to sum up the string vs int >> debate: >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> performance - you can do 140k >> ops/sec >> >>> > _per >> >>> > >> > thread_ >> >>> > >> > > with >> >>> > >> > > >> > > string >> >>> > >> > > >> > > > > > >> headers. >> >>> > >> > > >> > > > > > >> >> you >> >>> > >> > > >> > > > > > >> >> >> could do x2-3 better with ints. >> >>> there's >> >>> > no >> >>> > >> > arguing >> >>> > >> > > the >> >>> > >> > > >> > > > relative >> >>> > >> > > >> > > > > > diff >> >>> > >> > > >> > > > > > >> >> >> between the two, there's only the >> >>> > question >> >>> > >> of >> >>> > >> > > whether or >> >>> > >> > > >> > not >> >>> > >> > > >> > > > > _the >> >>> > >> > > >> > > > > > >> rest >> >>> > >> > > >> > > > > > >> >> of >> >>> > >> > > >> > > > > > >> >> >> kafka_ operates fast enough to >> care. >> >>> if >> >>> > we >> >>> > >> > want to >> >>> > >> > > make >> >>> > >> > > >> > > > choices >> >>> > >> > > >> > > > > > >> solely >> >>> > >> > > >> > > > > > >> >> >> based on performance we need ints. >> if >> >>> we >> >>> > are >> >>> > >> > > willing to >> >>> > >> > > >> > > > > > >> >> settle/compromise >> >>> > >> > > >> > > > > > >> >> >> for a nicer (to some) API than >> strings >> >>> > are >> >>> > >> > good >> >>> > >> > > enough >> >>> > >> > > >> for >> >>> > >> > > >> > > the >> >>> > >> > > >> > > > > > >> current >> >>> > >> > > >> > > > > > >> >> >> state of affairs. >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> message size - with batching and >> >>> > compression >> >>> > >> > it >> >>> > >> > > comes >> >>> > >> > > >> down >> >>> > >> > > >> > > to >> >>> > >> > > >> > > > a >> >>> > >> > > >> > > > > > ~5% >> >>> > >> > > >> > > > > > >> >> >> difference (internal testing, not >> in >> >>> the >> >>> > >> doc. >> >>> > >> > maybe >> >>> > >> > > >> would >> >>> > >> > > >> > > help >> >>> > >> > > >> > > > > > >> adding if >> >>> > >> > > >> > > > > > >> >> >> this becomes a point of >> contention?). >> >>> > this >> >>> > >> > means it >> >>> > >> > > wont >> >>> > >> > > >> > > > really >> >>> > >> > > >> > > > > > >> affect >> >>> > >> > > >> > > > > > >> >> >> kafka in "throughput mode" (large, >> >>> > >> compressed >> >>> > >> > > batches). >> >>> > >> > > >> in >> >>> > >> > > >> > > > "low >> >>> > >> > > >> > > > > > >> latency" >> >>> > >> > > >> > > > > > >> >> >> mode (meaning less/no batching and >> >>> > >> > compression) the >> >>> > >> > > >> > > difference >> >>> > >> > > >> > > > > can >> >>> > >> > > >> > > > > > >> be >> >>> > >> > > >> > > > > > >> >> >> extreme (it'll easily be an order >> of >> >>> > >> > magnitude with >> >>> > >> > > >> small >> >>> > >> > > >> > > > > payloads >> >>> > >> > > >> > > > > > >> like >> >>> > >> > > >> > > > > > >> >> >> stock ticks and header keys of the >> >>> form >> >>> > >> > > >> > > > > > >> >> >> "com.acme.infraTeam.kafka. >> >>> > >> hiMom.auditPlugin"). >> >>> > >> > we >> >>> > >> > > have >> >>> > >> > > >> a >> >>> > >> > > >> > > few >> >>> > >> > > >> > > > > such >> >>> > >> > > >> > > > > > >> >> topics at >> >>> > >> > > >> > > > > > >> >> >> linkedin where actual payloads are >> ~2 >> >>> > ints >> >>> > >> > and are >> >>> > >> > > >> > eclipsed >> >>> > >> > > >> > > by >> >>> > >> > > >> > > > > our >> >>> > >> > > >> > > > > > >> >> in-house >> >>> > >> > > >> > > > > > >> >> >> audit "header" which is why we >> liked >> >>> > ints to >> >>> > >> > begin >> >>> > >> > > with. >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> "ease of use" - strings would >> probably >> >>> > still >> >>> > >> > require >> >>> > >> > > >> > _some_ >> >>> > >> > > >> > > > > degree >> >>> > >> > > >> > > > > > >> of >> >>> > >> > > >> > > > > > >> >> >> partitioning by convention >> (imagine if >> >>> > >> > everyone >> >>> > >> > > used the >> >>> > >> > > >> > key >> >>> > >> > > >> > > > > > >> "infra"...) >> >>> > >> > > >> > > > > > >> >> >> but its very intuitive for java >> devs >> >>> to >> >>> > do >> >>> > >> > anyway >> >>> > >> > > >> > > > > (reverse-domain >> >>> > >> > > >> > > > > > is >> >>> > >> > > >> > > > > > >> >> >> ingrained into java developers at a >> >>> young >> >>> > >> age >> >>> > >> > :-) ). >> >>> > >> > > >> also >> >>> > >> > > >> > > most >> >>> > >> > > >> > > > > > java >> >>> > >> > > >> > > > > > >> devs >> >>> > >> > > >> > > > > > >> >> >> find Map<String, whatever> more >> >>> intuitive >> >>> > >> than >> >>> > >> > > >> > Map<Integer, >> >>> > >> > > >> > > > > > >> whatever> - >> >>> > >> > > >> > > > > > >> >> >> probably because of other >> text-based >> >>> > >> > protocols like >> >>> > >> > > >> http. >> >>> > >> > > >> > > ints >> >>> > >> > > >> > > > > > would >> >>> > >> > > >> > > > > > >> >> >> require a number registry. if you >> >>> think >> >>> > >> number >> >>> > >> > > >> registries >> >>> > >> > > >> > > are >> >>> > >> > > >> > > > > hard >> >>> > >> > > >> > > > > > >> just >> >>> > >> > > >> > > > > > >> >> >> look at the wiki page for KIPs >> >>> > (specifically >> >>> > >> > the >> >>> > >> > > number >> >>> > >> > > >> > for >> >>> > >> > > >> > > > next >> >>> > >> > > >> > > > > > >> >> available >> >>> > >> > > >> > > > > > >> >> >> KIP) and think again - we are >> probably >> >>> > >> talking >> >>> > >> > > about the >> >>> > >> > > >> > > same >> >>> > >> > > >> > > > > > >> volume of >> >>> > >> > > >> > > > > > >> >> >> requests. also this would only be >> >>> > "required" >> >>> > >> > (good >> >>> > >> > > >> > > > citizenship, >> >>> > >> > > >> > > > > > more >> >>> > >> > > >> > > > > > >> >> like) >> >>> > >> > > >> > > > > > >> >> >> if you want to publish your plugin >> for >> >>> > >> others >> >>> > >> > to >> >>> > >> > > use. >> >>> > >> > > >> > within >> >>> > >> > > >> > > > > your >> >>> > >> > > >> > > > > > >> org do >> >>> > >> > > >> > > > > > >> >> >> whatever you want - just know that >> if >> >>> you >> >>> > >> use >> >>> > >> > [some >> >>> > >> > > >> > > "reserved" >> >>> > >> > > >> > > > > > >> range] >> >>> > >> > > >> > > > > > >> >> and a >> >>> > >> > > >> > > > > > >> >> >> future kafka update breaks it its >> your >> >>> > >> > problem. >> >>> > >> > > RTFM. >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> personally im in favor of ints. >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> having said that (and like nacho) I >> >>> will >> >>> > >> > settle if >> >>> > >> > > int >> >>> > >> > > >> vs >> >>> > >> > > >> > > > string >> >>> > >> > > >> > > > > > >> remains >> >>> > >> > > >> > > > > > >> >> >> the only obstacle to this. >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >> On Tue, Nov 8, 2016 at 3:53 PM, >> Nacho >> >>> > Solis >> >>> > >> > > >> > > > > > >> <nso...@linkedin.com.invalid >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> >> wrote: >> >>> > >> > > >> > > > > > >> >> >> >> >>> > >> > > >> > > > > > >> >> >>> I think it's well known I've been >> >>> > pushing >> >>> > >> > for ints >> >>> > >> > > >> (and I >> >>> > >> > > >> > > > could >> >>> > >> > > >> > > > > > >> switch >> >>> > >> > > >> > > > > > >> >> to >> >>> > >> > > >> > > > > > >> >> >>> 16 bit shorts if pressed). >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> - efficient (space) >> >>> > >> > > >> > > > > > >> >> >>> - efficient (processing) >> >>> > >> > > >> > > > > > >> >> >>> - easily partitionable >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> However, if the only thing that is >> >>> > keeping >> >>> > >> > us from >> >>> > >> > > >> > adopting >> >>> > >> > > >> > > > > > >> headers is >> >>> > >> > > >> > > > > > >> >> the >> >>> > >> > > >> > > > > > >> >> >>> use of strings vs ints as keys, >> then >> >>> I >> >>> > >> would >> >>> > >> > cave >> >>> > >> > > in >> >>> > >> > > >> and >> >>> > >> > > >> > > > accept >> >>> > >> > > >> > > > > > >> >> strings. If >> >>> > >> > > >> > > > > > >> >> >>> we do so, I would like to limit >> >>> string >> >>> > keys >> >>> > >> > to 128 >> >>> > >> > > >> bytes >> >>> > >> > > >> > in >> >>> > >> > > >> > > > > > length. >> >>> > >> > > >> > > > > > >> >> This >> >>> > >> > > >> > > > > > >> >> >>> way 1) I could use a 3 letter >> string >> >>> if >> >>> > I >> >>> > >> > wanted >> >>> > >> > > >> > > (effectively >> >>> > >> > > >> > > > > > >> using 4 >> >>> > >> > > >> > > > > > >> >> total >> >>> > >> > > >> > > > > > >> >> >>> bytes), 2) limit overall impact of >> >>> > possible >> >>> > >> > keys >> >>> > >> > > (don't >> >>> > >> > > >> > > > really >> >>> > >> > > >> > > > > > want >> >>> > >> > > >> > > > > > >> >> people >> >>> > >> > > >> > > > > > >> >> >>> to send a 16K header string key). >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> Nacho >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> On Tue, Nov 8, 2016 at 3:35 PM, >> Gwen >> >>> > >> Shapira >> >>> > >> > < >> >>> > >> > > >> > > > > g...@confluent.io> >> >>> > >> > > >> > > > > > >> >> wrote: >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>>> Forgot to mention: Thank you for >> >>> > >> > quantifying the >> >>> > >> > > >> > > trade-off - >> >>> > >> > > >> > > > > it >> >>> > >> > > >> > > > > > is >> >>> > >> > > >> > > > > > >> >> >>>> helpful and important regardless >> of >> >>> > what >> >>> > >> we >> >>> > >> > end up >> >>> > >> > > >> > > deciding. >> >>> > >> > > >> > > > > > >> >> >>>> >> >>> > >> > > >> > > > > > >> >> >>>> On Tue, Nov 8, 2016 at 3:12 PM, >> Sean >> >>> > >> > McCauliff >> >>> > >> > > >> > > > > > >> >> >>>> <smccaul...@linkedin.com. >> invalid> >> >>> > wrote: >> >>> > >> > > >> > > > > > >> >> >>>>> On Tue, Nov 8, 2016 at 2:15 PM, >> >>> Gwen >> >>> > >> > Shapira < >> >>> > >> > > >> > > > > > g...@confluent.io> >> >>> > >> > > >> > > > > > >> >> >>> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> Since Kafka specifically >> targets >> >>> > >> > > high-throughput, >> >>> > >> > > >> > > > > low-latency >> >>> > >> > > >> > > > > > >> >> >>>>>> use-cases, I don't think we >> should >> >>> > trade >> >>> > >> > them >> >>> > >> > > off >> >>> > >> > > >> that >> >>> > >> > > >> > > > > easily. >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> I find these kind of design >> goals >> >>> not >> >>> > to >> >>> > >> be >> >>> > >> > > really >> >>> > >> > > >> > > helpful >> >>> > >> > > >> > > > > > unless >> >>> > >> > > >> > > > > > >> >> it's >> >>> > >> > > >> > > > > > >> >> >>>>> quantified in someway. Because >> it's >> >>> > >> always >> >>> > >> > > possible >> >>> > >> > > >> to >> >>> > >> > > >> > > > argue >> >>> > >> > > >> > > > > > >> against >> >>> > >> > > >> > > > > > >> >> >>>>> something as either being not >> >>> > performant >> >>> > >> > or just >> >>> > >> > > an >> >>> > >> > > >> > > > > > >> implementation >> >>> > >> > > >> > > > > > >> >> >>>> detail. >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> This is a single threaded >> >>> benchmarks >> >>> > so >> >>> > >> > all the >> >>> > >> > > >> > > > measurements >> >>> > >> > > >> > > > > > are >> >>> > >> > > >> > > > > > >> per >> >>> > >> > > >> > > > > > >> >> >>>>> thread. >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> For 1M messages/s/thread if >> header >> >>> > keys >> >>> > >> > are int >> >>> > >> > > and >> >>> > >> > > >> > you >> >>> > >> > > >> > > > had >> >>> > >> > > >> > > > > > >> even a >> >>> > >> > > >> > > > > > >> >> >>>> single >> >>> > >> > > >> > > > > > >> >> >>>>> header key, value pair then it's >> >>> still >> >>> > >> > about 2^-2 >> >>> > >> > > >> > > > > microseconds >> >>> > >> > > >> > > > > > >> which >> >>> > >> > > >> > > > > > >> >> >>>> means >> >>> > >> > > >> > > > > > >> >> >>>>> you only have another 0.75 >> >>> > microseconds >> >>> > >> to >> >>> > >> > do >> >>> > >> > > >> > everything >> >>> > >> > > >> > > > else >> >>> > >> > > >> > > > > > you >> >>> > >> > > >> > > > > > >> >> want >> >>> > >> > > >> > > > > > >> >> >>> to >> >>> > >> > > >> > > > > > >> >> >>>>> do with a message (1M messages/s >> >>> > means 1 >> >>> > >> > micro >> >>> > >> > > second >> >>> > >> > > >> > per >> >>> > >> > > >> > > > > > >> message). >> >>> > >> > > >> > > > > > >> >> >>> With >> >>> > >> > > >> > > > > > >> >> >>>>> string header keys there is >> still >> >>> 0.5 >> >>> > >> micro >> >>> > >> > > seconds >> >>> > >> > > >> to >> >>> > >> > > >> > > > > process >> >>> > >> > > >> > > > > > a >> >>> > >> > > >> > > > > > >> >> >>> message. >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> I love strings as much as the >> next >> >>> guy >> >>> > >> (we >> >>> > >> > had >> >>> > >> > > them >> >>> > >> > > >> in >> >>> > >> > > >> > > > > Flume), >> >>> > >> > > >> > > > > > >> but I >> >>> > >> > > >> > > > > > >> >> >>>>>> was convinced by >> >>> Magnus/Michael/Radai >> >>> > >> that >> >>> > >> > > strings >> >>> > >> > > >> > don't >> >>> > >> > > >> > > > > > >> actually >> >>> > >> > > >> > > > > > >> >> have >> >>> > >> > > >> > > > > > >> >> >>>>>> strong benefits as opposed to >> ints >> >>> > >> > (you'll need >> >>> > >> > > a >> >>> > >> > > >> > string >> >>> > >> > > >> > > > > > >> registry >> >>> > >> > > >> > > > > > >> >> >>>>>> anyway - otherwise, how will >> you >> >>> know >> >>> > >> > what does >> >>> > >> > > the >> >>> > >> > > >> > > > > > "profile_id" >> >>> > >> > > >> > > > > > >> >> >>>>>> header refers to?) and I want >> to >> >>> keep >> >>> > >> > closer to >> >>> > >> > > our >> >>> > >> > > >> > > > original >> >>> > >> > > >> > > > > > >> design >> >>> > >> > > >> > > > > > >> >> >>>>>> goals for Kafka. >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> "confluent.profile_id" >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> If someone likes strings in the >> >>> > headers >> >>> > >> > and >> >>> > >> > > doesn't >> >>> > >> > > >> do >> >>> > >> > > >> > > > > > millions >> >>> > >> > > >> > > > > > >> of >> >>> > >> > > >> > > > > > >> >> >>>>>> messages a sec, they probably >> have >> >>> > lots >> >>> > >> > of other >> >>> > >> > > >> > systems >> >>> > >> > > >> > > > > they >> >>> > >> > > >> > > > > > >> can >> >>> > >> > > >> > > > > > >> >> use >> >>> > >> > > >> > > > > > >> >> >>>>>> instead. >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> None of them will scale like >> Kafka. >> >>> > >> > Horizontal >> >>> > >> > > >> scaling >> >>> > >> > > >> > > is >> >>> > >> > > >> > > > > > still >> >>> > >> > > >> > > > > > >> >> good. >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> On Tue, Nov 8, 2016 at 1:22 PM, >> >>> Sean >> >>> > >> > McCauliff >> >>> > >> > > >> > > > > > >> >> >>>>>> <smccaul...@linkedin.com. >> invalid> >> >>> > >> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>> +1 for String keys. >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> I've been doing some >> bechmarking >> >>> > and it >> >>> > >> > seems >> >>> > >> > > like >> >>> > >> > > >> > the >> >>> > >> > > >> > > > > > speedup >> >>> > >> > > >> > > > > > >> for >> >>> > >> > > >> > > > > > >> >> >>>> using >> >>> > >> > > >> > > > > > >> >> >>>>>>> integer keys is about 2-5 >> >>> depending >> >>> > on >> >>> > >> > the >> >>> > >> > > length >> >>> > >> > > >> of >> >>> > >> > > >> > > the >> >>> > >> > > >> > > > > > >> strings >> >>> > >> > > >> > > > > > >> >> and >> >>> > >> > > >> > > > > > >> >> >>>> what >> >>> > >> > > >> > > > > > >> >> >>>>>>> collections are being used. >> The >> >>> > overall >> >>> > >> > amount >> >>> > >> > > of >> >>> > >> > > >> > time >> >>> > >> > > >> > > > > spent >> >>> > >> > > >> > > > > > >> >> >>> parsing >> >>> > >> > > >> > > > > > >> >> >>>> a >> >>> > >> > > >> > > > > > >> >> >>>>>> set >> >>> > >> > > >> > > > > > >> >> >>>>>>> of header key, value pairs >> >>> probably >> >>> > >> does >> >>> > >> > not >> >>> > >> > > matter >> >>> > >> > > >> > > > unless >> >>> > >> > > >> > > > > > you >> >>> > >> > > >> > > > > > >> are >> >>> > >> > > >> > > > > > >> >> >>>>>> getting >> >>> > >> > > >> > > > > > >> >> >>>>>>> close to 1M messages per >> >>> consumer. >> >>> > In >> >>> > >> > which >> >>> > >> > > case >> >>> > >> > > >> > > > probably >> >>> > >> > > >> > > > > > >> don't >> >>> > >> > > >> > > > > > >> >> use >> >>> > >> > > >> > > > > > >> >> >>>>>>> headers. There is also the >> >>> option to >> >>> > >> use >> >>> > >> > very >> >>> > >> > > >> short >> >>> > >> > > >> > > > > strings; >> >>> > >> > > >> > > > > > >> some >> >>> > >> > > >> > > > > > >> >> >>>> that >> >>> > >> > > >> > > > > > >> >> >>>>>> are >> >>> > >> > > >> > > > > > >> >> >>>>>>> even shorter than integers. >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> Partitioning the string key >> space >> >>> > will >> >>> > >> be >> >>> > >> > > easier >> >>> > >> > > >> than >> >>> > >> > > >> > > > > > >> partitioning >> >>> > >> > > >> > > > > > >> >> >>> an >> >>> > >> > > >> > > > > > >> >> >>>>>>> integer key space. We won't >> need >> >>> a >> >>> > >> global >> >>> > >> > > registry. >> >>> > >> > > >> > > > Kafka >> >>> > >> > > >> > > > > > >> >> >>> internally >> >>> > >> > > >> > > > > > >> >> >>>> can >> >>> > >> > > >> > > > > > >> >> >>>>>>> reserve some prefix like "_" >> as >> >>> its >> >>> > >> > namespace. >> >>> > >> > > >> > > Everyone >> >>> > >> > > >> > > > > else >> >>> > >> > > >> > > > > > >> can >> >>> > >> > > >> > > > > > >> >> >>> use >> >>> > >> > > >> > > > > > >> >> >>>>>> their >> >>> > >> > > >> > > > > > >> >> >>>>>>> company or project name as >> >>> namespace >> >>> > >> > prefix and >> >>> > >> > > >> life >> >>> > >> > > >> > > > should >> >>> > >> > > >> > > > > > be >> >>> > >> > > >> > > > > > >> >> good. >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> Here's the link to some of the >> >>> > >> > benchmarking >> >>> > >> > > info: >> >>> > >> > > >> > > > > > >> >> >>>>>>> https://docs.google.com/ >> >>> > >> document/d/1tfT- >> >>> > >> > > >> > > > > > >> >> >>>> 6SZdnKOLyWGDH82kS30PnUkmgb7nPL >> >>> > >> > > >> > > > > > >> >> >>>>>> dw6p65pAI/edit?usp=sharing >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> -- >> >>> > >> > > >> > > > > > >> >> >>>>>>> Sean McCauliff >> >>> > >> > > >> > > > > > >> >> >>>>>>> Staff Software Engineer >> >>> > >> > > >> > > > > > >> >> >>>>>>> Kafka >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> smccaul...@linkedin.com >> >>> > >> > > >> > > > > > >> >> >>>>>>> linkedin.com/in/sean- >> >>> > mccauliff-b563192 >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>> On Mon, Nov 7, 2016 at 11:51 >> PM, >> >>> > >> Michael >> >>> > >> > > Pearce < >> >>> > >> > > >> > > > > > >> >> >>>> michael.pea...@ig.com> >> >>> > >> > > >> > > > > > >> >> >>>>>>> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> +1 on this slimmer version of >> >>> our >> >>> > >> > proposal >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> I def think the Id space we >> can >> >>> > reduce >> >>> > >> > from >> >>> > >> > > the >> >>> > >> > > >> > > proposed >> >>> > >> > > >> > > > > > >> >> >>>> int32(4bytes) >> >>> > >> > > >> > > > > > >> >> >>>>>>>> down to int16(2bytes) it >> saves >> >>> on >> >>> > >> space >> >>> > >> > and as >> >>> > >> > > >> > headers >> >>> > >> > > >> > > > we >> >>> > >> > > >> > > > > > >> wouldn't >> >>> > >> > > >> > > > > > >> >> >>>>>> expect >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the number of headers being >> used >> >>> > >> > concurrently >> >>> > >> > > >> being >> >>> > >> > > >> > > that >> >>> > >> > > >> > > > > > high. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> I would wonder if we should >> make >> >>> > the >> >>> > >> > value >> >>> > >> > > byte >> >>> > >> > > >> > array >> >>> > >> > > >> > > > > length >> >>> > >> > > >> > > > > > >> still >> >>> > >> > > >> > > > > > >> >> >>>> int32 >> >>> > >> > > >> > > > > > >> >> >>>>>>>> though as This is the >> standard >> >>> Max >> >>> > >> array >> >>> > >> > > length in >> >>> > >> > > >> > > Java >> >>> > >> > > >> > > > > > saying >> >>> > >> > > >> > > > > > >> >> that >> >>> > >> > > >> > > > > > >> >> >>>> it >> >>> > >> > > >> > > > > > >> >> >>>>>> is a >> >>> > >> > > >> > > > > > >> >> >>>>>>>> header and I guess limiting >> the >> >>> > size >> >>> > >> is >> >>> > >> > > sensible >> >>> > >> > > >> and >> >>> > >> > > >> > > > would >> >>> > >> > > >> > > > > > >> work >> >>> > >> > > >> > > > > > >> >> for >> >>> > >> > > >> > > > > > >> >> >>>> all >> >>> > >> > > >> > > > > > >> >> >>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>> use cases we have in mind so >> >>> happy >> >>> > >> with >> >>> > >> > > limiting >> >>> > >> > > >> > this. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Do people generally concur on >> >>> > Magnus's >> >>> > >> > slimmer >> >>> > >> > > >> > > version? >> >>> > >> > > >> > > > > > >> Anyone see >> >>> > >> > > >> > > > > > >> >> >>>> any >> >>> > >> > > >> > > > > > >> >> >>>>>>>> issues if we moved from >> int32 to >> >>> > >> int16? >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Re configurable ids per >> plugin >> >>> > over a >> >>> > >> > global >> >>> > >> > > >> > registry >> >>> > >> > > >> > > > also >> >>> > >> > > >> > > > > > >> would >> >>> > >> > > >> > > > > > >> >> >>> work >> >>> > >> > > >> > > > > > >> >> >>>>>> for >> >>> > >> > > >> > > > > > >> >> >>>>>>>> us. As such if this has >> better >> >>> > >> > concensus over >> >>> > >> > > the >> >>> > >> > > >> > > > > proposed >> >>> > >> > > >> > > > > > >> global >> >>> > >> > > >> > > > > > >> >> >>>>>> registry >> >>> > >> > > >> > > > > > >> >> >>>>>>>> I'd be happy to change that. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> I was already sold on ints >> over >> >>> > >> strings >> >>> > >> > for >> >>> > >> > > keys >> >>> > >> > > >> ;) >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Cheers >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Mike >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> ______________________________ >> >>> > >> > __________ >> >>> > >> > > >> > > > > > >> >> >>>>>>>> From: Magnus Edenhill < >> >>> > >> > mag...@edenhill.se> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Sent: Monday, November 7, >> 2016 >> >>> > >> 10:10:21 >> >>> > >> > PM >> >>> > >> > > >> > > > > > >> >> >>>>>>>> To: dev@kafka.apache.org >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Subject: Re: [DISCUSS] >> KIP-82 - >> >>> Add >> >>> > >> > Record >> >>> > >> > > Headers >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Hi, >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> I'm +1 for adding generic >> >>> message >> >>> > >> > headers, >> >>> > >> > > but I >> >>> > >> > > >> do >> >>> > >> > > >> > > > share >> >>> > >> > > >> > > > > > the >> >>> > >> > > >> > > > > > >> >> >>>> concerns >> >>> > >> > > >> > > > > > >> >> >>>>>>>> previously aired on this >> thread >> >>> and >> >>> > >> > during >> >>> > >> > > the KIP >> >>> > >> > > >> > > > > meeting. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> So let me propose a slimmer >> >>> > >> alternative >> >>> > >> > that >> >>> > >> > > does >> >>> > >> > > >> > not >> >>> > >> > > >> > > > > > require >> >>> > >> > > >> > > > > > >> any >> >>> > >> > > >> > > > > > >> >> >>>> sort >> >>> > >> > > >> > > > > > >> >> >>>>>> of >> >>> > >> > > >> > > > > > >> >> >>>>>>>> global header registry, does >> not >> >>> > >> affect >> >>> > >> > broker >> >>> > >> > > >> > > > performance >> >>> > >> > > >> > > > > > or >> >>> > >> > > >> > > > > > >> >> >>>>>> operations, >> >>> > >> > > >> > > > > > >> >> >>>>>>>> and adds as little overhead >> as >> >>> > >> possible. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Message >> >>> > >> > > >> > > > > > >> >> >>>>>>>> ------------ >> >>> > >> > > >> > > > > > >> >> >>>>>>>> The protocol Message type is >> >>> > extended >> >>> > >> > with a >> >>> > >> > > >> Headers >> >>> > >> > > >> > > > array >> >>> > >> > > >> > > > > > >> >> consting >> >>> > >> > > >> > > > > > >> >> >>>> of >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Tags, where a Tag is defined >> as: >> >>> > >> > > >> > > > > > >> >> >>>>>>>> int16 Id >> >>> > >> > > >> > > > > > >> >> >>>>>>>> int16 Len // binary_data >> length >> >>> > >> > > >> > > > > > >> >> >>>>>>>> binary_data[Len] // opaque >> >>> binary >> >>> > data >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Ids >> >>> > >> > > >> > > > > > >> >> >>>>>>>> --- >> >>> > >> > > >> > > > > > >> >> >>>>>>>> The Id space is not centrally >> >>> > managed, >> >>> > >> > so >> >>> > >> > > whenever >> >>> > >> > > >> > an >> >>> > >> > > >> > > > > > >> application >> >>> > >> > > >> > > > > > >> >> >>>> needs >> >>> > >> > > >> > > > > > >> >> >>>>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> add headers, or use an >> >>> eco-system >> >>> > >> > plugin that >> >>> > >> > > >> does, >> >>> > >> > > >> > > its >> >>> > >> > > >> > > > Id >> >>> > >> > > >> > > > > > >> >> >>> allocation >> >>> > >> > > >> > > > > > >> >> >>>>>> will >> >>> > >> > > >> > > > > > >> >> >>>>>>>> need to be manually >> configured. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> This moves the allocation >> >>> concern >> >>> > from >> >>> > >> > the >> >>> > >> > > global >> >>> > >> > > >> > > space >> >>> > >> > > >> > > > > down >> >>> > >> > > >> > > > > > >> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> organization level and avoids >> >>> the >> >>> > risk >> >>> > >> > for id >> >>> > >> > > >> > > conflicts. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Example pseudo-config for >> some >> >>> app: >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> sometrackerplugin.tag.sourcev3 >> >>> .id >> >>> > >> =1000 >> >>> > >> > > >> > > > > > >> >> >>>>>>>> dbthing.tag.tablename.id >> =1001 >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> myschemareg.tag.schemaname.id= >> >>> 1002 >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> myschemareg.tag.schemaversion. >> >>> id >> >>> > =1003 >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Each header-writing or >> >>> > header-reading >> >>> > >> > plugin >> >>> > >> > > must >> >>> > >> > > >> > > > provide >> >>> > >> > > >> > > > > > >> means >> >>> > >> > > >> > > > > > >> >> >>>>>> (typically >> >>> > >> > > >> > > > > > >> >> >>>>>>>> through configuration) to >> >>> specify >> >>> > the >> >>> > >> > tag for >> >>> > >> > > each >> >>> > >> > > >> > > > header >> >>> > >> > > >> > > > > it >> >>> > >> > > >> > > > > > >> uses. >> >>> > >> > > >> > > > > > >> >> >>>>>> Defaults >> >>> > >> > > >> > > > > > >> >> >>>>>>>> should be avoided. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> A consumer silently ignores >> >>> tags it >> >>> > >> > does not >> >>> > >> > > have >> >>> > >> > > >> a >> >>> > >> > > >> > > > > mapping >> >>> > >> > > >> > > > > > >> for >> >>> > >> > > >> > > > > > >> >> >>>> (since >> >>> > >> > > >> > > > > > >> >> >>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>> binary_data can't be parsed >> >>> without >> >>> > >> > knowing >> >>> > >> > > what >> >>> > >> > > >> it >> >>> > >> > > >> > > is). >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Id range 0..999 is reserved >> for >> >>> > future >> >>> > >> > use by >> >>> > >> > > the >> >>> > >> > > >> > > broker >> >>> > >> > > >> > > > > and >> >>> > >> > > >> > > > > > >> must >> >>> > >> > > >> > > > > > >> >> >>>> not be >> >>> > >> > > >> > > > > > >> >> >>>>>>>> used by plugins. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Broker >> >>> > >> > > >> > > > > > >> >> >>>>>>>> --------- >> >>> > >> > > >> > > > > > >> >> >>>>>>>> The broker does not process >> the >> >>> > tags >> >>> > >> > (other >> >>> > >> > > than >> >>> > >> > > >> the >> >>> > >> > > >> > > > > > standard >> >>> > >> > > >> > > > > > >> >> >>>> protocol >> >>> > >> > > >> > > > > > >> >> >>>>>>>> syntax verification), it >> simply >> >>> > stores >> >>> > >> > and >> >>> > >> > > >> forwards >> >>> > >> > > >> > > them >> >>> > >> > > >> > > > > as >> >>> > >> > > >> > > > > > >> opaque >> >>> > >> > > >> > > > > > >> >> >>>> data. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Standard message translation >> >>> > (removal >> >>> > >> of >> >>> > >> > > Headers) >> >>> > >> > > >> > > kicks >> >>> > >> > > >> > > > in >> >>> > >> > > >> > > > > > for >> >>> > >> > > >> > > > > > >> >> >>> older >> >>> > >> > > >> > > > > > >> >> >>>>>>>> clients. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Why not string ids? >> >>> > >> > > >> > > > > > >> >> >>>>>>>> ------------------------- >> >>> > >> > > >> > > > > > >> >> >>>>>>>> String ids might seem like a >> >>> good >> >>> > >> idea, >> >>> > >> > but: >> >>> > >> > > >> > > > > > >> >> >>>>>>>> * does not really solve >> >>> uniqueness >> >>> > >> > > >> > > > > > >> >> >>>>>>>> * consumes a lot of space (2 >> >>> byte >> >>> > >> string >> >>> > >> > > length + >> >>> > >> > > >> > > > string, >> >>> > >> > > >> > > > > > per >> >>> > >> > > >> > > > > > >> >> >>>> header) >> >>> > >> > > >> > > > > > >> >> >>>>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> be meaningful >> >>> > >> > > >> > > > > > >> >> >>>>>>>> * doesn't really say anything >> >>> how >> >>> > to >> >>> > >> > parse the >> >>> > >> > > >> tag's >> >>> > >> > > >> > > > data, >> >>> > >> > > >> > > > > > so >> >>> > >> > > >> > > > > > >> it >> >>> > >> > > >> > > > > > >> >> >>> is >> >>> > >> > > >> > > > > > >> >> >>>> in >> >>> > >> > > >> > > > > > >> >> >>>>>>>> effect useless on its own. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Regards, >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Magnus >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> 2016-11-07 18:32 GMT+01:00 >> >>> Michael >> >>> > >> > Pearce < >> >>> > >> > > >> > > > > > >> michael.pea...@ig.com >> >>> > >> > > >> > > > > > >> >> >: >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Hi Roger, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Thanks for the support. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> I think the key thing is to >> >>> have a >> >>> > >> > common key >> >>> > >> > > >> space >> >>> > >> > > >> > > to >> >>> > >> > > >> > > > > make >> >>> > >> > > >> > > > > > >> an >> >>> > >> > > >> > > > > > >> >> >>>>>> ecosystem, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> there does have to be some >> >>> level >> >>> > of >> >>> > >> > contract >> >>> > >> > > for >> >>> > >> > > >> > > people >> >>> > >> > > >> > > > > to >> >>> > >> > > >> > > > > > >> play >> >>> > >> > > >> > > > > > >> >> >>>>>> nicely. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Having map<String, byte[]> >> or >> >>> as >> >>> > per >> >>> > >> > current >> >>> > >> > > >> > proposed >> >>> > >> > > >> > > > in >> >>> > >> > > >> > > > > > kip >> >>> > >> > > >> > > > > > >> of >> >>> > >> > > >> > > > > > >> >> >>>>>> having a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> numerical key space of >> map<int, >> >>> > >> > byte[]> is a >> >>> > >> > > >> level >> >>> > >> > > >> > > of >> >>> > >> > > >> > > > > the >> >>> > >> > > >> > > > > > >> >> >>> contract >> >>> > >> > > >> > > > > > >> >> >>>>>> that >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> most people would expect. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> I think the example in a >> >>> previous >> >>> > >> > comment >> >>> > >> > > someone >> >>> > >> > > >> > > else >> >>> > >> > > >> > > > > made >> >>> > >> > > >> > > > > > >> >> >>>> linking to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> AWS >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> blog and also implemented >> api >> >>> > where >> >>> > >> > > originally >> >>> > >> > > >> they >> >>> > >> > > >> > > > > didn't >> >>> > >> > > >> > > > > > >> have a >> >>> > >> > > >> > > > > > >> >> >>>>>> header >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> space but not they do, where >> >>> keys >> >>> > are >> >>> > >> > > uniform but >> >>> > >> > > >> > the >> >>> > >> > > >> > > > > value >> >>> > >> > > >> > > > > > >> can >> >>> > >> > > >> > > > > > >> >> >>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>> string, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> int, anything is a good >> >>> example. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Having a custom >> >>> > MetadataSerializer is >> >>> > >> > > something >> >>> > >> > > >> we >> >>> > >> > > >> > > had >> >>> > >> > > >> > > > > > played >> >>> > >> > > >> > > > > > >> >> >>> with, >> >>> > >> > > >> > > > > > >> >> >>>>>> but >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> discounted the idea, as if >> you >> >>> > wanted >> >>> > >> > > everyone to >> >>> > >> > > >> > > work >> >>> > >> > > >> > > > > the >> >>> > >> > > >> > > > > > >> same >> >>> > >> > > >> > > > > > >> >> >>>> way in >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> ecosystem, having to have >> this >> >>> > also >> >>> > >> > > customizable >> >>> > >> > > >> > > makes >> >>> > >> > > >> > > > > it a >> >>> > >> > > >> > > > > > >> bit >> >>> > >> > > >> > > > > > >> >> >>>>>> harder. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Think about making the whole >> >>> > message >> >>> > >> > record >> >>> > >> > > >> custom >> >>> > >> > > >> > > > > > >> serializable, >> >>> > >> > > >> > > > > > >> >> >>>> this >> >>> > >> > > >> > > > > > >> >> >>>>>>>> would >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> make it fairly tricky >> (though >> >>> it >> >>> > >> would >> >>> > >> > not be >> >>> > >> > > >> > > > impossible) >> >>> > >> > > >> > > > > > to >> >>> > >> > > >> > > > > > >> have >> >>> > >> > > >> > > > > > >> >> >>>> made >> >>> > >> > > >> > > > > > >> >> >>>>>>>> work >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> nicely. Having the value >> >>> > customizable >> >>> > >> > we >> >>> > >> > > thought >> >>> > >> > > >> > is a >> >>> > >> > > >> > > > > > >> reasonable >> >>> > >> > > >> > > > > > >> >> >>>>>> tradeoff >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> here of flexibility over >> >>> contract >> >>> > of >> >>> > >> > > interaction >> >>> > >> > > >> > > > between >> >>> > >> > > >> > > > > > >> >> >>> different >> >>> > >> > > >> > > > > > >> >> >>>>>>>> parties. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Is there a particular case >> or >> >>> > benefit >> >>> > >> > of >> >>> > >> > > having >> >>> > >> > > >> > > > > > serialization >> >>> > >> > > >> > > > > > >> >> >>>>>>>> customizable >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> that you have in mind? >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Saying this it is obviously >> >>> > something >> >>> > >> > that >> >>> > >> > > could >> >>> > >> > > >> be >> >>> > >> > > >> > > > > > >> implemented, >> >>> > >> > > >> > > > > > >> >> >>> if >> >>> > >> > > >> > > > > > >> >> >>>>>> there >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> is a need. If we did go this >> >>> > avenue I >> >>> > >> > think a >> >>> > >> > > >> > > defaulted >> >>> > >> > > >> > > > > > >> >> >>> serializer >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> implementation should exist >> so >> >>> for >> >>> > >> the >> >>> > >> > 80:20 >> >>> > >> > > >> rule, >> >>> > >> > > >> > > > people >> >>> > >> > > >> > > > > > can >> >>> > >> > > >> > > > > > >> >> >>> just >> >>> > >> > > >> > > > > > >> >> >>>>>> have >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> broker and clients get >> default >> >>> > >> > behavior. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Cheers >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Mike >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> On 11/6/16, 5:25 PM, >> "radai" < >> >>> > >> > > >> > > > radai.rosenbl...@gmail.com >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > >> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> making header _key_ >> >>> serialization >> >>> > >> > > configurable >> >>> > >> > > >> > > > > > potentially >> >>> > >> > > >> > > > > > >> >> >>>>>> undermines >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> board usefulness of the >> feature >> >>> > (any >> >>> > >> > point >> >>> > >> > > >> along >> >>> > >> > > >> > > the >> >>> > >> > > >> > > > > > path >> >>> > >> > > >> > > > > > >> >> >>> must >> >>> > >> > > >> > > > > > >> >> >>>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>> able >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> read the header keys. the >> >>> values >> >>> > may >> >>> > >> be >> >>> > >> > > >> whatever >> >>> > >> > > >> > > and >> >>> > >> > > >> > > > > > >> require >> >>> > >> > > >> > > > > > >> >> >>>> more >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> intimate >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> knowledge of the code that >> >>> > produced >> >>> > >> > specific >> >>> > >> > > >> > > > headers, >> >>> > >> > > >> > > > > > but >> >>> > >> > > >> > > > > > >> >> >>> keys >> >>> > >> > > >> > > > > > >> >> >>>>>> should >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> universally readable). >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> it would also make it hard >> to >> >>> > write >> >>> > >> > really >> >>> > >> > > >> > > portable >> >>> > >> > > >> > > > > > >> plugins - >> >>> > >> > > >> > > > > > >> >> >>>> say >> >>> > >> > > >> > > > > > >> >> >>>>>> i >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> large message >> >>> splitter/combiner - >> >>> > if >> >>> > >> i >> >>> > >> > rely >> >>> > >> > > on >> >>> > >> > > >> > key >> >>> > >> > > >> > > > > > >> >> >>>> "largeMessage" >> >>> > >> > > >> > > > > > >> >> >>>>>> and >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> values of the form "1/20" >> >>> someone >> >>> > who >> >>> > >> > uses >> >>> > >> > > >> > > > (contrived >> >>> > >> > > >> > > > > > >> >> >>> example) >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Map<Byte[], >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Double> wouldnt be able to >> >>> re-use >> >>> > my >> >>> > >> > code. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> not the end of a the world >> >>> within >> >>> > an >> >>> > >> > > >> > organization, >> >>> > >> > > >> > > > but >> >>> > >> > > >> > > > > > >> >> >>>>>> problematic if >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> you >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> want to enable an ecosystem >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> On Thu, Nov 3, 2016 at 2:04 >> PM, >> >>> > Roger >> >>> > >> > Hoover >> >>> > >> > > < >> >>> > >> > > >> > > > > > >> >> >>>>>> roger.hoo...@gmail.com >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> As others have laid out, I >> see >> >>> > >> strong >> >>> > >> > > reasons >> >>> > >> > > >> for >> >>> > >> > > >> > a >> >>> > >> > > >> > > > > common >> >>> > >> > > >> > > > > > >> >> >>>>>> message >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> metadata structure for the >> >>> Kafka >> >>> > >> > ecosystem. >> >>> > >> > > In >> >>> > >> > > >> > > > > > particular, >> >>> > >> > > >> > > > > > >> >> >>>> I've >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> seen that >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> even within a single >> >>> > organization, >> >>> > >> > > >> infrastructure >> >>> > >> > > >> > > > teams >> >>> > >> > > >> > > > > > >> >> >>> often >> >>> > >> > > >> > > > > > >> >> >>>>>> own >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> message metadata while >> >>> > application >> >>> > >> > teams >> >>> > >> > > own the >> >>> > >> > > >> > > > > > >> >> >>>>>> application-level >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> data >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> format. Allowing metadata >> and >> >>> > >> content >> >>> > >> > to >> >>> > >> > > have >> >>> > >> > > >> > > > different >> >>> > >> > > >> > > > > > >> >> >>>>>> structure >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> and >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> evolve separately is very >> >>> helpful >> >>> > >> for >> >>> > >> > this. >> >>> > >> > > >> > Also, I >> >>> > >> > > >> > > > > think >> >>> > >> > > >> > > > > > >> >> >>>>>> there's >> >>> > >> > > >> > > > > > >> >> >>>>>>>> a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> lot of >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> value to having a common >> >>> metadata >> >>> > >> > structure >> >>> > >> > > >> shared >> >>> > >> > > >> > > > > across >> >>> > >> > > >> > > > > > >> >> >>> the >> >>> > >> > > >> > > > > > >> >> >>>>>> Kafka >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> ecosystem so that tools >> which >> >>> > >> leverage >> >>> > >> > > metadata >> >>> > >> > > >> > can >> >>> > >> > > >> > > > more >> >>> > >> > > >> > > > > > >> >> >>>> easily >> >>> > >> > > >> > > > > > >> >> >>>>>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> shared >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> across organizations and >> >>> > integrated >> >>> > >> > > together. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> The question is, where does >> >>> the >> >>> > >> > metadata >> >>> > >> > > >> structure >> >>> > >> > > >> > > > > belong? >> >>> > >> > > >> > > > > > >> >> >>>>>> Here's >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> my take: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> We change the Kafka wire >> and >> >>> > on-disk >> >>> > >> > format >> >>> > >> > > to >> >>> > >> > > >> > from >> >>> > >> > > >> > > a >> >>> > >> > > >> > > > > > (key, >> >>> > >> > > >> > > > > > >> >> >>>>>> value) >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> model to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> a (key, metadata, value) >> model >> >>> > where >> >>> > >> > all >> >>> > >> > > three >> >>> > >> > > >> are >> >>> > >> > > >> > > > byte >> >>> > >> > > >> > > > > > >> >> >>>> arrays >> >>> > >> > > >> > > > > > >> >> >>>>>> from >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> brokers point of view. The >> >>> > primary >> >>> > >> > reason >> >>> > >> > > for >> >>> > >> > > >> > this >> >>> > >> > > >> > > is >> >>> > >> > > >> > > > > > that >> >>> > >> > > >> > > > > > >> >> >>>> it >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> provides a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> backward compatible >> migration >> >>> > path >> >>> > >> > forward. >> >>> > >> > > >> > > Producers >> >>> > >> > > >> > > > > can >> >>> > >> > > >> > > > > > >> >> >>>> start >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> populating >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> metadata fields before all >> >>> > consumers >> >>> > >> > > understand >> >>> > >> > > >> > the >> >>> > >> > > >> > > > > > >> >> >>> metadata >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> structure. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> For people who already have >> >>> > custom >> >>> > >> > envelope >> >>> > >> > > >> > > > structures, >> >>> > >> > > >> > > > > > >> >> >>> they >> >>> > >> > > >> > > > > > >> >> >>>> can >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> populate >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> their existing structure >> and >> >>> the >> >>> > new >> >>> > >> > > structure >> >>> > >> > > >> > for a >> >>> > >> > > >> > > > > while >> >>> > >> > > >> > > > > > >> >> >>> as >> >>> > >> > > >> > > > > > >> >> >>>>>> they >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> make the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> transition. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> We could stop there and let >> >>> the >> >>> > >> > clients >> >>> > >> > > plug in >> >>> > >> > > >> a >> >>> > >> > > >> > > > > > >> >> >>>> KeySerializer, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> MetadataSerializer, and >> >>> > >> > ValueSerializer but >> >>> > >> > > I >> >>> > >> > > >> > think >> >>> > >> > > >> > > it >> >>> > >> > > >> > > > > is >> >>> > >> > > >> > > > > > >> >> >>>> also >> >>> > >> > > >> > > > > > >> >> >>>>>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> useful to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> have a default >> >>> MetadataSerializer >> >>> > >> that >> >>> > >> > > >> implements >> >>> > >> > > >> > a >> >>> > >> > > >> > > > > > >> >> >>> key-value >> >>> > >> > > >> > > > > > >> >> >>>>>> model >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> similar >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> to AMQP or HTTP headers. >> Or we >> >>> > could >> >>> > >> > go even >> >>> > >> > > >> > > further >> >>> > >> > > >> > > > > and >> >>> > >> > > >> > > > > > >> >> >>>>>>>> prescribe a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Map<String, byte[]> or >> >>> > Map<String, >> >>> > >> > String> >> >>> > >> > > data >> >>> > >> > > >> > > model >> >>> > >> > > >> > > > > for >> >>> > >> > > >> > > > > > >> >> >>>>>> headers >> >>> > >> > > >> > > > > > >> >> >>>>>>>> in >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> clients (while still >> allowing >> >>> > custom >> >>> > >> > > >> serialization >> >>> > >> > > >> > > of >> >>> > >> > > >> > > > > the >> >>> > >> > > >> > > > > > >> >> >>>> header >> >>> > >> > > >> > > > > > >> >> >>>>>>>> data >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> model). >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> I think this would address >> >>> > Radai's >> >>> > >> > concerns: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 1. All client code would >> not >> >>> > need to >> >>> > >> > be >> >>> > >> > > updated >> >>> > >> > > >> to >> >>> > >> > > >> > > > know >> >>> > >> > > >> > > > > > >> >> >>> about >> >>> > >> > > >> > > > > > >> >> >>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> container. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 2. Middleware friendly >> clients >> >>> > would >> >>> > >> > have a >> >>> > >> > > >> > standard >> >>> > >> > > >> > > > > > header >> >>> > >> > > >> > > > > > >> >> >>>> data >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> model to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> work with. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 3. KIP is required both >> b/c of >> >>> > >> broker >> >>> > >> > > changes >> >>> > >> > > >> and >> >>> > >> > > >> > > > > because >> >>> > >> > > >> > > > > > >> >> >>> of >> >>> > >> > > >> > > > > > >> >> >>>>>> client >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> API >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> changes. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Cheers, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Roger >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> On Wed, Nov 2, 2016 at 4:38 >> >>> PM, >> >>> > >> radai >> >>> > >> > < >> >>> > >> > > >> > > > > > >> >> >>>>>> radai.rosenbl...@gmail.com> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> my biggest issues with a >> >>> > "standard" >> >>> > >> > wrapper >> >>> > >> > > >> > format: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 1. _ALL_ client _CODE_ (as >> >>> > opposed >> >>> > >> to >> >>> > >> > > kafka lib >> >>> > >> > > >> > > > > version) >> >>> > >> > > >> > > > > > >> >> >>>> must >> >>> > >> > > >> > > > > > >> >> >>>>>> be >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> updated >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> know about the container, >> >>> > because >> >>> > >> > any old >> >>> > >> > > naive >> >>> > >> > > >> > > code >> >>> > >> > > >> > > > > > >> >> >>>> trying to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> directly >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> deserialize its own >> payload >> >>> > would >> >>> > >> > keel >> >>> > >> > > over and >> >>> > >> > > >> > die >> >>> > >> > > >> > > > (it >> >>> > >> > > >> > > > > > >> >> >>>> needs >> >>> > >> > > >> > > > > > >> >> >>>>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> know to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> deserialize a container, >> and >> >>> > then >> >>> > >> > dig in >> >>> > >> > > there >> >>> > >> > > >> > for >> >>> > >> > > >> > > > its >> >>> > >> > > >> > > > > > >> >> >>>>>> payload). >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 2. in order to write >> >>> > >> > middleware-friendly >> >>> > >> > > >> clients >> >>> > >> > > >> > > that >> >>> > >> > > >> > > > > > >> >> >>>> utilize >> >>> > >> > > >> > > > > > >> >> >>>>>>>> such >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> a >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> container one would >> basically >> >>> > have >> >>> > >> > to write >> >>> > >> > > >> their >> >>> > >> > > >> > > own >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> producer/consumer >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> API >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> on top of the open source >> >>> kafka >> >>> > >> one. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 3. if you were going to go >> >>> with >> >>> > a >> >>> > >> > wrapper >> >>> > >> > > >> format >> >>> > >> > > >> > > you >> >>> > >> > > >> > > > > > >> >> >>> really >> >>> > >> > > >> > > > > > >> >> >>>>>> dont >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> need to >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> bother with a kip (just >> open >> >>> > source >> >>> > >> > your >> >>> > >> > > own >> >>> > >> > > >> > client >> >>> > >> > > >> > > > > stack >> >>> > >> > > >> > > > > > >> >> >>>>>> from #2 >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> above >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> so >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> others could stop >> >>> re-inventing >> >>> > it) >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> On Wed, Nov 2, 2016 at >> 4:25 >> >>> PM, >> >>> > >> James >> >>> > >> > > Cheng < >> >>> > >> > > >> > > > > > >> >> >>>>>>>> wushuja...@gmail.com> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> wrote: >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>>> How exactly would this >> >>> work? Or >> >>> > >> > maybe >> >>> > >> > > that's >> >>> > >> > > >> out >> >>> > >> > > >> > > of >> >>> > >> > > >> > > > > > >> >> >>> scope >> >>> > >> > > >> > > > > > >> >> >>>>>> for >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> this >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> email. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> The information contained in >> >>> this >> >>> > >> > email is >> >>> > >> > > >> strictly >> >>> > >> > > >> > > > > > >> confidential >> >>> > >> > > >> > > > > > >> >> >>>> and >> >>> > >> > > >> > > > > > >> >> >>>>>> for >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> the use of the addressee >> only, >> >>> > unless >> >>> > >> > > otherwise >> >>> > >> > > >> > > > > indicated. >> >>> > >> > > >> > > > > > >> If you >> >>> > >> > > >> > > > > > >> >> >>>> are >> >>> > >> > > >> > > > > > >> >> >>>>>> not >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> the intended recipient, >> please >> >>> do >> >>> > not >> >>> > >> > read, >> >>> > >> > > copy, >> >>> > >> > > >> > use >> >>> > >> > > >> > > > or >> >>> > >> > > >> > > > > > >> disclose >> >>> > >> > > >> > > > > > >> >> >>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> others >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> this message or any >> attachment. >> >>> > >> Please >> >>> > >> > also >> >>> > >> > > >> notify >> >>> > >> > > >> > > the >> >>> > >> > > >> > > > > > >> sender by >> >>> > >> > > >> > > > > > >> >> >>>>>> replying >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> to this email or by >> telephone >> >>> > >> (+44(020 >> >>> > >> > 7896 >> >>> > >> > > 0011) >> >>> > >> > > >> > and >> >>> > >> > > >> > > > > then >> >>> > >> > > >> > > > > > >> delete >> >>> > >> > > >> > > > > > >> >> >>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>> email >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> and any copies of it. >> Opinions, >> >>> > >> > conclusion >> >>> > >> > > (etc) >> >>> > >> > > >> > that >> >>> > >> > > >> > > > do >> >>> > >> > > >> > > > > > not >> >>> > >> > > >> > > > > > >> >> >>>> relate to >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> official business of this >> >>> company >> >>> > >> > shall be >> >>> > >> > > >> > understood >> >>> > >> > > >> > > > as >> >>> > >> > > >> > > > > > >> neither >> >>> > >> > > >> > > > > > >> >> >>>> given >> >>> > >> > > >> > > > > > >> >> >>>>>>>> nor >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> endorsed by it. IG is a >> trading >> >>> > name >> >>> > >> > of IG >> >>> > >> > > >> Markets >> >>> > >> > > >> > > > > Limited >> >>> > >> > > >> > > > > > (a >> >>> > >> > > >> > > > > > >> >> >>>> company >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> registered in England and >> >>> Wales, >> >>> > >> > company >> >>> > >> > > number >> >>> > >> > > >> > > > 04008957) >> >>> > >> > > >> > > > > > >> and IG >> >>> > >> > > >> > > > > > >> >> >>>> Index >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Limited (a company >> registered >> >>> in >> >>> > >> > England and >> >>> > >> > > >> Wales, >> >>> > >> > > >> > > > > company >> >>> > >> > > >> > > > > > >> >> >>> number >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> 01190902). Registered >> address >> >>> at >> >>> > >> Cannon >> >>> > >> > > Bridge >> >>> > >> > > >> > House, >> >>> > >> > > >> > > > 25 >> >>> > >> > > >> > > > > > >> Dowgate >> >>> > >> > > >> > > > > > >> >> >>>> Hill, >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> London EC4R 2YA. Both IG >> >>> Markets >> >>> > >> > Limited >> >>> > >> > > >> (register >> >>> > >> > > >> > > > number >> >>> > >> > > >> > > > > > >> 195355) >> >>> > >> > > >> > > > > > >> >> >>>> and >> >>> > >> > > >> > > > > > >> >> >>>>>> IG >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Index Limited (register >> number >> >>> > >> 114059) >> >>> > >> > are >> >>> > >> > > >> > authorised >> >>> > >> > > >> > > > and >> >>> > >> > > >> > > > > > >> >> >>>> regulated by >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> Financial Conduct Authority. >> >>> > >> > > >> > > > > > >> >> >>>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>>>> The information contained in >> >>> this >> >>> > >> email >> >>> > >> > is >> >>> > >> > > >> strictly >> >>> > >> > > >> > > > > > >> confidential >> >>> > >> > > >> > > > > > >> >> >>> and >> >>> > >> > > >> > > > > > >> >> >>>> for >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the use of the addressee >> only, >> >>> > unless >> >>> > >> > > otherwise >> >>> > >> > > >> > > > indicated. >> >>> > >> > > >> > > > > > If >> >>> > >> > > >> > > > > > >> you >> >>> > >> > > >> > > > > > >> >> >>> are >> >>> > >> > > >> > > > > > >> >> >>>>>> not >> >>> > >> > > >> > > > > > >> >> >>>>>>>> the intended recipient, >> please >> >>> do >> >>> > not >> >>> > >> > read, >> >>> > >> > > copy, >> >>> > >> > > >> > use >> >>> > >> > > >> > > or >> >>> > >> > > >> > > > > > >> disclose >> >>> > >> > > >> > > > > > >> >> >>> to >> >>> > >> > > >> > > > > > >> >> >>>>>> others >> >>> > >> > > >> > > > > > >> >> >>>>>>>> this message or any >> attachment. >> >>> > Please >> >>> > >> > also >> >>> > >> > > notify >> >>> > >> > > >> > the >> >>> > >> > > >> > > > > > sender >> >>> > >> > > >> > > > > > >> by >> >>> > >> > > >> > > > > > >> >> >>>>>> replying >> >>> > >> > > >> > > > > > >> >> >>>>>>>> to this email or by telephone >> >>> > (+44(020 >> >>> > >> > 7896 >> >>> > >> > > 0011) >> >>> > >> > > >> > and >> >>> > >> > > >> > > > then >> >>> > >> > > >> > > > > > >> delete >> >>> > >> > > >> > > > > > >> >> >>> the >> >>> > >> > > >> > > > > > >> >> >>>>>> email >> >>> > >> > > >> > > > > > >> >> >>>>>>>> and any copies of it. >> Opinions, >> >>> > >> > conclusion >> >>> > >> > > (etc) >> >>> > >> > > >> > that >> >>> > >> > > >> > > do >> >>> > >> > > >> > > > > not >> >>> > >> > > >> > > > > > >> >> relate >> >>> > >> > > >> > > > > > >> >> >>>> to >> >>> > >> > > >> > > > > > >> >> >>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>> official business of this >> >>> company >> >>> > >> shall >> >>> > >> > be >> >>> > >> > > >> > understood >> >>> > >> > > >> > > as >> >>> > >> > > >> > > > > > >> neither >> >>> > >> > > >> > > > > > >> >> >>>> given >> >>> > >> > > >> > > > > > >> >> >>>>>> nor >> >>> > >> > > >> > > > > > >> >> >>>>>>>> endorsed by it. IG is a >> trading >> >>> > name >> >>> > >> of >> >>> > >> > IG >> >>> > >> > > Markets >> >>> > >> > > >> > > > Limited >> >>> > >> > > >> > > > > > (a >> >>> > >> > > >> > > > > > >> >> >>> company >> >>> > >> > > >> > > > > > >> >> >>>>>>>> registered in England and >> Wales, >> >>> > >> company >> >>> > >> > > number >> >>> > >> > > >> > > > 04008957) >> >>> > >> > > >> > > > > > and >> >>> > >> > > >> > > > > > >> IG >> >>> > >> > > >> > > > > > >> >> >>>> Index >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Limited (a company >> registered in >> >>> > >> > England and >> >>> > >> > > >> Wales, >> >>> > >> > > >> > > > > company >> >>> > >> > > >> > > > > > >> number >> >>> > >> > > >> > > > > > >> >> >>>>>>>> 01190902). Registered >> address at >> >>> > >> Cannon >> >>> > >> > Bridge >> >>> > >> > > >> > House, >> >>> > >> > > >> > > 25 >> >>> > >> > > >> > > > > > >> Dowgate >> >>> > >> > > >> > > > > > >> >> >>>> Hill, >> >>> > >> > > >> > > > > > >> >> >>>>>>>> London EC4R 2YA. Both IG >> Markets >> >>> > >> Limited >> >>> > >> > > (register >> >>> > >> > > >> > > > number >> >>> > >> > > >> > > > > > >> 195355) >> >>> > >> > > >> > > > > > >> >> >>>> and IG >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Index Limited (register >> number >> >>> > 114059) >> >>> > >> > are >> >>> > >> > > >> > authorised >> >>> > >> > > >> > > > and >> >>> > >> > > >> > > > > > >> >> regulated >> >>> > >> > > >> > > > > > >> >> >>>> by >> >>> > >> > > >> > > > > > >> >> >>>>>> the >> >>> > >> > > >> > > > > > >> >> >>>>>>>> Financial Conduct Authority. >> >>> > >> > > >> > > > > > >> >> >>>>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>>>> -- >> >>> > >> > > >> > > > > > >> >> >>>>>> Gwen Shapira >> >>> > >> > > >> > > > > > >> >> >>>>>> Product Manager | Confluent >> >>> > >> > > >> > > > > > >> >> >>>>>> 650.450.2760 <(650)%20450-2760> >> >>> | @gwenshap >> >>> > >> > > >> > > > > > >> >> >>>>>> Follow us: Twitter | blog >> >>> > >> > > >> > > > > > >> >> >>>>>> >> >>> > >> > > >> > > > > > >> >> >>>> >> >>> > >> > > >> > > > > > >> >> >>>> >> >>> > >> > > >> > > > > > >> >> >>>> >> >>> > >> > > >> > > > > > >> >> >>>> -- >> >>> > >> > > >> > > > > > >> >> >>>> Gwen Shapira >> >>> > >> > > >> > > > > > >> >> >>>> Product Manager | Confluent >> >>> > >> > > >> > > > > > >> >> >>>> 650.450.2760 <(650)%20450-2760> | >> >>> @gwenshap >> >>> > >> > > >> > > > > > >> >> >>>> Follow us: Twitter | blog >> >>> > >> > > >> > > > > > >> >> >>>> >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> >>> -- >> >>> > >> > > >> > > > > > >> >> >>> Nacho (Ignacio) Solis >> >>> > >> > > >> > > > > > >> >> >>> Kafka >> >>> > >> > > >> > > > > > >> >> >>> nso...@linkedin.com >> >>> > >> > > >> > > > > > >> >> >>> >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > -- >> >>> > >> > > >> > > > > > >> >> > Gwen Shapira >> >>> > >> > > >> > > > > > >> >> > Product Manager | Confluent >> >>> > >> > > >> > > > > > >> >> > 650.450.2760 <(650)%20450-2760> | >> >>> @gwenshap >> >>> > >> > > >> > > > > > >> >> > Follow us: Twitter | blog >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > >> -- >> >>> > >> > > >> > > > > > >> Gwen Shapira >> >>> > >> > > >> > > > > > >> Product Manager | Confluent >> >>> > >> > > >> > > > > > >> 650.450.2760 <(650)%20450-2760> | >> @gwenshap >> >>> > >> > > >> > > > > > >> Follow us: Twitter | blog >> >>> > >> > > >> > > > > > >> >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > The information contained in this email is >> >>> > strictly >> >>> > >> > > confidential >> >>> > >> > > >> > and >> >>> > >> > > >> > > > for >> >>> > >> > > >> > > > > > the use of the addressee only, unless >> otherwise >> >>> > >> > indicated. >> >>> > >> > > If you >> >>> > >> > > >> > are >> >>> > >> > > >> > > > not >> >>> > >> > > >> > > > > > the intended recipient, please do not read, >> >>> copy, >> >>> > use >> >>> > >> > or >> >>> > >> > > disclose >> >>> > >> > > >> > to >> >>> > >> > > >> > > > > others >> >>> > >> > > >> > > > > > this message or any attachment. Please also >> >>> notify >> >>> > >> the >> >>> > >> > > sender by >> >>> > >> > > >> > > > replying >> >>> > >> > > >> > > > > > to this email or by telephone (+44(020 7896 >> >>> 0011) >> >>> > and >> >>> > >> > then >> >>> > >> > > delete >> >>> > >> > > >> > the >> >>> > >> > > >> > > > > email >> >>> > >> > > >> > > > > > and any copies of it. Opinions, conclusion >> >>> (etc) >> >>> > that >> >>> > >> > do not >> >>> > >> > > >> relate >> >>> > >> > > >> > > to >> >>> > >> > > >> > > > > the >> >>> > >> > > >> > > > > > official business of this company shall be >> >>> > understood >> >>> > >> > as >> >>> > >> > > neither >> >>> > >> > > >> > > given >> >>> > >> > > >> > > > > nor >> >>> > >> > > >> > > > > > endorsed by it. IG is a trading name of IG >> >>> Markets >> >>> > >> > Limited (a >> >>> > >> > > >> > company >> >>> > >> > > >> > > > > > registered in England and Wales, company >> number >> >>> > >> > 04008957) >> >>> > >> > > and IG >> >>> > >> > > >> > > Index >> >>> > >> > > >> > > > > > Limited (a company registered in England and >> >>> > Wales, >> >>> > >> > company >> >>> > >> > > >> number >> >>> > >> > > >> > > > > > 01190902). Registered address at Cannon >> Bridge >> >>> > House, >> >>> > >> > 25 >> >>> > >> > > Dowgate >> >>> > >> > > >> > > Hill, >> >>> > >> > > >> > > > > > London EC4R 2YA. Both IG Markets Limited >> >>> (register >> >>> > >> > number >> >>> > >> > > 195355) >> >>> > >> > > >> > and >> >>> > >> > > >> > > > IG >> >>> > >> > > >> > > > > > Index Limited (register number 114059) are >> >>> > authorised >> >>> > >> > and >> >>> > >> > > >> regulated >> >>> > >> > > >> > > by >> >>> > >> > > >> > > > > the >> >>> > >> > > >> > > > > > Financial Conduct Authority. >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > >> >>> > >> > > >> > >> >>> > >> > > >> The information contained in this email is strictly >> >>> > >> confidential >> >>> > >> > and >> >>> > >> > > for >> >>> > >> > > >> the use of the addressee only, unless otherwise >> >>> indicated. >> >>> > If >> >>> > >> > you are >> >>> > >> > > not >> >>> > >> > > >> the intended recipient, please do not read, copy, use >> or >> >>> > >> > disclose to >> >>> > >> > > others >> >>> > >> > > >> this message or any attachment. Please also notify the >> >>> > sender >> >>> > >> by >> >>> > >> > > replying >> >>> > >> > > >> to this email or by telephone (+44(020 7896 0011) and >> >>> then >> >>> > >> > delete the >> >>> > >> > > email >> >>> > >> > > >> and any copies of it. Opinions, conclusion (etc) that >> do >> >>> not >> >>> > >> > relate to >> >>> > >> > > the >> >>> > >> > > >> official business of this company shall be understood >> as >> >>> > >> neither >> >>> > >> > given >> >>> > >> > > nor >> >>> > >> > > >> endorsed by it. IG is a trading name of IG Markets >> >>> Limited >> >>> > (a >> >>> > >> > company >> >>> > >> > > >> registered in England and Wales, company number >> 04008957) >> >>> > and >> >>> > >> IG >> >>> > >> > Index >> >>> > >> > > >> Limited (a company registered in England and Wales, >> >>> company >> >>> > >> > number >> >>> > >> > > >> 01190902). Registered address at Cannon Bridge House, >> 25 >> >>> > >> Dowgate >> >>> > >> > Hill, >> >>> > >> > > >> London EC4R 2YA. Both IG Markets Limited (register >> number >> >>> > >> > 195355) and >> >>> > >> > > IG >> >>> > >> > > >> Index Limited (register number 114059) are authorised >> and >> >>> > >> > regulated by >> >>> > >> > > the >> >>> > >> > > >> Financial Conduct Authority. >> >>> > >> > > > >> >>> > >> > > > -- >> >>> > >> > > > Nacho - Ignacio Solis - iso...@igso.net >> >>> > >> > > The information contained in this email is strictly >> >>> confidential >> >>> > >> and >> >>> > >> > for >> >>> > >> > > the use of the addressee only, unless otherwise >> indicated. If >> >>> > you >> >>> > >> > are not >> >>> > >> > > the intended recipient, please do not read, copy, use or >> >>> > disclose >> >>> > >> to >> >>> > >> > others >> >>> > >> > > this message or any attachment. Please also notify the >> >>> sender by >> >>> > >> > replying >> >>> > >> > > to this email or by telephone (+44(020 7896 0011) and then >> >>> > delete >> >>> > >> > the email >> >>> > >> > > and any copies of it. Opinions, conclusion (etc) that do >> not >> >>> > relate >> >>> > >> > to the >> >>> > >> > > official business of this company shall be understood as >> >>> neither >> >>> > >> > given nor >> >>> > >> > > endorsed by it. IG is a trading name of IG Markets >> Limited (a >> >>> > >> company >> >>> > >> > > registered in England and Wales, company number 04008957) >> >>> and IG >> >>> > >> > Index >> >>> > >> > > Limited (a company registered in England and Wales, >> company >> >>> > number >> >>> > >> > > 01190902). Registered address at Cannon Bridge House, 25 >> >>> Dowgate >> >>> > >> > Hill, >> >>> > >> > > London EC4R 2YA. Both IG Markets Limited (register number >> >>> > 195355) >> >>> > >> > and IG >> >>> > >> > > Index Limited (register number 114059) are authorised and >> >>> > regulated >> >>> > >> > by the >> >>> > >> > > Financial Conduct Authority. >> >>> > >> > > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > -- >> >>> > >> > Nacho - Ignacio Solis - iso...@igso.net >> >>> > >> > >> >>> > >> > >> >>> > >> > The information contained in this email is strictly confidential >> >>> and >> >>> > for >> >>> > >> > the use of the addressee only, unless otherwise indicated. If >> you >> >>> are >> >>> > not >> >>> > >> > the intended recipient, please do not read, copy, use or >> disclose >> >>> to >> >>> > >> others >> >>> > >> > this message or any attachment. Please also notify the sender by >> >>> > replying >> >>> > >> > to this email or by telephone (+44(020 7896 0011) and then >> delete >> >>> the >> >>> > >> email >> >>> > >> > and any copies of it. Opinions, conclusion (etc) that do not >> >>> relate to >> >>> > >> the >> >>> > >> > official business of this company shall be understood as neither >> >>> given >> >>> > >> nor >> >>> > >> > endorsed by it. IG is a trading name of IG Markets Limited (a >> >>> company >> >>> > >> > registered in England and Wales, company number 04008957) and IG >> >>> Index >> >>> > >> > Limited (a company registered in England and Wales, company >> number >> >>> > >> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate >> >>> Hill, >> >>> > >> > London EC4R 2YA. Both IG Markets Limited (register number >> 195355) >> >>> and >> >>> > IG >> >>> > >> > Index Limited (register number 114059) are authorised and >> >>> regulated by >> >>> > >> the >> >>> > >> > Financial Conduct Authority. >> >>> > >> > >> >>> > >> >> >>> > >> >>> > >> >>> > >> >>> > -- >> >>> > Gwen Shapira >> >>> > Product Manager | Confluent >> >>> > 650.450.2760 <(650)%20450-2760> | @gwenshap >> >>> > Follow us: Twitter | blog >> >>> > >> >>> > >> >>> >> >> >> >> >> >> >> >> -- >> >> *Todd Palino* >> >> Staff Site Reliability Engineer >> >> Data Infrastructure Streaming >> >> >> >> >> >> >> >> linkedin.com/in/toddpalino >> >> >> >> >> > >> > >> > -- >> > *Todd Palino* >> > Staff Site Reliability Engineer >> > Data Infrastructure Streaming >> > >> > >> > >> > linkedin.com/in/toddpalino >> >> >> >> -- >> Gwen Shapira >> Product Manager | Confluent >> 650.450.2760 | @gwenshap >> Follow us: Twitter | blog >> > > > > -- > *Todd Palino* > Staff Site Reliability Engineer > Data Infrastructure Streaming > > > > linkedin.com/in/toddpalino -- Gwen Shapira Product Manager | Confluent 650.450.2760 | @gwenshap Follow us: Twitter | blog