Well I guess my question for you, then, is what is holding you back from full support for headers? What’s the bit that you’re missing that has you under a full +1?
-Todd On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <[email protected]> wrote: > I know why people who support headers support them, and I've seen what > the discussion is like. > > This is why I'm asking people who are against headers (especially > committers) what will make them change their mind - so we can get this > part over one way or another. > > If I sound frustrated it is not at Radai, Jun or you (Todd)... I am > just looking for something concrete we can do to move the discussion > along to the yummy design details (which is the argument I really am > looking forward to). > > On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <[email protected]> wrote: > > So, Gwen, to your question (even though I’m not a committer)... > > > > I have always been a strong supporter of introducing the concept of an > > envelope to messages, which headers accomplishes. The message key is > > already an example of a piece of envelope information. By providing a > means > > to do this within Kafka itself, and not relying on use-case specific > > implementations, you make it much easier for components to interoperate. > It > > simplifies development of all these things (message routing, auditing, > > encryption, etc.) because each one does not have to reinvent the wheel. > > > > It also makes it much easier from a client point of view if the headers > are > > defined as part of the protocol and/or message format in general because > > you can easily produce and consume messages without having to take into > > account specific cases. For example, I want to route messages, but > client A > > doesn’t support the way audit implemented headers, and client B doesn’t > > support the way encryption or routing implemented headers, so now my > > application has to create some really fragile (my autocorrect just tried > to > > make that “tragic”, which is probably appropriate too) code to strip > > everything off, rather than just consuming the messages, picking out the > 1 > > or 2 headers it’s interested in, and performing its function. > > > > Honestly, this discussion has been going on for a long time, and it’s > > always “Oh, you came up with 2 use cases, and yeah, those use cases are > > real things that someone would want to do. Here’s an alternate way to > > implement them so let’s not do headers.” If we have a few use cases that > we > > actually came up with, you can be sure that over the next year there’s a > > dozen others that we didn’t think of that someone would like to do. I > > really think it’s time to stop rehashing this discussion and instead > focus > > on a workable standard that we can adopt. > > > > -Todd > > > > > > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <[email protected]> wrote: > > > >> C. per message encryption > >>> One drawback of this approach is that this significantly reduce the > >>> effectiveness of compression, which happens on a set of serialized > >>> messages. An alternative is to enable SSL for wire encryption and rely > on > >>> the storage system (e.g. LUKS) for at rest encryption. > >> > >> > >> Jun, this is not sufficient. While this does cover the case of removing > a > >> drive from the system, it will not satisfy most compliance requirements > for > >> encryption of data as whoever has access to the broker itself still has > >> access to the unencrypted data. For end-to-end encryption you need to > >> encrypt at the producer, before it enters the system, and decrypt at the > >> consumer, after it exits the system. > >> > >> -Todd > >> > >> > >> On Thu, Dec 1, 2016 at 1:03 PM, radai <[email protected]> > wrote: > >> > >>> another big plus of headers in the protocol is that it would enable > rapid > >>> iteration on ideas outside of core kafka and would reduce the number of > >>> future wire format changes required. > >>> > >>> a lot of what is currently a KIP represents use cases that are not 100% > >>> relevant to all users, and some of them require rather invasive wire > >>> protocol changes. a thing a good recent example of this is kip-98. > >>> tx-utilizing traffic is expected to be a very small fraction of total > >>> traffic and yet the changes are invasive. > >>> > >>> every such wire format change translates into painful and slow > adoption of > >>> new versions. > >>> > >>> i think a lot of functionality currently in KIPs could be "spun out" > and > >>> implemented as opt-in plugins transmitting data over headers. this > would > >>> keep the core wire format stable(r), core codebase smaller, and avoid > the > >>> "burden of proof" thats sometimes required to prove a certain feature > is > >>> useful enough for a wide-enough audience to warrant a wire format > change > >>> and code complexity additions. > >>> > >>> (to be clear - kip-98 goes beyond "mere" wire format changes and im not > >>> saying it could have been completely done with headers, but > exactly-once > >>> delivery certainly could) > >>> > >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen Shapira <[email protected]> > wrote: > >>> > >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <[email protected]> > >>> wrote: > >>> > > "For use cases within an organization, one could always use other > >>> > > approaches such as company-wise containers" > >>> > > this is what linkedin has traditionally done but there are now > cases > >>> > (read > >>> > > - topics) where this is not acceptable. this makes headers useful > even > >>> > > within single orgs for cases where one-container-fits-all cannot > >>> apply. > >>> > > > >>> > > as for the particular use cases listed, i dont want this to devolve > >>> to a > >>> > > discussion of particular use cases - i think its enough that some > of > >>> them > >>> > > >>> > I think a main point of contention is that: We identified few > >>> > use-cases where headers are useful, do we want Kafka to be a system > >>> > that supports those use-cases? > >>> > > >>> > For example, Jun said: > >>> > "Not sure how widely useful record-level lineage is though since the > >>> > overhead could > >>> > be significant." > >>> > > >>> > We know NiFi supports record level lineage. I don't think it was > >>> > developed for lols, I think it is safe to assume that the NSA needed > >>> > that functionality. We also know that certain financial institutes > >>> > need to track tampering with records at a record level and there are > >>> > federal regulations that absolutely require this. They also need to > >>> > prove that routing apps that "touches" the messages and either reads > >>> > or updates headers couldn't have possibly modified the payload > itself. > >>> > They use record level encryption to do that - apps can read and > >>> > (sometimes) modify headers but can't touch the payload. > >>> > > >>> > We can totally say "those are corner cases and not worth adding > >>> > headers to Kafka for", they should use a different pubsub message for > >>> > that (Nifi or one of the other 1000 that cater specifically to the > >>> > financial industry). > >>> > > >>> > But this gets us into a catch 22: > >>> > If we discuss a specific use-case, someone can always say it isn't > >>> > interesting enough for Kafka. If we discuss more general trends, > >>> > others can say "well, we are not sure any of them really needs > headers > >>> > specifically. This is just hand waving and not interesting.". > >>> > > >>> > I think discussing use-cases in specifics is super important to > decide > >>> > implementation details for headers (my use-cases lean toward > numerical > >>> > keys with namespaces and object values, others differ), but I think > we > >>> > need to answer the general "Are we going to have headers" question > >>> > first. > >>> > > >>> > I'd love to hear from the other committers in the discussion: > >>> > What would it take to convince you that headers in Kafka are a good > >>> > idea in general, so we can move ahead and try to agree on the > details? > >>> > > >>> > I feel like we keep moving the goal posts and this is truly > exhausting. > >>> > > >>> > For the record, I mildly support adding headers to Kafka (+0.5?). > >>> > The community can continue to find workarounds to the issue and there > >>> > are some benefits to keeping the message format and clients simpler. > >>> > But I see the usefulness of headers to many use-cases and if we can > >>> > find a good and generally useful way to add it to Kafka, it will make > >>> > Kafka easier to use for many - worthy goal in my eyes. > >>> > > >>> > > are interesting/feasible, but: > >>> > > A+B. i think there are use cases for polyglot topics. especially if > >>> kafka > >>> > > is being used to "trunk" something else. > >>> > > D. multiple topics would make it harder to write portable consumer > >>> code. > >>> > > partition remapping would mess with locality of consumption > >>> guarantees. > >>> > > E+F. a use case I see for lineage/metadata is billing/chargeback. > for > >>> > that > >>> > > use case it is not enough to simply record the point of origin, but > >>> every > >>> > > replication stop (think mirror maker) must also add a record to > form a > >>> > > "transit log". > >>> > > > >>> > > as for stream processing on top of kafka - i know samza has a > metadata > >>> > map > >>> > > which they carry around in addition to user values. headers are the > >>> > perfect > >>> > > fit for these things. > >>> > > > >>> > > > >>> > > > >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun Rao <[email protected]> wrote: > >>> > > > >>> > >> Hi, Michael, > >>> > >> > >>> > >> In order to answer the first two questions, it would be helpful > if we > >>> > could > >>> > >> identify 1 or 2 strong use cases for headers in the space for > >>> > third-party > >>> > >> vendors. For use cases within an organization, one could always > use > >>> > other > >>> > >> approaches such as company-wise containers to get around w/o > >>> headers. I > >>> > >> went through the use cases in the KIP and in Radai's wiki ( > >>> > >> https://cwiki.apache.org/confluence/display/KAFKA/A+ > >>> > Case+for+Kafka+Headers > >>> > >> ). > >>> > >> The following are the ones that that I understand and could be in > the > >>> > >> third-party use case category. > >>> > >> > >>> > >> A. content-type > >>> > >> It seems that in general, content-type should be set at the topic > >>> level. > >>> > >> Not sure if mixing messages with different content types should be > >>> > >> encouraged. > >>> > >> > >>> > >> B. schema id > >>> > >> Since the value is mostly useless without schema id, it seems that > >>> > storing > >>> > >> the schema id together with serialized bytes in the value is > better? > >>> > >> > >>> > >> C. per message encryption > >>> > >> One drawback of this approach is that this significantly reduce > the > >>> > >> effectiveness of compression, which happens on a set of serialized > >>> > >> messages. An alternative is to enable SSL for wire encryption and > >>> rely > >>> > on > >>> > >> the storage system (e.g. LUKS) for at rest encryption. > >>> > >> > >>> > >> D. cluster ID for mirroring across Kafka clusters > >>> > >> This is actually interesting. Today, to avoid introducing cycles > when > >>> > doing > >>> > >> mirroring across data centers, one would either have to set up two > >>> Kafka > >>> > >> clusters (a local and an aggregate) per data center or rename > topics. > >>> > >> Neither is ideal. With headers, the producer could tag each > message > >>> with > >>> > >> the producing cluster ID in the header. MirrorMaker could then > avoid > >>> > >> mirroring messages to a cluster if they are tagged with the same > >>> cluster > >>> > >> id. > >>> > >> > >>> > >> However, an alternative approach is to introduce sth like > >>> hierarchical > >>> > >> topic and store messages from different clusters in different > >>> partitions > >>> > >> under the same topic. This approach avoids filtering out unneeded > >>> data > >>> > and > >>> > >> makes offset preserving easier to support. It may make compaction > >>> > trickier > >>> > >> though since the same key may show up in different partitions. > >>> > >> > >>> > >> E. record-level lineage > >>> > >> For example, a source connector could store in the message the > >>> metadata > >>> > >> (e.g. UUID) of the source record. Similarly, if a stream job > >>> transforms > >>> > >> messages from topic A to topic B, the library could include the > >>> source > >>> > >> message offset in each of the transformed message in the header. > Not > >>> > sure > >>> > >> how widely useful record-level lineage is though since the > overhead > >>> > could > >>> > >> be significant. > >>> > >> > >>> > >> F. auditing metadata > >>> > >> We could put things like clientId/host/user in the header in each > >>> > message > >>> > >> for auditing. These metadata are really at the producer level > though. > >>> > So, a > >>> > >> more efficient way is to only include a "producerId" per message > and > >>> > send > >>> > >> the producerId -> metadata mapping independently. KIP-98 is > actually > >>> > >> proposing including such a producerId natively in the message. > >>> > >> > >>> > >> So, overall, I not sure that I am fully convinced of the strong > >>> > third-party > >>> > >> use cases of headers yet. Perhaps we could discuss a bit more to > make > >>> > one > >>> > >> or two really convincing use cases. > >>> > >> > >>> > >> Another orthogonal question is whether header should be exposed > in > >>> > stream > >>> > >> processing systems such Kafka stream, Samza, and Spark streaming. > >>> > >> Currently, those systems just deal with key/value pairs. Should we > >>> > expose a > >>> > >> third thing header there too or somehow map header to key or > value? > >>> > >> > >>> > >> Thanks, > >>> > >> > >>> > >> Jun > >>> > >> > >>> > >> > >>> > >> On Tue, Nov 29, 2016 at 3:35 AM, Michael Pearce < > >>> [email protected]> > >>> > >> wrote: > >>> > >> > >>> > >> > I assume, that after a period of a week, that there is no > concerns > >>> now > >>> > >> > with points 1, and 2 and now we have agreement that headers are > >>> useful > >>> > >> and > >>> > >> > needed in Kafka. As such if put to a KIP vote, this wouldn’t be > a > >>> > reason > >>> > >> to > >>> > >> > reject. > >>> > >> > > >>> > >> > @ > >>> > >> > Ignacio on point 4). > >>> > >> > I think for purpose of getting this KIP moving past this, we can > >>> state > >>> > >> the > >>> > >> > key will be a 4 bytes space that can will be naturally > interpreted > >>> as > >>> > an > >>> > >> > Int32 (if namespacing is later wanted you can easily split this > >>> into > >>> > two > >>> > >> > int16 spaces), from the wire protocol implementation this makes > no > >>> > >> > difference I don’t believe. Is this reasonable to all? > >>> > >> > > >>> > >> > On 5) as per point 4 therefor happy we keep with 32 bits. > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > On 18/11/2016, 20:34, "[email protected] on behalf of > >>> Ignacio > >>> > >> > Solis" <[email protected] on behalf of [email protected]> > >>> wrote: > >>> > >> > > >>> > >> > Summary: > >>> > >> > > >>> > >> > 3) Yes - Header value as byte[] > >>> > >> > > >>> > >> > 4a) Int,Int - No > >>> > >> > 4b) Int - Yes > >>> > >> > 4c) String - Reluctant maybe > >>> > >> > > >>> > >> > 5) I believe the header system should take a single int. I > >>> think > >>> > >> > 32bits is > >>> > >> > a good size, if you want to interpret this as to 16bit > numbers > >>> in > >>> > the > >>> > >> > layer > >>> > >> > above go right ahead. If somebody wants to argue for 16 > bits > >>> or > >>> > 64 > >>> > >> > bits of > >>> > >> > header key space I would listen. > >>> > >> > > >>> > >> > > >>> > >> > Discussion: > >>> > >> > Dividing the key space into sub_key_1 and sub_key_2 makes no > >>> > sense to > >>> > >> > me at > >>> > >> > this layer. Are we going to start providing APIs to get all > >>> the > >>> > >> > sub_key_1s? or all the sub_key_2s? If there is no > >>> distinguishing > >>> > >> > functions > >>> > >> > that are applied to each one then they should be a single > >>> value. > >>> > At > >>> > >> > this > >>> > >> > layer all we're doing is equality. > >>> > >> > If the above layer wants to interpret this as 2, 3 or more > >>> values > >>> > >> > that's a > >>> > >> > different question. I personally think it's all one > keyspace > >>> > that is > >>> > >> > getting assigned using some structure, but if you want to > >>> > sub-assign > >>> > >> > parts > >>> > >> > of it then that's fine. > >>> > >> > > >>> > >> > The same discussion applies to strings. If somebody argued > for > >>> > >> > strings, > >>> > >> > would we be arguing to divide the strings with dots ('.') > as a > >>> > >> > requirement? > >>> > >> > Would we want them to give us the different name segments > >>> > separately? > >>> > >> > Would we be performing any actions on this key other than > >>> > matching? > >>> > >> > > >>> > >> > Nacho > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > On Fri, Nov 18, 2016 at 9:30 AM, Michael Pearce < > >>> > >> [email protected] > >>> > >> > > > >>> > >> > wrote: > >>> > >> > > >>> > >> > > #jay #jun any concerns on 1 and 2 still? > >>> > >> > > > >>> > >> > > @all > >>> > >> > > To get this moving along a bit more I'd also like to ask > to > >>> get > >>> > >> > clarity on > >>> > >> > > the below last points: > >>> > >> > > > >>> > >> > > 3) I believe we're all roughly happy with the header value > >>> > being a > >>> > >> > byte[]? > >>> > >> > > > >>> > >> > > 4) I believe consensus has been for an namespace based int > >>> > approach > >>> > >> > > {int,int} for the key. Any objections if this is what we > go > >>> > with? > >>> > >> > > > >>> > >> > > 5) as we have if assumption in (4) is correct, {int,int} > >>> keys. > >>> > >> > > Should both int's be int16 or int32? > >>> > >> > > I'm for them being int16(2 bytes) as combined is space of > >>> > 4bytes as > >>> > >> > per > >>> > >> > > original and gives plenty of combinations for the > >>> foreseeable, > >>> > and > >>> > >> > keeps > >>> > >> > > the overhead small. > >>> > >> > > > >>> > >> > > Do we see any benefit in another kip call to discuss > these at > >>> > all? > >>> > >> > > > >>> > >> > > Cheers > >>> > >> > > Mike > >>> > >> > > ________________________________________ > >>> > >> > > From: K Burstev <[email protected]> > >>> > >> > > Sent: Friday, November 18, 2016 7:07:07 AM > >>> > >> > > To: [email protected] > >>> > >> > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > >>> > >> > > > >>> > >> > > For what it is worth also i agree. As a user: > >>> > >> > > > >>> > >> > > 1) Yes - Headers are worthwhile > >>> > >> > > 2) Yes - Headers should be a top level option > >>> > >> > > > >>> > >> > > 14.11.2016, 21:15, "Ignacio Solis" <[email protected]>: > >>> > >> > > > 1) Yes - Headers are worthwhile > >>> > >> > > > 2) Yes - Headers should be a top level option > >>> > >> > > > > >>> > >> > > > On Mon, Nov 14, 2016 at 9:16 AM, Michael Pearce < > >>> > >> > [email protected]> > >>> > >> > > > wrote: > >>> > >> > > > > >>> > >> > > >> Hi Roger, > >>> > >> > > >> > >>> > >> > > >> The kip details/examples the original proposal for key > >>> > spacing > >>> > >> , > >>> > >> > not > >>> > >> > > the > >>> > >> > > >> new mentioned as per discussion namespace idea. > >>> > >> > > >> > >>> > >> > > >> We will need to update the kip, when we get agreement > >>> this > >>> > is a > >>> > >> > better > >>> > >> > > >> approach (which seems to be the case if I have > understood > >>> > the > >>> > >> > general > >>> > >> > > >> feeling in the conversation) > >>> > >> > > >> > >>> > >> > > >> Re the variable ints, at very early stage we did think > >>> about > >>> > >> > this. I > >>> > >> > > think > >>> > >> > > >> the added complexity for the saving isn't worth it. > I'd > >>> > rather > >>> > >> go > >>> > >> > > with, if > >>> > >> > > >> we want to reduce overheads and size int16 (2bytes) > keys > >>> as > >>> > it > >>> > >> > keeps it > >>> > >> > > >> simple. > >>> > >> > > >> > >>> > >> > > >> On the note of no headers, there is as per the kip as > we > >>> > use an > >>> > >> > > attribute > >>> > >> > > >> bit to denote if headers are present or not as such > >>> > provides a > >>> > >> > zero > >>> > >> > > >> overhead currently if headers are not used. > >>> > >> > > >> > >>> > >> > > >> I think as radai mentions would be good first if we > can > >>> get > >>> > >> > clarity if > >>> > >> > > do > >>> > >> > > >> we now have general consensus that (1) headers are > >>> > worthwhile > >>> > >> and > >>> > >> > > useful, > >>> > >> > > >> and (2) we want it as a top level entity. > >>> > >> > > >> > >>> > >> > > >> Just to state the obvious i believe (1) headers are > >>> > worthwhile > >>> > >> > and (2) > >>> > >> > > >> agree as a top level entity. > >>> > >> > > >> > >>> > >> > > >> Cheers > >>> > >> > > >> Mike > >>> > >> > > >> ________________________________________ > >>> > >> > > >> From: Roger Hoover <[email protected]> > >>> > >> > > >> Sent: Wednesday, November 9, 2016 9:10:47 PM > >>> > >> > > >> To: [email protected] > >>> > >> > > >> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > >>> > >> > > >> > >>> > >> > > >> Sorry for going a little in the weeds but thanks for > the > >>> > >> replies > >>> > >> > > regarding > >>> > >> > > >> varint. > >>> > >> > > >> > >>> > >> > > >> Agreed that a prefix and {int, int} can be the same. > It > >>> > doesn't > >>> > >> > look > >>> > >> > > like > >>> > >> > > >> that's what the KIP is saying the "Open" section. The > >>> > example > >>> > >> > shows > >>> > >> > > >> 2100001 > >>> > >> > > >> for New Relic and 210002 for App Dynamics implying > that > >>> the > >>> > New > >>> > >> > Relic > >>> > >> > > >> organization will have only a single header id to work > >>> > with. Or > >>> > >> > is > >>> > >> > > 2100001 > >>> > >> > > >> a prefix? The main point of a namespace or prefix is > to > >>> > reduce > >>> > >> > the > >>> > >> > > >> overhead of config mapping or registration depending > on > >>> how > >>> > >> > > >> namespaces/prefixes are managed. > >>> > >> > > >> > >>> > >> > > >> Would love to hear more feedback on the higher-level > >>> > questions > >>> > >> > > though... > >>> > >> > > >> > >>> > >> > > >> Cheers, > >>> > >> > > >> > >>> > >> > > >> Roger > >>> > >> > > >> > >>> > >> > > >> On Wed, Nov 9, 2016 at 11:38 AM, radai < > >>> > >> > [email protected]> > >>> > >> > > wrote: > >>> > >> > > >> > >>> > >> > > >> > I think this discussion is getting a bit into the > >>> weeds on > >>> > >> > technical > >>> > >> > > >> > implementation details. > >>> > >> > > >> > I'd liek to step back a minute and try and establish > >>> > where we > >>> > >> > are in > >>> > >> > > the > >>> > >> > > >> > larger picture: > >>> > >> > > >> > > >>> > >> > > >> > (re-wording nacho's last paragraph) > >>> > >> > > >> > 1. are we all in agreement that headers are a > >>> worthwhile > >>> > and > >>> > >> > useful > >>> > >> > > >> > addition to have? this was contested early on > >>> > >> > > >> > 2. are we all in agreement on headers as top level > >>> entity > >>> > vs > >>> > >> > headers > >>> > >> > > >> > squirreled-away in V? > >>> > >> > > >> > > >>> > >> > > >> > if there are still concerns around these #2 points > >>> (#jay? > >>> > >> > #jun?)? > >>> > >> > > >> > > >>> > >> > > >> > (and now back to our normal programming ...) > >>> > >> > > >> > > >>> > >> > > >> > varints are nice. having said that, its adding > >>> complexity > >>> > >> (see > >>> > >> > > >> > https://github.com/addthis/ > stream-lib/blob/master/src/ > >>> > >> > > >> > main/java/com/clearspring/ > analytics/util/Varint.java > >>> > >> > > >> > as 1st google result) and would require anyone > writing > >>> > other > >>> > >> > clients > >>> > >> > > (C? > >>> > >> > > >> > Python? Go? Bash? ;-) ) to get/implement the same, > and > >>> for > >>> > >> > relatively > >>> > >> > > >> > little gain (int vs string is order of magnitude, > this > >>> > isnt). > >>> > >> > > >> > > >>> > >> > > >> > int namespacing vs {int, int} namespacing are > basically > >>> > the > >>> > >> > same > >>> > >> > > thing - > >>> > >> > > >> > youre just namespacing an int64 and giving people > while > >>> > 2^32 > >>> > >> > ranges > >>> > >> > > at a > >>> > >> > > >> > time. the part i like about this is letting people > >>> have a > >>> > >> large > >>> > >> > > swath of > >>> > >> > > >> > numbers with one registration so they dont have to > come > >>> > back > >>> > >> > for > >>> > >> > > every > >>> > >> > > >> > single plugin/header they want to "reserve". > >>> > >> > > >> > > >>> > >> > > >> > > >>> > >> > > >> > On Wed, Nov 9, 2016 at 11:01 AM, Roger Hoover < > >>> > >> > > [email protected]> > >>> > >> > > >> > wrote: > >>> > >> > > >> > > >>> > >> > > >> > > Since some of the debate has been about overhead + > >>> > >> > performance, I'm > >>> > >> > > >> > > wondering if we have considered a varint encoding > ( > >>> > >> > > >> > > https://developers.google.com/ > protocol-buffers/docs/ > >>> > >> > > encoding#varints) > >>> > >> > > >> > for > >>> > >> > > >> > > the header length field (int32 in the proposal) > and > >>> for > >>> > >> > header > >>> > >> > > ids? If > >>> > >> > > >> > you > >>> > >> > > >> > > don't use headers, the overhead would be a single > >>> byte > >>> > and > >>> > >> > for each > >>> > >> > > >> > header > >>> > >> > > >> > > id < 128 would also need only a single byte? > >>> > >> > > >> > > > >>> > >> > > >> > > > >>> > >> > > >> > > > >>> > >> > > >> > > On Wed, Nov 9, 2016 at 6:43 AM, radai < > >>> > >> > [email protected]> > >>> > >> > > >> > wrote: > >>> > >> > > >> > > > >>> > >> > > >> > > > @magnus - and very dangerous (youre essentially > >>> > >> > downloading and > >>> > >> > > >> > executing > >>> > >> > > >> > > > arbitrary code off the internet on your servers > ... > >>> > bad > >>> > >> > idea > >>> > >> > > without > >>> > >> > > >> a > >>> > >> > > >> > > > sandbox, even with) > >>> > >> > > >> > > > > >>> > >> > > >> > > > as for it being a purely administrative task - i > >>> > >> disagree. > >>> > >> > > >> > > > > >>> > >> > > >> > > > i wish it would, really, because then my earlier > >>> > point on > >>> > >> > the > >>> > >> > > >> > complexity > >>> > >> > > >> > > of > >>> > >> > > >> > > > the remapping process would be invalid, but at > >>> > linkedin, > >>> > >> > for > >>> > >> > > example, > >>> > >> > > >> > we > >>> > >> > > >> > > > (the team im in) run kafka as a service. we dont > >>> > really > >>> > >> > know > >>> > >> > > what our > >>> > >> > > >> > > users > >>> > >> > > >> > > > (developing applications that use kafka) are up > to > >>> at > >>> > any > >>> > >> > given > >>> > >> > > >> moment. > >>> > >> > > >> > > it > >>> > >> > > >> > > > is very possible (given the existance of headers > >>> and a > >>> > >> > > corresponding > >>> > >> > > >> > > plugin > >>> > >> > > >> > > > ecosystem) for some application to "equip" their > >>> > >> producers > >>> > >> > and > >>> > >> > > >> > consumers > >>> > >> > > >> > > > with the required plugin without us knowing. i > dont > >>> > mean > >>> > >> > to imply > >>> > >> > > >> thats > >>> > >> > > >> > > > bad, i just want to make the point that its not > as > >>> > simple > >>> > >> > > keeping it > >>> > >> > > >> in > >>> > >> > > >> > > > sync across a large-enough organization. > >>> > >> > > >> > > > > >>> > >> > > >> > > > > >>> > >> > > >> > > > On Wed, Nov 9, 2016 at 6:17 AM, Magnus Edenhill > < > >>> > >> > > [email protected]> > >>> > >> > > >> > > > wrote: > >>> > >> > > >> > > > > >>> > >> > > >> > > > > I think there is a piece missing in the > Strings > >>> > >> > discussion, > >>> > >> > > where > >>> > >> > > >> > > > > pro-Stringers > >>> > >> > > >> > > > > reason that by providing unique string > >>> identifiers > >>> > for > >>> > >> > each > >>> > >> > > header > >>> > >> > > >> > > > > everything will just > >>> > >> > > >> > > > > magically work for all parts of the stream > >>> pipeline. > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > But the strings dont mean anything by > themselves, > >>> > and > >>> > >> > while we > >>> > >> > > >> could > >>> > >> > > >> > > > > probably envision > >>> > >> > > >> > > > > some auto plugin loader that downloads, > compiles, > >>> > links > >>> > >> > and > >>> > >> > > runs > >>> > >> > > >> > > plugins > >>> > >> > > >> > > > > on-demand > >>> > >> > > >> > > > > as soon as they're seen by a consumer, I dont > >>> really > >>> > >> see > >>> > >> > a > >>> > >> > > use-case > >>> > >> > > >> > for > >>> > >> > > >> > > > > something > >>> > >> > > >> > > > > so dynamic (and fragile) in practice. > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > In the real world an application will be > >>> configured > >>> > >> with > >>> > >> > a set > >>> > >> > > of > >>> > >> > > >> > > plugins > >>> > >> > > >> > > > > to either add (producer) > >>> > >> > > >> > > > > or read (consumer) headers. > >>> > >> > > >> > > > > This is an administrative task based on what > >>> > features a > >>> > >> > client > >>> > >> > > >> > > > > needs/provides and results in > >>> > >> > > >> > > > > some sort of configuration to enable and > >>> configure > >>> > the > >>> > >> > desired > >>> > >> > > >> > plugins. > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > Since this needs to be kept somewhat in sync > >>> across > >>> > an > >>> > >> > > organisation > >>> > >> > > >> > > > (there > >>> > >> > > >> > > > > is no point in having producers > >>> > >> > > >> > > > > add headers no consumers will read, and vice > >>> versa), > >>> > >> the > >>> > >> > added > >>> > >> > > >> > > complexity > >>> > >> > > >> > > > > of assigning an id namespace > >>> > >> > > >> > > > > for each plugin as it is being configured > should > >>> be > >>> > >> > tolerable. > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > /Magnus > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > 2016-11-09 13:06 GMT+01:00 Michael Pearce < > >>> > >> > > [email protected]>: > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > > Just following/catching up on what seems to > be > >>> an > >>> > >> > active > >>> > >> > > night :) > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > @Radai sorry if it may seem obvious but what > >>> does > >>> > MD > >>> > >> > stand > >>> > >> > > for? > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > My take on String vs Int: > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > I will state first I am pro Int (16 or 32). > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > I do though playing devils advocate see a > big > >>> plus > >>> > >> > with the > >>> > >> > > >> > argument > >>> > >> > > >> > > of > >>> > >> > > >> > > > > > String keys, this is around integrating > into an > >>> > >> > existing > >>> > >> > > >> > eco-system. > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > As many other systems use String based > headers > >>> > >> (Flume, > >>> > >> > JMS) > >>> > >> > > it > >>> > >> > > >> > makes > >>> > >> > > >> > > > it > >>> > >> > > >> > > > > > much easier for these to be > >>> > incorporated/integrated > >>> > >> > into. > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > How with Int based headers could we provide > a > >>> > >> > way/guidence to > >>> > >> > > >> make > >>> > >> > > >> > > this > >>> > >> > > >> > > > > > integration simple / easy with transition > flows > >>> > over > >>> > >> to > >>> > >> > > kafka? > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > * tough luck buddy you're on your own > >>> > >> > > >> > > > > > * simply hash the string into int code and > hope > >>> > for > >>> > >> no > >>> > >> > > collisions > >>> > >> > > >> > > (how > >>> > >> > > >> > > > to > >>> > >> > > >> > > > > > convert back though?) > >>> > >> > > >> > > > > > * http2 style as mentioned by nacho. > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > cheers, > >>> > >> > > >> > > > > > Mike > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > ________________________________________ > >>> > >> > > >> > > > > > From: radai <[email protected]> > >>> > >> > > >> > > > > > Sent: Wednesday, November 9, 2016 8:12 AM > >>> > >> > > >> > > > > > To: [email protected] > >>> > >> > > >> > > > > > Subject: Re: [DISCUSS] KIP-82 - Add Record > >>> Headers > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > thinking about it some more, the best way to > >>> > transmit > >>> > >> > the > >>> > >> > > header > >>> > >> > > >> > > > > remapping > >>> > >> > > >> > > > > > data to consumers would be to put it in the > MD > >>> > >> response > >>> > >> > > payload, > >>> > >> > > >> so > >>> > >> > > >> > > > maybe > >>> > >> > > >> > > > > > it should be discussed now. > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > On Wed, Nov 9, 2016 at 12:09 AM, radai < > >>> > >> > > >> [email protected] > >>> > >> > > >> > > > >>> > >> > > >> > > > > wrote: > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > > im not opposed to the idea of namespace > >>> mapping. > >>> > >> all > >>> > >> > im > >>> > >> > > saying > >>> > >> > > >> is > >>> > >> > > >> > > > that > >>> > >> > > >> > > > > > its > >>> > >> > > >> > > > > > > not part of the "mvp" and, since it > requires > >>> no > >>> > >> wire > >>> > >> > format > >>> > >> > > >> > change, > >>> > >> > > >> > > > can > >>> > >> > > >> > > > > > > always be added later. > >>> > >> > > >> > > > > > > also, its not as simple as just > configuring > >>> MM > >>> > to > >>> > >> do > >>> > >> > the > >>> > >> > > >> > transform: > >>> > >> > > >> > > > > lets > >>> > >> > > >> > > > > > > say i've implemented large message > support as > >>> > >> > {666,1} and > >>> > >> > > on > >>> > >> > > >> some > >>> > >> > > >> > > > > mirror > >>> > >> > > >> > > > > > > target cluster its been remapped to > {999,1}. > >>> the > >>> > >> > consumer > >>> > >> > > >> plugin > >>> > >> > > >> > > code > >>> > >> > > >> > > > > > would > >>> > >> > > >> > > > > > > also need to be told to look for the large > >>> > message > >>> > >> > "part X > >>> > >> > > of > >>> > >> > > >> Y" > >>> > >> > > >> > > > header > >>> > >> > > >> > > > > > > under {999,1}. doable, but tricky. > >>> > >> > > >> > > > > > > > >>> > >> > > >> > > > > > > On Tue, Nov 8, 2016 at 10:29 PM, Gwen > >>> Shapira < > >>> > >> > > >> [email protected] > >>> > >> > > >> > > > >>> > >> > > >> > > > > wrote: > >>> > >> > > >> > > > > > > > >>> > >> > > >> > > > > > >> While you can do whatever you want with a > >>> > >> namespace > >>> > >> > and > >>> > >> > > your > >>> > >> > > >> > code, > >>> > >> > > >> > > > > > >> what I'd expect is for each app to > >>> namespaces > >>> > >> > > configurable... > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> So if I accidentally used 666 for my HR > >>> > >> department, > >>> > >> > and > >>> > >> > > still > >>> > >> > > >> > want > >>> > >> > > >> > > > to > >>> > >> > > >> > > > > > >> run RadaiApp, I can config "namespace=42" > >>> for > >>> > >> > RadaiApp and > >>> > >> > > >> > > > everything > >>> > >> > > >> > > > > > >> will look normal. > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> This means you only need to sync usage > >>> inside > >>> > your > >>> > >> > own > >>> > >> > > >> > > organization. > >>> > >> > > >> > > > > > >> Still hard, but somewhat easier than > syncing > >>> > with > >>> > >> > the > >>> > >> > > entire > >>> > >> > > >> > > world. > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> On Tue, Nov 8, 2016 at 10:07 PM, radai < > >>> > >> > > >> > > [email protected]> > >>> > >> > > >> > > > > > >> wrote: > >>> > >> > > >> > > > > > >> > and we can start with {namespace, id} > and > >>> no > >>> > >> > re-mapping > >>> > >> > > >> > support > >>> > >> > > >> > > > and > >>> > >> > > >> > > > > > >> always > >>> > >> > > >> > > > > > >> > add it later on if/when collisions > >>> actually > >>> > >> > happen (i > >>> > >> > > dont > >>> > >> > > >> > think > >>> > >> > > >> > > > > > they'd > >>> > >> > > >> > > > > > >> be > >>> > >> > > >> > > > > > >> > a problem). > >>> > >> > > >> > > > > > >> > > >>> > >> > > >> > > > > > >> > every interested party (so orgs or > >>> > individuals) > >>> > >> > could > >>> > >> > > then > >>> > >> > > >> > > > register > >>> > >> > > >> > > > > a > >>> > >> > > >> > > > > > >> > prefix (0 = reserved, 1 = confluent ... > >>> 666 > >>> > = me > >>> > >> > :-) ) > >>> > >> > > and > >>> > >> > > >> do > >>> > >> > > >> > > > > whatever > >>> > >> > > >> > > > > > >> with > >>> > >> > > >> > > > > > >> > the 2nd ID - so once linkedin > registers, > >>> say > >>> > 3, > >>> > >> > then > >>> > >> > > >> linkedin > >>> > >> > > >> > > devs > >>> > >> > > >> > > > > are > >>> > >> > > >> > > > > > >> free > >>> > >> > > >> > > > > > >> > to use {3, *} with a reasonable > >>> expectation > >>> > to > >>> > >> to > >>> > >> > > collide > >>> > >> > > >> with > >>> > >> > > >> > > > > > anything > >>> > >> > > >> > > > > > >> > else. further partitioning of that * > >>> becomes > >>> > >> > linkedin's > >>> > >> > > >> > problem, > >>> > >> > > >> > > > but > >>> > >> > > >> > > > > > the > >>> > >> > > >> > > > > > >> > "upstream registration" of a namespace > >>> only > >>> > has > >>> > >> to > >>> > >> > > happen > >>> > >> > > >> > once. > >>> > >> > > >> > > > > > >> > > >>> > >> > > >> > > > > > >> > On Tue, Nov 8, 2016 at 9:03 PM, James > >>> Cheng < > >>> > >> > > >> > > [email protected] > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > > >> wrote: > >>> > >> > > >> > > > > > >> > > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> > On Nov 8, 2016, at 5:54 PM, Gwen > >>> Shapira < > >>> > >> > > >> > [email protected]> > >>> > >> > > >> > > > > > wrote: > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > Thank you so much for this clear and > >>> fair > >>> > >> > summary of > >>> > >> > > the > >>> > >> > > >> > > > > arguments. > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > I'm in favor of ints. Not a > >>> deal-breaker, > >>> > but > >>> > >> > in > >>> > >> > > favor. > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > Even more in favor of Magnus's > >>> > decentralized > >>> > >> > > suggestion > >>> > >> > > >> > with > >>> > >> > > >> > > > > > Roger's > >>> > >> > > >> > > > > > >> >> > tweak: add a namespace for headers. > >>> This > >>> > will > >>> > >> > allow > >>> > >> > > each > >>> > >> > > >> > app > >>> > >> > > >> > > to > >>> > >> > > >> > > > > > just > >>> > >> > > >> > > > > > >> >> > use whatever IDs it wants > internally, > >>> and > >>> > >> then > >>> > >> > let > >>> > >> > > the > >>> > >> > > >> > admin > >>> > >> > > >> > > > > > >> deploying > >>> > >> > > >> > > > > > >> >> > the app figure out an available > >>> namespace > >>> > ID > >>> > >> > for the > >>> > >> > > app > >>> > >> > > >> to > >>> > >> > > >> > > > live > >>> > >> > > >> > > > > > in. > >>> > >> > > >> > > > > > >> >> > So io.confluent.schema-registry can > be > >>> > >> > namespace > >>> > >> > > 0x01 on > >>> > >> > > >> my > >>> > >> > > >> > > > > > >> deployment > >>> > >> > > >> > > > > > >> >> > and 0x57 on yours, and the poor guys > >>> > >> > developing the > >>> > >> > > app > >>> > >> > > >> > don't > >>> > >> > > >> > > > > need > >>> > >> > > >> > > > > > to > >>> > >> > > >> > > > > > >> >> > worry about that. > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> Gwen, if I understand your example > >>> right, an > >>> > >> > > application > >>> > >> > > >> > > deployer > >>> > >> > > >> > > > > > might > >>> > >> > > >> > > > > > >> >> decide to use 0x01 in one deployment, > and > >>> > that > >>> > >> > means > >>> > >> > > that > >>> > >> > > >> > once > >>> > >> > > >> > > > the > >>> > >> > > >> > > > > > >> message > >>> > >> > > >> > > > > > >> >> is written into the broker, it will be > >>> > saved on > >>> > >> > the > >>> > >> > > broker > >>> > >> > > >> > with > >>> > >> > > >> > > > > that > >>> > >> > > >> > > > > > >> >> specific namespace (0x01). > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> If you were to mirror that message > into > >>> > another > >>> > >> > > cluster, > >>> > >> > > >> the > >>> > >> > > >> > > 0x01 > >>> > >> > > >> > > > > > would > >>> > >> > > >> > > > > > >> >> accompany the message, right? What if > the > >>> > >> > deployers of > >>> > >> > > the > >>> > >> > > >> > same > >>> > >> > > >> > > > app > >>> > >> > > >> > > > > > in > >>> > >> > > >> > > > > > >> the > >>> > >> > > >> > > > > > >> >> other cluster uses 0x57? They won't > >>> > understand > >>> > >> > each > >>> > >> > > other? > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> I'm not sure that's an avoidable > >>> problem. I > >>> > >> > think it > >>> > >> > > simply > >>> > >> > > >> > > means > >>> > >> > > >> > > > > > that > >>> > >> > > >> > > > > > >> in > >>> > >> > > >> > > > > > >> >> order to share data, you have to also > >>> have a > >>> > >> > shared > >>> > >> > > (agreed > >>> > >> > > >> > > upon) > >>> > >> > > >> > > > > > >> >> understanding of what the namespaces > >>> mean. > >>> > >> Which > >>> > >> > I > >>> > >> > > think > >>> > >> > > >> > makes > >>> > >> > > >> > > > > sense, > >>> > >> > > >> > > > > > >> >> because the alternate (sharing > *nothing* > >>> at > >>> > >> all) > >>> > >> > would > >>> > >> > > mean > >>> > >> > > >> > > that > >>> > >> > > >> > > > > > there > >>> > >> > > >> > > > > > >> >> would be no way to understand each > other. > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> -James > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> > Gwen > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > On Tue, Nov 8, 2016 at 4:23 PM, > radai < > >>> > >> > > >> > > > > [email protected]> > >>> > >> > > >> > > > > > >> >> wrote: > >>> > >> > > >> > > > > > >> >> >> +1 for sean's document. it covers > >>> pretty > >>> > >> much > >>> > >> > all > >>> > >> > > the > >>> > >> > > >> > > > trade-offs > >>> > >> > > >> > > > > > and > >>> > >> > > >> > > > > > >> >> >> provides concrete figures to argue > >>> about > >>> > :-) > >>> > >> > > >> > > > > > >> >> >> (nit-picking - used the same xkcd > >>> twice, > >>> > >> also > >>> > >> > trove > >>> > >> > > has > >>> > >> > > >> > been > >>> > >> > > >> > > > > > >> superceded > >>> > >> > > >> > > > > > >> >> for > >>> > >> > > >> > > > > > >> >> >> purposes of high performance > >>> collections: > >>> > >> > look at > >>> > >> > > >> > > > > > >> >> >> https://github.com/leventov/ > Koloboke) > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> so to sum up the string vs int > debate: > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> performance - you can do 140k > ops/sec > >>> > _per > >>> > >> > thread_ > >>> > >> > > with > >>> > >> > > >> > > string > >>> > >> > > >> > > > > > >> headers. > >>> > >> > > >> > > > > > >> >> you > >>> > >> > > >> > > > > > >> >> >> could do x2-3 better with ints. > >>> there's > >>> > no > >>> > >> > arguing > >>> > >> > > the > >>> > >> > > >> > > > relative > >>> > >> > > >> > > > > > diff > >>> > >> > > >> > > > > > >> >> >> between the two, there's only the > >>> > question > >>> > >> of > >>> > >> > > whether or > >>> > >> > > >> > not > >>> > >> > > >> > > > > _the > >>> > >> > > >> > > > > > >> rest > >>> > >> > > >> > > > > > >> >> of > >>> > >> > > >> > > > > > >> >> >> kafka_ operates fast enough to > care. > >>> if > >>> > we > >>> > >> > want to > >>> > >> > > make > >>> > >> > > >> > > > choices > >>> > >> > > >> > > > > > >> solely > >>> > >> > > >> > > > > > >> >> >> based on performance we need ints. > if > >>> we > >>> > are > >>> > >> > > willing to > >>> > >> > > >> > > > > > >> >> settle/compromise > >>> > >> > > >> > > > > > >> >> >> for a nicer (to some) API than > strings > >>> > are > >>> > >> > good > >>> > >> > > enough > >>> > >> > > >> for > >>> > >> > > >> > > the > >>> > >> > > >> > > > > > >> current > >>> > >> > > >> > > > > > >> >> >> state of affairs. > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> message size - with batching and > >>> > compression > >>> > >> > it > >>> > >> > > comes > >>> > >> > > >> down > >>> > >> > > >> > > to > >>> > >> > > >> > > > a > >>> > >> > > >> > > > > > ~5% > >>> > >> > > >> > > > > > >> >> >> difference (internal testing, not > in > >>> the > >>> > >> doc. > >>> > >> > maybe > >>> > >> > > >> would > >>> > >> > > >> > > help > >>> > >> > > >> > > > > > >> adding if > >>> > >> > > >> > > > > > >> >> >> this becomes a point of > contention?). > >>> > this > >>> > >> > means it > >>> > >> > > wont > >>> > >> > > >> > > > really > >>> > >> > > >> > > > > > >> affect > >>> > >> > > >> > > > > > >> >> >> kafka in "throughput mode" (large, > >>> > >> compressed > >>> > >> > > batches). > >>> > >> > > >> in > >>> > >> > > >> > > > "low > >>> > >> > > >> > > > > > >> latency" > >>> > >> > > >> > > > > > >> >> >> mode (meaning less/no batching and > >>> > >> > compression) the > >>> > >> > > >> > > difference > >>> > >> > > >> > > > > can > >>> > >> > > >> > > > > > >> be > >>> > >> > > >> > > > > > >> >> >> extreme (it'll easily be an order > of > >>> > >> > magnitude with > >>> > >> > > >> small > >>> > >> > > >> > > > > payloads > >>> > >> > > >> > > > > > >> like > >>> > >> > > >> > > > > > >> >> >> stock ticks and header keys of the > >>> form > >>> > >> > > >> > > > > > >> >> >> "com.acme.infraTeam.kafka. > >>> > >> hiMom.auditPlugin"). > >>> > >> > we > >>> > >> > > have > >>> > >> > > >> a > >>> > >> > > >> > > few > >>> > >> > > >> > > > > such > >>> > >> > > >> > > > > > >> >> topics at > >>> > >> > > >> > > > > > >> >> >> linkedin where actual payloads are > ~2 > >>> > ints > >>> > >> > and are > >>> > >> > > >> > eclipsed > >>> > >> > > >> > > by > >>> > >> > > >> > > > > our > >>> > >> > > >> > > > > > >> >> in-house > >>> > >> > > >> > > > > > >> >> >> audit "header" which is why we > liked > >>> > ints to > >>> > >> > begin > >>> > >> > > with. > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> "ease of use" - strings would > probably > >>> > still > >>> > >> > require > >>> > >> > > >> > _some_ > >>> > >> > > >> > > > > degree > >>> > >> > > >> > > > > > >> of > >>> > >> > > >> > > > > > >> >> >> partitioning by convention > (imagine if > >>> > >> > everyone > >>> > >> > > used the > >>> > >> > > >> > key > >>> > >> > > >> > > > > > >> "infra"...) > >>> > >> > > >> > > > > > >> >> >> but its very intuitive for java > devs > >>> to > >>> > do > >>> > >> > anyway > >>> > >> > > >> > > > > (reverse-domain > >>> > >> > > >> > > > > > is > >>> > >> > > >> > > > > > >> >> >> ingrained into java developers at a > >>> young > >>> > >> age > >>> > >> > :-) ). > >>> > >> > > >> also > >>> > >> > > >> > > most > >>> > >> > > >> > > > > > java > >>> > >> > > >> > > > > > >> devs > >>> > >> > > >> > > > > > >> >> >> find Map<String, whatever> more > >>> intuitive > >>> > >> than > >>> > >> > > >> > Map<Integer, > >>> > >> > > >> > > > > > >> whatever> - > >>> > >> > > >> > > > > > >> >> >> probably because of other > text-based > >>> > >> > protocols like > >>> > >> > > >> http. > >>> > >> > > >> > > ints > >>> > >> > > >> > > > > > would > >>> > >> > > >> > > > > > >> >> >> require a number registry. if you > >>> think > >>> > >> number > >>> > >> > > >> registries > >>> > >> > > >> > > are > >>> > >> > > >> > > > > hard > >>> > >> > > >> > > > > > >> just > >>> > >> > > >> > > > > > >> >> >> look at the wiki page for KIPs > >>> > (specifically > >>> > >> > the > >>> > >> > > number > >>> > >> > > >> > for > >>> > >> > > >> > > > next > >>> > >> > > >> > > > > > >> >> available > >>> > >> > > >> > > > > > >> >> >> KIP) and think again - we are > probably > >>> > >> talking > >>> > >> > > about the > >>> > >> > > >> > > same > >>> > >> > > >> > > > > > >> volume of > >>> > >> > > >> > > > > > >> >> >> requests. also this would only be > >>> > "required" > >>> > >> > (good > >>> > >> > > >> > > > citizenship, > >>> > >> > > >> > > > > > more > >>> > >> > > >> > > > > > >> >> like) > >>> > >> > > >> > > > > > >> >> >> if you want to publish your plugin > for > >>> > >> others > >>> > >> > to > >>> > >> > > use. > >>> > >> > > >> > within > >>> > >> > > >> > > > > your > >>> > >> > > >> > > > > > >> org do > >>> > >> > > >> > > > > > >> >> >> whatever you want - just know that > if > >>> you > >>> > >> use > >>> > >> > [some > >>> > >> > > >> > > "reserved" > >>> > >> > > >> > > > > > >> range] > >>> > >> > > >> > > > > > >> >> and a > >>> > >> > > >> > > > > > >> >> >> future kafka update breaks it its > your > >>> > >> > problem. > >>> > >> > > RTFM. > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> personally im in favor of ints. > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> having said that (and like nacho) I > >>> will > >>> > >> > settle if > >>> > >> > > int > >>> > >> > > >> vs > >>> > >> > > >> > > > string > >>> > >> > > >> > > > > > >> remains > >>> > >> > > >> > > > > > >> >> >> the only obstacle to this. > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >> On Tue, Nov 8, 2016 at 3:53 PM, > Nacho > >>> > Solis > >>> > >> > > >> > > > > > >> <[email protected] > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> >> wrote: > >>> > >> > > >> > > > > > >> >> >> > >>> > >> > > >> > > > > > >> >> >>> I think it's well known I've been > >>> > pushing > >>> > >> > for ints > >>> > >> > > >> (and I > >>> > >> > > >> > > > could > >>> > >> > > >> > > > > > >> switch > >>> > >> > > >> > > > > > >> >> to > >>> > >> > > >> > > > > > >> >> >>> 16 bit shorts if pressed). > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> - efficient (space) > >>> > >> > > >> > > > > > >> >> >>> - efficient (processing) > >>> > >> > > >> > > > > > >> >> >>> - easily partitionable > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> However, if the only thing that is > >>> > keeping > >>> > >> > us from > >>> > >> > > >> > adopting > >>> > >> > > >> > > > > > >> headers is > >>> > >> > > >> > > > > > >> >> the > >>> > >> > > >> > > > > > >> >> >>> use of strings vs ints as keys, > then > >>> I > >>> > >> would > >>> > >> > cave > >>> > >> > > in > >>> > >> > > >> and > >>> > >> > > >> > > > accept > >>> > >> > > >> > > > > > >> >> strings. If > >>> > >> > > >> > > > > > >> >> >>> we do so, I would like to limit > >>> string > >>> > keys > >>> > >> > to 128 > >>> > >> > > >> bytes > >>> > >> > > >> > in > >>> > >> > > >> > > > > > length. > >>> > >> > > >> > > > > > >> >> This > >>> > >> > > >> > > > > > >> >> >>> way 1) I could use a 3 letter > string > >>> if > >>> > I > >>> > >> > wanted > >>> > >> > > >> > > (effectively > >>> > >> > > >> > > > > > >> using 4 > >>> > >> > > >> > > > > > >> >> total > >>> > >> > > >> > > > > > >> >> >>> bytes), 2) limit overall impact of > >>> > possible > >>> > >> > keys > >>> > >> > > (don't > >>> > >> > > >> > > > really > >>> > >> > > >> > > > > > want > >>> > >> > > >> > > > > > >> >> people > >>> > >> > > >> > > > > > >> >> >>> to send a 16K header string key). > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> Nacho > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> On Tue, Nov 8, 2016 at 3:35 PM, > Gwen > >>> > >> Shapira > >>> > >> > < > >>> > >> > > >> > > > > [email protected]> > >>> > >> > > >> > > > > > >> >> wrote: > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>>> Forgot to mention: Thank you for > >>> > >> > quantifying the > >>> > >> > > >> > > trade-off - > >>> > >> > > >> > > > > it > >>> > >> > > >> > > > > > is > >>> > >> > > >> > > > > > >> >> >>>> helpful and important regardless > of > >>> > what > >>> > >> we > >>> > >> > end up > >>> > >> > > >> > > deciding. > >>> > >> > > >> > > > > > >> >> >>>> > >>> > >> > > >> > > > > > >> >> >>>> On Tue, Nov 8, 2016 at 3:12 PM, > Sean > >>> > >> > McCauliff > >>> > >> > > >> > > > > > >> >> >>>> <[email protected]. > invalid> > >>> > wrote: > >>> > >> > > >> > > > > > >> >> >>>>> On Tue, Nov 8, 2016 at 2:15 PM, > >>> Gwen > >>> > >> > Shapira < > >>> > >> > > >> > > > > > [email protected]> > >>> > >> > > >> > > > > > >> >> >>> wrote: > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> Since Kafka specifically > targets > >>> > >> > > high-throughput, > >>> > >> > > >> > > > > low-latency > >>> > >> > > >> > > > > > >> >> >>>>>> use-cases, I don't think we > should > >>> > trade > >>> > >> > them > >>> > >> > > off > >>> > >> > > >> that > >>> > >> > > >> > > > > easily. > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> I find these kind of design > goals > >>> not > >>> > to > >>> > >> be > >>> > >> > > really > >>> > >> > > >> > > helpful > >>> > >> > > >> > > > > > unless > >>> > >> > > >> > > > > > >> >> it's > >>> > >> > > >> > > > > > >> >> >>>>> quantified in someway. Because > it's > >>> > >> always > >>> > >> > > possible > >>> > >> > > >> to > >>> > >> > > >> > > > argue > >>> > >> > > >> > > > > > >> against > >>> > >> > > >> > > > > > >> >> >>>>> something as either being not > >>> > performant > >>> > >> > or just > >>> > >> > > an > >>> > >> > > >> > > > > > >> implementation > >>> > >> > > >> > > > > > >> >> >>>> detail. > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> This is a single threaded > >>> benchmarks > >>> > so > >>> > >> > all the > >>> > >> > > >> > > > measurements > >>> > >> > > >> > > > > > are > >>> > >> > > >> > > > > > >> per > >>> > >> > > >> > > > > > >> >> >>>>> thread. > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> For 1M messages/s/thread if > header > >>> > keys > >>> > >> > are int > >>> > >> > > and > >>> > >> > > >> > you > >>> > >> > > >> > > > had > >>> > >> > > >> > > > > > >> even a > >>> > >> > > >> > > > > > >> >> >>>> single > >>> > >> > > >> > > > > > >> >> >>>>> header key, value pair then it's > >>> still > >>> > >> > about 2^-2 > >>> > >> > > >> > > > > microseconds > >>> > >> > > >> > > > > > >> which > >>> > >> > > >> > > > > > >> >> >>>> means > >>> > >> > > >> > > > > > >> >> >>>>> you only have another 0.75 > >>> > microseconds > >>> > >> to > >>> > >> > do > >>> > >> > > >> > everything > >>> > >> > > >> > > > else > >>> > >> > > >> > > > > > you > >>> > >> > > >> > > > > > >> >> want > >>> > >> > > >> > > > > > >> >> >>> to > >>> > >> > > >> > > > > > >> >> >>>>> do with a message (1M messages/s > >>> > means 1 > >>> > >> > micro > >>> > >> > > second > >>> > >> > > >> > per > >>> > >> > > >> > > > > > >> message). > >>> > >> > > >> > > > > > >> >> >>> With > >>> > >> > > >> > > > > > >> >> >>>>> string header keys there is > still > >>> 0.5 > >>> > >> micro > >>> > >> > > seconds > >>> > >> > > >> to > >>> > >> > > >> > > > > process > >>> > >> > > >> > > > > > a > >>> > >> > > >> > > > > > >> >> >>> message. > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> I love strings as much as the > next > >>> guy > >>> > >> (we > >>> > >> > had > >>> > >> > > them > >>> > >> > > >> in > >>> > >> > > >> > > > > Flume), > >>> > >> > > >> > > > > > >> but I > >>> > >> > > >> > > > > > >> >> >>>>>> was convinced by > >>> Magnus/Michael/Radai > >>> > >> that > >>> > >> > > strings > >>> > >> > > >> > don't > >>> > >> > > >> > > > > > >> actually > >>> > >> > > >> > > > > > >> >> have > >>> > >> > > >> > > > > > >> >> >>>>>> strong benefits as opposed to > ints > >>> > >> > (you'll need > >>> > >> > > a > >>> > >> > > >> > string > >>> > >> > > >> > > > > > >> registry > >>> > >> > > >> > > > > > >> >> >>>>>> anyway - otherwise, how will > you > >>> know > >>> > >> > what does > >>> > >> > > the > >>> > >> > > >> > > > > > "profile_id" > >>> > >> > > >> > > > > > >> >> >>>>>> header refers to?) and I want > to > >>> keep > >>> > >> > closer to > >>> > >> > > our > >>> > >> > > >> > > > original > >>> > >> > > >> > > > > > >> design > >>> > >> > > >> > > > > > >> >> >>>>>> goals for Kafka. > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> "confluent.profile_id" > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> If someone likes strings in the > >>> > headers > >>> > >> > and > >>> > >> > > doesn't > >>> > >> > > >> do > >>> > >> > > >> > > > > > millions > >>> > >> > > >> > > > > > >> of > >>> > >> > > >> > > > > > >> >> >>>>>> messages a sec, they probably > have > >>> > lots > >>> > >> > of other > >>> > >> > > >> > systems > >>> > >> > > >> > > > > they > >>> > >> > > >> > > > > > >> can > >>> > >> > > >> > > > > > >> >> use > >>> > >> > > >> > > > > > >> >> >>>>>> instead. > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> None of them will scale like > Kafka. > >>> > >> > Horizontal > >>> > >> > > >> scaling > >>> > >> > > >> > > is > >>> > >> > > >> > > > > > still > >>> > >> > > >> > > > > > >> >> good. > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> On Tue, Nov 8, 2016 at 1:22 PM, > >>> Sean > >>> > >> > McCauliff > >>> > >> > > >> > > > > > >> >> >>>>>> <[email protected]. > invalid> > >>> > >> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>> +1 for String keys. > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> I've been doing some > bechmarking > >>> > and it > >>> > >> > seems > >>> > >> > > like > >>> > >> > > >> > the > >>> > >> > > >> > > > > > speedup > >>> > >> > > >> > > > > > >> for > >>> > >> > > >> > > > > > >> >> >>>> using > >>> > >> > > >> > > > > > >> >> >>>>>>> integer keys is about 2-5 > >>> depending > >>> > on > >>> > >> > the > >>> > >> > > length > >>> > >> > > >> of > >>> > >> > > >> > > the > >>> > >> > > >> > > > > > >> strings > >>> > >> > > >> > > > > > >> >> and > >>> > >> > > >> > > > > > >> >> >>>> what > >>> > >> > > >> > > > > > >> >> >>>>>>> collections are being used. > The > >>> > overall > >>> > >> > amount > >>> > >> > > of > >>> > >> > > >> > time > >>> > >> > > >> > > > > spent > >>> > >> > > >> > > > > > >> >> >>> parsing > >>> > >> > > >> > > > > > >> >> >>>> a > >>> > >> > > >> > > > > > >> >> >>>>>> set > >>> > >> > > >> > > > > > >> >> >>>>>>> of header key, value pairs > >>> probably > >>> > >> does > >>> > >> > not > >>> > >> > > matter > >>> > >> > > >> > > > unless > >>> > >> > > >> > > > > > you > >>> > >> > > >> > > > > > >> are > >>> > >> > > >> > > > > > >> >> >>>>>> getting > >>> > >> > > >> > > > > > >> >> >>>>>>> close to 1M messages per > >>> consumer. > >>> > In > >>> > >> > which > >>> > >> > > case > >>> > >> > > >> > > > probably > >>> > >> > > >> > > > > > >> don't > >>> > >> > > >> > > > > > >> >> use > >>> > >> > > >> > > > > > >> >> >>>>>>> headers. There is also the > >>> option to > >>> > >> use > >>> > >> > very > >>> > >> > > >> short > >>> > >> > > >> > > > > strings; > >>> > >> > > >> > > > > > >> some > >>> > >> > > >> > > > > > >> >> >>>> that > >>> > >> > > >> > > > > > >> >> >>>>>> are > >>> > >> > > >> > > > > > >> >> >>>>>>> even shorter than integers. > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> Partitioning the string key > space > >>> > will > >>> > >> be > >>> > >> > > easier > >>> > >> > > >> than > >>> > >> > > >> > > > > > >> partitioning > >>> > >> > > >> > > > > > >> >> >>> an > >>> > >> > > >> > > > > > >> >> >>>>>>> integer key space. We won't > need > >>> a > >>> > >> global > >>> > >> > > registry. > >>> > >> > > >> > > > Kafka > >>> > >> > > >> > > > > > >> >> >>> internally > >>> > >> > > >> > > > > > >> >> >>>> can > >>> > >> > > >> > > > > > >> >> >>>>>>> reserve some prefix like "_" > as > >>> its > >>> > >> > namespace. > >>> > >> > > >> > > Everyone > >>> > >> > > >> > > > > else > >>> > >> > > >> > > > > > >> can > >>> > >> > > >> > > > > > >> >> >>> use > >>> > >> > > >> > > > > > >> >> >>>>>> their > >>> > >> > > >> > > > > > >> >> >>>>>>> company or project name as > >>> namespace > >>> > >> > prefix and > >>> > >> > > >> life > >>> > >> > > >> > > > should > >>> > >> > > >> > > > > > be > >>> > >> > > >> > > > > > >> >> good. > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> Here's the link to some of the > >>> > >> > benchmarking > >>> > >> > > info: > >>> > >> > > >> > > > > > >> >> >>>>>>> https://docs.google.com/ > >>> > >> document/d/1tfT- > >>> > >> > > >> > > > > > >> >> >>>> 6SZdnKOLyWGDH82kS30PnUkmgb7nPL > >>> > >> > > >> > > > > > >> >> >>>>>> dw6p65pAI/edit?usp=sharing > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> -- > >>> > >> > > >> > > > > > >> >> >>>>>>> Sean McCauliff > >>> > >> > > >> > > > > > >> >> >>>>>>> Staff Software Engineer > >>> > >> > > >> > > > > > >> >> >>>>>>> Kafka > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> [email protected] > >>> > >> > > >> > > > > > >> >> >>>>>>> linkedin.com/in/sean- > >>> > mccauliff-b563192 > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>> On Mon, Nov 7, 2016 at 11:51 > PM, > >>> > >> Michael > >>> > >> > > Pearce < > >>> > >> > > >> > > > > > >> >> >>>> [email protected]> > >>> > >> > > >> > > > > > >> >> >>>>>>> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> +1 on this slimmer version of > >>> our > >>> > >> > proposal > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> I def think the Id space we > can > >>> > reduce > >>> > >> > from > >>> > >> > > the > >>> > >> > > >> > > proposed > >>> > >> > > >> > > > > > >> >> >>>> int32(4bytes) > >>> > >> > > >> > > > > > >> >> >>>>>>>> down to int16(2bytes) it > saves > >>> on > >>> > >> space > >>> > >> > and as > >>> > >> > > >> > headers > >>> > >> > > >> > > > we > >>> > >> > > >> > > > > > >> wouldn't > >>> > >> > > >> > > > > > >> >> >>>>>> expect > >>> > >> > > >> > > > > > >> >> >>>>>>>> the number of headers being > used > >>> > >> > concurrently > >>> > >> > > >> being > >>> > >> > > >> > > that > >>> > >> > > >> > > > > > high. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> I would wonder if we should > make > >>> > the > >>> > >> > value > >>> > >> > > byte > >>> > >> > > >> > array > >>> > >> > > >> > > > > length > >>> > >> > > >> > > > > > >> still > >>> > >> > > >> > > > > > >> >> >>>> int32 > >>> > >> > > >> > > > > > >> >> >>>>>>>> though as This is the > standard > >>> Max > >>> > >> array > >>> > >> > > length in > >>> > >> > > >> > > Java > >>> > >> > > >> > > > > > saying > >>> > >> > > >> > > > > > >> >> that > >>> > >> > > >> > > > > > >> >> >>>> it > >>> > >> > > >> > > > > > >> >> >>>>>> is a > >>> > >> > > >> > > > > > >> >> >>>>>>>> header and I guess limiting > the > >>> > size > >>> > >> is > >>> > >> > > sensible > >>> > >> > > >> and > >>> > >> > > >> > > > would > >>> > >> > > >> > > > > > >> work > >>> > >> > > >> > > > > > >> >> for > >>> > >> > > >> > > > > > >> >> >>>> all > >>> > >> > > >> > > > > > >> >> >>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>> use cases we have in mind so > >>> happy > >>> > >> with > >>> > >> > > limiting > >>> > >> > > >> > this. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Do people generally concur on > >>> > Magnus's > >>> > >> > slimmer > >>> > >> > > >> > > version? > >>> > >> > > >> > > > > > >> Anyone see > >>> > >> > > >> > > > > > >> >> >>>> any > >>> > >> > > >> > > > > > >> >> >>>>>>>> issues if we moved from > int32 to > >>> > >> int16? > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Re configurable ids per > plugin > >>> > over a > >>> > >> > global > >>> > >> > > >> > registry > >>> > >> > > >> > > > also > >>> > >> > > >> > > > > > >> would > >>> > >> > > >> > > > > > >> >> >>> work > >>> > >> > > >> > > > > > >> >> >>>>>> for > >>> > >> > > >> > > > > > >> >> >>>>>>>> us. As such if this has > better > >>> > >> > concensus over > >>> > >> > > the > >>> > >> > > >> > > > > proposed > >>> > >> > > >> > > > > > >> global > >>> > >> > > >> > > > > > >> >> >>>>>> registry > >>> > >> > > >> > > > > > >> >> >>>>>>>> I'd be happy to change that. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> I was already sold on ints > over > >>> > >> strings > >>> > >> > for > >>> > >> > > keys > >>> > >> > > >> ;) > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Cheers > >>> > >> > > >> > > > > > >> >> >>>>>>>> Mike > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > ______________________________ > >>> > >> > __________ > >>> > >> > > >> > > > > > >> >> >>>>>>>> From: Magnus Edenhill < > >>> > >> > [email protected]> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Sent: Monday, November 7, > 2016 > >>> > >> 10:10:21 > >>> > >> > PM > >>> > >> > > >> > > > > > >> >> >>>>>>>> To: [email protected] > >>> > >> > > >> > > > > > >> >> >>>>>>>> Subject: Re: [DISCUSS] > KIP-82 - > >>> Add > >>> > >> > Record > >>> > >> > > Headers > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Hi, > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> I'm +1 for adding generic > >>> message > >>> > >> > headers, > >>> > >> > > but I > >>> > >> > > >> do > >>> > >> > > >> > > > share > >>> > >> > > >> > > > > > the > >>> > >> > > >> > > > > > >> >> >>>> concerns > >>> > >> > > >> > > > > > >> >> >>>>>>>> previously aired on this > thread > >>> and > >>> > >> > during > >>> > >> > > the KIP > >>> > >> > > >> > > > > meeting. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> So let me propose a slimmer > >>> > >> alternative > >>> > >> > that > >>> > >> > > does > >>> > >> > > >> > not > >>> > >> > > >> > > > > > require > >>> > >> > > >> > > > > > >> any > >>> > >> > > >> > > > > > >> >> >>>> sort > >>> > >> > > >> > > > > > >> >> >>>>>> of > >>> > >> > > >> > > > > > >> >> >>>>>>>> global header registry, does > not > >>> > >> affect > >>> > >> > broker > >>> > >> > > >> > > > performance > >>> > >> > > >> > > > > > or > >>> > >> > > >> > > > > > >> >> >>>>>> operations, > >>> > >> > > >> > > > > > >> >> >>>>>>>> and adds as little overhead > as > >>> > >> possible. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Message > >>> > >> > > >> > > > > > >> >> >>>>>>>> ------------ > >>> > >> > > >> > > > > > >> >> >>>>>>>> The protocol Message type is > >>> > extended > >>> > >> > with a > >>> > >> > > >> Headers > >>> > >> > > >> > > > array > >>> > >> > > >> > > > > > >> >> consting > >>> > >> > > >> > > > > > >> >> >>>> of > >>> > >> > > >> > > > > > >> >> >>>>>>>> Tags, where a Tag is defined > as: > >>> > >> > > >> > > > > > >> >> >>>>>>>> int16 Id > >>> > >> > > >> > > > > > >> >> >>>>>>>> int16 Len // binary_data > length > >>> > >> > > >> > > > > > >> >> >>>>>>>> binary_data[Len] // opaque > >>> binary > >>> > data > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Ids > >>> > >> > > >> > > > > > >> >> >>>>>>>> --- > >>> > >> > > >> > > > > > >> >> >>>>>>>> The Id space is not centrally > >>> > managed, > >>> > >> > so > >>> > >> > > whenever > >>> > >> > > >> > an > >>> > >> > > >> > > > > > >> application > >>> > >> > > >> > > > > > >> >> >>>> needs > >>> > >> > > >> > > > > > >> >> >>>>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>> add headers, or use an > >>> eco-system > >>> > >> > plugin that > >>> > >> > > >> does, > >>> > >> > > >> > > its > >>> > >> > > >> > > > Id > >>> > >> > > >> > > > > > >> >> >>> allocation > >>> > >> > > >> > > > > > >> >> >>>>>> will > >>> > >> > > >> > > > > > >> >> >>>>>>>> need to be manually > configured. > >>> > >> > > >> > > > > > >> >> >>>>>>>> This moves the allocation > >>> concern > >>> > from > >>> > >> > the > >>> > >> > > global > >>> > >> > > >> > > space > >>> > >> > > >> > > > > down > >>> > >> > > >> > > > > > >> to > >>> > >> > > >> > > > > > >> >> >>>>>>>> organization level and avoids > >>> the > >>> > risk > >>> > >> > for id > >>> > >> > > >> > > conflicts. > >>> > >> > > >> > > > > > >> >> >>>>>>>> Example pseudo-config for > some > >>> app: > >>> > >> > > >> > > > > > >> >> >>>>>>>> > sometrackerplugin.tag.sourcev3 > >>> .id > >>> > >> =1000 > >>> > >> > > >> > > > > > >> >> >>>>>>>> dbthing.tag.tablename.id > =1001 > >>> > >> > > >> > > > > > >> >> >>>>>>>> > myschemareg.tag.schemaname.id= > >>> 1002 > >>> > >> > > >> > > > > > >> >> >>>>>>>> > myschemareg.tag.schemaversion. > >>> id > >>> > =1003 > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Each header-writing or > >>> > header-reading > >>> > >> > plugin > >>> > >> > > must > >>> > >> > > >> > > > provide > >>> > >> > > >> > > > > > >> means > >>> > >> > > >> > > > > > >> >> >>>>>> (typically > >>> > >> > > >> > > > > > >> >> >>>>>>>> through configuration) to > >>> specify > >>> > the > >>> > >> > tag for > >>> > >> > > each > >>> > >> > > >> > > > header > >>> > >> > > >> > > > > it > >>> > >> > > >> > > > > > >> uses. > >>> > >> > > >> > > > > > >> >> >>>>>> Defaults > >>> > >> > > >> > > > > > >> >> >>>>>>>> should be avoided. > >>> > >> > > >> > > > > > >> >> >>>>>>>> A consumer silently ignores > >>> tags it > >>> > >> > does not > >>> > >> > > have > >>> > >> > > >> a > >>> > >> > > >> > > > > mapping > >>> > >> > > >> > > > > > >> for > >>> > >> > > >> > > > > > >> >> >>>> (since > >>> > >> > > >> > > > > > >> >> >>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>> binary_data can't be parsed > >>> without > >>> > >> > knowing > >>> > >> > > what > >>> > >> > > >> it > >>> > >> > > >> > > is). > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Id range 0..999 is reserved > for > >>> > future > >>> > >> > use by > >>> > >> > > the > >>> > >> > > >> > > broker > >>> > >> > > >> > > > > and > >>> > >> > > >> > > > > > >> must > >>> > >> > > >> > > > > > >> >> >>>> not be > >>> > >> > > >> > > > > > >> >> >>>>>>>> used by plugins. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Broker > >>> > >> > > >> > > > > > >> >> >>>>>>>> --------- > >>> > >> > > >> > > > > > >> >> >>>>>>>> The broker does not process > the > >>> > tags > >>> > >> > (other > >>> > >> > > than > >>> > >> > > >> the > >>> > >> > > >> > > > > > standard > >>> > >> > > >> > > > > > >> >> >>>> protocol > >>> > >> > > >> > > > > > >> >> >>>>>>>> syntax verification), it > simply > >>> > stores > >>> > >> > and > >>> > >> > > >> forwards > >>> > >> > > >> > > them > >>> > >> > > >> > > > > as > >>> > >> > > >> > > > > > >> opaque > >>> > >> > > >> > > > > > >> >> >>>> data. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Standard message translation > >>> > (removal > >>> > >> of > >>> > >> > > Headers) > >>> > >> > > >> > > kicks > >>> > >> > > >> > > > in > >>> > >> > > >> > > > > > for > >>> > >> > > >> > > > > > >> >> >>> older > >>> > >> > > >> > > > > > >> >> >>>>>>>> clients. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Why not string ids? > >>> > >> > > >> > > > > > >> >> >>>>>>>> ------------------------- > >>> > >> > > >> > > > > > >> >> >>>>>>>> String ids might seem like a > >>> good > >>> > >> idea, > >>> > >> > but: > >>> > >> > > >> > > > > > >> >> >>>>>>>> * does not really solve > >>> uniqueness > >>> > >> > > >> > > > > > >> >> >>>>>>>> * consumes a lot of space (2 > >>> byte > >>> > >> string > >>> > >> > > length + > >>> > >> > > >> > > > string, > >>> > >> > > >> > > > > > per > >>> > >> > > >> > > > > > >> >> >>>> header) > >>> > >> > > >> > > > > > >> >> >>>>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>> be meaningful > >>> > >> > > >> > > > > > >> >> >>>>>>>> * doesn't really say anything > >>> how > >>> > to > >>> > >> > parse the > >>> > >> > > >> tag's > >>> > >> > > >> > > > data, > >>> > >> > > >> > > > > > so > >>> > >> > > >> > > > > > >> it > >>> > >> > > >> > > > > > >> >> >>> is > >>> > >> > > >> > > > > > >> >> >>>> in > >>> > >> > > >> > > > > > >> >> >>>>>>>> effect useless on its own. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> Regards, > >>> > >> > > >> > > > > > >> >> >>>>>>>> Magnus > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> 2016-11-07 18:32 GMT+01:00 > >>> Michael > >>> > >> > Pearce < > >>> > >> > > >> > > > > > >> [email protected] > >>> > >> > > >> > > > > > >> >> >: > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Hi Roger, > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Thanks for the support. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> I think the key thing is to > >>> have a > >>> > >> > common key > >>> > >> > > >> space > >>> > >> > > >> > > to > >>> > >> > > >> > > > > make > >>> > >> > > >> > > > > > >> an > >>> > >> > > >> > > > > > >> >> >>>>>> ecosystem, > >>> > >> > > >> > > > > > >> >> >>>>>>>>> there does have to be some > >>> level > >>> > of > >>> > >> > contract > >>> > >> > > for > >>> > >> > > >> > > people > >>> > >> > > >> > > > > to > >>> > >> > > >> > > > > > >> play > >>> > >> > > >> > > > > > >> >> >>>>>> nicely. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Having map<String, byte[]> > or > >>> as > >>> > per > >>> > >> > current > >>> > >> > > >> > proposed > >>> > >> > > >> > > > in > >>> > >> > > >> > > > > > kip > >>> > >> > > >> > > > > > >> of > >>> > >> > > >> > > > > > >> >> >>>>>> having a > >>> > >> > > >> > > > > > >> >> >>>>>>>>> numerical key space of > map<int, > >>> > >> > byte[]> is a > >>> > >> > > >> level > >>> > >> > > >> > > of > >>> > >> > > >> > > > > the > >>> > >> > > >> > > > > > >> >> >>> contract > >>> > >> > > >> > > > > > >> >> >>>>>> that > >>> > >> > > >> > > > > > >> >> >>>>>>>>> most people would expect. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> I think the example in a > >>> previous > >>> > >> > comment > >>> > >> > > someone > >>> > >> > > >> > > else > >>> > >> > > >> > > > > made > >>> > >> > > >> > > > > > >> >> >>>> linking to > >>> > >> > > >> > > > > > >> >> >>>>>>>> AWS > >>> > >> > > >> > > > > > >> >> >>>>>>>>> blog and also implemented > api > >>> > where > >>> > >> > > originally > >>> > >> > > >> they > >>> > >> > > >> > > > > didn't > >>> > >> > > >> > > > > > >> have a > >>> > >> > > >> > > > > > >> >> >>>>>> header > >>> > >> > > >> > > > > > >> >> >>>>>>>>> space but not they do, where > >>> keys > >>> > are > >>> > >> > > uniform but > >>> > >> > > >> > the > >>> > >> > > >> > > > > value > >>> > >> > > >> > > > > > >> can > >>> > >> > > >> > > > > > >> >> >>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>> string, > >>> > >> > > >> > > > > > >> >> >>>>>>>>> int, anything is a good > >>> example. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Having a custom > >>> > MetadataSerializer is > >>> > >> > > something > >>> > >> > > >> we > >>> > >> > > >> > > had > >>> > >> > > >> > > > > > played > >>> > >> > > >> > > > > > >> >> >>> with, > >>> > >> > > >> > > > > > >> >> >>>>>> but > >>> > >> > > >> > > > > > >> >> >>>>>>>>> discounted the idea, as if > you > >>> > wanted > >>> > >> > > everyone to > >>> > >> > > >> > > work > >>> > >> > > >> > > > > the > >>> > >> > > >> > > > > > >> same > >>> > >> > > >> > > > > > >> >> >>>> way in > >>> > >> > > >> > > > > > >> >> >>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>> ecosystem, having to have > this > >>> > also > >>> > >> > > customizable > >>> > >> > > >> > > makes > >>> > >> > > >> > > > > it a > >>> > >> > > >> > > > > > >> bit > >>> > >> > > >> > > > > > >> >> >>>>>> harder. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Think about making the whole > >>> > message > >>> > >> > record > >>> > >> > > >> custom > >>> > >> > > >> > > > > > >> serializable, > >>> > >> > > >> > > > > > >> >> >>>> this > >>> > >> > > >> > > > > > >> >> >>>>>>>> would > >>> > >> > > >> > > > > > >> >> >>>>>>>>> make it fairly tricky > (though > >>> it > >>> > >> would > >>> > >> > not be > >>> > >> > > >> > > > impossible) > >>> > >> > > >> > > > > > to > >>> > >> > > >> > > > > > >> have > >>> > >> > > >> > > > > > >> >> >>>> made > >>> > >> > > >> > > > > > >> >> >>>>>>>> work > >>> > >> > > >> > > > > > >> >> >>>>>>>>> nicely. Having the value > >>> > customizable > >>> > >> > we > >>> > >> > > thought > >>> > >> > > >> > is a > >>> > >> > > >> > > > > > >> reasonable > >>> > >> > > >> > > > > > >> >> >>>>>> tradeoff > >>> > >> > > >> > > > > > >> >> >>>>>>>>> here of flexibility over > >>> contract > >>> > of > >>> > >> > > interaction > >>> > >> > > >> > > > between > >>> > >> > > >> > > > > > >> >> >>> different > >>> > >> > > >> > > > > > >> >> >>>>>>>> parties. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Is there a particular case > or > >>> > benefit > >>> > >> > of > >>> > >> > > having > >>> > >> > > >> > > > > > serialization > >>> > >> > > >> > > > > > >> >> >>>>>>>> customizable > >>> > >> > > >> > > > > > >> >> >>>>>>>>> that you have in mind? > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Saying this it is obviously > >>> > something > >>> > >> > that > >>> > >> > > could > >>> > >> > > >> be > >>> > >> > > >> > > > > > >> implemented, > >>> > >> > > >> > > > > > >> >> >>> if > >>> > >> > > >> > > > > > >> >> >>>>>> there > >>> > >> > > >> > > > > > >> >> >>>>>>>>> is a need. If we did go this > >>> > avenue I > >>> > >> > think a > >>> > >> > > >> > > defaulted > >>> > >> > > >> > > > > > >> >> >>> serializer > >>> > >> > > >> > > > > > >> >> >>>>>>>>> implementation should exist > so > >>> for > >>> > >> the > >>> > >> > 80:20 > >>> > >> > > >> rule, > >>> > >> > > >> > > > people > >>> > >> > > >> > > > > > can > >>> > >> > > >> > > > > > >> >> >>> just > >>> > >> > > >> > > > > > >> >> >>>>>> have > >>> > >> > > >> > > > > > >> >> >>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>> broker and clients get > default > >>> > >> > behavior. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Cheers > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Mike > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> On 11/6/16, 5:25 PM, > "radai" < > >>> > >> > > >> > > > [email protected] > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > >> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> making header _key_ > >>> serialization > >>> > >> > > configurable > >>> > >> > > >> > > > > > potentially > >>> > >> > > >> > > > > > >> >> >>>>>> undermines > >>> > >> > > >> > > > > > >> >> >>>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>> board usefulness of the > feature > >>> > (any > >>> > >> > point > >>> > >> > > >> along > >>> > >> > > >> > > the > >>> > >> > > >> > > > > > path > >>> > >> > > >> > > > > > >> >> >>> must > >>> > >> > > >> > > > > > >> >> >>>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>> able > >>> > >> > > >> > > > > > >> >> >>>>>>>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>>> read the header keys. the > >>> values > >>> > may > >>> > >> be > >>> > >> > > >> whatever > >>> > >> > > >> > > and > >>> > >> > > >> > > > > > >> require > >>> > >> > > >> > > > > > >> >> >>>> more > >>> > >> > > >> > > > > > >> >> >>>>>>>>> intimate > >>> > >> > > >> > > > > > >> >> >>>>>>>>> knowledge of the code that > >>> > produced > >>> > >> > specific > >>> > >> > > >> > > > headers, > >>> > >> > > >> > > > > > but > >>> > >> > > >> > > > > > >> >> >>> keys > >>> > >> > > >> > > > > > >> >> >>>>>> should > >>> > >> > > >> > > > > > >> >> >>>>>>>>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>>> universally readable). > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> it would also make it hard > to > >>> > write > >>> > >> > really > >>> > >> > > >> > > portable > >>> > >> > > >> > > > > > >> plugins - > >>> > >> > > >> > > > > > >> >> >>>> say > >>> > >> > > >> > > > > > >> >> >>>>>> i > >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote a > >>> > >> > > >> > > > > > >> >> >>>>>>>>> large message > >>> splitter/combiner - > >>> > if > >>> > >> i > >>> > >> > rely > >>> > >> > > on > >>> > >> > > >> > key > >>> > >> > > >> > > > > > >> >> >>>> "largeMessage" > >>> > >> > > >> > > > > > >> >> >>>>>> and > >>> > >> > > >> > > > > > >> >> >>>>>>>>> values of the form "1/20" > >>> someone > >>> > who > >>> > >> > uses > >>> > >> > > >> > > > (contrived > >>> > >> > > >> > > > > > >> >> >>> example) > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Map<Byte[], > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Double> wouldnt be able to > >>> re-use > >>> > my > >>> > >> > code. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> not the end of a the world > >>> within > >>> > an > >>> > >> > > >> > organization, > >>> > >> > > >> > > > but > >>> > >> > > >> > > > > > >> >> >>>>>> problematic if > >>> > >> > > >> > > > > > >> >> >>>>>>>>> you > >>> > >> > > >> > > > > > >> >> >>>>>>>>> want to enable an ecosystem > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> On Thu, Nov 3, 2016 at 2:04 > PM, > >>> > Roger > >>> > >> > Hoover > >>> > >> > > < > >>> > >> > > >> > > > > > >> >> >>>>>> [email protected] > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> As others have laid out, I > see > >>> > >> strong > >>> > >> > > reasons > >>> > >> > > >> for > >>> > >> > > >> > a > >>> > >> > > >> > > > > common > >>> > >> > > >> > > > > > >> >> >>>>>> message > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> metadata structure for the > >>> Kafka > >>> > >> > ecosystem. > >>> > >> > > In > >>> > >> > > >> > > > > > particular, > >>> > >> > > >> > > > > > >> >> >>>> I've > >>> > >> > > >> > > > > > >> >> >>>>>>>>> seen that > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> even within a single > >>> > organization, > >>> > >> > > >> infrastructure > >>> > >> > > >> > > > teams > >>> > >> > > >> > > > > > >> >> >>> often > >>> > >> > > >> > > > > > >> >> >>>>>> own > >>> > >> > > >> > > > > > >> >> >>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> message metadata while > >>> > application > >>> > >> > teams > >>> > >> > > own the > >>> > >> > > >> > > > > > >> >> >>>>>> application-level > >>> > >> > > >> > > > > > >> >> >>>>>>>>> data > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> format. Allowing metadata > and > >>> > >> content > >>> > >> > to > >>> > >> > > have > >>> > >> > > >> > > > different > >>> > >> > > >> > > > > > >> >> >>>>>> structure > >>> > >> > > >> > > > > > >> >> >>>>>>>>> and > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> evolve separately is very > >>> helpful > >>> > >> for > >>> > >> > this. > >>> > >> > > >> > Also, I > >>> > >> > > >> > > > > think > >>> > >> > > >> > > > > > >> >> >>>>>> there's > >>> > >> > > >> > > > > > >> >> >>>>>>>> a > >>> > >> > > >> > > > > > >> >> >>>>>>>>> lot of > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> value to having a common > >>> metadata > >>> > >> > structure > >>> > >> > > >> shared > >>> > >> > > >> > > > > across > >>> > >> > > >> > > > > > >> >> >>> the > >>> > >> > > >> > > > > > >> >> >>>>>> Kafka > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> ecosystem so that tools > which > >>> > >> leverage > >>> > >> > > metadata > >>> > >> > > >> > can > >>> > >> > > >> > > > more > >>> > >> > > >> > > > > > >> >> >>>> easily > >>> > >> > > >> > > > > > >> >> >>>>>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>>> shared > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> across organizations and > >>> > integrated > >>> > >> > > together. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> The question is, where does > >>> the > >>> > >> > metadata > >>> > >> > > >> structure > >>> > >> > > >> > > > > belong? > >>> > >> > > >> > > > > > >> >> >>>>>> Here's > >>> > >> > > >> > > > > > >> >> >>>>>>>>> my take: > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> We change the Kafka wire > and > >>> > on-disk > >>> > >> > format > >>> > >> > > to > >>> > >> > > >> > from > >>> > >> > > >> > > a > >>> > >> > > >> > > > > > (key, > >>> > >> > > >> > > > > > >> >> >>>>>> value) > >>> > >> > > >> > > > > > >> >> >>>>>>>>> model to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> a (key, metadata, value) > model > >>> > where > >>> > >> > all > >>> > >> > > three > >>> > >> > > >> are > >>> > >> > > >> > > > byte > >>> > >> > > >> > > > > > >> >> >>>> arrays > >>> > >> > > >> > > > > > >> >> >>>>>> from > >>> > >> > > >> > > > > > >> >> >>>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> brokers point of view. The > >>> > primary > >>> > >> > reason > >>> > >> > > for > >>> > >> > > >> > this > >>> > >> > > >> > > is > >>> > >> > > >> > > > > > that > >>> > >> > > >> > > > > > >> >> >>>> it > >>> > >> > > >> > > > > > >> >> >>>>>>>>> provides a > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> backward compatible > migration > >>> > path > >>> > >> > forward. > >>> > >> > > >> > > Producers > >>> > >> > > >> > > > > can > >>> > >> > > >> > > > > > >> >> >>>> start > >>> > >> > > >> > > > > > >> >> >>>>>>>>> populating > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> metadata fields before all > >>> > consumers > >>> > >> > > understand > >>> > >> > > >> > the > >>> > >> > > >> > > > > > >> >> >>> metadata > >>> > >> > > >> > > > > > >> >> >>>>>>>>> structure. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> For people who already have > >>> > custom > >>> > >> > envelope > >>> > >> > > >> > > > structures, > >>> > >> > > >> > > > > > >> >> >>> they > >>> > >> > > >> > > > > > >> >> >>>> can > >>> > >> > > >> > > > > > >> >> >>>>>>>>> populate > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> their existing structure > and > >>> the > >>> > new > >>> > >> > > structure > >>> > >> > > >> > for a > >>> > >> > > >> > > > > while > >>> > >> > > >> > > > > > >> >> >>> as > >>> > >> > > >> > > > > > >> >> >>>>>> they > >>> > >> > > >> > > > > > >> >> >>>>>>>>> make the > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> transition. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> We could stop there and let > >>> the > >>> > >> > clients > >>> > >> > > plug in > >>> > >> > > >> a > >>> > >> > > >> > > > > > >> >> >>>> KeySerializer, > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> MetadataSerializer, and > >>> > >> > ValueSerializer but > >>> > >> > > I > >>> > >> > > >> > think > >>> > >> > > >> > > it > >>> > >> > > >> > > > > is > >>> > >> > > >> > > > > > >> >> >>>> also > >>> > >> > > >> > > > > > >> >> >>>>>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>>> useful to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> have a default > >>> MetadataSerializer > >>> > >> that > >>> > >> > > >> implements > >>> > >> > > >> > a > >>> > >> > > >> > > > > > >> >> >>> key-value > >>> > >> > > >> > > > > > >> >> >>>>>> model > >>> > >> > > >> > > > > > >> >> >>>>>>>>> similar > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> to AMQP or HTTP headers. > Or we > >>> > could > >>> > >> > go even > >>> > >> > > >> > > further > >>> > >> > > >> > > > > and > >>> > >> > > >> > > > > > >> >> >>>>>>>> prescribe a > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Map<String, byte[]> or > >>> > Map<String, > >>> > >> > String> > >>> > >> > > data > >>> > >> > > >> > > model > >>> > >> > > >> > > > > for > >>> > >> > > >> > > > > > >> >> >>>>>> headers > >>> > >> > > >> > > > > > >> >> >>>>>>>> in > >>> > >> > > >> > > > > > >> >> >>>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> clients (while still > allowing > >>> > custom > >>> > >> > > >> serialization > >>> > >> > > >> > > of > >>> > >> > > >> > > > > the > >>> > >> > > >> > > > > > >> >> >>>> header > >>> > >> > > >> > > > > > >> >> >>>>>>>> data > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> model). > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> I think this would address > >>> > Radai's > >>> > >> > concerns: > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 1. All client code would > not > >>> > need to > >>> > >> > be > >>> > >> > > updated > >>> > >> > > >> to > >>> > >> > > >> > > > know > >>> > >> > > >> > > > > > >> >> >>> about > >>> > >> > > >> > > > > > >> >> >>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> container. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 2. Middleware friendly > clients > >>> > would > >>> > >> > have a > >>> > >> > > >> > standard > >>> > >> > > >> > > > > > header > >>> > >> > > >> > > > > > >> >> >>>> data > >>> > >> > > >> > > > > > >> >> >>>>>>>>> model to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> work with. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> 3. KIP is required both > b/c of > >>> > >> broker > >>> > >> > > changes > >>> > >> > > >> and > >>> > >> > > >> > > > > because > >>> > >> > > >> > > > > > >> >> >>> of > >>> > >> > > >> > > > > > >> >> >>>>>> client > >>> > >> > > >> > > > > > >> >> >>>>>>>>> API > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> changes. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Cheers, > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> Roger > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> On Wed, Nov 2, 2016 at 4:38 > >>> PM, > >>> > >> radai > >>> > >> > < > >>> > >> > > >> > > > > > >> >> >>>>>> [email protected]> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> my biggest issues with a > >>> > "standard" > >>> > >> > wrapper > >>> > >> > > >> > format: > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 1. _ALL_ client _CODE_ (as > >>> > opposed > >>> > >> to > >>> > >> > > kafka lib > >>> > >> > > >> > > > > version) > >>> > >> > > >> > > > > > >> >> >>>> must > >>> > >> > > >> > > > > > >> >> >>>>>> be > >>> > >> > > >> > > > > > >> >> >>>>>>>>> updated > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> know about the container, > >>> > because > >>> > >> > any old > >>> > >> > > naive > >>> > >> > > >> > > code > >>> > >> > > >> > > > > > >> >> >>>> trying to > >>> > >> > > >> > > > > > >> >> >>>>>>>>> directly > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> deserialize its own > payload > >>> > would > >>> > >> > keel > >>> > >> > > over and > >>> > >> > > >> > die > >>> > >> > > >> > > > (it > >>> > >> > > >> > > > > > >> >> >>>> needs > >>> > >> > > >> > > > > > >> >> >>>>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>>> know to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> deserialize a container, > and > >>> > then > >>> > >> > dig in > >>> > >> > > there > >>> > >> > > >> > for > >>> > >> > > >> > > > its > >>> > >> > > >> > > > > > >> >> >>>>>> payload). > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 2. in order to write > >>> > >> > middleware-friendly > >>> > >> > > >> clients > >>> > >> > > >> > > that > >>> > >> > > >> > > > > > >> >> >>>> utilize > >>> > >> > > >> > > > > > >> >> >>>>>>>> such > >>> > >> > > >> > > > > > >> >> >>>>>>>>> a > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> container one would > basically > >>> > have > >>> > >> > to write > >>> > >> > > >> their > >>> > >> > > >> > > own > >>> > >> > > >> > > > > > >> >> >>>>>>>>> producer/consumer > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> API > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> on top of the open source > >>> kafka > >>> > >> one. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> 3. if you were going to go > >>> with > >>> > a > >>> > >> > wrapper > >>> > >> > > >> format > >>> > >> > > >> > > you > >>> > >> > > >> > > > > > >> >> >>> really > >>> > >> > > >> > > > > > >> >> >>>>>> dont > >>> > >> > > >> > > > > > >> >> >>>>>>>>> need to > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> bother with a kip (just > open > >>> > source > >>> > >> > your > >>> > >> > > own > >>> > >> > > >> > client > >>> > >> > > >> > > > > stack > >>> > >> > > >> > > > > > >> >> >>>>>> from #2 > >>> > >> > > >> > > > > > >> >> >>>>>>>>> above > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> so > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> others could stop > >>> re-inventing > >>> > it) > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> On Wed, Nov 2, 2016 at > 4:25 > >>> PM, > >>> > >> James > >>> > >> > > Cheng < > >>> > >> > > >> > > > > > >> >> >>>>>>>> [email protected]> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> wrote: > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>>> How exactly would this > >>> work? Or > >>> > >> > maybe > >>> > >> > > that's > >>> > >> > > >> out > >>> > >> > > >> > > of > >>> > >> > > >> > > > > > >> >> >>> scope > >>> > >> > > >> > > > > > >> >> >>>>>> for > >>> > >> > > >> > > > > > >> >> >>>>>>>>> this > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> email. > >>> > >> > > >> > > > > > >> >> >>>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>>> The information contained in > >>> this > >>> > >> > email is > >>> > >> > > >> strictly > >>> > >> > > >> > > > > > >> confidential > >>> > >> > > >> > > > > > >> >> >>>> and > >>> > >> > > >> > > > > > >> >> >>>>>> for > >>> > >> > > >> > > > > > >> >> >>>>>>>>> the use of the addressee > only, > >>> > unless > >>> > >> > > otherwise > >>> > >> > > >> > > > > indicated. > >>> > >> > > >> > > > > > >> If you > >>> > >> > > >> > > > > > >> >> >>>> are > >>> > >> > > >> > > > > > >> >> >>>>>> not > >>> > >> > > >> > > > > > >> >> >>>>>>>>> the intended recipient, > please > >>> do > >>> > not > >>> > >> > read, > >>> > >> > > copy, > >>> > >> > > >> > use > >>> > >> > > >> > > > or > >>> > >> > > >> > > > > > >> disclose > >>> > >> > > >> > > > > > >> >> >>>> to > >>> > >> > > >> > > > > > >> >> >>>>>>>> others > >>> > >> > > >> > > > > > >> >> >>>>>>>>> this message or any > attachment. > >>> > >> Please > >>> > >> > also > >>> > >> > > >> notify > >>> > >> > > >> > > the > >>> > >> > > >> > > > > > >> sender by > >>> > >> > > >> > > > > > >> >> >>>>>> replying > >>> > >> > > >> > > > > > >> >> >>>>>>>>> to this email or by > telephone > >>> > >> (+44(020 > >>> > >> > 7896 > >>> > >> > > 0011) > >>> > >> > > >> > and > >>> > >> > > >> > > > > then > >>> > >> > > >> > > > > > >> delete > >>> > >> > > >> > > > > > >> >> >>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>> email > >>> > >> > > >> > > > > > >> >> >>>>>>>>> and any copies of it. > Opinions, > >>> > >> > conclusion > >>> > >> > > (etc) > >>> > >> > > >> > that > >>> > >> > > >> > > > do > >>> > >> > > >> > > > > > not > >>> > >> > > >> > > > > > >> >> >>>> relate to > >>> > >> > > >> > > > > > >> >> >>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>> official business of this > >>> company > >>> > >> > shall be > >>> > >> > > >> > understood > >>> > >> > > >> > > > as > >>> > >> > > >> > > > > > >> neither > >>> > >> > > >> > > > > > >> >> >>>> given > >>> > >> > > >> > > > > > >> >> >>>>>>>> nor > >>> > >> > > >> > > > > > >> >> >>>>>>>>> endorsed by it. IG is a > trading > >>> > name > >>> > >> > of IG > >>> > >> > > >> Markets > >>> > >> > > >> > > > > Limited > >>> > >> > > >> > > > > > (a > >>> > >> > > >> > > > > > >> >> >>>> company > >>> > >> > > >> > > > > > >> >> >>>>>>>>> registered in England and > >>> Wales, > >>> > >> > company > >>> > >> > > number > >>> > >> > > >> > > > 04008957) > >>> > >> > > >> > > > > > >> and IG > >>> > >> > > >> > > > > > >> >> >>>> Index > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Limited (a company > registered > >>> in > >>> > >> > England and > >>> > >> > > >> Wales, > >>> > >> > > >> > > > > company > >>> > >> > > >> > > > > > >> >> >>> number > >>> > >> > > >> > > > > > >> >> >>>>>>>>> 01190902). Registered > address > >>> at > >>> > >> Cannon > >>> > >> > > Bridge > >>> > >> > > >> > House, > >>> > >> > > >> > > > 25 > >>> > >> > > >> > > > > > >> Dowgate > >>> > >> > > >> > > > > > >> >> >>>> Hill, > >>> > >> > > >> > > > > > >> >> >>>>>>>>> London EC4R 2YA. Both IG > >>> Markets > >>> > >> > Limited > >>> > >> > > >> (register > >>> > >> > > >> > > > number > >>> > >> > > >> > > > > > >> 195355) > >>> > >> > > >> > > > > > >> >> >>>> and > >>> > >> > > >> > > > > > >> >> >>>>>> IG > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Index Limited (register > number > >>> > >> 114059) > >>> > >> > are > >>> > >> > > >> > authorised > >>> > >> > > >> > > > and > >>> > >> > > >> > > > > > >> >> >>>> regulated by > >>> > >> > > >> > > > > > >> >> >>>>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>>> Financial Conduct Authority. > >>> > >> > > >> > > > > > >> >> >>>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>>>> The information contained in > >>> this > >>> > >> email > >>> > >> > is > >>> > >> > > >> strictly > >>> > >> > > >> > > > > > >> confidential > >>> > >> > > >> > > > > > >> >> >>> and > >>> > >> > > >> > > > > > >> >> >>>> for > >>> > >> > > >> > > > > > >> >> >>>>>>>> the use of the addressee > only, > >>> > unless > >>> > >> > > otherwise > >>> > >> > > >> > > > indicated. > >>> > >> > > >> > > > > > If > >>> > >> > > >> > > > > > >> you > >>> > >> > > >> > > > > > >> >> >>> are > >>> > >> > > >> > > > > > >> >> >>>>>> not > >>> > >> > > >> > > > > > >> >> >>>>>>>> the intended recipient, > please > >>> do > >>> > not > >>> > >> > read, > >>> > >> > > copy, > >>> > >> > > >> > use > >>> > >> > > >> > > or > >>> > >> > > >> > > > > > >> disclose > >>> > >> > > >> > > > > > >> >> >>> to > >>> > >> > > >> > > > > > >> >> >>>>>> others > >>> > >> > > >> > > > > > >> >> >>>>>>>> this message or any > attachment. > >>> > Please > >>> > >> > also > >>> > >> > > notify > >>> > >> > > >> > the > >>> > >> > > >> > > > > > sender > >>> > >> > > >> > > > > > >> by > >>> > >> > > >> > > > > > >> >> >>>>>> replying > >>> > >> > > >> > > > > > >> >> >>>>>>>> to this email or by telephone > >>> > (+44(020 > >>> > >> > 7896 > >>> > >> > > 0011) > >>> > >> > > >> > and > >>> > >> > > >> > > > then > >>> > >> > > >> > > > > > >> delete > >>> > >> > > >> > > > > > >> >> >>> the > >>> > >> > > >> > > > > > >> >> >>>>>> email > >>> > >> > > >> > > > > > >> >> >>>>>>>> and any copies of it. > Opinions, > >>> > >> > conclusion > >>> > >> > > (etc) > >>> > >> > > >> > that > >>> > >> > > >> > > do > >>> > >> > > >> > > > > not > >>> > >> > > >> > > > > > >> >> relate > >>> > >> > > >> > > > > > >> >> >>>> to > >>> > >> > > >> > > > > > >> >> >>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>> official business of this > >>> company > >>> > >> shall > >>> > >> > be > >>> > >> > > >> > understood > >>> > >> > > >> > > as > >>> > >> > > >> > > > > > >> neither > >>> > >> > > >> > > > > > >> >> >>>> given > >>> > >> > > >> > > > > > >> >> >>>>>> nor > >>> > >> > > >> > > > > > >> >> >>>>>>>> endorsed by it. IG is a > trading > >>> > name > >>> > >> of > >>> > >> > IG > >>> > >> > > Markets > >>> > >> > > >> > > > Limited > >>> > >> > > >> > > > > > (a > >>> > >> > > >> > > > > > >> >> >>> company > >>> > >> > > >> > > > > > >> >> >>>>>>>> registered in England and > Wales, > >>> > >> company > >>> > >> > > number > >>> > >> > > >> > > > 04008957) > >>> > >> > > >> > > > > > and > >>> > >> > > >> > > > > > >> IG > >>> > >> > > >> > > > > > >> >> >>>> Index > >>> > >> > > >> > > > > > >> >> >>>>>>>> Limited (a company > registered in > >>> > >> > England and > >>> > >> > > >> Wales, > >>> > >> > > >> > > > > company > >>> > >> > > >> > > > > > >> number > >>> > >> > > >> > > > > > >> >> >>>>>>>> 01190902). Registered > address at > >>> > >> Cannon > >>> > >> > Bridge > >>> > >> > > >> > House, > >>> > >> > > >> > > 25 > >>> > >> > > >> > > > > > >> Dowgate > >>> > >> > > >> > > > > > >> >> >>>> Hill, > >>> > >> > > >> > > > > > >> >> >>>>>>>> London EC4R 2YA. Both IG > Markets > >>> > >> Limited > >>> > >> > > (register > >>> > >> > > >> > > > number > >>> > >> > > >> > > > > > >> 195355) > >>> > >> > > >> > > > > > >> >> >>>> and IG > >>> > >> > > >> > > > > > >> >> >>>>>>>> Index Limited (register > number > >>> > 114059) > >>> > >> > are > >>> > >> > > >> > authorised > >>> > >> > > >> > > > and > >>> > >> > > >> > > > > > >> >> regulated > >>> > >> > > >> > > > > > >> >> >>>> by > >>> > >> > > >> > > > > > >> >> >>>>>> the > >>> > >> > > >> > > > > > >> >> >>>>>>>> Financial Conduct Authority. > >>> > >> > > >> > > > > > >> >> >>>>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>>>> -- > >>> > >> > > >> > > > > > >> >> >>>>>> Gwen Shapira > >>> > >> > > >> > > > > > >> >> >>>>>> Product Manager | Confluent > >>> > >> > > >> > > > > > >> >> >>>>>> 650.450.2760 <(650)%20450-2760> > >>> | @gwenshap > >>> > >> > > >> > > > > > >> >> >>>>>> Follow us: Twitter | blog > >>> > >> > > >> > > > > > >> >> >>>>>> > >>> > >> > > >> > > > > > >> >> >>>> > >>> > >> > > >> > > > > > >> >> >>>> > >>> > >> > > >> > > > > > >> >> >>>> > >>> > >> > > >> > > > > > >> >> >>>> -- > >>> > >> > > >> > > > > > >> >> >>>> Gwen Shapira > >>> > >> > > >> > > > > > >> >> >>>> Product Manager | Confluent > >>> > >> > > >> > > > > > >> >> >>>> 650.450.2760 <(650)%20450-2760> | > >>> @gwenshap > >>> > >> > > >> > > > > > >> >> >>>> Follow us: Twitter | blog > >>> > >> > > >> > > > > > >> >> >>>> > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> >>> -- > >>> > >> > > >> > > > > > >> >> >>> Nacho (Ignacio) Solis > >>> > >> > > >> > > > > > >> >> >>> Kafka > >>> > >> > > >> > > > > > >> >> >>> [email protected] > >>> > >> > > >> > > > > > >> >> >>> > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > > >>> > >> > > >> > > > > > >> >> > -- > >>> > >> > > >> > > > > > >> >> > Gwen Shapira > >>> > >> > > >> > > > > > >> >> > Product Manager | Confluent > >>> > >> > > >> > > > > > >> >> > 650.450.2760 <(650)%20450-2760> | > >>> @gwenshap > >>> > >> > > >> > > > > > >> >> > Follow us: Twitter | blog > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> >> > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > >> -- > >>> > >> > > >> > > > > > >> Gwen Shapira > >>> > >> > > >> > > > > > >> Product Manager | Confluent > >>> > >> > > >> > > > > > >> 650.450.2760 <(650)%20450-2760> | > @gwenshap > >>> > >> > > >> > > > > > >> Follow us: Twitter | blog > >>> > >> > > >> > > > > > >> > >>> > >> > > >> > > > > > > > >>> > >> > > >> > > > > > > > >>> > >> > > >> > > > > > The information contained in this email is > >>> > strictly > >>> > >> > > confidential > >>> > >> > > >> > and > >>> > >> > > >> > > > for > >>> > >> > > >> > > > > > the use of the addressee only, unless > otherwise > >>> > >> > indicated. > >>> > >> > > If you > >>> > >> > > >> > are > >>> > >> > > >> > > > not > >>> > >> > > >> > > > > > the intended recipient, please do not read, > >>> copy, > >>> > use > >>> > >> > or > >>> > >> > > disclose > >>> > >> > > >> > to > >>> > >> > > >> > > > > others > >>> > >> > > >> > > > > > this message or any attachment. Please also > >>> notify > >>> > >> the > >>> > >> > > sender by > >>> > >> > > >> > > > replying > >>> > >> > > >> > > > > > to this email or by telephone (+44(020 7896 > >>> 0011) > >>> > and > >>> > >> > then > >>> > >> > > delete > >>> > >> > > >> > the > >>> > >> > > >> > > > > email > >>> > >> > > >> > > > > > and any copies of it. Opinions, conclusion > >>> (etc) > >>> > that > >>> > >> > do not > >>> > >> > > >> relate > >>> > >> > > >> > > to > >>> > >> > > >> > > > > the > >>> > >> > > >> > > > > > official business of this company shall be > >>> > understood > >>> > >> > as > >>> > >> > > neither > >>> > >> > > >> > > given > >>> > >> > > >> > > > > nor > >>> > >> > > >> > > > > > endorsed by it. IG is a trading name of IG > >>> Markets > >>> > >> > Limited (a > >>> > >> > > >> > company > >>> > >> > > >> > > > > > registered in England and Wales, company > number > >>> > >> > 04008957) > >>> > >> > > and IG > >>> > >> > > >> > > Index > >>> > >> > > >> > > > > > Limited (a company registered in England and > >>> > Wales, > >>> > >> > company > >>> > >> > > >> number > >>> > >> > > >> > > > > > 01190902). Registered address at Cannon > Bridge > >>> > House, > >>> > >> > 25 > >>> > >> > > Dowgate > >>> > >> > > >> > > Hill, > >>> > >> > > >> > > > > > London EC4R 2YA. Both IG Markets Limited > >>> (register > >>> > >> > number > >>> > >> > > 195355) > >>> > >> > > >> > and > >>> > >> > > >> > > > IG > >>> > >> > > >> > > > > > Index Limited (register number 114059) are > >>> > authorised > >>> > >> > and > >>> > >> > > >> regulated > >>> > >> > > >> > > by > >>> > >> > > >> > > > > the > >>> > >> > > >> > > > > > Financial Conduct Authority. > >>> > >> > > >> > > > > > > >>> > >> > > >> > > > > > >>> > >> > > >> > > > > >>> > >> > > >> > > > >>> > >> > > >> > > >>> > >> > > >> The information contained in this email is strictly > >>> > >> confidential > >>> > >> > and > >>> > >> > > for > >>> > >> > > >> the use of the addressee only, unless otherwise > >>> indicated. > >>> > If > >>> > >> > you are > >>> > >> > > not > >>> > >> > > >> the intended recipient, please do not read, copy, use > or > >>> > >> > disclose to > >>> > >> > > others > >>> > >> > > >> this message or any attachment. Please also notify the > >>> > sender > >>> > >> by > >>> > >> > > replying > >>> > >> > > >> to this email or by telephone (+44(020 7896 0011) and > >>> then > >>> > >> > delete the > >>> > >> > > email > >>> > >> > > >> and any copies of it. Opinions, conclusion (etc) that > do > >>> not > >>> > >> > relate to > >>> > >> > > the > >>> > >> > > >> official business of this company shall be understood > as > >>> > >> neither > >>> > >> > given > >>> > >> > > nor > >>> > >> > > >> endorsed by it. IG is a trading name of IG Markets > >>> Limited > >>> > (a > >>> > >> > company > >>> > >> > > >> registered in England and Wales, company number > 04008957) > >>> > and > >>> > >> IG > >>> > >> > Index > >>> > >> > > >> Limited (a company registered in England and Wales, > >>> company > >>> > >> > number > >>> > >> > > >> 01190902). Registered address at Cannon Bridge House, > 25 > >>> > >> Dowgate > >>> > >> > Hill, > >>> > >> > > >> London EC4R 2YA. Both IG Markets Limited (register > number > >>> > >> > 195355) and > >>> > >> > > IG > >>> > >> > > >> Index Limited (register number 114059) are authorised > and > >>> > >> > regulated by > >>> > >> > > the > >>> > >> > > >> Financial Conduct Authority. > >>> > >> > > > > >>> > >> > > > -- > >>> > >> > > > Nacho - Ignacio Solis - [email protected] > >>> > >> > > The information contained in this email is strictly > >>> confidential > >>> > >> and > >>> > >> > for > >>> > >> > > the use of the addressee only, unless otherwise > indicated. If > >>> > you > >>> > >> > are not > >>> > >> > > the intended recipient, please do not read, copy, use or > >>> > disclose > >>> > >> to > >>> > >> > others > >>> > >> > > this message or any attachment. Please also notify the > >>> sender by > >>> > >> > replying > >>> > >> > > to this email or by telephone (+44(020 7896 0011) and then > >>> > delete > >>> > >> > the email > >>> > >> > > and any copies of it. Opinions, conclusion (etc) that do > not > >>> > relate > >>> > >> > to the > >>> > >> > > official business of this company shall be understood as > >>> neither > >>> > >> > given nor > >>> > >> > > endorsed by it. IG is a trading name of IG Markets > Limited (a > >>> > >> company > >>> > >> > > registered in England and Wales, company number 04008957) > >>> and IG > >>> > >> > Index > >>> > >> > > Limited (a company registered in England and Wales, > company > >>> > number > >>> > >> > > 01190902). Registered address at Cannon Bridge House, 25 > >>> Dowgate > >>> > >> > Hill, > >>> > >> > > London EC4R 2YA. Both IG Markets Limited (register number > >>> > 195355) > >>> > >> > and IG > >>> > >> > > Index Limited (register number 114059) are authorised and > >>> > regulated > >>> > >> > by the > >>> > >> > > Financial Conduct Authority. > >>> > >> > > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > -- > >>> > >> > Nacho - Ignacio Solis - [email protected] > >>> > >> > > >>> > >> > > >>> > >> > The information contained in this email is strictly confidential > >>> and > >>> > for > >>> > >> > the use of the addressee only, unless otherwise indicated. If > you > >>> are > >>> > not > >>> > >> > the intended recipient, please do not read, copy, use or > disclose > >>> to > >>> > >> others > >>> > >> > this message or any attachment. Please also notify the sender by > >>> > replying > >>> > >> > to this email or by telephone (+44(020 7896 0011) and then > delete > >>> the > >>> > >> email > >>> > >> > and any copies of it. Opinions, conclusion (etc) that do not > >>> relate to > >>> > >> the > >>> > >> > official business of this company shall be understood as neither > >>> given > >>> > >> nor > >>> > >> > endorsed by it. IG is a trading name of IG Markets Limited (a > >>> company > >>> > >> > registered in England and Wales, company number 04008957) and IG > >>> Index > >>> > >> > Limited (a company registered in England and Wales, company > number > >>> > >> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate > >>> Hill, > >>> > >> > London EC4R 2YA. Both IG Markets Limited (register number > 195355) > >>> and > >>> > IG > >>> > >> > Index Limited (register number 114059) are authorised and > >>> regulated by > >>> > >> the > >>> > >> > Financial Conduct Authority. > >>> > >> > > >>> > >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Gwen Shapira > >>> > Product Manager | Confluent > >>> > 650.450.2760 <(650)%20450-2760> | @gwenshap > >>> > Follow us: Twitter | blog > >>> > > >>> > > >>> > >> > >> > >> > >> -- > >> *Todd Palino* > >> Staff Site Reliability Engineer > >> Data Infrastructure Streaming > >> > >> > >> > >> linkedin.com/in/toddpalino > >> > >> > > > > > > -- > > *Todd Palino* > > Staff Site Reliability Engineer > > Data Infrastructure Streaming > > > > > > > > linkedin.com/in/toddpalino > > > > -- > Gwen Shapira > Product Manager | Confluent > 650.450.2760 | @gwenshap > Follow us: Twitter | blog > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
