Got it. As an ops guy, I'm not very happy with the workaround. Avro means that I have to be concerned with the format of the messages in order to run the infrastructure (audit, mirroring, etc.). That means that I have to handle the schemas, and I have to enforce rules about good formats. This is not something I want to be in the business of, because I should be able to run a service infrastructure without needing to be in the weeds of dealing with customer data formats.
Trust me, a sizable portion of my support time is spent dealing with schema issues. I really would like to get away from that. Maybe I'd have more time for other hobbies. Like writing. ;) -Todd On Thu, Dec 1, 2016 at 4:04 PM Gwen Shapira <g...@confluent.io> wrote: > I'm pretty satisfied with the current workarounds (Avro container > format), so I'm not too excited about the extra work required to do > headers in Kafka. I absolutely don't mind it if you do it... > I think the Apache convention for "good idea, but not willing to put > any work toward it" is +0.5? anyway, that's what I was trying to > convey :) > > On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <tpal...@gmail.com> wrote: > > Well I guess my question for you, then, is what is holding you back from > > full support for headers? What’s the bit that you’re missing that has you > > under a full +1? > > > > -Todd > > > > > > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <g...@confluent.io> wrote: > > > >> I know why people who support headers support them, and I've seen what > >> the discussion is like. > >> > >> This is why I'm asking people who are against headers (especially > >> committers) what will make them change their mind - so we can get this > >> part over one way or another. > >> > >> If I sound frustrated it is not at Radai, Jun or you (Todd)... I am > >> just looking for something concrete we can do to move the discussion > >> along to the yummy design details (which is the argument I really am > >> looking forward to). > >> > >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <tpal...@gmail.com> wrote: > >> > So, Gwen, to your question (even though I’m not a committer)... > >> > > >> > I have always been a strong supporter of introducing the concept of an > >> > envelope to messages, which headers accomplishes. The message key is > >> > already an example of a piece of envelope information. By providing a > >> means > >> > to do this within Kafka itself, and not relying on use-case specific > >> > implementations, you make it much easier for components to > interoperate. > >> It > >> > simplifies development of all these things (message routing, auditing, > >> > encryption, etc.) because each one does not have to reinvent the > wheel. > >> > > >> > It also makes it much easier from a client point of view if the > headers > >> are > >> > defined as part of the protocol and/or message format in general > because > >> > you can easily produce and consume messages without having to take > into > >> > account specific cases. For example, I want to route messages, but > >> client A > >> > doesn’t support the way audit implemented headers, and client B > doesn’t > >> > support the way encryption or routing implemented headers, so now my > >> > application has to create some really fragile (my autocorrect just > tried > >> to > >> > make that “tragic”, which is probably appropriate too) code to strip > >> > everything off, rather than just consuming the messages, picking out > the > >> 1 > >> > or 2 headers it’s interested in, and performing its function. > >> > > >> > Honestly, this discussion has been going on for a long time, and it’s > >> > always “Oh, you came up with 2 use cases, and yeah, those use cases > are > >> > real things that someone would want to do. Here’s an alternate way to > >> > implement them so let’s not do headers.” If we have a few use cases > that > >> we > >> > actually came up with, you can be sure that over the next year > there’s a > >> > dozen others that we didn’t think of that someone would like to do. I > >> > really think it’s time to stop rehashing this discussion and instead > >> focus > >> > on a workable standard that we can adopt. > >> > > >> > -Todd > >> > > >> > > >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <tpal...@gmail.com> > wrote: > >> > > >> >> C. per message encryption > >> >>> One drawback of this approach is that this significantly reduce the > >> >>> effectiveness of compression, which happens on a set of serialized > >> >>> messages. An alternative is to enable SSL for wire encryption and > rely > >> on > >> >>> the storage system (e.g. LUKS) for at rest encryption. > >> >> > >> >> > >> >> Jun, this is not sufficient. While this does cover the case of > removing > >> a > >> >> drive from the system, it will not satisfy most compliance > requirements > >> for > >> >> encryption of data as whoever has access to the broker itself still > has > >> >> access to the unencrypted data. For end-to-end encryption you need to > >> >> encrypt at the producer, before it enters the system, and decrypt at > the > >> >> consumer, after it exits the system. > >> >> > >> >> -Todd > >> >> > >> >> > >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <radai.rosenbl...@gmail.com> > >> wrote: > >> >> > >> >>> another big plus of headers in the protocol is that it would enable > >> rapid > >> >>> iteration on ideas outside of core kafka and would reduce the > number of > >> >>> future wire format changes required. > >> >>> > >> >>> a lot of what is currently a KIP represents use cases that are not > 100% > >> >>> relevant to all users, and some of them require rather invasive wire > >> >>> protocol changes. a thing a good recent example of this is kip-98. > >> >>> tx-utilizing traffic is expected to be a very small fraction of > total > >> >>> traffic and yet the changes are invasive. > >> >>> > >> >>> every such wire format change translates into painful and slow > >> adoption of > >> >>> new versions. > >> >>> > >> >>> i think a lot of functionality currently in KIPs could be "spun out" > >> and > >> >>> implemented as opt-in plugins transmitting data over headers. this > >> would > >> >>> keep the core wire format stable(r), core codebase smaller, and > avoid > >> the > >> >>> "burden of proof" thats sometimes required to prove a certain > feature > >> is > >> >>> useful enough for a wide-enough audience to warrant a wire format > >> change > >> >>> and code complexity additions. > >> >>> > >> >>> (to be clear - kip-98 goes beyond "mere" wire format changes and im > not > >> >>> saying it could have been completely done with headers, but > >> exactly-once > >> >>> delivery certainly could) > >> >>> > >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen Shapira <g...@confluent.io> > >> wrote: > >> >>> > >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai < > radai.rosenbl...@gmail.com> > >> >>> wrote: > >> >>> > > "For use cases within an organization, one could always use > other > >> >>> > > approaches such as company-wise containers" > >> >>> > > this is what linkedin has traditionally done but there are now > >> cases > >> >>> > (read > >> >>> > > - topics) where this is not acceptable. this makes headers > useful > >> even > >> >>> > > within single orgs for cases where one-container-fits-all cannot > >> >>> apply. > >> >>> > > > >> >>> > > as for the particular use cases listed, i dont want this to > devolve > >> >>> to a > >> >>> > > discussion of particular use cases - i think its enough that > some > >> of > >> >>> them > >> >>> > > >> >>> > I think a main point of contention is that: We identified few > >> >>> > use-cases where headers are useful, do we want Kafka to be a > system > >> >>> > that supports those use-cases? > >> >>> > > >> >>> > For example, Jun said: > >> >>> > "Not sure how widely useful record-level lineage is though since > the > >> >>> > overhead could > >> >>> > be significant." > >> >>> > > >> >>> > We know NiFi supports record level lineage. I don't think it was > >> >>> > developed for lols, I think it is safe to assume that the NSA > needed > >> >>> > that functionality. We also know that certain financial institutes > >> >>> > need to track tampering with records at a record level and there > are > >> >>> > federal regulations that absolutely require this. They also need > to > >> >>> > prove that routing apps that "touches" the messages and either > reads > >> >>> > or updates headers couldn't have possibly modified the payload > >> itself. > >> >>> > They use record level encryption to do that - apps can read and > >> >>> > (sometimes) modify headers but can't touch the payload. > >> >>> > > >> >>> > We can totally say "those are corner cases and not worth adding > >> >>> > headers to Kafka for", they should use a different pubsub message > for > >> >>> > that (Nifi or one of the other 1000 that cater specifically to the > >> >>> > financial industry). > >> >>> > > >> >>> > But this gets us into a catch 22: > >> >>> > If we discuss a specific use-case, someone can always say it isn't > >> >>> > interesting enough for Kafka. If we discuss more general trends, > >> >>> > others can say "well, we are not sure any of them really needs > >> headers > >> >>> > specifically. This is just hand waving and not interesting.". > >> >>> > > >> >>> > I think discussing use-cases in specifics is super important to > >> decide > >> >>> > implementation details for headers (my use-cases lean toward > >> numerical > >> >>> > keys with namespaces and object values, others differ), but I > think > >> we > >> >>> > need to answer the general "Are we going to have headers" question > >> >>> > first. > >> >>> > > >> >>> > I'd love to hear from the other committers in the discussion: > >> >>> > What would it take to convince you that headers in Kafka are a > good > >> >>> > idea in general, so we can move ahead and try to agree on the > >> details? > >> >>> > > >> >>> > I feel like we keep moving the goal posts and this is truly > >> exhausting. > >> >>> > > >> >>> > For the record, I mildly support adding headers to Kafka (+0.5?). > >> >>> > The community can continue to find workarounds to the issue and > there > >> >>> > are some benefits to keeping the message format and clients > simpler. > >> >>> > But I see the usefulness of headers to many use-cases and if we > can > >> >>> > find a good and generally useful way to add it to Kafka, it will > make > >> >>> > Kafka easier to use for many - worthy goal in my eyes. > >> >>> > > >> >>> > > are interesting/feasible, but: > >> >>> > > A+B. i think there are use cases for polyglot topics. > especially if > >> >>> kafka > >> >>> > > is being used to "trunk" something else. > >> >>> > > D. multiple topics would make it harder to write portable > consumer > >> >>> code. > >> >>> > > partition remapping would mess with locality of consumption > >> >>> guarantees. > >> >>> > > E+F. a use case I see for lineage/metadata is > billing/chargeback. > >> for > >> >>> > that > >> >>> > > use case it is not enough to simply record the point of origin, > but > >> >>> every > >> >>> > > replication stop (think mirror maker) must also add a record to > >> form a > >> >>> > > "transit log". > >> >>> > > > >> >>> > > as for stream processing on top of kafka - i know samza has a > >> metadata > >> >>> > map > >> >>> > > which they carry around in addition to user values. headers are > the > >> >>> > perfect > >> >>> > > fit for these things. > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun Rao <j...@confluent.io> > wrote: > >> >>> > > > >> >>> > >> Hi, Michael, > >> >>> > >> > >> >>> > >> In order to answer the first two questions, it would be helpful > >> if we > >> >>> > could > >> >>> > >> identify 1 or 2 strong use cases for headers in the space for > >> >>> > third-party > >> >>> > >> vendors. For use cases within an organization, one could always > >> use > >> >>> > other > >> >>> > >> approaches such as company-wise containers to get around w/o > >> >>> headers. I > >> >>> > >> went through the use cases in the KIP and in Radai's wiki ( > >> >>> > >> https://cwiki.apache.org/confluence/display/KAFKA/A+ > >> >>> > Case+for+Kafka+Headers > >> >>> > >> ). > >> >>> > >> The following are the ones that that I understand and could be > in > >> the > >> >>> > >> third-party use case category. > >> >>> > >> > >> >>> > >> A. content-type > >> >>> > >> It seems that in general, content-type should be set at the > topic > >> >>> level. > >> >>> > >> Not sure if mixing messages with different content types > should be > >> >>> > >> encouraged. > >> >>> > >> > >> >>> > >> B. schema id > >> >>> > >> Since the value is mostly useless without schema id, it seems > that > >> >>> > storing > >> >>> > >> the schema id together with serialized bytes in the value is > >> better? > >> >>> > >> > >> >>> > >> C. per message encryption > >> >>> > >> One drawback of this approach is that this significantly reduce > >> the > >> >>> > >> effectiveness of compression, which happens on a set of > serialized > >> >>> > >> messages. An alternative is to enable SSL for wire encryption > and > >> >>> rely > >> >>> > on > >> >>> > >> the storage system (e.g. LUKS) for at rest encryption. > >> >>> > >> > >> >>> > >> D. cluster ID for mirroring across Kafka clusters > >> >>> > >> This is actually interesting. Today, to avoid introducing > cycles > >> when > >> >>> > doing > >> >>> > >> mirroring across data centers, one would either have to set up > two > >> >>> Kafka > >> >>> > >> clusters (a local and an aggregate) per data center or rename > >> topics. > >> >>> > >> Neither is ideal. With headers, the producer could tag each > >> message > >> >>> with > >> >>> > >> the producing cluster ID in the header. MirrorMaker could then > >> avoid > >> >>> > >> mirroring messages to a cluster if they are tagged with the > same > >> >>> cluster > >> >>> > >> id. > >> >>> > >> > >> >>> > >> However, an alternative approach is to introduce sth like > >> >>> hierarchical > >> >>> > >> topic and store messages from different clusters in different > >> >>> partitions > >> >>> > >> under the same topic. This approach avoids filtering out > unneeded > >> >>> data > >> >>> > and > >> >>> > >> makes offset preserving easier to support. It may make > compaction > >> >>> > trickier > >> >>> > >> though since the same key may show up in different partitions. > >> >>> > >> > >> >>> > >> E. record-level lineage > >> >>> > >> For example, a source connector could store in the message the > >> >>> metadata > >> >>> > >> (e.g. UUID) of the source record. Similarly, if a stream job > >> >>> transforms > >> >>> > >> messages from topic A to topic B, the library could include the > >> >>> source > >> >>> > >> message offset in each of the transformed message in the > header. > >> Not > >> >>> > sure > >> >>> > >> how widely useful record-level lineage is though since the > >> overhead > >> >>> > could > >> >>> > >> be significant. > >> >>> > >> > >> >>> > >> F. auditing metadata > >> >>> > >> We could put things like clientId/host/user in the header in > each > >> >>> > message > >> >>> > >> for auditing. These metadata are really at the producer level > >> though. > >> >>> > So, a > >> >>> > >> more efficient way is to only include a "producerId" per > message > >> and > >> >>> > send > >> >>> > >> the producerId -> metadata mapping independently. KIP-98 is > >> actually > >> >>> > >> proposing including such a producerId natively in the message. > >> >>> > >> > >> >>> > >> So, overall, I not sure that I am fully convinced of the strong > >> >>> > third-party > >> >>> > >> use cases of headers yet. Perhaps we could discuss a bit more > to > >> make > >> >>> > one > >> >>> > >> or two really convincing use cases. > >> >>> > >> > >> >>> > >> Another orthogonal question is whether header should be > exposed > >> in > >> >>> > stream > >> >>> > >> processing systems such Kafka stream, Samza, and Spark > streaming. > >> >>> > >> Currently, those systems just deal with key/value pairs. > Should we > >> >>> > expose a > >> >>> > >> third thing header there too or somehow map header to key or > >> value? > >> >>> > >> > >> >>> > >> Thanks, > >> >>> > >> > >> >>> > >> Jun > >> >>> > >> > >> >>> > >> > >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM, Michael Pearce < > >> >>> michael.pea...@ig.com> > >> >>> > >> wrote: > >> >>> > >> > >> >>> > >> > I assume, that after a period of a week, that there is no > >> concerns > >> >>> now > >> >>> > >> > with points 1, and 2 and now we have agreement that headers > are > >> >>> useful > >> >>> > >> and > >> >>> > >> > needed in Kafka. As such if put to a KIP vote, this wouldn’t > be > >> a > >> >>> > reason > >> >>> > >> to > >> >>> > >> > reject. > >> >>> > >> > > >> >>> > >> > @ > >> >>> > >> > Ignacio on point 4). > >> >>> > >> > I think for purpose of getting this KIP moving past this, we > can > >> >>> state > >> >>> > >> the > >> >>> > >> > key will be a 4 bytes space that can will be naturally > >> interpreted > >> >>> as > >> >>> > an > >> >>> > >> > Int32 (if namespacing is later wanted you can easily split > this > >> >>> into > >> >>> > two > >> >>> > >> > int16 spaces), from the wire protocol implementation this > makes > >> no > >> >>> > >> > difference I don’t believe. Is this reasonable to all? > >> >>> > >> > > >> >>> > >> > On 5) as per point 4 therefor happy we keep with 32 bits. > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > On 18/11/2016, 20:34, "ignacio.so...@gmail.com on behalf of > >> >>> Ignacio > >> >>> > >> > Solis" <ignacio.so...@gmail.com on behalf of iso...@igso.net > > > >> >>> wrote: > >> >>> > >> > > >> >>> > >> > Summary: > >> >>> > >> > > >> >>> > >> > 3) Yes - Header value as byte[] > >> >>> > >> > > >> >>> > >> > 4a) Int,Int - No > >> >>> > >> > 4b) Int - Yes > >> >>> > >> > 4c) String - Reluctant maybe > >> >>> > >> > > >> >>> > >> > 5) I believe the header system should take a single > int. I > >> >>> think > >> >>> > >> > 32bits is > >> >>> > >> > a good size, if you want to interpret this as to 16bit > >> numbers > >> >>> in > >> >>> > the > >> >>> > >> > layer > >> >>> > >> > above go right ahead. If somebody wants to argue for 16 > >> bits > >> >>> or > >> >>> > 64 > >> >>> > >> > bits of > >> >>> > >> > header key space I would listen. > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > Discussion: > >> >>> > >> > Dividing the key space into sub_key_1 and sub_key_2 > makes no > >> >>> > sense to > >> >>> > >> > me at > >> >>> > >> > this layer. Are we going to start providing APIs to get > all > >> >>> the > >> >>> > >> > sub_key_1s? or all the sub_key_2s? If there is no > >> >>> distinguishing > >> >>> > >> > functions > >> >>> > >> > that are applied to each one then they should be a single > >> >>> value. > >> >>> > At > >> >>> > >> > this > >> >>> > >> > layer all we're doing is equality. > >> >>> > >> > If the above layer wants to interpret this as 2, 3 or > more > >> >>> values > >> >>> > >> > that's a > >> >>> > >> > different question. I personally think it's all one > >> keyspace > >> >>> > that is > >> >>> > >> > getting assigned using some structure, but if you want to > >> >>> > sub-assign > >> >>> > >> > parts > >> >>> > >> > of it then that's fine. > >> >>> > >> > > >> >>> > >> > The same discussion applies to strings. If somebody > argued > >> for > >> >>> > >> > strings, > >> >>> > >> > would we be arguing to divide the strings with dots ('.') > >> as a > >> >>> > >> > requirement? > >> >>> > >> > Would we want them to give us the different name segments > >> >>> > separately? > >> >>> > >> > Would we be performing any actions on this key other than > >> >>> > matching? > >> >>> > >> > > >> >>> > >> > Nacho > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > > >> >>> > >> > On Fri, Nov 18, 2016 at 9:30 AM, Michael Pearce < > >> >>> > >> michael.pea...@ig.com > >> >>> > >> > > > >> >>> > >> > wrote: > >> >>> > >> > > >> >>> > >> > > #jay #jun any concerns on 1 and 2 still? > >> >>> > >> > > > >> >>> > >> > > @all > >> >>> > >> > > To get this moving along a bit more I'd also like to > ask > >> to > >> >>> get > >> >>> > >> > clarity on > >> >>> > >> > > the below last points: > >> >>> > >> > > > >> >>> > >> > > 3) I believe we're all roughly happy with the header > value > >> >>> > being a > >> >>> > >> > byte[]? > >> >>> > >> > > > >> >>> > >> > > 4) I believe consensus has been for an namespace based > int > >> >>> > approach > >> >>> > >> > > {int,int} for the key. Any objections if this is what > we > >> go > >> >>> > with? > >> >>> > >> > > > >> >>> > >> > > 5) as we have if assumption in (4) is correct, > {int,int} > >> >>> keys. > >> >>> > >> > > Should both int's be int16 or int32? > >> >>> > >> > > I'm for them being int16(2 bytes) as combined is space > of > >> >>> > 4bytes as > >> >>> > >> > per > >> >>> > >> > > original and gives plenty of combinations for the > >> >>> foreseeable, > >> >>> > and > >> >>> > >> > keeps > >> >>> > >> > > the overhead small. > >> >>> > >> > > > >> >>> > >> > > Do we see any benefit in another kip call to discuss > >> these at > >> >>> > all? > >> >>> > >> > > > >> >>> > >> > > Cheers > >> >>> > >> > > Mike > >> >>> > >> > > ________________________________________ > >> >>> > >> > > From: K Burstev <k.burs...@yandex.com> > >> >>> > >> > > Sent: Friday, November 18, 2016 7:07:07 AM > >> >>> > >> > > To: dev@kafka.apache.org > >> >>> > >> > > Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > >> >>> > >> > > > >> >>> > >> > > For what it is worth also i agree. As a user: > >> >>> > >> > > > >> >>> > >> > > 1) Yes - Headers are worthwhile > >> >>> > >> > > 2) Yes - Headers should be a top level option > >> >>> > >> > > > >> >>> > >> > > 14.11.2016, 21:15, "Ignacio Solis" <iso...@igso.net>: > >> >>> > >> > > > 1) Yes - Headers are worthwhile > >> >>> > >> > > > 2) Yes - Headers should be a top level option > >> >>> > >> > > > > >> >>> > >> > > > On Mon, Nov 14, 2016 at 9:16 AM, Michael Pearce < > >> >>> > >> > michael.pea...@ig.com> > >> >>> > >> > > > wrote: > >> >>> > >> > > > > >> >>> > >> > > >> Hi Roger, > >> >>> > >> > > >> > >> >>> > >> > > >> The kip details/examples the original proposal for > key > >> >>> > spacing > >> >>> > >> , > >> >>> > >> > not > >> >>> > >> > > the > >> >>> > >> > > >> new mentioned as per discussion namespace idea. > >> >>> > >> > > >> > >> >>> > >> > > >> We will need to update the kip, when we get > agreement > >> >>> this > >> >>> > is a > >> >>> > >> > better > >> >>> > >> > > >> approach (which seems to be the case if I have > >> understood > >> >>> > the > >> >>> > >> > general > >> >>> > >> > > >> feeling in the conversation) > >> >>> > >> > > >> > >> >>> > >> > > >> Re the variable ints, at very early stage we did > think > >> >>> about > >> >>> > >> > this. I > >> >>> > >> > > think > >> >>> > >> > > >> the added complexity for the saving isn't worth it. > >> I'd > >> >>> > rather > >> >>> > >> go > >> >>> > >> > > with, if > >> >>> > >> > > >> we want to reduce overheads and size int16 (2bytes) > >> keys > >> >>> as > >> >>> > it > >> >>> > >> > keeps it > >> >>> > >> > > >> simple. > >> >>> > >> > > >> > >> >>> > >> > > >> On the note of no headers, there is as per the kip > as > >> we > >> >>> > use an > >> >>> > >> > > attribute > >> >>> > >> > > >> bit to denote if headers are present or not as such > >> >>> > provides a > >> >>> > >> > zero > >> >>> > >> > > >> overhead currently if headers are not used. > >> >>> > >> > > >> > >> >>> > >> > > >> I think as radai mentions would be good first if we > >> can > >> >>> get > >> >>> > >> > clarity if > >> >>> > >> > > do > >> >>> > >> > > >> we now have general consensus that (1) headers are > >> >>> > worthwhile > >> >>> > >> and > >> >>> > >> > > useful, > >> >>> > >> > > >> and (2) we want it as a top level entity. > >> >>> > >> > > >> > >> >>> > >> > > >> Just to state the obvious i believe (1) headers are > >> >>> > worthwhile > >> >>> > >> > and (2) > >> >>> > >> > > >> agree as a top level entity. > >> >>> > >> > > >> > >> >>> > >> > > >> Cheers > >> >>> > >> > > >> Mike > >> >>> > >> > > >> ________________________________________ > >> >>> > >> > > >> From: Roger Hoover <roger.hoo...@gmail.com> > >> >>> > >> > > >> Sent: Wednesday, November 9, 2016 9:10:47 PM > >> >>> > >> > > >> To: dev@kafka.apache.org > >> >>> > >> > > >> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers > >> >>> > >> > > >> > >> >>> > >> > > >> Sorry for going a little in the weeds but thanks > for > >> the > >> >>> > >> replies > >> >>> > >> > > regarding > >> >>> > >> > > >> varint. > >> >>> > >> > > >> > >> >>> > >> > > >> Agreed that a prefix and {int, int} can be the > same. > >> It > >> >>> > doesn't > >> >>> > >> > look > >> >>> > >> > > like > >> >>> > >> > > >> that's what the KIP is saying the "Open" section. > The > >> >>> > example > >> >>> > >> > shows > >> >>> > >> > > >> 2100001 > >> >>> > >> > > >> for New Relic and 210002 for App Dynamics implying > >> that > >> >>> the > >> >>> > New > >> >>> > >> > Relic > >> >>> > >> > > >> organization will have only a single header id to > work > >> >>> > with. Or > >> >>> > >> > is > >> >>> > >> > > 2100001 > >> >>> > >> > > >> a prefix? The main point of a namespace or prefix > is > >> to > >> >>> > reduce > >> >>> > >> > the > >> >>> > >> > > >> overhead of config mapping or registration > depending > >> on > >> >>> how > >> >>> > >> > > >> namespaces/prefixes are managed. > >> >>> > >> > > >> > >> >>> > >> > > >> Would love to hear more feedback on the > higher-level > >> >>> > questions > >> >>> > >> > > though... > >> >>> > >> > > >> > >> >>> > >> > > >> Cheers, > >> >>> > >> > > >> > >> >>> > >> > > >> Roger > >> >>> > >> > > >> > >> >>> > >> > > >> On Wed, Nov 9, 2016 at 11:38 AM, radai < > >> >>> > >> > radai.rosenbl...@gmail.com> > >> >>> > >> > > wrote: > >> >>> > >> > > >> > >> >>> > >> > > >> > I think this discussion is getting a bit into the > >> >>> weeds on > >> >>> > >> > technical > >> >>> > >> > > >> > implementation details. > >> >>> > >> > > >> > I'd liek to step back a minute and try and > establish > >> >>> > where we > >> >>> > >> > are in > >> >>> > >> > > the > >> >>> > >> > > >> > larger picture: > >> >>> > >> > > >> > > >> >>> > >> > > >> > (re-wording nacho's last paragraph) > >> >>> > >> > > >> > 1. are we all in agreement that headers are a > >> >>> worthwhile > >> >>> > and > >> >>> > >> > useful > >> >>> > >> > > >> > addition to have? this was contested early on > >> >>> > >> > > >> > 2. are we all in agreement on headers as top > level > >> >>> entity > >> >>> > vs > >> >>> > >> > headers > >> >>> > >> > > >> > squirreled-away in V? > >> >>> > >> > > >> > > >> >>> > >> > > >> > if there are still concerns around these #2 > points > >> >>> (#jay? > >> >>> > >> > #jun?)? > >> >>> > >> > > >> > > >> >>> > >> > > >> > (and now back to our normal programming ...) > >> >>> > >> > > >> > > >> >>> > >> > > >> > varints are nice. having said that, its adding > >> >>> complexity > >> >>> > >> (see > >> >>> > >> > > >> > https://github.com/addthis/ > >> stream-lib/blob/master/src/ > >> >>> > >> > > >> > main/java/com/clearspring/ > >> analytics/util/Varint.java > >> >>> > >> > > >> > as 1st google result) and would require anyone > >> writing > >> >>> > other > >> >>> > >> > clients > >> >>> > >> > > (C? > >> >>> > >> > > >> > Python? Go? Bash? ;-) ) to get/implement the > same, > >> and > >> >>> for > >> >>> > >> > relatively > >> >>> > >> > > >> > little gain (int vs string is order of magnitude, > >> this > >> >>> > isnt). > >> >>> > >> > > >> > > >> >>> > >> > > >> > int namespacing vs {int, int} namespacing are > >> basically > >> >>> > the > >> >>> > >> > same > >> >>> > >> > > thing - > >> >>> > >> > > >> > youre just namespacing an int64 and giving people > >> while > >> >>> > 2^32 > >> >>> > >> > ranges > >> >>> > >> > > at a > >> >>> > >> > > >> > time. the part i like about this is letting > people > >> >>> have a > >> >>> > >> large > >> >>> > >> > > swath of > >> >>> > >> > > >> > numbers with one registration so they dont have > to > >> come > >> >>> > back > >> >>> > >> > for > >> >>> > >> > > every > >> >>> > >> > > >> > single plugin/header they want to "reserve". > >> >>> > >> > > >> > > >> >>> > >> > > >> > > >> >>> > >> > > >> > On Wed, Nov 9, 2016 at 11:01 AM, Roger Hoover < > >> >>> > >> > > roger.hoo...@gmail.com> > >> >>> > >> > > >> > wrote: > >> >>> > >> > > >> > > >> >>> > >> > > >> > > Since some of the debate has been about > overhead + > >> >>> > >> > performance, I'm > >> >>> > >> > > >> > > wondering if we have considered a varint > encoding > >> ( > >> >>> > >> > > >> > > https://developers.google.com/ > >> protocol-buffers/docs/ > >> >>> > >> > > encoding#varints) > >> >>> > >> > > >> > for > >> >>> > >> > > >> > > the header length field (int32 in the proposal) > >> and > >> >>> for > >> >>> > >> > header > >> >>> > >> > > ids? If > >> >>> > >> > > >> > you > >> >>> > >> > > >> > > don't use headers, the overhead would be a > single > >> >>> byte > >> >>> > and > >> >>> > >> > for each > >> >>> > >> > > >> > header > >> >>> > >> > > >> > > id < 128 would also need only a single byte? > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > On Wed, Nov 9, 2016 at 6:43 AM, radai < > >> >>> > >> > radai.rosenbl...@gmail.com> > >> >>> > >> > > >> > wrote: > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > @magnus - and very dangerous (youre > essentially > >> >>> > >> > downloading and > >> >>> > >> > > >> > executing > >> >>> > >> > > >> > > > arbitrary code off the internet on your > servers > >> ... > >> >>> > bad > >> >>> > >> > idea > >> >>> > >> > > without > >> >>> > >> > > >> a > >> >>> > >> > > >> > > > sandbox, even with) > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > as for it being a purely administrative task > - i > >> >>> > >> disagree. > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > i wish it would, really, because then my > earlier > >> >>> > point on > >> >>> > >> > the > >> >>> > >> > > >> > complexity > >> >>> > >> > > >> > > of > >> >>> > >> > > >> > > > the remapping process would be invalid, but > at > >> >>> > linkedin, > >> >>> > >> > for > >> >>> > >> > > example, > >> >>> > >> > > >> > we > >> >>> > >> > > >> > > > (the team im in) run kafka as a service. we > dont > >> >>> > really > >> >>> > >> > know > >> >>> > >> > > what our > >> >>> > >> > > >> > > users > >> >>> > >> > > >> > > > (developing applications that use kafka) are > up > >> to > >> >>> at > >> >>> > any > >> >>> > >> > given > >> >>> > >> > > >> moment. > >> >>> > >> > > >> > > it > >> >>> > >> > > >> > > > is very possible (given the existance of > headers > >> >>> and a > >> >>> > >> > > corresponding > >> >>> > >> > > >> > > plugin > >> >>> > >> > > >> > > > ecosystem) for some application to "equip" > their > >> >>> > >> producers > >> >>> > >> > and > >> >>> > >> > > >> > consumers > >> >>> > >> > > >> > > > with the required plugin without us knowing. > i > >> dont > >> >>> > mean > >> >>> > >> > to imply > >> >>> > >> > > >> thats > >> >>> > >> > > >> > > > bad, i just want to make the point that its > not > >> as > >> >>> > simple > >> >>> > >> > > keeping it > >> >>> > >> > > >> in > >> >>> > >> > > >> > > > sync across a large-enough organization. > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > On Wed, Nov 9, 2016 at 6:17 AM, Magnus > Edenhill > >> < > >> >>> > >> > > mag...@edenhill.se> > >> >>> > >> > > >> > > > wrote: > >> >>> > >> > > >> > > > > >> >>> > >> > > >> > > > > I think there is a piece missing in the > >> Strings > >> >>> > >> > discussion, > >> >>> > >> > > where > >> >>> > >> > > >> > > > > pro-Stringers > >> >>> > >> > > >> > > > > reason that by providing unique string > >> >>> identifiers > >> >>> > for > >> >>> > >> > each > >> >>> > >> > > header > >> >>> > >> > > >> > > > > everything will just > >> >>> > >> > > >> > > > > magically work for all parts of the stream > >> >>> pipeline. > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > But the strings dont mean anything by > >> themselves, > >> >>> > and > >> >>> > >> > while we > >> >>> > >> > > >> could > >> >>> > >> > > >> > > > > probably envision > >> >>> > >> > > >> > > > > some auto plugin loader that downloads, > >> compiles, > >> >>> > links > >> >>> > >> > and > >> >>> > >> > > runs > >> >>> > >> > > >> > > plugins > >> >>> > >> > > >> > > > > on-demand > >> >>> > >> > > >> > > > > as soon as they're seen by a consumer, I > dont > >> >>> really > >> >>> > >> see > >> >>> > >> > a > >> >>> > >> > > use-case > >> >>> > >> > > >> > for > >> >>> > >> > > >> > > > > something > >> >>> > >> > > >> > > > > so dynamic (and fragile) in practice. > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > In the real world an application will be > >> >>> configured > >> >>> > >> with > >> >>> > >> > a set > >> >>> > >> > > of > >> >>> > >> > > >> > > plugins > >> >>> > >> > > >> > > > > to either add (producer) > >> >>> > >> > > >> > > > > or read (consumer) headers. > >> >>> > >> > > >> > > > > This is an administrative task based on > what > >> >>> > features a > >> >>> > >> > client > >> >>> > >> > > >> > > > > needs/provides and results in > >> >>> > >> > > >> > > > > some sort of configuration to enable and > >> >>> configure > >> >>> > the > >> >>> > >> > desired > >> >>> > >> > > >> > plugins. > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > Since this needs to be kept somewhat in > sync > >> >>> across > >> >>> > an > >> >>> > >> > > organisation > >> >>> > >> > > >> > > > (there > >> >>> > >> > > >> > > > > is no point in having producers > >> >>> > >> > > >> > > > > add headers no consumers will read, and > vice > >> >>> versa), > >> >>> > >> the > >> >>> > >> > added > >> >>> > >> > > >> > > complexity > >> >>> > >> > > >> > > > > of assigning an id namespace > >> >>> > >> > > >> > > > > for each plugin as it is being configured > >> should > >> >>> be > >> >>> > >> > tolerable. > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > /Magnus > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > 2016-11-09 13:06 GMT+01:00 Michael Pearce < > >> >>> > >> > > michael.pea...@ig.com>: > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > Just following/catching up on what seems > to > >> be > >> >>> an > >> >>> > >> > active > >> >>> > >> > > night :) > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > @Radai sorry if it may seem obvious but > what > >> >>> does > >> >>> > MD > >> >>> > >> > stand > >> >>> > >> > > for? > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > My take on String vs Int: > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > I will state first I am pro Int (16 or > 32). > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > I do though playing devils advocate see a > >> big > >> >>> plus > >> >>> > >> > with the > >> >>> > >> > > >> > argument > >> >>> > >> > > >> > > of > >> >>> > >> > > >> > > > > > String keys, this is around integrating > >> into an > >> >>> > >> > existing > >> >>> > >> > > >> > eco-system. > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > As many other systems use String based > >> headers > >> >>> > >> (Flume, > >> >>> > >> > JMS) > >> >>> > >> > > it > >> >>> > >> > > >> > makes > >> >>> > >> > > >> > > > it > >> >>> > >> > > >> > > > > > much easier for these to be > >> >>> > incorporated/integrated > >> >>> > >> > into. > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > How with Int based headers could we > provide > >> a > >> >>> > >> > way/guidence to > >> >>> > >> > > >> make > >> >>> > >> > > >> > > this > >> >>> > >> > > >> > > > > > integration simple / easy with transition > >> flows > >> >>> > over > >> >>> > >> to > >> >>> > >> > > kafka? > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > * tough luck buddy you're on your own > >> >>> > >> > > >> > > > > > * simply hash the string into int code > and > >> hope > >> >>> > for > >> >>> > >> no > >> >>> > >> > > collisions > >> >>> > >> > > >> > > (how > >> >>> > >> > > >> > > > to > >> >>> > >> > > >> > > > > > convert back though?) > >> >>> > >> > > >> > > > > > * http2 style as mentioned by nacho. > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > cheers, > >> >>> > >> > > >> > > > > > Mike > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > ________________________________________ > >> >>> > >> > > >> > > > > > From: radai <radai.rosenbl...@gmail.com> > >> >>> > >> > > >> > > > > > Sent: Wednesday, November 9, 2016 8:12 AM > >> >>> > >> > > >> > > > > > To: dev@kafka.apache.org > >> >>> > >> > > >> > > > > > Subject: Re: [DISCUSS] KIP-82 - Add > Record > >> >>> Headers > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > thinking about it some more, the best > way to > >> >>> > transmit > >> >>> > >> > the > >> >>> > >> > > header > >> >>> > >> > > >> > > > > remapping > >> >>> > >> > > >> > > > > > data to consumers would be to put it in > the > >> MD > >> >>> > >> response > >> >>> > >> > > payload, > >> >>> > >> > > >> so > >> >>> > >> > > >> > > > maybe > >> >>> > >> > > >> > > > > > it should be discussed now. > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > On Wed, Nov 9, 2016 at 12:09 AM, radai < > >> >>> > >> > > >> radai.rosenbl...@gmail.com > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > > wrote: > >> >>> > >> > > >> > > > > > > >> >>> > >> > > >> > > > > > > im not opposed to the idea of namespace > >> >>> mapping. > >> >>> > >> all > >> >>> > >> > im > >> >>> > >> > > saying > >> >>> > >> > > >> is > >> >>> > >> > > >> > > > that > >> >>> > >> > > >> > > > > > its > >> >>> > >> > > >> > > > > > > not part of the "mvp" and, since it > >> requires > >> >>> no > >> >>> > >> wire > >> >>> > >> > format > >> >>> > >> > > >> > change, > >> >>> > >> > > >> > > > can > >> >>> > >> > > >> > > > > > > always be added later. > >> >>> > >> > > >> > > > > > > also, its not as simple as just > >> configuring > >> >>> MM > >> >>> > to > >> >>> > >> do > >> >>> > >> > the > >> >>> > >> > > >> > transform: > >> >>> > >> > > >> > > > > lets > >> >>> > >> > > >> > > > > > > say i've implemented large message > >> support as > >> >>> > >> > {666,1} and > >> >>> > >> > > on > >> >>> > >> > > >> some > >> >>> > >> > > >> > > > > mirror > >> >>> > >> > > >> > > > > > > target cluster its been remapped to > >> {999,1}. > >> >>> the > >> >>> > >> > consumer > >> >>> > >> > > >> plugin > >> >>> > >> > > >> > > code > >> >>> > >> > > >> > > > > > would > >> >>> > >> > > >> > > > > > > also need to be told to look for the > large > >> >>> > message > >> >>> > >> > "part X > >> >>> > >> > > of > >> >>> > >> > > >> Y" > >> >>> > >> > > >> > > > header > >> >>> > >> > > >> > > > > > > under {999,1}. doable, but tricky. > >> >>> > >> > > >> > > > > > > > >> >>> > >> > > >> > > > > > > On Tue, Nov 8, 2016 at 10:29 PM, Gwen > >> >>> Shapira < > >> >>> > >> > > >> g...@confluent.io > >> >>> > >> > > >> > > > >> >>> > >> > > >> > > > > wrote: > >> >>> > >> > > >> > > > > > > > >> >>> > >> > > >> > > > > > >> While you can do whatever you want > with a > >> >>> > >> namespace > >> >>> > >> > and > >> >>> > >> > > your > >> >>> > >> > > >> > code, > >> >>> > >> > > >> > > > > > >> what I'd expect is for each app to > >> >>> namespaces > >> >>> > >> > > configurable... > >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> So if I accidentally used 666 for my > HR > >> >>> > >> department, > >> >>> > >> > and > >> >>> > >> > > still > >> >>> > >> > > >> > want > >> >>> > >> > > >> > > > to > >> >>> > >> > > >> > > > > > >> run RadaiApp, I can config > "namespace=42" > >> >>> for > >> >>> > >> > RadaiApp and > >> >>> > >> > > >> > > > everything > >> >>> > >> > > >> > > > > > >> will look normal. > >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> This means you only need to sync usage > >> >>> inside > >> >>> > your > >> >>> > >> > own > >> >>> > >> > > >> > > organization. > >> >>> > >> > > >> > > > > > >> Still hard, but somewhat easier than > >> syncing > >> >>> > with > >> >>> > >> > the > >> >>> > >> > > entire > >> >>> > >> > > >> > > world. > >> >>> > >> > > >> > > > > > >> > >> >>> > >> > > >> > > > > > >> On Tue, Nov 8, 2016 at 10:07 PM, > radai < > >> >>> > >> > > >> > > radai.rosenbl...@gmail.com> > >> >>> > >> > > >> > > > > > >> wrote: > >> >>> > >> > > >> > > > > > >> > and we can start with {namespace, > id} > >> and > >> >>> no > >> >>> > >> > re-mapping > >> >>> > >> > > >> > support > >> >>> > >> > > >> > > > and > >> >>> > >> > > >> > > > > > >> always > >> >>> > >> > > >> > > > > > >> > add it later on if/when collisions > >> >>> actually > >> >>> > >> > happen (i > >> >>> > >> > > dont > >> >>> > >> > > >> > think > >> >>> > >> > > >> > > > > > they'd > >> >>> > >> > > >> > > > > > >> be > >> >>> > >> > > >> > > > > > >> > a problem). > >> >>> > >> > > >> > > > > > >> > > >> >>> > >> > > >> > > > > > >> > every interested party (so orgs or > >> >>> > individuals) > >> >>> > >> > could > >> >>> > >> > > then > >> >>> > >> > > >> > > > register > >> >>> > >> > > >> > > > > a > >> >>> > >> > > >> > > > > > >> > prefix (0 = reserved, 1 = confluent > ... > >> >>> 666 > >> >>> > = me > >> >>> > >> > :-) ) > >> >>> > >> > > and > >> >>> > >> > > >> do > >> >>> > >> > > >> > > > > whatever > >> >>> > >> > > >> > > > > > >> with > >> >>> > >> > > >> > > > > > >> > the 2nd ID - so once linkedin > >> registers, > >> >>> say > >> >>> > 3, > >> >>> > >> > then > >> >>> > >> > > >> linkedin > >> >>> > >> > > >> > > devs > >> >>> > >> > > >> > > > > are > >> >>> > >> > > >> > > > > > >> free > >> >>> > >> > > >> > > > > > >> > to use {3, *} with a reasonable > >> >>> expectation > >> >>> > to > >> >>> > >> to > >> >>> > >> > > collide > >> >>> > >> > > >> with > >> >>> > >> > > >> > > > > > anything > >> >>> > >> > > >> > > > > > >> > else. further partitioning of that * > >> >>> becomes > >> >>> > >> > linkedin's > >> >>> > >> > > >> > problem, > >> >>> > >> > > >> > > > but > >> >>> > >> > > >> > > > > > the > >> >>> > >> > > >> > > > > > >> > "upstream registration" of a > namespace > >> >>> only > >> >>> > has > >> >>> > >> to > >> >>> > >> > > happen > >> >>> > >> > > >> > once. > >> >>> > >> > > >> > > > > > >> > > >> >>> > >> > > >> > > > > > >> > On Tue, Nov 8, 2016 at 9:03 PM, > James > >> >>> Cheng < > >> >>> > >> > > >> > > wushuja...@gmail.com > >> >>> > >> > > >> > > > > > >> >>> > >> > > >> > > > > > >> wrote: > >> >>> > >> > > >> > > > > > >> > > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > On Nov 8, 2016, at 5:54 PM, Gwen > >> >>> Shapira < > >> >>> > >> > > >> > g...@confluent.io> > >> >>> > >> > > >> > > > > > wrote: > >> >>> > >> > > >> > > > > > >> >> > > >> >>> > >> > > >> > > > > > >> >> > Thank you so much for this clear > and > >> >>> fair > >> >>> > >> > summary of > >> >>> > >> > > the > >> >>> > >> > > >> > > > > arguments. > >> >>> > >> > > >> > > > > > >> >> > > >> >>> > >> > > >> > > > > > >> >> > I'm in favor of ints. Not a > >> >>> deal-breaker, > >> >>> > but > >> >>> > >> > in > >> >>> > >> > > favor. > >> >>> > >> > > >> > > > > > >> >> > > >> >>> > >> > > >> > > > > > >> >> > Even more in favor of Magnus's > >> >>> > decentralized > >> >>> > >> > > suggestion > >> >>> > >> > > >> > with > >> >>> > >> > > >> > > > > > Roger's > >> >>> > >> > > >> > > > > > >> >> > tweak: add a namespace for > headers. > >> >>> This > >> >>> > will > >> >>> > >> > allow > >> >>> > >> > > each > >> >>> > >> > > >> > app > >> >>> > >> > > >> > > to > >> >>> > >> > > >> > > > > > just > >> >>> > >> > > >> > > > > > >> >> > use whatever IDs it wants > >> internally, > >> >>> and > >> >>> > >> then > >> >>> > >> > let > >> >>> > >> > > the > >> >>> > >> > > >> > admin > >> >>> > >> > > >> > > > > > >> deploying > >> >>> > >> > > >> > > > > > >> >> > the app figure out an available > >> >>> namespace > >> >>> > ID > >> >>> > >> > for the > >> >>> > >> > > app > >> >>> > >> > > >> to > >> >>> > >> > > >> > > > live > >> >>> > >> > > >> > > > > > in. > >> >>> > >> > > >> > > > > > >> >> > So io.confluent.schema-registry > can > >> be > >> >>> > >> > namespace > >> >>> > >> > > 0x01 on > >> >>> > >> > > >> my > >> >>> > >> > > >> > > > > > >> deployment > >> >>> > >> > > >> > > > > > >> >> > and 0x57 on yours, and the poor > guys > >> >>> > >> > developing the > >> >>> > >> > > app > >> >>> > >> > > >> > don't > >> >>> > >> > > >> > > > > need > >> >>> > >> > > >> > > > > > to > >> >>> > >> > > >> > > > > > >> >> > worry about that. > >> >>> > >> > > >> > > > > > >> >> > > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> Gwen, if I understand your example > >> >>> right, an > >> >>> > >> > > application > >> >>> > >> > > >> > > deployer > >> >>> > >> > > >> > > > > > might > >> >>> > >> > > >> > > > > > >> >> decide to use 0x01 in one > deployment, > >> and > >> >>> > that > >> >>> > >> > means > >> >>> > >> > > that > >> >>> > >> > > >> > once > >> >>> > >> > > >> > > > the > >> >>> > >> > > >> > > > > > >> message > >> >>> > >> > > >> > > > > > >> >> is written into the broker, it > will be > >> >>> > saved on > >> >>> > >> > the > >> >>> > >> > > broker > >> >>> > >> > > >> > with > >> >>> > >> > > >> > > > > that > >> >>> > >> > > >> > > > > > >> >> specific namespace (0x01). > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> If you were to mirror that message > >> into > >> >>> > another > >> >>> > >> > > cluster, > >> >>> > >> > > >> the > >> >>> > >> > > >> > > 0x01 > >> >>> > >> > > >> > > > > > would > >> >>> > >> > > >> > > > > > >> >> accompany the message, right? What > if > >> the > >> >>> > >> > deployers of > >> >>> > >> > > the > >> >>> > >> > > >> > same > >> >>> > >> > > >> > > > app > >> >>> > >> > > >> > > > > > in > >> >>> > >> > > >> > > > > > >> the > >> >>> > >> > > >> > > > > > >> >> other cluster uses 0x57? They won't > >> >>> > understand > >> >>> > >> > each > >> >>> > >> > > other? > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> I'm not sure that's an avoidable > >> >>> problem. I > >> >>> > >> > think it > >> >>> > >> > > simply > >> >>> > >> > > >> > > means > >> >>> > >> > > >> > > > > > that > >> >>> > >> > > >> > > > > > >> in > >> >>> > >> > > >> > > > > > >> >> order to share data, you have to > also > >> >>> have a > >> >>> > >> > shared > >> >>> > >> > > (agreed > >> >>> > >> > > >> > > upon) > >> >>> > >> > > >> > > > > > >> >> understanding of what the > namespaces > >> >>> mean. > >> >>> > >> Which > >> >>> > >> > I > >> >>> > >> > > think > >> >>> > >> > > >> > makes > >> >>> > >> > > >> > > > > sense, > >> >>> > >> > > >> > > > > > >> >> because the alternate (sharing > >> *nothing* > >> >>> at > >> >>> > >> all) > >> >>> > >> > would > >> >>> > >> > > mean > >> >>> > >> > > >> > > that > >> >>> > >> > > >> > > > > > there > >> >>> > >> > > >> > > > > > >> >> would be no way to understand each > >> other. > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> -James > >> >>> > >> > > >> > > > > > >> >> > >> >>> > >> > > >> > > > > > >> >> > Gwen > >> >>> > >> > > >> > > > > > >> >> > > >> >>> > >> > > >> > > > > > >> >> > On Tue, Nov 8, 2016 at 4:23 PM, > >> radai < > >> >>> > >> > > >> > > > > radai.rosenbl...@gmail.com> > >> >>> > >> > > >> > > > > > >> >> wrote: > >> >>> > >> > > >> > > > > > >> >> >> +1 for sean's document. it > covers > >> >>> pretty > >> >>> > >> much > >> >>> > >> > all > >> >>> > >> > > the > >> >>> > >> > > >> > > > trade-offs > >> >>> > >> > > >> > > > > > and > >> >>> > >> > > >> > > > > > >> >> >> provides concrete figures to > argue > >> >>> about > >> >>> > :-) > >> >>> > >> > > >> > > > > > >> >> >> (nit-picking - used the same > xkcd > >> >>> twice, > >> >>> > >> also > >> >>> > >> > trove > >> >>> > >> > > has > >> >>> > >> > > >> > been > >> >>> > >> > > >> > > > > > >> superceded > >