> On Oct 25, 2016, at 10:23 PM, Michael Pearce <michael.pea...@ig.com> wrote:
> 
> Hi All,
> 
> In case you hadn't noticed re the compaction issue for non-null values i have 
> created a separate KIP-87, if you could all contribute to its discussion 
> would be much appreciated.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-87+-+Add+Compaction+Tombstone+Flag
> 
> Secondly, focussing back on KIP-82, one of the actions agreed from the KIP 
> call was for some additional alternative solution proposals on top of those 
> already detailed in the KIP wiki and subsequent linked wiki pages by others 
> in the group in the meeting.
> 
> I haven't seen any activity on this, does this mean there isn't any further 
> and everyone in hindsight actually thinks the current proposed solution in 
> the KIP is the front runner? (i assume this isn't the case, just want to 
> nudge everyone)
> 

I have been meaning to respond, but I haven't had the time. In the next couple 
days, I will try to write up the container format that TiVo is using, and we 
can discuss it.

-James

> Also just copying across the kip call thread to keep everything in one thread 
> to avoid a divergence of the discussion into multiple threads.
> 
> Cheers
> Mike
> 
> ________________________________________
> From: Mayuresh Gharat <gharatmayures...@gmail.com>
> Sent: Monday, October 24, 2016 6:17 PM
> To: dev@kafka.apache.org
> Subject: Re: Kafka KIP meeting Oct 19 at 11:00am PST
> 
> I agree with Nacho.
> +1 for the KIP.
> 
> Thanks,
> 
> Mayuresh
> 
> On Fri, Oct 21, 2016 at 11:46 AM, Nacho Solis <nso...@linkedin.com.invalid>
> wrote:
> 
>> I think a separate KIP is a good idea as well.  Note however that potential
>> decisions in this KIP could affect the other KIP.
>> 
>> Nacho
>> 
>> On Fri, Oct 21, 2016 at 10:23 AM, Jun Rao <j...@confluent.io> wrote:
>> 
>>> Michael,
>>> 
>>> Yes, doing a separate KIP to address the null payload issue for compacted
>>> topics is a good idea.
>>> 
>>> Thanks,
>>> 
>>> Jun
>>> 
>>> On Fri, Oct 21, 2016 at 12:57 AM, Michael Pearce <michael.pea...@ig.com>
>>> wrote:
>>> 
>>>> I had noted that what ever the solution having compaction based on null
>>>> payload was agreed isn't elegant.
>>>> 
>>>> Shall we raise another kip to : as discussed propose using a attribute
>>> bit
>>>> for delete/compaction flag as well/or instead of null value and
>> updating
>>>> compaction logic to look at that delelete/compaction attribute
>>>> 
>>>> I believe this is less contentious, so that at least we get that done
>>>> alleviating some concerns whilst the below gets discussed further?
>>>> 
>>>> ________________________________________
>>>> From: Jun Rao <j...@confluent.io>
>>>> Sent: Wednesday, October 19, 2016 8:56:52 PM
>>>> To: dev@kafka.apache.org
>>>> Subject: Re: Kafka KIP meeting Oct 19 at 11:00am PST
>>>> 
>>>> The following are the notes from today's KIP discussion.
>>>> 
>>>> 
>>>>   - KIP-82 - add record header: We agreed that there are use cases for
>>>>   third-party vendors building tools around Kafka. We haven't reached
>>> the
>>>>   conclusion whether the added complexity justifies the use cases. We
>>> will
>>>>   follow up on the mailing list with use cases, container format
>> people
>>>> have
>>>>   been using, and details on the proposal.
>>>> 
>>>> 
>>>> The video will be uploaded soon in https://cwiki.apache.org/
>>>> confluence/display/KAFKA/Kafka+Improvement+Proposals .
>>>> 
>>>> Thanks,
>>>> 
>>>> Jun
>>>> 
>>>> On Mon, Oct 17, 2016 at 10:49 AM, Jun Rao <j...@confluent.io> wrote:
>>>> 
>>>>> Hi, Everyone.,
>>>>> 
>>>>> We plan to have a Kafka KIP meeting this coming Wednesday at 11:00am
>>> PST.
>>>>> If you plan to attend but haven't received an invite, please let me
>>> know.
>>>>> The following is the tentative agenda.
>>>>> 
>>>>> Agenda:
>>>>> KIP-82: add record header
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Jun
>>>>> 
>>>> The information contained in this email is strictly confidential and
>> for
>>>> the use of the addressee only, unless otherwise indicated. If you are
>> not
>>>> the intended recipient, please do not read, copy, use or disclose to
>>> others
>>>> this message or any attachment. Please also notify the sender by
>> replying
>>>> to this email or by telephone (+44(020 7896 0011) and then delete the
>>> email
>>>> and any copies of it. Opinions, conclusion (etc) that do not relate to
>>> the
>>>> official business of this company shall be understood as neither given
>>> nor
>>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>>> registered in England and Wales, company number 04008957) and IG Index
>>>> Limited (a company registered in England and Wales, company number
>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and
>> IG
>>>> Index Limited (register number 114059) are authorised and regulated by
>>> the
>>>> Financial Conduct Authority.
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Nacho (Ignacio) Solis
>> Kafka
>> nso...@linkedin.com
>> 
> 
> 
> 
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
> 
> 
> ________________________________________
> From: Michael Pearce <michael.pea...@ig.com>
> Sent: Monday, October 17, 2016 7:48 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> 
> Hi Jun,
> 
> Sounds good.
> 
> Look forward to the invite.
> 
> Cheers,
> Mike
> ________________________________________
> From: Jun Rao <j...@confluent.io>
> Sent: Monday, October 17, 2016 5:55:57 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> 
> Hi, Michael,
> 
> We do have online KIP discussion meetings from time to time. How about we
> discuss this KIP Wed (Oct 19) at 11:00am PST? I will send out an invite (we
> typically do the meeting through Zoom and will post the video recording to
> Kafka wiki).
> 
> Thanks,
> 
> Jun
> 
> On Wed, Oct 12, 2016 at 1:22 AM, Michael Pearce <michael.pea...@ig.com>
> wrote:
> 
>> @Jay and Dana
>> 
>> We have internally had a few discussions of how we may address this if we
>> had a common apache kafka message wrapper for headers that can be used
>> client side only to, and address the compaction issue.
>> I have detailed this solution separately and linked from the main KIP-82
>> wiki.
>> 
>> Here’s a direct link –
>> https://cwiki.apache.org/confluence/display/KAFKA/
>> Headers+Value+Message+Wrapper
>> 
>> We feel this solution though doesn’t manage to address all the use cases
>> being mentioned still and also has some compatibility drawbacks e.g.
>> backwards forwards compatibility especially on different language clients
>> Also we still require with this solution, as still need to address
>> compaction issue / tombstones, we need to make server side changes and as
>> many message/record version changes.
>> 
>> We believe the proposed solution in KIP-82 does address all these needs
>> and is cleaner still, and more benefits.
>> Please have a read, and comment. Also if you have any improvements on the
>> proposed KIP-82 or an alternative solution/option your input is appreciated.
>> 
>> @All
>> As Joel has mentioned to get this moving along, and able to discuss more
>> fluidly, it would be great if we can organize to meet up virtually online
>> e.g. webex or something.
>> I am aware, that the majority are based in America, myself is in the UK.
>> @Kostya I assume you’re in Eastern Europe or Russia based on your email
>> address (please correct this assumption), I hope the time difference isn’t
>> too much that the below would suit you if you wish to join
>> 
>> Can I propose next Wednesday 19th October at 18:30 BST , 10:30 PST, 20:30
>> MSK we try meetup online?
>> 
>> Would this date/time suit the majority?
>> Also what is the preferred method? I can host via Adobe Connect style
>> webex (which my company uses) but it isn’t the best IMHO, so more than
>> happy to have someone suggest a better alternative.
>> 
>> Best,
>> Mike
>> 
>> 
>> 
>> 
>> On 10/8/16, 7:26 AM, "Michael Pearce" <michael.pea...@ig.com> wrote:
>> 
>>>> I agree with the critique of compaction not having a value. I think
>> we should consider fixing that directly.
>> 
>>> Agree that the compaction issue is troubling: compacted "null"
>> deletes
>>    are incompatible w/ headers that must be packed into the message
>>    value. Are there any alternatives on compaction delete semantics that
>>    could address this? The KIP wiki discussion I think mostly assumes
>>    that compaction-delete is what it is and can't be changed/fixed.
>> 
>>    This KIP is about dealing with quite a few use cases and issues,
>> please see both the KIP use cases detailed by myself and also the
>> additional use cases wiki added by LinkedIn linked from the main KIP.
>> 
>>    The compaction is something that happily is addressed with headers,
>> but most defiantly isn't the sole reason or use case for them, headers
>> solves many issues and use cases. Thus their elegance and simplicity, and
>> why they're so common in transport mechanisms and so succesfull, as stated
>> like http, tcp, jms.
>> 
>>    ________________________________________
>>    From: Dana Powers <dana.pow...@gmail.com>
>>    Sent: Friday, October 7, 2016 11:09 PM
>>    To: dev@kafka.apache.org
>>    Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>> 
>>> I agree with the critique of compaction not having a value. I think
>> we should consider fixing that directly.
>> 
>>    Agree that the compaction issue is troubling: compacted "null" deletes
>>    are incompatible w/ headers that must be packed into the message
>>    value. Are there any alternatives on compaction delete semantics that
>>    could address this? The KIP wiki discussion I think mostly assumes
>>    that compaction-delete is what it is and can't be changed/fixed.
>> 
>>    -Dana
>> 
>>    On Fri, Oct 7, 2016 at 1:38 PM, Michael Pearce <michael.pea...@ig.com>
>> wrote:
>>> 
>>> Hi Jay,
>>> 
>>> Thanks for the comments and feedback.
>>> 
>>> I think its quite clear that if a problem keeps arising then it is
>> clear that it needs resolving, and addressing properly.
>>> 
>>> Fair enough at linkedIn, and historically for the very first use
>> cases addressing this maybe not have been a big priority. But as Kafka is
>> now Apache open source and being picked up by many including my company, it
>> is clear and evident that this is a requirement and issue that needs to be
>> now addressed to address these needs.
>>> 
>>> The fact in almost every transport mechanism including networking
>> layers in the enterprise ive worked in, there has always been headers i
>> think clearly shows their need and success for a transport mechanism.
>>> 
>>> I understand some concerns with regards to impact for others not
>> needing it.
>>> 
>>> What we are proposing is flexible solution that provides no overhead
>> on storage or network traffic layers if you chose not to use headers, but
>> does enable those who need or want it to use it.
>>> 
>>> 
>>> On your response to 1), there is nothing saying that it should be
>> put in any faster or without diligence and the same KIP process can still
>> apply for adding kafka-scope headers, having headers, just makes it easier
>> to add, without constant message and record changes. Timestamp is a clear
>> real example of actually what should be in a header (along with other
>> fields) but as such the whole message/record object needed to be changed to
>> add this, as will any further headers deemed needed by kafka.
>>> 
>>> On response to 2) why within my company as a platforms designer
>> should i enforce that all teams use the same serialization for their
>> payloads? But what i do need is some core cross cutting concerns and
>> information addressed at my platform level and i don't want to impose onto
>> my development teams. This is the same argument why byte[] is the exposed
>> value and key because as a messaging platform you dont want to impose that
>> on my company.
>>> 
>>> On response to 3) Actually this isnt true, there are many 3rd party
>> tools, we need to hook into our messaging flows that they only build onto
>> standardised interfaces as obviously the cost to have a custom
>> implementation for every company would be very high.
>>> APM tooling is a clear case in point, every enterprise level APM
>> tool on the market is able to stitch in transaction flow end 2 end over a
>> platform over http, jms because they can stitch in some "magic" data in a
>> uniform/standardised for the two mentioned they stitch this into the
>> headers. It is current form they cannot do this with Kafka. Providing a
>> standardised interface will i believe actually benefit the project as
>> commercial companies like these will now be able to plugin their tooling
>> uniformly, making it attractive and possible.
>>> 
>>> Some of you other concerns as Joel mentions these are more
>> implementation details, that i think should be agreed upon, but i think can
>> be addressed.
>>> 
>>> e.g. re your concern on the hashmap.
>>> it is more than possible not to have every record have to have a
>> hashmap unless it actually has a header (just like we have managed to do on
>> the serialized meesage) so if theres a concern on the in memory record size
>> for those using kafka without headers.
>>> 
>>> On your second to last comment about every team choosing their own
>> format, actually we do want this a little, as very first mentioned, no we
>> don't want a free for all, but some freedom to have different serialization
>> has different benefits and draw backs across our business. I can iterate
>> these if needed. One of the use case for headers provided by linkedIn on
>> top of my KIP even shows where headers could be beneficial here as a header
>> could be used to detail which data format the message is serialized to
>> allowing me to consume different formats.
>>> 
>>> Also we have some systems that we need to integrate that pretty near
>> impossible to wrap or touch their binary payloads, or we’re not allowed to
>> touch them (historic system, or inter/intra corporate)
>>> 
>>> Headers really gives as a solution to provide a pluggable platform,
>> and standardisation that allows users to build platforms that adapt to
>> their needs.
>>> 
>>> 
>>> Cheers
>>> Mike
>>> 
>>> 
>>> ________________________________________
>>> From: Jay Kreps <j...@confluent.io>
>>> Sent: Friday, October 7, 2016 4:45 PM
>>> To: dev@kafka.apache.org
>>> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
>>> 
>>> Hey guys,
>>> 
>>> This discussion has come up a number of times and we've always
>> passed.
>>> 
>>> One of things that has helped keep Kafka simple is not adding in new
>>> abstractions and concepts except when the proposal is really elegant
>> and
>>> makes things simpler.
>>> 
>>> Consider three use cases for headers:
>>> 
>>>   1. Kafka-scope: We want to add a feature to Kafka that needs a
>>>   particular field.
>>>   2. Company-scope: You want to add a header to be shared by
>> everyone in
>>>   your company.
>>>   3. World-wide scope: You are building a third party tool and want
>> to add
>>>   some kind of header.
>>> 
>>> For the case of (1) you should not use headers, you should just add
>> a field
>>> to the record format. Having a second way of encoding things doesn't
>> make
>>> sense. Occasionally people have complained that adding to the record
>> format
>>> is hard and it would be nice to just shove lots of things in
>> quickly. I
>>> think a better solution would be to make it easy to add to the record
>>> format, and I think we've made progress on that. I also think we
>> should be
>>> insanely focused on the simplicity of the abstraction and not adding
>> in new
>>> thingies often---we thought about time for years before adding a
>> timestamp
>>> and I guarantee you we would have goofed it up if we'd gone with the
>>> earlier proposals. These things end up being long term commitments
>> so it's
>>> really worth being thoughtful.
>>> 
>>> For case (2) just use the body of the message. You don't need a
>> globally
>>> agreed on definition of headers, just standardize on a header you
>> want to
>>> include in the value in your company. Since this is just used by
>> code in
>>> your company having a more standard header format doesn't really
>> help you.
>>> In fact by using something like Avro you can define exactly the
>> types you
>>> want, the required header fields, etc.
>>> 
>>> The only case that headers help is (3). This is a bit of a niche
>> case and i
>>> think is easily solved just making the reading and writing of given
>>> required fields pluggable to work with the header you have.
>>> 
>>> A couple of specific problems with this proposal:
>>> 
>>>   1. A global registry of numeric keys is super super ugly. This
>> seems
>>>   silly compared to the Avro (or whatever) header solution which
>> gives more
>>>   compact encoding, rich types, etc.
>>>   2. Using byte arrays for header values means they aren't really
>>>   interoperable for case (3). E.g. I can't make a UI that displays
>> headers,
>>>   or allow you to set them in config. To work with third party
>> headers, the
>>>   only case I think this really helps, you need the union of all
>>>   serialization schemes people have used for any tool.
>>>   3. For case (2) and (3) your key numbers are going to collide like
>>>   crazy. I don't think a global registry of magic numbers
>> maintained either
>>>   by word of mouth or checking in changes to kafka source is the
>> right thing
>>>   to do.
>>>   4. We are introducing a new serialization primitive which makes
>> fields
>>>   disappear conditional on the contents of other fields. This
>> breaks the
>>>   whole serialization/schema system we have today.
>>>   5. We're adding a hashmap to each record
>>>   6. This proposes making the ProducerRecord and ConsumerRecord
>> mutable
>>>   and adding setters and getters (which we try to avoid).
>>> 
>>> For context on LinkedIn: I set up the system there, but it may have
>> changed
>>> since i left. The header is maintained with the record schemas in
>> the avro
>>> schema registry and is required for all records. Essentially all
>> messages
>>> must have a field named "header" of type EventHeader which is itself
>> a
>>> record schema with a handful of fields (time, host, etc). The header
>>> follows the same compatibility rules as other avro fields, so it can
>> be
>>> evolved in a compatible way gradually across apps. Avro is typed and
>>> doesn't require deserializing the full record to read the header. The
>>> header information is (timestamp, host, etc) is important and needs
>> to
>>> propagate into other systems like Hadoop which don't have a concept
>> of
>>> headers for records, so I doubt it could move out of the value in
>> any case.
>>> Not allowing teams to chose a data format other than avro was
>> considered a
>>> feature, not a bug, since the whole point was to be able to share
>> data,
>>> which doesn't work if every team chooses their own format.
>>> 
>>> I agree with the critique of compaction not having a value. I think
>> we
>>> should consider fixing that directly.
>>> 
>>> -Jay
>>> 
>>> On Thu, Sep 22, 2016 at 12:31 PM, Michael Pearce <
>> michael.pea...@ig.com>
>>> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> 
>>>> I would like to discuss the following KIP proposal:
>>>> 
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> 82+-+Add+Record+Headers
>>>> 
>>>> 
>>>> 
>>>> I have some initial ?drafts of roughly the changes that would be
>> needed.
>>>> This is no where finalized and look forward to the discussion
>> especially as
>>>> some bits I'm personally in two minds about.
>>>> 
>>>> https://github.com/michaelandrepearce/kafka/tree/
>> kafka-headers-properties
>>>> 
>>>> 
>>>> 
>>>> Here is a link to a alternative option mentioned in the kip but one
>> i
>>>> would personally would discard (disadvantages mentioned in kip)
>>>> 
>>>> https://github.com/michaelandrepearce/kafka/tree/kafka-headers-full
>> ?
>>>> 
>>>> 
>>>> Thanks
>>>> 
>>>> Mike
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> The information contained in this email is strictly confidential
>> and for
>>>> the use of the addressee only, unless otherwise indicated. If you
>> are not
>>>> the intended recipient, please do not read, copy, use or disclose
>> to others
>>>> this message or any attachment. Please also notify the sender by
>> replying
>>>> to this email or by telephone (+44(020 7896 0011) and then delete
>> the email
>>>> and any copies of it. Opinions, conclusion (etc) that do not relate
>> to the
>>>> official business of this company shall be understood as neither
>> given nor
>>>> endorsed by it. IG is a trading name of IG Markets Limited (a
>> company
>>>> registered in England and Wales, company number 04008957) and IG
>> Index
>>>> Limited (a company registered in England and Wales, company number
>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>> Hill,
>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355)
>> and IG
>>>> Index Limited (register number 114059) are authorised and regulated
>> by the
>>>> Financial Conduct Authority.
>>>> 
>>> The information contained in this email is strictly confidential and
>> for the use of the addressee only, unless otherwise indicated. If you are
>> not the intended recipient, please do not read, copy, use or disclose to
>> others this message or any attachment. Please also notify the sender by
>> replying to this email or by telephone (+44(020 7896 0011) and then delete
>> the email and any copies of it. Opinions, conclusion (etc) that do not
>> relate to the official business of this company shall be understood as
>> neither given nor endorsed by it. IG is a trading name of IG Markets
>> Limited (a company registered in England and Wales, company number
>> 04008957) and IG Index Limited (a company registered in England and Wales,
>> company number 01190902). Registered address at Cannon Bridge House, 25
>> Dowgate Hill, London EC4R 2YA. Both IG Markets Limited (register number
>> 195355) and IG Index Limited (register number 114059) are authorised and
>> regulated by the Financial Conduct Authority.
>> 
>> 
> The information contained in this email is strictly confidential and for the 
> use of the addressee only, unless otherwise indicated. If you are not the 
> intended recipient, please do not read, copy, use or disclose to others this 
> message or any attachment. Please also notify the sender by replying to this 
> email or by telephone (+44(020 7896 0011) and then delete the email and any 
> copies of it. Opinions, conclusion (etc) that do not relate to the official 
> business of this company shall be understood as neither given nor endorsed by 
> it. IG is a trading name of IG Markets Limited (a company registered in 
> England and Wales, company number 04008957) and IG Index Limited (a company 
> registered in England and Wales, company number 01190902). Registered address 
> at Cannon Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets 
> Limited (register number 195355) and IG Index Limited (register number 
> 114059) are authorised and regulated by the Financial Conduct Authority.

Reply via email to