+1 (non-binding)

Thanks,

Mayuresh


> On Nov 29, 2016, at 3:18 AM, Michael Pearce <michael.pea...@ig.com> wrote:
> 
> Hi All,
> 
> We have been discussing in the below thread and final changes have been made 
> to the KIP wiki based on these discussions.
> 
> We would now like to put to the vote the following KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-87+-+Add+Compaction+Tombstone+Flag
> 
> This kip is for having a distinct compaction attribute “tombstone” flag 
> instead of relying on null value, allowing non-null value delete messages.
> 
> Many thanks,
> Michael
> 
> 
> 
> On 22/11/2016, 15:52, "Michael Pearce" <michael.pea...@ig.com> wrote:
> 
>    Hi Mayuresh,
> 
>    LGTM. Ive just made one small adjustment updating the wire protocol to 
> show the magic byte bump.
> 
>    Do we think we’re good to put to a vote? Is there any other bits needing 
> discussion?
> 
>    Cheers
>    Mike
> 
>    On 21/11/2016, 18:26, "Mayuresh Gharat" <gharatmayures...@gmail.com> wrote:
> 
>        Hi Michael,
> 
>        I have updated the migration section of the KIP. Can you please take a 
> look?
> 
>        Thanks,
> 
>        Mayuresh
> 
>        On Fri, Nov 18, 2016 at 9:07 AM, Mayuresh Gharat 
> <gharatmayures...@gmail.com
>> wrote:
> 
>> Hi Michael,
>> 
>> That whilst sending tombstone and non null value, the consumer can expect
>> only to receive the non-null message only in step (3) is this correct?
>> ---> I do agree with you here.
>> 
>> Becket, Ismael : can you guys review the migration plan listed above using
>> magic byte?
>> 
>> Thanks,
>> 
>> Mayuresh
>> 
>> On Fri, Nov 18, 2016 at 8:58 AM, Michael Pearce <michael.pea...@ig.com>
>> wrote:
>> 
>>> Many thanks for this Mayuresh. I don't have any objections.
>>> 
>>> I assume we should state:
>>> 
>>> That whilst sending tombstone and non null value, the consumer can expect
>>> only to receive the non-null message only in step (3) is this correct?
>>> 
>>> Cheers
>>> Mike
>>> 
>>> 
>>> 
>>> Sent using OWA for iPhone
>>> ________________________________________
>>> From: Mayuresh Gharat <gharatmayures...@gmail.com>
>>> Sent: Thursday, November 17, 2016 5:18:41 PM
>>> To: dev@kafka.apache.org
>>> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>>> 
>>> Hi Ismael,
>>> 
>>> Thanks for the explanation.
>>> Specially I like this part where in you mentioned we can get rid of the
>>> older null value support for log compaction later on, here :
>>> We can't change semantics of the message format without having a long
>>> transition period. And we can't rely
>>> on people reading documentation or acting on a warning for something so
>>> fundamental. As such, my take is that we need to bump the magic byte. The
>>> good news is
>>> that we don't have to support all versions forever. We have said that we
>>> will support direct upgrades for 2 years. That means that message format
>>> version n could, in theory, be removed 2 years after the it's introduced.
>>> 
>>> Just a heads up, I would like to mention that even without bumping magic
>>> byte, we will *NOT* loose zero copy as in the client(x+1) in my
>>> explanation
>>> above will convert internally a null value to have a tombstone bit set and
>>> a tombstone bit set to have a null value automatically internally and by
>>> the time we move to version (x+2), the clients would have upgraded.
>>> Obviously if we support a request from consumer(x), we will loose zero
>>> copy
>>> but that is the same case with magic byte.
>>> 
>>> But if magic byte bump makes life easier for transition for the above
>>> reasons that you explained, I am OK with it since we are going to meet the
>>> end goal down the road :)
>>> 
>>> On a side note can we update the doc here on magic byte to say that "*it
>>> should be bumped whenever the message format is changed or the
>>> interpretation of message format (usage of the reserved bits as well) is
>>> changed*".
>>> 
>>> 
>>> Hi Michael,
>>> 
>>> Here is the update plan that we discussed offline yesterday :
>>> 
>>> Currently the magic-byte which corresponds to the "message.format.version"
>>> is set to 1.
>>> 
>>> 1) On broker it will be set to 1 initially.
>>> 
>>> 2) When a producer client sends a message with magic-byte = 2, since the
>>> broker is on magic-byte = 1, we will down convert it, which means if the
>>> tombstone bit is set, the value will be set to null. A consumer
>>> understanding magic-byte = 1, will still work with this. A consumer
>>> working
>>> with magic-byte =2 will also be able to understand this, since it
>>> understands the tombstone.
>>> Now there is still the question of supporting a non-tombstone and null
>>> value from producer client with magic-byte = 2.* (I am not sure if we
>>> should support this. Ismael/Becket can comment here)*
>>> 
>>> 3) When almost all the clients have upgraded, the message.format.version
>>> on
>>> the broker can be changed to 2, where in the down conversion in the above
>>> step will not happen. If at this point we get a consumer request from a
>>> older consumer, we might have to down convert where in we loose zero copy,
>>> but these cases should be rare.
>>> 
>>> Becket can you review this plan and add more details if I have
>>> missed/wronged something, before we put it on KIP.
>>> 
>>> Thanks,
>>> 
>>> Mayuresh
>>> 
>>> On Wed, Nov 16, 2016 at 11:07 PM, Michael Pearce <michael.pea...@ig.com>
>>> wrote:
>>> 
>>>> Thanks guys, for discussing this offline and getting some consensus.
>>>> 
>>>> So its clear for myself and others what is proposed now (i think i
>>>> understand, but want to make sure)
>>>> 
>>>> Could i ask either directly update the kip to detail the migration
>>>> strategy, or (re-)state your offline discussed and agreed migration
>>>> strategy based on a magic byte is in this thread.
>>>> 
>>>> 
>>>> The main original driver for the KIP was to support compaction where
>>> value
>>>> isn't null, based off the discussions on KIP-82 thread.
>>>> 
>>>> We should be able to support non-tombstone + null value by the
>>> completion
>>>> of the KIP, as we noted when discussing this kip, having logic based on
>>> a
>>>> null value isn't very clean and also separates the concerns.
>>>> 
>>>> As discussed already though we can split this into KIP-87a and KIP-87b
>>>> 
>>>> Where we look to deliver KIP-87a on a compacted topic (to address the
>>>> immediate issues)
>>>> * tombstone + null value
>>>> * tombstone + non-null value
>>>> * non-tombstone + non-null value
>>>> 
>>>> Then we can discuss once KIP-87a is completed options later and how we
>>>> support the second part KIP-87b to deliver:
>>>> * non-tombstone + null value
>>>> 
>>>> Cheers
>>>> Mike
>>>> 
>>>> 
>>>> 
>>>> ________________________________________
>>>> From: Becket Qin <becket....@gmail.com>
>>>> Sent: Thursday, November 17, 2016 1:43 AM
>>>> To: dev@kafka.apache.org
>>>> Subject: Re: [DISCUSS] KIP-87 - Add Compaction Tombstone Flag
>>>> 
>>>> Renu, Mayuresh and I had an offline discussion, and following is a brief
>>>> summary.
>>>> 
>>>> 1. We agreed that not bumping up magic value may result in losing zero
>>> copy
>>>> during migration.
>>>> 2. Given that bumping up magic value is almost free and has benefit of
>>>> avoiding potential performance issue. It is probably worth doing.
>>>> 
>>>> One issue we still need to think about is whether we want to support a
>>>> non-tombstone message with null value.
>>>> Currently it is not supported by Kafka. If we allow a non-tombstone null
>>>> value message to exist after KIP-87. The problem is that such message
>>> will
>>>> not be supported by the consumers prior to KIP-87. Because a null value
>>>> will always be interpreted to a tombstone.
>>>> 
>>>> One option is that we keep the current way, i.e. do not support such
>>>> message. It would be good to know if there is a concrete use case for
>>> such
>>>> message. If there is not, we can probably just not support it.
>>>> 
>>>> Thanks,
>>>> 
>>>> JIangjie (Becket) Qin
>>>> 
>>>> 
>>>> 
>>>> On Wed, Nov 16, 2016 at 1:28 PM, Mayuresh Gharat <
>>>> gharatmayures...@gmail.com
>>>>> wrote:
>>>> 
>>>>> Hi Ismael,
>>>>> 
>>>>> This is something I can think of for migration plan:
>>>>> So the migration plan can look something like this, with up
>>> conversion :
>>>>> 
>>>>> 1) Currently lets say we have Broker at version x.
>>>>> 2) Currently we have clients at version x.
>>>>> 3) a) We move the version to Broker(x+1) : supports both tombstone and
>>>> null
>>>>> for log compaction.
>>>>>    b) We upgrade the client to version client(x+1) : if in the
>>> producer
>>>>> client(x+1) the value is set to null, we will automatically set the
>>>>> Tombstone bit internally. If the producer client(x+1) sets the
>>> tombstone
>>>>> itself, well and good. For producer client(x), the broker will up
>>> convert
>>>>> to have the tombstone bit. Broker(x+1) is supporting both. Consumer
>>>>> client(x+1) will be aware of this and should be able to handle this.
>>> For
>>>>> consumer client(x) we will down convert the message on the broker
>>> side.
>>>>>    c) At this point we will have to specify a warning or clearly
>>> specify
>>>>> in docs that this behavior is about to be changed for log compaction.
>>>>> 4) a) In next release of the Broker(x+2), we say that only Tombstone
>>> is
>>>>> used for log compaction on the Broker side. Clients(x+1) still is
>>>>> supported.
>>>>>    b) We upgrade the client to version client(x+2) : if value is set
>>> to
>>>>> null, tombstone will not be set automatically. The client will have to
>>>> call
>>>>> setTombstone() to actually set the tombstone.
>>>>> 
>>>>> We should compare this migration plan with the migration plan for
>>> magic
>>>>> byte bump and do whatever looks good.
>>>>> I am just worried that if we go down magic byte route, unless I am
>>>> missing
>>>>> something, it sounds like kafka will be stuck with supporting both
>>> null
>>>>> value and tombstone bit for log compaction for life long, which does
>>> not
>>>>> look like a good end state.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Mayuresh
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Nov 16, 2016 at 9:32 AM, Mayuresh Gharat <
>>>>> gharatmayures...@gmail.com
>>>>>> wrote:
>>>>> 
>>>>>> Hi Ismael,
>>>>>> 
>>>>>> That's a very good point which I might have not considered earlier.
>>>>>> 
>>>>>> Here is a plan that I can think of:
>>>>>> 
>>>>>> Stage 1) The broker from now on, up converts the message to have the
>>>>>> tombstone marker. The log compaction thread does log compaction
>>> based
>>>> on
>>>>>> both null and tombstone marker. This is our transition period.
>>>>>> Stage 2) The next release we only say that log compaction is based
>>> on
>>>>>> tombstone marker. (Open source kafka makes this as a policy). By
>>> this
>>>>> time,
>>>>>> the organization which is moving to this release will be sure that
>>> they
>>>>>> have gone through the entire transition period.
>>>>>> 
>>>>>> My only goal of doing this is that Kafka clearly specifies the end
>>>> state
>>>>>> about what log compaction means (is it null value or a tombstone
>>>> marker,
>>>>>> but not both).
>>>>>> 
>>>>>> What do you think?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Mayuresh
>>>>>> .
>>>>>> 
>>>>>> On Wed, Nov 16, 2016 at 9:17 AM, Ismael Juma <ism...@juma.me.uk>
>>>> wrote:
>>>>>> 
>>>>>>> One comment below.
>>>>>>> 
>>>>>>> On Wed, Nov 16, 2016 at 5:08 PM, Mayuresh Gharat <
>>>>>>> gharatmayures...@gmail.com
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>>   - If we don't bump up the magic byte, on the broker side, the
>>>>> broker
>>>>>>>>   will always have to look at both tombstone bit and the value
>>> when
>>>>> do
>>>>>>> the
>>>>>>>>   compaction. Assuming we do not bump up the magic byte,
>>>>>>>>   imagine the broker sees a message which does not have a
>>> tombstone
>>>>> bit
>>>>>>>>   set. The broker does not know when the message was produced
>>> (i.e.
>>>>>>>> whether
>>>>>>>>   the message has been up converted or not), it has to take a
>>>> further
>>>>>>>> look at
>>>>>>>>   the value to see if it is null or not in order to determine
>>> if it
>>>>> is
>>>>>>> a
>>>>>>>>   tombstone. The same logic has to be put on the consumer as
>>> well
>>>>>>> because
>>>>>>>> the
>>>>>>>>   consumer does not know if the message has been up converted or
>>>> not.
>>>>>>>>      - If we upconvert while appending, this is not the case,
>>>> right?
>>>>>>> 
>>>>>>> 
>>>>>>> If I understand you correctly, this is not sufficient because the
>>> log
>>>>> may
>>>>>>> have messages appended before it was upgraded to include KIP-87.
>>>>>>> 
>>>>>>> Ismael
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> -Regards,
>>>>>> Mayuresh R. Gharat
>>>>>> (862) 250-7125
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> -Regards,
>>>>> Mayuresh R. Gharat
>>>>> (862) 250-7125
>>>> The information contained in this email is strictly confidential and for
>>>> the use of the addressee only, unless otherwise indicated. If you are
>>> not
>>>> the intended recipient, please do not read, copy, use or disclose to
>>> others
>>>> this message or any attachment. Please also notify the sender by
>>> replying
>>>> to this email or by telephone (+44(020 7896 0011) and then delete the
>>> email
>>>> and any copies of it. Opinions, conclusion (etc) that do not relate to
>>> the
>>>> official business of this company shall be understood as neither given
>>> nor
>>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>>> registered in England and Wales, company number 04008957) and IG Index
>>>> Limited (a company registered in England and Wales, company number
>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>>> Index Limited (register number 114059) are authorised and regulated by
>>> the
>>>> Financial Conduct Authority.
>>> 
>>> 
>>> 
>>> --
>>> -Regards,
>>> Mayuresh R. Gharat
>>> (862) 250-7125
>>> The information contained in this email is strictly confidential and for
>>> the use of the addressee only, unless otherwise indicated. If you are not
>>> the intended recipient, please do not read, copy, use or disclose to others
>>> this message or any attachment. Please also notify the sender by replying
>>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>>> official business of this company shall be understood as neither given nor
>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>> registered in England and Wales, company number 04008957) and IG Index
>>> Limited (a company registered in England and Wales, company number
>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>> Index Limited (register number 114059) are authorised and regulated by the
>>> Financial Conduct Authority.
>> 
>> 
>> 
>> --
>> -Regards,
>> Mayuresh R. Gharat
>> (862) 250-7125
> 
> 
> 
>        --
>        -Regards,
>        Mayuresh R. Gharat
>        (862) 250-7125
> 
> 
>    The information contained in this email is strictly confidential and for 
> the use of the addressee only, unless otherwise indicated. If you are not the 
> intended recipient, please do not read, copy, use or disclose to others this 
> message or any attachment. Please also notify the sender by replying to this 
> email or by telephone (+44(020 7896 0011) and then delete the email and any 
> copies of it. Opinions, conclusion (etc) that do not relate to the official 
> business of this company shall be understood as neither given nor endorsed by 
> it. IG is a trading name of IG Markets Limited (a company registered in 
> England and Wales, company number 04008957) and IG Index Limited (a company 
> registered in England and Wales, company number 01190902). Registered address 
> at Cannon Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets 
> Limited (register number 195355) and IG Index Limited (register number 
> 114059) are authorised and regulated by the Financial Conduct Authority.
> 
> 

Reply via email to