Hi Guozhang,

Sure and I have made a note in the JIRA item to make sure the wiki is updated.

Thanks,
Senthil

-----Original Message-----
From: Guozhang Wang <wangg...@gmail.com> 
Sent: Monday, November 4, 2019 11:00 AM
To: dev <dev@kafka.apache.org>
Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction

Hello Senthilnathan,

Thanks for revamping on the KIP. I have only one comment about the wiki 
otherwise LGTM.

1. We should emphasize that the newly introduced config yields to the existing 
"log.cleanup.policy", i.e. if the latter's value is `delete` not `compact`, 
then the previous config would be ignored.


Guozhang

On Mon, Nov 4, 2019 at 9:52 AM Senthilnathan Muthusamy 
<senth...@microsoft.com.invalid> wrote:

> Hi all,
>
> I will start the vote thread shortly for this updated KIP. If there 
> are any more thoughts I would love to hear them.
>
> Thanks,
> Senthil
>
> -----Original Message-----
> From: Senthilnathan Muthusamy <senth...@microsoft.com.INVALID>
> Sent: Thursday, October 31, 2019 3:51 AM
> To: dev@kafka.apache.org
> Subject: RE: [DISCUSS] KIP-280: Enhanced log compaction
>
> Hi Matthias
>
> Thanks for the response.
>
> (1) Yes
>
> (2) Yes, and the config name will be the same (i.e.
> `log.cleaner.compaction.strategy` &
> `log.cleaner.compaction.strategy.header`) at broker level and topic 
> level (to override broker level default compact strategy). Please let 
> me know if we need to keep it in different naming convention. Note: 
> Broker level (which will be in the server.properties) configuration is 
> optional and default it to offset. Topic level configuration will be 
> default to broker level config...
>
> (3) By this new way, it avoids another config parameter and also in 
> feature if any new strategy like header need addition info, no 
> additional config required. As this got discussed already and agreed 
> to have separate config, I will revert it. KIP updated...
>
> (4) Done
>
> (5) Updated
>
> (6) Updated to pick the first header in the list
>
> Please let me know if you have any other questions.
>
> Thanks,
> Senthil
>
> -----Original Message-----
> From: Matthias J. Sax <matth...@confluent.io>
> Sent: Thursday, October 31, 2019 12:13 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
>
> Thanks for picking up this KIP, Senthil.
>
> (1) As far as I remember, the main issue of the original proposal was 
> a missing topic level configuration for the compaction strategy. With 
> this being addressed, I am in favor of this KIP.
>
> (2) With regard to (1), it seems we would need a new topic level 
> config `compaction.strategy`, and `log.cleaner.compaction.strategy` 
> would be the default strategy (ie, broker level config) if a topic does not 
> overwrite it?
>
> (3) Why did you remove `log.cleaner.compaction.strategy.header`
> parameter and change the accepted values of 
> `log.cleaner.compaction.strategy` to "header.<key>" instead of keeping 
> "header"? The original approach seems to be cleaner, and I think this 
> was discussed on the original discuss thread already.
>
> (4) Nit: For the "timestamp" compaction strategy you changed the KIP 
> to
>
> -> `The record [create] timestamp`
>
> This is miss leading IMHO, because it depends on the broker/log 
> configuration `(log.)message.timestamp.type` that can either be 
> `CreateTime` or `LogAppendTime` what the actual record timestamp is. I 
> would just remove "create" to keep it unspecified.
>
> (5) Nit: the section "Public Interfaces" should list the newly 
> introduced configs -- configuration parameters are a public interface.
>
> (6) What do you mean by "first level header lookup"? The term "first 
> level" indicates some hierarchy, but headers don't have any hierarchy 
> -- it's just a list of key-value pairs? If you mean the _order_ of the 
> headers, ie, pick the first header in the list that matches the key, 
> please rephrase it to make it clearer.
>
>
>
> @Tom: I agree with all you are saying, however, I still think that 
> this KIP will improve the overall situation, because everything you 
> pointed out is actually true with offset based compaction, too.
>
> The KIP is not a silver bullet that solves all issue for interleaved 
> writes, but I personally believe, it's a good improvement.
>
>
>
> -Matthias
>
>
> On 10/30/19 9:45 AM, Senthilnathan Muthusamy wrote:
> > Hi,
> >
> > Please let me know if anyone has any questions on this updated KIP-280...
> >
> > Thanks,
> >
> > Senthil
> >
> > -----Original Message-----
> > From: Senthilnathan Muthusamy <senth...@microsoft.com.INVALID>
> > Sent: Monday, October 28, 2019 11:36 PM
> > To: dev@kafka.apache.org
> > Subject: RE: [DISCUSS] KIP-280: Enhanced log compaction
> >
> > Hi Tom,
> >
> > Sorry for the delayed response.
> >
> > Regarding the fall back to offset decision for both timestamp & 
> > header
> value is based on the previous author discuss
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.apache.org%2Fthread.html%2Ff44317eb6cd34f91966654c80509d4a457dbbccdd
> 02b86645782be67%40%253Cdev.kafka.apache.org%253E&amp;data=02%7C01%7Cse
> nthilm%40microsoft.com%7C8f6cae776082459c793b08d761595294%7C72f988bf86
> f141af91ab2d7cd011db47%7C1%7C0%7C637085022516423400&amp;sdata=WpEW5ylu
> %2FsLMyGS2ULWDZ7vA1OzQwFYWSuioLCbABhM%3D&amp;reserved=0
> and as per the discussion, it is really required to avoid duplicates.
> >
> > And the timestamp strategy is from the original KIP author and we 
> > are
> keeping it as is.
> >
> > Finally on the sequence order guarantee by the producer, it is not
> feasible on waiting for ack in async / multi-threads/processes 
> scenarios and hence the header sequence based compact strategy with 
> producer's responsibility to have a unique sequence generation for the 
> topic-partition-key level.
> >
> > Hoping this clarifies all your questions. Please let us know if you 
> > have
> any further questions.
> >
> > @Guozhang Wang / @Matthias J. Sax, I see you both had a detail
> discussion on the original KIP with previous author and it would great 
> to hear your inputs as well.
> >
> > Thanks,
> > Senthil
> >
> > -----Original Message-----
> > From: Tom Bentley <tbent...@redhat.com>
> > Sent: Tuesday, October 22, 2019 2:32 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
> >
> > Hi Senthilnathan,
> >
> > In the motivation isn't it a little misleading to say "On the 
> > producer side, we clearly preserve an order for the two messages, 
> > <K1, V1> <K1,
> > V2>"? IMHO, the semantics of the producer are clear that having an 
> > V2>observed
> > order of sending records from different producers is not sufficient 
> > to
> guarantee ordering on the broker. You really need to send the 2nd 
> record only after the 1st record is acked. It's the difficultly of 
> achieving that in practice that's the true motivation for your KIP.
> >
> > I can see the attraction of using timestamps, but it would be 
> > helpful to
> explain how that really solves the problem. When the producers are in 
> different processes on different machines you're relying on their 
> clocks being synchronized, which is a whole subject in itself. Even if 
> they're synchronized the resolution of System.currentTimeMillis() is 
> typically many milliseconds. If your producers are in different 
> threads of the same process that could be a real problem because it makes 
> ties quite likely.
> > And you don't explain why it's OK to resolve ties using the offset. 
> > The
> basis of your argument is that the offset is giving you the wrong answer.
> > So it seems to me that using it as a tiebreaker is just narrowing 
> > the
> chances of getting the wrong answer. Maybe none of this matters for 
> your use case, but I think it should be spelled out in the KIP, 
> because it surely would matter for similar use cases.
> >
> > Using a sequence at least removes the problem of ties, but the
> interesting bit is now in how you deal with races between 
> threads/processes in getting a sequence number allocated (which is out 
> of scope of the KIP, I guess).
> > How is resolving that race any simpler that resolving the motivating
> race by waiting for the ack of the first record sent?
> >
> > Kind regards,
> >
> > Tom
> >
> > On Mon, Oct 21, 2019 at 9:06 PM Senthilnathan Muthusamy <
> senth...@microsoft.com.invalid> wrote:
> >
> >> Hi All,
> >>
> >> We are bring back the KIP-280 to live with small correct for the 
> >> discussion & voting. Thanks to previous author Luis Cabral on the
> >> KIP-280 initiation and we are taking over to complete and get it 
> >> into
> 2.4...
> >>
> >> Below is the correction that we made to the existing KIP-280:
> >>
> >>   *   Allowing the compact strategy configuration at the topic level as
> >> the log compaction is at the topic level and a broker can have 
> >> multiple topics. This allows the flexibility to have the strategy 
> >> at both broker level (i.e. for all topics within the broker) and 
> >> topic level (i.e. for a subset of topics within a broker) as well...
> >>
> >> KIP-280:
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fc
> >> wi
> >> k
> >> i.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-280%253A%2BEnhanc
> >> ed
> >> %
> >> 2Blog%2Bcompaction&amp;data=02%7C01%7Csenthilm%40microsoft.com%7C68
> >> 6c
> >> 3
> >> 2fa4a554d61ae1408d756d409f6%7C72f988bf86f141af91ab2d7cd011db47%7C1%
> >> 7C
> >> 0
> >> %7C637073341017520406&amp;sdata=KrRem2KWCBscHX963Ah8wZ%2Fj9dkhCeAa7
> >> Gs
> >> 6
> >> XqJ%2F5SQ%3D&amp;reserved=0 PULL REQUEST:
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> >> it
> >> h
> >> ub.com%2Fapache%2Fkafka%2Fpull%2F7528&amp;data=02%7C01%7Csenthilm%4
> >> 0m
> >> i
> >> crosoft.com%7C686c32fa4a554d61ae1408d756d409f6%7C72f988bf86f141af91
> >> ab
> >> 2
> >> d7cd011db47%7C1%7C0%7C637073341017520406&amp;sdata=bt32PgDUjJjpXohE
> >> Wp
> >> t
> >> Fxv6mPERCwcRFlVROzinBtnk%3D&amp;reserved=0 (unit test coverage in
> >> progress)
> >>
> >> Previous Thread DISCUSS:
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
> >> is
> >> t
> >> s.apache.org%2Fthread.html%2F79aa6e50d7c737ddf83455dd8063692a535a1a
> >> fa
> >> 5
> >> 58620fe1a1496d3%40%253Cdev.kafka.apache.org%253E&amp;data=02%7C01%7
> >> Cs
> >> e
> >> nthilm%40microsoft.com%7C686c32fa4a554d61ae1408d756d409f6%7C72f988b
> >> f8
> >> 6
> >> f141af91ab2d7cd011db47%7C1%7C0%7C637073341017520406&amp;sdata=XwcUW
> >> WY
> >> D
> >> PV1nA%2BbkDGLFNlXZ5bysVblWUTDQEzAaKxM%3D&amp;reserved=0
> >> Previous Thread VOTE:
> >> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fl
> >> is
> >> t
> >> s.apache.org%2Fthread.html%2Fb2ecd73ce849741f0c40b4f801c3f765058349
> >> 78
> >> 1
> >> 2713e240e1ac2b7%40%253Cdev.kafka.apache.org%253E&amp;data=02%7C01%7
> >> Cs
> >> e
> >> nthilm%40microsoft.com%7C686c32fa4a554d61ae1408d756d409f6%7C72f988b
> >> f8
> >> 6
> >> f141af91ab2d7cd011db47%7C1%7C0%7C637073341017520406&amp;sdata=8cKQc
> >> Am
> >> 2
> >> DDVGVLTKtciYKGMiI%2FgOADW6tam9nem4lsg%3D&amp;reserved=0
> >>
> >> Appreciate your timely action.
> >>
> >> PS: Initiating a separate thread as I was not able to reply to the 
> >> existing threads...
> >>
> >> Thanks,
> >> Senthil
> >>
>
>

--
-- Guozhang

Reply via email to