Hi Dongjin,

I was thinking of a simple test: Snappy with 1 KB block size vs 32 KB block
size. If the compression rate is similar for both, then it seems very
wasteful to use 32 KB. I suspect you will see a significant difference
though.

Ismael

On Tue, Jun 8, 2021 at 8:27 AM Dongjin Lee <dong...@apache.org> wrote:

> Hi Ismael,
>
> I added the linear write benchmark result to the proposal. Like the
> producer benchmark, the least compression level showed the best MB/sec for
> any case. I tested several configurations, but the result was almost the
> same.
>
> If you have any proposals for the benchmark, don't hesitate to give me a
> suggestion. I am a newbie to run the linear write benchmark.
>
> Best,
> Dongjin
>
> On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> wrote:
>
> > Hi Ismael,
> >
> > Thanks for the reply.
> >
> > > So you're saying that reducing the buffer size didn't reduce the
> > compression rate for codecs like lz4?
> >
> > Of course, there were some improvements in compressed size when I tried
> > the 'buffer.size' option, but the gain was not significant. I tried
> several
> > datasets, but the result was the same. It made me so skeptical about
> adding
> > this option, which seemed to make the configuration option complex only.
> >
> > In contrast, 'compression.level' showed its effectiveness immediately. It
> > is why I decided to focus on the 'compression.level' in this rework.
> >
> > As you can see in the update KIP with the benchmark, IMHO, the true value
> > of supporting the compression option may not be the compressed size or
> > rate, but speed. By tweaking the compression level slightly, it showed
> > great produce performance gain.
> >
> > Thanks,
> > Dongjin
> >
> >
> > On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote:
> >
> >> Thanks Dongjin. So you're saying that reducing the buffer size didn't
> >> reduce the compression rate for codecs like lz4? If so, that would
> suggest
> >> reducing the default value, but that seems odd.
> >>
> >> Ismael
> >>
> >> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote:
> >>
> >> > Hello Kafka dev,
> >> >
> >> > I hope to reboot the discussion of KIP-390: Support Compression Level
> >> > <
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
> >> > >.
> >> > It proposes to add a new option, 'compression.level', that controls
> the
> >> > compression level.
> >> >
> >> > This KIP has been submitted more than one year ago, but had been
> >> neglected
> >> > for a long time. Recently I reworked it from scratch with the
> following
> >> > differences:
> >> >
> >> > 1. Tested how it works with a real-world dataset. As you can see in
> the
> >> > updated KIP, *this feature can improve the producer's message/second
> >> rate
> >> > by more than 50%*, such a significant enhancement.
> >> > 2. Dropped 'compression.buffer.size' option that was in the initial
> >> work.
> >> > With the repeated benchmarks, I could not find any evidence this
> option
> >> > results in meaningful differences. So I removed it.
> >> >
> >> > All feedback will be highly appreciated.
> >> >
> >> > Best,
> >> > Dongjin
> >> >
> >> >
> >> > --
> >> > *Dongjin Lee*
> >> >
> >> > *A hitchhiker in the mathematical world.*
> >> >
> >> >
> >> >
> >> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> >> > <https://github.com/dongjinleekr>keybase:
> >> https://keybase.io/dongjinleekr
> >> > <https://keybase.io/dongjinleekr>linkedin:
> >> kr.linkedin.com/in/dongjinleekr
> >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> >> > speakerdeck.com/dongjin
> >> > <https://speakerdeck.com/dongjin>*
> >> >
> >>
> >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> >
> >
> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> > <https://github.com/dongjinleekr>keybase:
> https://keybase.io/dongjinleekr
> > <https://keybase.io/dongjinleekr>linkedin:
> kr.linkedin.com/in/dongjinleekr
> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> speakerdeck.com/dongjin
> > <https://speakerdeck.com/dongjin>*
> >
>
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
>
>
> *github:  <http://goog_969573159/>github.com/dongjinleekr
> <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr
> <https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
> speakerdeck.com/dongjin
> <https://speakerdeck.com/dongjin>*
>

Reply via email to