Hi Ismael,

I added the linear write benchmark result to the proposal. Like the
producer benchmark, the least compression level showed the best MB/sec for
any case. I tested several configurations, but the result was almost the
same.

If you have any proposals for the benchmark, don't hesitate to give me a
suggestion. I am a newbie to run the linear write benchmark.

Best,
Dongjin

On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> wrote:

> Hi Ismael,
>
> Thanks for the reply.
>
> > So you're saying that reducing the buffer size didn't reduce the
> compression rate for codecs like lz4?
>
> Of course, there were some improvements in compressed size when I tried
> the 'buffer.size' option, but the gain was not significant. I tried several
> datasets, but the result was the same. It made me so skeptical about adding
> this option, which seemed to make the configuration option complex only.
>
> In contrast, 'compression.level' showed its effectiveness immediately. It
> is why I decided to focus on the 'compression.level' in this rework.
>
> As you can see in the update KIP with the benchmark, IMHO, the true value
> of supporting the compression option may not be the compressed size or
> rate, but speed. By tweaking the compression level slightly, it showed
> great produce performance gain.
>
> Thanks,
> Dongjin
>
>
> On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote:
>
>> Thanks Dongjin. So you're saying that reducing the buffer size didn't
>> reduce the compression rate for codecs like lz4? If so, that would suggest
>> reducing the default value, but that seems odd.
>>
>> Ismael
>>
>> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote:
>>
>> > Hello Kafka dev,
>> >
>> > I hope to reboot the discussion of KIP-390: Support Compression Level
>> > <
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
>> > >.
>> > It proposes to add a new option, 'compression.level', that controls the
>> > compression level.
>> >
>> > This KIP has been submitted more than one year ago, but had been
>> neglected
>> > for a long time. Recently I reworked it from scratch with the following
>> > differences:
>> >
>> > 1. Tested how it works with a real-world dataset. As you can see in the
>> > updated KIP, *this feature can improve the producer's message/second
>> rate
>> > by more than 50%*, such a significant enhancement.
>> > 2. Dropped 'compression.buffer.size' option that was in the initial
>> work.
>> > With the repeated benchmarks, I could not find any evidence this option
>> > results in meaningful differences. So I removed it.
>> >
>> > All feedback will be highly appreciated.
>> >
>> > Best,
>> > Dongjin
>> >
>> >
>> > --
>> > *Dongjin Lee*
>> >
>> > *A hitchhiker in the mathematical world.*
>> >
>> >
>> >
>> > *github:  <http://goog_969573159/>github.com/dongjinleekr
>> > <https://github.com/dongjinleekr>keybase:
>> https://keybase.io/dongjinleekr
>> > <https://keybase.io/dongjinleekr>linkedin:
>> kr.linkedin.com/in/dongjinleekr
>> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck:
>> > speakerdeck.com/dongjin
>> > <https://speakerdeck.com/dongjin>*
>> >
>>
>
>
> --
> *Dongjin Lee*
>
> *A hitchhiker in the mathematical world.*
>
>
>
> *github:  <http://goog_969573159/>github.com/dongjinleekr
> <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr
> <https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin
> <https://speakerdeck.com/dongjin>*
>


-- 
*Dongjin Lee*

*A hitchhiker in the mathematical world.*



*github:  <http://goog_969573159/>github.com/dongjinleekr
<https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr
<https://keybase.io/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
<https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin
<https://speakerdeck.com/dongjin>*

Reply via email to