Btw, I am OK with doing compression level first, but I don't want to rule out the buffer size change without understanding better.
Ismael On Tue, Jun 8, 2021 at 8:33 AM Ismael Juma <ism...@juma.me.uk> wrote: > Hi Dongjin, > > I was thinking of a simple test: Snappy with 1 KB block size vs 32 KB > block size. If the compression rate is similar for both, then it seems very > wasteful to use 32 KB. I suspect you will see a significant difference > though. > > Ismael > > On Tue, Jun 8, 2021 at 8:27 AM Dongjin Lee <dong...@apache.org> wrote: > >> Hi Ismael, >> >> I added the linear write benchmark result to the proposal. Like the >> producer benchmark, the least compression level showed the best MB/sec for >> any case. I tested several configurations, but the result was almost the >> same. >> >> If you have any proposals for the benchmark, don't hesitate to give me a >> suggestion. I am a newbie to run the linear write benchmark. >> >> Best, >> Dongjin >> >> On Sun, Jun 6, 2021 at 8:20 AM Dongjin Lee <dong...@apache.org> wrote: >> >> > Hi Ismael, >> > >> > Thanks for the reply. >> > >> > > So you're saying that reducing the buffer size didn't reduce the >> > compression rate for codecs like lz4? >> > >> > Of course, there were some improvements in compressed size when I tried >> > the 'buffer.size' option, but the gain was not significant. I tried >> several >> > datasets, but the result was the same. It made me so skeptical about >> adding >> > this option, which seemed to make the configuration option complex only. >> > >> > In contrast, 'compression.level' showed its effectiveness immediately. >> It >> > is why I decided to focus on the 'compression.level' in this rework. >> > >> > As you can see in the update KIP with the benchmark, IMHO, the true >> value >> > of supporting the compression option may not be the compressed size or >> > rate, but speed. By tweaking the compression level slightly, it showed >> > great produce performance gain. >> > >> > Thanks, >> > Dongjin >> > >> > >> > On Sun, Jun 6, 2021 at 6:48 AM Ismael Juma <ism...@juma.me.uk> wrote: >> > >> >> Thanks Dongjin. So you're saying that reducing the buffer size didn't >> >> reduce the compression rate for codecs like lz4? If so, that would >> suggest >> >> reducing the default value, but that seems odd. >> >> >> >> Ismael >> >> >> >> On Sat, Jun 5, 2021, 9:25 AM Dongjin Lee <dong...@apache.org> wrote: >> >> >> >> > Hello Kafka dev, >> >> > >> >> > I hope to reboot the discussion of KIP-390: Support Compression Level >> >> > < >> >> > >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level >> >> > >. >> >> > It proposes to add a new option, 'compression.level', that controls >> the >> >> > compression level. >> >> > >> >> > This KIP has been submitted more than one year ago, but had been >> >> neglected >> >> > for a long time. Recently I reworked it from scratch with the >> following >> >> > differences: >> >> > >> >> > 1. Tested how it works with a real-world dataset. As you can see in >> the >> >> > updated KIP, *this feature can improve the producer's message/second >> >> rate >> >> > by more than 50%*, such a significant enhancement. >> >> > 2. Dropped 'compression.buffer.size' option that was in the initial >> >> work. >> >> > With the repeated benchmarks, I could not find any evidence this >> option >> >> > results in meaningful differences. So I removed it. >> >> > >> >> > All feedback will be highly appreciated. >> >> > >> >> > Best, >> >> > Dongjin >> >> > >> >> > >> >> > -- >> >> > *Dongjin Lee* >> >> > >> >> > *A hitchhiker in the mathematical world.* >> >> > >> >> > >> >> > >> >> > *github: <http://goog_969573159/>github.com/dongjinleekr >> >> > <https://github.com/dongjinleekr>keybase: >> >> https://keybase.io/dongjinleekr >> >> > <https://keybase.io/dongjinleekr>linkedin: >> >> kr.linkedin.com/in/dongjinleekr >> >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: >> >> > speakerdeck.com/dongjin >> >> > <https://speakerdeck.com/dongjin>* >> >> > >> >> >> > >> > >> > -- >> > *Dongjin Lee* >> > >> > *A hitchhiker in the mathematical world.* >> > >> > >> > >> > *github: <http://goog_969573159/>github.com/dongjinleekr >> > <https://github.com/dongjinleekr>keybase: >> https://keybase.io/dongjinleekr >> > <https://keybase.io/dongjinleekr>linkedin: >> kr.linkedin.com/in/dongjinleekr >> > <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: >> speakerdeck.com/dongjin >> > <https://speakerdeck.com/dongjin>* >> > >> >> >> -- >> *Dongjin Lee* >> >> *A hitchhiker in the mathematical world.* >> >> >> >> *github: <http://goog_969573159/>github.com/dongjinleekr >> <https://github.com/dongjinleekr>keybase: https://keybase.io/dongjinleekr >> <https://keybase.io/dongjinleekr>linkedin: >> kr.linkedin.com/in/dongjinleekr >> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: >> speakerdeck.com/dongjin >> <https://speakerdeck.com/dongjin>* >> >