Thank you Jon, great article as usually!

One topic that was discussed in the article is filesystem cache which is 
traditionally leveraged for data caching in Cassandra (with row-caching 
disabled by default).

IIRC mmap() is used.

Some RDBMS and NoSQL DB's as well use direct I/O + async I/O + maintain own, 
not kernel-managed, DB Cache thus improving overall performance.

As Cassandra is designed to be a DB with low response time, this approach with 
DIO/AIO/DB Cache seems to be a really useful feature.

Just out of curiosity, are there reasons why this advanced IO stack wasn't 
implemented, except lack of resources to do this?


Regards,

Kyrill

________________________________
From: Eric Plowe <eric.pl...@gmail.com>
Sent: Wednesday, August 8, 2018 9:39:44 PM
To: user@cassandra.apache.org
Subject: Re: Compression Tuning Tutorial

Great post, Jonathan! Thank you very much.

~Eric

On Wed, Aug 8, 2018 at 2:34 PM Jonathan Haddad 
<j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote:
Hey folks,

We've noticed a lot over the years that people create tables usually leaving 
the default compression parameters, and have spent a lot of time helping teams 
figure out the right settings for their cluster based on their workload.  I 
finally managed to write some thoughts down along with a high level breakdown 
of how the internals function that should help people pick better settings for 
their cluster.

This post focuses on a mixed 50:50 read:write workload, but the same 
conclusions are drawn from a read heavy workload.  Hopefully this helps some 
folks get better performance / save some money on hardware!

http://thelastpickle.com/blog/2018/08/08/compression_performance.html


--
Jon Haddad
Principal Consultant, The Last Pickle

Reply via email to