Specifically regarding the behavior in different kernels, from `man
posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then this
was interpreted literally as "zero bytes", rather than as meaning "all bytes
through to the end of the file"."
On Oct 18, 2016, at 8:57 AM, Michael Kjellman
Right, so in SSTableReader#GlobalTidy$tidy it does:
// don't ideally want to dropPageCache for the file until all instances have
CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0);
CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0);
It seems to me every time the reference is released on a new sstable we would
immediately tidy() it and then call posix_fadvise with POSIX_FADV_DONTNEED with
an offset of 0 and a length of 0 (which I'm thinking is doing so in respect to
the API behavior in modern Linux kernel builds?). Am I reading things correctly
here? Sorta hard as there are many different code paths the reference could
have tidy() called.
Why would we want to drop the segment we just write from the page cache --
wouldn't that most likely be the most hot data, and even if it turned out not
to be wouldn't it be better in this case to have kernel be smart at what it's
On Oct 18, 2016, at 8:50 AM, Jake Luciani
The main point is to avoid keeping things in the page cache that are no
longer needed like compacted data that has been early opened elsewhere.
On Oct 18, 2016 11:29 AM, "Michael Kjellman"
We use posix_fadvise in a bunch of places, and in stereotypical Cassandra
fashion no comments were provided.
There is a check the OS is Linux (okay, a start) but it turns out the
behavior of providing a length of 0 to posix_fadvise changed in some 2.6
kernels. We don't check the kernel version -- or even note it.
What is the *expected* outcome of our use of posix_fadvise -- not what
does it do or not do today -- but what problem was it added to solve and
what's the expected behavior regardless of kernel versions.
Sent from my iPhone