Yeah, it has been there for years -- that being said most of the community is
just catching up to 2.1 and 3.0 now where the usage did appear to change over
2.0-- and I'm more trying to figure out what the intent was in the various
usages all over the codebase and make sure it's actually doing that. Maybe even
add some comments about that intent. :)
In 2.1 I saw that we were doing this to get the file descriptor in some cases
(which obviously will return the wrong file descriptor so most likely would
have made this even more of a potential no-op than it already was?):
public static int getfd(String path)
RandomAccessFile file = null;
file = new RandomAccessFile(path, "r");
catch (Throwable t)
if (file != null)
catch (Throwable t)
On Oct 18, 2016, at 9:34 AM, Jake Luciani
Although given we have an in process page cache now this may not be
This is only for the data file though. I think its been years? since we
showed it helped so perhaps someone should show if this is still
working/helping in the real world.
On Tue, Oct 18, 2016 at 11:59 AM, Michael Kjellman <
Specifically regarding the behavior in different kernels, from `man
posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then
this was interpreted literally as "zero bytes", rather than as meaning "all
bytes through to the end of the file"."
On Oct 18, 2016, at 8:57 AM, Michael Kjellman <
Right, so in SSTableReader#GlobalTidy$tidy it does:
// don't ideally want to dropPageCache for the file until all instances
have been released
CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0);
CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0);
It seems to me every time the reference is released on a new sstable we
would immediately tidy() it and then call posix_fadvise with
POSIX_FADV_DONTNEED with an offset of 0 and a length of 0 (which I'm
thinking is doing so in respect to the API behavior in modern Linux kernel
builds?). Am I reading things correctly here? Sorta hard as there are many
different code paths the reference could have tidy() called.
Why would we want to drop the segment we just write from the page cache --
wouldn't that most likely be the most hot data, and even if it turned out
not to be wouldn't it be better in this case to have kernel be smart at
what it's best at?
On Oct 18, 2016, at 8:50 AM, Jake Luciani
The main point is to avoid keeping things in the page cache that are no
longer needed like compacted data that has been early opened elsewhere.
On Oct 18, 2016 11:29 AM, "Michael Kjellman"
We use posix_fadvise in a bunch of places, and in stereotypical Cassandra
fashion no comments were provided.
There is a check the OS is Linux (okay, a start) but it turns out the
behavior of providing a length of 0 to posix_fadvise changed in some 2.6
kernels. We don't check the kernel version -- or even note it.
What is the *expected* outcome of our use of posix_fadvise -- not what
does it do or not do today -- but what problem was it added to solve and
what's the expected behavior regardless of kernel versions.
Sent from my iPhone