This is what JIRA is for. It seems to date back to CASSANDRA-1470, where the default became immediately evicting newly compacted files.
This results in cold reads for *hot* data after compaction, so CASSANDRA-6916 permitted evicting the *old* data instead, while guaranteeing >= the same amount of eviction. Whether or not the original issue of cold compaction data was a pain point, I cannot attest, but I was assured (by whom, I do not recall) that it was. In its present form it is at least not harmful. It was (and is) not a no-op: http://riptano.github.io/cassandra_performance/graph_v3/graph.html?stats=stats.6916v3-preempive-open-compact.mixed.2.json&metric=op_rate&operation=mixed&smoothing=1&show_aggregates=true&xmin=0&xmax=545.6&ymin=0&ymax=114638.7 On 18 October 2016 at 17:42, Michael Kjellman <mkjell...@internalcircle.com> wrote: > Yeah, it has been there for years -- that being said most of the community > is just catching up to 2.1 and 3.0 now where the usage did appear to change > over 2.0-- and I'm more trying to figure out what the intent was in the > various usages all over the codebase and make sure it's actually doing > that. Maybe even add some comments about that intent. :) > > In 2.1 I saw that we were doing this to get the file descriptor in some > cases (which obviously will return the wrong file descriptor so most likely > would have made this even more of a potential no-op than it already was?): > > public static int getfd(String path) > { > RandomAccessFile file = null; > try > { > file = new RandomAccessFile(path, "r"); > return getfd(file.getFD()); > } > catch (Throwable t) > { > JVMStabilityInspector.inspectThrowable(t); > // ignore > return -1; > } > finally > { > try > { > if (file != null) > file.close(); > } > catch (Throwable t) > { > // ignore > } > } > } > > > On Oct 18, 2016, at 9:34 AM, Jake Luciani <jak...@gmail.com<mailto:jaker > s...@gmail.com>> wrote: > > Although given we have an in process page cache[1] now this may not be > needed anymore? > This is only for the data file though. I think its been years? since we > showed it helped so perhaps someone should show if this is still > working/helping in the real world. > > [1] https://issues.apache.org/jira/browse/CASSANDRA-5863 > > > On Tue, Oct 18, 2016 at 11:59 AM, Michael Kjellman < > mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com>> wrote: > > Specifically regarding the behavior in different kernels, from `man > posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then > this was interpreted literally as "zero bytes", rather than as meaning "all > bytes through to the end of the file"." > > On Oct 18, 2016, at 8:57 AM, Michael Kjellman < > mkjell...@internalcircle.com<mailto:mkjell...@internalcircle.com><mailto: > mkjell...@internalcircle.com>> wrote: > > Right, so in SSTableReader#GlobalTidy$tidy it does: > // don't ideally want to dropPageCache for the file until all instances > have been released > CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0); > CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0); > > It seems to me every time the reference is released on a new sstable we > would immediately tidy() it and then call posix_fadvise with > POSIX_FADV_DONTNEED with an offset of 0 and a length of 0 (which I'm > thinking is doing so in respect to the API behavior in modern Linux kernel > builds?). Am I reading things correctly here? Sorta hard as there are many > different code paths the reference could have tidy() called. > > Why would we want to drop the segment we just write from the page cache -- > wouldn't that most likely be the most hot data, and even if it turned out > not to be wouldn't it be better in this case to have kernel be smart at > what it's best at? > > best, > kjellman > > On Oct 18, 2016, at 8:50 AM, Jake Luciani <jak...@gmail.com<mailto:jaker > s...@gmail.com><mailto:jaker > s...@gmail.com<mailto:s...@gmail.com>>> wrote: > > The main point is to avoid keeping things in the page cache that are no > longer needed like compacted data that has been early opened elsewhere. > > On Oct 18, 2016 11:29 AM, "Michael Kjellman" <mkjell...@internalcircle.com > <mailto:mkjell...@internalcircle.com> > <mailto:mkjell...@internalcircle.com>> > wrote: > > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments were provided. > > There is a check the OS is Linux (okay, a start) but it turns out the > behavior of providing a length of 0 to posix_fadvise changed in some 2.6 > kernels. We don't check the kernel version -- or even note it. > > What is the *expected* outcome of our use of posix_fadvise -- not what > does it do or not do today -- but what problem was it added to solve and > what's the expected behavior regardless of kernel versions. > > best, > kjellman > > Sent from my iPhone > > > > > > -- > http://twitter.com/tjake > >