Re: Use of posix_fadvise

2016-10-18 Thread Edward Capriolo
I want to point out something: https://issues.apache.org/jira/browse/CASSANDRA-6846 "I'm definitively -1 on putting any type of contract on the internals. They are called internals for a reason, and if rewriting it all entirely tomorrow is best for Cassandra, we should have the possibility to do

Re: Use of posix_fadvise

2016-10-18 Thread Nate McCall
Benedict mentioned it briefly above, but the earliest / most detailed conversation on this can be found in CASSANDRA-1470. You may also find some stuff from the dev list and other issues around the same time from specifically from Chris G. and Peter S. (names on that ticket) as, IIRC, they were bo

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
I'm not certain this is the best way to go about encouraging people to help you, or generally encourage participation in the project. You have seemed to lash out at the project (and in this case me specifically) in a fairly antagonistic manner a multitude of times in just a couple of hours. Your

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Sorry, No. Always document your assumptions. I shouldn't need to git blame a thousand commits and read thru a billion tickets to maybe understand why something was done. Clearly thru the conversations on this topic I've had on IRC and the responses so far on this email thread it's not/still not

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
This is what JIRA is for. It seems to date back to CASSANDRA-1470, where the default became immediately evicting newly compacted files. This results in cold reads for *hot* data after compaction, so CASSANDRA-6916 permitted evicting the *old* data instead, while guaranteeing >= the same amount of

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Yeah, it has been there for years -- that being said most of the community is just catching up to 2.1 and 3.0 now where the usage did appear to change over 2.0-- and I'm more trying to figure out what the intent was in the various usages all over the codebase and make sure it's actually doing th

Re: Use of posix_fadvise

2016-10-18 Thread Jake Luciani
Although given we have an in process page cache[1] now this may not be needed anymore? This is only for the data file though. I think its been years? since we showed it helped so perhaps someone should show if this is still working/helping in the real world. [1] https://issues.apache.org/jira/bro

Re: Use of posix_fadvise

2016-10-18 Thread Ariel Weisberg
Hi, Compaction can merge some very large files together with data that may be completely cold. So yeah caching the whole file just creates pressure to evict useful stuff. In some theories. In other theories the page cache is flush and scan resistant and should just eat this stuff up without inter

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Within a single SegmentedFile? On Oct 18, 2016, at 9:02 AM, Ariel Weisberg mailto:ariel.weisb...@datastax.com>> wrote: With compaction there can be hot and cold data mixed together.

Re: Use of posix_fadvise

2016-10-18 Thread Ariel Weisberg
Hi, With compaction there can be hot and cold data mixed together. So we want to drop the data and then warm it via early opening so only the hot data is in the cache. Some of those cases are for the old sstable that have been rewritten or discarded so the data is entirely defunct. The files migh

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Specifically regarding the behavior in different kernels, from `man posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then this was interpreted literally as "zero bytes", rather than as meaning "all bytes through to the end of the file"." On Oct 18, 2016, at 8:57 AM, Michael

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Right, so in SSTableReader#GlobalTidy$tidy it does: // don't ideally want to dropPageCache for the file until all instances have been released CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0); CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0); It seems to me ever

Re: Use of posix_fadvise

2016-10-18 Thread Jake Luciani
The main point is to avoid keeping things in the page cache that are no longer needed like compacted data that has been early opened elsewhere. On Oct 18, 2016 11:29 AM, "Michael Kjellman" wrote: > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments we

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Sure -- my bad, I aggregated them all of them up for you: https://github.com/apache/cassandra/search?utf8=✓&q=CLibrary.trySkipCache&type=Code https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f5

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
... and continuing in the fashion of behaviours one might like to disabuse people of, no code link is provided. On 18 October 2016 at 16:28, Michael Kjellman wrote: > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments were provided. > > There is a c