[
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
T Jake Luciani updated CASSANDRA-1902:
--------------------------------------
Attachment: 1902_v1.txt
Attaching patch and benchmark test.
The approach I decided on was to track contiguous pages and track this
information per key.
This is working for reads (especially the last step) we can see a significant
improvement by migrating cached pages. However this comes at the cost of write
performance, since we are increasing the TLB flushes. Reading the kernel
mailing lists this posix_fadvise should be non-blocking but it can block
sporadically when writing FS metadata.
I think the safest thing is to expose the BRAF.MAX_BYTES_IN_PAGE_CACHE since
this will directly affect the read vs write io tradeoff.
I can run some tests with different values so we can get an idea of what affect
this has.
Could someone else run the below? (0.7.1+) run includes this patch. I'm not
sure if this is limited to my testing env.
{code}
#write 100k wide rows to CF1
python stress.py -S 3000 -n 100000 -k
0.7.0: 73
0.7.1: 92
0.7.1+: 98
#flush the keyspace wait for compaction to finish
#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 98, 16
0.7.1: 104, 14
0.7.1+: 69, 14
#write 100k wide rows to CF2
python stress.py -S 3000 -n 100000 -k -y super
0.7.0: 83
0.7.1: 90
0.7.1+: 98
#wait for compaction to finish
#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 262, 53, 23
0.7.1: 41, 14
0.7.1+: 119, 40, 14
#overwrite 100k wide rows to CF1
python stress.py -S 3000 -n 100000 -k
0.7.0: 92
0.7.1: 104
0.7.1+: 99
#perform major compaction on CF1
#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 173, 29
0.7.1: 186, 30
0.7.1+: 80, 15
{code}
> Migrate cached pages during compaction
> ---------------------------------------
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.7.1
> Reporter: T Jake Luciani
> Assignee: T Jake Luciani
> Fix For: 0.7.2
>
> Attachments: 1902_v1.txt
>
> Original Estimate: 32h
> Remaining Estimate: 32h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a
> pre-compacted CF during the compaction process.
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that
> uses the posix mincore() function to detect the offsets of pages for this
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os
> cache and make sure the subsequent pages in the new compacted SSTable are
> kept in the page cache for these keys. This will minimize the impact of
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here:
> http://insights.oetiker.ch/linux/fadvise/
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.