[jira] Updated: (CASSANDRA-1902) Migrate cached pages during compaction

T Jake Luciani (JIRA) Mon, 24 Jan 2011 19:32:32 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


T Jake Luciani updated CASSANDRA-1902:
--------------------------------------

    Attachment: 1902_v1.txt

Attaching patch and benchmark test.

The approach I decided on was to track contiguous pages and track this 
information per key.

This is working for reads (especially the last step) we can see a significant 
improvement by migrating cached pages.  However this comes at the cost of write 
performance, since we are increasing the TLB flushes. Reading the kernel 
mailing lists this posix_fadvise should be non-blocking but it can block 
sporadically when writing FS metadata.   

I think the safest thing is to expose the BRAF.MAX_BYTES_IN_PAGE_CACHE since 
this will directly affect the read vs write io tradeoff.  

I can run some tests with different values so we can get an idea of what affect 
this has.

Could someone else run the below?  (0.7.1+) run includes this patch.  I'm not 
sure if this is limited to my testing env.

{code}
#write 100k wide rows to CF1
python stress.py -S 3000 -n 100000 -k
0.7.0: 73
0.7.1: 92
0.7.1+: 98

#flush the keyspace wait for compaction to finish

#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 98, 16
0.7.1: 104, 14
0.7.1+: 69, 14 

#write 100k wide rows to CF2
python stress.py -S 3000 -n 100000 -k -y super
0.7.0: 83
0.7.1: 90
0.7.1+: 98

#wait for compaction to finish

#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 262, 53, 23
0.7.1: 41, 14
0.7.1+: 119, 40, 14


#overwrite 100k wide rows to CF1
python stress.py -S 3000 -n 100000 -k
0.7.0: 92
0.7.1: 104
0.7.1+: 99

#perform major compaction on CF1


#read from CF1 till in page cache
python stress.py -n 100000 -k -o read
0.7.0: 173, 29
0.7.1: 186, 30
0.7.1+: 80, 15
{code}

> Migrate cached pages during compaction 
> ---------------------------------------
>
>                 Key: CASSANDRA-1902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>             Fix For: 0.7.2
>
>         Attachments: 1902_v1.txt
>
>   Original Estimate: 32h
>  Remaining Estimate: 32h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (CASSANDRA-1902) Migrate cached pages during compaction

Reply via email to