[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12995947#comment-12995947
 ] 

T Jake Luciani commented on CASSANDRA-1902:
-------------------------------------------

bq. Can you give an overview of what the patch does?

At a high level this is to lessen the affect of sstable compaction and cleanup 
on reads, it does this by letting compaction figure out what rows in a SSTable 
are in the OS page cache and make sure that the compacted row remains in the 
page cache. Remember post CASSANDRA-1470 we are not putting anything in the 
page cache during compaction. 

bq. Where and how you "track contiguous pages per key," why that is the right 
solution

This is the right solution because the worst case is the same as the current 
code today. It can really only help because it's just giving the OS hints, it's 
upto the OS to do with that info what it thinks is best.

The important piece is in CLibrary.getCachedPages(File file, int 
minContiguousPages)

This takes a file and mmaps it in 2G chunks then uses the posix mincore() call 
to get a vector of which pages in the range are actually cached (for a totally 
unread file this is []). We use the starting offset + (pagecache_size * each 
mapped page) to return a vector of positions on disk. we use the 
minContiguousPages to filter down the noise of cache fragments.


Jump to SSTableScanner, here we use the file positions from getCachedPaged to 
figure out if a given row is considered "active". If it is we set the 
isInPageCache flag on the SSTableIdentityIterator.


Jump to CompactionManager, if any part of a row has been flagged as active then 
we make sure when we write the new SSTable this rows data is not forced out of 
the page cache (the default action from CASSANDRA-1470)

The two variables we probably should expose here are: 

1. BRAF.MAX_BYTES_IN_PAGE_CACHE - this says how many bytes should i let the 
page cache buffer before I force a flush of the OS cache for this files working 
(this is currently set to 128mb which, based on my testing is a nice default)

2. SSTableScanner's call to getCachedPages uses a minContiguousPages setting of 
32.  Again this is a nice default I've found.


By increasing (1) you pollute your page cache more but slightly increase your 
write performance.
By increasing (2) you migrate less and less rows during compaction.





> Migrate cached pages during compaction 
> ---------------------------------------
>
>                 Key: CASSANDRA-1902
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.7.1
>            Reporter: T Jake Luciani
>            Assignee: T Jake Luciani
>             Fix For: 0.7.2
>
>         Attachments: 
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
>   Original Estimate: 32h
>          Time Spent: 24h
>  Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
> pre-compacted CF during the compaction process.  
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
> uses the posix mincore() function to detect the offsets of pages for this 
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os 
> cache and make sure the subsequent pages in the new compacted SSTable are 
> kept in the page cache for these keys. This will minimize the impact of 
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here: 
> http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to