[
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996195#comment-12996195
]
T Jake Luciani commented on CASSANDRA-1902:
-------------------------------------------
bq. After the file is done being written, we call getCachedPages() across all
sstables used in compaction and compute which pages are hot AFTER compaction is
complete. This would allow us to to then sweep through the new SSTable written
and mark pages that were hot. If we do the process while file is being written
and we have a compaction that might take an hour, by the time it's done, the
cache could churn.
I like it in theory, I guess the only thing is it's less efficient since you
need to re-iterate through all old sstables twice, once for compaction and once
for matching the cached pages to the rows. Then you'd need to iterate through
the new sstable to find the new rows location.
For something like compaction where we are trying to minimize IO it might be
not worth it?
> Migrate cached pages during compaction
> ---------------------------------------
>
> Key: CASSANDRA-1902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.7.1
> Reporter: T Jake Luciani
> Assignee: T Jake Luciani
> Fix For: 0.7.3
>
> Attachments:
> 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt
>
> Original Estimate: 32h
> Time Spent: 24h
> Remaining Estimate: 8h
>
> Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a
> pre-compacted CF during the compaction process.
> First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that
> uses the posix mincore() function to detect the offsets of pages for this
> file currently in page cache.
> Then add getActiveKeys() which uses underlying pagesInPageCache() to get the
> keys actually in the page cache.
> use getActiveKeys() to detect which SSTables being compacted are in the os
> cache and make sure the subsequent pages in the new compacted SSTable are
> kept in the page cache for these keys. This will minimize the impact of
> compacting a "hot" SSTable.
> A simpler yet similar approach is described here:
> http://insights.oetiker.ch/linux/fadvise/
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira