[
https://issues.apache.org/jira/browse/CASSANDRA-6916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963147#comment-13963147
]
Benedict edited comment on CASSANDRA-6916 at 4/8/14 4:34 PM:
-------------------------------------------------------------
bq. I know intuitively it feels like it would be a big win, not only for the
page cache, but also since we get to only read one sstable for the compacted
keys, but still would be nice to have some evidence.
It would also be interesting to test this patch out against a constrained
memory scenario, and see how well it behaves, as it should occupy much less
memory at any given moment during a compaction. However, I think for the moment
confirming we get the expected outcome of fixing CASSANDRA-6746 is enough in my
book. I'll ask Ryan to run that test once I've addressed your points below.
{quote}
# Problem with fully expired sstables in CompactionTask, we need to replace
"toCompact" instead of "actuallyCompact".
# KeyCacheTest fails (not had time to look deeper into it)
# A few comments around the replaceReader/replaceCompacted... and related in
CompactionTask would be helpful, a bit harder to grasp now that we start using
the compacted sstables earlier and then call DataTracker.replace(...) with an
empty list.
# In MMappedSegmentedFile, line 199, it looks like that ternary expression is
not really needed since we add raf.length() in boundaries above?
# Preheat comment left in CompactionTask
{quote}
Good points, thanks. I'll check the failing test.
bq. In SSTR.cloneWithNewStart(..), CLibrary.trySkipCache(dfile.path, 0, 0); is
called when newStart > last, is that on purpose? I guess we could drop the page
cache for the whole file in this case?
That's the idea, yes. A parameter of 0 indicates to drop the whole file.
was (Author: benedict):
bq. I know intuitively it feels like it would be a big win, not only for the
page cache, but also since we get to only read one sstable for the compacted
keys, but still would be nice to have some evidence.
It would also be interesting to test this patch out against a constrained
memory scenario, and see how well it behaves, as it should occupy much less
memory at any given moment during a compaction. However, I think for the moment
confirming we get the expected outcome of fixing CASSANDRA-6746 is enough in my
book. I'll ask Russ to run that test once I've addressed your points below.
{quote}
# Problem with fully expired sstables in CompactionTask, we need to replace
"toCompact" instead of "actuallyCompact".
# KeyCacheTest fails (not had time to look deeper into it)
# A few comments around the replaceReader/replaceCompacted... and related in
CompactionTask would be helpful, a bit harder to grasp now that we start using
the compacted sstables earlier and then call DataTracker.replace(...) with an
empty list.
# In MMappedSegmentedFile, line 199, it looks like that ternary expression is
not really needed since we add raf.length() in boundaries above?
# Preheat comment left in CompactionTask
{quote}
Good points, thanks. I'll check the failing test.
bq. In SSTR.cloneWithNewStart(..), CLibrary.trySkipCache(dfile.path, 0, 0); is
called when newStart > last, is that on purpose? I guess we could drop the page
cache for the whole file in this case?
That's the idea, yes. A parameter of 0 indicates to drop the whole file.
> Preemptive opening of compaction result
> ---------------------------------------
>
> Key: CASSANDRA-6916
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6916
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Priority: Minor
> Fix For: 2.1
>
>
> Related to CASSANDRA-6812, but a little simpler: when compacting, we mess
> quite badly with the page cache. One thing we can do to mitigate this problem
> is to use the sstable we're writing before we've finished writing it, and to
> drop the regions from the old sstables from the page cache as soon as the new
> sstables have them (even if they're only written to the page cache). This
> should minimise any page cache churn, as the old sstables must be larger than
> the new sstable, and since both will be in memory, dropping the old sstables
> is at least as good as dropping the new.
> The approach is quite straight-forward. Every X MB written:
> # grab flushed length of index file;
> # grab second to last index summary record, after excluding those that point
> to positions after the flushed length;
> # open index file, and check that our last record doesn't occur outside of
> the flushed length of the data file (pretty unlikely)
> # Open the sstable with the calculated upper bound
> Some complications:
> # must keep running copy of compression metadata for reopening with
> # we need to be able to replace an sstable with itself but a different lower
> bound
> # we need to drop the old page cache only when readers have finished
--
This message was sent by Atlassian JIRA
(v6.2#6252)