[
https://issues.apache.org/jira/browse/CASSANDRA-436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12757147#action_12757147
]
Hudson commented on CASSANDRA-436:
----------------------------------
Integrated in Cassandra #201 (See
[http://hudson.zones.apache.org/hudson/job/Cassandra/201/])
Replace PriorityQueue mess with a CompactionIterator that efficiently
yields compacted Rows from a set of sstables by feeding CollationIterator into
a ReducingIterator transform. ("Efficiently" means we never deserialize data
until it is needed, so the number of sstables that can be compacted at once is
virtually unlimited, and if only one sstable contains a given key that row data
will be copied over without an intermediate de/serialize step.) This is a very
natural fit for the compaction algorithm and almost entirely gets rid of
duplicated code between doFileCompaction and doAntiCompaction.
patch by jbellis; reviewed by goffinet for
allow ReducingIterator to reduce from one type to a different one
patch by jbellis; reviewed by goffinet for
copy FileStruct to SSTableScanner and remove cruft. Migrate getKeyRange to new
scanner class.
patch by jbellis; reviewed by goffinet for
minor fixes
patch by jbellis; reviewed by goffinet for
> OOM during major compaction on many (hundreds) of sstables
> ----------------------------------------------------------
>
> Key: CASSANDRA-436
> URL: https://issues.apache.org/jira/browse/CASSANDRA-436
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Fix For: 0.5
>
> Attachments: 0001-CASSANDRA-436-minor-fixes.txt,
> 0001-CASSANDRA-436-minor-fixes.txt,
> 0002-copy-FileStruct-to-SSTableScanner-and-remove-cruft.-M.txt,
> 0002-copy-FileStruct-to-SSTableScanner-and-remove-cruft.-M.txt,
> 0003-allow-ReducingIterator-to-reduce-from-one-type-to-a-di.txt,
> 0003-allow-ReducingIterator-to-reduce-from-one-type-to-a-di.txt,
> 0004-Replace-PriorityQueue-mess-with-a-CompactionIterator-t.txt,
> 0004-Replace-PriorityQueue-mess-with-a-CompactionIterator-t.txt
>
>
> compaction deserializes rows during compaction before they are needed, one
> per sstable. if we only deserialized on-demand the current algorithm would
> be fine on nearly arbitrarily large numbers of sstables. (this is only
> important b/c it is useful to disable compactions during bulk load.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.