[
https://issues.apache.org/jira/browse/CASSANDRA-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stu Hood updated CASSANDRA-2062:
--------------------------------
Attachment: 0005-Remove-temporary-instrumentation-and-CollatingIterator.txt
0004-Port-LazilyCompactedRow-ReducingKeyIterator-RangeSlice.txt
* 0001 - Adds MergeIterator, which uses a priority queue to perform a
{{O(2*log(M)*N)}} merge in the same space. Also guarantees to only call hasNext
once per item.
* 0002 - Adds some debug output for checking the number of comparisons
performed during compaction
* 0003 - Replaces CollatingIterator for Compaction
* 0004 - Replaces CollatingIterator elsewhere
* 0005 - Removes debug output and CollatingIterator from FBUtilities
The number of comparisons needed to compact {{M=sstables=100}},
{{N=rows=80000}} drops from ~8 million ({{M*N}}) to ~1 million
({{2*log(M)*N}}), as expected.
{noformat}# MergeIterator
1 for org.apache.cassandra.io.CompactionIterator@7dc21ece
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=1
colsper=200000: 2758 ms
399999 for org.apache.cassandra.io.CompactionIterator@20ca5bff
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=200000
colsper=1: 4504 ms
917546 for org.apache.cassandra.io.CompactionIterator@6360f5bf
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=100 rowsper=800
colsper=5: 1021 ms
# CollatingIterator
1 for org.apache.cassandra.io.CompactionIterator@5f873eb2
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=1
colsper=200000: 1732 ms
399999 for org.apache.cassandra.io.CompactionIterator@357c7988
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=200000
colsper=1: 4499 ms
7915050 for org.apache.cassandra.io.CompactionIterator@2ae0420b
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=100 rowsper=800
colsper=5: 1427 ms{noformat}
> Use more efficient merge algorithm
> ----------------------------------
>
> Key: CASSANDRA-2062
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2062
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Priority: Minor
> Fix For: 0.7.2
>
> Attachments: 0001-Improved-iterator-for-merging-sorted-iterators.txt,
> 0002-Quickie-instrumentation-for-comparisons.txt,
> 0003-Replace-Collating-with-Merge-in-CompactionIterator.txt,
> 0004-Port-LazilyCompactedRow-ReducingKeyIterator-RangeSlice.txt,
> 0005-Remove-temporary-instrumentation-and-CollatingIterator.txt
>
>
> For {{M}} iterators containing {{N}} total items,
> commons.collections.CollatingIterator performs a {{O(M*N)}} merge, and calls
> hasNext multiple times per returned value. We can do better.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.