[
https://issues.apache.org/jira/browse/CASSANDRA-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-2062:
----------------------------------------
Fix Version/s: (was: 0.7.2)
0.7.3
> Better control of iterator consumption
> --------------------------------------
>
> Key: CASSANDRA-2062
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2062
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Stu Hood
> Priority: Minor
> Fix For: 0.7.3
>
> Attachments: 0001-Improved-iterator-for-merging-sorted-iterators.txt,
> 0002-Port-all-collating-consumers-to-MergeIterator.txt,
> 0003-A-ManyToOne-merge-iterator-implementation.txt,
> 0004-Port-most-ReducingIterator-consumers-to-ManyToOne.txt
>
>
> The core reason for this ticket is to gain control over the consumption of
> the lazy nested iterators in the read path.
> {quote}We survive now because we write the size of the row at the front of
> the row (via some serious acrobatics at write time), which gives us hasNext()
> for rows for free. But it became apparent while working on the block-based
> format that hasNext() will not be cheap unless the current item has been
> consumed. "Consumption" of the row is easy, and blocks will be framed so that
> they can be very easily skipped, but you don't want to have to seek to the
> end of the row to answer hasNext, and then seek back to the beginning to
> consume the row, which is what CollatingIterator would have forced us to
> do.{quote}
> While we're at it, we can also improve efficiency: for {{M}} iterators
> containing {{N}} total items, commons.collections.CollatingIterator performs
> a {{O(M*N)}} merge, and calls hasNext multiple times per returned value. We
> can do better.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira