[
https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499404#comment-13499404
]
Jonathan Ellis commented on CASSANDRA-4803:
-------------------------------------------
bq. what about virtual nodes in 1.2? Do we insist that split may not span more
than one contiguous token range?
That's kind of orthogonal to wrapping ranges per se -- you'll still only have a
single [virtual] node whose range wraps. So vnodes won't make that worse.
Moreover, you're still going to need two scans at the disk level since a
wrapping range won't be contiguous there. (Currently wrapping ranges are split
by StorageProxy.getRestrictedRanges but this may change for CASSANDRA-4858.)
Doing an extra Thrift or CQL query is negligible overhead compared to the
actual scan.
Finally, getRestrictedRanges *will* split it up into scan-per-vnode which I
agree is something we should fix but I don't think this patch does it. As an
optimization I don't think it's something we should block 1.2.0 for. Should we
split this into a separate ticket?
> CFRR wide row iterators improvements
> ------------------------------------
>
> Key: CASSANDRA-4803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4803
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Affects Versions: 1.1.0
> Reporter: Piotr Kołaczkowski
> Assignee: Piotr Kołaczkowski
> Fix For: 1.1.7
>
> Attachments:
> 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch,
> 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch,
> 0007-Fallback-to-describe_splits-in-case-describe_splits_.patch
>
>
> {code}
> public float getProgress()
> {
> // TODO this is totally broken for wide rows
> // the progress is likely to be reported slightly off the actual but
> close enough
> float progress = ((float) iter.rowsRead() / totalRowCount);
> return progress > 1.0F ? 1.0F : progress;
> }
> {code}
> The problem is iter.rowsRead() does not return the number of rows read from
> the wide row iterator, but returns number of *columns* (every row is counted
> multiple times).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira