[jira] [Commented] (CASSANDRA-4803) CFRR wide row iterators improvements

Jonathan Ellis (JIRA) Sat, 17 Nov 2012 05:03:17 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499404#comment-13499404
 ]


Jonathan Ellis commented on CASSANDRA-4803:
-------------------------------------------

bq. what about virtual nodes in 1.2? Do we insist that split may not span more 
than one contiguous token range?

That's kind of orthogonal to wrapping ranges per se -- you'll still only have a 
single [virtual] node whose range wraps.  So vnodes won't make that worse.  
Moreover, you're still going to need two scans at the disk level since a 
wrapping range won't be contiguous there.  (Currently wrapping ranges are split 
by StorageProxy.getRestrictedRanges but this may change for CASSANDRA-4858.)  
Doing an extra Thrift or CQL query is negligible overhead compared to the 
actual scan.

Finally, getRestrictedRanges *will* split it up into scan-per-vnode which I 
agree is something we should fix but I don't think this patch does it.  As an 
optimization I don't think it's something we should block 1.2.0 for.  Should we 
split this into a separate ticket?
                
> CFRR wide row iterators improvements
> ------------------------------------
>
>                 Key: CASSANDRA-4803
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4803
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1.0
>            Reporter: Piotr Kołaczkowski
>            Assignee: Piotr Kołaczkowski
>             Fix For: 1.1.7
>
>         Attachments: 
> 0004-Better-token-range-wrap-around-handling-in-CFIF-CFRR.patch, 
> 0006-Code-cleanup-refactoring-in-CFRR.-Fixed-bug-with-mis.patch, 
> 0007-Fallback-to-describe_splits-in-case-describe_splits_.patch
>
>
> {code}
>  public float getProgress()
>     {
>         // TODO this is totally broken for wide rows
>         // the progress is likely to be reported slightly off the actual but 
> close enough
>         float progress = ((float) iter.rowsRead() / totalRowCount);
>         return progress > 1.0F ? 1.0F : progress;
>     }
> {code}
> The problem is iter.rowsRead() does not return the number of rows read from 
> the wide row iterator, but returns number of *columns* (every row is counted 
> multiple times). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4803) CFRR wide row iterators improvements

Reply via email to