[ 
https://issues.apache.org/jira/browse/CASSANDRA-9639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860082#comment-15860082
 ] 

Paulo Motta commented on CASSANDRA-9639:
----------------------------------------

LGTM, thanks! We should also take the chance here to avoid merging overlapping 
ranges (which was introduced on 
[CASSANDRA-9462|https://issues.apache.org/jira/browse/CASSANDRA-9462?focusedCommentId=14611854&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14611854]),
 since this will require the client to manually split the ranges to be able to 
directly run repair on ranges from the {{size_estimates}} table. For instance, 
a node with ranges (-2688160409776496397, -2506475074448728501) and 
(-2506475074448728501, 8473270337963525440) will currently be inserted into 
{{system.size_estimates}} as (-2688160409776496397, 8473270337963525440), due 
to the merging of neighbor ranges.

[~cnlwsu] can you have a look on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/e26e1513812c3f2df2ea375585a575730bbb949e]?
 It basically avoids normalizing ranges, but instead just unwrap wrap-around 
ranges, which was the original idea behind CASSANDRA-9462.

I also added a regression 
[dtest|https://github.com/riptano/cassandra-dtest/pull/1439] to make sure this 
is working as expected.

Updated patch and tests available below:
||3.0||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-9639]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:9639]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-9639-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-9639-dtest/lastCompletedBuild/testReport/]|


> size_estimates is inacurate in multi-dc clusters
> ------------------------------------------------
>
>                 Key: CASSANDRA-9639
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9639
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sebastian Estevez
>            Assignee: Chris Lohfink
>            Priority: Minor
>             Fix For: 3.0.x
>
>
> CASSANDRA-7688 introduced size_estimates to replace the thrift 
> describe_splits_ex command.
> Users have reported seeing estimates that are widely off in multi-dc clusters.
> system.size_estimates show the wrong range_start / range_end



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to