[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328698#comment-15328698
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

Thanks for the update [~mahdix]. The patch looks good, I fixed one minor nit on 
2.1 test, added CHANGES.txt entries, updated commit message (and author 
information that was screwed up on 2.2 and 3.0) and resubmitted tests (still 
running).

||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:11933-2.1]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:11933-2.2]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:11933-3.0]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:11933-trunk]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.2-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-3.0-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-trunk-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.2-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-3.0-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-trunk-dtest/lastCompletedBuild/testReport/]|

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328324#comment-15328324
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

Sorry for the delay, I was away for a few days, I will setup this shortly and 
post back here.

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-13 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328318#comment-15328318
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

Can someone setup CI for this ticket? 

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-08 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321471#comment-15321471
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

[~pauloricardomg] Would you please setup CI for my branches?

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-04 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315660#comment-15315660
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

[~pauloricardomg] can you please set-up CI for my branches again?

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-03 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15314971#comment-15314971
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

||2.1||2.2||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...mm-binary:11933-2.1?expand=1]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...mm-binary:11933-2.2?expand=0]|
|testall|testall|
|dtest|dtest|

Will continue to add remaining branches if can't be auto-merged.

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313449#comment-15313449
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

Don't worry, they seem to be unrelated (flakey tests addressed elsewhere).

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313430#comment-15313430
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

CI Reports 5 test failures in dtest and testall together. Does that mean 
something is wrong with my change?

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313426#comment-15313426
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

You are right. Will rename that and check for other versions too.

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313220#comment-15313220
 ] 

Paulo Motta commented on CASSANDRA-11933:
-

This looks good, thanks! Just a minor nitpick: can you rename the variable from 
{{keyspaceLocalRange}} to {{keyspaceLocalRange*s*}}, or maybe just 
{{localRanges}} (it's implicit that is for a given keyspace)?

Also, could you check if this patch merges to cassandra-2.2 all the way up to 
trunk (via cassandra-3.7), and if not, provide patch for conflicting versions?

Submitted CI unit and dtests for 2.1:
||2.1||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:11933-2.1]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-11933-2.1-dtest/lastCompletedBuild/testReport/]|



> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313183#comment-15313183
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

How can I run my branch in ci (for testall and dtests)?

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-02 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313181#comment-15313181
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

[Branch for 2.1|https://github.com/mm-binary/cassandra/tree/11933-2.1]

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-01 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310628#comment-15310628
 ] 

Joshua McKenzie commented on CASSANDRA-11933:
-

Go for it - assigned it to you.

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>Assignee: Mahdi Mohammadi
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11933) Improve Repair performance

2016-06-01 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15310024#comment-15310024
 ] 

Mahdi Mohammadi commented on CASSANDRA-11933:
-

Can I work on this ticket?

> Improve Repair performance
> --
>
> Key: CASSANDRA-11933
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11933
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Cyril Scetbon
>
> During  a full repair on a ~ 60 nodes cluster, I've been able to see that 
> this stage can be significant (up to 60 percent of the whole time) :
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/StorageService.java#L2983-L2997
> It's merely caused by the fact that 
> https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/service/ActiveRepairService.java#L189
>  calls {code}ss.getLocalRanges(keyspaceName){code} everytime and that it 
> takes more than 99% of the time. This call takes 600ms when there is no load 
> on the cluster and more if there is. So for 10k ranges, you can imagine that 
> it takes at least 1.5 hours just to compute ranges. 
> Underneath it calls 
> [ReplicationStrategy.getAddressRanges|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L170]
>  which can get pretty inefficient ([~jbellis]'s 
> [words|https://github.com/apache/cassandra/blob/3dcbe90e02440e6ee534f643c7603d50ca08482b/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java#L165])
> *ss.getLocalRanges(keyspaceName)* should be cached to avoid having to spend 
> hours on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)