[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-11-28 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268362#comment-16268362
 ] 

Corentin Chary commented on CASSANDRA-13215:


[~krummas] I tried it on our test cluster and it seems to work great. startup 
time got divided by 3~4. I expect it to have an even greater impact on prod 
(mode nodes, more sstables).

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Fix For: 3.11.2, 4.0
>
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-11-23 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16264445#comment-16264445
 ] 

Paulo Motta commented on CASSANDRA-13215:
-

bq. both look pretty bad, but I don't think it is because of this patch

yeah, they seem to be a problem with the jolokia agent not working correctly, 
probably some configuration error on the CI server.
 
In any case, I ran some failing compaction tests locally and they passed which 
indicate it's indeed a problem with CI. 

Patch LGTM, though can you just add a {{toString}} to {{DiskBoundaries}} so the 
boundary changes are logged correctly? Marking this as ready to commit so feel 
free to fix this on commit.

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Fix For: 3.11.x, 4.x
>
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-11-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263983#comment-16263983
 ] 

Marcus Eriksson commented on CASSANDRA-13215:
-

https://github.com/krummas/cassandra/commits/marcuse/13215-version2
https://github.com/krummas/cassandra/commits/marcuse/13215-trunk

test results:
trunk: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/437/
3.11: 
https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/434/

both look pretty bad, but I don't think it is because of this patch

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Fix For: 3.11.x, 4.x
>
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-11-08 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245164#comment-16245164
 ] 

Paulo Motta commented on CASSANDRA-13215:
-

Good job, this is much nicer than having the StorageService manage the disk 
boundaries. Patch and tests LGTM.

Two minor nits:
 * could you make {{getDiskBoundaryValue}} and {{getDiskBoundaries}} static?
 * can you log the actual boundary changes to facilitate debugging?

Probably not a big deal but we will unnecessarily invalidate the disk 
boundaries whenever there is a keyspace change (table creation, drop, add view, 
etc) - rather then when replication settings or local ranges changes, do you 
think we should invalidate the boundaries only when replication settings/local 
range change or not really bother about this?

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Fix For: 3.11.x, 4.x
>
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-10-31 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16233674#comment-16233674
 ] 

Paulo Motta commented on CASSANDRA-13215:
-

bq. I might change it slightly during the day, but if you want to start 
reviewing it should be ok

Thanks for the heads up! I should only be able to review it early next week 
anyway, so please update if/when you have another version.

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
>Priority: Major
> Fix For: 3.11.x, 4.x
>
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-10-12 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16201558#comment-16201558
 ] 

Corentin Chary commented on CASSANDRA-13215:


I'll try to test that in our test env in the next days :)

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-10-11 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16200205#comment-16200205
 ] 

Marcus Eriksson commented on CASSANDRA-13215:
-

so finally got around to finishing this up, I still need to write a bunch of 
tests, but if anyone has a safe place to try it out, please do

https://github.com/krummas/cassandra/commits/marcuse/13215

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-09-20 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173029#comment-16173029
 ] 

Corentin Chary commented on CASSANDRA-13215:


Cool, will be happy to test it and report performance improvements (mostly 
during startup)

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-08-29 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145464#comment-16145464
 ] 

Marcus Eriksson commented on CASSANDRA-13215:
-

[~iksaif] yeah I'm working on it, should have a patch ready soon

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-08-29 Thread Corentin Chary (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16145453#comment-16145453
 ] 

Corentin Chary commented on CASSANDRA-13215:


I can confirm that this is affecting us too (startup and repairs). [~krummas] 
did you end up doing something for this issue ? Else I might give it a shot.

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
>Assignee: Marcus Eriksson
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-06-21 Thread Viktor Kuzmin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057322#comment-16057322
 ] 

Viktor Kuzmin commented on CASSANDRA-13215:
---

Marcus Eriksson, It would be great if you'll be able to pick this up.

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-06-21 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16057183#comment-16057183
 ] 

Marcus Eriksson commented on CASSANDRA-13215:
-

[~kvaster] let me know if you want me to pick this up

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-06-09 Thread Viktor Kuzmin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16044231#comment-16044231
 ] 

Viktor Kuzmin commented on CASSANDRA-13215:
---

It seems that my knowledge of cassandra internals are not enough to complete 
this task fast...

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-03-02 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15891771#comment-15891771
 ] 

Marcus Eriksson commented on CASSANDRA-13215:
-

We also need to invalidate the cache on ring changes (ie, if a node 
joins/leaves the ring, we need to recalculate the boundaries) - ie, you 
probably need to implement {{IEndpointStateChangeSubscriber}} and register in 
{{Gossiper}}

Let me know if you have time to work on this, otherwise I can pick it up

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-02-15 Thread Viktor Kuzmin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868206#comment-15868206
 ] 

Viktor Kuzmin commented on CASSANDRA-13215:
---

It is really used on critical path. I think that this affects not only startup 
time, but repairs aswell... And maybe some other parts...

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CASSANDRA-13215) Cassandra nodes startup time 20x more after upgarding to 3.x

2017-02-14 Thread Romain Hardouin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865479#comment-15865479
 ] 

Romain Hardouin commented on CASSANDRA-13215:
-

It's related to CASSANDRA-6696 i.e. since 3.2.

Regarding {{AbstractReplicationStrategy.getAddressRanges}} it seems to be a 
known limitation. Maybe we can now consider that it's used on a critical path:
{code}
/*
 * NOTE: this is pretty inefficient. also the inverse (getRangeAddresses) 
below.
 * this is fine as long as we don't use this on any critical path.
 * (fixing this would probably require merging tokenmetadata into 
replicationstrategy,
 * so we could cache/invalidate cleanly.)
 */
public Multimap getAddressRanges(TokenMetadata 
metadata)
{code}

> Cassandra nodes startup time 20x more after upgarding to 3.x
> 
>
> Key: CASSANDRA-13215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>Reporter: Viktor Kuzmin
> Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)