[ 
https://issues.apache.org/jira/browse/CASSANDRA-13215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245164#comment-16245164
 ] 

Paulo Motta commented on CASSANDRA-13215:
-----------------------------------------

Good job, this is much nicer than having the StorageService manage the disk 
boundaries. Patch and tests LGTM.

Two minor nits:
 * could you make {{getDiskBoundaryValue}} and {{getDiskBoundaries}} static?
 * can you log the actual boundary changes to facilitate debugging?

Probably not a big deal but we will unnecessarily invalidate the disk 
boundaries whenever there is a keyspace change (table creation, drop, add view, 
etc) - rather then when replication settings or local ranges changes, do you 
think we should invalidate the boundaries only when replication settings/local 
range change or not really bother about this?

> Cassandra nodes startup time 20x more after upgarding to 3.x
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-13215
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13215
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Cluster setup: two datacenters (dc-main, dc-backup).
> dc-main - 9 servers, no vnodes
> dc-backup - 6 servers, vnodes
>            Reporter: Viktor Kuzmin
>            Assignee: Marcus Eriksson
>             Fix For: 3.11.x, 4.x
>
>         Attachments: simple-cache.patch
>
>
> CompactionStrategyManage.getCompactionStrategyIndex is called on each sstable 
> at startup. And this function calls StorageService.getDiskBoundaries. And 
> getDiskBoundaries calls AbstractReplicationStrategy.getAddressRanges.
> It appears that last function can be really slow. In our environment we have 
> 1545 tokens and with NetworkTopologyStrategy it can make 1545*1545 
> computations in worst case (maybe I'm wrong, but it really takes lot's of 
> cpu).
> Also this function can affect runtime later, cause it is called not only 
> during startup.
> I've tried to implement simple cache for getDiskBoundaries results and now 
> startup time is about one minute instead of 25m, but I'm not sure if it's a 
> good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to