[
https://issues.apache.org/jira/browse/CASSANDRA-20715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Capwell updated CASSANDRA-20715:
--------------------------------------
Status: Ready to Commit (was: Review In Progress)
+1 from Benedict in GH
> Accord: Topology serializer has a lot of repeated data, can dedup to shrink
> the cost
> ------------------------------------------------------------------------------------
>
> Key: CASSANDRA-20715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20715
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Accord
> Reporter: David Capwell
> Assignee: David Capwell
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 2h
> Remaining Estimate: 0h
>
> Topology object represents all tables -> ranges -> nodes that accord needs to
> care about, but there is a big problem; there is a lot of duplication.
> Each TokenRange repeats TableId
> Tables with the same replication factor have the same ranges
> Shard has views for fast path and joining nodes
> All these duplicate values add up bloating the serialization format
> In testing these are the results I am seeing
> {code}
> min: tables=2, ranges=927 By 43.47%, partitioner: Murmur3Partitioner
> max: tables=10, ranges=48 By 67.16%, partitioner: RandomPartitioner
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]