[
https://issues.apache.org/jira/browse/CASSANDRA-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13480221#comment-13480221
]
Brandon Williams commented on CASSANDRA-4784:
---------------------------------------------
It's worth noting that vnodes in 1.2 will already solve the bootstrap
performance problem.
bq. Run a repair like we normally do after a bootstrap.
We don't do that, we begin forwarding the writes to the new node as a first
step to obviate the need for repair.
> Create separate sstables for each token range handled by a node
> ---------------------------------------------------------------
>
> Key: CASSANDRA-4784
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4784
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: sankalp kohli
> Priority: Minor
> Labels: perfomance
>
> Currently, each sstable has data for all the ranges that node is handling. If
> we change that and rather have separate sstables for each range that node is
> handling, it can lead to some improvements.
> Improvements
> 1) Node rebuild will be very fast as sstables can be directly copied over to
> the bootstrapping node. It will minimize any application level logic. We can
> directly use Linux native methods to transfer sstables without using CPU and
> putting less pressure on the serving node. I think in theory it will be the
> fastest way to transfer data.
> 2) Backup can only transfer sstables for a node which belong to its primary
> keyrange.
> 3) ETL process can only copy one replica of data and will be much faster.
> Changes:
> We can split the writes into multiple memtables for each range it is
> handling. The sstables being flushed from these can have details of which
> range of data it is handling.
> There will be no change I think for any reads as they work with interleaved
> data anyway. But may be we can improve there as well?
> Complexities:
> The change does not look very complicated. I am not taking into account how
> it will work when ranges are being changed for nodes.
> Vnodes might make this work more complicated. We can also have a bit on each
> sstable which says whether it is primary data or not.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira