[
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261376#comment-14261376
]
Jeremy Hanna commented on CASSANDRA-5220:
-----------------------------------------
I think it's important to reiterate that the project devs recognize that these
inefficiencies are impacting many users. However, lots of parallel work is
getting done on repair. As Yuki pointed out, with incremental repair
(CASSANDRA-5351) already in 2.1 and improving the concurrency of the repair
process (CASSANDRA-6455) coming in 3.0, many of the problems seen in this
ticket will be resolved.
Until 2.1/3.0, sub-range repair (CASSANDRA-5280) is helpful to parallelize and
repair more efficiently with virtual nodes. See
http://www.datastax.com/dev/blog/advanced-repair-techniques for details about
efficiency gains with sub-range repair. It's just more tedious to track.
Saving repair data to a system table (CASSANDRA-5839) will help track that in
Cassandra itself.
> Repair improvements when using vnodes
> -------------------------------------
>
> Key: CASSANDRA-5220
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.2.0 beta 1
> Reporter: Brandon Williams
> Assignee: Yuki Morishita
> Labels: performance, repair
> Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2
>
>
> Currently when using vnodes, repair takes much longer to complete than
> without them. This appears at least in part because it's using a session per
> range and processing them sequentially. This generates a lot of log spam
> with vnodes, and while being gentler and lighter on hard disk deployments,
> ssd-based deployments would often prefer that repair be as fast as possible.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)