[
https://issues.apache.org/jira/browse/CASSANDRA-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667349#comment-13667349
]
Christian Spriegel commented on CASSANDRA-4905:
-----------------------------------------------
[~mtheroux2]: 15 hours for 280GB sounds bad: That is effectively <2MB/s
throughput (assuming -pr and RF3), right? Ouch :-)
Your understanding is correct. Any tombstone can cause it, its not just TTLed
columns.
My assumption was that time-series data would be the worst case scenario,
because repair would always stream entire wide rows. Reading your message I
realized that random deletes are probably worse because they will cause hash
differences for more keys, spread across the ring. The merkletree-inaccuracy
will then make repair stream larger portions of data for each of these
mismatches.
In general, shouldn't leveled compaction behave better than size-tiered,
because it keeps compaction more up-to-date?
If you do deletes, did you also look at CASSANDRA-5398?
> Repair should exclude gcable tombstones from merkle-tree computation
> --------------------------------------------------------------------
>
> Key: CASSANDRA-4905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4905
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Christian Spriegel
> Assignee: Sylvain Lebresne
> Fix For: 1.2.0 beta 3
>
> Attachments: 4905.txt
>
>
> Currently gcable tombstones get repaired if some replicas compacted already,
> but some are not compacted.
> This could be avoided by ignoring all gcable tombstones during merkle tree
> calculation.
> This was discussed with Sylvain on the mailing list:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/repair-compaction-and-tombstone-rows-td7583481.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira