On 04/05/2011 03:49 PM, Jonathan Ellis wrote: > Sounds like https://issues.apache.org/jira/browse/CASSANDRA-2324
Yes, that sounds like the issue I'm having. Any chance for a fix for this being backported to 0.7.x? Anyway, I guess I might as well share the test case I've used to reproduce this problem: ============================================================ Cluster configuration: 6 nodes running 0.7.4 with RF=3 1. Create keyspace and column families (see repair_test.py (attached)) 2. Insert 20 100MB keys into each of column family A, B and C: $ python repair_test.py This results in 2.4GB worth of sstables on node1: $ du -sh /data/cassandra/data/repair_test3/ 2.4G /data/cassandra/data/repair_test3/ 3. Run repair: $ time nodetool -h node1 repair repair_test3 real 3m28.218s The repair logged about streaming of 1 to 3 ranges for each column family and the sstable directory was filled with a bunch of "<column-family>-tmp-" files and disk usage peaked at 10+GB The repair completed successfully and the disk usage is down to 6.4GB: $ du -sh /data/cassandra/data/repair_test3/ 6.4G /data/cassandra/data/repair_test3/ 4. Run repair again: $ time nodetool -h node1 repair repair_test3 real 9m23.514s This time the disk usage peaked at 25+GB and then settled at 4.7GB. This time repair reported that even more ranges were out of sync. So this issue seems to cause repair to take a very long time, unnecessarily sending a lot of data over the network and leave a lot of "air" in the resulting sstables that can only be recovered by triggering major compactions. (A GC was triggered before all disk usage measurements) ============================================================ Regards, Jonas
import pycassa """ create keyspace repair_test3 with replication_factor=3; use repair_test3; create column family A with memtable_throughput=32; create column family B with memtable_throughput=32; create column family C with memtable_throughput=32; """ servers = ['node1:9160', 'node2:9160', 'node3:9160', 'node4:9160', 'node5:9160', 'node6:9160'] def insert_data(cf_name): pool = pycassa.ConnectionPool('repair_test3', servers) cf = pycassa.ColumnFamily(pool, cf_name, write_consistency_level=pycassa.ConsistencyLevel.ONE) data = 'X' * 1024*1024 for x in range(20): for y in range(100): print cf_name, x, y cf.insert(str(x), {str(y): data}) insert_data('A') insert_data('B') insert_data('C')