Hi,

I have two questions about anti-entropy repair.

Q1:
According to the DataStax document, it's recommended to run full repair
weekly or monthly. Is it needed even if repair with partitioner range
option ("nodetool repair -pr", in C* v2.2+) is set to run periodically for
every node in the cluster?

References:
- DataStax, "When to run anti-entropy repair",
http://docs.datastax.com/en/cassandra/2.2/cassandra/operations/opsRepairNodesWhen.html


Q2:
Is it a good practice to repair a node without using non-repaired snapshots
when I want to restore a node because repair process is too slow?

I've done some simple verifications for anti-entropy repair and found out
that the repair process spends too much time than simply transferring the
replica data from existing nodes to restoring node.

My verification settings are as following:

- 3 node cluster (N1, N2, N3)
- 2 CPUs, 8GB memory, 500GB HDD for each node
- Replication Factor is 3
- C* version is 2.2.6
- CS is LCS

And I prepared test data as following:

- a snapshot (10GB, full repaired) for N1, N2, N3.
- 1GB SSTables (by using incremental backup) for N1, N2, N3.
- another 1GB SSTables for N1, N2

I've measured repair time for two cases.

- Case 1: repair N3 with the snapshot and 1GB SStables
- Case 2: repair N3 with the snapshot only

In case 1, N3 is needed to repair 12GB (actually 1GB data is updated
because the snapshot is already repaired) and received 1GB data from N1 or
N2. Whereas in case 2, N3 is needed to repair 12GB (actually just compare
merkle tree for 10GB) and received 2GB data from N1 or N2.

The result showed that case 2 was faster than case 1 (case 1: 6889sec, case
2: 4535sec). I guess the repair process is very slow and it would be better
to repair a node without (non repaired) backed up (snapshot or incremental
backup) files if the other replica nodes exists.

So... I guess if I just have non-repaired backups, what's the point of
using them? Looks like there's no merit... Am I missing something?

Regards,
Satoshi

Reply via email to