> In my experience running repair on some counter data, the size of
> streamed data is much bigger than the cluster could possibly have lost
> messages or would be due to snapshotting at different times.
>
> I know the data will eventually be in sync on every repair, but I'm
> more interested in whether Cassandra transfers excess data and how to
> minimize this.
>
> Does any body have insights on this?
>
The problem is in granularity of Merkle tree. Cassandra sends regions
which have different hash values. It could be much bigger then a
single row.

Andrey

Reply via email to