Hi,
We've inherited quite a big amazon infrastructure from a company we've purchased. It's has an ancient and obsolete implementation of services, being the worst (and more expensive) of all of them a 5 cluster of Cassandra (RF=3). I'm new to Cassandra, and yes, I'm doing my way throughout docs. I was told that Amazon asked them a few months ago to reboot one of their servers (it had been turned on for so long that Amazon had to make some changes and needed it rebooted), so they had to add a new node to the cluster. If you query nodetool as of now, it shows: $ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC Rack Status State Load Owns Token 141784319550391026443072753096570088105 10.128.50.130 datacenter1 rack1 Up Normal 263.06 GB 16.67% 0 10.128.50.237 datacenter1 rack1 Up Normal 253.31 GB 16.67% 28356863910078205288614550619314017621 10.128.60.106 datacenter1 rack1 Up Normal 262.12 GB 33.33% 85070591730234615865843651857942052863 10.128.70.41 datacenter1 rack1 Up Normal 264.28 GB 16.67% 113427455640312821154458202477256070484 10.128.60.206 datacenter1 rack1 Up Normal 65.15 GB 16.67% 141784319550391026443072753096570088105 What puzzels me is the last line. It belongs to the last added node, the new one I talked about. While it's holding the same amount of data (16.67%) that other 3 nodes, the Load is about 4 times lower. What does this mean? Is that difference data that is not cleaned up, such as TTL-expired cell or tombstoned data? Thanks and excuse me if I'm asking something stupid. Rubén.