I've been doing a lot of reading on SSTable fragmentation due to updates and the costs associated with reconstructing the end data from multiple SSTables that have been created over time and not yet compacted. One question is stuck in my head: If you re-insert entire rows instead of updating one column, will cassandra end flushing that entire row into one SSTable on disk and then end up up finding a non fragmented entire row quickly on reads instead of potential reconstruction across multiple SSTables? Obviously this has implications for space as a trade off.
Wayne