Hi All, I'm running a benchmark on Cassandra while using a benchmark client which I've written myself.
I'm running the following scenario: One Cassandra node on the same machine as the client. The client writes a new key every 1 second and deletes it after 10 seconds, so at any given time there should be only 10 keys. The key value size is 2K. When I ran this scenario I looked at the data folder and saw that initially cassandra created 4 files (SSTables) of ~130K and then compacted them to 20K (which is exactly what I expected 10 keys * 2K = 20K). And afterwards there where another 3 130K files which were compacted together with the first 20K file to a new 20K file, and so on... This scenario ran exactly as I excepted. Now I ran the same scenario but this time with key value size=2M. When I ran this scenario initially cassandra created 4 files (SSTables) of ~64M and then compacted them to 20M (which is exactly what I expected 10 keys * 2M = 20M). But after creating another 3 64M files - the problem started! It didn't compact them with the first 20M - instead it created another 64M file and compacted all 4 of them to 260M file (!), and after creating another 4 64M files it compacted them to another 260M file and so on... It looks to me like in this scenario for some reason the compaction doesn't make any deletes. I don't have any idea why :-( Additional info which I should mention: In the storage-conf.xml the following are not default: GCGraceSeconds = 0 MemtableFlushAfterMinutes = 1 <ColumnFamily Name="Standard2" CompareWith="UTF8Type" KeysCached="99%"/> Thanks a lot for your help, Amir