I've been wondering about this too, but every column has both a timestamp and a TTL. Unless the timestamp is not preserved, there should be no need to adjust the TTL, assuming the expiration time is determined from these two variables.
Does that make sense? My question is how often Cassandra checks for TTL expirations. Does it happen at compaction time? Some other time? Caleb Rackliffe | Software Developer M 949.981.0159 | ca...@steelhouse.com [cid:C02073B9-9A8A-49FE-89BE-9AC4419A3D3C] From: "i...@4friends.od.ua<mailto:i...@4friends.od.ua>" <i...@4friends.od.ua<mailto:i...@4friends.od.ua>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Date: Mon, 19 Mar 2012 15:28:40 -0400 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>> Subject: Re: repair broke TTL based expiration Hello Datasize should decrease during minor compactions. Check logs for compactions results. -----Original Message----- From: Radim Kolar <h...@filez.com<mailto:h...@filez.com>> To: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Sent: Mon, 19 Mar 2012 12:16 Subject: repair broke TTL based expiration I suspect that running cluster wide repair interferes with TTL based expiration. I am running repair every 7 days and using TTL expiration time 7 days too. Data are never deleted. Stored data in cassandra are always growing (watching them for 3 months) but they should not. If i run manual cleanup, some data are deleted but just about 5%. Currently there are about 3-5 times more rows then i estimate. I suspect that running repair on data with TTL can cause: 1. time check for expired records is ignored and these data are streamed to other node and they will be alive again or 2. streaming data are propagated with full TTL. Lets say that i have ttl 7 days, data are stored for 5 days and then repaired, they should be sent to other node with ttl 2 days not 7. Can someone do testing on this case? I could not play with production cluster too much.
<<inline: EB2FF764-478C-4966-9B0A-E7B76D6AD7DC[23].png>>