We have a scheduler app here at smartthings, where we track per-second tasks to be executed.
These are all TTL'd to be destroyed after the second the event was registered with has passed. If the scheduling window was sufficiently small, say, 1 day, we could probably use a time window compaction strategy with this. But the window is one-two years worth of adhoc event registration per the contract. Thus, the intermingling of all this data TTL'ing at the different times since they are registered at different times means the sstables are not written with data TTLing in the same rough time period. If they were, then compaction would be a relatively easy process since the entire sstable would tombstone. We could kind of do this by doing sharded tables for the time periods and rotating the shards for duty, and truncating them as they are recycled. But an elegant way would be a custom compaction strategy that would "window" the data into clustered sstables that could be compacted with other similarly time bucketed sstables. This would require visibility into the rowkey when it came time to convert the memtable data to sstables. Is that even possible with compaction schemes? We would provide a requirement that the time-based data would be in the row key if it is a composite row key, making it required.