Thanks for the explanation.
From: Marcus Eriksson [mailto:krum...@gmail.com]
Sent: Thursday, March 17, 2016 12:56 AM
To: user@cassandra.apache.org
Subject: Re: DTCS Question
On Wed, Mar 16, 2016 at 6:49 PM, Anubhav Kale
mailto:anubhav.k...@microsoft.com>> wrote:
I am using Cassandra 2.1.13 which has all the latest DTCS fixes (it does STCS
within the DTCS windows). It also introduced a field called MAX_WINDOW_SIZE
which defaults to one day.
So in my data folders, I may see SS Tables that span beyond a day (generated
through old data through repairs or commit logs), but whenever I see a message
in logs “Compacted Foo” (meaning the SS Table under question was definitely a
result of compaction), the “Foo” SS Table should never have data beyond a day.
Is this understanding accurate ?
No - not until
https://issues.apache.org/jira/browse/CASSANDRA-10496<https://na01.safelinks.protection.outlook.com/?url=https%3a%2f%2fissues.apache.org%2fjira%2fbrowse%2fCASSANDRA-10496&data=01%7c01%7cAnubhav.Kale%40microsoft.com%7c1dde7659fb8a420b61f308d34e3993dc%7c72f988bf86f141af91ab2d7cd011db47%7c1&sdata=7334rIfNRo0Oz5sXGAlATOmAkbmFJg4cqifXbGm23qA%3d>
(read for explanation)
If we have issues with repairs pulling in old data, should MAX_WINDOW_SIZE
instead be set to a larger value so that we don’t run the risk of too many SS
Tables lying around and never getting compacted ?
No, with CASSANDRA-10280 that old data will get compacted if needed (assuming
you have default settings). If the remote node is correctly date tiered, the
streamed sstable will also be correctly date tiered. Then that streamed sstable
will be put in a time window and if there are enough sstables in that old
window, we do a compaction.
/Marcus