I have in the past tried to delete SSTables manually, but have noticed bits
and pieces of that data still remain, even though the sstables of that
window is deleted. So always wondered if playing directly with the
underlying filesystem is a safe bet?


On Mon, Feb 11, 2019 at 1:01 PM Jonathan Haddad <j...@jonhaddad.com> wrote:

> Deleting SSTables manually can be useful if you don't know your TTL up
> front.  For example, you have an ETL process that moves your raw Cassandra
> data into S3 as parquet files, and you want to be sure that process is
> completed before you delete the data.  You could also start out without
> setting a TTL and later realize you need one.  This is a remarkably common
> problem.
>
> On Mon, Feb 11, 2019 at 12:51 PM Nitan Kainth <nitankai...@gmail.com>
> wrote:
>
>> Jeff,
>>
>> It means we have to delete sstables manually?
>>
>>
>> Regards,
>>
>> Nitan
>>
>> Cell: 510 449 9629
>>
>> On Feb 11, 2019, at 2:40 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>> There's a bit of headache around overlapping sstables being strictly safe
>> to delete.  https://issues.apache.org/jira/browse/CASSANDRA-13418 was
>> added to allow the "I know it's not technically safe, but just delete it
>> anyway" use case. For a lot of people who started using TWCS before 13418,
>> "stop cassandra, remove stuff we know is expired, start cassandra" is a
>> not-uncommon pattern in very high-write, high-disk-space use cases.
>>
>>
>>
>> On Mon, Feb 11, 2019 at 12:34 PM Nitan Kainth <nitankai...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> In regards to comment “Purging data is also straightforward, just
>>> dropping SSTables (by a script) where create date is older than a
>>> threshold, we don't even need to rely on TTL”
>>>
>>> Doesn’t the old sstables drop by itself? One ttl and gc grace seconds
>>> past whole sstable will have only tombstones.
>>>
>>>
>>> Regards,
>>>
>>> Nitan
>>>
>>> Cell: 510 449 9629
>>>
>>> On Feb 11, 2019, at 2:23 PM, DuyHai Doan <doanduy...@gmail.com> wrote:
>>>
>>> Purging data is also straightforward, just dropping SSTables (by a
>>> script) where create date is older than a threshold, we don't even need to
>>> rely on TTL
>>>
>>>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> twitter: rustyrazorblade
>


-- 
Akash

Reply via email to