I think you're introducing a layer violation. GDPR is a business
requirement and
compaction is an implementation detail.

IMHO it's enough to delete the partition using regular CQL.
It's true that it won't be deleted immedietly but it will be eventually
deleted (welcome to eventual consistency ;).

Even with user defined compaction, compaction may not be running instantly,
repair will be required,
there are other nodes in the cluster, maybe partitioned nodes with the
data. There is data in snapshots
and backups.

The business idea is to delete the data in a fast, reasonable time for
humans and make it
first unreachable and later delete completely.

On Fri, Feb 9, 2018 at 8:51 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> That might be fine for a one off but is totally impractical at scale or
> when using TWCS.
> On Fri, Feb 9, 2018 at 8:39 AM DuyHai Doan <doanduy...@gmail.com> wrote:
>> Or use the new user-defined compaction option recently introduced,
>> provided you can determine over which SSTables a partition is spread
>> On Fri, Feb 9, 2018 at 5:23 PM, Jon Haddad <j...@jonhaddad.com> wrote:
>>> Give this a read through:
>>> https://github.com/protectwise/cassandra-util/tree/master/deleting-
>>> compaction-strategy
>>> Basically you write your own logic for how stuff gets forgotten, then
>>> you can recompact every sstable with upgradesstables -a.
>>> Jon
>>> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar <nicolas.guyo...@gmail.com>
>>> wrote:
>>> Hi everyone,
>>> Because of GDPR we really face the need to support “Right to Be
>>> Forgotten” requests => https://gdpr-info.eu/art-17-gdpr/  stating that *"the
>>> controller shall have the obligation to erase personal data without undue
>>> delay"*
>>> Because I usually meet customers that do not have that much clients,
>>> modeling one partition per client is almost always possible, easing
>>> deletion by partition key.
>>> Then, appart from triggering a manual compaction on impacted tables
>>> using STCS, I do not see how I can be GDPR compliant.
>>> I'm kind of surprised not to find any thread on that matter on the ML,
>>> do you guys have any modeling strategy that would make it easier to get rid
>>> of data ?
>>> Thank you for any given advice
>>> Nicolas

Reply via email to