[
https://issues.apache.org/jira/browse/HBASE-25118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rushabh Shah updated HBASE-25118:
---------------------------------
Description:
We want to track the source of mutations (especially Deletes) via Phoenix. We
have multiple use cases which does the deletes namely: customer deleting the
data, internal process like GDPR compliance, Phoenix TTL MR jobs. For every
mutations we want to track the source of operation which initiated the deletes.
At my day job, we have custom Backup/Restore tool.
For example: During GDPR compliance cleanup (lets say at time t0), we
mistakenly deleted some customer data and it were possible that customer also
deleted some data from their side (at time t1). To recover mistakenly deleted
data, we restore from the backup at time (t0 - 1). By doing this, we also
recovered the data that customer intentionally deleted.
We need a way for Restore tool to selectively recover data.
Trying to explain via an example.
Lets say there are 2 different systems (lets say accidental-delete and
customer-delete) deleting the data from the same table at almost the same time.
As the name suggest customer-delete is the intentional delete and
accidental-delete is deletes done by mistake. We have restore tool which will
restore all the data between start time and end times (start-ts and end-ts). We
want to restore the deletes that happened by accidental-delete system and not
want to restore the deletes done by customer-delete system. By adding cell tag
to Delete Markers, we can not restore data done by customer-delete system.
In my proposal, I want to add cell tags to Tombstone delete marker so that we
have that tag in the backups. Incase we have to restore data, we can restore
specific row depending on the tag present in the cell.
We want to leverage Cell Tag feature for Delete mutations to store these
metadata. Currently Delete object doesn't support Tag feature.
was:
We want to track the source of mutations (especially Deletes) via Phoenix. We
have multiple use cases which does the deletes namely: customer deleting the
data, internal process like GDPR compliance, Phoenix TTL MR jobs. For every
mutations we want to track the source of operation which initiated the deletes.
At my day job, we have custom Backup/Restore tool.
For example: During GDPR compliance cleanup (lets say at time t0), we
mistakenly deleted some customer data and it were possible that customer also
deleted some data from their side (at time t1). To recover mistakenly deleted
data, we restore from the backup at time (t0 - 1). By doing this, we also
recovered the data that customer intentionally deleted.
We need a way for Restore tool to selectively recover data.
We want to leverage Cell Tag feature for Delete mutations to store these
metadata. Currently Delete object doesn't support Tag feature.
> Extend Cell Tags to Delete object.
> ----------------------------------
>
> Key: HBASE-25118
> URL: https://issues.apache.org/jira/browse/HBASE-25118
> Project: HBase
> Issue Type: Improvement
> Reporter: Rushabh Shah
> Assignee: Rushabh Shah
> Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
>
> We want to track the source of mutations (especially Deletes) via Phoenix. We
> have multiple use cases which does the deletes namely: customer deleting the
> data, internal process like GDPR compliance, Phoenix TTL MR jobs. For every
> mutations we want to track the source of operation which initiated the
> deletes.
> At my day job, we have custom Backup/Restore tool.
> For example: During GDPR compliance cleanup (lets say at time t0), we
> mistakenly deleted some customer data and it were possible that customer also
> deleted some data from their side (at time t1). To recover mistakenly deleted
> data, we restore from the backup at time (t0 - 1). By doing this, we also
> recovered the data that customer intentionally deleted.
> We need a way for Restore tool to selectively recover data.
> Trying to explain via an example.
> Lets say there are 2 different systems (lets say accidental-delete and
> customer-delete) deleting the data from the same table at almost the same
> time. As the name suggest customer-delete is the intentional delete and
> accidental-delete is deletes done by mistake. We have restore tool which will
> restore all the data between start time and end times (start-ts and end-ts).
> We want to restore the deletes that happened by accidental-delete system and
> not want to restore the deletes done by customer-delete system. By adding
> cell tag to Delete Markers, we can not restore data done by customer-delete
> system.
> In my proposal, I want to add cell tags to Tombstone delete marker so that we
> have that tag in the backups. Incase we have to restore data, we can restore
> specific row depending on the tag present in the cell.
> We want to leverage Cell Tag feature for Delete mutations to store these
> metadata. Currently Delete object doesn't support Tag feature.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)