Thank you both of you for your valuable information. I will test and revert
soon.

Best regards

On Tue, Dec 12, 2023 at 2:39 PM Xiaoxiang Yu <x...@apache.org> wrote:

> I don't know GDPR very well. Here is my understanding.
>
> For hive and hdfs, you can consider using these techniques which support
> ACID in Spark and Hive(I recommend first one):
> 1) Delta Lake,
> https://docs.databricks.com/en/security/privacy/gdpr-delta.html
> 2) Hive ACID table, here is a link,
>
> https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/migrate-hive-workloads/topics/hive-acid-migration-regulations.html
>
> For Kylin, there are three places which may store data, index, snapshot,
> dict. The refresh of the snapshot costs
> less time and resources,  while refresh of index/dict much more. Snapshot
> refresh will be triggered automatically
> when you build an index every day.
>
> I think you should consider centralizing user-sensitive columns(email,
> phone, address) in dimension tables,
> and your fact table only has the foreign key(for example, uid) which refers
> to the primary key of dimension tables.
> When you are modeling in Kylin, for these dim tables which contains
> user-sensitive columns, try
>
> 1. set dim tables as snapshot by disable precompute join relation, so these
> columns won't be built into indexes, refer
>
> https://kylin.apache.org/5.0/docs/modeling/model_design/precompute_join_relations
> 2. not create a bitmap measure on these columns, so these columns won't be
> built into dict
>
> ------------------------
> With warm regard
> Xiaoxiang Yu
>
>
>
> On Tue, Dec 12, 2023 at 12:11 PM Nam Đỗ Duy <na...@vnpay.vn.invalid>
> wrote:
>
> > Dear Xiaoxiang, Sirs/Madams
> >
> > I face an issue with deleting data of user according to GPDR-like policy
> > which means when user send request to delete their personal data, we need
> > to delete it from all system, that means to delete data:
> >
> > 1- from Kylin index (cube)
> > 2- from Hive
> > 3- from HDFS
> >
> > Have you had the same use-case before, do you have any suggestions to
> > achieve this scenario?
> >
> > Thank you very much and best regards
> >
>

Reply via email to