Thank you both of you for your valuable information. I will test and revert soon.
Best regards On Tue, Dec 12, 2023 at 2:39 PM Xiaoxiang Yu <x...@apache.org> wrote: > I don't know GDPR very well. Here is my understanding. > > For hive and hdfs, you can consider using these techniques which support > ACID in Spark and Hive(I recommend first one): > 1) Delta Lake, > https://docs.databricks.com/en/security/privacy/gdpr-delta.html > 2) Hive ACID table, here is a link, > > https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/migrate-hive-workloads/topics/hive-acid-migration-regulations.html > > For Kylin, there are three places which may store data, index, snapshot, > dict. The refresh of the snapshot costs > less time and resources, while refresh of index/dict much more. Snapshot > refresh will be triggered automatically > when you build an index every day. > > I think you should consider centralizing user-sensitive columns(email, > phone, address) in dimension tables, > and your fact table only has the foreign key(for example, uid) which refers > to the primary key of dimension tables. > When you are modeling in Kylin, for these dim tables which contains > user-sensitive columns, try > > 1. set dim tables as snapshot by disable precompute join relation, so these > columns won't be built into indexes, refer > > https://kylin.apache.org/5.0/docs/modeling/model_design/precompute_join_relations > 2. not create a bitmap measure on these columns, so these columns won't be > built into dict > > ------------------------ > With warm regard > Xiaoxiang Yu > > > > On Tue, Dec 12, 2023 at 12:11 PM Nam Đỗ Duy <na...@vnpay.vn.invalid> > wrote: > > > Dear Xiaoxiang, Sirs/Madams > > > > I face an issue with deleting data of user according to GPDR-like policy > > which means when user send request to delete their personal data, we need > > to delete it from all system, that means to delete data: > > > > 1- from Kylin index (cube) > > 2- from Hive > > 3- from HDFS > > > > Have you had the same use-case before, do you have any suggestions to > > achieve this scenario? > > > > Thank you very much and best regards > > >