@Mike However, this is also partly very frustrating, what we have to consider here. But also pretty fascinating.
Mit freundlichen Grüßen / best regards Kay-Uwe Moosheimer > Am 30.01.2020 um 16:23 schrieb Mike Thomsen <[email protected]>: > > That's actually a pretty fascinating use case. Our experience on this side > of the Atlantic is that few people really care about lineage. > >> On Thu, Jan 30, 2020 at 9:48 AM [email protected] <[email protected]> >> wrote: >> >> I think you have the wrong picture. >> >> Data lineage systems like Atlas and similar are pushed because GDPR >> prescribes it! >> Data Lineage is by no means a pure "internal diagnostic" but has a legal >> background. >> >> Thus GDPR defines a recording requirement. >> It states among other things that >> - a description of the categories of personal data >> - a description of the categories of recipients of personal data, >> including recipients in third countries or international organisations >> Transfer of personal data to a third country or an international >> organisation >> - be recorded in an audit-proof manner. >> >> And if you do all this correctly, then you have to make sure that the >> data is erasable again (right to be forgotten). >> >> By the way, this does not only apply to special Data Lineage systems but >> also to all log files, backups etc. At least as long as no other legal >> regulation prohibits this. >> Data Lineage is therefore not a nice feature for internal diagnostics >> but a must. >> >> So far, too few companies have thought of this. But more and more are >> recognizing the necessity. >> This is also the reason why formerly Hortonworks and now Cloudera work >> hard on Atlas. >> >>> Am 30.01.2020 um 15:25 schrieb Mike Thomsen: >>> IANAL, but I would be surprised if NiFi provenance data even legally >> falls >>> under the Right to Be Forgotten because it's internal diagnostic data >> that >>> is highly ephemeral. >>> >>> On Thu, Jan 30, 2020 at 9:07 AM Emanuel Oliveira <[email protected]> >> wrote: >>> >>>> Hi, dont think makes sense an api for atomic records: >>>> >>>> 1. one configure retention od data provenance (default 24h is "good >>>> enough" GDPR doesnt need milisecond realtime deletion right ?) >>>> >>>> >> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#persistent-provenance-repository-properties >>>> 2. even if there would be one api to delete FF's with an attribute = >>>> <some id>, that would normally be useless as well, since inbound FFs >>>> have >>>> normally hundreds, thousands of records that will need to split, >>>> aggregate, >>>> in complex flow file, implementing a clean up an nano atomic level >>>> would be >>>> to hard and extra effort not needed, since your target single record >>>> would >>>> surely be part of multiple FF UUIDs, some only holding your record, >> but >>>> mot >>>> surefly will have 100s, 100s of other records including your record >>>> somewhere on the middle. >>>> >>>> >>>> In my opinion your answer to business/management gate keepers is that >> data >>>> will be stored on data provenance for 24h (default) which can be >>>> configured, and that >>>> >>>> >>>> Best Regards, >>>> *Emanuel Oliveira* >>>> >>>> >>>> >>>> On Thu, Jan 30, 2020 at 1:54 PM [email protected] <[email protected]> >>>> wrote: >>>> >>>>> Dear NiFi developer team, >>>>> >>>>> NiFi's Data Provenance and Data Lineage is perfectly adequate in the >>>>> environment of NiFi, so there is often no need to use Atlas. >>>>> >>>>> When using NiFi with customer data a problem arises. >>>>> The problem is the GDPR requirement that a user has the right to be >>>>> forgotten. Unfortunately, I can't find any API call or information on >>>>> how to delete individual user data from the NiFi Provenance Repository >>>>> based on a user-defined attribute and its defined characteristics. >>>>> >>>>> A delete request like "delete all data and dependencies where the >>>>> attribute XYZ has the value 123" is currently not possible to my >>>> knowledge. >>>>> My questions are: >>>>> Is this actually possible and how? And if not, is it planned? >>>>> >>>>> Thanks >>>>> Uwe >>>>> >> >>
