Hi Chris For this alternative id solution, when the data is processed by external system, is there way to generate the provenance event for “processing in Spark” stage?
Chris Sampson <[email protected]>于2021年1月11日 周一下午3:58写道: > Might be worth taking a look at the "alternate.identifier" (see > https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#associating-disparate-data > ). > > > --- > *Chris Sampson* > IT Consultant > [email protected] > <https://www.naimuri.com/> > > > On Mon, 11 Jan 2021 at 14:53, Yi Wang <[email protected]> wrote: > >> Hi Nifi team and experts, >> >> In such an example scenario: >> >> Data —> Nifi —> Kafka —> external system (spark, flink etc) —> Kafka —> >> Nifi —> s3 >> >> How do I fill in the ‘gap’ of data lineage? >> Once the data leave Nifi, its provenance life is ended (as far as I >> know), even when the same data is sent back to Nifi later, Nifi treats them >> as different data. So how can I handle this in order to get a complete data >> lineage graph? >> >> Any idea or suggestions? Thanks in advance. >> >> >> Cheers >> >>
