Re: Cross system data lineage

Yi Wang Mon, 11 Jan 2021 11:56:17 -0800

Hi Chris

For this alternative id solution, when the data is processed by external
system, is there way to generate the provenance event for “processing in
Spark” stage?


Chris Sampson <[email protected]>于2021年1月11日 周一下午3:58写道：

> Might be worth taking a look at the "alternate.identifier" (see
> https://nifi.apache.org/docs/nifi-docs/html/nifi-in-depth.html#associating-disparate-data
> ).
>
>
> ---
> *Chris Sampson*
> IT Consultant
> [email protected]
> <https://www.naimuri.com/>
>
>
> On Mon, 11 Jan 2021 at 14:53, Yi Wang <[email protected]> wrote:
>
>> Hi Nifi team and experts,
>>
>> In such an example scenario:
>>
>> Data —> Nifi —> Kafka —> external system (spark, flink etc) —> Kafka —>
>> Nifi —> s3
>>
>> How do I fill in the ‘gap’ of data lineage?
>> Once the data leave Nifi, its provenance life is ended (as far as I
>> know), even when the same data is sent back to Nifi later, Nifi treats them
>> as different data. So how can I handle this in order to get a complete data
>> lineage graph?
>>
>> Any idea or suggestions? Thanks in advance.
>>
>>
>> Cheers
>>
>>

Re: Cross system data lineage

Reply via email to