[ 
https://issues.apache.org/jira/browse/NIFI-13779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bob Paulin updated NIFI-13779:
------------------------------
    Attachment: NIFI-13779-DataProvenance.png

> [NiFi 2.x Python] Missing Some Data Provenance Events from Python Processors
> ----------------------------------------------------------------------------
>
>                 Key: NIFI-13779
>                 URL: https://issues.apache.org/jira/browse/NIFI-13779
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 2.0.0-M4
>         Environment: Mac ARM64
>            Reporter: Bob Paulin
>            Priority: Major
>         Attachments: NIFI-13779-DataProvenance.png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If I run the following Python Processor (FlowFileTransform type) that has 2 
> relationship defined coco and annotations from the codebase
> [https://github.com/bobpaulin/nifi-ai-talk/tree/main/table-detection-processor]
> I get Data Providence events from both relationships when I terminate the 
> relationships.  I do NOT get Data Providence events from either relationship 
> when that relationship is passed on to another processor
> See flow
> [https://github.com/bobpaulin/nifi-ai-talk/blob/main/flow_defs/TestTable.json]
> I believe the issue is due to how we're cloning the flow file and using the 
> Clone as clone to proceed as the "transformed" flow file.  There is logic to 
> drop providence events on the cloned flow file 
> SEE 
> [https://github.com/apache/nifi/blob/563d7ea6140c9cd847ddae56f3d3a1690abd6972/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java#L927]
> This is because the data will be changed but it will be seen as a "New" flow 
> file since orginal will be null
>  
> I suggest we send off the cloned flow file on the original relationship and 
> use the incoming flow file in the 
> [https://github.com/apache/nifi/blob/445d34f91e7581c4f4f92540bc6b055118e5966e/nifi-extension-bundles/nifi-py4j-extension-bundle/nifi-py4j-bridge/src/main/java/org/apache/nifi/python/processor/FlowFileTransformProxy.java]
> to be used as transformed.  PR will be incoming.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to