Bob Paulin created NIFI-13779:
---------------------------------

             Summary: [NiFi 2.x Python] Missing Some Data Provenance Events 
from Python Processors
                 Key: NIFI-13779
                 URL: https://issues.apache.org/jira/browse/NIFI-13779
             Project: Apache NiFi
          Issue Type: Bug
          Components: Core Framework
    Affects Versions: 2.0.0-M4
         Environment: Mac ARM64
            Reporter: Bob Paulin


If I run the following Python Processor (FlowFileTransform type) that has 2 
relationship defined coco and annotations from the codebase

[https://github.com/bobpaulin/nifi-ai-talk/tree/main/table-detection-processor]

I get Data Providence events from both relationships when I terminate the 
relationships.  I do NOT get Data Providence events from either relationship 
when that relationship is passed on to another processor

See flow
[https://github.com/bobpaulin/nifi-ai-talk/blob/main/flow_defs/TestTable.json]

I believe the issue is due to how we're cloning the flow file and using the 
Clone as clone to proceed as the "transformed" flow file.  There is logic to 
drop providence events on the cloned flow file 

SEE 
[https://github.com/apache/nifi/blob/563d7ea6140c9cd847ddae56f3d3a1690abd6972/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java#L927]

This is because the data will be changed but it will be seen as a "New" flow 
file since orginal will be null

 

I suggest we send off the cloned flow file on the original relationship and use 
the incoming flow file in the 
[https://github.com/apache/nifi/blob/445d34f91e7581c4f4f92540bc6b055118e5966e/nifi-extension-bundles/nifi-py4j-extension-bundle/nifi-py4j-bridge/src/main/java/org/apache/nifi/python/processor/FlowFileTransformProxy.java]

to be used as transformed.  PR will be incoming.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to