Hi,
In working with the Python Processors I've got some questions about how
NiFi handles Clone events. Currently when a clone event is displayed in
a provenance event in a processor the Attribute and Content information
attached to the event represents the originally cloned FlowFile. This
includes data modifications that are applied after the FlowFile is
cloned. My concern is this doesn't give the end user a clear idea of
what was actually cloned from the original FlowFile. A couple of
different ways I see this could be improved.
1) Treat the CLONE events like SEND events[1]. This would prevent the
FlowFile attributes from being updated prior to being written to the
Provenance repo.
OR
2) Use the FlowFile's clone (the child) to populate the clone event[2].
The cloned file has the proper representation of the data at the time of
cloning and will not contain any updates made to the original FlowFile
when being displayed.
The first suggestion is more subtle. It freeze the attribute data but
the Output Content Claim displayed would still reflect changes to the
parent file following the clone.
The second suggestion is a more significant/breaking change to how
Provenance stores clone events. However the more fundamental change
means that the Provenance Event will show the correct attributes and
content from the time of cloning.
My preference is the second suggestion but I also understand if there
may be reasons for the design displaying the parent FlowFile's
information. My interest is improving the experience documented in
[3]. Open to suggestions if there are other paths to doing so.
Sincerely,
Bob Paulin
[1]
https://github.com/apache/nifi/blob/aacbd514ce4af7e41f54fc2418394c563395c9bd/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java#L1010
[2]
https://github.com/apache/nifi/blob/aacbd514ce4af7e41f54fc2418394c563395c9bd/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProvenanceReporter.java#L457
[3] https://issues.apache.org/jira/browse/NIFI-13808