Hi Mark,
I get what you're saying that option 2 changes the semantics so it makes
sense to take that one off the table. I was able to look at option 1
and take it a bit further that I believe may bring the events closer to
my stated goal of ensuring both attributes and content are kept as they
were at the time of clone/send etc. I've made the following changes in
a branch [1] that do the following:
1) Adds clone to the events that should be have attribute updates
suppressed (Option 1)
2) I've broadened the updateAttribute to also control updates to the
original and current content with respect to the ProvidenceEventRecord.
This means when attributes are not updated neither are the original or
current content.
3) I've added an enrich event to the clone event. It appears that the
send and upload events already have an enrich in the ProvidenceReporter
so they should have the content recorded at the time the event is
registered.
I believe this should keep the same semantics while freezing the content
at the time of the event for send, upload and clone. Let me know if
this approach makes sense. I was able to test this locally and it does
keep the cloned content as it was at the time of cloning even when the
cloned content is modified as it is in the Python Processors. It also
continues to display correctly for sent and uploaded content (Tried with
PutFile).
Sincerely,
Bob Paulin
[1] https://github.com/bobpaulin/nifi/tree/NIFI-13808
On 10/29/2024 10:19 AM, Mark Payne wrote:
Hey Bob,
I think that Option 1 does make sense. We should show the attributes, etc. as
they are when the FlowFile is cloned. I do not think the semantics would be
accurate for Option #2. If we were to use the child/clone to populate the clone
event, that would imply that the child was cloned. It is the parent that is
being cloned, so the clone event should reflect the parent.
Thanks
-Mark
On Oct 29, 2024, at 8:40 AM, Bob Paulin <b...@bobpaulin.com> wrote:
Hi,
In working with the Python Processors I've got some questions about how NiFi
handles Clone events. Currently when a clone event is displayed in a
provenance event in a processor the Attribute and Content information attached
to the event represents the originally cloned FlowFile. This includes data
modifications that are applied after the FlowFile is cloned. My concern is
this doesn't give the end user a clear idea of what was actually cloned from
the original FlowFile. A couple of different ways I see this could be improved.
1) Treat the CLONE events like SEND events[1]. This would prevent the FlowFile
attributes from being updated prior to being written to the Provenance repo.
OR
2) Use the FlowFile's clone (the child) to populate the clone event[2]. The
cloned file has the proper representation of the data at the time of cloning
and will not contain any updates made to the original FlowFile when being
displayed.
The first suggestion is more subtle. It freeze the attribute data but the
Output Content Claim displayed would still reflect changes to the parent file
following the clone.
The second suggestion is a more significant/breaking change to how Provenance
stores clone events. However the more fundamental change means that the
Provenance Event will show the correct attributes and content from the time of
cloning.
My preference is the second suggestion but I also understand if there may be
reasons for the design displaying the parent FlowFile's information. My
interest is improving the experience documented in [3]. Open to suggestions if
there are other paths to doing so.
Sincerely,
Bob Paulin
[1]
https://github.com/apache/nifi/blob/aacbd514ce4af7e41f54fc2418394c563395c9bd/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProcessSession.java#L1010
[2]
https://github.com/apache/nifi/blob/aacbd514ce4af7e41f54fc2418394c563395c9bd/nifi-framework-bundle/nifi-framework/nifi-framework-components/src/main/java/org/apache/nifi/controller/repository/StandardProvenanceReporter.java#L457
[3] https://issues.apache.org/jira/browse/NIFI-13808