Of course, a custom processor can create any attribute, including an
"external id field." I don't think it can "lose" the original uuid
since, if it attempts to reset it, the action will be quietly ignored
(Mark).
Note that uuid figures prominently in the display of provenance--in my
mind the crucial nature of my question. [1]
My question was about the "sanctified" state (or not) of uuid and Matt
and Mark gave succinct and useful answers that I will explore. I was
unaware of the suggested "best practice" of considering losing any and
all previously established attributes before sending flowfiles on. I
have long done this explicitly in the case of attributes I create, but
will now contemplate doing it for other attributes I did not create and
therefore have respected "religiously."
Russ
[1]
https://www.tutorialspoint.com/apache_nifi/apache_nifi_data_provenance.htm
On 7/18/23 14:07, Edward Armes wrote:
Hmm,
I've seen this come up a few times now I wonder is there need for a rename
of the uuid field and a creation of an external id field?
Edward
On Tue, 18 Jul 2023, 20:53 Lucas Ottersbach,<lucas.ottersb...@gmail.com>
wrote:
Hey Matt,
you wrote that both `Session.create` and `Session.clone` set a new FlowFile
UUID to the resulting FlowFile. This somewhat sounds like there is an
alternative way where the UUID is not controlled by the framework itself?
I've got a different use case than Russell, but was wondering whether it is
even possible to control the FlowFile UUID as a Processor developer? I've
got a processor pair for inter-cluster transfer of FlowFiles (where
Site-to-Site is not applicable). As of now, the UUID on the receiving side
differs from the original on the origin cluster, because I'm using
`Session.create`.
Is there a way to control the UUID of new FlowFiles?
Best regards,
Lucas
Matt Burgess<mattyb...@apache.org> schrieb am Di., 18. Juli 2023, 20:23:
In general I recommend only sending on those attributes that will be
used at some point downstream (unless you have an "original"
relationship that should maintain the original state with respect to
provenance). If you don't know that ahead of time you'll probably need
to send all/most of the attributes just in case.
Are you using session.create() or session.clone()? They both set a new
"uuid" attribute on the created FlowFile, with at least the latter
setting some other attributes as well (see the Developer Guide [1] for
more details).
Regards,
Matt
[1]https://nifi.apache.org/docs/nifi-docs/html/developer-guide.html
On Tue, Jul 18, 2023 at 12:25 PM Russell Bateman<r...@windofkeltia.com>
wrote:
I have a custom processor, /SplitHl7v4Resources/, that splits out
individual FHIR resources (Patients, Observations, Encounters, etc.)
from great Bundle flowfiles. So, for a given flowfile, it's split into
hundreds of smaller ones.
When I do this, I leave the existing NiFi attributes as they were on
the
original flowfile.
As I contemplate the uuid attribute, it occurs to me that I should find
out what its *significance is for provenance and other potential
debugging/tracing concerns*. I never really look at it, but, if there
were some kind of melt-down in a production environment, would I care
that it multiplied across hundreds of flowfiles besided the original
one?
Also these two other NiFi attributes remain unchanged:
filename
path
I do garnish each flowfile with many pointed/significant new attributes
like resource.type that are my own. In my processing, I don't care
about
NiFi's original attributes, but should I?
Thanks,
Russ