Thanks Joe, Juan,
Perhaps it would be useful to be able to generate provenance events for
a _sample_ of flowfiles? eg every Nth flowfile created by a "data
ingress" (GET* or LISTEN*) processor gets tracked? Or maybe better:
every flowfile gets tracked with a probability of N, to ensure that
You're right that the generation and indexing of provenance data
creates overhead. We've put considerable effort in minimizing that
overhead to a point where you should not have to think about it and
still get all the powerful user experience/auditing gains it provides.
However, when you're
Simon,
I feel that " provenance event is emitted for each flowfile for each
processor." is accurate understanding "each processor" means the unique
processors the flowFile goes through.
The provenance database is a lucene database and 1 million provenance
events is not unreasonable.
It would
Hi All,
In some parts of the NiFi documentation, it is stated that a provenance
event is emitted for each flowfile for each processor. However elsewhere
it is stated that no provenance-event is generated for a flowfile sent
to the “success” output of a processor - which is true?
And are