Simon,

I feel that " provenance event is emitted for each flowfile for each
processor." is accurate understanding "each processor" means the unique
processors the flowFile goes through.

The provenance database is a lucene database and 1 million provenance
events is not unreasonable.
It would have to do with how you configure your NIFI and a best practice is
to store your provenance on its own disk.

Many tweak able settings for provenance are on nifi.properties [1]

[1] https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html

On Wed, Apr 19, 2017 at 6:50 AM <[email protected]> wrote:

> Hi All,
>
> In some parts of the NiFi documentation, it is stated that a provenance
> event is emitted for each flowfile for each processor. However elsewhere
> it is stated that no provenance-event is generated for a flowfile sent
> to the “success” output of a processor - which is true?
>
> And are there mechanisms for reducing the number of provenance events
> generated by a NiFi flow? When a dataflow is processing large numbers of
> events, it would seem to me that the generation of provenance events
> will be the limiting factor for performance. When processing 1 million
> records per day, generating 1 million provenance events (or worse) is
> not helpful..
>
> Thanks in advance,
>
> Simon
>

Reply via email to