Hi David,

I think this would be a great improvement for NiFi. I have considered a
similar approach in the past but I didn't have the time to pursue it
further, so I ended up using a reporting task instead. I do think that
having extensibility of the provenance repository would be the best
approach for anything production grade.

I would be very interested in seeing this move forward and agree that this
would need to follow the NIP process.

Thanks,
Pierre


Le mer. 11 mars 2026 à 18:25, David Young <[email protected]> a
écrit :

> Hello Team!
>
> I've been working with NiFi for a bit now and am seeing a usage pattern
> within my team that I think could be improved. We have thrown around the
> idea of creating an additional provenance repository implementation that
> would allow the storage and retrieval of `ProveanceEventRecords` in an
> external database / service to support more cloud-centric deployments.
>
> Expanding where NiFi can store provenance would allow the instance/cluster
> itself to offload the storage and management of provenance events to an
> external tool. e.g. Elasticsearch / Opensearch, Solr, etc.
>
> When targeting cloud based deployments of NiFi's, resource constraints are
> an important consideration. Externalizing some database-like features would
> allow more resources to be allocated to data processing tasks. Also, in the
> event that a container or VM needs to be replaced or scaled down, having
> provenance stored in an external service would still allow other nodes in
> the cluster to access those events.
>
> My goal is to refactor some of the existing implementations within the
> nifi-data-provenance-utils module to decouple them from being disk-centric.
> To go along with that, I'd like to create some new interfaces that external
> services could be built against.
>
> In my research and prototyping for this, I've run into several situations
> where, while trying to follow the existing patterns, sub-typing some of the
> existing things doesn't make sense for an external provider.
>
> I don't yet have any complete implementations due to the amount of work I
> think would be involved. So far my research has primarily been with using
> Elasticsearch as a backing store.
>
> I believe this would rise to the level of requiring a NIP and would like to
> see how the larger dev team feels about this.
> Thank you!
>
> --
> -David Y.
>

Reply via email to