Hi David, I think this would be a great improvement for NiFi. I have considered a similar approach in the past but I didn't have the time to pursue it further, so I ended up using a reporting task instead. I do think that having extensibility of the provenance repository would be the best approach for anything production grade.
I would be very interested in seeing this move forward and agree that this would need to follow the NIP process. Thanks, Pierre Le mer. 11 mars 2026 à 18:25, David Young <[email protected]> a écrit : > Hello Team! > > I've been working with NiFi for a bit now and am seeing a usage pattern > within my team that I think could be improved. We have thrown around the > idea of creating an additional provenance repository implementation that > would allow the storage and retrieval of `ProveanceEventRecords` in an > external database / service to support more cloud-centric deployments. > > Expanding where NiFi can store provenance would allow the instance/cluster > itself to offload the storage and management of provenance events to an > external tool. e.g. Elasticsearch / Opensearch, Solr, etc. > > When targeting cloud based deployments of NiFi's, resource constraints are > an important consideration. Externalizing some database-like features would > allow more resources to be allocated to data processing tasks. Also, in the > event that a container or VM needs to be replaced or scaled down, having > provenance stored in an external service would still allow other nodes in > the cluster to access those events. > > My goal is to refactor some of the existing implementations within the > nifi-data-provenance-utils module to decouple them from being disk-centric. > To go along with that, I'd like to create some new interfaces that external > services could be built against. > > In my research and prototyping for this, I've run into several situations > where, while trying to follow the existing patterns, sub-typing some of the > existing things doesn't make sense for an external provider. > > I don't yet have any complete implementations due to the amount of work I > think would be involved. So far my research has primarily been with using > Elasticsearch as a backing store. > > I believe this would rise to the level of requiring a NIP and would like to > see how the larger dev team feels about this. > Thank you! > > -- > -David Y. >
