Ali You certainly can and at the rates you mention you should be able to keep it for a good while.
Just set the properties you need for your system and measure the rate at which prov storage fills. Thanks On Fri, Feb 15, 2019 at 10:29 PM Ali Nazemian <alinazem...@gmail.com> wrote: > I didn't mean to use Nifi provenance search for an external provenance > search. I meant to use it for internal search provenance but keep the > provenance for a longer time than usual. It means instead of expecting it > to keep provenance data for a few days, use it as an event store as it also > provides the search capability. > > Regards, > Ali > > On Sat, Feb 16, 2019 at 5:29 AM Andrew Grande <apere...@gmail.com> wrote: > >> NiFi provenance searches are not a good integration pattern for external >> systems. I.e. using it to periodicaly fetch history burdens the cluster >> (those searches can be heavy) and disrupt normal processing SLAs. >> >> Pushing provenance events out to an external system (pitebtially even >> filtered down to components of interest) is a much more predictable pattern >> and provides lots of flexibility on how to interpret the events. >> >> Andrew >> >> On Thu, Feb 14, 2019, 11:26 PM Ali Nazemian <alinazem...@gmail.com> >> wrote: >> >>> Can I expect the Nifi search provenance part do the job for me? >>> >>> On Fri, 15 Feb. 2019, 13:21 Mike Thomsen <mikerthom...@gmail.com wrote: >>> >>>> Ali, >>>> >>>> There is a site to site publishing task for provenance that you can add >>>> as a root controller service that would be great here. It'll just take all >>>> of your provenance data periodically and ship it off to another NiFi server >>>> or cluster that can process all of the provenance data as blocks of JSON >>>> data. A common pattern there is to filter down to the events you want and >>>> publish to ElasticSearch. >>>> >>>> On Thu, Feb 14, 2019 at 7:05 PM Ali Nazemian <alinazem...@gmail.com> >>>> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I am investigating to see how Nifi provenance can be used as an event >>>>> store for a long period of time. Our use case is very burst based and >>>>> sometimes we may not receive any event for a period of time and sometimes >>>>> we may get burst traffic. On average we can say maybe around 1000 eps is >>>>> the expected throughput at this stage. Nifi has a powerful provenance that >>>>> gives you an ability to also index based on some attributes. I am >>>>> investigating how reliable is to use Nifi provenance store for a long >>>>> period of time and enable index for a few extra attributes. Has anybody >>>>> used Nifi provenance at this scale? Can lots of Lucene indices create >>>>> other >>>>> issues within Nifi as provenance uses Lucene for the indexing? >>>>> >>>>> P.S: Our use case is pretty light for Nifi as we are not going to have >>>>> any ETL and Nifi is being used mostly as an Orchestrator of multiple >>>>> Microservices. >>>>> >>>>> Regards, >>>>> Ali >>>>> >>>> > > -- > A.Nazemian >