Thanks. I've done something similar in the past using Elasticsearch as the data store. I think we might start with that and hope that we don't get more nuanced requirements. I guess we could look at naming the processors after the steps and hope that that works to keep users happy.
On Fri, Mar 27, 2020 at 8:22 AM Marc Parisi <[email protected]> wrote: > Hey Mike, > > I recently did something similar for a personal project. I ingested > Provenance data into a NoSQL store ( through a reporting task that also > indexed the data ), primarily querying upon the ProvenanceEventType. > > I tracked some piece of information ( in my case the original file name > with an identifier ) and queried for event types to get an idea of what > occurred - for example I looked for ROUTE and ATTRIBUTES_MODIFIED to > determine which path my data took. > > It was very easy to monitor the provenance event types for DROP and to > check if data succeeded or failed. I didn't concern myself with diving into > why data failed because I was worried that would be a bit more complex and > requires a bit more thought. > > I originally had an ingest processor perform this notification but moved > to a provenance reporting task as it just worked so well ( at least for my > purposes ). > > In my case the dashboard was a simple table that showed what file(s) I > uploaded and their state, flashing red if data took more than a > configurable period of time to complete ( fail or success). The table > linked to a separate query interface that would allow a deeper dive into > the provenance records so that i can dive into a problem set further if > failure or extreme latency occurred. it was super simple... > > Hope this helps, > Marc > > On Fri, Mar 27, 2020 at 7:51 AM Mike Thomsen <[email protected]> > wrote: > >> Has anyone ever created good dashboards on top of NiFi flows or >> provenance data that will report the status of a flowfile back to the user? >> Our client would like to give users the ability to feed Nifi data and then >> get a basic view of where it is. It can be fairly simplistic, like >> "Started..." "Processing..." "Done..." for now, but I was wondering if >> anyone has any good patterns for this before I dive into it myself. >> >> My current thought here is to create a new processor bundle that would >> add a new processor called "ProgressGateProcessor" that would allow users >> in one step to signal to an external application or data store the status >> of a flowfile, so you don't have to mix in process groups. >> >> Thanks, >> >> Mike >> >
