I haven't looked into a good way to filter them out yet, but I suspect somehow using the component ids of the components being used after the Input Port that receives the events.
The reporting task has a configurable batch size which defaults to 1000. So assuming you are only doing a couple of things after receiving the batch, you would be probably be producing 3-4 more provenance events per 1000. On Wed, Jul 6, 2016 at 9:16 AM, Gresock, Joseph <[email protected]> wrote: > That's awesome, I'll just wait for that site-to-site provenance reporting > task, then. > > Have you guys figured out a good way to identify those circular provenance > events? I will likely have to use the same cluster for the site-to-site > endpoint. > > Joe Gresock > Lockheed Martin Software Engineer Stf > 443-294-2661 > [email protected] > > ________________________________________ > From: Bryan Bende [[email protected]] > Sent: Wednesday, July 06, 2016 9:08 AM > To: [email protected] > Subject: EXTERNAL: Re: ReportingTask provenance question > > Joe, > > You will have to keep track of the last provenance event id that you > queried in order to query for new events. > > In 0.7.0 we added a site-to-site provenance reporting task [1] which may > take care of what you need, or at least be an example to base your custom > reporting task from. > > The reason we went this route was rather than having a whole bunch of > custom reporting tasks to send provenance data to different places, we may > as well make use of NiFi's existing processors. > So you can have a separate NiFi instance that just receives provenance > events over site-to-site and uses processors to send them wherever, or even > site-to-site back to a single instance but this produces a few more > circular provenance events. > > -Bryan > > [1] > > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-site-to-site-reporting-bundle/nifi-site-to-site-reporting-task/src/main/java/org/apache/nifi/reporting/SiteToSiteProvenanceReportingTask.java#L117 > > On Wed, Jul 6, 2016 at 9:02 AM, Gresock, Joseph <[email protected]> > wrote: > > > Hi folks, > > > > When developing a ReportingTask, I see that i can call > > reportingContext.getEventAccess().getProvenanceRepository(). Will this > > repository contain only provenance events created since the last time my > > ReportingTask's onTrigger() was fired, or does it contain the entire > > provenance repository to date? > > > > I'd like to develop a reporting task for metric purposes, and my hope is > > that I can simply grab all the latest provenance events each time the > > reporting task triggers. > > > > Thanks, > > > > Joe Gresock > > Lockheed Martin Software Engineer Stf > > 443-294-2661 > > [email protected] > > >
