I haven't looked into a good way to filter them out yet, but I suspect
somehow using the component ids of the components being used after the
Input Port that receives the events.

The reporting task has a configurable batch size which defaults to 1000. So
assuming you are only doing a couple of things after receiving the batch,
you would be probably be producing 3-4 more provenance events per 1000.

On Wed, Jul 6, 2016 at 9:16 AM, Gresock, Joseph <[email protected]>
wrote:

> That's awesome, I'll just wait for that site-to-site provenance reporting
> task, then.
>
> Have you guys figured out a good way to identify those circular provenance
> events?  I will likely have to use the same cluster for the site-to-site
> endpoint.
>
> Joe Gresock
> Lockheed Martin Software Engineer Stf
> 443-294-2661
> [email protected]
>
> ________________________________________
> From: Bryan Bende [[email protected]]
> Sent: Wednesday, July 06, 2016 9:08 AM
> To: [email protected]
> Subject: EXTERNAL: Re: ReportingTask provenance question
>
> Joe,
>
> You will have to keep track of the last provenance event id that you
> queried in order to query for new events.
>
> In 0.7.0 we added a site-to-site provenance reporting task  [1] which may
> take care of what you need, or at least be an example to base your custom
> reporting task from.
>
> The reason we went this route was rather than having a whole bunch of
> custom reporting tasks to send provenance data to different places, we may
> as well make use of NiFi's existing processors.
> So you can have a separate NiFi instance that just receives provenance
> events over site-to-site and uses processors to send them wherever, or even
> site-to-site back to a single instance but this produces a few more
> circular provenance events.
>
> -Bryan
>
> [1]
>
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-site-to-site-reporting-bundle/nifi-site-to-site-reporting-task/src/main/java/org/apache/nifi/reporting/SiteToSiteProvenanceReportingTask.java#L117
>
> On Wed, Jul 6, 2016 at 9:02 AM, Gresock, Joseph <[email protected]>
> wrote:
>
> > Hi folks,
> >
> > When developing a ReportingTask, I see that i can call
> > reportingContext.getEventAccess().getProvenanceRepository().  Will this
> > repository contain only provenance events created since the last time my
> > ReportingTask's onTrigger() was fired, or does it contain the entire
> > provenance repository to date?
> >
> > I'd like to develop a reporting task for metric purposes, and my hope is
> > that I can simply grab all the latest provenance events each time the
> > reporting task triggers.
> >
> > Thanks,
> >
> > Joe Gresock
> > Lockheed Martin Software Engineer Stf
> > 443-294-2661
> > [email protected]
> >
>

Reply via email to