Daniel, Thanks for the detailed explanation, however I have built a python client which we use internally to automate few things as well. Coming back to getting list of all processors, I use “/flow/process-groups/root/status? recursive=true” for getting all components in one call and then parse the ProcessGroupStatusEntity recursively to get all processors/process-groups from an instance but this approach will not give me the ControllerServices that a processor is referencing.
One approach from here on is to make one rest call per processor to get those details (which is expensive timewise) or filter the processor list for the specific set of processor types you’re interested in and then make a rest call for each one in the filtered list. Either ways, I was trying to see if this can be done with very few rest call/s (possibly one) as I will run this every five minutes and collect stats about those processors. -Karthik From: Daniel Chaffelson [mailto:[email protected]] Sent: Wednesday, August 15, 2018 12:40 AM To: [email protected] Subject: Re: [EXT] Re: Get all Processors Hi Karthik, I have already implemented this in NiPyApi, assuming a Python automation client is useful to you. In the nipyapi.canvas.recurse_flow command ( https://github.com/Chaffelson/nipyapi/blob/28d7f74478e5e71253ce2de53fd22f56f455c338/nipyapi/canvas.py#L36 ) is the base functionality to step through the tree of ProcessGroups and retrieve the attributes from each. This is the one-call-per-PG method described above. This is then leveraged in the list_all_process_groups and list_all_processors commands, which respectively produce a flat list of all process groups or processors anywhere on the canvas ( https://github.com/Chaffelson/nipyapi/blob/28d7f74478e5e71253ce2de53fd22f56f455c338/nipyapi/canvas.py#L144 ) These are particularly useful for running Python list comprehensions against NiFi entities, such as single line script commands to find all instances of a certain processor, or purge the entire canvas in a test environment. If anyone can think of a different way to implement this that would be useful, particularly in these larger deployments, I'm happy to take a run at it. Thanks, Dan On Tue, Aug 14, 2018 at 4:38 PM Karthik Kothareddy (karthikk) [CONT - Type 2] <[email protected]<mailto:[email protected]>> wrote: Bryan, Thanks, I was thinking the same, may be get all the root level processors and filter them instead of going through each processor group(one call to each PG will be expensive). -Karthik -----Original Message----- From: Bryan Bende [mailto:[email protected]<mailto:[email protected]>] Sent: Tuesday, August 14, 2018 9:17 AM To: [email protected]<mailto:[email protected]> Subject: Re: [EXT] Re: Get all Processors You could probably achieve the same thing but traversing the process groups and asking each one for its processors without the includeDescendantGroups=true. It would be more complex on the client side, but would avoid making one huge request. On Tue, Aug 14, 2018 at 11:07 AM, Karthik Kothareddy (karthikk) [CONT - Type 2] <[email protected]<mailto:[email protected]>> wrote: > Hi Pierre, > > > > I tried this both on Standalone instance (1.7.1) and clustered > instance > (1.6.0) where the nifi.cluster.node.read.timeout is set to 60 secs. > > > > -Karthik > > > > > > From: Pierre Villard > [mailto:[email protected]<mailto:[email protected]>] > Sent: Tuesday, August 14, 2018 2:20 AM > To: [email protected]<mailto:[email protected]> > Subject: Re: [EXT] Re: Get all Processors > > > > Hi Karthik, > > > > Are you running a cluster or standalone NiFi? What's your default read > timeout value in nifi.properties? I believe the default value is a bit > low when you start reaching thousands of processors. > > > > nifi.cluster.node.read.timeout=5 sec > > > > Pierre > > > > > > 2018-08-14 0:12 GMT+02:00 Karthik Kothareddy (karthikk) [CONT - Type > 2] > <[email protected]<mailto:[email protected]>>: > > Joe, > > I tried this call on both 1.7.1 and 1.6.0 and still getting the > timeout exception. I know that this is a very expensive call and > requires lot of caching from serverside. I was looking for a way to > get all processors and the controller Services they refer (if any?). > Not sure how to get the information I need in one call. > > -Karthik > > > -----Original Message----- > From: Joe Witt [mailto:[email protected]<mailto:[email protected]>] > Sent: Monday, August 13, 2018 2:07 PM > To: [email protected]<mailto:[email protected]> > Subject: [EXT] Re: Get all Processors > > Karthik > > I believe that call is/was very expensive on the server side. You > might want to experiment with the latest release of NiFi against the > same flow configuration. From conversations I have had I feel like > this is an addressed issue though admittedly i'm not sure which JIRA > would address it if that is the case. > > Others might have better data offhand. > > Thanks > On Mon, Aug 13, 2018 at 3:08 PM Karthik Kothareddy (karthikk) [CONT - > Type 2] <[email protected]<mailto:[email protected]>> wrote: >> >> All, >> >> >> >> I was trying to get all processors from “root” Process group with the >> following rest call - >> /nifi-api/process-groups/root/processors?includeDescendantGroups=true >> and this call keeps timing out with the below exception >> >> >> >> javax.ws.rs<http://javax.ws.rs>.ProcessingException: >> java.net<http://java.net>.SocketTimeoutException: >> Read timed out >> >> >> >> We have around 2000 processors on that instance and if I change the >> process group from root to a lower level group with less number of >> processors, the call will return the ProcessorsEntity json. Any idea >> on why this is timing out where as more bulkier rest calls such as >> /flow/process-groups/root/status?recursive=true Will return results >> immediately ? >> >> >> >> >> >> -Karthik > >
