Hey James, Are you making sure that every route from HandleHttpRequest goes to a HandleHttpResponse? If not, the StandardHttpContextMap may be filling up with requests which would probably delay processing.
Thanks, Bryan On Wed, Apr 5, 2017 at 2:07 PM, James McMahon <[email protected]> wrote: > Thank you very much Matt. I have cranked my Concurrent Tasks config parm > on my ExecuteScripts up to 20, and judging by the empty queue feeding that > processor it is screaming through the flowfiles arriving at its doorstep. > > Can anyone comment on performance optimizations for HandleHttpRequest? In > your experiences, is HandleHttpRequest a bottleneck? I do notice that I > often have a count in the processor for "flowfile in process" within the > processor. Anywhere from 1 to 10 when it does show such a count. > > -Jim > > On Wed, Apr 5, 2017 at 1:52 PM, Matt Burgess <[email protected]> wrote: > >> Jim, >> >> One quick thing you can try is to use GenerateFlowFile to send to your >> ExecuteScript instead of HandleHttpRequest, you can configure it to >> send whatever body with whatever attributes (such that you would get >> from HandleHttpRequest) and send files at whatever rate the processor >> is scheduled. This might take ExecuteScript out of the bottleneck >> equation; if you are getting plenty of throughput without >> HandleHttpRequest then that's probably your bottleneck. >> >> I'm not sure offhand about optimizations for HandleHttpRequest, >> perhaps someone else will jump in :) >> >> Regards, >> Matt >> >> >> On Wed, Apr 5, 2017 at 1:48 PM, James McMahon <[email protected]> >> wrote: >> > I am receiving POSTs from a Pentaho process, delivering files to my NiFi >> > 0.7.x workflow HandleHttpRequest processor. That processor hands the >> > flowfile off to an ExecuteScript processor that runs a python script. >> This >> > script is very, very simple: it takes an incoming JSO object and loads >> it >> > into a Python dictionary, and verifies the presence of required fields >> using >> > simple has_key checks on the dictionary. There are only eight fields in >> the >> > incoming JSON object. >> > >> > The throughput for these two processes is not exceeding 100-150 files in >> > five minutes. It seems very slow in light of the minimal processing >> going on >> > in these two steps. >> > >> > I notice that there are configuration operations seemingly related to >> > optimizing performance. "Concurrent tasks", for example, is only set by >> > default to 1 for each processor. >> > >> > What performance optimizations at the processor level do users >> recommend? Is >> > it advisable to crank up the concurrent tasks for a processor, and is >> there >> > an optimal performance point beyond which you should not crank up that >> > value? Are there trade-offs? >> > >> > I am particularly interested in optimizations for HandleHttpRequest and >> > ExecuteScript processors. >> > >> > Thanks in advance for your thoughts. >> > >> > cheers, >> > >> > Jim >> > >
