It looks like HandleHttpRequest should be sending back a 503 if its containerQueue fills up (default capacity of 50 requests that have been accepted but not processed in an onTrigger()) [1]. Also, the default thread pool the jetty server is using should be able to create up to 200 threads to accept connections and the handler is using an async context so the in-flight flow files shouldn't be holding up new requests.
If you're not seeing 503s it might be on the sender side of the equation. Is the sender doing posts concurrently or waiting on each to complete before sending another? [1] https://github.com/apache/nifi/blob/rel/nifi-0.7.0/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/HandleHttpRequest.java#L395 On Wed, Apr 5, 2017 at 2:27 PM, Joe Witt <[email protected]> wrote: > Much of this goodness can be found in the help->Users Guide. > Adjusting run durection/scheduling factors: > https://nifi.apache.org/docs/nifi-docs/html/user-guide. > html#scheduling-tab > > These are the latest docs but I'm sure there is coverage in the older > stuff. > > Thanks > > On Wed, Apr 5, 2017 at 2:23 PM, James McMahon <[email protected]> > wrote: > > Yes sir! Sure am. And I know, because I have committed that very silly > > mistake before. We are indeed seeing # responses = # requests -Jim > > > > On Wed, Apr 5, 2017 at 2:13 PM, Bryan Rosander <[email protected]> > wrote: > >> > >> Hey James, > >> > >> Are you making sure that every route from HandleHttpRequest goes to a > >> HandleHttpResponse? If not, the StandardHttpContextMap may be filling > up > >> with requests which would probably delay processing. > >> > >> Thanks, > >> Bryan > >> > >> On Wed, Apr 5, 2017 at 2:07 PM, James McMahon <[email protected]> > >> wrote: > >>> > >>> Thank you very much Matt. I have cranked my Concurrent Tasks config > parm > >>> on my ExecuteScripts up to 20, and judging by the empty queue feeding > that > >>> processor it is screaming through the flowfiles arriving at its > doorstep. > >>> > >>> Can anyone comment on performance optimizations for HandleHttpRequest? > In > >>> your experiences, is HandleHttpRequest a bottleneck? I do notice that I > >>> often have a count in the processor for "flowfile in process" within > the > >>> processor. Anywhere from 1 to 10 when it does show such a count. > >>> > >>> -Jim > >>> > >>> On Wed, Apr 5, 2017 at 1:52 PM, Matt Burgess <[email protected]> > >>> wrote: > >>>> > >>>> Jim, > >>>> > >>>> One quick thing you can try is to use GenerateFlowFile to send to your > >>>> ExecuteScript instead of HandleHttpRequest, you can configure it to > >>>> send whatever body with whatever attributes (such that you would get > >>>> from HandleHttpRequest) and send files at whatever rate the processor > >>>> is scheduled. This might take ExecuteScript out of the bottleneck > >>>> equation; if you are getting plenty of throughput without > >>>> HandleHttpRequest then that's probably your bottleneck. > >>>> > >>>> I'm not sure offhand about optimizations for HandleHttpRequest, > >>>> perhaps someone else will jump in :) > >>>> > >>>> Regards, > >>>> Matt > >>>> > >>>> > >>>> On Wed, Apr 5, 2017 at 1:48 PM, James McMahon <[email protected]> > >>>> wrote: > >>>> > I am receiving POSTs from a Pentaho process, delivering files to my > >>>> > NiFi > >>>> > 0.7.x workflow HandleHttpRequest processor. That processor hands the > >>>> > flowfile off to an ExecuteScript processor that runs a python > script. > >>>> > This > >>>> > script is very, very simple: it takes an incoming JSO object and > loads > >>>> > it > >>>> > into a Python dictionary, and verifies the presence of required > fields > >>>> > using > >>>> > simple has_key checks on the dictionary. There are only eight fields > >>>> > in the > >>>> > incoming JSON object. > >>>> > > >>>> > The throughput for these two processes is not exceeding 100-150 > files > >>>> > in > >>>> > five minutes. It seems very slow in light of the minimal processing > >>>> > going on > >>>> > in these two steps. > >>>> > > >>>> > I notice that there are configuration operations seemingly related > to > >>>> > optimizing performance. "Concurrent tasks", for example, is only > set > >>>> > by > >>>> > default to 1 for each processor. > >>>> > > >>>> > What performance optimizations at the processor level do users > >>>> > recommend? Is > >>>> > it advisable to crank up the concurrent tasks for a processor, and > is > >>>> > there > >>>> > an optimal performance point beyond which you should not crank up > that > >>>> > value? Are there trade-offs? > >>>> > > >>>> > I am particularly interested in optimizations for HandleHttpRequest > >>>> > and > >>>> > ExecuteScript processors. > >>>> > > >>>> > Thanks in advance for your thoughts. > >>>> > > >>>> > cheers, > >>>> > > >>>> > Jim > >>> > >>> > >> > > >
