Yes sir! Sure am. And I know, because I have committed that very silly
mistake before. We are indeed seeing # responses = # requests  -Jim

On Wed, Apr 5, 2017 at 2:13 PM, Bryan Rosander <[email protected]> wrote:

> Hey James,
>
> Are you making sure that every route from HandleHttpRequest goes to
> a HandleHttpResponse?  If not, the StandardHttpContextMap may be filling up
> with requests which would probably delay processing.
>
> Thanks,
> Bryan
>
> On Wed, Apr 5, 2017 at 2:07 PM, James McMahon <[email protected]>
> wrote:
>
>> Thank you very much Matt. I have cranked my Concurrent Tasks config parm
>> on my ExecuteScripts up to 20, and judging by the empty queue feeding that
>> processor it is screaming through the flowfiles arriving at its doorstep.
>>
>> Can anyone comment on performance optimizations for HandleHttpRequest? In
>> your experiences, is HandleHttpRequest a bottleneck? I do notice that I
>> often have a count in the processor for "flowfile in process" within the
>> processor. Anywhere from 1 to 10 when it does show such a count.
>>
>> -Jim
>>
>> On Wed, Apr 5, 2017 at 1:52 PM, Matt Burgess <[email protected]>
>> wrote:
>>
>>> Jim,
>>>
>>> One quick thing you can try is to use GenerateFlowFile to send to your
>>> ExecuteScript instead of HandleHttpRequest, you can configure it to
>>> send whatever body with whatever attributes (such that you would get
>>> from HandleHttpRequest) and send files at whatever rate the processor
>>> is scheduled. This might take ExecuteScript out of the bottleneck
>>> equation; if you are getting plenty of throughput without
>>> HandleHttpRequest then that's probably your bottleneck.
>>>
>>> I'm not sure offhand about optimizations for HandleHttpRequest,
>>> perhaps someone else will jump in :)
>>>
>>> Regards,
>>> Matt
>>>
>>>
>>> On Wed, Apr 5, 2017 at 1:48 PM, James McMahon <[email protected]>
>>> wrote:
>>> > I am receiving POSTs from a Pentaho process, delivering files to my
>>> NiFi
>>> > 0.7.x workflow HandleHttpRequest processor. That processor hands the
>>> > flowfile off to an ExecuteScript processor that runs a python script.
>>> This
>>> > script is very, very simple: it takes an incoming JSO object and loads
>>> it
>>> > into a Python dictionary, and verifies the presence of required fields
>>> using
>>> > simple has_key checks on the dictionary. There are only eight fields
>>> in the
>>> > incoming JSON object.
>>> >
>>> > The throughput for these two processes is not exceeding 100-150 files
>>> in
>>> > five minutes. It seems very slow in light of the minimal processing
>>> going on
>>> > in these two steps.
>>> >
>>> > I notice that there are configuration operations seemingly related to
>>> > optimizing performance. "Concurrent tasks", for example,  is only set
>>> by
>>> > default to 1 for each processor.
>>> >
>>> > What performance optimizations at the processor level do users
>>> recommend? Is
>>> > it advisable to crank up the concurrent tasks for a processor, and is
>>> there
>>> > an optimal performance point beyond which you should not crank up that
>>> > value? Are there trade-offs?
>>> >
>>> > I am particularly interested in optimizations for HandleHttpRequest and
>>> > ExecuteScript processors.
>>> >
>>> > Thanks in advance for your thoughts.
>>> >
>>> > cheers,
>>> >
>>> > Jim
>>>
>>
>>
>

Reply via email to