Hey James,

Are you making sure that every route from HandleHttpRequest goes to
a HandleHttpResponse?  If not, the StandardHttpContextMap may be filling up
with requests which would probably delay processing.

Thanks,
Bryan

On Wed, Apr 5, 2017 at 2:07 PM, James McMahon <[email protected]> wrote:

> Thank you very much Matt. I have cranked my Concurrent Tasks config parm
> on my ExecuteScripts up to 20, and judging by the empty queue feeding that
> processor it is screaming through the flowfiles arriving at its doorstep.
>
> Can anyone comment on performance optimizations for HandleHttpRequest? In
> your experiences, is HandleHttpRequest a bottleneck? I do notice that I
> often have a count in the processor for "flowfile in process" within the
> processor. Anywhere from 1 to 10 when it does show such a count.
>
> -Jim
>
> On Wed, Apr 5, 2017 at 1:52 PM, Matt Burgess <[email protected]> wrote:
>
>> Jim,
>>
>> One quick thing you can try is to use GenerateFlowFile to send to your
>> ExecuteScript instead of HandleHttpRequest, you can configure it to
>> send whatever body with whatever attributes (such that you would get
>> from HandleHttpRequest) and send files at whatever rate the processor
>> is scheduled. This might take ExecuteScript out of the bottleneck
>> equation; if you are getting plenty of throughput without
>> HandleHttpRequest then that's probably your bottleneck.
>>
>> I'm not sure offhand about optimizations for HandleHttpRequest,
>> perhaps someone else will jump in :)
>>
>> Regards,
>> Matt
>>
>>
>> On Wed, Apr 5, 2017 at 1:48 PM, James McMahon <[email protected]>
>> wrote:
>> > I am receiving POSTs from a Pentaho process, delivering files to my NiFi
>> > 0.7.x workflow HandleHttpRequest processor. That processor hands the
>> > flowfile off to an ExecuteScript processor that runs a python script.
>> This
>> > script is very, very simple: it takes an incoming JSO object and loads
>> it
>> > into a Python dictionary, and verifies the presence of required fields
>> using
>> > simple has_key checks on the dictionary. There are only eight fields in
>> the
>> > incoming JSON object.
>> >
>> > The throughput for these two processes is not exceeding 100-150 files in
>> > five minutes. It seems very slow in light of the minimal processing
>> going on
>> > in these two steps.
>> >
>> > I notice that there are configuration operations seemingly related to
>> > optimizing performance. "Concurrent tasks", for example,  is only set by
>> > default to 1 for each processor.
>> >
>> > What performance optimizations at the processor level do users
>> recommend? Is
>> > it advisable to crank up the concurrent tasks for a processor, and is
>> there
>> > an optimal performance point beyond which you should not crank up that
>> > value? Are there trade-offs?
>> >
>> > I am particularly interested in optimizations for HandleHttpRequest and
>> > ExecuteScript processors.
>> >
>> > Thanks in advance for your thoughts.
>> >
>> > cheers,
>> >
>> > Jim
>>
>
>

Reply via email to