Scott,

The slowness if I recall correctly is mostly related to jython initiation
time.

There have been some discussions in  past about this:

https://lists.apache.org/thread.html/a4a56c91e43857df0e2e38797585d0f496f8723b550d6b866f2e28e4@
<dev.nifi.apache.org>

Cheers


On 6 Apr 2017 06:26, "Scott Wagner" <[email protected]> wrote:

> One of my experiences is that when using ExecuteScript and Python is that
> having an ExecuteScript that works on an individual FlowFile when you have
> multiple in the input queue is very inefficient, even when you set it to a
> timer of 0 sec.
>
> Instead, I have the following in all of my Python scripts:
>
> flowFiles = session.get(10)
> for flowFile in flowFiles:
>     if flowFile is None:
>         continue
>     # Do stuff here
>
> That seems to improve the throughput of the ExecuteScript processor
> dramatically.
>
> YMMV
>
> - Scott
>
> James McMahon <[email protected]>
> Wednesday, April 5, 2017 12:48 PM
> I am receiving POSTs from a Pentaho process, delivering files to my NiFi
> 0.7.x workflow HandleHttpRequest processor. That processor hands the
> flowfile off to an ExecuteScript processor that runs a python script. This
> script is very, very simple: it takes an incoming JSO object and loads it
> into a Python dictionary, and verifies the presence of required fields
> using simple has_key checks on the dictionary. There are only eight fields
> in the incoming JSON object.
>
> The throughput for these two processes is not exceeding 100-150 files in
> five minutes. It seems very slow in light of the minimal processing going
> on in these two steps.
>
> I notice that there are configuration operations seemingly related to
> optimizing performance. "Concurrent tasks", for example,  is only set by
> default to 1 for each processor.
>
> What performance optimizations at the processor level do users recommend?
> Is it advisable to crank up the concurrent tasks for a processor, and is
> there an optimal performance point beyond which you should not crank up
> that value? Are there trade-offs?
>
> I am particularly interested in optimizations for HandleHttpRequest and
> ExecuteScript processors.
>
> Thanks in advance for your thoughts.
>
> cheers,
>
> Jim
>
>
>

Reply via email to