Stephen,

It would be good to see screenshots of your flow, and get a little more
information about your NiFi installation to help you get better throughput
of your data.  Are you running NiFi as a single node?  At which processor
in your flow are you noticing the queues backing up?  In the global
settings menu, under "Controller Settings", what is "Maximum Timer Driven
Thread Count" set to?  On your "MyImportProcessor" config, on the
"Scheduling" tab, what is "Concurrent Tasks" set to?

In general terms, having more threads available to a processor means you'll
get greater throughput of the data, provided that your IO configuration
(disk read/write speed) can keep up.  The number of threads that NiFi is
configured to use are made available to processors as flowfiles are
presented to a processor via an incoming queue based on the number of
concurrent tasks for which a processor is configured.

In the UI, you can see how many tasks are currently being executed by each
processor, which will never be more than the "Maximum Timer Driven Thread
Count" (for processors configured to use timer-based scheduling).

If you are experience backpressure on the incoming queue for
"MyImportProcessor", try increasing the number of "Concurrent Tasks"
available to that processor, and you may also want to increase the number
of "Maximum Timer Driven Tread Count".

These are just some of the basics of getting more throughput in NiFi.

On Thu, Apr 6, 2017 at 4:25 PM Stephen-Talk <[email protected]>
wrote:

> Thanks for the quick reply.
>
> Yes, that is quite correct.
> The scenario is the following:
>
> The input flow is a "GetFile" process that collects csv files
> (>100,000 lines) which in turn queues the file and parses each line to a
> locally built processor (MyImportProcessor say) that submits them via
> the REST API to a Drupal website.
> The process works fine, but it is very slow, and would like to speed it
> up by splitting the csv file into chunks so that it can then spawn
> "MyImportProcessor" as many times as required.
>
>
> On 06/04/2017 20:47, Jeff wrote:
> > Hello Stephen,
> >
> > It's possible to watch the status of NiFi, and upon observing a
> > particular status in which you're interested, you can use the REST API
> > to create new processor groups.  You'd also have to populate that
> > processor group with processors and other components.  Based on the
> > scenario you mentioned, though, it sounds like you are looking at being
> > able to scale up available processing (via more concurrent threads, or
> > more nodes in a cluster) once a certain amount of data is queued up and
> > waiting to be processed, rather than adding components to the existing
> > flow.  Is that correct?
> >
> > On Thu, Apr 6, 2017 at 3:30 PM Stephen-Talk
> > <[email protected] <mailto:[email protected]>>
> > wrote:
> >
> >     Hi, I am just a Nifi Inquisitor,
> >
> >     Is it, or could it be possible to Dynamically spawn a "Processor
> Group"
> >     when the input flow reaches a certain threshold.
> >
> >     Thanking you in aniticipation.
> >     Stephen
> >
>

Reply via email to