Stephen,

Glad it has been narrowed down for you!

One other thing to try is adjusting "Run Duration" under the "Scheduling"
tab in your processor, if it supports it (I believe @SupportsBatching
enables this).  Increasing this value should result in higher throughput
for your processor, but flowfiles may be delayed a bit before they reach
the downstream processors since a batch needs to be completed before the
flowfiles are available to those processors.

On Sat, Apr 8, 2017 at 5:08 AM Stephen-Talk <[email protected]>
wrote:

> Jeff,
>
> You hit the nail on the head.
>
> My "Concurrent Tasks" is set to 1.
>
> I shall have a fiddle with the numbers for both the threads, which is
> set to 10 and the concurrent tasks, and see if it helps.
>
> Thanks for your valuable assitance.
>
> Have a great weekend...
> Stephen
>
> On 07/04/2017 18:58, Jeff wrote:
> > Stephen,
> >
> > It would be good to see screenshots of your flow, and get a little more
> > information about your NiFi installation to help you get better
> > throughput of your data.  Are you running NiFi as a single node?  At
> > which processor in your flow are you noticing the queues backing up?  In
> > the global settings menu, under "Controller Settings", what is "Maximum
> > Timer Driven Thread Count" set to?  On your "MyImportProcessor" config,
> > on the "Scheduling" tab, what is "Concurrent Tasks" set to?
> >
> > In general terms, having more threads available to a processor means
> > you'll get greater throughput of the data, provided that your IO
> > configuration (disk read/write speed) can keep up.  The number of
> > threads that NiFi is configured to use are made available to processors
> > as flowfiles are presented to a processor via an incoming queue based on
> > the number of concurrent tasks for which a processor is configured.
> >
> > In the UI, you can see how many tasks are currently being executed by
> > each processor, which will never be more than the "Maximum Timer Driven
> > Thread Count" (for processors configured to use timer-based scheduling).
> >
> > If you are experience backpressure on the incoming queue for
> > "MyImportProcessor", try increasing the number of "Concurrent Tasks"
> > available to that processor, and you may also want to increase the
> > number of "Maximum Timer Driven Tread Count".
> >
> > These are just some of the basics of getting more throughput in NiFi.
> >
> > On Thu, Apr 6, 2017 at 4:25 PM Stephen-Talk
> > <[email protected] <mailto:[email protected]>>
> > wrote:
> >
> >     Thanks for the quick reply.
> >
> >     Yes, that is quite correct.
> >     The scenario is the following:
> >
> >     The input flow is a "GetFile" process that collects csv files
> >     (>100,000 lines) which in turn queues the file and parses each line
> to a
> >     locally built processor (MyImportProcessor say) that submits them via
> >     the REST API to a Drupal website.
> >     The process works fine, but it is very slow, and would like to speed
> it
> >     up by splitting the csv file into chunks so that it can then spawn
> >     "MyImportProcessor" as many times as required.
> >
> >
> >     On 06/04/2017 20:47, Jeff wrote:
> >     > Hello Stephen,
> >     >
> >     > It's possible to watch the status of NiFi, and upon observing a
> >     > particular status in which you're interested, you can use the REST
> API
> >     > to create new processor groups.  You'd also have to populate that
> >     > processor group with processors and other components.  Based on the
> >     > scenario you mentioned, though, it sounds like you are looking at
> >     being
> >     > able to scale up available processing (via more concurrent
> threads, or
> >     > more nodes in a cluster) once a certain amount of data is queued
> >     up and
> >     > waiting to be processed, rather than adding components to the
> existing
> >     > flow.  Is that correct?
> >     >
> >     > On Thu, Apr 6, 2017 at 3:30 PM Stephen-Talk
> >     > <[email protected]
> >     <mailto:[email protected]>
> >     <mailto:[email protected]
> >     <mailto:[email protected]>>>
> >     > wrote:
> >     >
> >     >     Hi, I am just a Nifi Inquisitor,
> >     >
> >     >     Is it, or could it be possible to Dynamically spawn a
> >     "Processor Group"
> >     >     when the input flow reaches a certain threshold.
> >     >
> >     >     Thanking you in aniticipation.
> >     >     Stephen
> >     >
> >
>

Reply via email to