Jeff, You hit the nail on the head.
My "Concurrent Tasks" is set to 1. I shall have a fiddle with the numbers for both the threads, which is set to 10 and the concurrent tasks, and see if it helps. Thanks for your valuable assitance. Have a great weekend... Stephen On 07/04/2017 18:58, Jeff wrote: > Stephen, > > It would be good to see screenshots of your flow, and get a little more > information about your NiFi installation to help you get better > throughput of your data. Are you running NiFi as a single node? At > which processor in your flow are you noticing the queues backing up? In > the global settings menu, under "Controller Settings", what is "Maximum > Timer Driven Thread Count" set to? On your "MyImportProcessor" config, > on the "Scheduling" tab, what is "Concurrent Tasks" set to? > > In general terms, having more threads available to a processor means > you'll get greater throughput of the data, provided that your IO > configuration (disk read/write speed) can keep up. The number of > threads that NiFi is configured to use are made available to processors > as flowfiles are presented to a processor via an incoming queue based on > the number of concurrent tasks for which a processor is configured. > > In the UI, you can see how many tasks are currently being executed by > each processor, which will never be more than the "Maximum Timer Driven > Thread Count" (for processors configured to use timer-based scheduling). > > If you are experience backpressure on the incoming queue for > "MyImportProcessor", try increasing the number of "Concurrent Tasks" > available to that processor, and you may also want to increase the > number of "Maximum Timer Driven Tread Count". > > These are just some of the basics of getting more throughput in NiFi. > > On Thu, Apr 6, 2017 at 4:25 PM Stephen-Talk > <[email protected] <mailto:[email protected]>> > wrote: > > Thanks for the quick reply. > > Yes, that is quite correct. > The scenario is the following: > > The input flow is a "GetFile" process that collects csv files > (>100,000 lines) which in turn queues the file and parses each line to a > locally built processor (MyImportProcessor say) that submits them via > the REST API to a Drupal website. > The process works fine, but it is very slow, and would like to speed it > up by splitting the csv file into chunks so that it can then spawn > "MyImportProcessor" as many times as required. > > > On 06/04/2017 20:47, Jeff wrote: > > Hello Stephen, > > > > It's possible to watch the status of NiFi, and upon observing a > > particular status in which you're interested, you can use the REST API > > to create new processor groups. You'd also have to populate that > > processor group with processors and other components. Based on the > > scenario you mentioned, though, it sounds like you are looking at > being > > able to scale up available processing (via more concurrent threads, or > > more nodes in a cluster) once a certain amount of data is queued > up and > > waiting to be processed, rather than adding components to the existing > > flow. Is that correct? > > > > On Thu, Apr 6, 2017 at 3:30 PM Stephen-Talk > > <[email protected] > <mailto:[email protected]> > <mailto:[email protected] > <mailto:[email protected]>>> > > wrote: > > > > Hi, I am just a Nifi Inquisitor, > > > > Is it, or could it be possible to Dynamically spawn a > "Processor Group" > > when the input flow reaches a certain threshold. > > > > Thanking you in aniticipation. > > Stephen > > >
