[ 
https://issues.apache.org/jira/browse/NIFI-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon DeVries updated NIFI-6157:
----------------------------------
    Description: 
NIFI-6068 made modifications such that a funnel wouldn't hold on to a 
TimerDriven thread excessively.  However, now it isn't holding on to the thread 
long enough...

Since Funnels and Local Ports are scheduled with the timer driven thread pool, 
they're competing for threads with all of the other processors on the graph.  
In a large flow with a large number of processors, potentially with multiple 
assigned concurrent tasks, funnels and ports get to run less and less 
frequently, since they are hard coded to 1 concurrent task.

I'm open to implementation options, but a couple of possibilities are:
 * Increase the transferred FlowFilecap from 10K to 100K.  The thread will 
still be released if less than the requested 1000 FlowFiles are moved in a 
loop, so it won't hold on inappropriately, but it will still have the 
opportunity to move the files that need to be moved.  Furthermore, if the back 
pressure on the outgoing relationship is engaged, it will cause the thread to 
be released.  Effectively, the amount transferred would be limited by the max 
of 100K and outgoing queue capacity.
 * Like above, but add a property to specify the max number of FlowFiles 
transferred per run.  Removing hard coded magic numbers is good... but 
cluttering nifi.properties is bad, so its a trade off.
 * Increase the number of concurrent threads for funnels / ports.  This 
probably would want to be a configurable property, as the value should really 
likely be proportional to the "size" of your flow, whatever that means for the 
system in question.
 * Increase the "run duration"... but i don't think i like that.
 * If session.getQueueSize exceeds some threshold, spin off a new thread to 
transfer those files... but that could be dangerous.
 * Create a new thread pool for ports / funnels, so they aren't starved by 
processors.  Similar to above, but reuses resources.  Still would need to 
determine the correct size of the pool.  This could be the best answer in 
theory, but would also require the most code work.

[~markap14], thoughts?

  was:
https://issues.apache.org/jira/browse/NIFI-6068?filter=-2 made modifications 
such that a funnel wouldn't hold on to a TimerDriven thread excessively.  
However, now it isn't holding on to the thread long enough...

Since Funnels and Local Ports are scheduled with the timer driven thread pool, 
they're competing for threads with all of the other processors on the graph.  
In a large flow with a large number of processors, potentially with multiple 
assigned concurrent tasks, funnels and ports get to run less and less 
frequently, since they are hard coded to 1 concurrent task.

I'm open to implementation options, but a couple of possibilities are:
 * Increase the transferred FlowFilecap from 10K to 100K.  The thread will 
still be released if less than the requested 1000 FlowFiles are moved in a 
loop, so it won't hold on inappropriately, but it will still have the 
opportunity to move the files that need to be moved.  Furthermore, if the back 
pressure on the outgoing relationship is engaged, it will cause the thread to 
be released.  Effectively, the amount transferred would be limited by the max 
of 100K and outgoing queue capacity.
 * Like above, but add a property to specify the max number of FlowFiles 
transferred per run.  Removing hard coded magic numbers is good... but 
cluttering nifi.properties is bad, so its a trade off.
 * Increase the number of concurrent threads for funnels / ports.  This 
probably would want to be a configurable property, as the value should really 
likely be proportional to the "size" of your flow, whatever that means for the 
system in question.
 * Increase the "run duration"... but i don't think i like that.
 * If session.getQueueSize exceeds some threshold, spin off a new thread to 
transfer those files... but that could be dangerous.
 * Create a new thread pool for ports / funnels, so they aren't starved by 
processors.  Similar to above, but reuses resources.  Still would need to 
determine the correct size of the pool.  This could be the best answer in 
theory, but would also require the most code work.

[~markap14], thoughts?


> StandardFunnel transferring too slowly
> --------------------------------------
>
>                 Key: NIFI-6157
>                 URL: https://issues.apache.org/jira/browse/NIFI-6157
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Brandon DeVries
>            Priority: Major
>
> NIFI-6068 made modifications such that a funnel wouldn't hold on to a 
> TimerDriven thread excessively.  However, now it isn't holding on to the 
> thread long enough...
> Since Funnels and Local Ports are scheduled with the timer driven thread 
> pool, they're competing for threads with all of the other processors on the 
> graph.  In a large flow with a large number of processors, potentially with 
> multiple assigned concurrent tasks, funnels and ports get to run less and 
> less frequently, since they are hard coded to 1 concurrent task.
> I'm open to implementation options, but a couple of possibilities are:
>  * Increase the transferred FlowFilecap from 10K to 100K.  The thread will 
> still be released if less than the requested 1000 FlowFiles are moved in a 
> loop, so it won't hold on inappropriately, but it will still have the 
> opportunity to move the files that need to be moved.  Furthermore, if the 
> back pressure on the outgoing relationship is engaged, it will cause the 
> thread to be released.  Effectively, the amount transferred would be limited 
> by the max of 100K and outgoing queue capacity.
>  * Like above, but add a property to specify the max number of FlowFiles 
> transferred per run.  Removing hard coded magic numbers is good... but 
> cluttering nifi.properties is bad, so its a trade off.
>  * Increase the number of concurrent threads for funnels / ports.  This 
> probably would want to be a configurable property, as the value should really 
> likely be proportional to the "size" of your flow, whatever that means for 
> the system in question.
>  * Increase the "run duration"... but i don't think i like that.
>  * If session.getQueueSize exceeds some threshold, spin off a new thread to 
> transfer those files... but that could be dangerous.
>  * Create a new thread pool for ports / funnels, so they aren't starved by 
> processors.  Similar to above, but reuses resources.  Still would need to 
> determine the correct size of the pool.  This could be the best answer in 
> theory, but would also require the most code work.
> [~markap14], thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to