RE: [EXT] Re: Funnel Queue Slowness

2017-10-16 Thread Peter Wicks (pwicks)
Pierre,

I agree with you all around. It would be nice if it was a little smarter.

--Peter


-Original Message-
From: Pierre Villard [mailto:pierre.villard...@gmail.com] 
Sent: Monday, October 16, 2017 4:00 PM
To: dev <dev@nifi.apache.org>
Subject: Re: [EXT] Re: Funnel Queue Slowness

Peter,

This behaviour is by design and it's the case for processors as well.

Back pressure is only checked by the component each time it is scheduled to see 
whether the component can run or not. If yes, the component will run as 
configured and will process as many flow files as it is supposed to process. In 
case of funnels, a funnel will always perform actions on a batch of 100 flow 
files (
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/StandardFunnel.java#L372
).

You would have the same with other components. Let's say you have a SplitText 
creating 10k flow files for each incoming flow file. Even though backpressure 
is configured with 1k flow file on the downstream connection, if back pressure 
thresholds are not reached, the processor will be triggered and produce the 
expected number of flow files (which is over back pressure threshold).

I agree this hard-coded number of 100 for funnels could be improved (something 
like min(100, backpressure threshold - number of queued flow
files)) but I'm not sure that's really an issue.

Pierre







2017-10-16 5:05 GMT+02:00 Peter Wicks (pwicks) <pwi...@micron.com>:

> Joe,
>
> It really is about just forgetting that penalization is a thing. 
> Penalized files are fairly well marked when you do a List Queue.
>
> I think Funnel's need an overall re-examination. I noticed another 
> quirk the other day when moving queues around that already contained 
> FlowFiles; Funnel's ignore back pressure settings if there is any 
> space available in the down-stream queue.
>
> Prep the FlowFiles: https://photos.app.goo.gl/Fu3EBDtQZ5wurQNt2
> Configure the Queue to only allow Back Pressure of 10 files:
> https://photos.app.goo.gl/17OlJSu2NXkxQ8lZ2
> Funnel grabs 100 FlowFiles no matter what and shoves them through:
> https://photos.app.goo.gl/vEwoZYETH6iMImBJ3
>
> If you let the down-stream processor run until there is space for 1 
> FlowFile available then it loads in another 100 flow files:
> https://photos.app.goo.gl/R4P5mdXr3L5oJnSw2
>
> I created a ticket: NIFI-4486.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com]
> Sent: Tuesday, October 10, 2017 10:01 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> I see your point that it feels not natural or at least surprising.
> There are two challenges I see with what you propose.  One is user 
> oriented and the other is technical.
>
> The user oriented one is that penalized objects are penalized as a 
> function of the thing that last operated on them.  The further away we 
> let the data get the harder it would be to reason over why they were 
> penalized in the first place.
>
> The technical one is that once something is penalized and placed into 
> the queue there is prioritization and polling logic that kicks in as a factor.
> I'm not sure how we'd tweak it for that to be ok in some cases and in 
> others not.  Perhaps we could just make funnels truly a pass-through 
> and when calculating the queue we're storing on figure out the first 
> non-funnel queue provided there is no cloning/branching we'd have to 
> account for.  But even then it brings us back to the previous point 
> which is the user challenge of knowing what thing penalized objects in 
> queue in the first place.
>
> Alternatively, we should review whether it is obvious enough (or at
> all) that items within a queue at a given moment in time are penalized.
> I've worked with NiFi for a very long time and i'll be honest and 
> state I've forgotten that penalization was a thing more than a few times too.
>
> What do you think?
>
> Thanks
>
> On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com>
> wrote:
> > Bryan,
> >
> > Yes, it was the penalty causing the issue. This feels like weird
> behavior for Funnel’s, and I’m not sure if it makes sense for 
> penalties to work this way.
> >
> > Would it make more sense if penalties were generally kept as is, but 
> > not
> applied at Funnel’s, then the penalty would kick back in at the first 
> non-funnel queue?
> >
> > Thanks,
> >   Peter
> >
> > From: Bryan Bende [mailto:bbe...@gmail.com]
> > Sent: Monday, October 09, 2017 7:33 PM
> > To: dev@nifi.apache.org
> > Subject: [EXT] Re: Funnel Queu

Re: [EXT] Re: Funnel Queue Slowness

2017-10-16 Thread Pierre Villard
Peter,

This behaviour is by design and it's the case for processors as well.

Back pressure is only checked by the component each time it is scheduled to
see whether the component can run or not. If yes, the component will run as
configured and will process as many flow files as it is supposed to
process. In case of funnels, a funnel will always perform actions on a
batch of 100 flow files (
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core-api/src/main/java/org/apache/nifi/controller/StandardFunnel.java#L372
).

You would have the same with other components. Let's say you have a
SplitText creating 10k flow files for each incoming flow file. Even though
backpressure is configured with 1k flow file on the downstream connection,
if back pressure thresholds are not reached, the processor will be
triggered and produce the expected number of flow files (which is over back
pressure threshold).

I agree this hard-coded number of 100 for funnels could be improved
(something like min(100, backpressure threshold - number of queued flow
files)) but I'm not sure that's really an issue.

Pierre







2017-10-16 5:05 GMT+02:00 Peter Wicks (pwicks) <pwi...@micron.com>:

> Joe,
>
> It really is about just forgetting that penalization is a thing. Penalized
> files are fairly well marked when you do a List Queue.
>
> I think Funnel's need an overall re-examination. I noticed another quirk
> the other day when moving queues around that already contained FlowFiles;
> Funnel's ignore back pressure settings if there is any space available in
> the down-stream queue.
>
> Prep the FlowFiles: https://photos.app.goo.gl/Fu3EBDtQZ5wurQNt2
> Configure the Queue to only allow Back Pressure of 10 files:
> https://photos.app.goo.gl/17OlJSu2NXkxQ8lZ2
> Funnel grabs 100 FlowFiles no matter what and shoves them through:
> https://photos.app.goo.gl/vEwoZYETH6iMImBJ3
>
> If you let the down-stream processor run until there is space for 1
> FlowFile available then it loads in another 100 flow files:
> https://photos.app.goo.gl/R4P5mdXr3L5oJnSw2
>
> I created a ticket: NIFI-4486.
>
> Thanks,
>   Peter
>
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com]
> Sent: Tuesday, October 10, 2017 10:01 AM
> To: dev@nifi.apache.org
> Subject: Re: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> I see your point that it feels not natural or at least surprising.
> There are two challenges I see with what you propose.  One is user
> oriented and the other is technical.
>
> The user oriented one is that penalized objects are penalized as a
> function of the thing that last operated on them.  The further away we let
> the data get the harder it would be to reason over why they were penalized
> in the first place.
>
> The technical one is that once something is penalized and placed into the
> queue there is prioritization and polling logic that kicks in as a factor.
> I'm not sure how we'd tweak it for that to be ok in some cases and in
> others not.  Perhaps we could just make funnels truly a pass-through and
> when calculating the queue we're storing on figure out the first non-funnel
> queue provided there is no cloning/branching we'd have to account for.  But
> even then it brings us back to the previous point which is the user
> challenge of knowing what thing penalized objects in queue in the first
> place.
>
> Alternatively, we should review whether it is obvious enough (or at
> all) that items within a queue at a given moment in time are penalized.
> I've worked with NiFi for a very long time and i'll be honest and state
> I've forgotten that penalization was a thing more than a few times too.
>
> What do you think?
>
> Thanks
>
> On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) <pwi...@micron.com>
> wrote:
> > Bryan,
> >
> > Yes, it was the penalty causing the issue. This feels like weird
> behavior for Funnel’s, and I’m not sure if it makes sense for penalties to
> work this way.
> >
> > Would it make more sense if penalties were generally kept as is, but not
> applied at Funnel’s, then the penalty would kick back in at the first
> non-funnel queue?
> >
> > Thanks,
> >   Peter
> >
> > From: Bryan Bende [mailto:bbe...@gmail.com]
> > Sent: Monday, October 09, 2017 7:33 PM
> > To: dev@nifi.apache.org
> > Subject: [EXT] Re: Funnel Queue Slowness
> >
> > Peter,
> >
> > The images didn’t come across for me, but since you mentioned that a
> failure queue is involved, is it possible all the flow files going to
> failure are being penalized which would cause them to not be processed
> immediately?
> >
> > -Bryan
> >
> >
> 

RE: [EXT] Re: Funnel Queue Slowness

2017-10-15 Thread Peter Wicks (pwicks)
Joe,

It really is about just forgetting that penalization is a thing. Penalized 
files are fairly well marked when you do a List Queue.

I think Funnel's need an overall re-examination. I noticed another quirk the 
other day when moving queues around that already contained FlowFiles; Funnel's 
ignore back pressure settings if there is any space available in the 
down-stream queue.

Prep the FlowFiles: https://photos.app.goo.gl/Fu3EBDtQZ5wurQNt2
Configure the Queue to only allow Back Pressure of 10 files: 
https://photos.app.goo.gl/17OlJSu2NXkxQ8lZ2
Funnel grabs 100 FlowFiles no matter what and shoves them through: 
https://photos.app.goo.gl/vEwoZYETH6iMImBJ3

If you let the down-stream processor run until there is space for 1 FlowFile 
available then it loads in another 100 flow files: 
https://photos.app.goo.gl/R4P5mdXr3L5oJnSw2

I created a ticket: NIFI-4486.

Thanks,
  Peter

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Tuesday, October 10, 2017 10:01 AM
To: dev@nifi.apache.org
Subject: Re: [EXT] Re: Funnel Queue Slowness

Peter,

I see your point that it feels not natural or at least surprising.
There are two challenges I see with what you propose.  One is user oriented and 
the other is technical.

The user oriented one is that penalized objects are penalized as a function of 
the thing that last operated on them.  The further away we let the data get the 
harder it would be to reason over why they were penalized in the first place.

The technical one is that once something is penalized and placed into the queue 
there is prioritization and polling logic that kicks in as a factor.  I'm not 
sure how we'd tweak it for that to be ok in some cases and in others not.  
Perhaps we could just make funnels truly a pass-through and when calculating 
the queue we're storing on figure out the first non-funnel queue provided there 
is no cloning/branching we'd have to account for.  But even then it brings us 
back to the previous point which is the user challenge of knowing what thing 
penalized objects in queue in the first place.

Alternatively, we should review whether it is obvious enough (or at
all) that items within a queue at a given moment in time are penalized.  I've 
worked with NiFi for a very long time and i'll be honest and state I've 
forgotten that penalization was a thing more than a few times too.

What do you think?

Thanks

On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) <pwi...@micron.com> wrote:
> Bryan,
>
> Yes, it was the penalty causing the issue. This feels like weird behavior for 
> Funnel’s, and I’m not sure if it makes sense for penalties to work this way.
>
> Would it make more sense if penalties were generally kept as is, but not 
> applied at Funnel’s, then the penalty would kick back in at the first 
> non-funnel queue?
>
> Thanks,
>   Peter
>
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Monday, October 09, 2017 7:33 PM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> The images didn’t come across for me, but since you mentioned that a failure 
> queue is involved, is it possible all the flow files going to failure are 
> being penalized which would cause them to not be processed immediately?
>
> -Bryan
>
>
> On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com<mailto:pwi...@micron.com>> wrote:
>
> I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. 
> I haven’t been able to create a nice reproducible test case to pass on.
> What I’m seeing is that my failure queue on the right will start to fill up, 
> even though there is plenty of room for them in the next queue. You can see 
> that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first 
> image), so it’s not that the FlowFile’s are moving so fast that they just 
> appear to be in queue.
>
> If I stop the downstream processor the files slowly trickle through the 
> funnel into the next queue slowly. I had an Oldest FlowFile First prioritizer 
> on the downstream queue. I tried removing it but there was no change in 
> behavior.
> One time where I saw this behavior in the past was when my NiFi 
> instance was thread starved, but there are plenty of threads available 
> on the instance and all other processors are running fine. I also 
> don’t understand why it trickles the FlowFile’s in, from what I’ve 
> seen in the code Funnel grabs large batches at one time…
>
> Thoughts?
>
> (Sometimes my images don’t make it, let me know if that happens.) 
> [cid:image002.png@01D340EC.543FE750] 
> [cid:image004.png@01D340EC.543FE750]
>


Re: [EXT] Re: Funnel Queue Slowness

2017-10-09 Thread Joe Witt
Peter,

I see your point that it feels not natural or at least surprising.
There are two challenges I see with what you propose.  One is user
oriented and the other is technical.

The user oriented one is that penalized objects are penalized as a
function of the thing that last operated on them.  The further away we
let the data get the harder it would be to reason over why they were
penalized in the first place.

The technical one is that once something is penalized and placed into
the queue there is prioritization and polling logic that kicks in as a
factor.  I'm not sure how we'd tweak it for that to be ok in some
cases and in others not.  Perhaps we could just make funnels truly a
pass-through and when calculating the queue we're storing on figure
out the first non-funnel queue provided there is no cloning/branching
we'd have to account for.  But even then it brings us back to the
previous point which is the user challenge of knowing what thing
penalized objects in queue in the first place.

Alternatively, we should review whether it is obvious enough (or at
all) that items within a queue at a given moment in time are
penalized.  I've worked with NiFi for a very long time and i'll be
honest and state I've forgotten that penalization was a thing more
than a few times too.

What do you think?

Thanks

On Mon, Oct 9, 2017 at 9:01 PM, Peter Wicks (pwicks) <pwi...@micron.com> wrote:
> Bryan,
>
> Yes, it was the penalty causing the issue. This feels like weird behavior for 
> Funnel’s, and I’m not sure if it makes sense for penalties to work this way.
>
> Would it make more sense if penalties were generally kept as is, but not 
> applied at Funnel’s, then the penalty would kick back in at the first 
> non-funnel queue?
>
> Thanks,
>   Peter
>
> From: Bryan Bende [mailto:bbe...@gmail.com]
> Sent: Monday, October 09, 2017 7:33 PM
> To: dev@nifi.apache.org
> Subject: [EXT] Re: Funnel Queue Slowness
>
> Peter,
>
> The images didn’t come across for me, but since you mentioned that a failure 
> queue is involved, is it possible all the flow files going to failure are 
> being penalized which would cause them to not be processed immediately?
>
> -Bryan
>
>
> On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks) 
> <pwi...@micron.com<mailto:pwi...@micron.com>> wrote:
>
> I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. 
> I haven’t been able to create a nice reproducible test case to pass on.
> What I’m seeing is that my failure queue on the right will start to fill up, 
> even though there is plenty of room for them in the next queue. You can see 
> that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first 
> image), so it’s not that the FlowFile’s are moving so fast that they just 
> appear to be in queue.
>
> If I stop the downstream processor the files slowly trickle through the 
> funnel into the next queue slowly. I had an Oldest FlowFile First prioritizer 
> on the downstream queue. I tried removing it but there was no change in 
> behavior.
> One time where I saw this behavior in the past was when my NiFi instance was 
> thread starved, but there are plenty of threads available on the instance and 
> all other processors are running fine. I also don’t understand why it 
> trickles the FlowFile’s in, from what I’ve seen in the code Funnel grabs 
> large batches at one time…
>
> Thoughts?
>
> (Sometimes my images don’t make it, let me know if that happens.)
> [cid:image002.png@01D340EC.543FE750] [cid:image004.png@01D340EC.543FE750]
>


RE: [EXT] Re: Funnel Queue Slowness

2017-10-09 Thread Peter Wicks (pwicks)
Bryan,

Yes, it was the penalty causing the issue. This feels like weird behavior for 
Funnel’s, and I’m not sure if it makes sense for penalties to work this way.

Would it make more sense if penalties were generally kept as is, but not 
applied at Funnel’s, then the penalty would kick back in at the first 
non-funnel queue?

Thanks,
  Peter

From: Bryan Bende [mailto:bbe...@gmail.com]
Sent: Monday, October 09, 2017 7:33 PM
To: dev@nifi.apache.org
Subject: [EXT] Re: Funnel Queue Slowness

Peter,

The images didn’t come across for me, but since you mentioned that a failure 
queue is involved, is it possible all the flow files going to failure are being 
penalized which would cause them to not be processed immediately?

-Bryan


On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks) 
<pwi...@micron.com<mailto:pwi...@micron.com>> wrote:

I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. I 
haven’t been able to create a nice reproducible test case to pass on.
What I’m seeing is that my failure queue on the right will start to fill up, 
even though there is plenty of room for them in the next queue. You can see 
that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first image), 
so it’s not that the FlowFile’s are moving so fast that they just appear to be 
in queue.

If I stop the downstream processor the files slowly trickle through the funnel 
into the next queue slowly. I had an Oldest FlowFile First prioritizer on the 
downstream queue. I tried removing it but there was no change in behavior.
One time where I saw this behavior in the past was when my NiFi instance was 
thread starved, but there are plenty of threads available on the instance and 
all other processors are running fine. I also don’t understand why it trickles 
the FlowFile’s in, from what I’ve seen in the code Funnel grabs large batches 
at one time…

Thoughts?

(Sometimes my images don’t make it, let me know if that happens.)
[cid:image002.png@01D340EC.543FE750] [cid:image004.png@01D340EC.543FE750]



Re: Funnel Queue Slowness

2017-10-09 Thread Bryan Bende
Peter,

The images didn’t come across for me, but since you mentioned that a failure 
queue is involved, is it possible all the flow files going to failure are being 
penalized which would cause them to not be processed immediately?

-Bryan

> On Oct 8, 2017, at 10:49 PM, Peter Wicks (pwicks)  wrote:
> 
> I’ve been running into an issue on 1.4.0 where my Funnel sometimes runs slow. 
> I haven’t been able to create a nice reproducible test case to pass on.
> What I’m seeing is that my failure queue on the right will start to fill up, 
> even though there is plenty of room for them in the next queue. You can see 
> that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first 
> image), so it’s not that the FlowFile’s are moving so fast that they just 
> appear to be in queue.
> 
> If I stop the downstream processor the files slowly trickle through the 
> funnel into the next queue slowly. I had an Oldest FlowFile First prioritizer 
> on the downstream queue. I tried removing it but there was no change in 
> behavior.
> One time where I saw this behavior in the past was when my NiFi instance was 
> thread starved, but there are plenty of threads available on the instance and 
> all other processors are running fine. I also don’t understand why it 
> trickles the FlowFile’s in, from what I’ve seen in the code Funnel grabs 
> large batches at one time…
> 
> Thoughts?
> 
> (Sometimes my images don’t make it, let me know if that happens.)
> 



signature.asc
Description: Message signed with OpenPGP


Funnel Queue Slowness

2017-10-08 Thread Peter Wicks (pwicks)
I've been running into an issue on 1.4.0 where my Funnel sometimes runs slow. I 
haven't been able to create a nice reproducible test case to pass on.
What I'm seeing is that my failure queue on the right will start to fill up, 
even though there is plenty of room for them in the next queue. You can see 
that the Tasks/Time is fairly low, only 24 in the last 5 minutes (first image), 
so it's not that the FlowFile's are moving so fast that they just appear to be 
in queue.

If I stop the downstream processor the files slowly trickle through the funnel 
into the next queue slowly. I had an Oldest FlowFile First prioritizer on the 
downstream queue. I tried removing it but there was no change in behavior.
One time where I saw this behavior in the past was when my NiFi instance was 
thread starved, but there are plenty of threads available on the instance and 
all other processors are running fine. I also don't understand why it trickles 
the FlowFile's in, from what I've seen in the code Funnel grabs large batches 
at one time...

Thoughts?

(Sometimes my images don't make it, let me know if that happens.)
[cid:image002.png@01D340EC.543FE750] [cid:image004.png@01D340EC.543FE750]