[
https://issues.apache.org/jira/browse/NIFI-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14577767#comment-14577767
]
Aldrin Piri commented on NIFI-627:
----------------------------------
I have been playing with this a little bit. I think a discussion is needed on
what the focus of this processor is. To Mike's points, it is in fact misleading
and perhaps it is more of a usability thing than anything else. Maybe the
solution is just clearer and more thorough documentation of the processor and
its usage.
The general contract of the processor as currently developed is something akin
to the MergeContent processor. The intent here, is that the duration is
actually a window of allowance for the given threshold before it closes. With
the example laid out above, the maximum throughput of the processor in terms of
number of files 1 per second, or in general, 1/T where T is the duration of the
threshold. This window will close and will always allow one file through
regardless of how much larger than the specified size configured. Otherwise,
there could be zero transmission.
There is no extrapolation of the values configured to the displayed 5 minute
window. What one should expect is a maximum of 1 file per time duration
regardless of criteria in the interest of not deadlocking for a possibly
impossible criteria. In the example as laid out by Mike, I would expect up to
1 file / sec * 300 KB / file * 5 min / window * 60 sec / min ~= 90 MB/ window
(5 minutes) == 300 files. Of course there are other factors in terms of how
penalization, prioritization, and the like are configured on the processor that
would bring this number down farther.
Not sure what makes the most sense, but that's currently my perspective on the
issue at the moment. What Mike wants would be accomplished by setting a
duration of 5 minutes to align with the stats window and specifying 5 KB/s *
300 s/window ~= 1.5 MB/window.
That configuration provides the output expected in my sample flow.
> ControlRate processor does not accurately control the rate
> ----------------------------------------------------------
>
> Key: NIFI-627
> URL: https://issues.apache.org/jira/browse/NIFI-627
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 0.1.0
> Reporter: Michael Moser
> Assignee: Aldrin Piri
> Priority: Minor
> Fix For: 0.2.0
>
>
> Set a ControlRate processor to something like 5 KB per 1 sec. Generate flow
> files that are about 300 KB in size and feed a bunch to this processor. This
> should allow about 5 files through per 5 minutes. But it allows a lot more
> data through than it should. The difference seems to get worse with really
> low Time Duration values. And people tend to think in number of bytes per
> second so the temptation to set Time Duration to 1 sec is great.
> Also, if ControlRate has multiple input queues, it seems to output even more
> data than it should.
> This seems to be caused by the code at the beginning of ControlRate
> onTrigger(). Under some conditions when the number of files that are allowed
> through per Time Duration is less than 1, the Throttle is being removed from
> the throttleMap while it actually still should be in use.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)