Adding to what Aldrin already said...
If you are using the pattern where a background thread puts data on a
queue, and onTrigger polls the queue, we found that polling with a small
wait time and then yielding when no data is available works well.
For example:
MyObject obj = queue.poll(100, TimeUnit.MILLISECONDS);
if (obj == null) {
context.yield();
return;
}
The yield duration is configurable in the UI so this lets the user decide
how long to yield for. Also, the wait time on the poll is important because
without that it can sometimes unnecessarily yield... if you had several
concurrent tasks calling onTrigger and one of them happens to call it when
the queue was empty but 1ms later data is placed in the queue then it would
have yielded for no reason. Hope I'm not confusing things more, just
sharing my learning experience from working on ListenSyslog.
On Thu, Dec 3, 2015 at 12:15 AM, Aldrin Piri <[email protected]> wrote:
> Ian,
>
> The onTrigger method is called by the processor when the Flow Controller
> allocates a thread to this processor to execute. This share of resources
> is driven by the number of concurrent tasks and overall processing of the
> system. In times of light load, perhaps polling for data, and not doing
> much else in the flow, the FlowController will continuously allocate an
> execution time to that processor instance, invoking its onTrigger. The way
> to reason about this, is that without yielding in the processor, it is
> continuously looking for actionable events from whatever system/source it
> is consuming.
>
> In terms of considerations and concerns, the key item is that when you
> spawn these items off, they are out of the purview of the framework itself
> and an appropriate amount of diligence is needed in maintaining these and
> handling them in conjunction with the lifecycle of the processor itself;
> the controller is not managing their execution.
>
> In terms of what affects run duration, this is a net view of the number of
> available timer threads in conjunction with the number of extensions and
> their relative number of concurrent tasks. The FlowController will
> effectively dole out these specified threads to processors and extensions
> and let them run with it until they have completed their trigger method.
> The effect that occurs depends on what your mechanism for the background
> threads and what they are doing. There are a few different approaches to
> how this is handled depending on specific application and context, but you
> can see various approaches to how the "source" processors are
> orchestrated. As mentioned previously by Mark, the GetTwitter processor
> and the assorted Listen processors handle this kind of background,
> continuously executing task in different ways. The key item is that what
> "consumes" threads by the FlowController are instances of processors and
> their onTrigger method.
>
> If you would like some additional guidance or insights any details or
> specifics on what you are trying to tackle could certainly aid us in
> getting into the nitty gritty a bit more.
>
> --aldrin
>
>
> On Wed, Dec 2, 2015 at 10:00 PM, ianwork <[email protected]> wrote:
>
> > I have some questions related to this type of processor...When is the
> > onTrigger method called? I'm noticing a high and constantly increasing
> > number of Tasks in tasks/time statistics for my processor.
> >
> > "For some source/sink type processors, which sounds like your processor,
> it
> > can be acceptable to create a controlled number of threads. " What are
> the
> > considerations/concerns with having multiple threads in this situation?
> > What affect would run duration have on reading incoming messages into
> queue
> > and in ontrigger?
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-nifi-developer-list.39713.n7.nabble.com/Asynchronous-JMS-Consumer-for-IBM-MQ-tp3919p5574.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> > Nabble.com.
> >
>