ActiveMQ 6.0 Broker Core Prototype -- Flow Control / Memory Management

Colin MacNaughton Wed, 10 Jun 2009 09:31:00 -0700

Hi Everyone,

As a follow on to my e-mail last week introducing the core broker
prototype that Hiram and I have been working on, I wanted to spin up a
thread on the flow control model that we're using.


I'd be interested to hear in your thoughts on current shortcomings
associated with flow control / memory management in 5.3 so we can make
sure that the use cases are covered. Beyond that any additional input on
the design or implementation would be great ... are we on the right
track?

Cheers,
Colin


The text below is taken straight from the webgen in the project, my
apologies if it's a little verbose!

As a reminder the bits can be found at:
https://svn.apache.org/repos/asf/activemq/sandbox/activemq-flow

The activemq-flow package is meant to be a standalone module that deals
generically with Resources and Flow's of elements that flow through and
between them. The current implementation is designed with the following
goals in mind:

    * SIMPLE: Want a fairly simple and consistent model for controlling
flow of messages and other data in the system to control memory and disk
space. The module must be able to handle fan-in/fan-out as well as
simpler 1 to 1 cases.
    * PERFORMANT: The flow control mechanism must be performant and
should not introduce much overhead in cases where downstream resources
are able to keep up. 
    * MODULARIZED: The module should be independent generic and
reusable.
    * FAIRNESS: We should be able to provide better fairness. If I've
got several producers putting messages on a queue, the flow controller
should not prefer one source over the other (unless configured to do so)
    * VISIBILITY: With a unified model in place we can instrument it to
provide visibility in the product (e.g. a visual graph of flows in the
system). When a customer says that they are not using PERSISTENT
messages yet we see 1000msgs/sec flowing through the recovery log....
    * ADMINISTRATION: We can explore the possibility of administratively
limiting message flows. E.g. I've done my production stress testing and
can successfully handle my anticipated load of 4000 msgs/sec on topic1
... I'd prefer to avoid the case where publishers go berserk and
overload my backend with messages).
    * POLICIES: We should be able to better instrument general flow
control policies. E.g. I want to tune for latency or throughput. If a
subscriber gets behind, I'd like the policy for messages on topic1 to be
that I drop the oldest messages instead of initiating flow control.

The Basics:

Each resource creates a FlowController for each of it's Flows which is
assigned a corresponding FlowLimiter. As elements (e.g. messages) pass
from one resource to another they are passed through the downstream
resource's FlowController which updates its Limiter. If propagation of
an element from one resource to another causes the downstream limiter to
become throttled the associated FlowController will block the source of
the element. The flow module is used heavily by the rest of the core for
memory and disk management.

    * Memory Management: Memory is managed based on the resources in
play -- the usage is computed by summing of the space allocated to each
of the resources' limiters. This strategy intentionally avoids a
centralized memory limit which leads to complicated logic to track when
a centralized limiter needs to be decremented and avoids contention
between multiple resources/threads accessing the limiter and also
reduces the potential for memory limiter related deadlocks. However, it
should be noted that this approach doesn't preclude implementing
centralized limiters in the future.
    * Flow Control: As messages propagate from one resource A to another
B, then if A overflows B's limit, B will block A and A can't release
it's limiter space until B unblocks it. This allowance for overflow into
downstream resources is a key concept in flow control performance and
ease of use. Provided that the upstream resource has already accounted
for the message's memory it can freely overflow any downstream limiter
providing it reserves space from elements that caused overflow.
    * Threading Model: Note that as a message propagates from A to B,
that the general contract is that A won't release it's memory if B
blocks it during the course of dispatch. This means that it is not safe
to perform a thread handoff during dispatch between two resources since
the thread dispatching A relies on the message making it to B (so that B
can block it) prior to A completing dispatch.
    * Management/Visibility: Another intended use of the activemq-flow
module is to assist in visibility e.g. provide an underlying map of
resources that can be exposed via tooling to see the relationships
between sources and sinks of messages and to find bottlenecks ... this
aspect has been downplayed for now as we have been focusing more on the
queueing/memory management model in the prototype, but eventually the
flow package itself will provide a handy way of providing visibility in
the system particularly in terms of finding performance bottlenecks.

FlowResource (FlowSink and FlowSource): A container for FlowControllers
providing some lifecycle related logic. The base resource class handles
interaction/registration with the FlowManager (below).

FlowManager: Registry for Flow's and FlowResources. The manager will
provide some hooks into system visibility. As mentioned above this
aspect has been downplayed somewhat for the present time.

FlowController: Wraps a FlowLimiter and actually implements common basic
block/resume logic between FlowControllers.

FlowLimiter: Defines the limits enforced by a FlowController. Currently
the package has size based limiter implementations, but eventually
should also support other common limiter types such as rate based
limiters. The limiter's are also extended at other points in the broker
(for example implementing a protocol based WindowLimiter). It is also
likely that we would want to introduce CompositeLimiters to combine
various limiter types.

Flow: The concept of a flow is not used very heavily right now. But a
Flow defines the stream of elements that can be blocked. In general the
prototype creates a single flow per resource, but in the future a source
may break it's elements down into more granular flows on which
downstream sinks may block it. One case where this is anticipated as
being useful is in networks of brokers where-in it may be desirable to
partition messages into more granular flows (e.g based on producer or
destination) to avoid blocking the broker-broker connection
uncessarily).

ActiveMQ 6.0 Broker Core Prototype -- Flow Control / Memory Management

Reply via email to