Introducing container_engine: integrate proton with arbitrary IO and concurrency models.

Alan Conway Sat, 23 Jan 2016 23:23:31 -0800

The proton connection engine
============================

The next proton release will introduce `proton::connection_engine` in
the C++ binding.  (Possibly `Qpid::Proton::ConnectionEngine` in ruby if
time permits)


The `connection_engine` is alternative to `proton::container` which is
based on the C reactor.

I'll explain why I think we need an alternative and then what the
connection_engine is.  Check the code for more details - there is some
documentation, there will be more before release.

Why do we need the connection_engine?
-------------------------------------

The website clearly states the goal: "Qpid Proton is a high-
performance, lightweight messaging library. It can be used in the
widest range of messaging applications, including brokers, client
libraries, routers, bridges, proxies, and more. Proton makes it trivial
to integrate with the AMQP 1.0 ecosystem from any platform,
environment, or language."

I believe the original proton design met all these goals *except* ease
of use. The messaging-handler and reactor did a good job on ease of
use, but undermined the "widest range" goal. The connection_engine is
going for the lot.

The proton core is a pure AMQP protocol engine. It is highly portable,
makes few assumptions and imposes few requirements. There is no thread
synchronization and no IO code, there is no shared state between
connections. You can process data from any IO source, and you can
process separate connections concurrently. The only requirement is not
to process a single connection concurrently. This core can scale from
simple clients to high-performance, multi-threaded servers like qpidd,
qpid dispatch and JBoss A-MQ.

However the core is not very easy to use. Correctly co-ordinating the
`pn_connction_t`, `pn_transport_t` and `pn_collector_t` with IO is a
challenge.

Enter the reactor: a framework that handles this co-ordination and
manages socket IO for multiple connections. It delivers simple events
(via the messaging-handler interface) to user-defined event handlers.
It solves the ease-of-use problem very nicely.

Unfortunately it introduces new assumptions and restrictions.The C
reactor uses sockets and `poll` to serialize events from multiple
connections *in a single thread*. (On Windows it uses IOCP but still in
a single thread.)

That works for some scenarios but high-scale, high-performance servers
often need multi-threading and other dispatch mechanisms such as epoll,
kqueue, solaris event ports, proactive iocp etc. Developers often want
to use portable frameworks like boost::asio, libuv, libevent etc. The C
reactor can't participate in any of this.

The C reactor also creates problems for some language bindings.

In python and ruby it creates problems with interprter locks. The
python binding has workarounds (IMO risky ones), the ruby reactor is
unusable in a program with any other ruby threads.

The Go language is concurrent so serializing with `poll` makes no
sense.  The standard `net.Conn` interface doesn't have a file
descriptor so you can't included it in the reactor's poll even if you
want to.

Windows IOCP is designed to be used as a concurrent *proactor*, which
is not the same as a reactor. The proton reactor uses IOCP internally
on windows, but forces it to act like `poll` which eliminates many of
its benefits.

Developing an IO framework that is portable, scalable, performant and
usable is *hard*. It is not in the scope of the Qpid project, Qpid is
about AMQP. So we want something that is easy to use (like the reactor)
but that focuses on AMQP, is highly portable, and lets the user choose
the IO and concurrency they want (like the proton core).

The connection_engine
---------------------

The `connection_engine` is a wrapper around the proton core. Each
`connection_engine` manages a *single connection*. It hides the
interaction between IO and the `pn_transport_t`/`pn_collector_t` and
delivers events to the user's handler, like the reactor. Each engine is
independent with no state shared between them, so they can be used in
multiple threads like the proton core.

Applications for the `connection_engine` are like those for the
reactor, using the same handler and event classes. With a little care
you can write handlers that will work with either. The C++ broker
examples for reactor and connection_engine demonstrate this.

The connection_engine class itself does no IO. You subclass and
override simple methods `io_read`, `io_write` and `io_close` with your
IO code.

There is a `socket_engine` subclass provided which implements TCP
sockets, so you can write applications out of the box without writing
your own IO. The engine examples are copies of the reactor examples
with a socket_engine instead of a container, you can compare them to
see how little difference there is.

You can use `socket_engine` with multiple threads, with poll, epoll,
kqueue, boost::asio or any framework you choose. The broker example
illustrates.

You can implement your own engine classes. For example if you use
boost::asio, you can implement an engine that uses
boost::asio::read/write and get the portability/performance benefits of
that library. You can implement engines for unusual platforms or IO
libraries that are unrelated to TCP, sockets or Unix-like file
descriptors. For example the connection_engine test code uses an in-
memory connection using std::deque<byte> as a transport.

Sub-classes of `connection_engine` are implemented in your chosen
language, *not* in C. For example the ruby `ConnectionEngine` uses the
native ruby `IO` class. This avoids locking problems, since native ruby
IO is designed to work with ruby's threading and locking rules. It also
makes it easy to integrate with frameworks in you chosen language, for
example cool.io or celluloid in ruby.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Introducing container_engine: integrate proton with arbitrary IO and concurrency models.

Reply via email to