Re: [openstack-dev] [oslo.messaging][zeromq] Next step

Alec Hothan (ahothan) Thu, 28 May 2015 10:17:59 -0700

Hi Oleksii,

Thanks for putting together the slides, they are well done and extremely useful!


I find this 0MQ driver redesign proposal a much needed improvement over the 
current design.
However it is worth debating the need to keep the proxy server and I would be 
interested to hear from others as well if they feel like this is something we 
should pursue.
Also do we know the level of interest we have in the community to contribute 
to, use or support a production grade 0MQ driver in the future?

Comments inline...


From: ozamiatin <ozamia...@mirantis.com<mailto:ozamia...@mirantis.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Date: Wednesday, May 27, 2015 at 3:52 AM
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org<mailto:openstack-dev@lists.openstack.org>>
Subject: Re: [openstack-dev] [oslo.messaging][zeromq] Next step

Hi,

I'll try to address the question about Proxy process.

AFAIK there is no way yet in zmq to bind more than once to a specific port 
(e.g. tcp://*:9501).

Apparently we can:

socket1.bind('tcp://node1:9501')
socket2.bind('tcp://node2:9501')

but we can not:

socket1.bind('tcp://*:9501')
socket2.bind('tcp://*:9501')

So if we would like to have a definite port assigned with the driver we need to 
use a proxy which receives on a single socket and redirects to a number of 
sockets.

Right you can only bind once on a given ip/port.



It is a normal practice in zmq to do so. There are even some helpers 
implemented in the library so-called 'devices'.

Here the performance question is relevant. According to ZeroMQ documentation 
[1] The basic heuristic is to allocate 1 I/O thread in the context for every 
gigabit per second of data that will be sent and received (aggregated).

The other way is to 'bind_to_random_port', but here we need some mechanism to 
notify the client about the port we are listening to. So it is more complicated 
solution.

It is all relative and I actually find it simpler overall than using a proxy ;-)
Dynamic port binding has some benefits as well, is widely used and is a well 
known/understood pattern.
In the current implementation, messaging end points register their topic to 
Redis with the host address (and implicitly on the well known port 5901), it 
would have been possible to register the host+port instead if we were to 
consider bypassing the proxy.


Why to run in a separate process? For zmq api it doesn't matter to communicate 
between threads (INPROC), between processes (IPC) or between nodes (TCP, PGM 
and others). Because we need to run proxy once on a node it's easier to do it 
in a separate process. How to track the proxy is running already if we put it 
in a thread of some service?

There would not be any proxy at all. Each end point (nova compute, ovs agent, 
neutron server...) would simply listen on its unique IP+port (that's real peer 
to peer)




In spite of having a broker-like instance locally we still stay brokerless 
because we have no central broker node with a queue we need to replicate and 
keep alive. Each node is acutally a peer. The broker is not a standalone node 
so we can not say that it is a 'single point of failure' .

You are correct in that regard (i.e. "node" level), but there is also process 
level HA (meaning if the proxy process goes down for whatever reason, all end 
points in that node become unreachable). One complication to keep in mind is 
that some production deployment schemes (those that use a container or a VM to 
deploy openstack services) would have to accommodate that extra proxy process 
(in the same way that they do today with rabbitMQ clusters) by adding one extra 
service "container" just for the proxy which is pretty heavy handed since you 
can't bundle that proxy process with any other container. For example a typical 
compute node would have 2 "packages" (nova-compute and ovs agent for example), 
with that proxy there will be a need to have a third package.


We can consider the local broker as a part of a server. It is worth noting that 
IPC communication is much more reliable than real network communication.
One more benefit is that the proxy is stateless so we don't have to bother 
about managing the state (syncing it or having enough memory to keep it)

I agree the proxy server is not very complex and likely solid, but there are 
also downsides to it as well.
Talking about TCP sockets vs. IPC sockets (which are probably based on unix 
domain sockets), you'll have to note that you won't be able to use IPC if the 
proxy server will have to run inside a VM by itself (as would be the case for 
certain deployment schemes), and you'd have to use TCP sockets in that case 
(with the extra burden of connecting the end points to the proxy server while 
they all can potentially reside in different Vms - you cannot just use 
127.0.0.1  to find the local proxy server).



I'll cite the zmq-guide about broker/brokerless (4.14. Brokerless Reliability 
p.221):

"It might seem ironic to focus so much on broker-based reliability, when we 
often explain ØMQ as "brokerless messaging". However, in messaging, as in real 
life, the middleman is both a burden and a benefit. In practice, most messaging 
architectures benefit from a mix of distributed and brokered messaging. "

Brokers and middlemen are beneficial in many situations no question about it. 
In this particular situation there is actually already a "broker" of sort which 
is the redis server ;-) The redis server acts like a name server and allows 
dynamic discovery of services (topics and the associated addresses).
Brokers are interesting for 2 reasons:

  *   decouple participants of a communication infra (e.g. you do not want to 
hardcode anything about peers, their count and their addresses), this can be 
done by a broker a la 0MQ example or by a name server
  *   do something special about messages that are being brokered (e.g. 
persistence, replication, multicast, load balancing etc...), things that you 
can't do with simple peer to peer connections and things where RabbitMQ 
(presumably) excels

Given that the proxy server does not seem to do anything special with the 
messages (other than forwarding/unicasting) and given that Redis could provide 
a full end to end addressing, it seems that the need for a proxy is greatly 
diminished.
A list of drawbacks in having a proxy server in every node:

  *   potential deployment complications as noted above,
  *   total connection count is higher compared to proxy-less design - this 
could be a problem if we ever get to the point of encrypting every connection 
(btw 0MQ supports encryption since version 3 although I have not personally 
tried it)
  *   1 more hop for every message in both directions
  *   not clear if the proxy would propagate disconnection events from one side 
to the other?
  *   need to tend to the buffering and flow control in the proxy (one policy 
may not fit all needs and will it still be state-less)

Lastly, note that the Redis server itself could be clustered for HA (a feature 
added recently) and this might be something we may have to look at as well 
because it is another point of failure (it would be awkward to put the redis 
server on 1 controller node where HA calls for 3 controller nodes for example).

I'm still relatively new to oslo messaging and still have a lot of questions 
regarding a deployment based on 0MQ. I think it is important that we assess 
properly the forces in favor of this protocol and make sure it does provide a 
better option than rabbitMQ at production scale using measurable evidence.

Thanks

  Alec







Thanks,
Oleksii


1 - http://zeromq.org/area:faq#toc7


5/26/15 18:57, Davanum Srinivas пишет:

Alec,

Here are the slides:
http://www.slideshare.net/davanum/oslomessaging-new-0mq-driver-proposal

All the 0mq patches to date should be either already merged in trunk
or waiting for review on trunk.

Oleksii, Li Ma,
Can you please address the other questions?

thanks,
Dims

On Tue, May 26, 2015 at 11:43 AM, Alec Hothan (ahothan)
<ahot...@cisco.com><mailto:ahot...@cisco.com> wrote:


Looking at what is the next step following the design summit meeting on
0MQ as the etherpad does not provide too much information.
Few questions:
- would it be possible to have the slides presented (showing the proposed
changes in the 0MQ driver design) to be available somewhere?
- is there a particular branch in the oslo messaging repo that contains
0MQ related patches - I'm more particularly interested by James Page's
patch to pool the 0MQ connections but there might be other
- question for Li Ma, are you deploying with the straight upstream 0MQ
driver or with some additional patches?

The per node proxy process (which is itself some form of broker) needs to
be removed completely if the new solution is to be made really
broker-less. This will also eliminate the only single point of failure in
the path and reduce the number of 0MQ sockets (and hops per message) by
half.

I think it was proposed that we go on with the first draft of the new
driver (which still keeps the proxy server but reduces the number of
sockets) before eventually tackling the removal of the proxy server?



Thanks

  Alec



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe<mailto:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [oslo.messaging][zeromq] Next step

Reply via email to