On 6/1/15, 5:03 PM, "Davanum Srinivas" <dava...@gmail.com> wrote:
>fyi, the spec for zeromq driver in oslo.messaging is here: >https://review.openstack.org/#/c/187338/1/specs/liberty/zmq-patterns-usage >.rst,unified > >-- dims I was about to provide some email comments on the above review off gerrit, but figured maybe it would be good to make a quick status of the state of this general effort for pushing out a better zmq driver for oslo essaging. So I started to look around the oslo/zeromq wiki and saw few email threads that drew my interest. In this email (Nov 2014) Ilya proposes about getting rid of a central broker for zmq: http://lists.openstack.org/pipermail/openstack-dev/2014-November/050701.htm l Not clear if Ilya already had in mind to instead have a local proxy on every node (as proposed in the above spec) In this email (mar 2014), Yatin described the prospect of using zmq in a completely broker-less way (so not even a proxy per node), with the use of matchmaker rings to configure well known ports. http://lists.openstack.org/pipermail/openstack-dev/2014-March/030411.html Which is pretty close to what I think would be a better design (with the variant that I'd rather see a robust and highly available name server instead of fixed port assignments), I'd be interested to know what happened to that proposal and why we ended up with a proxy per node solution at this stage (I'll reply to the proxy per node design in a separate email to complement my gerrit comments). I could not find one document that summarizes the list of issues related to rabbitMQ deployments, all it appears is that many people are unhappy with it, some are willing to switch to zmq, many are hesitant and some are decidedly skeptical. On my side I know a number of issues related to oslo messaging over rabbitMQ. I think it is important for the community to understand that of the many issues generally attributed to oslo messaging over rabbitMQ, not all of them are caused by the choice of rabbitMQ as a transport (and hence those will likely not be fixed if we just switched from rabbitMQ to ZMQ) and many are actually caused by the misuse of oslo messaging by the apps (Neutron, Nova...) and can only be fixed by modification of the app code. I think personally that there is a strong case for a properly designed ZMQ driver but we first need to make the expectations very clear. One long standing issue I can see is the fact that the oslo messaging API documentation is sorely lacking details on critical areas such as API behavior during fault conditions, load conditions and scale conditions. As a result, app developers are using the APIs sometimes indiscriminately and that will have an impact on the overall quality of openstack in deployment conditions. I understand that a lot of the existing code was written in a hurry and good enough to work properly on small setups, but some code will break really badly under load or when things start to go south in the cloud. That is unless the community realizes that perhaps there is something that needs to be done. We're only starting to see today things breaking under load because we have more lab tests at scale, more deployments at scale and we only start to see real system level testing at scale with HA testing (the kind of test where you inject load and cause failures of all sorts). Today we know that openstack behaves terribly in these conditions, even in so-called HA deployments! As a first step, would it be useful to have one single official document that characterizes all the issues we're trying to fix and perhaps used that document as a basis for showing which of all these issues will be fixed by the use of the zmq driver? I think that could help us focus better on the type of requirements we need from this new ZMQ driver. Thanks, Alec __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev