Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
Hope this thread isn't dead. Mike - thanks for highlighting some really key issues at scale. On a related note, can someone from the Ceilometer comment about the store and forward requirement? Currently scaling RabbitMQ is non-trivial. Though cells help make the problem smaller, as Paul Mathews points out in the video below, cells don't make the problems go away. Looking at the experience in the community, Qpid isn't an option either. Cheers, Subbu On Dec 9, 2013, at 4:36 PM, Mike Wilson geekinu...@gmail.com wrote: This is the first time I've heard of the dispatch router, I'm really excited now that I've looked at it a bit. Thx Gordon and Russell for bringing this up. I'm very familiar with the scaling issues associated with any kind of brokered messaging solution. We grew an Openstack installation to about 7,000 nodes and started having significant scaling issues with the qpid broker. We've talked about our problems at a couple summits in a fair amount of detail[1][2]. I won't bother repeating the information in this thread. I really like the idea of separating the logic of routing away from the the message emitter. Russell mentioned the 0mq matchmaker, we essentially ditched the qpid broker for direct communication via 0mq and it's matchmaker. It still has a lot of problems which dispatch seems to address. For example, in ceilometer we have store-and-forward behavior as a requirement. This kind of communication requires a broker but 0mq doesn't really officially support one, which means we would probably end up with some broker as part of OpenStack. Matchmaker is also a fairly basic implementation of what is essentially a directory. For any sort of serious production use case you end up sprinkling JSON files all over the place or maintaining a Redis backend. I feel like the matchmaker needs a bunch more work to make modifying the directory simpler for operations. I would rather put that work into a separate project like dispatch than have to maintain essentially a one off in Openstack's codebase. I wonder how this fits into messaging from a driver perspective in Openstack or even how this fits into oslo.messaging? Right now we have topics for binaries(compute, network, consoleauth, etc), hostname.service_topic for nodes, fanout queue per node (not sure if kombu also has this) and different exchanges per project. If we can abstract the routing from the emission of the message all we really care about is emitter, endpoint, messaging pattern (fanout, store and forward, etc). Also not sure if there's a dispatch analogue in the rabbit world, if not we need to have some mapping of concepts etc between impls. So many questions, but in general I'm really excited about this and eager to contribute. For sure I will start playing with this in Bluehost's environments that haven't been completely 0mqized. I also have some lingering concerns about qpid in general. Beyond scaling issues I've run into some other terrible bugs that motivated our move away from it. Again, these are mentioned in our presentations at summits and I'd be happy to talk more about them in a separate discussion. I've also been able to talk to some other qpid+openstack users who have seen the same bugs. Another large installation that comes to mind is Qihoo 360 in China. They run a few thousand nodes with qpid for messaging and are familiar with the snags we run into. Gordon, I would really appreciate if you could watch those two talks and comment. The bugs are probably separate from the dispatch router discussion, but it does dampen my enthusiasm a bit not knowing how to fix issues beyond scale :-(. -Mike Wilson [1] http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment [2] http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/going-brokerless-the-transition-from-qpid-to-0mq On Mon, Dec 9, 2013 at 4:29 PM, Mark McLoughlin mar...@redhat.com wrote: On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote: Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Sounds sane to me. To put it another way, assuming all AMQP 1.0
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 11/12/13 09:31 -0500, Andrew Laski wrote: On 12/10/13 at 11:09am, Flavio Percoco wrote: On 09/12/13 17:37 -0500, Russell Bryant wrote: On 12/09/2013 05:16 PM, Gordon Sim wrote: On 12/09/2013 07:15 PM, Russell Bryant wrote: [...] One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. Sometimes you even _want_ message distribution to be 'unequal', if the load varies by message or the capacity by consumer. E.g. If one consumer is particularly slow (or is given a particularly arduous task), it may not be optimal for it to receive the same portion of subsequent messages as other less heavily loaded or more powerful consumers. Indeed. We haven't tried to do that anywhere, but it would be an improvement for some cases. Agreed, this is something that worths experimenting. [...] I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? For scaling the basic idea is that not all connections are made to the same process and therefore not all messages need to travel through a single intermediary process. So for N different routers, each have a portion of the total number of publishers and consumers connected to them. Though client can communicate even if they are not connected to the same router, each router only needs to handle the messages sent by the publishers directly attached, or sent to the consumer directly attached. It never needs to see messages between publishers and consumer that are not directly attached. To address your example, the 10s of thousands of compute nodes would be spread across N routers. Assuming these were all interconnected, a message from the scheduler would only travel through at most two of these N routers (the one the scheduler was connected to and the one the receiving compute node was connected to). No process needs to be able to handle 10s of thousands of connections itself (as contrasted with full direct, non-intermediated communication, where the scheduler would need to manage connections to each of the compute nodes). This basic pattern is the same as networks of brokers, but Dispatch router has been designed from the start to simply focus on that problem (and not deal with all other broker related features, such as transactions, durability, specialised queueing etc). Soudns awesome. :-) The other difference is that Dispatch Router does not accept responsibility for messages, i.e. it does not offer any store-and-forward behaviour. Any acknowledgement is end-to-end. This avoids it having to replicate messages. On failure they can if needed by replayed by the original sender. I think the lack of store-and-forward is OK. Right now, all of the Nova code is written to assume that the messaging is unreliable and that any message could get lost. It may result in an operation failing, but it should fail gracefully. Doing end-to-end acknowledgement may actually be an improvement. This is interesting and a very important point. I wonder what the reliability expectations of other services w.r.t OpenStack messaging are. I agree on the fact that p2p acknowledgement could be an improvement but I'm also wondering how this (if ever) will affect projects - in terms of requiring changes. One of the goals of this new driver is to not require any changes on the existing projects. Also, a bit different but related topic, are there cases where tasks are re-scheduled in nova? If so, what does nova do in this case? Are those task sent back to `nova-scheduler` for re-scheduling? Yes, there are certain build failures that can occur which will cause a re-schedule. That's currently accomplished by the compute node sending a message back to the scheduler so it can pick a new host. I'm trying to shift that a bit so we're messaging the conductor rather than the scheduler, but the basic structure of it is going to remain the same for now. If you mean in progress operations being restarted after a service is restarted, then no. We're working towards making that possible but at the moment it doesn't exist. This is very valuable information. I wonder if the same applies for other projects. I'd expect cinder to behave pretty much the same way nova does in
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/10/13 at 11:09am, Flavio Percoco wrote: On 09/12/13 17:37 -0500, Russell Bryant wrote: On 12/09/2013 05:16 PM, Gordon Sim wrote: On 12/09/2013 07:15 PM, Russell Bryant wrote: [...] One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. Sometimes you even _want_ message distribution to be 'unequal', if the load varies by message or the capacity by consumer. E.g. If one consumer is particularly slow (or is given a particularly arduous task), it may not be optimal for it to receive the same portion of subsequent messages as other less heavily loaded or more powerful consumers. Indeed. We haven't tried to do that anywhere, but it would be an improvement for some cases. Agreed, this is something that worths experimenting. [...] I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? For scaling the basic idea is that not all connections are made to the same process and therefore not all messages need to travel through a single intermediary process. So for N different routers, each have a portion of the total number of publishers and consumers connected to them. Though client can communicate even if they are not connected to the same router, each router only needs to handle the messages sent by the publishers directly attached, or sent to the consumer directly attached. It never needs to see messages between publishers and consumer that are not directly attached. To address your example, the 10s of thousands of compute nodes would be spread across N routers. Assuming these were all interconnected, a message from the scheduler would only travel through at most two of these N routers (the one the scheduler was connected to and the one the receiving compute node was connected to). No process needs to be able to handle 10s of thousands of connections itself (as contrasted with full direct, non-intermediated communication, where the scheduler would need to manage connections to each of the compute nodes). This basic pattern is the same as networks of brokers, but Dispatch router has been designed from the start to simply focus on that problem (and not deal with all other broker related features, such as transactions, durability, specialised queueing etc). Soudns awesome. :-) The other difference is that Dispatch Router does not accept responsibility for messages, i.e. it does not offer any store-and-forward behaviour. Any acknowledgement is end-to-end. This avoids it having to replicate messages. On failure they can if needed by replayed by the original sender. I think the lack of store-and-forward is OK. Right now, all of the Nova code is written to assume that the messaging is unreliable and that any message could get lost. It may result in an operation failing, but it should fail gracefully. Doing end-to-end acknowledgement may actually be an improvement. This is interesting and a very important point. I wonder what the reliability expectations of other services w.r.t OpenStack messaging are. I agree on the fact that p2p acknowledgement could be an improvement but I'm also wondering how this (if ever) will affect projects - in terms of requiring changes. One of the goals of this new driver is to not require any changes on the existing projects. Also, a bit different but related topic, are there cases where tasks are re-scheduled in nova? If so, what does nova do in this case? Are those task sent back to `nova-scheduler` for re-scheduling? Yes, there are certain build failures that can occur which will cause a re-schedule. That's currently accomplished by the compute node sending a message back to the scheduler so it can pick a new host. I'm trying to shift that a bit so we're messaging the conductor rather than the scheduler, but the basic structure of it is going to remain the same for now. If you mean in progress operations being restarted after a service is restarted, then no. We're working towards making that possible but at the moment it doesn't exist. Cheers, FF -- @flaper87 Flavio Percoco ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 10:37 PM, Russell Bryant wrote: On 12/09/2013 05:16 PM, Gordon Sim wrote: On 12/09/2013 07:15 PM, Russell Bryant wrote: Understood. The Dispatch Router was indeed created from an understanding of the limitations and drawbacks of the 'federation' feature of qpidd (which was the primary mechanism for scaling beyond one broker) as well learning lessons around the difficulties of message replication and storage. Cool. To make the current situation worse, AFAIK, we've never been able to make Qpid federation work at all for OpenStack. That may be due to the way we use Qpid, though. The federation in qpidd requires path specific configuration. I.e. for each topic or queu or whatever you need to explicitly enable the flow of messages in each direction between each pair of brokers. The original uses cases it was designed for were much simpler than openstacks needs and this wasn't insurmountable. As it was used in new areas however the limitation became apparent. The Dispatch Router instances on the other hand communicate with each other to automatically setup the internal routing necessary to ensure publishers and subscribers communicate regardless of the point at which they are attached. Another limitation of the original federation was the inability to handle redundant routes between brokers without duplicating messages. [...] The Dispatch Router can work for pub-sub patterns as well, though not store and forward directly. In theory, for flows where store-and-forward is needed, that can be supplied by an additional service e.g. a more traditional broker, which would take responsibility for replaying over from the publisher in order that subscribers could if needed have message replayed even after the original publisher had exited. Any thoughts on what we would be recommending for notifications? For basic pub-sub (i.e. no durable/reliable subscriptions that retain messages across disconnects), dispatch router itself will work fine. For those cases where you do need store-and-forward, i.e. where something other than the publisher needs to reliably store notifications until all interested subscribers have confirmed receipt, you could use a broker (rabbitmq, qpidd, activemq or similar). The plan for dispatch router is to allow such brokers to be hooked into the router network also. So you could have the broker(s) accept the messages from the publishers, and then have any subscribers needing reliability guarantees subscribe to the broker via the router network. The aim is to allow the load to be spread across multiple brokers, but have dispatch router control the flow of messages in and out of these, rather than relying on the brokers to link themselves up. This aspect however is not yet built (but is coming soon!). Durable/reliable subscriptions do of course bring with them the need to manage potential growth of backed up, unconfirmed messages, e.g. when a subscriber is down or unresponsive for some time. Ensuring that there are limits in place that prevent this getting out of hand are essential. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 11:29 PM, Mark McLoughlin wrote: On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote: Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Sounds sane to me. To put it another way, assuming all AMQP 1.0 client libraries are equal, all the operator cares about is that we have a driver that connect into whatever AMQP 1.0 messaging topology they want to use. Of course, not all client libraries will be equal, so if we don't offer the choice of library/driver to the operator, then the onus is on us to pick the best client library for this driver. That is a fair point. One thing to point out about Qpid proton is that it is in fact two different things in the one library. On the one hand it is a fully fledged client library with its own IO and model of use. On the other hand it has a passive protocol engine that is agnostic as to the IO/threading approach used and merely encapsulates the encoding and protocol rules. This allows it to be integrated into different environments without imposing architectural restrictions. My suggestion would be to use the protocol engine, and design the IO and threading to work well with the rest of the oslo.messaging code (e.g. with eventlet or asynchio or whatever). In some ways this makes oslo.messaging a client library in its own right, with and RPC and notify based API and ensuring that other choices fit in well with the overall codebase. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/10/2013 12:36 AM, Mike Wilson wrote: This is the first time I've heard of the dispatch router, I'm really excited now that I've looked at it a bit. Thx Gordon and Russell for bringing this up. I'm very familiar with the scaling issues associated with any kind of brokered messaging solution. We grew an Openstack installation to about 7,000 nodes and started having significant scaling issues with the qpid broker. We've talked about our problems at a couple summits in a fair amount of detail[1][2]. I won't bother repeating the information in this thread. I really like the idea of separating the logic of routing away from the the message emitter. Russell mentioned the 0mq matchmaker, we essentially ditched the qpid broker for direct communication via 0mq and it's matchmaker. It still has a lot of problems which dispatch seems to address. For example, in ceilometer we have store-and-forward behavior as a requirement. This kind of communication requires a broker but 0mq doesn't really officially support one, which means we would probably end up with some broker as part of OpenStack. Matchmaker is also a fairly basic implementation of what is essentially a directory. For any sort of serious production use case you end up sprinkling JSON files all over the place or maintaining a Redis backend. I feel like the matchmaker needs a bunch more work to make modifying the directory simpler for operations. I would rather put that work into a separate project like dispatch than have to maintain essentially a one off in Openstack's codebase. I wonder how this fits into messaging from a driver perspective in Openstack or even how this fits into oslo.messaging? Right now we have topics for binaries(compute, network, consoleauth, etc), hostname.service_topic for nodes, fanout queue per node (not sure if kombu also has this) and different exchanges per project. If we can abstract the routing from the emission of the message all we really care about is emitter, endpoint, messaging pattern (fanout, store and forward, etc). Also not sure if there's a dispatch analogue in the rabbit world, if not we need to have some mapping of concepts etc between impls. So many questions, but in general I'm really excited about this and eager to contribute. For sure I will start playing with this in Bluehost's environments that haven't been completely 0mqized. I also have some lingering concerns about qpid in general. Beyond scaling issues I've run into some other terrible bugs that motivated our move away from it. Again, these are mentioned in our presentations at summits and I'd be happy to talk more about them in a separate discussion. I've also been able to talk to some other qpid+openstack users who have seen the same bugs. Another large installation that comes to mind is Qihoo 360 in China. They run a few thousand nodes with qpid for messaging and are familiar with the snags we run into. Gordon, I would really appreciate if you could watch those two talks and comment. The bugs are probably separate from the dispatch router discussion, but it does dampen my enthusiasm a bit not knowing how to fix issues beyond scale :-(. Mike (and others), First, as a Qpid developer, let me apologise for the frustrating experience you have had. The qpid components used here are not the most user friendly, it has to be said. They work well for the paths most usually taken, but there can be some unanticipated problems outside that. The main failing I think is that we in the Qpid community did not get involved in OpenStack to listen, understand the use cases and to help diagnose and address problems earlier. I joined this list specifically to try and rectify that failing. The specific issues I gleaned from the presentations were: (a) issues with eventlet and qpid.messaging integration The qpid.messaging library does some perhaps quirky things that made the monkey patched solution more awkward. The openstack rpc implementation over qpid was heavily driven by the kombu base rabbitmq implementation, although the client libraries are quite different in design. The addressing syntax for the qpid.messaging library is not always the most intuitive. As suggested in another mail on this thread, for an AMQP 1.0 based driver I would pick an approach that allows olso.messaging to retain control over threading choices etc to avoid some of these sorts of integration issues, and program more directly to the protocol. (b) general scaling issues with standalone qpidd instance, As you point out very clearly, a single broker is always going to be a bottleneck. Further there are some aspects of the integration code that I think unnecessarily reduce performance. E.g. each call and cast is synchronous, only a single message of prefetch is enabled for any subscription (forcing more roundtrips), senders and receivers are created for every request and reply etc. (c) message loss, The code I have studied doesn't enable
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 10/12/13 12:15 +, Gordon Sim wrote: On 12/09/2013 11:29 PM, Mark McLoughlin wrote: On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote: Sounds sane to me. To put it another way, assuming all AMQP 1.0 client libraries are equal, all the operator cares about is that we have a driver that connect into whatever AMQP 1.0 messaging topology they want to use. Of course, not all client libraries will be equal, so if we don't offer the choice of library/driver to the operator, then the onus is on us to pick the best client library for this driver. That is a fair point. One thing to point out about Qpid proton is that it is in fact two different things in the one library. On the one hand it is a fully fledged client library with its own IO and model of use. On the other hand it has a passive protocol engine that is agnostic as to the IO/threading approach used and merely encapsulates the encoding and protocol rules. This allows it to be integrated into different environments without imposing architectural restrictions. My suggestion would be to use the protocol engine, and design the IO and threading to work well with the rest of the oslo.messaging code (e.g. with eventlet or asynchio or whatever). In some ways this makes oslo.messaging a client library in its own right, with and RPC and notify based API and ensuring that other choices fit in well with the overall codebase. This is very interesting and fits perfectly with oslo.messaging executors. Assuming we'll use proton, I imagine we'll have a simple implementation for a tcp transport that we could use with oslo.messaging executors to get and send messages. Cheers, FF -- @flaper87 Flavio Percoco pgpmV4nRbEhH1.pgp Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 04:10 PM, Russell Bryant wrote: From looking it appears that RabbitMQ's support is via an experimental plugin. I don't know any more about it. Has anyone looked at it in detail? I believe initial support was added in 3.1.0: http://www.rabbitmq.com/release-notes/README-3.1.0.txt I have certainly successfully tested basic interaction with RabbitMQ over 1.0. https://www.rabbitmq.com/specification.html As I understand it, Qpid supports it, in that it's a completely new implementation as a library (Proton) under the Qpid project umbrella. There's also a message router called Dispatch. This is not to be confused with the existing Qpid libraries, or the existing Qpid broker (qpidd). Yes, proton is a library that encapsulates the AMQP 1.0 encoding and protocol rules. It is used in the existing native broker (i.e. qpidd) to offer 1.0 support (as well as the qpid::messaging c++ client library). In addition there is as you mention the 'Dispatch Router'. This is an alternative architecture for an intermediary to address some of the issues that can arise with qpidd or other brokers (distributed in nature to scale better, end-to-end reliability rather than store and forward etc). So the Qpid project offers both new components as well as 1.0 support and smooth transition for existing components. (Disclosure, I'm a developer in the Qpid community also). (There are of course other implementations also e.g., ActiveMQ, ApolloMQ, HornetQ, Microsoft ServiceBus, SwiftMQ.) http://qpid.apache.org/proton/ http://qpid.apache.org/components/dispatch-router/ [...] All of this sounds fine to me. Surely a single driver for multiple systems is an improvement. What's not really mentioned though is why we should care about AMQP 1.0 beyond that. Why is it architecturally superior? It has been discussed on this list some before, but I figure it's worth re-visiting if some code is going to show up soon. Personally I think there is benefit to having a standardised, open wire-protocol as the basis for communication in systems like OpenStack, rather than having the driver tied to a particular implementation throughout (and having more of the key details of the interaction as details of the implementation of the driver). The bytes over the wire are another level of interface and having that tightly specified can be valuable. Having one driver that still offers choice with regard to intermediaries used (I avoid the term broker in case it is implies particular approaches), is I think an advantage. Hypothetically for example it would have been an advantage had the same driver been usable against both RabbitMQ and Qpid previously. The (bumpy!) evolution of AMQP meant that wasn't quite possible since they both spoke different versions ofthe early protocol. AMQP 1.0 might in the future avoid needing new drivers in such cases however, making it easier to adopt alternative or emerging solutions. AMQP is not the only messaging protocol of course, However its general purpose nature and the fact that both RabbitMQ and Qpid really came about through AMQP makes it a reasonable choice. In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. Is scale the only reason for preferring direct communication? I don't think an intermediary based solution _necessarily_ scales less effectively (providing it is distributed in nature, which for example is one of the central aims of the dispatch router in Qpid). That's not to argue that peer-to-peer shouldn't be used, just trying to understand all the factors. One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. The 'fanout' RPC cast would perhaps be another? In terms of existing messaging drivers, you could accomplish this with a combination of both RabbitMQ or Qpid for brokered messaging and ZeroMQ for the direct messaging cases. It would require only a small amount of additional code to allow you to select a separate driver for each case. Based on my understanding, AMQP 1.0 could be used for both of these patterns. It seems ideal long term to be able to use the same protocol for everything we need. That is certainly true. AMQP 1.0 is fully symmetric so it can be used directly peer-to-peer as well as
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 12:56 PM, Gordon Sim wrote: In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. Is scale the only reason for preferring direct communication? I don't think an intermediary based solution _necessarily_ scales less effectively (providing it is distributed in nature, which for example is one of the central aims of the dispatch router in Qpid). That's not to argue that peer-to-peer shouldn't be used, just trying to understand all the factors. Scale is the primary one. If the intermediary based solution is easily distributed to handle our scaling needs, that would probably be fine, too. That just hasn't been our experience so far with both RabbitMQ and Qpid. One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. One example is how Nova components talk to the nova-scheduler service. All instances of the nova-scheduler service are reading off a single 'scheduler' queue, so messages hit them round-robin. In the case of the zeromq driver, this logic is embedded in the client. It has to know about all consumers and handles choosing where each message goes itself. See references to the 'matchmaker' code for this. Honestly, using a distributed more lightweight router like Dispatch sounds *much* nicer. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. The 'fanout' RPC cast would perhaps be another? Good point. In Nova we have been working to get rid of the usage of this pattern. In the latest code the only place it's used AFAIK is in some code we expect to mark deprecated (nova-network). In terms of existing messaging drivers, you could accomplish this with a combination of both RabbitMQ or Qpid for brokered messaging and ZeroMQ for the direct messaging cases. It would require only a small amount of additional code to allow you to select a separate driver for each case. Based on my understanding, AMQP 1.0 could be used for both of these patterns. It seems ideal long term to be able to use the same protocol for everything we need. That is certainly true. AMQP 1.0 is fully symmetric so it can be used directly peer-to-peer as well as between intermediaries. In fact, apart from the establishment of the connection in the first place, a process need not see any difference in the interaction either way. We could use only ZeroMQ, as well. It doesn't have the publish-subscribe stuff we need built in necessarily. Surely that has been done multiple times by others already, though. We could build it too, if we had to. Indeed. However the benefit of choosing a protocol is that you can use solutions developed outside OpenStack or any other single project. Can you (or someone) elaborate further on what will make this solution superior to our existing options? Superior is a very bold claim to make :-) I do personally think that an AMQP 1.0 based solution would be worthwhile for the reasons above. Given a hypothetical choice between say the current qpid driver and one that could talk to different back-ends, over a standard protocol for which e.g. semantic monitoring tools could be developed and which would make reasoning about partial upgrades or migrations easier, I know which I would lean to. Obviously that is not the choice here, since one already exists and the other is as yet hypothetical. However, as I say I think this could be a worthwhile journey and that would justify at least taking some initial steps. Thanks for sharing some additional insight. I was already quite optimistic, but you've helped solidify that. I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 05:16 PM, Gordon Sim wrote: On 12/09/2013 07:15 PM, Russell Bryant wrote: On 12/09/2013 12:56 PM, Gordon Sim wrote: In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. Is scale the only reason for preferring direct communication? I don't think an intermediary based solution _necessarily_ scales less effectively (providing it is distributed in nature, which for example is one of the central aims of the dispatch router in Qpid). That's not to argue that peer-to-peer shouldn't be used, just trying to understand all the factors. Scale is the primary one. If the intermediary based solution is easily distributed to handle our scaling needs, that would probably be fine, too. That just hasn't been our experience so far with both RabbitMQ and Qpid. Understood. The Dispatch Router was indeed created from an understanding of the limitations and drawbacks of the 'federation' feature of qpidd (which was the primary mechanism for scaling beyond one broker) as well learning lessons around the difficulties of message replication and storage. Cool. To make the current situation worse, AFAIK, we've never been able to make Qpid federation work at all for OpenStack. That may be due to the way we use Qpid, though. For RabbitMQ, I know people are at least using active-active clustering of the broker. One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. Sometimes you even _want_ message distribution to be 'unequal', if the load varies by message or the capacity by consumer. E.g. If one consumer is particularly slow (or is given a particularly arduous task), it may not be optimal for it to receive the same portion of subsequent messages as other less heavily loaded or more powerful consumers. Indeed. We haven't tried to do that anywhere, but it would be an improvement for some cases. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. The 'fanout' RPC cast would perhaps be another? Good point. In Nova we have been working to get rid of the usage of this pattern. In the latest code the only place it's used AFAIK is in some code we expect to mark deprecated (nova-network). Interesting. Is that because of problems in scaling the messaging solution or for other reasons? It's primarily a scaling concern. We're assuming that broadcasting messages is generally an anti-pattern for the massive scale we're aiming for. [...] I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? For scaling the basic idea is that not all connections are made to the same process and therefore not all messages need to travel through a single intermediary process. So for N different routers, each have a portion of the total number of publishers and consumers connected to them. Though client can communicate even if they are not connected to the same router, each router only needs to handle the messages sent by the publishers directly attached, or sent to the consumer directly attached. It never needs to see messages between publishers and consumer that are not directly attached. To address your example, the 10s of thousands of compute nodes would be spread across N routers. Assuming these were all interconnected, a message from the scheduler would only travel through at most two of these N routers (the one the scheduler was connected to and the one the receiving compute node was connected to). No process needs to be able to handle 10s of thousands of connections itself (as contrasted with full direct, non-intermediated communication, where the scheduler would need to manage connections to each of the compute nodes). This basic pattern is the same as networks of brokers, but Dispatch router has been designed from the start to
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote: Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Sounds sane to me. To put it another way, assuming all AMQP 1.0 client libraries are equal, all the operator cares about is that we have a driver that connect into whatever AMQP 1.0 messaging topology they want to use. Of course, not all client libraries will be equal, so if we don't offer the choice of library/driver to the operator, then the onus is on us to pick the best client library for this driver. (Enjoying the rest of this thread too, thanks to Gordon for his insights) Mark. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
This is the first time I've heard of the dispatch router, I'm really excited now that I've looked at it a bit. Thx Gordon and Russell for bringing this up. I'm very familiar with the scaling issues associated with any kind of brokered messaging solution. We grew an Openstack installation to about 7,000 nodes and started having significant scaling issues with the qpid broker. We've talked about our problems at a couple summits in a fair amount of detail[1][2]. I won't bother repeating the information in this thread. I really like the idea of separating the logic of routing away from the the message emitter. Russell mentioned the 0mq matchmaker, we essentially ditched the qpid broker for direct communication via 0mq and it's matchmaker. It still has a lot of problems which dispatch seems to address. For example, in ceilometer we have store-and-forward behavior as a requirement. This kind of communication requires a broker but 0mq doesn't really officially support one, which means we would probably end up with some broker as part of OpenStack. Matchmaker is also a fairly basic implementation of what is essentially a directory. For any sort of serious production use case you end up sprinkling JSON files all over the place or maintaining a Redis backend. I feel like the matchmaker needs a bunch more work to make modifying the directory simpler for operations. I would rather put that work into a separate project like dispatch than have to maintain essentially a one off in Openstack's codebase. I wonder how this fits into messaging from a driver perspective in Openstack or even how this fits into oslo.messaging? Right now we have topics for binaries(compute, network, consoleauth, etc), hostname.service_topic for nodes, fanout queue per node (not sure if kombu also has this) and different exchanges per project. If we can abstract the routing from the emission of the message all we really care about is emitter, endpoint, messaging pattern (fanout, store and forward, etc). Also not sure if there's a dispatch analogue in the rabbit world, if not we need to have some mapping of concepts etc between impls. So many questions, but in general I'm really excited about this and eager to contribute. For sure I will start playing with this in Bluehost's environments that haven't been completely 0mqized. I also have some lingering concerns about qpid in general. Beyond scaling issues I've run into some other terrible bugs that motivated our move away from it. Again, these are mentioned in our presentations at summits and I'd be happy to talk more about them in a separate discussion. I've also been able to talk to some other qpid+openstack users who have seen the same bugs. Another large installation that comes to mind is Qihoo 360 in China. They run a few thousand nodes with qpid for messaging and are familiar with the snags we run into. Gordon, I would really appreciate if you could watch those two talks and comment. The bugs are probably separate from the dispatch router discussion, but it does dampen my enthusiasm a bit not knowing how to fix issues beyond scale :-(. -Mike Wilson [1] http://www.openstack.org/summit/portland-2013/session-videos/presentation/using-openstack-in-a-traditional-hosting-environment [2] http://www.openstack.org/summit/openstack-summit-hong-kong-2013/session-videos/presentation/going-brokerless-the-transition-from-qpid-to-0mq On Mon, Dec 9, 2013 at 4:29 PM, Mark McLoughlin mar...@redhat.com wrote: On Mon, 2013-12-09 at 16:05 +0100, Flavio Percoco wrote: Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Sounds sane to me. To put it another way, assuming all AMQP 1.0 client libraries are equal, all the operator cares about is that we have a driver that connect into whatever AMQP 1.0 messaging topology they want to use. Of course, not all client libraries will be equal, so if we don't offer the choice of library/driver to the operator, then the onus is on us to pick the best client library for this driver. (Enjoying the rest of this thread too, thanks to Gordon for his insights) Mark. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Oslo] First steps towards amqp 1.0
Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Since the proposal is not about replacing existing dirvers but adding a new one for a specific protocol, there are some things that need to be taken under consideration: 1. The driver should be developed to work against various implementations of AMQP 1.0. 2. The driver shouldn't require support for 'backend-specific' features (Of course a backend specific driver may also be developed in the future if desired to exploit some non-standard feature etc). 3. AMQP 1.0 changed a lot of what we know about AMQP = 0.10. It's more oriented to messaging rather than queues, brokers and exchanges. Some other benefits of unifying backend specific implementations under a single protocol based driver are: - It'll ease the incorporation of alternatives that emerge and have support for such protocol - It'll help maintaining the code for that driver and it'll unify efforts throughout the community around that code. - It'll help developers to focus more on the benefits of the protocol itself rather than the benefits of that specific driver. - It fits perfectly as a non-opinionated feature that embraces existing and emerging technologies through an open standard. - A clear standard wire protocol will make reasoning about partial upgrades/migrations simpler That being said. The benefits of having a *protocol* based driver does not only apply to AMQP but to any well-defined protocol with a wide acceptance. However, AMQP 1.0 seems a reasonable fit right now and so a good protocol to begin with. Thoughts? Concerns? Ideas? [0] http://www.amqp.org/specification/1.0/amqp-org-download Cheers, FF -- @flaper87 Flavio Percoco pgphtaFMSwwzY.pgp Description: PGP signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 10:05 AM, Flavio Percoco wrote: Greetings, As $subject mentions, I'd like to start discussing the support for AMQP 1.0[0] in oslo.messaging. We already have rabbit and qpid drivers for earlier (and different!) versions of AMQP, the proposal would be to add an additional driver for a _protocol_ not a particular broker. (Both RabbitMQ and Qpid support AMQP 1.0 now). I didn't know the bit about RabbitMQ. Some clarification to make sure I understand the picture ... From looking it appears that RabbitMQ's support is via an experimental plugin. I don't know any more about it. Has anyone looked at it in detail? https://www.rabbitmq.com/specification.html As I understand it, Qpid supports it, in that it's a completely new implementation as a library (Proton) under the Qpid project umbrella. There's also a message router called Dispatch. This is not to be confused with the existing Qpid libraries, or the existing Qpid broker (qpidd). http://qpid.apache.org/proton/ http://qpid.apache.org/components/dispatch-router/ By targeting a clear mapping on to a protocol, rather than a specific implementation, we would simplify the task in the future for anyone wishing to move to any other system that spoke AMQP 1.0. That would no longer require a new driver, merely different configuration and deployment. That would then allow openstack to more easily take advantage of any emerging innovations in this space. Since the proposal is not about replacing existing dirvers but adding a new one for a specific protocol, there are some things that need to be taken under consideration: 1. The driver should be developed to work against various implementations of AMQP 1.0. 2. The driver shouldn't require support for 'backend-specific' features (Of course a backend specific driver may also be developed in the future if desired to exploit some non-standard feature etc). 3. AMQP 1.0 changed a lot of what we know about AMQP = 0.10. It's more oriented to messaging rather than queues, brokers and exchanges. Some other benefits of unifying backend specific implementations under a single protocol based driver are: - It'll ease the incorporation of alternatives that emerge and have support for such protocol - It'll help maintaining the code for that driver and it'll unify efforts throughout the community around that code. - It'll help developers to focus more on the benefits of the protocol itself rather than the benefits of that specific driver. - It fits perfectly as a non-opinionated feature that embraces existing and emerging technologies through an open standard. - A clear standard wire protocol will make reasoning about partial upgrades/migrations simpler That being said. The benefits of having a *protocol* based driver does not only apply to AMQP but to any well-defined protocol with a wide acceptance. However, AMQP 1.0 seems a reasonable fit right now and so a good protocol to begin with. Thoughts? Concerns? Ideas? [0] http://www.amqp.org/specification/1.0/amqp-org-download All of this sounds fine to me. Surely a single driver for multiple systems is an improvement. What's not really mentioned though is why we should care about AMQP 1.0 beyond that. Why is it architecturally superior? It has been discussed on this list some before, but I figure it's worth re-visiting if some code is going to show up soon. In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. In terms of existing messaging drivers, you could accomplish this with a combination of both RabbitMQ or Qpid for brokered messaging and ZeroMQ for the direct messaging cases. It would require only a small amount of additional code to allow you to select a separate driver for each case. Based on my understanding, AMQP 1.0 could be used for both of these patterns. It seems ideal long term to be able to use the same protocol for everything we need. We could use only ZeroMQ, as well. It doesn't have the publish-subscribe stuff we need built in necessarily. Surely that has been done multiple times by others already, though. We could build it too, if we had to. Can you (or someone) elaborate further on what will make this solution superior to our existing options? Thanks, -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev