On 12/09/2013 05:16 PM, Gordon Sim wrote: > On 12/09/2013 07:15 PM, Russell Bryant wrote: >> On 12/09/2013 12:56 PM, Gordon Sim wrote: >>>> In the case of Nova (and others that followed Nova's messaging >>>> patterns), I firmly believe that for scaling reasons, we need to move >>>> toward it becoming the norm to use peer-to-peer messaging for most >>>> things. For example, the API and conductor services should be talking >>>> directly to compute nodes instead of through a broker. >>> >>> Is scale the only reason for preferring direct communication? I don't >>> think an intermediary based solution _necessarily_ scales less >>> effectively (providing it is distributed in nature, which for example is >>> one of the central aims of the dispatch router in Qpid). >>> >>> That's not to argue that peer-to-peer shouldn't be used, just trying to >>> understand all the factors. >> >> Scale is the primary one. If the intermediary based solution is easily >> distributed to handle our scaling needs, that would probably be fine, >> too. That just hasn't been our experience so far with both RabbitMQ and >> Qpid. > > Understood. The Dispatch Router was indeed created from an understanding > of the limitations and drawbacks of the 'federation' feature of qpidd > (which was the primary mechanism for scaling beyond one broker) as well > learning lessons around the difficulties of message replication and > storage.
Cool. To make the current situation worse, AFAIK, we've never been able to make Qpid federation work at all for OpenStack. That may be due to the way we use Qpid, though. For RabbitMQ, I know people are at least using active-active clustering of the broker. >>> One other pattern that can benefit from intermediated message flow is in >>> load balancing. If the processing entities are effectively 'pulling' >>> messages, this can more naturally balance the load according to capacity >>> than when the producer of the workload is trying to determine the best >>> balance. >> >> Yes, that's another factor. Today, we rely on the message broker's >> behavior to equally distribute messages to a set of consumers. > > Sometimes you even _want_ message distribution to be 'unequal', if the > load varies by message or the capacity by consumer. E.g. If one consumer > is particularly slow (or is given a particularly arduous task), it may > not be optimal for it to receive the same portion of subsequent messages > as other less heavily loaded or more powerful consumers. Indeed. We haven't tried to do that anywhere, but it would be an improvement for some cases. >>>> The exception >>>> to that is cases where we use a publish-subscribe model, and a broker >>>> serves that really well. Notifications and notification consumers >>>> (such as Ceilometer) are the prime example. >>> >>> The 'fanout' RPC cast would perhaps be another? >> >> Good point. >> >> In Nova we have been working to get rid of the usage of this pattern. >> In the latest code the only place it's used AFAIK is in some code we >> expect to mark deprecated (nova-network). > > Interesting. Is that because of problems in scaling the messaging > solution or for other reasons? It's primarily a scaling concern. We're assuming that broadcasting messages is generally an anti-pattern for the massive scale we're aiming for. > [...] >> I'm very interested in diving deeper into how Dispatch would fit into >> the various ways OpenStack is using messaging today. I'd like to get >> a better handle on how the use of Dispatch as an intermediary would >> scale out for a deployment that consists of 10s of thousands of >> compute nodes, for example. >> >> Is it roughly just that you can have a network of N Dispatch routers >> that route messages from point A to point B, and for notifications we >> would use a traditional message broker (qpidd or rabbitmq) ? > > For scaling the basic idea is that not all connections are made to the > same process and therefore not all messages need to travel through a > single intermediary process. > > So for N different routers, each have a portion of the total number of > publishers and consumers connected to them. Though client can > communicate even if they are not connected to the same router, each > router only needs to handle the messages sent by the publishers directly > attached, or sent to the consumer directly attached. It never needs to > see messages between publishers and consumer that are not directly > attached. > > To address your example, the 10s of thousands of compute nodes would be > spread across N routers. Assuming these were all interconnected, a > message from the scheduler would only travel through at most two of > these N routers (the one the scheduler was connected to and the one the > receiving compute node was connected to). No process needs to be able to > handle 10s of thousands of connections itself (as contrasted with full > direct, non-intermediated communication, where the scheduler would need > to manage connections to each of the compute nodes). > > This basic pattern is the same as networks of brokers, but Dispatch > router has been designed from the start to simply focus on that problem > (and not deal with all other broker related features, such as > transactions, durability, specialised queueing etc). Soudns awesome. :-) > The other difference is that Dispatch Router does not accept > responsibility for messages, i.e. it does not offer any > store-and-forward behaviour. Any acknowledgement is end-to-end. This > avoids it having to replicate messages. On failure they can if needed by > replayed by the original sender. I think the lack of store-and-forward is OK. Right now, all of the Nova code is written to assume that the messaging is unreliable and that any message could get lost. It may result in an operation failing, but it should fail gracefully. Doing end-to-end acknowledgement may actually be an improvement. > The Dispatch Router can work for pub-sub patterns as well, though not > store and forward directly. In theory, for flows where store-and-forward > is needed, that can be supplied by an additional service e.g. a more > traditional broker, which would take responsibility for replaying over > from the publisher in order that subscribers could if needed have > message replayed even after the original publisher had exited. Any thoughts on what we would be recommending for notifications? -- Russell Bryant _______________________________________________ OpenStack-dev mailing list OpenStackfirstname.lastname@example.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev