I think we can agree that a data-plane API only makes sense if it is useful to a large number of web and mobile developers deploying their apps on OpenStack. Also, it only makes sense if it is cost-effective and scalable for operators who wish to deploy such a service.
Marconi was born of practical experience and direct interaction with prospective users. When Marconi was kicked off a few summits ago, the community was looking for a multi-tenant messaging service to round out the OpenStack portfolio. Users were asking operators for something easier to work with and more web-friendly than established options such as AMQP. To that end, we started drafting an HTTP-based API specification that would afford several different messaging patterns, in order to support the use cases that users were bringing to the table. We did this completely in the open, and received lots of input from prospective users familiar with a variety of message broker solutions, including more “cloudy” ones like SQS and Iron.io. The resulting design was a hybrid that supported what you might call “claim-based” semantics ala SQS and feed-based semantics ala RSS. Application developers liked the idea of being able to use one or the other, or combine them to come up with new patterns according to their needs. For example: 1. A video app can use Marconi to feed a worker pool of transcoders. When a video is uploaded, it is stored in Swift and a job message is posted to Marconi. Then, a worker claims the job and begins work on it. If the worker crashes, the claim expires and the message becomes available to be claimed by a different worker. Once the worker is finished with the job, it deletes the message so that another worker will not process it, and claims another message. Note that workers never “list” messages in this use case; those endpoints in the API are simply ignored. 2. A backup service can use Marconi to communicate with hundreds of thousands of backup agents running on customers' machines. Since Marconi queues are extremely light-weight, the service can create a different queue for each agent, and additional queues to broadcast messages to all the agents associated with a single customer. In this last scenario, the service would post a message to a single queue and the agents would simply list the messages on that queue, and everyone would get the same message. This messaging pattern is emergent, and requires no special routing setup in advance from one queue to another. 3. A metering service for an Internet application can use Marconi to aggregate usage data from a number of web heads. Each web head collects several minutes of data, then posts it to Marconi. A worker periodically claims the messages off the queue, performs the final aggregation and processing, and stores the results in a DB. So far, this messaging pattern is very much like example #1, above. However, since Marconi’s API also affords the observer pattern via listing semantics, the metering service could run an auditor that logs the messages as they go through the queue in order to provide extremely valuable data for diagnosing problems in the aggregated data. Users are excited about what Marconi offers today, and we are continuing to evolve the API based on their feedback. Of course, app developers aren’t the only audience Marconi needs to serve. Operators want something that is cost-effective, scales, and is customizable for the unique needs of their target market. While Marconi has plenty of room to improve (who doesn’t?), here is where the project currently stands in these areas: 1. Customizable. Marconi transport and storage drivers can be swapped out, and messages can be manipulated in-flight with custom filter drivers. Currently we have MongoDB and SQLAlchemy drivers, and are exploring Redis and AMQP brokers. Now, the v1.0 API does impose some constraints on the backend in order to support the use cases mentioned earlier. For example, an AMQP backend would only be able to support a subset of the current API. Operators occasionally ask about AMQP broker support, in particular, and we are exploring ways to evolve the API in order to support that. 2. Scalable. Operators can use Marconi’s HTTP transport to leverage their existing infrastructure and expertise in scaling out web heads. When it comes to the backend, for small deployments with minimal throughput needs, we are providing a SQLAlchemy driver as a non-AGPL alternative to MongoDB. For large-scale production deployments, we currently provide the MongoDB driver and will likely add Redis as another option (there is already a POC driver). And, of course, operators can provide drivers for NewSQL databases, such as VelocityDB, that are very fast and scale extremely well. In Marconi, every queue can be associated with a different backend cluster. This allows operators to scale both up and out, according to what is most cost-effective for them. Marconi's app-level sharding is currently done using a lookup table to provide for maximum operator control over placement, but I personally think it would be great to see this opened up so that we can swap in other types of drivers, such as one based on hash rings (TBD). 3. Cost-effective. The Marconi team has done a lot of work to (1) provide several dimensions for scaling deployments that can be used according to what is most cost-effective for a given use case, and (2) make the Marconi service as efficient as possible, including time spent optimizing the transport layer (using Falcon in lieu of Pecan, reducing the work that the request handlers do, etc.), and tuning the MongoDB storage driver (the SQLAlchemy driver is newer and we haven’t had the chance to tune it yet, but are planning to do so during Juno). Turnaround on requests is in the low ms range (including dealing with HTTP), not the usec range, but that works perfectly well for a large class of applications. We’ve been benchmarking with Tsung for quite a while now, and we are working on making the raw data more accessible to folks outside our team. I’ll try to get some of the latest data up on the wiki this week. Marconi was originally incubated because the community believed developers building their apps on top of OpenStack were looking for this kind of service, and it was a big missing gap in our portfolio. Since that time, the team has worked hard to fill that gap. Kurt _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev