Kurt already gave a quite detailed explanation of why Marconi, what can you do with it and where it's standing. I'll reply in-line:
On 19/03/14 10:17 +1300, Robert Collins wrote:
So this came up briefly at the tripleo sprint, and since I can't seem to find a /why/ document (https://wiki.openstack.org/wiki/Marconi/Incubation#Raised_Questions_.2B_Answers and https://wiki.openstack.org/wiki/Marconi#Design don't supply this) we decided at the TC meeting that I should raise it here. Firstly, let me check my facts :) - Marconi is backed by a modular 'storage' layer which places some conceptual design constraints on the storage backends that are possible (e.g. I rather expect a 0mq implementation to be very tricky, at best (vs the RPC style front end https://wiki.openstack.org/wiki/Marconi/specs/zmq/api/v1 )), and has a hybrid control/data plane API implementation where one can call into it to make queues etc, and to consume them.
Those docs refers to a transport driver not a storage driver. In Marconi, it's possible to have different protocols on top of the API. The current one is based on HTTP but there'll likely be others in the future. We've changed some things in the API to support amqp based storage drivers. We had a session during the HKG summit about this and since then, we've always kept amqp drivers in mind when doing changes on the API. I'm not saying it's perfect, though.
The API for the queues is very odd from a queueing perspective - https://wiki.openstack.org/wiki/Marconi/specs/api/v1#Get_a_Specific_Message - you don't subscribe to the queue, you enumerate and ask for a single message.
The current way to subscribe to queues is by using polling. Subscribing is not just tight to the "API" but also the transport itself. As mentioned above, we currently just have support for HTTP. Also, enumerating is not necessary. For instance, claiming with limit 1 will consume one message. (Side note: At the incubation meeting, it was recommended to not put efforts on writing new transport but to stabilize the API and work an a storage backend with a license != AGPL)
And the implementations in tree are mongodb (which is at best contentious, due to the AGPL and many folks reasonable concerns about it), and mysq.
Just to avoid misleading folks that are not familiar with marconi, I just want to point out that the driver is based on sqlalchemy.
My desires around Marconi are: - to make sure the queue we have is suitable for use by OpenStack itself: we have a very strong culture around consolidating technology choices, and it would be extremely odd to have Marconi be something that isn't suitable to replace rabbitmq etc as the queue abstraction in the fullness of time.
Although this could be done in the future, I've heard from many folks in the community that replacing OpenStack's rabbitmq / qpid / etc layer with Marconi is a no-go. I don't recall the exact reasons now but I think I can grab them from logs or something (Unless those folks are reading this email and want to chime in). FWIW, I'd be more than happy to *experiment* with this in the future. Marconi is definitely not ready as-is.
- to make sure that deployers with scale / performance needs can have that met by Marconi - to make my life easy as a deployer ;)
This has been part of our daily reviews, work and designs. I'm sure there's room for improvement, though.
So my questions are: - why isn't the API a queue friendly API (e.g. like
Define *queue friendly*
https://github.com/twitter/kestrel - kestrel which uses the memcache API, puts put into the queue, gets get from the queue). The current
I don't know kestrel but, how is this different from what Marconi does?
API looks like pretty much the worst case scenario there - CRUD rather than submit/retrieve with blocking requests (e.g. longpoll vs poll).
I agree there are some limitations from using HTTP for this job, hence the support for different transports. Just saying *the API is CRUD* is again misleading and it doesn't highlight the value of having an HTTP based transport. It's just wrong to think about marconi as *just another queuing system* instead of considering the use-cases it's trying to solve. There's a rough support for websocket in an external project but: 1. It's not offical... yet. 2. It was written as a proof of concept for the transport layer. 3. It likely needs to be updated. https://github.com/FlaPer87/marconi-websocket
- wouldn't it be better to expose other existing implementations of HTTP message queues like nova does with hypervisors, rather than creating our own one? E.g. HTTPSQS, RestMQ, Kestrel, queues.io.
We've discussed to have support for API extensions in order to allow some deployments to expose features from a queuing technology that we don't necessary consider part of the core API.
- or even do what Trove does and expose the actual implementation directly? - whats the plan to fix the API?
Fix the API? For starters, moving away from a data API to a provision API (or to just exposing the queuing technologies features) would not be fixing, it'd be re-writing Marconi (or actually a brand new project).
- is there a plan / desire to back onto actual queue services (e.g. AMQP, $anyof the http ones above, etc)
We've a plan to support an AMQP storage. It was delayed to focus on the graduation requirements but we've already done some changes in the API in order to improve the support of this type of storage. https://blueprints.launchpad.net/marconi/+spec/storage-amqp
- what is the current performance - how many usecs does it take to put a message, and get one back, in real world use? How many concurrent clients can a single Marconi API server with one backing server deliver today?
I don't have the results of the last bench test we did but I'm sure other folks can provide them. It's not as fast as qpid's or rabbit's. I don't think the HTTP API driver will be.
As background, 'implement a message queues in a SQL DB' is such a horrid antipattern its been a standing joke in many organisations I've been in - and yet we're preparing to graduate *exactly that* which is frankly perplexing.
TBH. I could say the exact same thing of some of the supported drivers that exist in some of the integrated projects and yet, they're integrated. This comment was not necessary and it's quite misleading for folks that are not familiar with Marconi. The concerns about the sqlalchemy *driver* could've been expressed differently. FWIW, I think there's a value on having an sqlalchemy driver. It's helpful for newcomers, it integrates perfectly with the gate and I don't want to impose other folks what they should or shouldn't use in production. Marconi may be providing a data API but it's still non-opinionated and it wants to support other drivers - or at least provide a nice way to implement them. Working on sqlalchemy instead of amqp (or redis) was decided in the incubation meeting. But again, It's an optional driver that we're talking about here. As of now, our recommended driver is mongodb's and as I already mentioned in this email, we'll start working on an amqp's one, which will likely become the recommended one. There's also support for redis. As already mentioned, we have plans to complete the redis driver and write an amqp based one and let them both live in the code base. Having support for different storage dirvers makes marconi's sharding feature more valuable. Side Note: When I say "this was decided in the incubation meeting" I'm not blaming the meeting nor the TC. What I mean there is that it was considered, at that point, to be the best thing to have in the immediate future. Cheers, Flavio -- @flaper87 Flavio Percoco
pgp7SEtkQpc_f.pgp
Description: PGP signature
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev