Re: [openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

Malini Kamalambal Thu, 20 Mar 2014 08:14:45 -0700

Let me start by saying that I want there to be a constructive discussion around 
all this. I've done my best to keep my tone as non-snarky as I could while 
still clearly stating my concerns. I've also spent a few hours reviewing the 
current code and docs. Hopefully this contribution will be beneficial in 
helping the discussion along.


For what it's worth, I don't have a clear understanding of why the Marconi 
developer community chose to create a new queue rather than an abstraction 
layer on top of existing queues. While my lack of understanding there isn't a 
technical objection to the project, I hope they can address this in the 
aforementioned FAQ.

The reference storage implementation is MongoDB. AFAIK, no integrated projects 
require an AGPL package to be installed, and from the discussions I've been 
part of, that would be a show-stopper if Marconi required MongoDB. As I 
understand it, this is why sqlalchemy support was required when Marconi was 
incubated. Saying "Marconi also supports SQLA" is disingenuous because it is a 
second-class citizen, with incomplete API support, is clearly not the 
recommended storage driver, and is going to be unusuable at scale (I'll come 
back to this point in a bit).

Let me ask this. Which back-end is tested in Marconi's CI? That is the back-end 
that matters right now. If that's Mongo, I think there's a problem. If it's 
SQLA, then I think Marconi should declare any features which SQLA doesn't 
support to be optional extensions, make SQLA the default, and clearly document 
how to deploy Marconi at scale with a SQLA back-end.


"[drivers]
storage = mongodb

[drivers:storage:mongodb]
uri = mongodb://localhost:27017/marconi



http://logs.openstack.org/94/81094/2/check/check-tempest-dsvm-marconi/c006285/logs/etc/marconi/marconi.conf.txt.gz



On an related note I see that marconi has no gating integration tests.
https://review.openstack.org/#/c/81094/2


But then again that is documented in 
https://wiki.openstack.org/wiki/Marconi/Incubation/Graduation#Legal_requirements
We have a devstack-gate job running and will be making it voting this week.


Of the non-gating integration test job, I only see one marconi test being run: 
tempest.api.queuing.test_queues.TestQueues.test_create_queue
 
http://logs.openstack.org/94/81094/2/check/check-tempest-dsvm-marconi/c006285/logs/testr_results.html.gz
"


I have a separate thread started on the graduation gating requirements w.r.t 
Tempest.
The single test we have on Tempest was a result of the one-liner requirement ' 
'Project must have a basic devstack-gate job set up'.
The subsequent discussion in openstack qa meeting lead me to believe that the 
'basic' job we have is good enough.
Please refer to the email 'Graduation Requirements + Scope of Tempest' for more 
details regarding this.

But that does not mean that 'the single tempest test' is all we have to verify 
the Marconi functionality.
We have had a robust test suite (unit & functional tests – with lots of 
positive & negative test scenarios)for a very long time in Marconi.
See 
http://logs.openstack.org/33/81033/2/check/gate-marconi-python27/35822df/testr_results.html.gz
These tests are run against a sqlite backend.
The gating tests have been using sqlalchemy driver ever since we have had it.
Hope that clarifies !

- Malini





Then there's the db-as-a-queue antipattern, and the problems that I have seen 
result from this in the past... I'm not the only one in the OpenStack community 
with some experience scaling MySQL databases. Surely others have their own 
experiences and opinions on whether a database (whether MySQL or Mongo or 
Postgres or ...) can be used in such a way _at_scale_ and not fall over from 
resource contention. I would hope that those members of the community would 
chime into this discussion at some point. Perhaps they'll even disagree with me!

A quick look at the code around claim (which, it seems, will be the most 
commonly requested action) shows why this is an antipattern.

The MongoDB storage driver for claims requires _four_ queries just to get a 
message, with a serious race condition (but at least it's documented in the 
code) if multiple clients are claiming messages in the same queue at the same 
time. For reference:
  
https://github.com/openstack/marconi/blob/master/marconi/queues/storage/mongodb/claims.py#L119

The SQLAlchemy storage driver is no better. It's issuing _five_ queries just to 
claim a message (including a query to purge all expired claims every time a new 
claim is created). The performance of this transaction under high load is 
probably going to be bad...
  
https://github.com/openstack/marconi/blob/master/marconi/queues/storage/sqlalchemy/claims.py#L83

Lastly, it looks like the Marconi storage drivers assume the storage back-end 
to be infinitely scalable. AFAICT, the mongo storage driver supports mongo's 
native sharding -- which I'm happy to see -- but the SQLA driver does not 
appear to support anything equivalent for other back-ends, eg. MySQL. This 
relegates any deployment using the SQLA backend to the scale of "only what one 
database instance can handle". It's unsuitable for any large-scale deployment. 
Folks who don't want to use Mongo are likely to use MySQL and will be promptly 
bitten by Marconi's lack of scalability with this back end.

While there is a lot of room to improve the messaging around what/how/why, and 
I think a FAQ will be very helpful, I don't think that Marconi should graduate 
this cycle because:
(1) support for a non-AGPL-backend is a legal requirement [*] for Marconi's 
graduation;
(2) deploying Marconi with sqla+mysql will result in an incomplete and 
unscalable service.

++


It's possible that I'm wrong about the scalability of Marconi with sqla + 
mysql. If anyone feels that this is going to perform blazingly fast on a single 
mysql db backend, please publish a benchmark and I'll be very happy to be 
proved wrong. To be meaningful, it must have a high concurrency of clients 
creating and claiming messages with (num queues) << (num clients) << (num 
messages), and all clients polling on a reasonably short interval, based on 
what ever the recommended client-rate-limit is. I'd like the test to be 
repeated with both Mongo and SQLA back-ends on the same hardware for comparison.


Regards,
Devananda

[*] 
https://wiki.openstack.org/wiki/Marconi/Incubation/Graduation#Legal_requirements





_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org<mailto:OpenStack-dev@lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Marconi] Why is marconi a queue implementation vs a provisioning API?

Reply via email to