Re: [openstack-dev] [nova] Distributed Database

Mike Bayer Thu, 28 Apr 2016 10:03:06 -0700


On 04/28/2016 08:44 AM, Edward Leafe wrote:

On Apr 24, 2016, at 3:28 PM, Robert Collins <[email protected]> wrote:

For instance, the things I think are essential for a distributed
database based datastore:
- good single-machine developer story. Must not need a physical
cluster to hack on OpenStack
- deal gracefully with single node/rack/site failures (when deployed
appropriately) - allow limiting failure domain impact
- straightforward programming model: wrong uses should be obvious to reviewers
- low latency performance with big datasets: e.g. nova list as an
admin should be able to get the Nth page as rapidly as the 2nd or 3rd.
- code to deliver that should be (approximately) no worse than the current code


Agree on all of these points, as well as the rest of your post.

After several hallway track discussions, as well as yesterday’s Cells V2 
discussion, I’ve written a follow-up post:

http://blog.leafe.com/index.php/2016/04/28/fragmented-data/

Feedback, of course, is welcomed!

Regarding ROME [1], I've taken a look at its source code and while it iscertainly interesting, I wouldn't recommend lifting and moving all ofNova's database infrastructure onto it as a dependency within the nearterm, as the state of this code is very immature. SQLAlchemy itself wasonce immature as well, so there is no sin here, but that was elevenyears ago.

The internals here are not only highly dependent on SQLAlchemy internals(pinned at the 0.9 series which is obsolete), it is using these APIs ina very brittle and non-performant way [2]. In this code example, theinternal elements of SQLAlchemy expression objects are repeatedly runthrough str() which on each call runs a full string compilation step inorder to test for what their actual type is. It can't be overstated howinappropriate this approach is and the author of the library would havebenefited from reaching out to me in order to get some guidance on thecorrect way to introspect SQLAlchemy expression objects. Basic Pythonidioms like type checking also seem to be misunderstood [3].

I don't think anyone denies that Nova can use any kind of databasebackend but the point was raised that to start from scratch with anentirely new database approach is an enormous job. If the first stepof that job is in fact "port SQLAlchemy and the relational model toRedis", that makes the job extremely more involved and I'd disagree withyour post's assertion that "It's not too late" if this is the case.If the admission of ROME for Nova is that the relational model is infact necessary for Nova, then that disqualifies NoSQL databases out ofthe gate - it's one thing to lament that MySQL is not as "distributed"out of the box as a NoSQL database, but it's another to lament thatnon-relational databases are not in fact relational.


[1] https://github.com/BeyondTheClouds/rome

[2]https://github.com/BeyondTheClouds/rome/blob/master/lib/rome/core/expression/expression.py#L172

[3]https://github.com/BeyondTheClouds/rome/blob/master/lib/rome/core/expression/expression.py#L102



-- Ed Leafe






__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] Distributed Database

Reply via email to