2011/11/29 Jay Pipes <jaypi...@gmail.com>: > On Tue, Nov 29, 2011 at 2:58 PM, Soren Hansen <so...@linux2go.dk> wrote: >> 2011/11/29 Jay Pipes <jaypi...@gmail.com>: >>> There's a very good reason this hasn't happened so far: handling >>> highly relational datasets with a non-relational data store is a bad >>> idea. In fact, I seem to remember that is exactly how Nova's data >>> store started out life (*cough* Redis *cough*) >> To be fair, we're only barely making use of this in our DB >> implementation. I don't think we do any foreign key checking at all, >> and deletes (because we don't actually delete anything, we just mark >> it as deleted) don't cascade, so there are all sort of ways in which >> our data store could be inconsistent. > Because the database schema isn't properly protecting against > referential integrity failures does not mean the relational database > store is a failure itself.
I'm not suggesting it's a failure at all. >> Besides, we don't really use transactions. I could easily read the >> same data from two separate nodes, make different (irreconcilable) >> changes on both nodes, and write them back, and the last one to write >> simply wins. > Sure, but using a KV store doesn't solve this problem... I'm not suggesting it will. My point is simply that using a KV store wouldn't lose us anything in that respect. >> In short, it seems to me we're not really getting much out of having a >> relational data store? > We're getting out of it what we ask of it. We aren't using scoped > sessions properly, aren't using transactions properly, and we aren't > enforcing referential integrity. But those are choices we've made, not > some native deficiency in relational data stores. I didn't mean to suggest that that was the case at all. The point I'm trying (but failing, clearly) to make is that with the way we're using it, we're not reaping the usual benefits from it, and that we'd in fact not lose anything by using a KV store. > As soon as someone can demonstrate the performance, scalability, and > robustness advantages of rewriting the data layer to use a > non-relational data store, I'm all ears. Until that point, I remain > unconvinced that the relational database is the source of major > bottlenecks. I understand that MySQL (and the other backends supported by SQLAlchemy, too) scales very well. Vertically. I doubt they'll be bottlenecks. Heck, they're even well-understood enough that people have built very decent HA setups using them. I just don't think they're a particularly good fit for a distributed system. You can have a highly available datastore all you want, but I'd sleep better knowing that our data is stored in a distributed system that is designed to handle network partitions well. -- Soren Hansen | http://linux2go.dk/ Ubuntu Developer | http://www.ubuntu.com/ OpenStack Developer | http://www.openstack.org/ _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp