On Tue, May 8, 2012 at 6:44 PM, Julian Edwards <[email protected]> wrote: > On Saturday 05 May 2012 12:21:24 Clint Byrum wrote: >> Excerpts from Julian Edwards's message of Thu May 03 23:42:12 -0700 2012: >> > On Friday 04 May 2012 18:12:44 Robert Collins wrote: >> > > I would discourage, strongly discourage, any direct DB access from >> > > pserv: our experience with LP with such access has been universally >> > > bad. Let the appserver drive the DB exclusively, and offer appropriate >> > > APIs for getting stuff from/to it. I think we glossed over this on >> > > IRC; celery talking to postgresql might mean this needs some extra >> > > glue for celery, or something. >> > >> > For read access only, can you elaborate why this is bad? >> >> I've not been involved with Launchpad, but I have done a few multi-tiered >> architectures. >> >> There are a few reasons: >> >> * The database used is an implementation detail. Putting a lightweight >> layer of indirection between the DB and the other pieces of the app means >> being able to swap out the DB for the cases that matter. With the hyper >> scale requirement, this is likely to happen as it becomes clear which >> tables just cannot be served through a purely relational model. API's >> map intentions rather than implementations. >> >> * API's can be used as layers of control. The postgres and mysql >> protocols both make proxying a real chore, and so, its hard to control >> the number of threads. pgbouncer seems pretty good, but it then requires >> a dedicated proxy just for pgsql, which ties you further into pg. An API >> call, however, can be extended to provide needed metrics, and then be >> an intelligent choke point or pressure-release for a limited resource >> like the database. >> >> * Intelligence in the pipeline. This makes it easier to cache >> intelligently, easier to route/shard/etc. The layer of indirection used >> to be just in code, but you really need it in the network separation >> so that the pieces can be scaled individually and whole parts can be >> refactored without touching every place that might access that place. > > Thanks Clint, that's well elaborated. > > For the record, I was playing Devil's Advocate to some extent since we'd be > insulated through Django's ORM, but the points are well understood. > >> Put more succinctly, API changes are easier than schema changes. > > I'd argue the opposite if you're using lazr.restful :)
Hah :P In addition to Clint's excellent points (all of which I agree with), I'd also add two more points: * pserv, being twisted, means that it will have a hate-hate relationship with ORM state, just keeping it from doing silly things like keeping a transaction open for days will be an exercise in great care and diligence. * all the protections we (eventually) put in place around the DB (such as timeouts and worker concurrency limits) will have to be replicated for pserv, and as it has a different programming model, that means double work. In LP we haven't done this yet, and we have had the failure modes (like a script that goes nutty keeping backps from running, or a concurrent script causing unanticipated load) at one time or another. MAAS, being deployed on customer sites, outside of our ops teams reach, has to insulate itself from these sorts of things. -Rob -- Mailing list: https://launchpad.net/~maas-devel Post to : [email protected] Unsubscribe : https://launchpad.net/~maas-devel More help : https://help.launchpad.net/ListHelp

