+1 Documenting these findings would be nice too.
best, Joe On Fri, Mar 23, 2012 at 2:15 PM, Justin Santa Barbara <jus...@fathomdb.com>wrote: > This is great: hard numbers are exactly what we need. I would love to see > a statement-by-statement SQL log with timings from someone that has a > performance issue. I'm happy to look into any DB problems that > demonstrates. > > The nova database is small enough that it should always be in-memory (if > you're running a million VMs, I don't think asking for one gigabyte of RAM > on your DB is unreasonable!) > > If it isn't hitting disk, PostgreSQL or MySQL with InnoDB can serve 10k > 'indexed' requests per second through SQL on a low-end (<$1000) box. With > tuning you can get 10x that. Using one of the SQL bypass engines (e.g. > MySQL HandlerSocket) can supposedly give you 10x again. Throwing money at > the problem in the form of multi-processor boxes (or disks if you're I/O > bound) can probably get you 10x again. > > However, if you put a DB on a remote host, you'll have to wait for a > network round-trip per query. If your ORM is doing a 1+N query, the total > read time will be slow. If your DB is doing a sync on every write, writes > will be slow. If the DB isn't tuned with a sensible amount of cache (at > least as big as the DB size), it will be slow(er). Each of these has a > very simple fix for OpenStack. > > Relational databases have very efficient caching mechanisms built in. Any > out-of-process cache will have a hard time beating it. Let's make sure the > bottleneck is the DB, and not (for example) RabbitMQ, before we go off a > huge rearchitecture. > > Justin > > > > > On Thu, Mar 22, 2012 at 7:53 PM, Mark Washenberger < > mark.washenber...@rackspace.com> wrote: > >> Working on this independently, I created a branch with some simple >> performance logging around the nova-api, and individually around >> glance, nova.db, and nova.rpc calls. (Sorry, I only have a local >> copy and its on a different computer right now, and probably needs >> a rebase. I will rebase and publish it on GitHub tomorrow.) >> >> With this logging, I could get some simple profiling that I found >> very useful. Here is a GH project with the analysis code as well >> as some nova-api logs I was using as input. >> >> https://github.com/markwash/nova-perflog >> >> With these tools, you can get a wall-time profile for individual >> requests. For example, looking at one server create request (and >> you can run this directly from the checkout as the logs are saved >> there): >> >> markw@poledra:perflogs$ cat nova-api.vanilla.1.5.10.log | python >> profile-request.py req-3cc0fe84-e736-4441-a8d6-ef605558f37f >> key count avg >> nova.api.openstack.wsgi.POST 1 0.657 >> nova.db.api.instance_update 1 0.191 >> nova.image.show 1 0.179 >> nova.db.api.instance_add_security_group 1 0.082 >> nova.rpc.cast 1 0.059 >> nova.db.api.instance_get_all_by_filters 1 0.034 >> nova.db.api.security_group_get_by_name 2 0.029 >> nova.db.api.instance_create 1 0.011 >> nova.db.api.quota_get_all_by_project 3 0.003 >> nova.db.api.instance_data_get_for_project 1 0.003 >> >> key count total >> nova.api.openstack.wsgi 1 0.657 >> nova.db.api 10 0.388 >> nova.image 1 0.179 >> nova.rpc 1 0.059 >> >> All times are in seconds. The nova.rpc time is probably high >> since this was the first call since server restart, so the >> connection handshake is probably included. This is also probably >> 1.5 months stale. >> >> The conclusion I reached from this profiling is that we just plain >> overuse the db (and we might do the same in glance). For example, >> whenever we do updates, we actually re-retrieve the item from the >> database, update its dictionary, and save it. This is double the >> cost it needs to be. We also handle updates for data across tables >> inefficiently, where they could be handled in single database round >> trip. >> >> In particular, in the case of server listings, extensions are just >> rough on performance. Most extensions hit the database again >> at least once. This isn't really so bad, but it clearly is an area >> where we should improve, since these are the most frequent api >> queries. >> >> I just see a ton of specific performance problems that are easier >> to address one by one, rather than diving into a general (albeit >> obvious) solution such as caching. >> >> >> "Sandy Walsh" <sandy.wa...@rackspace.com> said: >> >> > We're doing tests to find out where the bottlenecks are, caching is the >> > most obvious solution, but there may be others. Tools like memcache do a >> > really good job of sharing memory across servers so we don't have to >> > reinvent the wheel or hit the db at all. >> > >> > In addition to looking into caching technologies/approaches we're gluing >> > together some tools for finding those bottlenecks. Our first step will >> > be finding them, then squashing them ... however. >> > >> > -S >> > >> > On 03/22/2012 06:25 PM, Mark Washenberger wrote: >> >> What problems are caching strategies supposed to solve? >> >> >> >> On the nova compute side, it seems like streamlining db access and >> >> api-view tables would solve any performance problems caching would >> >> address, while keeping the stale data management problem small. >> >> >> > >> > _______________________________________________ >> > Mailing list: https://launchpad.net/~openstack >> > Post to : openstack@lists.launchpad.net >> > Unsubscribe : https://launchpad.net/~openstack >> > More help : https://help.launchpad.net/ListHelp >> > >> >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp >> > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > >
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp