Alas, I let my patch get too stale to rebase properly. However, it is a fairly "dumb" approach I took that can be demonstrated just from the patch. And in any case, I think the approach you're taking, profiling based on Tach, is going to be better in the long run and more share-able in the community.
+ 1 gazillion to getting good metrics! "Sandy Walsh" <sandy.wa...@rackspace.com> said: > (resent to list as I realized I just did a Reply) > > Cool! This is great stuff. Look forward to seeing the branch. > > I started working on a similar tool that takes the data collected from > Tach and fetches the data from Graphite to look at the performance > issues (no changes to nova trunk requires since Tach is awesome). > > It's a shell of an idea yet, but the basics work: > https://github.com/ohthree/novaprof > > But if there is something already existing, I'm happy to kill it off. > > I don't doubt for a second the db is the culprit for many of our woes. > > The thing I like about internal caching using established tools is that > it works for db issues too without having to resort to custom tables. > SQL query optimization, I'm sure, will go equally far. > > Thanks again for the great feedback ... keep it comin'! > > -S > > > On 03/22/2012 11:53 PM, Mark Washenberger wrote: >> Working on this independently, I created a branch with some simple >> performance logging around the nova-api, and individually around >> glance, nova.db, and nova.rpc calls. (Sorry, I only have a local >> copy and its on a different computer right now, and probably needs >> a rebase. I will rebase and publish it on GitHub tomorrow.) >> >> With this logging, I could get some simple profiling that I found >> very useful. Here is a GH project with the analysis code as well >> as some nova-api logs I was using as input. >> >> https://github.com/markwash/nova-perflog >> >> With these tools, you can get a wall-time profile for individual >> requests. For example, looking at one server create request (and >> you can run this directly from the checkout as the logs are saved >> there): >> >> markw@poledra:perflogs$ cat nova-api.vanilla.1.5.10.log | python >> profile-request.py req-3cc0fe84-e736-4441-a8d6-ef605558f37f >> key count avg >> nova.api.openstack.wsgi.POST 1 0.657 >> nova.db.api.instance_update 1 0.191 >> nova.image.show 1 0.179 >> nova.db.api.instance_add_security_group 1 0.082 >> nova.rpc.cast 1 0.059 >> nova.db.api.instance_get_all_by_filters 1 0.034 >> nova.db.api.security_group_get_by_name 2 0.029 >> nova.db.api.instance_create 1 0.011 >> nova.db.api.quota_get_all_by_project 3 0.003 >> nova.db.api.instance_data_get_for_project 1 0.003 >> >> key count total >> nova.api.openstack.wsgi 1 0.657 >> nova.db.api 10 0.388 >> nova.image 1 0.179 >> nova.rpc 1 0.059 >> >> All times are in seconds. The nova.rpc time is probably high >> since this was the first call since server restart, so the >> connection handshake is probably included. This is also probably >> 1.5 months stale. >> >> The conclusion I reached from this profiling is that we just plain >> overuse the db (and we might do the same in glance). For example, >> whenever we do updates, we actually re-retrieve the item from the >> database, update its dictionary, and save it. This is double the >> cost it needs to be. We also handle updates for data across tables >> inefficiently, where they could be handled in single database round >> trip. >> >> In particular, in the case of server listings, extensions are just >> rough on performance. Most extensions hit the database again >> at least once. This isn't really so bad, but it clearly is an area >> where we should improve, since these are the most frequent api >> queries. >> >> I just see a ton of specific performance problems that are easier >> to address one by one, rather than diving into a general (albeit >> obvious) solution such as caching. >> >> >> "Sandy Walsh" <sandy.wa...@rackspace.com> said: >> >>> We're doing tests to find out where the bottlenecks are, caching is the >>> most obvious solution, but there may be others. Tools like memcache do a >>> really good job of sharing memory across servers so we don't have to >>> reinvent the wheel or hit the db at all. >>> >>> In addition to looking into caching technologies/approaches we're gluing >>> together some tools for finding those bottlenecks. Our first step will >>> be finding them, then squashing them ... however. >>> >>> -S >>> >>> On 03/22/2012 06:25 PM, Mark Washenberger wrote: >>>> What problems are caching strategies supposed to solve? >>>> >>>> On the nova compute side, it seems like streamlining db access and >>>> api-view tables would solve any performance problems caching would >>>> address, while keeping the stale data management problem small. >>>> >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~openstack >>> Post to : openstack@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~openstack >>> More help : https://help.launchpad.net/ListHelp >>> >> >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp >
diff --git nova/api/openstack/wsgi.py nova/api/openstack/wsgi.py index af62f9b..70f4474 100644 --- nova/api/openstack/wsgi.py +++ nova/api/openstack/wsgi.py @@ -769,6 +769,9 @@ class Resource(wsgi.Application): LOG.info("%(method)s %(url)s" % {"method": request.method, "url": request.url}) + LOG.info("MJW %(method)s %(url)s" % {"method": request.method, + "url": request.url}, + context=request.environ['nova.context']) # Identify the action, its arguments, and the requested # content type @@ -782,8 +785,12 @@ class Resource(wsgi.Application): # function. If we try to audit __call__(), we can # run into troubles due to the @webob.dec.wsgify() # decorator. - return self._process_stack(request, action, action_args, - content_type, body, accept) + ret = self._process_stack(request, action, action_args, + content_type, body, accept) + LOG.info("MJW finish %(method)s %(url)s" % {"method": request.method, + "url": request.url}, + context=request.environ['nova.context']) + return ret def _process_stack(self, request, action, action_args, content_type, body, accept): diff --git nova/db/api.py nova/db/api.py index a886945..63bdf69 100644 --- nova/db/api.py +++ nova/db/api.py @@ -42,11 +42,13 @@ these objects be simple dictionaries. pool of available hardware (Default: True) """ +import functools from nova import exception from nova import flags from nova.openstack.common import cfg from nova import utils +from nova import log as logging db_opts = [ @@ -73,8 +75,34 @@ db_opts = [ FLAGS = flags.FLAGS FLAGS.add_options(db_opts) -IMPL = utils.LazyPluggable(FLAGS['db_backend'], - sqlalchemy='nova.db.sqlalchemy.api') + +LOG = logging.getLogger('nova.db.api') + + +class LogEverything(object): + + def __init__(self, wrapped): + self.wrapped = wrapped + + def __getattr__(self, name): + original = getattr(self.wrapped, name) + if callable(original): + return self._log_everything(original, name) + else: + return original + + def _log_everything(self, fun, name): + @functools.wraps(fun) + def logged_fun(context, *args, **kwargs): + LOG.info("MJW %s" % name, context=context) + ret = fun(context, *args, **kwargs) + LOG.info("MJW finish %s" % name, context=context) + return ret + return logged_fun + + +IMPL = LogEverything(utils.LazyPluggable(FLAGS['db_backend'], + sqlalchemy='nova.db.sqlalchemy.api')) class NoMoreNetworks(exception.Error): diff --git nova/image/__init__.py nova/image/__init__.py index c5ca6f3..ac047fc 100644 --- nova/image/__init__.py +++ nova/image/__init__.py @@ -14,19 +14,47 @@ # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the # License for the specific language governing permissions and limitations # under the License. - +import functools import nova from nova import utils from nova import flags from nova.image import glance +from nova import log as logging FLAGS = flags.FLAGS +LOG = logging.getLogger('nova.image') + + +class LogEverything(object): + + def __init__(self, *args, **kwargs): + if not hasattr(self, 'already_inited'): + self.wrapped = args[0] + self.already_inited = True + else: + self.wrapped.__init__(*args, **kwargs) + + def __getattr__(self, name): + original = getattr(self.wrapped, name) + if callable(original): + return self._log_everything(original, name) + else: + return original + + def _log_everything(self, fun, name): + @functools.wraps(fun) + def logged_fun(context, *args, **kwargs): + LOG.info("MJW %s" % name, context=context) + ret = fun(context, *args, **kwargs) + LOG.info("MJW finish %s" % name, context=context) + return ret + return logged_fun def get_default_image_service(): ImageService = utils.import_class(FLAGS.image_service) - return ImageService() + return LogEverything(ImageService()) def get_image_service(context, image_href): @@ -48,4 +76,4 @@ def get_image_service(context, image_href): (glance_client, image_id) = glance.get_glance_client(context, image_href) image_service = nova.image.glance.GlanceImageService(glance_client) - return (image_service, image_id) + return (LogEverything(image_service), image_id) diff --git nova/log.py nova/log.py index c0bc256..00a203f 100644 --- nova/log.py +++ nova/log.py @@ -30,6 +30,7 @@ It also allows setting of formatting information through flags. """ import cStringIO +import eventlet import inspect import json import logging @@ -49,6 +50,7 @@ from nova import version log_opts = [ cfg.StrOpt('logging_context_format_string', default='%(asctime)s %(levelname)s %(name)s [%(request_id)s ' + '%(eventlet_id)x ' '%(user_id)s %(project_id)s] %(instance)s' '%(message)s', help='format string to use for log messages with context'), @@ -177,6 +179,8 @@ class NovaLogger(logging.Logger): params['extra'] = {} extra = params['extra'] + extra['eventlet_id'] = id(eventlet.getcurrent()) + context = None if 'context' in params: context = params['context'] diff --git nova/rpc/__init__.py nova/rpc/__init__.py index 7c6ed29..80a7a7c 100644 --- nova/rpc/__init__.py +++ nova/rpc/__init__.py @@ -16,11 +16,13 @@ # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the # License for the specific language governing permissions and limitations # under the License. +import functools from nova import flags from nova.openstack.common import cfg from nova.rpc.common import RemoteError, LOG from nova.utils import import_object +from nova import log as logging rpc_backend_opt = \ @@ -30,6 +32,29 @@ rpc_backend_opt = \ FLAGS = flags.FLAGS FLAGS.add_option(rpc_backend_opt) +LOG = logging.getLogger('nova.rpc') + + +class LogEverything(object): + + def __init__(self, wrapped): + self.wrapped = wrapped + + def __getattr__(self, name): + original = getattr(self.wrapped, name) + if callable(original): + return self._log_everything(original, name) + else: + return original + + def _log_everything(self, fun, name): + @functools.wraps(fun) + def logged_fun(context, *args, **kwargs): + LOG.info("MJW %s" % name, context=context) + ret = fun(context, *args, **kwargs) + LOG.info("MJW finish %s" % name, context=context) + return ret + return logged_fun def create_connection(new=True): @@ -169,5 +194,5 @@ def _get_impl(): """Delay import of rpc_backend until FLAGS are loaded.""" global _RPCIMPL if _RPCIMPL is None: - _RPCIMPL = import_object(FLAGS.rpc_backend) + _RPCIMPL = LogEverything(import_object(FLAGS.rpc_backend)) return _RPCIMPL
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp