Re: [openstack-dev] memory usage in devstack-gate (the oom-killer strikes again)

Mike Bayer Mon, 08 Sep 2014 17:13:46 -0700

Hi All - 

Joe had me do some quick memory profiling on nova, just an FYI if anyone wants 
to play with this technique, I place a little bit of memory profiling code 
using Guppy into nova/api/__init__.py, or anywhere in your favorite app that 
will definitely get imported when the thing first runs:

from guppy import hpy
import signal
import datetime

def handler(signum, frame):
    print "guppy memory dump"

    fname = "/tmp/memory_%s.txt" % 
datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    prof = hpy().heap()
    with open(fname, 'w') as handle:
        prof.dump(handle)
    del prof

signal.signal(signal.SIGUSR2, handler)

Then, run nova-api, run some API calls, then you hit the nova-api process with 
a SIGUSR2 signal, and it will dump a profile into /tmp/ like this:

http://paste.openstack.org/show/108536/

Now obviously everyone is like, oh boy memory lets go beat up SQLAlchemy 
again…..which is fine I can take it.  In that particular profile, there’s a 
bunch of SQLAlchemy stuff, but that is all structural to the classes that are 
mapped in Nova API, e.g. 52 classes with a total of 656 attributes mapped.   
That stuff sets up once and doesn’t change.   If Nova used less ORM,  e.g. 
didn’t map everything, that would be less.  But in that profile there’s no 
“data” lying around.

But even if you don’t have that many objects resident, your Python process 
might still be using up a ton of memory.  The reason for this is that the 
cPython interpreter has a model where it will grab all the memory it needs to 
do something, a time consuming process by the way, but then it really doesn’t 
ever release it  (see 
http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm
 for the “classic” answer on this, things may have improved/modernized in 2.7 
but I think this is still the general idea).

So in terms of SQLAlchemy, a good way to suck up a ton of memory all at once 
that probably won’t get released is to do this:

1. fetching a full ORM object with all of its data

2. fetching lots of them all at once

So to avoid doing that, the answer isn’t necessarily that simple.   The quick 
wins to loading full objects are to …not load the whole thing!   E.g. assuming 
we can get Openstack onto 0.9 in requirements.txt, we can start using 
load_only():

session.query(MyObject).options(load_only(“id”, “name”, “ip”))

or with any version, just load those columns - we should be using this as much 
as possible for any query that is row/time intensive and doesn’t need full ORM 
behaviors (like relationships, persistence):

session.query(MyObject.id, MyObject.name, MyObject.ip)

Another quick win, if we *really* need an ORM object, not a row, and we have to 
fetch a ton of them in one big result, is to fetch them using yield_per():

   for obj in session.query(MyObject).yield_per(100):
        # work with obj and then make sure to lose all references to it

yield_per() will dish out objects drawing from batches of the number you give 
it.   But it has two huge caveats: one is that it isn’t compatible with most 
forms of eager loading, except for many-to-one joined loads.  The other is that 
the DBAPI, e.g. like the MySQL driver, does *not* stream the rows; virtually 
all DBAPIs by default load a result set fully before you ever see the first 
row.  psycopg2 is one of the only DBAPIs that even offers a special mode to 
work around this (server side cursors).

Which means its even *better* to paginate result sets, so that you only ask the 
database for a chunk at a time, only storing at most a subset of objects in 
memory at once.  Pagination itself is tricky, if you are using a naive 
LIMIT/OFFSET approach, it takes awhile if you are working with a large OFFSET.  
It’s better to SELECT into windows of data, where you can specify a start and 
end criteria (against an indexed column) for each window, like a timestamp.

Then of course, using Core only is another level of fastness/low memory.  
Though querying for individual columns with ORM is not far off, and I’ve also 
made some major improvements to that in 1.0 so that query(*cols) is pretty 
competitive with straight Core (and Core is…well I’d say becoming visible in 
raw DBAPI’s rear view mirror, at least….).

What I’d suggest here is that we start to be mindful of memory/performance 
patterns and start to work out naive ORM use into more savvy patterns; being 
aware of what columns are needed, what rows, how many SQL queries we really 
need to emit, what the “worst case” number of rows will be for sections that 
really need to scale.  By far the hardest part is recognizing and 
reimplementing when something might have to deal with an arbitrarily large 
number of rows, which means organizing that code to deal with a “streaming” 
pattern where you never have all the rows in memory at once - on other projects 
I’ve had tasks that would normally take about a day, but in order to organize 
it to “scale”, took weeks - such as being able to write out a 1G XML file from 
a database (yes, actual use case - not only do you have to stream your database 
data, but you also have to stream out your DOM nodes for which I had to write 
some fancy SAX extensions).   

I know that using the ORM makes SQL development “easy”, and so many anti-ORM 
articles insist that this lulls us all into not worrying about what is actually 
going on (as much as SQLAlchemy eschews that way of working)…but I remain 
optimistic that it *is* possible to use tools that save a vast amount of 
effort, code verbosity and inconsistency that results from doing everything “by 
hand”, while at the same time not losing our ability to understand how we’re 
talking to the database.   It’s a cake and eat it too, situation, I know.

This is already what I’m here to contribute on, I’ve been working out some new 
SQLAlchemy patterns that hopefully will help, but in the coming weeks I may try 
to find time to spot some more of these particular things within current Nova 
code without getting too much into a total rewrite as of yet.

On Sep 8, 2014, at 6:24 PM, Joe Gordon <[email protected]> wrote:

> Hi All,
> 
> We have recently started seeing assorted memory issues in the gate including 
> the oom-killer [0] and libvirt throwing memory errors [1]. Luckily we run ps 
> and dstat on every devstack run so we have some insight into why we are 
> running out of memory. Based on the output from job taken at random [2][3] a 
> typical run consists of:
> 
> * 68 openstack api processes alone
> * the following services are running 8 processes (number of CPUs on test 
> nodes)
>   * nova-api (we actually run 24 of these, 8 compute, 8 EC2, 8 metadata)
>   * nova-conductor
>   * cinder-api
>   * glance-api
>   * trove-api
>   * glance-registry
>   * trove-conductor
> * together nova-api, nova-conductor, cinder-api alone take over 45 %MEM 
> (note: some of that is memory usage is counted multiple times as RSS includes 
> shared libraries)
> * based on dstat numbers, it looks like we don't use that much memory before 
> tempest runs, and after tempest runs we use a lot of memory.
> 
> Based on this information I have two categories of questions:
> 
> 1) Should we explicitly set the number of workers that services use in 
> devstack? Why have so many workers in a small all-in-one environment? What is 
> the right balance here?
> 
> 2) Should we be worried that some OpenStack services such as nova-api, 
> nova-conductor and cinder-api take up so much memory? Does there memory usage 
> keep growing over time, does anyone have any numbers to answer this? Why do 
> these processes take up so much memory?
> 
> best,
> Joe
> 
> 
> [0] 
> http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwib29tLWtpbGxlclwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiIxNzI4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxNDEwMjExMjA5NzY3fQ==
> [1] https://bugs.launchpad.net/nova/+bug/1366931
> [2] http://paste.openstack.org/show/108458/
> [3] 
> http://logs.openstack.org/83/119183/4/check/check-tempest-dsvm-full/ea576e7/logs/screen-dstat.txt.gz
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] memory usage in devstack-gate (the oom-killer strikes again)

Reply via email to