Re: [openstack-dev] memory usage in devstack-gate (the oom-killer strikes again)

Mike Bayer Tue, 09 Sep 2014 08:39:32 -0700

yes.  guppy seems to have some nicer string formatting for this dump as well, 
but i was unable to figure out how to get this string format to write to a 
file, it seems like the tool is very geared towards interactive console use.   
We should pick a nice memory formatter we like, there’s a bunch of them, and 
then add it to our standard toolset.



On Sep 9, 2014, at 10:35 AM, Doug Hellmann <[email protected]> wrote:

> 
> On Sep 8, 2014, at 8:12 PM, Mike Bayer <[email protected]> wrote:
> 
>> Hi All - 
>> 
>> Joe had me do some quick memory profiling on nova, just an FYI if anyone 
>> wants to play with this technique, I place a little bit of memory profiling 
>> code using Guppy into nova/api/__init__.py, or anywhere in your favorite app 
>> that will definitely get imported when the thing first runs:
>> 
>> from guppy import hpy
>> import signal
>> import datetime
>> 
>> def handler(signum, frame):
>>     print "guppy memory dump"
>> 
>>     fname = "/tmp/memory_%s.txt" % 
>> datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
>>     prof = hpy().heap()
>>     with open(fname, 'w') as handle:
>>         prof.dump(handle)
>>     del prof
>> 
>> signal.signal(signal.SIGUSR2, handler)
> 
> This looks like something we could build into our standard service startup 
> code. Maybe in 
> http://git.openstack.org/cgit/openstack/oslo-incubator/tree/openstack/common/service.py
>  for example?
> 
> Doug
> 
>> 
>> 
>> 
>> Then, run nova-api, run some API calls, then you hit the nova-api process 
>> with a SIGUSR2 signal, and it will dump a profile into /tmp/ like this:
>> 
>> http://paste.openstack.org/show/108536/
>> 
>> Now obviously everyone is like, oh boy memory lets go beat up SQLAlchemy 
>> again…..which is fine I can take it.  In that particular profile, there’s a 
>> bunch of SQLAlchemy stuff, but that is all structural to the classes that 
>> are mapped in Nova API, e.g. 52 classes with a total of 656 attributes 
>> mapped.   That stuff sets up once and doesn’t change.   If Nova used less 
>> ORM,  e.g. didn’t map everything, that would be less.  But in that profile 
>> there’s no “data” lying around.
>> 
>> But even if you don’t have that many objects resident, your Python process 
>> might still be using up a ton of memory.  The reason for this is that the 
>> cPython interpreter has a model where it will grab all the memory it needs 
>> to do something, a time consuming process by the way, but then it really 
>> doesn’t ever release it  (see 
>> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm
>>  for the “classic” answer on this, things may have improved/modernized in 
>> 2.7 but I think this is still the general idea).
>> 
>> So in terms of SQLAlchemy, a good way to suck up a ton of memory all at once 
>> that probably won’t get released is to do this:
>> 
>> 1. fetching a full ORM object with all of its data
>> 
>> 2. fetching lots of them all at once
>> 
>> 
>> So to avoid doing that, the answer isn’t necessarily that simple.   The 
>> quick wins to loading full objects are to …not load the whole thing!   E.g. 
>> assuming we can get Openstack onto 0.9 in requirements.txt, we can start 
>> using load_only():
>> 
>> session.query(MyObject).options(load_only(“id”, “name”, “ip”))
>> 
>> or with any version, just load those columns - we should be using this as 
>> much as possible for any query that is row/time intensive and doesn’t need 
>> full ORM behaviors (like relationships, persistence):
>> 
>> session.query(MyObject.id, MyObject.name, MyObject.ip)
>> 
>> Another quick win, if we *really* need an ORM object, not a row, and we have 
>> to fetch a ton of them in one big result, is to fetch them using yield_per():
>> 
>>    for obj in session.query(MyObject).yield_per(100):
>>         # work with obj and then make sure to lose all references to it
>> 
>> yield_per() will dish out objects drawing from batches of the number you 
>> give it.   But it has two huge caveats: one is that it isn’t compatible with 
>> most forms of eager loading, except for many-to-one joined loads.  The other 
>> is that the DBAPI, e.g. like the MySQL driver, does *not* stream the rows; 
>> virtually all DBAPIs by default load a result set fully before you ever see 
>> the first row.  psycopg2 is one of the only DBAPIs that even offers a 
>> special mode to work around this (server side cursors).
>> 
>> Which means its even *better* to paginate result sets, so that you only ask 
>> the database for a chunk at a time, only storing at most a subset of objects 
>> in memory at once.  Pagination itself is tricky, if you are using a naive 
>> LIMIT/OFFSET approach, it takes awhile if you are working with a large 
>> OFFSET.  It’s better to SELECT into windows of data, where you can specify a 
>> start and end criteria (against an indexed column) for each window, like a 
>> timestamp.
>> 
>> Then of course, using Core only is another level of fastness/low memory.  
>> Though querying for individual columns with ORM is not far off, and I’ve 
>> also made some major improvements to that in 1.0 so that query(*cols) is 
>> pretty competitive with straight Core (and Core is…well I’d say becoming 
>> visible in raw DBAPI’s rear view mirror, at least….).
>> 
>> What I’d suggest here is that we start to be mindful of memory/performance 
>> patterns and start to work out naive ORM use into more savvy patterns; being 
>> aware of what columns are needed, what rows, how many SQL queries we really 
>> need to emit, what the “worst case” number of rows will be for sections that 
>> really need to scale.  By far the hardest part is recognizing and 
>> reimplementing when something might have to deal with an arbitrarily large 
>> number of rows, which means organizing that code to deal with a “streaming” 
>> pattern where you never have all the rows in memory at once - on other 
>> projects I’ve had tasks that would normally take about a day, but in order 
>> to organize it to “scale”, took weeks - such as being able to write out a 1G 
>> XML file from a database (yes, actual use case - not only do you have to 
>> stream your database data, but you also have to stream out your DOM nodes 
>> for which I had to write some fancy SAX extensions).   
>> 
>> I know that using the ORM makes SQL development “easy”, and so many anti-ORM 
>> articles insist that this lulls us all into not worrying about what is 
>> actually going on (as much as SQLAlchemy eschews that way of working)…but I 
>> remain optimistic that it *is* possible to use tools that save a vast amount 
>> of effort, code verbosity and inconsistency that results from doing 
>> everything “by hand”, while at the same time not losing our ability to 
>> understand how we’re talking to the database.   It’s a cake and eat it too, 
>> situation, I know.
>> 
>> This is already what I’m here to contribute on, I’ve been working out some 
>> new SQLAlchemy patterns that hopefully will help, but in the coming weeks I 
>> may try to find time to spot some more of these particular things within 
>> current Nova code without getting too much into a total rewrite as of yet.
>> 
>> 
>> 
>> 
>> 
>> On Sep 8, 2014, at 6:24 PM, Joe Gordon <[email protected]> wrote:
>> 
>>> Hi All,
>>> 
>>> We have recently started seeing assorted memory issues in the gate 
>>> including the oom-killer [0] and libvirt throwing memory errors [1]. 
>>> Luckily we run ps and dstat on every devstack run so we have some insight 
>>> into why we are running out of memory. Based on the output from job taken 
>>> at random [2][3] a typical run consists of:
>>> 
>>> * 68 openstack api processes alone
>>> * the following services are running 8 processes (number of CPUs on test 
>>> nodes)
>>>   * nova-api (we actually run 24 of these, 8 compute, 8 EC2, 8 metadata)
>>>   * nova-conductor
>>>   * cinder-api
>>>   * glance-api
>>>   * trove-api
>>>   * glance-registry
>>>   * trove-conductor
>>> * together nova-api, nova-conductor, cinder-api alone take over 45 %MEM 
>>> (note: some of that is memory usage is counted multiple times as RSS 
>>> includes shared libraries)
>>> * based on dstat numbers, it looks like we don't use that much memory 
>>> before tempest runs, and after tempest runs we use a lot of memory.
>>> 
>>> Based on this information I have two categories of questions:
>>> 
>>> 1) Should we explicitly set the number of workers that services use in 
>>> devstack? Why have so many workers in a small all-in-one environment? What 
>>> is the right balance here?
>>> 
>>> 2) Should we be worried that some OpenStack services such as nova-api, 
>>> nova-conductor and cinder-api take up so much memory? Does there memory 
>>> usage keep growing over time, does anyone have any numbers to answer this? 
>>> Why do these processes take up so much memory?
>>> 
>>> best,
>>> Joe
>>> 
>>> 
>>> [0] 
>>> http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwib29tLWtpbGxlclwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiIxNzI4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxNDEwMjExMjA5NzY3fQ==
>>> [1] https://bugs.launchpad.net/nova/+bug/1366931
>>> [2] http://paste.openstack.org/show/108458/
>>> [3] 
>>> http://logs.openstack.org/83/119183/4/check/check-tempest-dsvm-full/ea576e7/logs/screen-dstat.txt.gz
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> [email protected]
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> [email protected]
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] memory usage in devstack-gate (the oom-killer strikes again)

Reply via email to