Felix Schwarz wrote:
>
> I have a question which I think is similar enough to be asked 
> in the same 
> thread: I have a set of quite simple migration scripts which 
> us SQLAlchemy 0.4 
> and Elixir 0.4. I do extract data from the old legacy (MySQL) 
> database with 
> SQLAlchemy and put this data into new Elixir objects.
> 
> Currently, these scripts use up to 600 MB RAM. This is no 
> real problem as we 
> probably could devote a machine with 4 GB ram solely for the 
> automated 
> migration. But it would be nice to use lower-powered machines 
> for our migration 
> tasks.
> 
> What wonders me is that I do not (knowingly) keep references 
> neither to the old 
> data items nor the new elixir objects. Nevertheless memory 
> usage increases 
> during the migration. Is there any way to debug this easily 
> to see why Python 
> does need so much memory/which references prevent the objects 
> from being garbage 
> collected? Running the garbage collector manually did not 
> help much (saving only 
> about 5 MB).
> 
> fs
>

Here is a snippet that I've used before when trying to track down
objects that aren't getting cleaned up properly. I don't think it'll
find leaks of built-in types, but it should help with instances of
user-defined classes. Just call 'report_objects' every now and then.

--------------------------------------------

import gc
        
_previous = {}
def report_objects(threshold=500):
    objects = gc.get_objects()
    print "Number of objects in memory: %d" % len(objects)
    modules = {}
    for obj in gc.get_objects():
        if getattr(obj, '__module__', None) is not None:
            module_parts = obj.__module__.split('.')
            module = '.'.join(module_parts[:3])
            modules.setdefault(module, 0)
            modules[module] += 1

    print "Modules with > %d objects:" % threshold
    dump_modules(modules, threshold)

    if _previous:
        changes = {}
        for module, value in modules.items():
            changes[module] = value - _previous.get(module, 0)

        print "Changes since last time:"
        dump_modules(changes, 10)

        _previous.clear()
    _previous.update(modules)
    print ""


def dump_modules(modules, threshold):
    maxlen = max(len(m) for m in modules)
    l = [(value, module) for module, value in modules.items()
         if value > threshold]
    if l:
        l.sort(reverse=True)
        for value, module in l:
            print "%*s %5d" % (maxlen+1, module, value)
    else:
        print "   <None>"
    
-------------------------------------------------

The first time you call report_objects, you should get something like
this:

Number of objects in memory: 100794
Modules with > 500 objects:
    sqlalchemy.ext.assignmapper  1935
                sqlalchemy.util  1362
               sqlalchemy.types  1250
              sqlalchemy.schema  1170
                 sqlalchemy.sql  1124
      sqlalchemy.orm.unitofwork  1003
      sqlalchemy.orm.strategies   956
      sqlalchemy.orm.properties   750
      sqlalchemy.orm.attributes   699
          sqlalchemy.orm.mapper   681
      testresults.define_schema   665


And then when you call it again some time later:

Number of objects in memory: 102349
Modules with > 500 objects:
    sqlalchemy.ext.assignmapper  1935
                sqlalchemy.util  1418
               sqlalchemy.types  1250
              sqlalchemy.schema  1204
                 sqlalchemy.sql  1177
      sqlalchemy.orm.unitofwork  1004
      sqlalchemy.orm.strategies   993
      sqlalchemy.orm.properties   750
      sqlalchemy.orm.attributes   708
          sqlalchemy.orm.mapper   681
      testresults.define_schema   665
Changes since last time:
                sqlalchemy.util    56
                 sqlalchemy.sql    53
     sqlalchemy.databases.mysql    49
                MySQLdb.cursors    45
      sqlalchemy.orm.strategies    37
              sqlalchemy.schema    34
            MySQLdb.connections    16
             MySQLdb.converters    11

Note that the module names are where the classes are defined, not where
they are used, but it may be enough to give you a clue.

Hope that helps,

Simon

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to