|
Hello. I have a problem when processing relatively large number of rows. For example, when selecting 5000 main rows, each having a number of many-to-one relationships, memory usage shown by top skyrockets into 200+ MB range (RES), while heapy shows cca 20MB of Python heap. PostgreSQL backend via psycopg2. I've made a minimum example case based on the problem I'm noticing in my Pyramid app, so the session.commit() at line 130 is there to simulate commit done by Transaction used in Pyramid at the end of each request. If I'm understanding things correctly, committing would expire all objects involved in the session, and I even tried manual session.expunge(row), but there is no difference in memory usage. The following is source of an example case. Requires SQLAlchemy (tested with 0.7.5 and 0.7.7), guppy, psycopg2 (tested with 2.4.2 and 2.4.4). Happens both on Fedora 15 64-bit and CentOS 6.2 32-bit, though of course the 32-bit shows some 30% lower RES in top. http://pastebin.com/UFgduWVw Usage: setup a test database, update line 25 config. Prepopulate database with -p flag, then run again without any flags. I don't see where and how would any objects remain in memory, and heapy showing much lower memory use suggests something is retained in the involved C extensions?� I also tried with pympler, diff before and after selecting rows, shows nothing near reported by top. I guess there is no "leak" in traditional sense of the word because repeating the task does not yield growing memory consumption. It stabilizes at certain value and stays there. Heapy before selecting rows: Partition of a set of 102014 objects. Total size = 13160672 bytes. �Index� Count�� %���� Size�� % Cumulative� % Kind (class / dict of class) ���� 0� 45901� 45� 4395296� 33�� 4395296� 33 str ���� 1� 26041� 26� 2186184� 17�� 6581480� 50 tuple ���� 2�� 7039�� 7�� 900992�� 7�� 7482472� 57 types.CodeType ���� 3�� 6836�� 7�� 820320�� 6�� 8302792� 63 function ���� 4��� 235�� 0�� 761608�� 6�� 9064400� 69 dict of module ���� 5��� 608�� 1�� 689792�� 5�� 9754192� 74 dict (no owner) ���� 6��� 676�� 1�� 648544�� 5� 10402736� 79 dict of type ���� 7��� 676�� 1�� 608344�� 5� 11011080� 84 type ���� 8��� 199�� 0�� 206248�� 2� 11217328� 85 dict of class ���� 9��� 185�� 0�� 167320�� 1� 11384648� 87 sqlalchemy.sql.visitors.VisitableType <334 more rows. Type e.g. '_.more' to view.> Heapy after 5000 rows have been selected: Partition of a set of 102587 objects. Total size = 16455168 bytes. �Index� Count�� %���� Size�� % Cumulative� % Kind (class / dict of class) ���� 0� 45923� 45� 4397632� 27�� 4397632� 27 str ���� 1����� 1�� 0� 3146024� 19�� 7543656� 46 sqlalchemy.orm.identity.WeakInstanceDict ���� 2� 26090� 25� 2189480� 13�� 9733136� 59 tuple ���� 3�� 7039�� 7�� 900992�� 5� 10634128� 65 types.CodeType ���� 4�� 6859�� 7�� 823080�� 5� 11457208� 70 function ���� 5��� 235�� 0�� 761608�� 5� 12218816� 74 dict of module ���� 6��� 657�� 1�� 705048�� 4� 12923864� 79 dict (no owner) ���� 7��� 676�� 1�� 650464�� 4� 13574328� 82 dict of type ���� 8��� 676�� 1�� 608344�� 4� 14182672� 86 type ���� 9��� 199�� 0�� 206248�� 1� 14388920� 87 dict of class <372 more rows. Type e.g. '_.more' to view.> What am I doing wrong? I'm hoping something trivial and blatantly obvious that I'm oblivious to. :) Thanks. -- .oO V Oo. -- |
- [sqlalchemy] Understanding memory usage under SQLA Vlad K.
- Re: [sqlalchemy] Understanding memory usage under SQLA Michael Bayer
