zhuyifei1999 added a comment. |
(Tests below are done with the patch above applied)
I forced garbage collection on each sleep with __import__('gc').collect(), but the memory usage kept increasing, so it is not an issue with garbage collection not running frequently enough.
For simpler memory profiling, I captured a few memory mem_top snapshots:
- when the script just started, around ~280M used:
refs: 119192 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 119192 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 73266 <class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 73266 <class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 982 <class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 758 <class 'dict'> {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a 758 <class 'list'> [{'prefix': 'acronym', 'url': 'https://www.acronymfinder.com/$1.html'}, {'prefix': 'advisory', 'loca 474 <class 'dict'> {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor 420 <class 'list'> ["Wrapper script to use Pywikibot in 'directory' mode.\n\nRun scripts using:\n\n python pwb.py <n 343 <class 'dict'> {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S
bytes: 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 2097376 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 659504 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 49248 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a 24672 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 24672 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor 24672 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19 24672 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S 24672 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN types: 241076 <class 'list'> 11074 <class 'function'> 7362 <class 'dict'> 4629 <class 'tuple'> 2281 <class 'weakref'> 1533 <class 'cell'> 1521 <class 'inspect.Parameter'> 1229 <class 'type'> 1190 <class 'wrapper_descriptor'> 1156 <class 'builtin_function_or_method'>
- when the memory sent up to ~350M:
refs: 119192 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 119191 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 73266 <class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 73266 <class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 1830 <class 'list'> ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n', 1352 <class 'list'> ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n 1338 <class 'list'> ['"""HTTP/1.1 client library\n', '\n', '<intro stuff goes here>\n', '<other stuff, too>\n', '\n', 'H 1051 <class 'list'> ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa 982 <class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 905 <class 'list'> ['from __future__ import absolute_import\n', 'import errno\n', 'import logging\n', 'import sys\n', ' bytes: 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 2097376 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 659504 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 49248 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a 24672 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 24672 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor 24672 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19 24672 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S 24672 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN types: 242404 <class 'list'> 37071 <class 'dict'> 12078 <class 'http.cookiejar.Cookie'> 11072 <class 'function'> 4597 <class 'tuple'> 2339 <class 'weakref'> 1534 <class 'cell'> 1521 <class 'inspect.Parameter'> 1421 <class 'frame'> 1283 <class 'builtin_function_or_method'>
When memory went up to ~480M, before I terminated it:
refs: 119192 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 119192 <class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 73266 <class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 73266 <class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 1830 <class 'list'> ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n', 1352 <class 'list'> ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n 1338 <class 'list'> ['"""HTTP/1.1 client library\n', '\n', '<intro stuff goes here>\n', '<other stuff, too>\n', '\n', 'H 1148 <class 'list'> ['# Wrapper module for _ssl, providing some additional facilities\n', '# implemented in Python. Wri 1051 <class 'list'> ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa 982 <class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 bytes: 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 6291552 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h 2097376 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor 659504 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)', 49248 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a 24672 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0 24672 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor 24672 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19 24672 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S 24672 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN types: 245105 <class 'list'> 165069 <class 'dict'> 66076 <class 'http.cookiejar.Cookie'> 11081 <class 'function'> 4794 <class 'tuple'> 2419 <class 'weakref'> 2110 <class 'frame'> 1551 <class 'cell'> 1521 <class 'inspect.Parameter'> 1497 <class 'builtin_function_or_method'>
The most significant from my first read, is that, <class 'http.cookiejar.Cookie'> is being insane, from <1156 instances, to 12078 instances, to 66076 instances.
Will now look into its referrers with guppy.
TASK DETAIL
EMAIL PREFERENCES
To: zhuyifei1999
Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy
Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy
_______________________________________________ pywikibot-bugs mailing list pywikibot-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs