zhuyifei1999 added a comment.

(Tests below are done with the patch above applied)

I forced garbage collection on each sleep with __import__('gc').collect(), but the memory usage kept increasing, so it is not an issue with garbage collection not running frequently enough.

For simpler memory profiling, I captured a few memory mem_top snapshots:

  • when the script just started, around ~280M used:
refs:
119192	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119192	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	<class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	<class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
982	<class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0
758	<class 'dict'> {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a
758	<class 'list'> [{'prefix': 'acronym', 'url': 'https://www.acronymfinder.com/$1.html'}, {'prefix': 'advisory', 'loca
474	<class 'dict'> {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor
420	<class 'list'> ["Wrapper script to use Pywikibot in 'directory' mode.\n\nRun scripts using:\n\n    python pwb.py <n
343	<class 'dict'> {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S

bytes:
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
2097376	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
659504	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
49248	 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a
24672	 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0
24672	 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor
24672	 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19
24672	 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S
24672	 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN

types:
241076	 <class 'list'>
11074	 <class 'function'>
7362	 <class 'dict'>
4629	 <class 'tuple'>
2281	 <class 'weakref'>
1533	 <class 'cell'>
1521	 <class 'inspect.Parameter'>
1229	 <class 'type'>
1190	 <class 'wrapper_descriptor'>
1156	 <class 'builtin_function_or_method'>
  • when the memory sent up to ~350M:
refs:
119192	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119191	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	<class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	<class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
1830	<class 'list'> ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n',
1352	<class 'list'> ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n
1338	<class 'list'> ['"""HTTP/1.1 client library\n', '\n', '<intro stuff goes here>\n', '<other stuff, too>\n', '\n', 'H
1051	<class 'list'> ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa
982	<class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0
905	<class 'list'> ['from __future__ import absolute_import\n', 'import errno\n', 'import logging\n', 'import sys\n', '

bytes:
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
2097376	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
659504	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
49248	 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a
24672	 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0
24672	 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor
24672	 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19
24672	 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S
24672	 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN

types:
242404	 <class 'list'>
37071	 <class 'dict'>
12078	 <class 'http.cookiejar.Cookie'>
11072	 <class 'function'>
4597	 <class 'tuple'>
2339	 <class 'weakref'>
1534	 <class 'cell'>
1521	 <class 'inspect.Parameter'>
1421	 <class 'frame'>
1283	 <class 'builtin_function_or_method'>

When memory went up to ~480M, before I terminated it:

refs:
119192	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119192	<class 'dict'> {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	<class 'set'> {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	<class 'list'> ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
1830	<class 'list'> ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n',
1352	<class 'list'> ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n
1338	<class 'list'> ['"""HTTP/1.1 client library\n', '\n', '<intro stuff goes here>\n', '<other stuff, too>\n', '\n', 'H
1148	<class 'list'> ['# Wrapper module for _ssl, providing some additional facilities\n', '# implemented in Python.  Wri
1051	<class 'list'> ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa
982	<class 'dict'> {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0

bytes:
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
2097376	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
659504	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
49248	 {'wmmx': <pywikibot.site._IWEntry object at 0x7fe33e442208>, 'tl': <pywikibot.site._IWEntry object a
24672	 {10731520: <weakref at 0x7fe349b8a9a8; to 'type' at 0xa3c000 (dict_values)>, 10733056: <weakref at 0
24672	 {'pkg_resources.extern.six.moves': <module 'pkg_resources._vendor.six.moves' (<pkg_resources._vendor
24672	 {'XATTR_SIZE_MAX': 65536, 'environ': {b'HOME': b'/home/zhuyifei1999', b'NVM_DIR': b'/home/zhuyifei19
24672	 {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S
24672	 {'INADDR_BROADCAST': 4294967295, 'HCI_TIME_STAMP': 3, 'SOCK_NONBLOCK': 2048, 'NETLINK_XFRM': 6, 'CAN

types:
245105	 <class 'list'>
165069	 <class 'dict'>
66076	 <class 'http.cookiejar.Cookie'>
11081	 <class 'function'>
4794	 <class 'tuple'>
2419	 <class 'weakref'>
2110	 <class 'frame'>
1551	 <class 'cell'>
1521	 <class 'inspect.Parameter'>
1497	 <class 'builtin_function_or_method'>

The most significant from my first read, is that, <class 'http.cookiejar.Cookie'> is being insane, from <1156 instances, to 12078 instances, to 66076 instances.

Will now look into its referrers with guppy.


TASK DETAIL
https://phabricator.wikimedia.org/T185561

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: zhuyifei1999
Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy
_______________________________________________
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to