[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2023-05-16 Thread gerritbot
gerritbot added a comment. Change 416369 **abandoned** by Xqt: [pywikibot/core@master] weblinkchecker: Use temporary session and stream the response Reason: Task is already solved I guess https://gerrit.wikimedia.org/r/416369 TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2022-11-17 Thread gerritbot
gerritbot added a comment. Change 416609 **abandoned** by Xqt: [pywikibot/core@master] [DO NOT MERGE] weblinkchecker: add timeout for get_memento_info() Reason: solved with https://gerrit.wikimedia.org/r/803232 https://gerrit.wikimedia.org/r/416609 TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2022-06-19 Thread gerritbot
gerritbot added a comment. Change 803232 **merged** by Xqt: [pywikibot/core@master] [fix] Add a memento_client fix to the framework https://gerrit.wikimedia.org/r/803232 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2022-06-06 Thread gerritbot
gerritbot added a comment. Change 803232 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] [fix] Add a memento_client fix to the framework https://gerrit.wikimedia.org/r/803232 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2021-05-11 Thread gerritbot
gerritbot added a comment. Change 689013 **merged** by jenkins-bot: [pywikibot/core@master] [doc] bot owner can specify their own session object https://gerrit.wikimedia.org/r/689013 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2021-05-11 Thread gerritbot
gerritbot added a comment. Change 416368 **abandoned** by Zhuyifei1999: [pywikibot/core@master] comms.http: allow consumer to specify their own session Reason: https://gerrit.wikimedia.org/r/416368 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2021-05-11 Thread gerritbot
gerritbot added a comment. Change 689013 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] [doc] bot owner can specify their own session object https://gerrit.wikimedia.org/r/689013 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2020-03-28 Thread Zoranzoki21
Zoranzoki21 added a comment. In T185561#6007682 , @Dvorapa wrote: > Perhaps we could deprecate the whole script as #internetarchivebot is already doing the same job in a more

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2020-03-28 Thread Dvorapa
Dvorapa added a comment. Perhaps we could deprecate the whole script as #internetarchivebot is already doing the same job in a more proper way (editing articles directly, creating well structured listings, etc.) TASK DETAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2020-03-01 Thread Zoranzoki21
Zoranzoki21 added a comment. In T185561#5930472 , @Xqt wrote: > Update request: https://github.com/mementoweb/py-memento-client/issues/24 Looks like repo is inactive. TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2020-03-01 Thread Xqt
Xqt added a comment. Update request: https://github.com/mementoweb/py-memento-client/issues/24 TASK DETAIL https://phabricator.wikimedia.org/T185561 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Xqt Cc: D3r1ck01, Framawiki, Masti, iank,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-11-08 Thread zhuyifei1999
zhuyifei1999 added a comment. (still waiting for upstream release)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Framawiki, Masti, iank, gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-06-10 Thread Framawiki
Framawiki added a comment. Issue was closed upstream: https://github.com/mementoweb/py-memento-client/issues/23TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: FramawikiCc: Framawiki, Masti, iank, gerritbot,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-05 Thread gerritbot
gerritbot added a comment. Change 416609 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] [DO NOT MERGE] weblinkchecker: add timeout for get_memento_info() https://gerrit.wikimedia.org/r/416609TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. @Dvorapa Could you see if the memory still leaks without bound after applying the patches?TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. ^ I can squash these two if it is better that way.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread gerritbot
gerritbot added a comment. Change 416368 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] comms.http: allow consumer to specify their own session https://gerrit.wikimedia.org/r/416368TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread gerritbot
gerritbot added a comment. Change 416369 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] weblinkchecker: Use temporary session and stream the response https://gerrit.wikimedia.org/r/416369TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. FWIW, those python ptr stuffs are defined at https://github.com/python/cpython/blob/5fe59f8e3a0a56a155c18f9d581205ec533764b6/Tools/gdb/libpython.pyTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. So this code seems to be able to extract the appproximate size of cookies, if navigated to the right frame (thread 2 & a few py-up): python def objptr_getattr(obj, name): try: return dict_getitem(obj.get_attr_dict(), name) except KeyError:

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. Trying to get the size of all the cookies... (gdb) py-bt Traceback (most recent call first): File "/home/pavel/.local/lib/python3.6/site-packages/requests/sessions.py", line 508, in request resp = self.send(prep, **send_kwargs) File

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. zhuyifei1999@zhuyifei1999-ThinkPad-X260:~$ curl -I 'http://aplikace.mvcr.cz/sbirka-zakonu/ViewFile.aspx?type=z=26659' HTTP/1.1 200 OK Date: Sun, 04 Mar 2018 19:38:07 GMT Server: MVCR Cache-Control: private Content-Length: 1867373444 Content-Type:

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread zhuyifei1999
zhuyifei1999 added a comment. The 1.8G dump looks like normal processing, but looking at the 3.2G dump, there is only two threads: Thread 2 (Thread 0x7f75a3fff700 (LWP 22212)): Traceback (most recent call first): File "/home/pavel/.local/lib/python3.6/site-packages/requests/models.py", line

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread Dvorapa
Dvorapa added a comment. OK, I run python3 pwb.py weblinkchecker -start:! -ns:0 on Ubuntu 17.10 (with Python 3.6). I created 3 core dumps. The script run with session replaced by requests in http.py#L384 and logged off. The script run more than 3x faster than on Arch, with being logged in and with

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-04 Thread Dvorapa
Dvorapa added a comment. @zhuyifei1999 Thank you very much for the analysis! Yeah, I should mentioned at the beginning, that I install requests and requirements.txt (where the memento client is contained) every time after cloning Pywikibot. I forgot to install memento client when installing Python

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. Reported: https://github.com/mementoweb/py-memento-client/issues/23 As for the memory leak I saw earlier umm, takes forever to produce anything useful. I'll see if I can keep it running and attach gdb onto it when the memory usage is really large. @Dvorapa And

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. zhuyifei1999@zhuyifei1999-ThinkPad-X260:~/T185561test$ strace python3 -c 'import requests; requests.head("https://www.icann.org/en/tlds/agreements/cat/cat-agreement-23sep05.htm")' execve("/home/zhuyifei1999/T185561test/.venv/bin/python3", ["python3", "-c", "import

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. ioctl(4, FIONBIO, [0]) basically sets the socket into blocking IO mode, comparing with the previous ioctl(3, FIONBIO, [1]).TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. Reproduction code (simulating weblinkchecker.py#L200): strace python3 -c 'import datetime, memento_client; mc = memento_client.MementoClient(); mc.timegate_uri = "http://web.archive.org/web/";

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. ^ uh yeah, the stuck process is doing read, not poll, I just realized.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. I tested this reproduction code: PYWIKIBOT2_NO_USER_CONFIG=1 strace python3 -c 'import pwb; from pywikibot.comms.http import fetch; fetch("https://www.icann.org/en/tlds/agreements/cat/cat-agreement-23sep05.htm", headers={"User-Agent":"python-requests/2.18.4",

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. In line 334 of the paste, the url is revealed to be https://www.icann.org/en/tlds/agreements/cat/cat-agreement-23sep05.htmTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. I got this reproduced with an Arch Linux image, strace shows the thread stuck at read(31,, and lsof says fd 31 is www.icann.org:443, but I'm not sure about the exact url. Where are the debug symbols...TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. (Now trying to run LXC...)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. First test run... unable to reproduce. then I realized in your traceback there is memento_client. Installed it, still no. Ran to 7. návštěvní expedice (ISS) without problems, after which I terminated it.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. Compiled: zhuyifei1999@zhuyifei1999-ThinkPad-X260:~/T185561test$ PATH=`pwd`/.venv/bin:$PATH zhuyifei1999@zhuyifei1999-ThinkPad-X260:~/T185561test$ wget 'https://www.python.org/ftp/python/3.6.4/Python-3.6.4.tar.xz' --2018-03-03 16:39:32--

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. The symbol tables we have are probably very different, and unless you can copy the symbol tables over as well, I guess I'd better try to compile that version myself. I run Ubuntu 16.04 and cannot easily switch to other linux distros / versions without messing around

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread Dvorapa
Dvorapa added a comment. ↑ (missing comment:) I reproduced the bug on Ubuntu 17.10. This was captured in freeze ([[.lu]] article currently processed). In addition I also tried (gdb) generate-core-file (found a suggestion somewhere in python forums) which gave me some type of binary file as an

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. Thread 2 is waiting on SSL socket at line 825 of your paste. Why doesn't it timeout? Other threads are waiting on the semaphore.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4020983, @Dvorapa wrote: I have a partition to test install of Linux distros or similar stuff, I could install Ubuntu 17.10 with Python 3.6 and pywikibot in there and try gdb py-bt ;) Thanks.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-03 Thread Dvorapa
Dvorapa added a comment. In T185561#4020362, @zhuyifei1999 wrote: Oh, if someone has the time to provide a core dump, along with gdb symbols, when the script memory usage is really high, I might be, unlikely, able to figure something out the root cause. Right now I really lack the time to babysit

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. Oh, if someone has the time to provide a core dump, along with gdb symbols, when the script memory usage is really high, I might be, unlikely, able to figure something out the root cause. Right now I really lack the time to babysit the script.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread gerritbot
gerritbot added a comment. Change 415928 abandoned by Zhuyifei1999: [WIP] weblinkchecker: use shelve to persist data in disk real-time https://gerrit.wikimedia.org/r/415928TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. I installed Python 3.5 and repeat worked as expected, so this is definitelly a python 3.6 issue!TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, DvorapaCc: gerritbot,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4019574, @zhuyifei1999 wrote: Will check the contribution of each. AFAICT, both the cookie change (session => requests) and the http method (GET => HEAD) significantly affects the rate of resident memory bloating.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. That indeed looks like freezing, but without the gdb symbols it's hard to conclude what it's doing.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. Or I can use pdb if needed (but it needs to log from the beginning too) Strace output: $ sudo strace -p 1797 [sudo] heslo pro pavel: strace: Process 1797 attached select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=610594}) = 0 (Timeout) clock_gettime(CLOCK_MONOTONIC,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4019573, @Dvorapa wrote: On Arch Linux there is no support for py-bt and also there are missing things like symbols completely. Arch's python package does not contain any debug things. I use gdb the first time in my life so I even don't know how to use it

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In my attempt to isolate the cause of the memory leak, I applied this onto the above patch: diff --git a/pywikibot/comms/http.py b/pywikibot/comms/http.py index 76878ae..168b4a3 100644 --- a/pywikibot/comms/http.py +++ b/pywikibot/comms/http.py @@ -381,7 +381,7 @@

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. On Arch Linux there is no support for py-bt and also there are missing things like symbols completely. Arch's python package does not contain any debug things. I use gdb the first time in my life so I even don't know how to use it correctly. I can also run it from

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. I stripped the duplicated information. Threads 3 to 50 has exact same stack traces, and 1 & 2: Thread 2 (Thread 0x7fd4147e8700 (LWP 6449)): #0 0x7fd46d12b988 in read () from /usr/lib/libpthread.so.0 #1 0x7fd46ac4c390 in ?? () from /usr/lib/libcrypto.so.1.1

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. Can you apply py-bt? If not, could you at least get the python symbols? See https://wiki.python.org/moin/DebuggingWithGdb. That is really hard to read :(TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. I captured two guppy heaps (same process, the second is captured later the the first): >>> hpy().heap() Partition of a set of 1071160 objects. Total size = 112056904 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 338785 32

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. That would confirm the original deadlocking problem fixed :) For collecting phase probably yes! :) But... I tested the -repeat with weblink_dead_days set to 0 on newly created data file (containing freshly collected links from ! to Adam Lambert) and got stuck on [[.lb]]

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. Since the original mem_top measurement blamed Cookies, I looked into wheere they are referenced with guppy: >>> h[11] Partition of a set of 704 objects. Total size = 737792 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0704

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. Linux typically... Thank you! From my current measurement: $ python pwb.py weblinkchecker -start:! -ns:0 on freshly booted system Python memory usage after start 200 MB Worked quite smoothly for 3 hours only increasing memory usage Python memory usage stopped on 1.1 GB

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. ^ The patch uses shelve for persistence, but memory is still increasing non-stop.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread gerritbot
gerritbot added a comment. Change 415928 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] [WIP] weblinkchecker: use shelve to persist data in disk real-time https://gerrit.wikimedia.org/r/415928TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4018687, @Dvorapa wrote: It will when RAM consumption will be close to 100 %, but now on 90 % it works really fast. That definitely sounds like an OOM condition. Linux typically overcommits memory (see /proc/sys/vm/overcommit* in proc(5)), allowing more

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. I will, but I think this can be some error in the data file. Currently I am trying to collect new data using https://gerrit.wikimedia.org/r/415771 and it works more smoothly, the memory consumption still slowly grows (from initial cca 200 MB to 500 MB now), but it seems

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4018476, @Dvorapa wrote: With https://gerrit.wikimedia.org/r/415771 there is still no difference for -repeat. It still behaves as I described in T185561#4015457 Please give me the traceback of: The last printed traceback in the script's console/stderr.

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread Dvorapa
Dvorapa added a comment. In T185561#4018322, @zhuyifei1999 wrote: In T185561#4016642, @Dvorapa wrote: I'll look into patches and test them tommorow. Thanks both Dalba and zhuyifei1999 for analysis and fixes. Is there something I can help with/test? The script freeze was caused by a semaphore

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-02 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4016642, @Dvorapa wrote: The only option I know is to hard shutdown, neither switching to terminal interface, nor some magic key combinations work in this case. Fix your OOM killer :). Hard shutdown is for when the kernel itself crashes (and in my case I

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. In T185561#4015912, @zhuyifei1999 wrote: @Dvorapa: Does the freeze only happen after the script reach near 100% RAM? What is the CPU usage during the freeze? What is the state of the processes during the freeze? I'm especially interested in the load (/proc/loadavg), #

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I just realized there is the shelve module that wraps around dbm to make like easier, although in python 2 keys can’t be unicode. Good part is that it’s already used by interwiki.py so I might use it as a reference.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment. Change 415771 merged by jenkins-bot: [pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore https://gerrit.wikimedia.org/r/415771TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4016283, @zhuyifei1999 wrote: Does the entire database have to persist in memory? If not I'll submit a patch to use dbm (py3 py2) instead. Yes it will be slower, but the gain is, the memory consumption can be much smaller and avoids OOM. Like a rabbit

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. ...which is not surprising: >>> len(bot.history.historyDict.keys()) 119193 >>> bot.history.historyDict.keys()[:10] [u'http://www.pref.akita.lg.jp/www/toppage/0/APM03000.html', u'http://www.worldwarships.com/class/samar-0',

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Test code: diff --git a/scripts/weblinkchecker.py b/scripts/weblinkchecker.py index ce018a2..16f8f12 100755 --- a/scripts/weblinkchecker.py +++ b/scripts/weblinkchecker.py @@ -1009,6 +1009,18 @@ def main(*args): gen =

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4016169, @zhuyifei1999 wrote: Will now look into its referrers with guppy. Which sadly doesn't seem to support python3. (I think it's because C macro DL_EXPORT is gone). Gonna switch to python 2 to test this.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. (Tests below are done with the patch above applied) I forced garbage collection on each sleep with __import__('gc').collect(), but the memory usage kept increasing, so it is not an issue with garbage collection not running frequently enough. For simpler memory

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I will now start doing memory profiling and see what is using so much memory.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment. Change 415771 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore https://gerrit.wikimedia.org/r/415771TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. The last traceback produced by the bot is curious: Exception in thread b'ALTAR Games - http://www.bistudio.com/index.php/czech/uvod/novinky/spolenost/190-bohemia-interactive-grows-in-strength': Traceback (most recent call last): File

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Oh, and if you are using Linux, really don't hard shutdown. I crashed one of my SSDs once (had to do some messy recovering, but went successful) and learned it the hard way. Linux kernel will process SysRq at #1 priority as long as the kernel is alive.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I have not reproduced the problem of entire OS freeze yet, but what I can confirm is that the memory consumption of the script is increasing non-stop. @Dvorapa: Does the freeze only happen after the script reach near 100% RAM? What is the CPU usage during the

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. In T185561#4015866, @zhuyifei1999 wrote: treat_page is sequential. No more than one extra page (not being processed by a thread) should be loaded at once. Right, thanks, nevermind then.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4015808, @Dalba wrote: @Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps? (It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. @Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps? (It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches while threads are full)TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. @Dalba No, it was on default value of 5 seconds, but the -repeat should not make requests to MW API at all, or am I wrong? UPDATE: I tried to set it to 0, but the result was the sameTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. @Dvorapa, have you set retry_wait value to 0 in your user-config ?TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DalbaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. Here: http://mysharegadget.com/743636238 (stored for 33 days). Unzip into deadlinks folder in pwb root folder. PS 2: my user-config has default values: 50 links, 7 days, etc... PS 3: I tried different number of max_external_links, always gave me similar resultTASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Could you post your deadlinks datafile somewhere? I'll try to reproduce it and attach some debugger onto it.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Xqt,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. PS: Data were collected using a workaround to this issue: timeout --signal=SIGINT 20m python pwb.py weblinkchecker -lang:cs -ns:0 -start:"Last article from last run" (it took me me than 3 weeks to collect the data by this 20 minutes blocks between start and

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. $ python pwb.py version Pywikibot: [https] r-pywikibot-core.git (6866469, g9129, 2018/03/01, 09:07:44, ok) Release version: 3.0-dev requests version: 2.18.4 cacerts: /etc/ssl/certs/ca-certificates.crt certificate test: ok Python: 3.6.4 (default, Jan 5 2018,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. @Xqt Currently I'm AFK, I'll post version result and -repeat result (quite short as you'll see) at evening..TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt,