[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage

2018-03-01 Thread Dvorapa
Dvorapa added a comment.Herald added a subscriber: Zoranzoki21.
This is even worse on -repeat as weblinkchecker don't need to load page content from wiki (which slows it down a bit but in general it helps to postpone link checking a little). So within a few seconds of running, the pool is exhausted really quickly (49 remaining threads) and the checking sometimes freezes at this point.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.
@Xqt Currently I'm AFK, I'll post version result and -repeat result (quite short as you'll see) at evening..TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Retitled] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa renamed this task from "weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage" to "weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM".Dvorapa updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...weblinkchecker.py after some amount of time/pages read begins to slow itself down (let's say 1 hour). After another amount of time (2 hours) it slows also the whole OS by increasing CPURAM usage up to 100% (4 hours). It finally stops on 100% CPURAM usage when processing some page (`>>> Some page <<<` is the last what it outputs) and does nothing more (freezes). The whole OS (per CPURAM usage) is making itself slower until it freezes too. If I interrupt it by keyboard when CPURAM usage less than 100%, it usually says: `Waiting for remaining 49 threads to finish, please wait...`; on 100% CPURAM usage I can not do anything else than hard shutdown my PC (100% CPURAM usage makes the whole OS freeze).

Yes, I can set my OS not to reach 100% (still slows the whole OS after a while). Yes, I can limit weblinkchecker (pywikibot) to use only 50% CPU usagesome memory max (OS and CPU okRAM ok, but weblinkchecker.py still slows and freezes itself). Yes, I can interrupt it after a while and start it from the last article checked (currently the best solution I think, I use `timeout --signal=SIGINT 20m python pwb.py weblinkchecker -ns:0 -start:"Last article processed in last run"` and hope the data will not be broken at the end). But I think there is some issue in weblinkchecker.py, maybe duplicite calling of threads or functions; or threads or functions not properly terminated after a success or fail; or maybe functions or threads exceeds some limits which should not. And this issue makes weblinkchecker.py slowly grow until it reaches the cutoff and freezesTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage

2018-03-01 Thread Xqt
Xqt added a comment.
Could you give the result of version please.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: XqtCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
Could you post your deadlinks datafile somewhere? I'll try to reproduce it and attach some debugger onto it.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
Oh, and if you are using Linux, really don't hard shutdown. I crashed one of my SSDs once (had to do some messy recovering, but went successful) and learned it the hard way. Linux kernel will process SysRq at #1 priority as long as the kernel is alive.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment.

In T185561#4015866, @zhuyifei1999 wrote:
treat_page is sequential. No more than one extra page (not being processed by a thread) should be loaded at once.


Right, thanks, nevermind then.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DalbaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Claimed] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 claimed this task.zhuyifei1999 added a comment.
The script has frozen. No new threads have been created in a few minutes, according to strace). Getting a python traceback on all threads with gdb:
P6773 T185561 gdb-ing

Most threads are waiting for a semaphore.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment.
@Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps?
(It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches while threads are full)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DalbaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment.
@Dvorapa, have you set retry_wait value to 0 in your user-config ?TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DalbaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.
Here: http://mysharegadget.com/743636238 (stored for 33 days). Unzip into deadlinks folder in pwb root folder.

PS 2: my user-config has default values: 50 links, 7 days, etc...
PS 3: I tried different number of max_external_links, always gave me similar resultTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Updated] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a project: Patch-For-Review.
TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, gerritbotCc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
I will now start doing memory profiling and see what is using so much memory.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.
@Dalba No, it was on default value of 5 seconds, but the -repeat should not make requests to MW API at all, or am I wrong?

UPDATE: I tried to set it to 0, but the result was the sameTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
I have not reproduced the problem of entire OS freeze yet, but what I can confirm is that the memory consumption of the script is increasing non-stop.

@Dvorapa:


Does the freeze only happen after the script reach near 100% RAM?
What is the CPU usage during the freeze? What is the state of the processes during the freeze?
I'm especially interested in the load (/proc/loadavg), # of threads in D-state (uninterpretable sleep), and the IO wait CPU %.

TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Edited] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...Python 3.6.4, Pywikibot last master commit, OS: Arch Linux; 80 Mbit/s connection; 1,9 GiB RAM (+ 4 GB of swap space)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment.
Change 415771 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999):
[pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore

https://gerrit.wikimedia.org/r/415771TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, gerritbotCc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Edited] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa updated the task description. (Show Details)
CHANGES TO TASK DESCRIPTION...Python 3.6.4, Pywikibot last master commit, OS: Arch Linux; 80 Mbit/s connection; 1,9 GiB RAM (+ 4 GiB of swap space)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.

In T185561#4015808, @Dalba wrote:
@Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps?
 (It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches while threads are full)


treat_page is sequential. No more than one extra page (not being processed by a thread) should be loaded at once.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
The last traceback produced by the bot is curious:

Exception in thread b'ALTAR Games - http://www.bistudio.com/index.php/czech/uvod/novinky/spolenost/190-bohemia-interactive-grows-in-strength':
Traceback (most recent call last):
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
  File "", line 2, in raise_from
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
  File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
  File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
  File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/util/retry.py", line 357, in increment
raise six.reraise(type(error), error, _stacktrace)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
  File "", line 2, in raise_from
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
  File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
  File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
  File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/scripts/weblinkchecker.py", line 624, in run
config.weblink_dead_days)
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/scripts/weblinkchecker.py", line 726, in setLinkDead
archiveURL = weblib.getWebCitationURL(url)
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/pywikibot/tools/__init__.py", line 1399, in wrapper
return obj(*args, **kwargs)
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/pywikibot/weblib.py", line 88, in getWebCitationURL
xmltext = http.fetch(uri).content
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/pywikibot/comms/http.py", line 521, in fetch
error_handling_callback(request)
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/pywikibot/comms/http.py", line 408, in error_handling_callback
raise request.data
  File "/home/zhuyifei1999/mw-dev/pywikibot-core/pywikibot/comms/http.py", line 387, in _http_process
**http_request.kwargs)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
  File "/home/zhuyifei1999/T185561test/.venv/lib/python3.5/site-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))

weblib.getWebCitationURL(url) received an error propagated from requests. The semaphore was not released.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, 

[Pywikipedia-bugs] [Maniphest] [Commented On] T188512: Access denied for user 'nadwi'@'10.64.37.14'

2018-03-01 Thread Framawiki
Framawiki added a comment.
Are you on toolforge servers ? If yes, you can find information on https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database, particularly that credentials and user needed to connect to the database are available in the replica.my.cnf file.TASK DETAILhttps://phabricator.wikimedia.org/T188512EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: FramawikiCc: Framawiki, Xqt, Aklapper, MuhammadShuaib, pywikibot-bugs-list, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.
$ python pwb.py version
Pywikibot: [https] r-pywikibot-core.git (6866469, g9129, 2018/03/01, 09:07:44, ok)
Release version: 3.0-dev
requests version: 2.18.4
  cacerts: /etc/ssl/certs/ca-certificates.crt
certificate test: ok
Python: 3.6.4 (default, Jan  5 2018, 02:35:40) 
[GCC 7.2.1 20171224]
PYWIKIBOT2_DIR: Not set
PYWIKIBOT2_DIR_PWB: 
PYWIKIBOT2_NO_USER_CONFIG: Not set
Config base dir: /home/pavel/pywikibot
Usernames for family "wikipedia":
	cs: DvorapaBot (no sysop configured)

Script output (look for ### prefixed comments)

$ python pwb.py weblinkchecker -repeat
Retrieving 240 pages from wikipedia:cs.


>>> 'Ndrangheta <<<



### script works as expected usually until [[.lb]], [[.lt]] or [[.lu]]:



>>> .lt <<<
STDOUT:pywiki:

>>> {lightpurple}.lt{default} <<<
WARNING: Http response status 403
VERBOSE:pywiki:Working on '.lu'


>>> .lu <<<
WARNING:pywiki:Http response status 403
*[[.lt]] links to http://www.domreg.lt/en/index.html - 403.
STDOUT:pywiki:

>>> {lightpurple}.lu{default} <<<
INFO:pywiki:*[[.lt]] links to http://www.domreg.lt/en/index.html - 403.



### here it hangs/stops/freezes. I left it for 20 minutes, it does not continue
### Let's try Ctrl + C:



^C
KeyboardInterrupt during WeblinkCheckerRobot bot run...
INFO:pywiki:
KeyboardInterrupt during WeblinkCheckerRobot bot run...

55 pages read
0 pages written
INFO:pywiki:
55 pages read
0 pages written
Execution time: 532 seconds
INFO:pywiki:Execution time: 532 seconds
Read operation time: 9 seconds
INFO:pywiki:Read operation time: 9 seconds
Script terminated successfully.
INFO:pywiki:Script terminated successfully.
Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...
Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...
Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...



### 30 x the same...



Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...
Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...
Waiting for remaining 49 threads to finish, please wait...
INFO:pywiki:Waiting for remaining 49 threads to finish, please wait...
Remaining 49 threads will be killed.
INFO:pywiki:Remaining 49 threads will be killed.
Saving history...
INFO:pywiki:Saving history...
VERBOSE:pywiki:Dropped throttle(s).
VERBOSE:pywiki:Closing network session.
VERBOSE:pywiki:Network session closed.
$ TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.
PS: Data were collected using a workaround to this issue: timeout --signal=SIGINT 20m python pwb.py weblinkchecker -lang:cs -ns:0 -start:"Last article from last run" (it took me me than 3 weeks to collect the data by this 20 minutes blocks between start and KeyboardInterrupt thrown by SIGINT). The same workaround should work for -repeat too, but this issue with 49 forever openned threads should be solved somehow. I can send the collected data if needed to identify links, which create neverending threadsTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
(Tests below are done with the patch above applied)

I forced garbage collection on each sleep with __import__('gc').collect(), but the memory usage kept increasing, so it is not an issue with garbage collection not running frequently enough.

For simpler memory profiling, I captured a few memory mem_top snapshots:


when the script just started, around ~280M used:


refs:
119192	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119192	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
982	 {10731520: , 10733056:  {'wmmx': , 'tl':  [{'prefix': 'acronym', 'url': 'https://www.acronymfinder.com/$1.html'}, {'prefix': 'advisory', 'loca
474	 {'pkg_resources.extern.six.moves':  ["Wrapper script to use Pywikibot in 'directory' mode.\n\nRun scripts using:\n\npython pwb.py  {'INADDR_BROADCAST': 4294967295, '__file__': '/usr/lib/python3.5/socket.py', 'HCI_TIME_STAMP': 3, 'S

bytes:
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
2097376	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
659504	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
49248	 {'wmmx': , 'tl': , 10733056: 
11074	 
7362	 
4629	 
2281	 
1533	 
1521	 
1229	 
1190	 
1156	 


when the memory sent up to ~350M:


refs:
119192	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119191	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
1830	 ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n',
1352	 ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n
1338	 ['"""HTTP/1.1 client library\n', '\n', '\n', '\n', '\n', 'H
1051	 ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa
982	 {10731520: , 10733056:  ['from __future__ import absolute_import\n', 'import errno\n', 'import logging\n', 'import sys\n', '

bytes:
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
6291552	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
2097376	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
659504	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
49248	 {'wmmx': , 'tl': , 10733056: 
37071	 
12078	 
11072	 
4597	 
2339	 
1534	 
1521	 
1421	 
1283	 

When memory went up to ~480M, before I terminated it:

refs:
119192	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
119192	 {'http://www.gobiernoenlinea.ve/misc-view/index.pag': [('Venezuela', 1518455223.3850815, '404')], 'h
73266	 {'Diabolo', 'Storm Shadow', 'Kačka strakatá', 'Josef Tomsa', 'Fixace (populační genetika)', 'Theodor
73266	 ["'Ndrangheta", "'Patafyzika", '(15760) 1992 QB1', "(What's the Story) Morning Glory?", '+ (album)',
1830	 ['# -*- coding: utf-8 -*-\n', '"""Miscellaneous helper functions (not wiki-dependent)."""\n', '#\n',
1352	 ['"""Thread module emulating a subset of Java\'s threading model."""\n', '\n', 'import sys as _sys\n
1338	 ['"""HTTP/1.1 client library\n', '\n', '\n', '\n', '\n', 'H
1148	 ['# Wrapper module for _ssl, providing some additional facilities\n', '# implemented in Python.  Wri
1051	 ['#!/usr/bin/python\n', '# -*- coding: utf-8 -*-\n', '"""\n', 'This bot is used for checking externa
982	 {10731520: , 10733056: , 'tl': , 10733056: 
165069	 
66076	 
11081	 
4794	 
2419	 
2110	 
1551	 
1521	 
1497	 

The most significant from my first read, is that,  is being insane, from <1156 instances, to 12078 instances, to 66076 instances.

Will now look into its referrers with guppy.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, 

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
Test code:

diff --git a/scripts/weblinkchecker.py b/scripts/weblinkchecker.py
index ce018a2..16f8f12 100755
--- a/scripts/weblinkchecker.py
+++ b/scripts/weblinkchecker.py
@@ -1009,6 +1009,18 @@ def main(*args):
 gen = pagegenerators.RedirectFilterPageGenerator(gen)
 bot = WeblinkCheckerRobot(gen, HTTPignore, config.weblink_dead_days)
 try:
+import signal
+
+def on_interactreq(signum, frame):
+with bot.history.semaphore:
+import code
+from guppy import hpy
+l = {'bot': bot, 'hpy': hpy}
+l.update(globals())
+l.update(locals())
+code.interact(local=l)
+signal.signal(signal.SIGUSR1, on_interactreq)
+
 bot.run()
 finally:
 waitTime = 0

>>> hpy().heap()
Partition of a set of 1792800 objects. Total size = 225903392 bytes.
 Index  Count   % Size   % Cumulative  % Kind (class / dict of class)
 0 751854  42 123489224  55 123489224  55 unicode
 1 240731  13 33477528  15 156966752  69 list
 2 347177  19 28436720  13 185403472  82 tuple
 3   2943   0 15730536   7 201134008  89 dict (no owner)
 4 305817  17  7339608   3 208473616  92 float
 5  77525   4  6901360   3 215374976  95 str
 6  11373   1  1455744   1 216830720  96 types.CodeType
 7  11074   1  1328880   1 218159600  97 function
 8401   0  1144472   1 219304072  97 dict of module
 9995   0   897480   0 220201552  97 type
<417 more rows. Type e.g. '_.more' to view.>
>>> h = _
>>> _.more
 Index  Count   % Size   % Cumulative  % Kind (class / dict of class)
10995   0   829640   0 221031192  98 dict of type
11  24725   1   593400   0 221624592  98 int
12326   0   348560   0 221973152  98 dict of class
13281   0   294488   0 67640  98 dict of cookielib.Cookie
14 92   0   256160   0 222523800  99 dict of pkg_resources._vendor.pyparsing.Literal
15240   0   251520   0 222775320  99 dict of pywikibot.page.Link
16828   0   231840   0 223007160  99 dict of function
17758   0   212240   0 223219400  99 dict of pywikibot.site._IWEntry
18109   0   173368   0 223392768  99 dict of pkg_resources._vendor.pyparsing.And
19   1707   0   150216   0 223542984  99 __builtin__.weakref
<407 more rows. Type e.g. '_.more' to view.>
>>> h[0].byid
Set of 751854  objects. Total size = 123489224 bytes.
 Index Size   %   Cumulative  %   Representation (limited)
 0   309552   0.3309552   0.3 u'{{Souh...e 2017]]'
 1   158120   0.1467672   0.4 u"==Vl\x...l|2011]]"
 2   142360   0.1610032   0.5 u'{{Info...011bla]]'
 394040   0.1704072   0.6 u"{{Kale...:Srpen]]"
 473872   0.1777944   0.6 u'{{Info...vision]]'
 568184   0.1846128   0.7 u'{{Souh...grafie]]'
 661456   0.0907584   0.7 u'\nThis...tch.\n\n'
 759720   0.0967304   0.8 u'GENERA...match.\n'
 854128   0.0   1021432   0.8 u'{{Info...u 2007]]'
 953520   0.0   1074952   0.9 u'[[Soub...fdroba]]'
<751844 more rows. Type e.g. '_.more' to view.>
>>> _.more
 Index Size   %   Cumulative  %   Representation (limited)
1047736   0.0   1122688   0.9 u'{{Info...e 1989]]'
1147552   0.0   1170240   0.9 u'{{Souh...oprava]]'
1242344   0.0   1212584   1.0 u"{{Souh...oprava]]"
1340472   0.0   1253056   1.0 u'\'\'\'...eaving]]'
1439584   0.0   1292640   1.0 u'Jako \...egendy]]'
1535088   0.0   1327728   1.1 u'{{Info... filmy]]'
1632176   0.0   1359904   1.1 u"{{Souh...oprava]]"
1732088   0.0   1391992   1.1 u"'''32summit]]"
1831648   0.0   1423640   1.2 u'{{Info...011bla]]'
1931472   0.0   1455112   1.2 u'{{Souh...grafie]]'
<751834 more rows. Type e.g. '_.more' to view.>
>>>

Guppy seems to guess that a majority of memory usage is wikitext.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.

In T185561#4016169, @zhuyifei1999 wrote:
Will now look into its referrers with guppy.


Which sadly doesn't seem to support python3. (I think it's because C macro DL_EXPORT is gone). Gonna switch to python 2 to test this.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
...which is not surprising:

>>> len(bot.history.historyDict.keys())
119193
>>> bot.history.historyDict.keys()[:10]
[u'http://www.pref.akita.lg.jp/www/toppage/0/APM03000.html', u'http://www.worldwarships.com/class/samar-0', u'http://www.fphil.uniba.sk/index.php?id=3294', u'http://www.hokej.cz/index.php?lng=CZ=clanek=52180', u'http://laser.zcu.cz/cz/opticke-vlastnosti/metody/snhrrt', u'http://www.3in.biz/pls/acm/load?p_name=T7010_2406446_941/Osobnosti_Stepan.html', u'http://www.lportala.net/players/137.html', u'http://www.johnbsebastian.com/bio.html', u'http://www.tanks-encyclopedia.com/OLD/ww2/soviet/soviet_T34-76.php', u'http://www.youngartistawards.org/noms23A.htm']
>>> bot.history.historyDict.values()[:10]
[[(u'Prefektura Akita', 1518027628.9147513, u'404'), (u'Prefektura Akita', 1518102371.6200852, u'404')], [(u'T\u0159\xedda Samar', 1518397748.7366626, u'508')], [(u'Alexander Avenarius', 1516462213.1042707, u'404'), (u'Alexander Avenarius', 1516717920.382955, u'404')], [(u'HC Ocel\xe1\u0159i T\u0159inec v \u010desk\xe9 hokejov\xe9 extralize 2010/2011', 1516698848.8886993, u'404'), (u'\u010cesk\xe1 hokejov\xe1 extraliga 2010/2011', 1518550830.2320974, u'404')], [(u'Emisivita', 1516623109.8608327, u'404'), (u'Absorpce z\xe1\u0159en\xed', 1516710675.2672064, u'404'), (u'Emisivita', 1517057700.7524137, u'404'), (u'Odrazivost', 1517932349.4918804, u'404')], [(u'Miroslav \u0160t\u011bp\xe1n', 1517744892.8110282, u'404')], [(u'Todor Jonov', 1518382319.6734998, u'404')], [(u'John Sebastian', 1517212814.478413, u'404')], [(u'BT (tank)', 1516501027.0819583, u'404'), (u'BT (tank)', 1516743253.580716, u'404'), (u'BT (tank)', 1516770024.0754175, u'404'), (u'T-34', 1518305440.8585296, u'404'), (u'T-34', 1518344610.1407645, u'404'), (u'T-34', 1518365360.8237803, u'404')], [(u'Tyler Posey', 1518394011.70306, u'404')]]

Does the entire database have to persist in memory? If not I'll submit a patch to use dbm (py3 py2) instead. Yes it will be slower, but the gain is, the memory consumption can be much smaller and avoids OOM.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment.
Change 415771 merged by jenkins-bot:
[pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore

https://gerrit.wikimedia.org/r/415771TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, gerritbotCc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.

In T185561#4016283, @zhuyifei1999 wrote:
Does the entire database have to persist in memory? If not I'll submit a patch to use dbm (py3 py2) instead. Yes it will be slower, but the gain is, the memory consumption can be much smaller and avoids OOM.


Like a rabbit hole. I'll implement it, but the estimated time of code review ... forever.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment.
I just realized there is the shelve module that wraps around dbm to make
like easier, although in python 2 keys can’t be unicode. Good part is that
it’s already used by interwiki.py so I might use it as a reference.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs


[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment.

In T185561#4015912, @zhuyifei1999 wrote:
@Dvorapa:


Does the freeze only happen after the script reach near 100% RAM?
What is the CPU usage during the freeze? What is the state of the processes during the freeze?
I'm especially interested in the load (/proc/loadavg), # of threads in D-state (uninterpretable sleep), and the IO wait CPU %.




The script freezed, when it went to 49 threads openned. The OS freezed, when the script was working for more than a few hours. I don't know if this is a usual behavior, but my PC freezes every time RAM reaches 100% (It did on Ubuntu quite a lot when I openned like 200 tabs in Chrome and it does on Arch Linux now when I run weblinkchecker). The only option I know is to hard shutdown, neither switching to terminal interface, nor some magic key combinations work in this case. CPU usage when script freezes is normal, when OS freezes I've nnever tested.

I'll look into patches and test them tommorow. Thanks both Dalba and zhuyifei1999 for analysis and fixes. Is there something I can help with/test?


In T185561#4015920, @zhuyifei1999 wrote:
Oh, and if you are using Linux, really don't hard shutdown. I crashed one of my SSDs once (had to do some messy recovering, but went successful) and learned it the hard way. Linux kernel will process SysRq at #1 priority as long as the kernel is alive.


TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, DvorapaCc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Giuliamocci, Adrian1985, Cpaulf30, Baloch007, Darkminds3113, Lordiis, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, Magul, Tbscho, rafidaslam, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, Avicennasis, jayvdb, Masti, Alchimista, Rxy___
pywikibot-bugs mailing list
pywikibot-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs