[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage

2018-03-01 Thread Dvorapa
Dvorapa added a comment.Herald added a subscriber: Zoranzoki21. This is even worse on -repeat as weblinkchecker don't need to load page content from wiki (which slows it down a bit but in general it helps to postpone link checking a little). So within a few seconds of running, the pool is

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. @Xqt Currently I'm AFK, I'll post version result and -repeat result (quite short as you'll see) at evening..TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DvorapaCc: Xqt,

[Pywikipedia-bugs] [Maniphest] [Retitled] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa renamed this task from "weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage" to "weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM".Dvorapa updated the task description. (Show Details) CHANGES TO TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of CPU usage

2018-03-01 Thread Xqt
Xqt added a comment. Could you give the result of version please.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: XqtCc: Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Could you post your deadlinks datafile somewhere? I'll try to reproduce it and attach some debugger onto it.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Xqt,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Oh, and if you are using Linux, really don't hard shutdown. I crashed one of my SSDs once (had to do some messy recovering, but went successful) and learned it the hard way. Linux kernel will process SysRq at #1 priority as long as the kernel is alive.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. In T185561#4015866, @zhuyifei1999 wrote: treat_page is sequential. No more than one extra page (not being processed by a thread) should be loaded at once. Right, thanks, nevermind then.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Claimed] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 claimed this task.zhuyifei1999 added a comment. The script has frozen. No new threads have been created in a few minutes, according to strace). Getting a python traceback on all threads with gdb: P6773 T185561 gdb-ing Most threads are waiting for a semaphore.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. @Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps? (It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches while threads are full)TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dalba
Dalba added a comment. @Dvorapa, have you set retry_wait value to 0 in your user-config ?TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: DalbaCc: Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. Here: http://mysharegadget.com/743636238 (stored for 33 days). Unzip into deadlinks folder in pwb root folder. PS 2: my user-config has default values: 50 links, 7 days, etc... PS 3: I tried different number of max_external_links, always gave me similar resultTASK

[Pywikipedia-bugs] [Maniphest] [Updated] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a project: Patch-For-Review. TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999, gerritbotCc: gerritbot, Dalba, Xqt, Zoranzoki21, zhuyifei1999, Aklapper, pywikibot-bugs-list,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I will now start doing memory profiling and see what is using so much memory.TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: gerritbot, Dalba, Xqt, Zoranzoki21,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. @Dalba No, it was on default value of 5 seconds, but the -repeat should not make requests to MW API at all, or am I wrong? UPDATE: I tried to set it to 0, but the result was the sameTASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I have not reproduced the problem of entire OS freeze yet, but what I can confirm is that the memory consumption of the script is increasing non-stop. @Dvorapa: Does the freeze only happen after the script reach near 100% RAM? What is the CPU usage during the

[Pywikipedia-bugs] [Maniphest] [Edited] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...Python 3.6.4, Pywikibot last master commit, OS: Arch Linux; 80 Mbit/s connection; 1,9 GiB RAM (+ 4 GB of swap space)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment. Change 415771 had a related patch set uploaded (by Zhuyifei1999; owner: Zhuyifei1999): [pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore https://gerrit.wikimedia.org/r/415771TASK

[Pywikipedia-bugs] [Maniphest] [Edited] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...Python 3.6.4, Pywikibot last master commit, OS: Arch Linux; 80 Mbit/s connection; 1,9 GiB RAM (+ 4 GiB of swap space)TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4015808, @Dalba wrote: @Dvorapa, Could you try https://gerrit.wikimedia.org/r/#/c/415687/2/scripts/weblinkchecker.py and see if it helps? (It is not supposed to reduce the number of active threads, I'm hoping to control RAM usage by stopping page fetches

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. The last traceback produced by the bot is curious: Exception in thread b'ALTAR Games - http://www.bistudio.com/index.php/czech/uvod/novinky/spolenost/190-bohemia-interactive-grows-in-strength': Traceback (most recent call last): File

[Pywikipedia-bugs] [Maniphest] [Commented On] T188512: Access denied for user 'nadwi'@'10.64.37.14'

2018-03-01 Thread Framawiki
Framawiki added a comment. Are you on toolforge servers ? If yes, you can find information on https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database, particularly that credentials and user needed to connect to the database are available in the replica.my.cnf file.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. $ python pwb.py version Pywikibot: [https] r-pywikibot-core.git (6866469, g9129, 2018/03/01, 09:07:44, ok) Release version: 3.0-dev requests version: 2.18.4 cacerts: /etc/ssl/certs/ca-certificates.crt certificate test: ok Python: 3.6.4 (default, Jan 5 2018,

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. PS: Data were collected using a workaround to this issue: timeout --signal=SIGINT 20m python pwb.py weblinkchecker -lang:cs -ns:0 -start:"Last article from last run" (it took me me than 3 weeks to collect the data by this 20 minutes blocks between start and

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. (Tests below are done with the patch above applied) I forced garbage collection on each sleep with __import__('gc').collect(), but the memory usage kept increasing, so it is not an issue with garbage collection not running frequently enough. For simpler memory

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. Test code: diff --git a/scripts/weblinkchecker.py b/scripts/weblinkchecker.py index ce018a2..16f8f12 100755 --- a/scripts/weblinkchecker.py +++ b/scripts/weblinkchecker.py @@ -1009,6 +1009,18 @@ def main(*args): gen =

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4016169, @zhuyifei1999 wrote: Will now look into its referrers with guppy. Which sadly doesn't seem to support python3. (I think it's because C macro DL_EXPORT is gone). Gonna switch to python 2 to test this.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. ...which is not surprising: >>> len(bot.history.historyDict.keys()) 119193 >>> bot.history.historyDict.keys()[:10] [u'http://www.pref.akita.lg.jp/www/toppage/0/APM03000.html', u'http://www.worldwarships.com/class/samar-0',

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread gerritbot
gerritbot added a comment. Change 415771 merged by jenkins-bot: [pywikibot/core@master] weblinkchecker: use with-statement to acquire and release semaphore https://gerrit.wikimedia.org/r/415771TASK DETAILhttps://phabricator.wikimedia.org/T185561EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. In T185561#4016283, @zhuyifei1999 wrote: Does the entire database have to persist in memory? If not I'll submit a patch to use dbm (py3 py2) instead. Yes it will be slower, but the gain is, the memory consumption can be much smaller and avoids OOM. Like a rabbit

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread zhuyifei1999
zhuyifei1999 added a comment. I just realized there is the shelve module that wraps around dbm to make like easier, although in python 2 keys can’t be unicode. Good part is that it’s already used by interwiki.py so I might use it as a reference.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T185561: weblinkchecker.py slows down (itself, OS) to freeze after a while reaching 100% of RAM

2018-03-01 Thread Dvorapa
Dvorapa added a comment. In T185561#4015912, @zhuyifei1999 wrote: @Dvorapa: Does the freeze only happen after the script reach near 100% RAM? What is the CPU usage during the freeze? What is the state of the processes during the freeze? I'm especially interested in the load (/proc/loadavg), #