Dvorapa created this task.
Dvorapa added projects: Pywikibot-core, Pywikibot-weblinkchecker.py, Pywikibot-network.
Herald added subscribers: pywikibot-bugs-list, Aklapper.

TASK DESCRIPTION

Steps to reproduce

  1. run python pwb.py weblinkchecker -start:! -ns:0 on a large wiki (cswiki with 400 000 articles is enough)

Expected behavior
weblinkchecker.py should run through all pages in chosen namespace smoothly

Current behavior
weblinkchecker.py after some amount of time/pages read begins to slow itself down (let's say 1 hour). After another amount of time (2 hours) it slows also the whole OS by increasing CPU usage up to 100% (4 hours). It finally stops on 100% CPU usage when processing some page (>>> Some page <<< is the last what it outputs) and does nothing more (freezes). The whole OS (per CPU usage) is making itself slower until it freezes too. If I interrupt it by keyboard when CPU usage less than 100%, it usually says: Waiting for remaining 49 threads to finish, please wait...; on 100% CPU usage I can not do anything else than hard shutdown my PC (100% CPU usage makes the whole OS freeze).

Yes, I can set my OS not to reach 100% (still slows the whole OS after a while). Yes, I can limit weblinkchecker (pywikibot) to use only 50% CPU usage max (OS and CPU ok, but weblinkchecker.py still slows and freezes itself). Yes, I can interrupt it after a while and start it from the last article checked (currently the best solution I think, I use timeout --signal=SIGINT 20m python pwb.py weblinkchecker -ns:0 -start:"Last article processed in last run" and hope the data will not be broken at the end). But I think there is some issue in weblinkchecker.py, maybe duplicite calling of threads or functions; or threads or functions not properly terminated after a success or fail; or maybe functions or threads exceeds some limits which should not. And this issue makes weblinkchecker.py slowly grow until it reaches the cutoff and freezes.

Configuration
Python 3.6.4, Pywikibot last master commit, OS: Arch Linux


TASK DETAIL
https://phabricator.wikimedia.org/T185561

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Dvorapa
Cc: Aklapper, pywikibot-bugs-list, Dvorapa, Magul, Tbscho, rafidaslam, MayS, Mdupont, JJMC89, Avicennasis, jayvdb, Dalba, Masti, Alchimista, Rxy
_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to