[Pywikipedia-bugs] [Maniphest] [Commented On] T183016: Create a "Play with Pywikibot" set of tasks for GCI

2017-12-24 Thread eflyjason
eflyjason added a comment. In T183016#3847447, @jayvdb wrote: I suggest using https://www.mediawiki.org/wiki/Manual:Pywikibot/PAWS for this. Less complicated. I always got either 500 : Internal Server Error Failed to start your server. Please contact admin. or 504 Gateway Time-out nginx/1.13.6

[Pywikipedia-bugs] [Maniphest] [Commented On] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread eflyjason
eflyjason added a comment. One implementation is to perform file locking on whichever is being downloaded on, and exit with an error if the will-be-written file is locked. Another is to avoid same-filenames entirely. Based on https://stackoverflow.com/q/489861/2603230, it seems that locking file

[Pywikipedia-bugs] [Maniphest] [Commented On] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread zhuyifei1999
zhuyifei1999 added a comment. Thanks for proofreading though :)TASK DETAILhttps://phabricator.wikimedia.org/T183675EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Framawiki, Aklapper, Xqt, jayvdb, siebrand, Zoranzoki21, eflyjason,

[Pywikipedia-bugs] [Maniphest] [Commented On] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread eflyjason
eflyjason added a comment. In T183675#3860177, @zhuyifei1999 wrote: (clarification: by iff I meant if and only if) Sorry TASK DETAILhttps://phabricator.wikimedia.org/T183675EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjasonCc: Framawiki, Aklapper,

[Pywikipedia-bugs] [Maniphest] [Edited] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread eflyjason
eflyjason updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...1. Make sure the target file is //always// a complete file; only commit if and only if the file is completely downloaded (and verified with checksum), or discard if anything goes wrongOne implementation is to

[Pywikipedia-bugs] [Maniphest] [Commented On] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread zhuyifei1999
zhuyifei1999 added a comment. (clarification: by iff I meant if and only if)TASK DETAILhttps://phabricator.wikimedia.org/T183675EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: zhuyifei1999Cc: Framawiki, Aklapper, Xqt, jayvdb, siebrand, Zoranzoki21, eflyjason,

[Pywikipedia-bugs] [Maniphest] [Edited] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread eflyjason
eflyjason updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...1. Make sure the target file is //always// a complete file; only commit iff the file is completely downloaded (and verified with checksum), or discard if anything goes wrong. Consider a scenario where on a

[Pywikipedia-bugs] [Maniphest] [Edited] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread zhuyifei1999
zhuyifei1999 updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONThe meaning of [[https://en.wiktionary.org/wiki/atomic#Adjective|atomic]] here:Pywikibot is a Python-based framework to write bots for MediaWiki ([more

[Pywikipedia-bugs] [Maniphest] [Created] T183675: download_dump.py: Make download process atomic

2017-12-24 Thread zhuyifei1999
zhuyifei1999 created this task.zhuyifei1999 triaged this task as "Normal" priority.zhuyifei1999 added projects: Google-Code-in-2017, Pywikibot-core, Pywikibot-Other-scripts. TASK DESCRIPTIONThe meaning of atomic here: (computing) Of an operation: guaranteed to complete either fully or not at all

[Pywikipedia-bugs] [Maniphest] [Edited] T183666: download_dump.py: Use response.iter_content

2017-12-24 Thread eflyjason
eflyjason updated the task description. (Show Details) CHANGES TO TASK DESCRIPTION...Thanks to work in Google Code-in, Pywikibot now has a script called `download_dump.py`. It downloads a Wikimedia database dump from http://dumps.wikimedia.org/ , and places the dump in a predictable directory for

[Pywikipedia-bugs] [Maniphest] [Commented On] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread eflyjason
eflyjason added a comment. Adding parent task T183666: download_dump.py: Use response.iter_content as we cannot implement progress bar without using iter_content.TASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL

[Pywikipedia-bugs] [Maniphest] [Updated] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread eflyjason
eflyjason added a parent task: T183666: download_dump.py: Use response.iter_content. TASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjasonCc: rafidaslam, pywikibot-bugs-list, eflyjason, Aklapper, Xqt,

[Pywikipedia-bugs] [Maniphest] [Updated] T183666: download_dump.py: Use response.iter_content

2017-12-24 Thread eflyjason
eflyjason added a subtask: T183664: download_dump.py: Add a progress bar. TASK DETAILhttps://phabricator.wikimedia.org/T183666EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjasonCc: eflyjason, pywikibot-bugs-list, Aklapper, Xqt, jayvdb, siebrand,

[Pywikipedia-bugs] [Maniphest] [Commented On] T183670: download_dump.py: Verify the file using the checksum

2017-12-24 Thread eflyjason
eflyjason added a comment. It seems that file like enwiki-latest-abstract.xml does not have a md5 code in https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-md5sums.txt. How can we check those files?TASK DETAILhttps://phabricator.wikimedia.org/T183670EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread rafidaslam
rafidaslam added a comment. @eflyjason yupTASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: rafidaslamCc: rafidaslam, pywikibot-bugs-list, eflyjason, Aklapper, Xqt, zhuyifei1999, jayvdb, siebrand, Zoranzoki21,

[Pywikipedia-bugs] [Maniphest] [Commented On] T183667: download_dump.py: Handle cases when the dump file already exists

2017-12-24 Thread eflyjason
eflyjason added a comment. Would timezone settings on bot user's computer be a problem to this?TASK DETAILhttps://phabricator.wikimedia.org/T183667EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjasonCc: eflyjason, pywikibot-bugs-list, Aklapper, Xqt,

[Pywikipedia-bugs] [Maniphest] [Commented On] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread eflyjason
eflyjason added a comment. If we are to use external library, does that mean we have to add this library to requirements.txt?TASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjasonCc: rafidaslam,

[Pywikipedia-bugs] [Maniphest] [Commented On] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread rafidaslam
rafidaslam added a comment. There's also a library to make progressbar and some command-line related things in Python: https://github.com/kennethreitz/clintTASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Updated] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread eflyjason
eflyjason added a comment. I would suggest something like https://stackoverflow.com/a/15645088/2603230. (See also: https://github.com/wikimedia/pywikibot/blob/master/pywikibot/page.py#L2686-L2693) It would also solve T183666: download_dump.py: Use response.iter_contentTASK

[Pywikipedia-bugs] [Maniphest] [Changed Project Column] T183671: download_dump.py: Create tests

2017-12-24 Thread Aklapper
Aklapper moved this task from Proposed tasks to Information needed on the Google-Code-in-2017 board.Aklapper added a comment. This task needs more information / links for contributors and a description which extent is expected.TASK

[Pywikipedia-bugs] [Maniphest] [Changed Project Column] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread Aklapper
Aklapper moved this task from Proposed tasks to Information needed on the Google-Code-in-2017 board.Aklapper added a comment. Is there any existing implementation of a progress bar? Is some library recommended? I'd like to avoid reinventing a wheel in this task.TASK

[Pywikipedia-bugs] [Maniphest] [Edited] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread Aklapper
Aklapper updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONPywikibot is a Python-based framework to write bots for MediaWiki ([more information](https://www.mediawiki.org/wiki/Manual:Pywikibot)). Thanks to work in Google Code-in, Pywikibot now has a script called

[Pywikipedia-bugs] [Maniphest] [Retitled] T183667: download_dump.py: Handle cases when the dump file already exists

2017-12-24 Thread Aklapper
Aklapper renamed this task from "download_dump.py: If the file already exists" to "download_dump.py: Handle cases when the dump file already exists".Aklapper updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONPywikibot is a Python-based framework to write bots for MediaWiki

[Pywikipedia-bugs] [Maniphest] [Edited] T183666: download_dump.py: Use response.iter_content

2017-12-24 Thread Aklapper
Aklapper updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONPer @zhuyifei1999 @ywikibot is a Python-based framework to write bots for MediaWiki ([more information](https://www.mediawiki.org/wiki/Manual:Pywikibot)). Thanks to work in Google Code-in, Pywikibot now has a script

[Pywikipedia-bugs] [Maniphest] [Edited] T183668: download_dump.py: Use symlink instead of a copy for toolforge users

2017-12-24 Thread Aklapper
Aklapper updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONPywikibot is a Python-based framework to write bots for MediaWiki ([more information](https://www.mediawiki.org/wiki/Manual:Pywikibot)). Thanks to work in Google Code-in, Pywikibot now has a script called

[Pywikipedia-bugs] [Maniphest] [Edited] T183670: download_dump.py: Verify the file using the checksum

2017-12-24 Thread Aklapper
Aklapper updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONPywikibot is a Python-based framework to write bots for MediaWiki ([more information](https://www.mediawiki.org/wiki/Manual:Pywikibot)). Thanks to work in Google Code-in, Pywikibot now has a script called

[Pywikipedia-bugs] [Maniphest] [Updated] T183671: download_dump.py: Create tests

2017-12-24 Thread Framawiki
Framawiki added a comment. See also T123884: Create test to download dump and simulate reflinks parsing all pagesTASK DETAILhttps://phabricator.wikimedia.org/T183671EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: FramawikiCc: Aklapper, Xqt, zhuyifei1999,

[Pywikipedia-bugs] [Maniphest] [Created] T183671: download_dump.py: Create tests

2017-12-24 Thread Framawiki
Framawiki created this task.Framawiki triaged this task as "Normal" priority.Framawiki added projects: Google-Code-in-2017, Pywikibot-core, Pywikibot-Other-scripts, Pywikibot-tests. TASK DESCRIPTIONTASK DETAILhttps://phabricator.wikimedia.org/T183671EMAIL

[Pywikipedia-bugs] [Maniphest] [Unblock] T123884: Create test to download dump and simulate reflinks parsing all pages

2017-12-24 Thread Framawiki
Framawiki closed subtask T123885: Create a Python Pywikibot script to download Wikimedia database dump as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T123884EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: FramawikiCc: Beta16, Aklapper, Rubin16,

[Pywikipedia-bugs] [Maniphest] [Closed] T123885: Create a Python Pywikibot script to download Wikimedia database dump

2017-12-24 Thread Framawiki
Framawiki assigned this task to eflyjason.Framawiki closed this task as "Resolved". TASK DETAILhttps://phabricator.wikimedia.org/T123885EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: eflyjason, FramawikiCc: gerritbot, eflyjason, binbot, zhuyifei1999,

[Pywikipedia-bugs] [Maniphest] [Edited] T183670: download_dump.py: Verify the file using the checksum

2017-12-24 Thread Framawiki
Framawiki updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONWe should check that the file is not corrupted: compare downloaded file md5 and excepted one, delete the corrupted and retries if it fails, with a `maxretries` parameterTASK

[Pywikipedia-bugs] [Maniphest] [Created] T183670: download_dump.py: Verify the file using the checksum

2017-12-24 Thread Framawiki
Framawiki created this task.Framawiki triaged this task as "Normal" priority.Framawiki added projects: Google-Code-in-2017, Pywikibot-core, Pywikibot-Other-scripts. TASK DESCRIPTIONWe should check that the file is not corrupted: compare downloaded file md5 and excepted one, retries if it fails,

[Pywikipedia-bugs] [Maniphest] [Created] T183668: download_dump.py: Use symlink instead of a copy for toolforge users

2017-12-24 Thread Framawiki
Framawiki created this task.Framawiki triaged this task as "Normal" priority.Framawiki added projects: Google-Code-in-2017, Pywikibot-core, Pywikibot-Other-scripts. TASK DESCRIPTIONhttps://gerrit.wikimedia.org/r/#/c/399179/7/scripts/maintenance/download_dump.py@71 We shouldn't use copyfile for

[Pywikipedia-bugs] [Maniphest] [Updated] T183666: download_dump.py: Use response.iter_content

2017-12-24 Thread Framawiki
Framawiki added projects: Pywikibot-core, Pywikibot-Other-scripts.Framawiki added a subscriber: eflyjason.Herald added a subscriber: pywikibot-bugs-list. TASK DETAILhttps://phabricator.wikimedia.org/T183666EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Updated] T183667: download_dump.py: If the file already exists

2017-12-24 Thread Framawiki
Framawiki added projects: Pywikibot-core, Pywikibot-Other-scripts.Framawiki added a subscriber: eflyjason.Herald added a subscriber: pywikibot-bugs-list. TASK DETAILhttps://phabricator.wikimedia.org/T183667EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Updated] T183664: download_dump.py: Add a progress bar

2017-12-24 Thread Framawiki
Framawiki added projects: Pywikibot-core, Pywikibot-Other-scripts.Framawiki added a subscriber: eflyjason.Herald added a subscriber: pywikibot-bugs-list. TASK DETAILhttps://phabricator.wikimedia.org/T183664EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Updated] T183663: Improve the maintenance script that download Wikimedia database dump

2017-12-24 Thread Framawiki
Framawiki added a subscriber: eflyjason.Framawiki added projects: Pywikibot-core, Pywikibot-Other-scripts.Herald added a subscriber: pywikibot-bugs-list. TASK DETAILhttps://phabricator.wikimedia.org/T183663EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Pywikipedia-bugs] [Maniphest] [Commented On] T123885: Create a Python Pywikibot script to download Wikimedia database dump

2017-12-24 Thread gerritbot
gerritbot added a comment. Change 399179 merged by jenkins-bot: [pywikibot/core@master] Create a maintenance script to download Wikimedia database dump https://gerrit.wikimedia.org/r/399179TASK DETAILhttps://phabricator.wikimedia.org/T123885EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T183085: [RfC] Drop compat module

2017-12-24 Thread Multichill
Multichill added a comment. Gain: Some code clean up Pain: A lot of our long time users will have broken scrips they have to update If I look at it like that, it's not worth it. Please explain why this is needed now.TASK DETAILhttps://phabricator.wikimedia.org/T183085EMAIL

[Pywikipedia-bugs] [Maniphest] [Edited] T130523: Convert imagecopy.py to requests

2017-12-24 Thread eflyjason
eflyjason updated the task description. (Show Details) CHANGES TO TASK DESCRIPTIONThe file `imagecopy.py` uses `urllib/urllib2.urlopen`. It should use [`requests`](https://www.mediawiki.org/wiki/Manual:Pywikibot/Installation#Install_dependencies). **Details: T68102**TASK

[Pywikipedia-bugs] [Maniphest] [Updated] T130523: Convert imagecopy.py to requests

2017-12-24 Thread Dvorapa
Dvorapa added a comment. @Aklapper Generally this is just a subtask of T68102. Everything is in the description of T68102, I would only duplicate its description here, which does not seem necessary to meTASK DETAILhttps://phabricator.wikimedia.org/T130523EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T183085: [RfC] Drop compat module

2017-12-24 Thread Fae
Fae added a comment. Unless there is an immediate issue, please keep this minor backwards compatibility. Most of my stuff is now in Core but look have a library of Compat handy things that once broken are unlikely to be revisited. Deliberately breaking them does not make much sense.TASK

[Pywikipedia-bugs] [Maniphest] [Commented On] T183085: [RfC] Drop compat module

2017-12-24 Thread fantasticfears
fantasticfears added a comment. Would be good to start using SemVer.TASK DETAILhttps://phabricator.wikimedia.org/T183085EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: fantasticfearsCc: Dalba, Framawiki, gerritbot, Steinsplitter, Zoranzoki21, fantasticfears,

[Pywikipedia-bugs] [Maniphest] [Commented On] T135339: Category add should categorize inside noinclude tags and /doc subpages of templates

2017-12-24 Thread eflyjason
eflyjason added a comment. I guess there should be a simpler solution (without having to modify replaceCategoryInPlace in textlib)? Also it still cannot add category to a template document that does not have category existed originally.TASK DETAILhttps://phabricator.wikimedia.org/T135339EMAIL

[Pywikipedia-bugs] [Maniphest] [Updated] T135339: Category add should categorize inside noinclude tags and /doc subpages of templates

2017-12-24 Thread gerritbot
gerritbot added a project: Patch-For-Review. TASK DETAILhttps://phabricator.wikimedia.org/T135339EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: gerritbotCc: gerritbot, eflyjason, Framawiki, Xqt, valhallasw, Aklapper, Zppix, pywikibot-bugs-list, Dvorapa,

[Pywikipedia-bugs] [Maniphest] [Commented On] T135339: Category add should categorize inside noinclude tags and /doc subpages of templates

2017-12-24 Thread gerritbot
gerritbot added a comment. Change 400090 had a related patch set uploaded (by Eflyjason; owner: Eflyjason): [pywikibot/core@master] Allow category-add script to categorize inside /doc subpages of templates https://gerrit.wikimedia.org/r/400090TASK