[MediaWiki-commits] [Gerrit] interwiki.py: check for category before emptiness - change (pywikibot/core)
jenkins-bot has submitted this change and it was merged. Change subject: interwiki.py: check for category before emptiness .. interwiki.py: check for category before emptiness Currently, page.isEmpty() requires parsing the whole page and doing some removals (langlinks and categories) which takes up significant CPU time. There are 2 checks which use page.isEmpty() as the starting condition while having a much simpler second condition of just checking the page namespace. In this patch, I reversed the checks order. For categories, the time taken in batchLoaded() is reduced to about 30% of the original time. Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 (cherry picked from commit 330ebe54492c9cf7c787a3c272872d778ba34fab) --- M scripts/interwiki.py 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: John Vandenberg: Looks good to me, approved Malafaya: Looks good to me, but someone else must approve jenkins-bot: Verified diff --git a/scripts/interwiki.py b/scripts/interwiki.py index fa26860..ca83ff4 100755 --- a/scripts/interwiki.py +++ b/scripts/interwiki.py @@ -1364,7 +1364,7 @@ # must be behind the page.isRedirectPage() part # otherwise a redirect error would be raised -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty. Skipping." % page) @@ -1449,7 +1449,7 @@ pywikibot.output(u'File autonomous_problems.dat open or corrupted! Try again with -restore.') sys.exit() iw = () -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty; ignoring it and its interwiki links" -- To view, visit https://gerrit.wikimedia.org/r/243049 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 Gerrit-PatchSet: 1 Gerrit-Project: pywikibot/core Gerrit-Branch: 2.0 Gerrit-Owner: John VandenbergGerrit-Reviewer: John Vandenberg Gerrit-Reviewer: Ladsgroup Gerrit-Reviewer: Malafaya Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] interwiki.py: check for category before emptiness - change (pywikibot/core)
John Vandenberg has uploaded a new change for review. https://gerrit.wikimedia.org/r/243049 Change subject: interwiki.py: check for category before emptiness .. interwiki.py: check for category before emptiness Currently, page.isEmpty() requires parsing the whole page and doing some removals (langlinks and categories) which takes up significant CPU time. There are 2 checks which use page.isEmpty() as the starting condition while having a much simpler second condition of just checking the page namespace. In this patch, I reversed the checks order. For categories, the time taken in batchLoaded() is reduced to about 30% of the original time. Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 (cherry picked from commit 330ebe54492c9cf7c787a3c272872d778ba34fab) --- M scripts/interwiki.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/pywikibot/core refs/changes/49/243049/1 diff --git a/scripts/interwiki.py b/scripts/interwiki.py index fa26860..ca83ff4 100755 --- a/scripts/interwiki.py +++ b/scripts/interwiki.py @@ -1364,7 +1364,7 @@ # must be behind the page.isRedirectPage() part # otherwise a redirect error would be raised -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty. Skipping." % page) @@ -1449,7 +1449,7 @@ pywikibot.output(u'File autonomous_problems.dat open or corrupted! Try again with -restore.') sys.exit() iw = () -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty; ignoring it and its interwiki links" -- To view, visit https://gerrit.wikimedia.org/r/243049 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 Gerrit-PatchSet: 1 Gerrit-Project: pywikibot/core Gerrit-Branch: 2.0 Gerrit-Owner: John VandenbergGerrit-Reviewer: Malafaya ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] interwiki.py: check for category before emptiness - change (pywikibot/core)
Malafaya has uploaded a new change for review. https://gerrit.wikimedia.org/r/237445 Change subject: interwiki.py: check for category before emptiness .. interwiki.py: check for category before emptiness Currently, page.isEmpty() requires parsing the whole page and doing some removals (langlinks and categories) which takes up significant CPU time. There are 2 checks which use page.isEmpty() as the starting condition while having a much simpler second condition of just checking the page namespace. In this patch, I reversed the checks order. For categories, the time taken in batchLoaded() is reduced to about 30% of the original time. Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 --- M scripts/interwiki.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/pywikibot/core refs/changes/45/237445/1 diff --git a/scripts/interwiki.py b/scripts/interwiki.py index 950f8f1..1ace2bc 100755 --- a/scripts/interwiki.py +++ b/scripts/interwiki.py @@ -1362,7 +1362,7 @@ # must be behind the page.isRedirectPage() part # otherwise a redirect error would be raised -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty. Skipping." % page) @@ -1453,7 +1453,7 @@ 'Try again with -restore.') sys.exit() iw = () -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty; ignoring it and its interwiki links" -- To view, visit https://gerrit.wikimedia.org/r/237445 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 Gerrit-PatchSet: 1 Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Owner: Malafaya___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] interwiki.py: check for category before emptiness - change (pywikibot/core)
jenkins-bot has submitted this change and it was merged. Change subject: interwiki.py: check for category before emptiness .. interwiki.py: check for category before emptiness Currently, page.isEmpty() requires parsing the whole page and doing some removals (langlinks and categories) which takes up significant CPU time. There are 2 checks which use page.isEmpty() as the starting condition while having a much simpler second condition of just checking the page namespace. In this patch, I reversed the checks order. For categories, the time taken in batchLoaded() is reduced to about 30% of the original time. Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 --- M scripts/interwiki.py 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: XZise: Looks good to me, approved jenkins-bot: Verified diff --git a/scripts/interwiki.py b/scripts/interwiki.py index 950f8f1..1ace2bc 100755 --- a/scripts/interwiki.py +++ b/scripts/interwiki.py @@ -1362,7 +1362,7 @@ # must be behind the page.isRedirectPage() part # otherwise a redirect error would be raised -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty. Skipping." % page) @@ -1453,7 +1453,7 @@ 'Try again with -restore.') sys.exit() iw = () -elif page.isEmpty() and not page.isCategory(): +elif not page.isCategory() and page.isEmpty(): globalvar.remove.append(unicode(page)) if not globalvar.quiet: pywikibot.output(u"NOTE: %s is empty; ignoring it and its interwiki links" -- To view, visit https://gerrit.wikimedia.org/r/237445 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I00375411ca15658c22ae6bdb49588ec9f03b8c69 Gerrit-PatchSet: 1 Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Owner: MalafayaGerrit-Reviewer: John Vandenberg Gerrit-Reviewer: Ladsgroup Gerrit-Reviewer: Merlijn van Deen Gerrit-Reviewer: XZise Gerrit-Reviewer: Xqt Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits