[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-21 Thread gerritbot
gerritbot added a comment. Change 782196 **merged** by jenkins-bot: [pywikibot/core@master] [IMPR] handle ParserError within xmlreader.XmlDump.parse() https://gerrit.wikimedia.org/r/782196 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-21 Thread gerritbot
gerritbot added a comment. Change 780854 **merged** by jenkins-bot: [pywikibot/core@master] [IMPR] Deprecate XMLDumpOldPageGenerator in favour of a 'content' parameter https://gerrit.wikimedia.org/r/780854 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-20 Thread gerritbot
gerritbot added a comment. Change 780808 **merged** by jenkins-bot: [pywikibot/core@master] [IMPR] add -quiet option to omit message when no change was made https://gerrit.wikimedia.org/r/780808 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-15 Thread gerritbot
gerritbot added a comment. Change 782196 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] [IMPR] handle ParserError within xmlreader.XmlDump.parse() https://gerrit.wikimedia.org/r/782196 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-15 Thread gerritbot
gerritbot added a comment. Change 781565 **merged** by jenkins-bot: [pywikibot/core@stable] [7.1.1] Fix regression of XmlDumpPageGenerator https://gerrit.wikimedia.org/r/781565 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-15 Thread gerritbot
gerritbot added a comment. Change 781565 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@stable] [7.1.1] Fix regression of XmlDumpPageGenerator https://gerrit.wikimedia.org/r/781565 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Xqt
Xqt added a comment. In T306134#7856117 , @Mpaa wrote: > @Xqt, I think this might be the cause. > https://gerrit.wikimedia.org/r/c/pywikibot/core/+/769728 Yes indeed. TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Mpaa
Mpaa added a comment. @Xqt, I think this might be the cause. https://gerrit.wikimedia.org/r/c/pywikibot/core/+/769728 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Xqt, Mpaa Cc: Mpaa,

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread gerritbot
gerritbot added a comment. Change 780890 **merged** by jenkins-bot: [pywikibot/core@master] Revert "[IMPR] use pg.XMLDumpPageGenerator in replace.py" https://gerrit.wikimedia.org/r/780890 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Xqt
Xqt added a comment. @Basilicofresco: I've reverted using pagegeneratory XMLDumpPageGenerator to speed up replace.py. I think my dump is corrupt but that ParseError should be fixed anyway. TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread gerritbot
gerritbot added a comment. Change 780890 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] Revert "[IMPR] use pg.XMLDumpPageGenerator in replace.py" https://gerrit.wikimedia.org/r/780890 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Basilicofresco
Basilicofresco added a comment. Ok, thanks. And keep in mind that speed matters when you have to montly check the ns:0 with hundreds of regexes. Many active bots on Wikipedia, I believe a good part of them, are actually using the dumps. So the efficiency should be as good as possibile... we

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Xqt
Xqt added a comment. In T306134#7855306 , @Basilicofresco wrote: > Well, probably I did not express myself well. > The whole point of using the dump with replace.py is to rapidly filter the xml by replacements in order to speed up

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread gerritbot
gerritbot added a comment. Change 780854 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] [IMPR] Speed up XMLDumpPageGenerator https://gerrit.wikimedia.org/r/780854 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Basilicofresco
Basilicofresco added a comment. Well, probably I did not express myself well. The whole point of using the dump with replace.py is to rapidly filter the xml by replacements in order to speed up the process of replacing something with something else on the whole ns:0. Replace.py used to

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread gerritbot
gerritbot added a comment. Change 780808 had a related patch set uploaded (by Xqt; author: Xqt): [pywikibot/core@master] [IMPR] Remove message when no change was made https://gerrit.wikimedia.org/r/780808 TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Xqt
Xqt added a comment. In T306134#7855028 , @Basilicofresco wrote: > The problem is that it is not filtering at all the xml by replacements, it is just listing one by one every single page present in the dump. Why is this a problem?

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Basilicofresco
Basilicofresco added a comment. These articles never contained the word "meteorite". Moreover "Organo a pompa" is the very first article written in the current itwiki-20220401-pages-articles.xml dump,

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Xqt
Xqt added a comment. In T306134#7854401 , @Basilicofresco wrote: > None of the skipped page has that word. The point of running replace.py on a dump should be to load only the pages with that word and not just any page. The xml

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-14 Thread Basilicofresco
Basilicofresco added a comment. None of the skipped page has that word. The point of running replace.py on a dump should be to load only the pages with that word and not just any page. TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES

[Pywikipedia-bugs] [Maniphest] [Commented On] T306134: XMLDumpPageGenerator is still not working

2022-04-13 Thread Xqt
Xqt added a comment. @Basilicofresco: are you sure any of the skipped pages has that "meteorite" in its content? TASK DETAIL https://phabricator.wikimedia.org/T306134 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Xqt Cc: #pywikibot-replace.py,