Xqt triaged this task as "High" priority.
Xqt added a comment.
I see it is important to filter before process the page
| **XmlDumpReplacePageGenerator ** | **old
XMLDumpPageGenerator ** | **new XMLDumpPageGenerator**
| **-start:An option used** |
| filter is made for each dump entry | no filtering is made
before processing | no filtering is made before processing but entry.text is
not assigned | no filtering is made before processing |
| 55697 entries processed, 12 pages found to process | 271 pages processed
until first edit | 271 pages processed until first edit
| 3625 pages processed until first edit |
| 1 second | 194 seconds
| 57 seconds
| 60 seconds |
|
Unfortunately the old redirect XmlDumpReplacePageGenerator implementation
raises an exception when parsing whereas the pagegenerators implementation does
not (maybe because the script was halted after the first 271 pages):
ERROR: ParseError: no element found: line 328465, column 355
Traceback (most recent call last):
File "C:\pwb\GIT\core\pwb.py", line 496, in <module>
main()
File "C:\pwb\GIT\core\pwb.py", line 480, in main
if not execute():
File "C:\pwb\GIT\core\pwb.py", line 463, in execute
run_python_file(filename, script_args, module)
File "C:\pwb\GIT\core\pwb.py", line 143, in run_python_file
exec(compile(source, filename, 'exec', dont_inherit=True),
File ".\scripts\replace.py", line 1107, in <module>
main()
File ".\scripts\replace.py", line 1103, in main
bot.run()
File "C:\pwb\GIT\core\pywikibot\bot.py", line 1555, in run
for item in self.generator:
File "C:\pwb\GIT\core\pywikibot\pagegenerators.py", line 2240, in
PreloadingGenerator
for page in generator:
File "C:\pwb\GIT\core\pywikibot\pagegenerators.py", line 1761, in
<genexpr>
return (page for page in generator if page.namespace() in namespaces)
File ".\scripts\replace.py", line 435, in __iter__
for entry in self.parser:
File "C:\pwb\GIT\core\pywikibot\xmlreader.py", line 119, in parse
for event, elem in context:
File "C:\Python310\lib\xml\etree\ElementTree.py", line 1260, in iterator
root = pullparser._close_and_return_root()
File "C:\Python310\lib\xml\etree\ElementTree.py", line 1307, in
_close_and_return_root
root = self._parser.close()
xml.etree.ElementTree.ParseError: no element found: line 328465, column 355
CRITICAL: Exiting due to uncaught exception <class
'xml.etree.ElementTree.ParseError'>
TASK DETAIL
https://phabricator.wikimedia.org/T306134
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Xqt
Cc: #pywikibot-replace.py, Aklapper, #pywikibot-pagegenerators.py,
pywikibot-bugs-list, Basilicofresco, Fernandobacasegua34, 786, Suran38,
Biggs657, Lalamarie69, Jyoo1011, JohnsonLee01, Juan90264, SHEKH, Dijkstra,
Alter-paule, Beast1978, Un1tY, Khutuck, Zkhalido, Hook696, Kent7301,
joker88john, Viztor, CucyNoiD, Wenyi, Gaboe420, Giuliamocci, Cpaulf30, Af420,
Bsandipan, Tbscho, MayS, Lewizho99, Mdupont, JJMC89, Maathavan, TerraCodes,
Dvorapa, Altostratus, Neuronton, Avicennasis, mys_721tx, jayvdb, Masti,
Alchimista
_______________________________________________
pywikibot-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]