whym created this task.
whym added a project: Pywikibot-textlib.py.
Restricted Application added subscribers: pywikibot-bugs-list, Aklapper.
Restricted Application added a project: Pywikibot.
TASK DESCRIPTION
It looks like even
extract_sections('<!-- -->', site)
causes an infinite loop, and when I interrupt the program, the error looks
like this:
File "/data/project/archiving/pkgsrc/core/scripts/archivebot.py", line 451,
in load_page
header, threads, footer = extract_sections(text, self.site)
File
"/mnt/nfs/labstore-secondary-tools-project/archiving/pkgsrc/core/pywikibot/textlib.py",
line 917, in extract_sections
last_section_content).group().lstrip()
File
"/mnt/nfs/labstore-secondary-tools-project/archiving/venv/lib/python3.5/re.py",
line 173, in search
return _compile(pattern, flags).search(string)
pointing to this code segment:
footer = re.search(
r'(%s)*\Z' % r'|'.join((langlink_pattern, cat_regex.pattern, r'\s+')),
last_section_content).group().lstrip()
The regex has effectively '(\s+)*$' in it, which can be problematic:
https://www.regular-expressions.info/catastrophic.html.
Originally found in
https://commons.wikimedia.org/w/index.php?title=Commons:Bar&oldid=347447603 .
TASK DETAIL
https://phabricator.wikimedia.org/T222671
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: whym
Cc: Aklapper, whym, pywikibot-bugs-list, Viztor, DannyS712, Wenyi, Tbscho,
MayS, Mdupont, JJMC89, Avicennasis, mys_721tx, jayvdb, Dalba, Masti,
Alchimista, Rxy
_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs