jenkins-bot has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/329761 )
Change subject: Fix and improve default regexes ...................................................................... Fix and improve default regexes - Remove superfluous flags. - Clean up 'header' using multiline. - Expand 'pre' and 'table' to support HTML attributes (mostly 'style'). - Update 'property' to support parameters (currently, it supports "|from=" but it might support more in the future). - Localize 'property' and 'invoke' using magic words. - Add singleline to 'invoke'. Change-Id: Ib805bf70cb1cc99711138d7d6c7e40971f31b602 --- M pywikibot/textlib.py 1 file changed, 10 insertions(+), 8 deletions(-) Approvals: jenkins-bot: Verified Xqt: Looks good to me, approved diff --git a/pywikibot/textlib.py b/pywikibot/textlib.py index 9f7782e..dce6608 100644 --- a/pywikibot/textlib.py +++ b/pywikibot/textlib.py @@ -221,13 +221,13 @@ _regex_cache.update({ 'comment': re.compile(r'(?s)<!--.*?-->'), # section headers - 'header': re.compile(r'\r?\n=+.+=+ *\r?\n'), + 'header': re.compile(r'(?m)^=+.+=+ *$'), # preformatted text - 'pre': re.compile(r'(?ism)<pre>.*?</pre>'), + 'pre': re.compile(r'(?is)<pre[ >].*?</pre>'), 'source': re.compile(r'(?is)<source .*?</source>'), - 'score': re.compile(r'(?ism)<score[ >].*?</score>'), + 'score': re.compile(r'(?is)<score[ >].*?</score>'), # inline references - 'ref': re.compile(r'(?ism)<ref[ >].*?</ref>'), + 'ref': re.compile(r'(?is)<ref[ >].*?</ref>'), 'template': NESTED_TEMPLATE_REGEX, # lines that start with a space are shown in a monospace font and # have whitespace preserved. @@ -235,7 +235,7 @@ # tables often have whitespace that is used to improve wiki # source code readability. # TODO: handle nested tables. - 'table': re.compile(r'(?ims)^{\|.*?^\|}|<table>.*?</table>'), + 'table': re.compile(r'(?ims)^{\|.*?^\|}|<table[ >].*?</table>'), 'hyperlink': compileLinkR(), 'gallery': re.compile(r'(?is)<gallery.*?>.*?</gallery>'), # this matches internal wikilinks, but also interwiki, categories, and @@ -247,11 +247,13 @@ site.validLanguageLinks() + list(site.family.obsolete.keys()))), # Wikibase property inclusions - 'property': re.compile(r'(?i)\{\{\s*#property:\s*p\d+\s*\}\}'), + 'property': (r'(?i)\{\{\s*\#(?:%s):\s*p\d+.*?\}\}', + lambda site: '|'.join(site.getmagicwords('property'))), # Module invocations (currently only Lua) - 'invoke': re.compile(r'(?i)\{\{\s*#invoke:.*?}\}'), + 'invoke': (r'(?is)\{\{\s*\#(?:%s):.*?\}\}', + lambda site: '|'.join(site.getmagicwords('invoke'))), # categories - 'category': ('\[\[ *(?:%s)\s*:.*?\]\]', + 'category': (r'\[\[ *(?:%s)\s*:.*?\]\]', lambda site: '|'.join(site.namespaces[14])), # files 'file': (FILE_LINK_REGEX, -- To view, visit https://gerrit.wikimedia.org/r/329761 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib805bf70cb1cc99711138d7d6c7e40971f31b602 Gerrit-PatchSet: 5 Gerrit-Project: pywikibot/core Gerrit-Branch: master Gerrit-Owner: Matěj Suchánek <matejsuchane...@gmail.com> Gerrit-Reviewer: Dalba <dalba.w...@gmail.com> Gerrit-Reviewer: Ladsgroup <ladsgr...@gmail.com> Gerrit-Reviewer: Magul <tomasz.magul...@gmail.com> Gerrit-Reviewer: Matěj Suchánek <matejsuchane...@gmail.com> Gerrit-Reviewer: Xqt <i...@gno.de> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits