jenkins-bot has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/323184 )

Change subject: Remove non-breaking spaces when tidying up a link
......................................................................


Remove non-breaking spaces when tidying up a link

The relevant code comes from:
https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/title/MediaWikiTitleCodec.php;1fdc4c6f53d4fce86ade2986bea43b729d62fee3$290

Bug: T130818
Change-Id: I45d843824eae4fa68ab4001b68dd7bf05c2e6439
---
M pywikibot/page.py
M tests/link_tests.py
2 files changed, 6 insertions(+), 4 deletions(-)

Approvals:
  Merlijn van Deen: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/pywikibot/page.py b/pywikibot/page.py
index 78c2f79..49f282e 100644
--- a/pywikibot/page.py
+++ b/pywikibot/page.py
@@ -5292,10 +5292,10 @@
             raise pywikibot.Error(
                 "Title contains illegal char (\\uFFFD 'REPLACEMENT 
CHARACTER')")
 
-        # Replace underscores by spaces
-        t = t.replace(u"_", u" ")
-        # replace multiple spaces with a single space
-        t = re.sub(' {2,}', ' ', t)
+        # Cleanup whitespace
+        t = re.sub(
+            '[_ \xa0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000]+',
+            ' ', t)
         # Strip spaces at both ends
         t = t.strip()
         # Remove left-to-right and right-to-left markers.
diff --git a/tests/link_tests.py b/tests/link_tests.py
index 9df58b0..8bacdc1 100644
--- a/tests/link_tests.py
+++ b/tests/link_tests.py
@@ -90,6 +90,8 @@
         self.assertEqual(Link('A é B', self.get_site()).title, u'A é B')
         self.assertEqual(Link('A é B', self.get_site()).title, u'A é B')
         self.assertEqual(Link('A é B', self.get_site()).title, u'A é B')
+        self.assertEqual(Link('A   B', self.get_site()).title, 'A B')
+        self.assertEqual(Link('A   B', self.get_site()).title, 'A B')
 
         l = Link('A | B', self.get_site())
         self.assertEqual(l.title, 'A')

-- 
To view, visit https://gerrit.wikimedia.org/r/323184
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I45d843824eae4fa68ab4001b68dd7bf05c2e6439
Gerrit-PatchSet: 7
Gerrit-Project: pywikibot/core
Gerrit-Branch: master
Gerrit-Owner: Matěj Suchánek <[email protected]>
Gerrit-Reviewer: Dalba <[email protected]>
Gerrit-Reviewer: John Vandenberg <[email protected]>
Gerrit-Reviewer: Magul <[email protected]>
Gerrit-Reviewer: Matěj Suchánek <[email protected]>
Gerrit-Reviewer: Merlijn van Deen <[email protected]>
Gerrit-Reviewer: XXN <[email protected]>
Gerrit-Reviewer: Xqt <[email protected]>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
Pywikibot-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-commits

Reply via email to