XZise added a subscriber: XZise.
XZise added a comment.

Are you able to determine on which site the page happened? And I'm having 
trouble to understand how this issue can happen because when the title is 
`bytes` it shouldn't be just `0xFB` because that is no valid sequence for UTF-8 
which is the expected encoding. And if it's `unicode` it shouldn't be able 
decode it as it first tries to encode it using ASCII:

  >>> u'รป'.decode('utf8')
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/home/xzise/.pyenv/versions/2.7.8/lib/python2.7/encodings/utf_8.py", 
line 16, in decode
      return codecs.utf_8_decode(input, errors, True)
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xfb' in position 
0: ordinal not in range(128)

If you are able to, you could help by adding `print(type(title)); 
print(repr(title))` above the for-loop in `url2unicode` (which is for me in 
line `5272`).


TASK DETAIL
  https://phabricator.wikimedia.org/T111116

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: XZise
Cc: XZise, pywikibot-bugs-list, Malafaya, Aklapper, jayvdb, Malyacko



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to