XZise claimed this task.
XZise added a comment.

Well Python greets us with a present that `strings.Formatter` isn't actually 
the formatter Python uses. For example when you use `unicode`'s `format` it 
returns a `unicode`. But `strings.Formatter` doesn't care and returns a 
`bytes`. Not sure yet how this actually causing your failure as it seems to be 
that it tries to decode a `bytes` instance.

Just for reference 
https://phabricator.wikimedia.org/rPWBCafe2555d7a6379ef1ae11d3ffe26bf6d7f391dcd 
applied `color_format` to a lot of cases so using a version before that will 
help in most cases. The original patch from me is 
https://phabricator.wikimedia.org/rPWBC25980447cf4507ae58d7beb22f2e13b1084b8838 
which only used it in one instance which isn't used by many scripts (at least 
not interwiki.py). Alternatively you can try using Python 3.

Here are a few commands to test stuff out:

  >>> from pywikibot.tools.formatter import _ColorFormatter as C
  >>> from string import Formatter as F
  >>> import pywikibot as py
  >>> s = py.Site()
  >>> p = py.Page(s, u'ü')
  >>> u'%s' % p
  WARNING: 
/home/xzise/.pyenv/versions/2.7/lib/python2.7/site-packages/requests/packages/urllib3/util/ssl_.py:90:
 InsecurePlatformWarning: A true SSLContext object is not available. This 
prevents urllib3 from configuring SSL appropriately and may cause certain SSL 
connections to fail. For more information, see 
https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
    InsecurePlatformWarning
  
  u'[[en:\xdc]]'
  >>> u'{0}'.format(p)
  u'[[en:\xdc]]'
  >>> color_format(u'{0}', p)
  '[[en:\xc3\x9c]]'
  >>> color_format(u'{red}{0}', p)
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "pywikibot/tools/formatter.py", line 124, in color_format
      return _ColorFormatter().format(text, *args, **kwargs)
    File "/home/xzise/.pyenv/versions/2.7/lib/python2.7/string.py", line 545, 
in format
      return self.vformat(format_string, args, kwargs)
    File "pywikibot/tools/formatter.py", line 114, in vformat
      kwargs)
    File "/home/xzise/.pyenv/versions/2.7/lib/python2.7/string.py", line 549, 
in vformat
      result = self._vformat(format_string, args, kwargs, used_args, 2)
    File "/home/xzise/.pyenv/versions/2.7/lib/python2.7/string.py", line 584, 
in _vformat
      return ''.join(result)
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 5: 
ordinal not in range(128)

As you can see the %-notation (used before 
https://phabricator.wikimedia.org/rPWBCafe2555d7a6379ef1ae11d3ffe26bf6d7f391dcd)
 returns an Unicode with the character U+00DC (Ü). The same happens when you 
use `unicode.format`. Now `color_format` without a color “works” too but 
returns a `bytes` instance which shouldn't happen. And then when you add a 
color it crashes. Now to see what it actually returns when a color field is 
used I use an ASCII title and it works:

  >>> color_format(u'{red}{0}', py.Page(s, u'u'))
  u'\x03{red}[[en:U]]'
  >>> color_format(u'{0}', py.Page(s, u'u'))
  '[[en:U]]'
  >>> u'{0}'.format(py.Page(s, u'u'))
  u'[[en:U]]'
  >>> F().format(u'{0}', py.Page(s, u'u'))
  '[[en:U]]'

Now there is fun stuff: It actually uses `unicode` now but as soon as the color 
field is removed it's back to `bytes` while `unicode.format` still works as 
expected. And at last I verify that it's not `_ColorFormatter` but instead 
`Formatter` which actually returns a bytes.


TASK DETAIL
  https://phabricator.wikimedia.org/T113411

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: XZise
Cc: XZise, Aklapper, Malafaya, pywikibot-bugs-list



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to