XZise added a comment.

Okay I seem to have found the culprit. The `Formatter` allows two levels deep 
specifications for something like `'{0:0{1}}'` where the second argument is 
actually the width of the first one. Now to format the string it splits it up 
in chunks and buffers the “filled“ chunks in a list and concatenates that list 
at the end using `''.join(result)`. Now if one of the elements in `result` is a 
`unicode` it converts that into a `unicode`. But if the list is empty it 
returns `bytes` which is the case for the second round if there are no 
cascading specifications. This converts a `unicode` specification onto a 
`bytes` specification. And that specification is then used to format the field 
using the builtin function `format` which returns `unicode` if the 
specification is a `unicode` and `bytes` otherwise (as long as the value is not 
already a `unicode` afaik).

So as an example `Formatter().format(u'{0}{1}', u'a', 'ä')`: It is splitting 
that string up into two parts and then these parts into the name and 
specification (e.g. `u'0'` and `u''`). Then the specification is parsed again 
which is similar to `Formatter().format(u'')` and returns a `bytes` instance so 
that the specification is now `b''` (Python 2 won't show that prefix but just 
for clarity I add it here). Now it uses the value associated by that name 
(`u'a'` for the first entry) and does basically `format(u'a', '')` which 
returns `u'a'`. For the second entry it's `format('ä', '')` and that returns 
`'ä'` so that result is then `[u'a', 'ä']` and it crashes on the concatenation.

At the moment I have to approaches to fix it. Either change `format_field` to 
return `unicode` if the format string is one (independently of the 
specification). Alternatively I could overwrite `_vformat` and return a 
`unicode` similarly to `format_field` which would prevent that the 
specification changes type. While the latter is closer to fixing the actual bug 
(as it would prevent that `format` returns `bytes` for an empty string) it 
would change a “private” method which isn't part of the official API. Anyway a 
fix is probably near and I want first to design tests which fail and will be 
fixed with the patch to be sure that I got it.


TASK DETAIL
  https://phabricator.wikimedia.org/T113411

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: XZise
Cc: XZise, Aklapper, Malafaya, pywikibot-bugs-list



_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

Reply via email to