I just tripped over this one and it took some time to figure out what in
blazes was going on. You may want to watch for it when porting CPython code.
I was cleaning up an input argument using
s = unicode(S.strip().upper())
where S is the argument supplying the value I need to convert.
When I handed the function a genuine unicode string, such as in:
assert Roman(u'\u217b') == 12 #unicode Roman number 'xii' as a single
charactor
IronPython complains with:
UnicodeEncodeError: ('unknown', '\x00', 0, 1, '')
The Python manual says:
> If no optional parameters are given, unicode() will mimic the behaviour of
> str() except that it returns Unicode strings instead of 8-bit strings.
> More precisely, if *object* is a Unicode string or subclass it will return
> that Unicode string without any additional decoding applied.
It turns out that this was already reported on codeplex as:
http://ironpython.codeplex.com/WorkItem/View.aspx?WorkItemId=15372
but the reporting party did not catch the fact that he had located an
incompatibility with documented behavior.
It has been setting on a back burner for some time.
Others may want to join me in voting this up. Meanwhile I will add an
unneeded exception handler to my own code.
--
Vernon Cole
_______________________________________________
Users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com