[IronPython] x = unicode(someExtendedUnicodeString) fails.

Vernon Cole Thu, 17 Dec 2009 11:06:01 -0800

I just tripped over this one and it took some time to figure out what in
blazes was going on. You may want to watch for it when porting CPython code.


I was cleaning up an input argument using
     s = unicode(S.strip().upper())
where S is the argument supplying the value I need to convert.

When I handed the function a genuine unicode string, such as in:
     assert Roman(u'\u217b') == 12 #unicode Roman number 'xii' as a single
charactor
IronPython complains with:
    UnicodeEncodeError: ('unknown', '\x00', 0, 1, '')

The Python manual says:

> If no optional parameters are given, unicode() will mimic the behaviour of
> str() except that it returns Unicode strings instead of 8-bit strings.
> More precisely, if *object* is a Unicode string or subclass it will return
> that Unicode string without any additional decoding applied.


It turns out that this was already reported on codeplex as:
http://ironpython.codeplex.com/WorkItem/View.aspx?WorkItemId=15372
but the reporting party did not catch the fact that he had located an
incompatibility with documented behavior.
It has been setting on a back burner for some time.

Others may want to join me in voting this up.  Meanwhile I will add an
unneeded exception handler to my own code.
--
Vernon Cole

_______________________________________________
Users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

[IronPython] x = unicode(someExtendedUnicodeString) fails.

Reply via email to