On Fri, Aug 22, 2014 at 2:10 PM, Albert-Jan Roskam <fo...@yahoo.com.dmarc.invalid> wrote: > Hi, > > I have data that is either floats or byte strings in utf-8. I need to cast > both to unicode strings.
Just to be sure, I'm parsing the problem statement above as: data :== float | utf-8-encoded-byte-string because the alternative way to parse the statement in English: data :== float-in-utf-8 | byte-string-in-utf-8 doesn't make any technical sense. :P > I am probably missing something simple, but.. in the code below, under > "float", why does [B] throw an error but [A] does not? > > # float: cannot explicitly give encoding, even if it's the default >>>> value = 1.0 >>>> unicode(value) # [A] > u'1.0' >>>> unicode(value, sys.getdefaultencoding()) # [B] > > Traceback (most recent call last): > File "<pyshell#22>", line 1, in <module> > unicode(value, sys.getdefaultencoding()) > TypeError: coercing to Unicode: need string or buffer, float found Yeah. Unfortunately, you're right: this doesn't make too much sense. What's happening is that the standard library overloads two _different_ behaviors to the same function unicode(). It's conditioned on whether we're passing in a single value, or if we're passing in two. I would not try to reconcile a single, same behavior for both uses: treat them as two distinct behaviors. Reference: https://docs.python.org/2/library/functions.html#unicode Specifically, the two arg case is meant where you've got an uninterpreted source of bytes that should be decoded to Unicode using the provided encoding. So for your problem statement, the function should look something like: ############################### def convert(data): if isinstance(data, float): return unicode(data) if isinstance(data, bytes): return unicode(data, "utf-8") raise ValueError("Unexpected data", data) ############################### where you must use unicode with either the 1-arg or 2-arg variant based on your input data. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor