Am 06.04.2006 um 10:27 schrieb Christian Boos:
Of course, that would not be used to every text used in the system,
only for file content. The above `to_unicode` is also used for that,
so I think I'll rename it `data_to_unicode` (preserve content),
to contrast it with `text_to_unicode` (which might be "lossy").
I still fail to see how any text decoding can *not* be lossy if you
don't know the encoding. Decoding using ISO-8859-15 is only going to
be non-lossy if that *happens to be* the encoding of the text.
And why would we ever want to decode non-textual data to unicode?
Such attempts should be considered a bug.
An alternative to having 2 versions `*_to_unicode` would be to add
a third optional argument: `to_unicode(text, charset=None,
lossy=False)`.
That would be better, but as explained above, I fail to see the point
of `lossy=False`.
PS: Hm, I just realized I begin to wiki format my e-mails... Damn :)
And I thought you've been doing that since, like, forever :-P
Cheers,
Chris
--
Christopher Lenz
cmlenz at gmx.de
http://www.cmlenz.net/
_______________________________________________
Trac-dev mailing list
[email protected]
http://lists.edgewall.com/mailman/listinfo/trac-dev