Re: [Python-Dev] unicode hell/mixing str and unicode as dictionarykeys

Kristján V . Jónsson Fri, 04 Aug 2006 04:41:29 -0700

The "string" isn´t necessarily text, so selecting latin-1 doesn´t help  (in 
fact, what happens is that the current default encoding is used, in his case 
this was ascii).  What if it is image data?  What if you are using a dict to 
implement a singleton set for arbitrary objects?


The point is that if the comparison operator raises an exception, the two 
objects are likely to be dissimilar.  We could even define that behaviour.  
Propagating the exception means that you can't have objects as keys in a 
dictionary that raise an exception when compared.  This goes over and beyond 
any unicode vs. string question.

If the propagation of the exception was a concious change for debugging 
purposes, why not make that somehow optional?  A flag on the dict object?  Or 
special lookup mehtods for that?

Cheers,
Kristján

> -----Original Message-----
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] 
> On Behalf Of Josiah Carlson
> Sent: 4. ágúst 2006 04:34
> To: Bob Ippolito; python-dev@python.org
> Subject: Re: [Python-Dev] unicode hell/mixing str and unicode 
> as dictionarykeys
> 
> 
> Bob Ippolito <[EMAIL PROTECTED]> wrote:
> > On Aug 3, 2006, at 6:51 PM, Greg Ewing wrote:
> > 
> > > M.-A. Lemburg wrote:
> > >
> > >> Perhaps we ought to add an exception to the dict lookup 
> mechanism 
> > >> and continue to silence UnicodeErrors ?!
> > >
> > > Seems to be that comparison of unicode and non-unicode 
> strings for 
> > > equality shouldn't raise exceptions in the first place.
> > 
> > Seems like a slightly better idea than having dictionaries suppress 
> > exceptions. Still not ideal though because sticking 
> non-ASCII strings 
> > that are supposed to be text and unicode in the same data 
> structures 
> > is *probably* still an error.
> 
> If/when 'python -U -c "import test.testall"' runs without 
> unexpected error (I doubt it will happen prior to the "all 
> strings are unicode"
> conversion), then I think that we can say that there aren't 
> any use-cases for strings and unicode being in the same dictionary.
> 
> As an alternate idea, rather than attempting to 
> .decode('ascii') when strings and unicode compare, why not 
> .decode('latin-1')?  We lose the unicode decoding error, but 
> "the right thing" happens (in my opinion) when u'\xa1' and 
> '\xa1' compare.
> 
>  - Josiah
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/kristjan%40c
cpgames.com
> 
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode hell/mixing str and unicode as dictionarykeys

Reply via email to