On Saturday, October 19, 2013 12:16:02 PM UTC-4, Steven D'Aprano wrote:

> Another reasonable use for accent-stripping is searches. If I'm searching 
> for music by the Blue Öyster Cult, it would be good to see results for 
> Blue Oyster Cult as well.

Tell me about it (I work at Songza; music search is what we do).  Accents are 
easy (Beyoncé, for example).  What about NIN (where one of the N's is supposed 
to be backwards, but I can't figure out how to type that)?  And Ke$ha.  And 
"The artist previously known as a glyph which doesn't even exist in Unicode 6.3"

> On the other hand, if you name your band ▼□■□■□■, you deserve to wallow 
> in obscurity :-)

Indeed.

So, yesterday, I tracked down an uncaught exception stack in our logs to a user 
whose username included the unicode character 'SMILING FACE WITH SUNGLASSES' 
(U+1F60E).  It turns out, that's perfectly fine as a user name, except that in 
one obscure error code path, we try to str() it during some error processing.  
If you named your band something which included that character, would you 
expect it to match a search for the same name but with 'WHITE SMILING FACE' 
(U+263A) instead?


-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to