On 17 Jul 2006, at 8:25, tsuyuki makoto wrote: > We Japanese know that we can't transarate Japanese to ASCII. > So I want to do it as follows at least. > A letter does not disappear and is restored. > #FileField and ImageField have same letters disappear problem. > > def slug_ja(word) : > try : > unicode(word, 'ASCII') > import re > slug = re.sub('[^\w\s-]', '', word).strip().lower() > slug = re.sub('[-\s]+', '-', slug) > return slug > except UnicodeDecodeError : > from encodings import idna > painful_slug = word.strip().lower().decode('utf-8').encode > ('IDNA') > return painful_slug
I’m not convinced by this approach, but I would suggest using the “punycode” instead of the “idna” encoder anyway. The results don’t include the initial “xn--” marks which are only useful in a domain name, not in a URI path. Also, the “from encodings […]” line appears to be unnecessary on my Python 2.3.5 and 2.4.1 on OSX. [[[ >>> p = u"perché" >>> from encodings import idna >>> p.encode('idna') 'xn--perch-fsa' >>> p.encode('punycode') 'perch-fsa' >>> puny = 'perch-fsa' >>> puny.decode('punycode') u'perch\xe9' >>> print puny.decode('punycode') perché >>> pu = puny.decode('punycode') # it's reversible >>> print pu perché ]]] More on Punycode: http://en.wikipedia.org/wiki/Punycode Cheers. -- Antonio --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers -~----------~----~----~----~------~----~------~--~---