Re: unicode.. reject?

gabor Tue, 30 May 2006 13:07:00 -0700

Ivan Sagalaev wrote:
> gabor wrote:
> 
>> i'm also willing to help with this task... but..do i understand this 
>> correctly, that it's agreed that django is going to switch to unicode?
>>  
>>
> Being one of the proponent of the all-unicode way back when it was 
> proposed I should say that the more I think of it the more I'm afraid 
> that it can create just as much problems as it will solve.


hi,

a short summary at the beginning:
(to borrow the words of someone from sci.lang.japan):

This is a very hard subject to discuss without creating a personal blog,
by the way.

:))

i understand your point, but i still think it pays off to switch to 
unicode. but it needs a confirmation from the django-BDFL's side that 
they are willing to make this step. if not, there's no point to work on it.

generally i think, that you must understand the various aspects of 
unicode. whether you use unicode-strings or byte-string, you have to 
understand them.

below are more&less various rants regarding the topic :) (if it sounds 
too agressive, then sorry, wasn't my intent)

> Today there 
> is a convention that Django works everywhere with byte strings except 
> some rare cases where it is convinient to use Python's built-in unicode 
> functions (these cases include counting length in validation code and 
> various string conversions in filters like in the unapplied patch in 
> ticket http://code.djangoproject.com/ticket/924). Using this convention 
> one can write international apps without worries since all messy things 
> are made inside the framework.

well, i would replace 'without worries' with 'good enough' :)

> 
> If we change this convention to using Python's unicode everywhere we 
> will hardly win anything except some feeling of a more "pure" approach. 
> At least _I_ can't see any gain :-).

some people would say (maybe including me) that it's already a gain :)

> However there will be some 
> disadvantages that I think are serious:
> 
> - I heard that many 3rd party libraries don't work with unicode so a 
> user will be forced to do some coversions manually

this happens both ways. some libraries work with unicode strings, so you 
have to convert back&forth there :)

for example, i was building a very simple web-file-manager. you can ask 
python to get you the os.listdir, etc, data in unicode, which i think 
the only sensible way, because otherwise you have to watch out for the 
filesystem-encoding. but because the querystring and httpresponse is 
bytestring, i have .encode('utf8'), .decode('utf8') all the way.

> - byte strings is a default string type in Python and is more widely 
> used than unicode, I'm afraid many people in ASCII-world will be 
> resistant to switching all their coding to unicode because they can't be 
> sure that it works right because they can't test it easily themselves

hmm... is it that hard to get some japanese text and maybe some german 
names? :)

anyway, i think it actually might help them at the end, because 
converting your app to unicode means to think about all the places where 
data enters the system and where data leaves the system.

so this is more explicit, and, as 'import this' says: explicit is 
bettern than implicit :))

> 
> So may be Django should just wait for Python 3000 when unicode string 
> will become default ones and developers would be more widely aware of 
> this change.

'aware of this change' is an understatement :) there will not be any 
byte-strings in py3000 :)


gabor

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers
-~----------~----~----~----~------~----~------~--~---

Re: unicode.. reject?

Reply via email to