Ivan Sagalaev wrote: > gabor wrote: > >> i'm also willing to help with this task... but..do i understand this >> correctly, that it's agreed that django is going to switch to unicode? >> >> > Being one of the proponent of the all-unicode way back when it was > proposed I should say that the more I think of it the more I'm afraid > that it can create just as much problems as it will solve.
hi, a short summary at the beginning: (to borrow the words of someone from sci.lang.japan): This is a very hard subject to discuss without creating a personal blog, by the way. :)) i understand your point, but i still think it pays off to switch to unicode. but it needs a confirmation from the django-BDFL's side that they are willing to make this step. if not, there's no point to work on it. generally i think, that you must understand the various aspects of unicode. whether you use unicode-strings or byte-string, you have to understand them. below are more&less various rants regarding the topic :) (if it sounds too agressive, then sorry, wasn't my intent) > Today there > is a convention that Django works everywhere with byte strings except > some rare cases where it is convinient to use Python's built-in unicode > functions (these cases include counting length in validation code and > various string conversions in filters like in the unapplied patch in > ticket http://code.djangoproject.com/ticket/924). Using this convention > one can write international apps without worries since all messy things > are made inside the framework. well, i would replace 'without worries' with 'good enough' :) > > If we change this convention to using Python's unicode everywhere we > will hardly win anything except some feeling of a more "pure" approach. > At least _I_ can't see any gain :-). some people would say (maybe including me) that it's already a gain :) > However there will be some > disadvantages that I think are serious: > > - I heard that many 3rd party libraries don't work with unicode so a > user will be forced to do some coversions manually this happens both ways. some libraries work with unicode strings, so you have to convert back&forth there :) for example, i was building a very simple web-file-manager. you can ask python to get you the os.listdir, etc, data in unicode, which i think the only sensible way, because otherwise you have to watch out for the filesystem-encoding. but because the querystring and httpresponse is bytestring, i have .encode('utf8'), .decode('utf8') all the way. > - byte strings is a default string type in Python and is more widely > used than unicode, I'm afraid many people in ASCII-world will be > resistant to switching all their coding to unicode because they can't be > sure that it works right because they can't test it easily themselves hmm... is it that hard to get some japanese text and maybe some german names? :) anyway, i think it actually might help them at the end, because converting your app to unicode means to think about all the places where data enters the system and where data leaves the system. so this is more explicit, and, as 'import this' says: explicit is bettern than implicit :)) > > So may be Django should just wait for Python 3000 when unicode string > will become default ones and developers would be more widely aware of > this change. 'aware of this change' is an understatement :) there will not be any byte-strings in py3000 :) gabor --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers -~----------~----~----~----~------~----~------~--~---