Hi again Dave, I completely agree with you. Unicode strings seem to be the way to go, since they are the most generic types of strings, and many API:s for python seem to use them.
I did a quick experiment to verify that a statically declared unicode string is correctly coded to utf-8 given that it is declared in a python file which characters outside of the 128-range are recognized using utf-8 encoding: test.py: # -*- coding: utf-8 -*- a=u'รครครค' print (type(a),a) b= a.encode('utf-8') print (type(b),b) % python test.py (<type 'unicode'>, u'\xe4\xe4\xe4') (<type 'str'>, '\xc3\xa4\xc3\xa4\xc3\xa4') Thanks again for a great piece of software! /Kristoffer Thanks again for a great piece of software. With generateDS I don't need to care about XML anymore :) On Wed, Dec 8, 2010 at 12:52 AM, Dave Kuhlman <dkuhl...@rexx.com> wrote: > >From: Kristoffer Kobosko > >Sent: Thu, December 2, 2010 5:52:40 AM > > > > Hi Dave! > > > > I am currently using and enjoying generateds very much! > > > > Kristoffer - > > Super. Glad it's of help. Thanks for letting me know. > > > One thing I have noticed though, is that an xml object structure > > fails to export properly to python code using a generated > > parser/generator if it contains unicode characters outside of the > > ascii range. > > Sigh. Yes, I've left the generation of Python code behind a bit > during recent changes. I'll get back to it soon I hope. > > > > > XML code parsing / exporting is handled properly thanks to the > > --external-encoding command line argument to generateDS, which does > > .encode(external_encoding) properly, but this is not the case for > > python exports. > > Thanks for pointing this out. I'll put the encoding capability on > my list of things to fix in the exportLiteral stuff. > > > > > Is there a reason behind this, or should it actually be considered > > as a bug? > > Seems like a bug. I'll look more closely. > > > > > I understand that python does support code in utf-8 format. > > Yes. I believe you are right. In fact, the Python style guide > (http://www.python.org/dev/peps/pep-0008/) even suggests that utf-8 > is preferred, especially for Python 3.0 and beyond. > > But, here is a question -- Should literal strings in the generated > Python code be unicode which is then encoded as utf-8 during > export? Or, should those strings be encoded as utf-8 strings? My > guess is the first, so that we can do our processing in Python > unicode strings. What do you think? > > Thanks again for the nudge. Sometimes I can use a little > motivation. And, thanks for the suggestions. I hope to get to > them before too long. > > - Dave > > > -- > > > Dave Kuhlman > http://www.rexx.com/~dkuhlman >
------------------------------------------------------------------------------
_______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users