Hi again Dave,

I completely agree with you. Unicode strings seem to be the way to go, since
they are the most generic types
of strings, and many API:s for python seem to use them.

I did a quick experiment to verify that a statically declared unicode string
is correctly coded to utf-8 given
that it is declared in a python file which characters outside of the
128-range are recognized using utf-8 encoding:

test.py:

# -*- coding: utf-8 -*-
a=u'รครครค'
print (type(a),a)
b= a.encode('utf-8')
print (type(b),b)

% python test.py

(<type 'unicode'>, u'\xe4\xe4\xe4')
(<type 'str'>, '\xc3\xa4\xc3\xa4\xc3\xa4')

Thanks again for a great piece of software!

/Kristoffer


Thanks again for a great piece of software. With generateDS I don't need to
care about XML anymore :)


On Wed, Dec 8, 2010 at 12:52 AM, Dave Kuhlman <dkuhl...@rexx.com> wrote:

> >From: Kristoffer Kobosko
> >Sent: Thu, December 2, 2010 5:52:40 AM
> >
> > Hi Dave!
> >
> > I am currently using and enjoying generateds very much!
> >
>
> Kristoffer -
>
> Super.  Glad it's of help.  Thanks for letting me know.
>
> > One thing I have noticed though, is that an xml object structure
> > fails to export properly to python code using a generated
> > parser/generator if it contains unicode characters outside of the
> > ascii range.
>
> Sigh.  Yes, I've left the generation of Python code behind a bit
> during recent changes.  I'll get back to it soon I hope.
>
> >
> > XML code parsing / exporting is handled properly thanks to the
> > --external-encoding command line argument to generateDS, which does
> > .encode(external_encoding) properly, but this is not the case for
> > python exports.
>
> Thanks for pointing this out.  I'll put the encoding capability on
> my list of things to fix in the exportLiteral stuff.
>
> >
> > Is there a reason behind this, or should it actually be considered
> > as a bug?
>
> Seems like a bug.  I'll look more closely.
>
> >
> > I understand that python does support code in utf-8 format.
>
> Yes.  I believe you are right.  In fact, the Python style guide
> (http://www.python.org/dev/peps/pep-0008/) even suggests that utf-8
> is preferred, especially for Python 3.0 and beyond.
>
> But, here is a question -- Should literal strings in the generated
> Python code be unicode which is then encoded as utf-8 during
> export?  Or, should those strings be encoded as utf-8 strings?  My
> guess is the first, so that we can do our processing in Python
> unicode strings.  What do you think?
>
> Thanks again for the nudge.  Sometimes I can use a little
> motivation.  And, thanks for the suggestions.  I hope to get to
> them before too long.
>
> - Dave
>
>
>  --
>
>
> Dave Kuhlman
> http://www.rexx.com/~dkuhlman
>
------------------------------------------------------------------------------
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Reply via email to