Re: [Generateds-users] Unicode problem

Al Niessner Wed, 03 Dec 2008 14:09:35 -0800

Sorry it took so long, but I got side tracked with another problem. I am
back to this now and read a little on Python's handling of unicode and
its decode/encode methodology. I think I found my solution. With the
error I had something like:


out.write ("...")
message.export (out, 0, name="communication")
out.write ("\n")

where message was an object from generateDS and out is a stream like
stdout or an open file. Unicode handling is not as intrinsic as it is in
Java so there is no way to define a default codec. However, Python is a
little better about handling weird little things so I went with this:

import codecs

c = codec.lookup ("latin-1")
eout = c.streamwriter(out)
eout.write ("...")
message.export (eout, 0, name="communication")
eout.write ("\n")

All works as expected now. It also lets me randomly assign a codec if I
want as well. I think this is the right way to handle these cases
because it leaves it completely to the user with no real effort. Thanks
for the help.

On Mon, 2008-12-01 at 20:18 -0800, Dave Kuhlman wrote:
> > From: Al Niessner <[EMAIL PROTECTED]>
> > To: Dave Kuhlman <[EMAIL PROTECTED]>
> > Cc: "generateds-users@lists.sourceforge.net" 
> > <generateds-users@lists.sourceforge.net>
> > Sent: Monday, December 1, 2008 1:41:12 PM
> > Subject: Re: [Generateds-users] Unicode problem
> >
> >
> > Here is the stack output. I will trace it out tomorrow and see what I
> > can do with it. It does not look positive because it seems to be a deep
> > rooted Python thing. I will get back to you though.
> >
> 
> [snip]
> 
> >   File "/tmp/Test71/SciPyScripts/communicationsInterface.py", line 939,
> > in exportChildren
> >     outfile.write('<%srequest>%s\n' % (namespace_,
> > quote_xml(self.get_request()), namespace_))
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in
> > position 18: ordinal not in range(128)
> 
> So, we have a unicode character.  Before writing it out, we need to
> encode it in some external character set.  Consider the following:
> 
>     >>> a = u'\xb5'
>     >>>
>     >>> print a
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in 
> position 0: ordinal not in range(128)
>     >>>
>     >>> print a.encode('utf-8')
>     µ
>     >>>
> 
> (Not sure whether the mu/micro symbol will show up in email.)
> 
> Suggestion -- You might try using the -s and the --super flags to
> generateDS.py.  That will generate a sub-class module that will
> import your (main) generated file.  Then in the sub-class where the
> problem occurs, copy and modify the export method from the
> super-class.  Add something like the ".encode('utf-8')" to that
> method.  Now, process your XML with the sub-class module/file.
> 
> Once you get that working, let's talk about whether generateDS.py
> should be modified so that it generates code that handles this
> situation automatically and without modifications to a generated
> file.
> 
> A bit of excuse making -- I had zero understanding of unicode when
> I initially started work on generateDS.py.  Now, at least, I have a
> smidgen.
> 
> And, here is a Web page that I found very helpful:
> 
>     http://farmdev.com/talks/unicode/
> 
> Excuse me if I'm insulting you here.  I don't know where you are on
> the unicode learning curve.
> 
> - Dave
> 
> --
> 
> Dave Kuhlman
> http://www.rexx.com/~dkuhlman
-- 
Al Niessner
818.354.0859

All opinions stated above are mine and do not necessarily reflect those
of JPL or NASA.

--------
|  dS  | >= 0
--------



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Re: [Generateds-users] Unicode problem

Reply via email to