Sorry it took so long, but I got side tracked with another problem. I am
back to this now and read a little on Python's handling of unicode and
its decode/encode methodology. I think I found my solution. With the
error I had something like:
out.write ("...")
message.export (out, 0, name="communication")
out.write ("\n")
where message was an object from generateDS and out is a stream like
stdout or an open file. Unicode handling is not as intrinsic as it is in
Java so there is no way to define a default codec. However, Python is a
little better about handling weird little things so I went with this:
import codecs
c = codec.lookup ("latin-1")
eout = c.streamwriter(out)
eout.write ("...")
message.export (eout, 0, name="communication")
eout.write ("\n")
All works as expected now. It also lets me randomly assign a codec if I
want as well. I think this is the right way to handle these cases
because it leaves it completely to the user with no real effort. Thanks
for the help.
On Mon, 2008-12-01 at 20:18 -0800, Dave Kuhlman wrote:
> > From: Al Niessner <[EMAIL PROTECTED]>
> > To: Dave Kuhlman <[EMAIL PROTECTED]>
> > Cc: "[email protected]"
> > <[email protected]>
> > Sent: Monday, December 1, 2008 1:41:12 PM
> > Subject: Re: [Generateds-users] Unicode problem
> >
> >
> > Here is the stack output. I will trace it out tomorrow and see what I
> > can do with it. It does not look positive because it seems to be a deep
> > rooted Python thing. I will get back to you though.
> >
>
> [snip]
>
> > File "/tmp/Test71/SciPyScripts/communicationsInterface.py", line 939,
> > in exportChildren
> > outfile.write('<%srequest>%s\n' % (namespace_,
> > quote_xml(self.get_request()), namespace_))
> > UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in
> > position 18: ordinal not in range(128)
>
> So, we have a unicode character. Before writing it out, we need to
> encode it in some external character set. Consider the following:
>
> >>> a = u'\xb5'
> >>>
> >>> print a
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in
> position 0: ordinal not in range(128)
> >>>
> >>> print a.encode('utf-8')
> ยต
> >>>
>
> (Not sure whether the mu/micro symbol will show up in email.)
>
> Suggestion -- You might try using the -s and the --super flags to
> generateDS.py. That will generate a sub-class module that will
> import your (main) generated file. Then in the sub-class where the
> problem occurs, copy and modify the export method from the
> super-class. Add something like the ".encode('utf-8')" to that
> method. Now, process your XML with the sub-class module/file.
>
> Once you get that working, let's talk about whether generateDS.py
> should be modified so that it generates code that handles this
> situation automatically and without modifications to a generated
> file.
>
> A bit of excuse making -- I had zero understanding of unicode when
> I initially started work on generateDS.py. Now, at least, I have a
> smidgen.
>
> And, here is a Web page that I found very helpful:
>
> http://farmdev.com/talks/unicode/
>
> Excuse me if I'm insulting you here. I don't know where you are on
> the unicode learning curve.
>
> - Dave
>
> --
>
> Dave Kuhlman
> http://www.rexx.com/~dkuhlman
--
Al Niessner
818.354.0859
All opinions stated above are mine and do not necessarily reflect those
of JPL or NASA.
--------
| dS | >= 0
--------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
generateds-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/generateds-users