On Apr 21, 2:25 pm, Peter Otten <__pete...@web.de> wrote:
> Are you sure that your script has
>
> str = u"..."
>
> like in your post and not just
>
> str = "..."
No :-)
str=u""
doc=xml.dom.minidom.parseString( str.encode("utf-8") )
xml=doc.toxml( encoding="utf-8")
file=codecs.open( "foo.xml", "w
On Apr 21, 1:58 pm, Peter Otten <__pete...@web.de> wrote:
> C. Benson Manica wrote:
>> (snip)
>
> It seems that parseString() doesn't like unicode
Yes, I noticed that, and I already tried...
> -- let's try a byte string
> then:
>
> >>&g
I have the following simple script running on 2.5.2 on a machine where
the default character encoding is "ascii":
#!/usr/bin/env python
#coding: utf-8
import xml.dom.minidom
import codecs
str=u""
doc=xml.dom.minidom.parseString( str )
xml=doc.toxml( encoding="utf-8" )
file=codecs.open( "foo.xml"
On Mar 9, 12:24 pm, "Richard Brodie" wrote:
> "C. Benson Manica" wrote in
> messagenews:98375575-1071-46af-8ebc-f3c817b47...@q23g2000yqd.googlegroups.com...
>
> >The strings come from the same place, i.e. they're exclusively
> > normal ASCII charac
On Mar 9, 12:07 pm, Tim Golden wrote:
> You can't. You can apply one or more heuristics, depending on exactly
> what your requirement is. But any valid ASCII text is also valid
> UTF8-encoded text since UTF-8 isn't "two bytes per char" but a variable
> number of bytes per char.
Hm, well that's v
Hours of Googling has not helped me resolve a seemingly simple
question - Given a string s, how can I tell whether it's ascii (and
thus 1 byte per character) or UTF-8 (and two bytes per character)?
This is python 2.4.3, so I don't have getsizeof available to me.
--
http://mail.python.org/mailman/l