Re: a simple unicode question

Mark Tolonen Mon, 19 Oct 2009 21:53:56 -0700

"George Trojan" <[email protected]> wrote in messagenews:[email protected]...

A trivial one, this is the first time I have to deal with Unicode. I amtrying to parse a string s='''48° 13' 16.80" N'''. I know the charset is"iso-8859-1". To get the degrees I did
 >>> encoding='iso-8859-1'
 >>> q=s.decode(encoding)
 >>> q.split()
[u'48\xc2\xb0', u"13'", u'16.80"', u'N']
 >>> r=q.split()[0]
 >>> int(r[:r.find(unichr(ord('\xc2')))])
48
Is there a better way of getting the degrees?

It seems your string is UTF-8. \xc2\xb0 is UTF-8 for DEGREE SIGN. If youtype non-ASCII characters in source code, make sure to declare the encodingthe file is *actually* saved in:


# coding: utf-8

s = '''48° 13' 16.80" N'''
q = s.decode('utf-8')

# next line equivalent to previous two
q = u'''48° 13' 16.80" N'''

# couple ways to find the degrees
print int(q[:q.find(u'°')])
import re
print re.search(ur'(\d+)°',q).group(1)

-Mark


--
http://mail.python.org/mailman/listinfo/python-list

Re: a simple unicode question

Reply via email to