Re: [pygtk] Problem in fetching Unicode from URL and displaying it in PyGTK widget

Bertrand Kintanar Sat, 18 Jul 2009 03:09:12 -0700

On 7/17/09 11:19 PM, Walter Leibbrandt wrote:

Bertrand Kintanar wrote:

On 7/17/09 9:19 PM, saeed wrote:
s1 = 'Guz&#xE1;n'
s2 = ''
n = len(s1)
i = 0
while i<n:
   if i<n-6:
     if s1[i:i+3]=='&#x' and s1[i+5]==';':
       s2 += unichr(int(s1[i+3:i+5], 16)).encode('utf-8')
       i += 6
       continue
   s2 += s1[i]
   i += 1
print s2
Now this fixes it all. Thanks alot. I hope there is some sexier wayto do this though. but this will work. thanks again

import re
htmluni = re.compile(r'&#x([\dA-Fa-f]+);')
data = 'Guz&#xE1;n   Guz&#xE1;n'


match = htmluni.search(data)
while match:

data = data[:match.start()] + unichr(int(match.group(1), 16)) +data[match.end():]

    match = htmluni.search(data)

Thanks for this Walter. I'm also using regex for my search but neverthought of it to use it as you have in here.

_______________________________________________
pygtk mailing list   [email protected]
http://www.daa.com.au/mailman/listinfo/pygtk
Read the PyGTK FAQ: http://faq.pygtk.org/

Re: [pygtk] Problem in fetching Unicode from URL and displaying it in PyGTK widget

Reply via email to