Re: How to display Chinese in a list retrieved from database via python

Mark Tolonen Mon, 29 Dec 2008 01:07:17 -0800

"zxo102" <[email protected]> wrote in messagenews:2560a6e0-c103-46d2-aa5a-8604de4d1...@b38g2000prf.googlegroups.com...

I have a list in a dictionary and want to insert it into the html
file. I test it with following scripts of CASE 1, CASE 2 and CASE 3. I
can see "中文" in CASE 1 but that is not what I want. CASE 2 does not
show me correct things.
So, in CASE 3, I hacked the script of CASE 2 with a function:
conv_list2str() to 'convert' the list into a string. CASE 3 can show
me "中文". I don't know what is wrong with CASE 2 and what is right with
CASE 3.

Without knowing why, I have just hard coded my python application
following CASE 3 for displaying Chinese characters from a list in a
dictionary in my web application.

Any ideas?


See below each case...新年快乐！

Happy a New Year: 2009

ouyang



CASE 1:
########################################################
f=open('test.html','wt')
f.write('''<html><head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=gb2312">
<title>test</title>
<script language=javascript>
var test = ['\xd6\xd0\xce\xc4', '\xd6\xd0\xce\xc4', '\xd6\xd0\xce
\xc4']
alert(test[0])
alert(test[1])
alert(test[2])
</script>
</head>
<body></body></html>''')
f.close()

In CASE 1, the *4 bytes* D6 D0 CE C4 are written to the file, which is thecorrect gb2312 encoding for 中文.

CASE 2:
#######################################################
mydict = {}
mydict['JUNK'] = ['\xd6\xd0\xce\xc4','\xd6\xd0\xce\xc4','\xd6\xd0\xce
\xc4']
f_str = '''<html><head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=gb2312">
<title>test</title>
<script language=javascript>
var test = %(JUNK)s
alert(test[0])
alert(test[1])
alert(test[2])
</script>
</head>
<body></body></html>'''

f_str = f_str%mydict
f=open('test02.html','wt')
f.write(f_str)
f.close()

In CASE 2, the *16 characters* "\xd6\xd0\xce\xc4" are written to the file,which is NOT the correct gb2312 encoding for 中文, and will be interpretedhowever javascript pleases. This is because the str() representation ofmydict['JUNK'] in Python 2.x is the characters "['\xd6\xd0\xce\xc4','\xd6\xd0\xce\xc4', '\xd6\xd0\xce\xc4']".

CASE 3:
###################################################
mydict = {}
mydict['JUNK'] = ['\xd6\xd0\xce\xc4','\xd6\xd0\xce\xc4','\xd6\xd0\xce
\xc4']

f_str = '''<html><head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=gb2312">
<title>test</title>
<script language=javascript>
var test = %(JUNK)s
alert(test[0])
alert(test[1])
alert(test[2])
</script>
</head>
<body></body></html>'''

import string

def conv_list2str(value):
  list_len = len(value)
  list_str = "["
  for ii in range(list_len):
      list_str += '"'+string.strip(str(value[ii])) + '"'
      if ii != list_len-1:
       list_str += ","
  list_str += "]"
  return list_str

mydict['JUNK'] = conv_list2str(mydict['JUNK'])

f_str = f_str%mydict
f=open('test03.html','wt')
f.write(f_str)
f.close()

CASE 3 works because you build your own, correct, gb2312 representation ofmydict['JUNK'] (value[ii] above is the correct 4-byte sequence for 中文).

That said, learn to use Unicode strings by trying the following program, butset the first line to the encoding *your editor* saves files in. You canuse the actual Chinese characters instead of escape codes this way. Theencoding used for the source code and the encoding used for the html filedon't have to match, but the charset declared in the file and the encodingused to write the file *do* have to match.


# coding: utf8

import codecs

mydict = {}
mydict['JUNK'] = [u'中文',u'中文',u'中文']

def conv_list2str(value):
   return u'["' + u'","'.join(s for s in value) + u'"]'

f_str = u'''<html><head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=gb2312">
<title>test</title>
<script language=javascript>
var test = %s
alert(test[0])
alert(test[1])
alert(test[2])
</script>
</head>
<body></body></html>'''

s = conv_list2str(mydict['JUNK'])
f=codecs.open('test04.html','wt',encoding='gb2312')
f.write(f_str % s)
f.close()


-Mark

P.S. Python 3.0 makes this easier for what you want to do, because therepresentation of a dictionary changes. You'll be able to skip theconv_list2str() function and all strings are Unicode by default.



--
http://mail.python.org/mailman/listinfo/python-list

Re: How to display Chinese in a list retrieved from database via python

Reply via email to