Hi

I am trying to read in non-ASCII data from file using Unicode, with this test app:

vocab=[("abends","in the evening"),
("aber","but"),
("die abflughalle","departure lounge"),
("abhauen","to beat it/leave"),
("abholen","to collect/pick up"),
("das Abitur","A-levels"),
("abmachen","to take off"),
("abnehem","to lose weight"),
("die Auff\xFCrung","performance (of a play)"),
("der Au\xDFenhandel","foreign trade")
]

print "data from list"
for (word1, word2) in vocab:
    print "   ", word1, unicode(word1,"latin1")

print "\ndata from file"
in_file = open("eng_ger.txt","r")
for line in in_file:
    words = line.split(',')
    print "   ",words[0],unicode(words[0],"latin1")
in_file.close()

The data in the file"eng_ger.txt" is listed below. When I parse the data from the list, I get the correct text displayed but when reading it from file, the encoding into unicode does not occur. I would be really grateful if someone could explain why the string-> unicode conversion works with lists but not with files!

Thanks in advance

Alun Griffiths

Contents of "eng_ger.txt"

abends,in the evening
aber,but
die abflughalle,departure lounge
abhauen,to beat it/leave
abholen,to collect/pick up
das Abitur,A-levels
abmachen,to take off
abnehem,to lose weight
die Auff\xFCrung,performance (of a play)
der Au\xDFenhandel,foreign trade


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to