Øyvind wrote: > I tried the error='replace' as you suggested and the program made it thru > the list. However, here are some results: > > the gjenoppl�et gjenoppl� > from > the gjenoppløst det gjenoppløste > > kan v� konsentrert > from > kan være konsentrert
It seems pretty clear that you are using the wrong encoding somewhere. > > I did check the site http://www.columbia.edu/kermit/utf8.html and the > letters that is the problem here are a part of the utf-8. That doesn't mean anything. Pretty much every letter used in every natural language of the world is part of unicode, that's the point of it. utf-8 is just a way to encode unicode so it includes all unicode characters. The important question is, what is actual encoding of your source data? > > Is there anything else I could try? Understand why the above question is important, then answer it. Until you do you are just thrashing around in the dark. Do you know what a character encoding is? Do you understand the difference between utf-8 and latin-1? Kent -- http://www.kentsjohnson.com _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor