Re: [Tutor] Unicode trouble

Kent Johnson Thu, 01 Dec 2005 02:50:33 -0800

Øyvind wrote:
> I tried the error='replace' as you suggested and the program made it thru
> the list. However, here are some results:
> 
> the gjenopplï¿½et gjenopplï¿½
> from
> the gjenoppløst       det gjenoppløste
> 
> kan vï¿½ konsentrert
> from
> kan være konsentrert


It seems pretty clear that you are using the wrong encoding somewhere.
> 
> I did check the site http://www.columbia.edu/kermit/utf8.html and the
> letters that is the problem here are a part of the utf-8.

That doesn't mean anything. Pretty much every letter used in every natural 
language of the world is part of unicode, that's the point of it. utf-8 is just 
a way to encode unicode so it includes all unicode characters.

The important question is, what is actual encoding of your source data?
> 
> Is there anything else I could try?

Understand why the above question is important, then answer it. Until you do 
you are just thrashing around in the dark.

Do you know what a character encoding is? Do you understand the difference 
between utf-8 and latin-1?

Kent
-- 
http://www.kentsjohnson.com

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Unicode trouble

Reply via email to