On Tuesday, April 28, 2015 at 9:23:40 PM UTC-7, [email protected] wrote:
>
> (Much more important is handling of len(), looping over a string and so 
> on. But they are another story.) 
>
That's a straight encoding problem. It seems that when regarded as "str", 
you get the utf8 encoding in bytes, so  ä consists of two bytes, even 
though it prints as one character. Iterating over a string iterates over 
the bytes. A "unicode" consists of unicode codepoints, so ä is one unit:

sage: len(u"Direct translation of 'Mäntysalo' is 'Pine forest'")
50
sage: len("Direct translation of 'Mäntysalo' is 'Pine forest'")
51
sage: print u"Direct translation of 'Mäntysalo' is 'Pine forest'"[24:26]
än
sage: print "Direct translation of 'Mäntysalo' is 'Pine forest'"[24:26]
ä

If you're going to use unicode (i.e., if you're going to use characters 
that don't fit in ascii), use "unicode" objects. That's what python3 does 
all the time (for strings).

-- 
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.

Reply via email to