I read older threads about parsing Japanese with comparse and took some ideas from there, but am still stuck:
(import comparse utf8 utf8-srfi-14) (define s "Gänsesäger 2,1") (define s1 "Rotkehlchen 1,0") (define (utf8-in cs) (satisfies (lambda (c) (char-set-contains? cs c)))) (define letter (utf8-in char-set:letter)) (define letters (as-string (repeated letter 1 20))) This is what I have, and the beginning 'word' in the beginning of s1 is parsed completely and correctly with the 'letters' parser: #;1> (parse letters (string->list s1)) "Rotkehlchen" #<parser-input lazy-seq #\space #\1 #\, #\0> ; 2 values For 's' though I get this: #;2> (parse letters (string->list s)) "G" #<parser-input lazy-seq #\ #\n #\s #\e #\s #\ #\g #\e #\r #\space ...> ; 2 values meaning, that the ä isn't recognized as being a letter within the 'char-set:letter'. (The UTF8 aspect of correct character width works on the other hand: in the remaining string, the ä is represented by only one #\. If I don't use the UTF8 string equivalents by importing 'utf8', it would be two.) Any hint for me? /Christoph -- Christoph Lange Lotsarnas Väg 8 430 83 Vrångö
