Hi Sven & Everyone,

I need to convert an UTF8 encoded decomposed stream (Mac OS file
names) in to a composed string, e.g.:

string: 'test-äöü-äöü'
code points: #(116 101 115 116 45 228 246 252 45 97 776 111 776 117 776)
utf8 encoding: #[116 101 115 116 45 195 164 195 182 195 188 45 97 204
136 111 204 136 117 204 136]

In the above string, the first group of 3 accented characters are the
same as the second group of 3, but are encoded differently - code
points (228 246 252) vs (97 776 111 776 117 776).

Reading the utf8 encoded stream should result in:

string: 'test-äöü-äöü'
code points: #(116 101 115 116 45 228 246 252 45 228 246 252)
utf8 encoding: #[116 101 115 116 45 195 164 195 182 195 188 45 195 164
195 182 195 188]

My current thought is to write a ZnUnicodeComposingReadStream which
would wrap a ZnCharacterReadStream and return the composed characters.

What do you think?

Thanks!
Alistair

Reply via email to