Chun Sungjin wrote:
Hi,
I've tried GNU smalltalk and for me it seems good. But I have a
problem: current implementation does not support Unicode. It seems
that it only supports single byte character only. I've also tried
squeak, which seems less faster than GNU smalltalk - I'm not sure on
this, this might not be correct - has unicode compatible string
implementation and I think this kind of approach is good. Is there any
change to have unicode compatible string implementation in next
version of GNU smalltalk?
What do you need exactly? The main missing thing is support for
Character objects with values above 256. However if you are content
with multibyte character sets like UTF-8, or with Unicode character
codes, that's fine.
For character set translation, if you load the I18N package, GNU
Smalltalk gets an iconv wrapper. The main method you need is
EncodedStream>>#on:from:to: (e.g. on: 'abc' from: 'UTF-8' to: 'UCS-4').
To extract Unicode character codes from an UCS-4LE encoded string, you
can use (ByteStream on: x asByteArray) and send nextLong. For
big-endian, there is no class but I was thinking of adding a #bigEndian
method to ByteStream for the next version.
Things that could be useful are
Integer>>#asUTF8String
String class>>#utf8FromCodepoint: (same as above)
String>>#utf8Stream
UTF8Stream (returns Unicode character codes)
... (tell me what you need) ...
Paolo
_______________________________________________
help-smalltalk mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-smalltalk