Chun Sungjin wrote:
Hi,

main problem is that for example, if I did create an instance of string like this;

a := 'Some MultiByte Encoded String'.

then

a size

does not answer correct length of string.
Well, strlen does not in C, too. You need mbrlen, and #size is more like strlen than mbrlen.

Also, the result heavily depends on the chosen character set. If we want to have #utf8Size, that's fine. But #size should be the number of *bytes*, not of characters.

I'm seeing now if I can add an EncodedStream method that extracts Unicode characters. Then what you wanted would be something like

   (EncodedStream wordsOn: 'some string') contents size

for which, of course, we can add a utility method.

Paolo


_______________________________________________
help-smalltalk mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-smalltalk

Reply via email to