How can we attribute the performance difference between these xml
parsers to encoding? Where are the benchmarks?

Memory usage of strings probably isn't as important as you think - for
large strings, you are probably more interested in using a stream
decoder then a great big in-memory string, and if that doesn't suit
your use case, you probably want to implement your own string type,
whether that be ropes or an array in utf-8 or whatever.

For typical in-memory string manipulation, UCS-2 has served us well,
and people usually work under the assumption that indexing or slicing
a string by index-of-codepoint is O(1) (even if the strings resulting
from the slice may not be valid). I think it is a useful assumption,
and that programmers will continue to want cheap slices based on a
vague if sometimes incorrect count of characters for the time being.

As for immutability, I don't see what that has to do with indexing or
slicing or encoding. Immutable strings are non-optional in any sane
modern language.

-- 
William Leslie
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to