I have written a description of my prototype implementation of adaptive 
ASCII/UTF-16 strings in Mono:


http://www.mono-project.com/docs/advanced/runtime/docs/ascii-strings/


Introduction:


> For historical reasons, System.String uses the UCS-2 character encoding, that 
> is, UTF-16 without surrogate pairs.


> However, most strings in typical .NET applications consist solely of ASCII 
> characters, leading to wasted space: half of the bytes in a string are likely 
> to be null bytes!


> Since strings are immutable, we can scan the character data when the string 
> is constructed, then dynamically select an encoding, thereby saving 50% of 
> string memory in most cases.


I would like to solicit feedback on this proposal from runtime developers and 
users alike. In particular:


- Specific objections regarding performance characteristics, compatibility 
issues, &c.

- Questions about unclear or underspecified parts of the proposal

- Real-world use cases that would benefit from this optimization

- Suggestions for suitable real-world benchmarks


Thank you!

_______________________________________________
Mono-devel-list mailing list
[email protected]
http://lists.dot.net/mailman/listinfo/mono-devel-list

Reply via email to