Just curious, but why the different representations? Is it because you don't need to be able to index into a symbol and thus utf-8's (usually) more compact representation is a win but for strings, where you do need to index into it, a simple computation (and avoiding searching?) makes UTF-32 the right choice?
Robby On Thu, Jun 18, 2009 at 2:35 AM, Matthew Flatt<mfl...@cs.utah.edu> wrote: > At Wed, 17 Jun 2009 20:28:10 -0400, Carl Eastlund wrote: >> Why do symbol->string and keyword->string produce mutable strings? In >> so doing, they have to allocate a new string every time. Is there any >> way to get at an immutable string that is not allocated more than >> once? I would prefer that this be the default behavior; R6RS already >> specifies that symbol->string produces an immutable string, for >> instance. > > Symbols and keywords are represented internally in UTF-8, while strings > are represented internally as UTF-32. So, there's not an obvious way to > have `symbol->string' avoid allocation, except by either caching a > string reference in the symbol (probably not worth the extra space, > since most symbols are never converted) or keeping a symbol-to-string > mapping in a hash table (which any programmer can do externally). > > I think it would be a good idea to switch to an immutable-string > result, but considering potential incompatibility, it has never seemed > worthwhile in the short run. > > _________________________________________________ > For list-related administrative tasks: > http://list.cs.brown.edu/mailman/listinfo/plt-dev > _________________________________________________ For list-related administrative tasks: http://list.cs.brown.edu/mailman/listinfo/plt-dev