On Thu, Jun 18, 2009 at 3:35 AM, Matthew Flatt<mfl...@cs.utah.edu> wrote: > At Wed, 17 Jun 2009 20:28:10 -0400, Carl Eastlund wrote: >> Why do symbol->string and keyword->string produce mutable strings? In >> so doing, they have to allocate a new string every time. Is there any >> way to get at an immutable string that is not allocated more than >> once? I would prefer that this be the default behavior; R6RS already >> specifies that symbol->string produces an immutable string, for >> instance. > > Symbols and keywords are represented internally in UTF-8, while strings > are represented internally as UTF-32. So, there's not an obvious way to > have `symbol->string' avoid allocation, except by either caching a > string reference in the symbol (probably not worth the extra space, > since most symbols are never converted) or keeping a symbol-to-string > mapping in a hash table (which any programmer can do externally). > > I think it would be a good idea to switch to an immutable-string > result, but considering potential incompatibility, it has never seemed > worthwhile in the short run.
I see. I have contracts set up to accept only symbols and keywords whose names are ASCII strings; I was planning to use a weak, eq?-based hash of their names to shortcut the test. Apparently, though, I cannot get eq?-unique names for symbols and strings. If I hash the symbols and keywords themselves, I believe the weak table can never reclaim the space (since interned symbols and keywords are forgeable); if I use an equal? hash, it defeats the purpose. In the end, this is probably premature optimization; symbol and keyword names are usually short, so I can just use an unhashed check. However, while I'm musing out loud... would it be possible to have symbol->bytes and keyword->bytes that produce the UTF-8 representation (presumably with guarantees of uniqueness, immutability, and proper UTF-8 encoding)? --Carl _________________________________________________ For list-related administrative tasks: http://list.cs.brown.edu/mailman/listinfo/plt-dev