Re: Wide strings status

2009-04-22 Thread Ludovic Courtès
Hello! Mike Gran writes: > On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote: >> You seem to imply that `scm_getc ()' will now return a Unicode >> codepoint, is that right? What about `scm_c_{read,write} ()', and >> `scm_{get,put}s ()'? >> > > I vacillate on this, but, I think the most

Re: Wide strings status

2009-04-21 Thread Mike Gran
On Tue, 2009-04-21 at 23:37 +0200, Ludovic Courtès wrote: > > This is all going to be slower than before because of the string > > conversion operations, but, I didn't want to do any premature > > optimization. First, I wanted to get it working, but, there is plenty > > of room for optimization l

Re: Wide strings status

2009-04-21 Thread Ludovic Courtès
Hello! Mike Gran writes: > Strings are internally encoded either as "narrow" 8-bit ISO-8859-1 > strings or as "wide" UTF-32 strings. Strings are usually created as > narrow strings. Narrow strings get automatically widened to wide > strings if non-8-bit characters are set! or appended to them.

Wide strings status

2009-04-20 Thread Mike Gran
Hi, OK. I've uploaded a "string-abstraction" branch so that you can see what I've been doing over the last couple of months. Currently, I do have a version of Guile that uses Unicode codepoints for characters. The C representation of chars was changed to scm_t_uint32 throughout the code. Strin