Re: Unicode, ports and encoding

2009-02-18 Thread Ludovic Courtès
Hi Mike, Mike Gran writes: > I thought I could start there, but, it isn't easy. There is a lot that could > be broken by modifying string processing. So I tried writing some tests > first so I can check my work as I go along. But the tests have to be > non-ASCII, so they need to be converted

Re: Unicode, ports and encoding

2009-02-17 Thread Mike Gran
> From: Ludovic Courtès >> Mike Gran writes: > > This implies that a source code file should have syntax to > > indicate its own encoding, if it is not ASCII. Something akin to > > the line in HTML files. > > One could imagine special treatment of, say, the first 10 lines of a > f

Re: Unicode, ports and encoding

2009-02-17 Thread Ludovic Courtès
Hello! Mike Gran writes: > 1. To move to a Unicode-enabled guile, text information needs to be > converted to an internal representation when read and converted > back to the locale when written. Most reading and writing for > ports passes through scm_getc (input) and scm_lfwrite (

Unicode, ports and encoding

2009-02-16 Thread Mike Gran
More observations about wide strings and Guile. First, here are the abridged call trees for low-level reading and writing. read <-+- scm_getc <-+- [the parser] <--- scm_read <--- scm_primitive_load | | | +- scm_read_char | | +- scm_c_read