> From: Ludovic Courtès <l...@gnu.org> >> Mike Gran writes:
> > This implies that a source code file should have syntax to > > indicate its own encoding, if it is not ASCII. Something akin to > > the line in HTML files. > > One could imagine special treatment of, say, the first 10 lines of a > file, with the ability to recognize Emacs file variables like > "-*- coding: utf-8 -*-" and to change the current port transcoder > accordingly, something like that. Yeah. Something like that. > IIRC, the first step you suggested was the implementation of wide > string/char types. Did you also work on this? Sort of. I thought I could start there, but, it isn't easy. There is a lot that could be broken by modifying string processing. So I tried writing some tests first so I can check my work as I go along. But the tests have to be non-ASCII, so they need to be converted when they are read in. It gets a little bit circular using scm_from_locale_string to convert non-ASCII strings in the test source, and then having the test check the behavior of scm_from_locale_string. So, now I think a better route is to make some type of simplified transcoded port system available to ports so that non-ASCII tests are read in correctly. From there, one can work up toward wide strings and chars while checking work along the way. Thanks, Mike Gran