> From: Ludovic Courtès <l...@gnu.org> > > Hello! > > Mike Gran writes: > > > Anyway, I backtracked a bit. Today I pushed a new tree > > (string_abstraction2) which should be the same as master except with all > > unnecessary calls to scm_i_string_chars, scm_i_symbol_chars, and > > scm_i_string_writable_chars removed. I largely avoided other > > unnecessary modifications. It is still an 8-bit string build, but, now > > the string internals have been confined to strings.c and strports.c, > > with very few exceptions. > > How does it differ from the approach you took a while back (e.g., > http://thread.gmane.org/gmane.lisp.guile.devel/8436)? >
It is exactly the same idea as before, but, more militantly applied. In my private build of my first attempt, I'm basically done. All the pieces are there. But it is buggy as hell. In the first pass, I tried to do too many things at once: ports, locales, R6RS escapes. I started getting some errors that were tough to debug (double-frees, SIGSEGV). Since valgrind and mcheck are useless here, I got frustrated. So, what I'm doing now, basically, is just adding the patches I made before in a more logical order and testing along the way. So, same patches, different order, better testing. Just try to git 'er done. The locale conversion will still happen at scm_lfwrite and scm_getc. (Did I ever mention this backtrace tree pic? http://www.lonelycactus.com/uploaded_images/test[1]-765536.PNG It shows that for all the scripts the test suite, all of the calls to low-level read and write pass through those two functions.) I have changed my opinion on one issue. I don't believe that Guile ports should have a specific encoding: they should just use the locale. This is just pragmatism. Guile ports and the default reader are annoying things to hack. I am loathe to touch them more than is necessary. The R6RS ports have the nice transcoder idea. It might be more fun to push port-specific encodings to that library. The one weird side-effect of doing the locale conversion at scm_getc and scm_lfwrite is that the ports' buffers -- including string ports -- contain locale-encoded data. Since all the pieces have already been coded once already, it should come together quickly. -Mike