Greetings! Raymond Toy <[email protected]> writes:
> I, unfortunately, don't have great hope of seeing gcl with unicode any > time soon because the plan for supporting unicode is really > complicated. [1][2] > > -- > Ray > > [1] UTF-8 strings with 21-bit Lisp character. I don't know how that's > going to work reliably when you can index at random points in the > string and also insert random characters into a utf-8 code > sequence. > [2] I suggested a really simple utf-16 with 16-bit chars to simplify > the implementation and still cover 99-44/100% of the use cases. > This is way easier to do with very minimal code changes. > Perhaps I should weigh in here. I do have a branch starting utf8 unicode character support, but it will have to wait until post 2.6.13. Emacs takes this strategy, so I know its doable, and the performance is probably a net win as the gc overhead of the larger strings will outweigh the string access times, I'm guessing. We also had a discussion on gcl-devel that the current approach of defining a character to be a byte, and relying on terminals etc. to do the translation, is legal, although not desirable as a permanent solution. I can outline the algorithm if there is interest, but essentially a simple one entry cache to cover the vast majority of cases of sequential access (utf8 can do this backwards as well) together with a log(N) special character counting from the beginning, cache, or end (making use of parallelism in long integers) for random access, appears quite serviceable. This is not that complicated, and can be source inlined escaping out the most common case of no special bytes, which can be indicated by a flag in the header. (BTW, I've also put in open-stream-p for you in 2.6.13pre.) Take care, > ------------------------------------------------------------------------------ > _______________________________________________ > Maxima-discuss mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/maxima-discuss > > > -- Camm Maguire [email protected] ========================================================================== "The earth is but one country, and mankind its citizens." -- Baha'u'llah _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel
