Hi -- I think you and Camm know more about this than I do, but to answer your question, below is what I get in GCL 2.6.12. Except, I don't know how mailers handle high characters of the sort GCL printed in the output from (string (code-char 232)) below, so although that string was printed using a single character, here I show it as four characters (that visually appear just like the one-character version).
>(code-char 232) #\\350 >(string (code-char 232)) "\350" > Interestingly, your (count nil (loop ...)) form also evaluates to 63 in CCL, CLISP, and SBCL, but it evaluates to 32 in Allegro CL and 66 in LispWorks. It seems to me that the HyperSpec documentation allows for these differences. I've pasted in an sbcl log below in case it's illuminating somehow. sloth:~% sbcl This is SBCL 1.2.2, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. * sb-impl::*default-external-format* :UTF-8 * (code-char 232) #\LATIN_SMALL_LETTER_E_WITH_GRAVE * (string (code-char 232)) "è" * (length *) 1 * (setq sb-impl::*default-external-format* :iso-8859-1) :ISO-8859-1 * (code-char 232) #\LATIN_SMALL_LETTER_E_WITH_GRAVE * (string (code-char 232)) "è" * (length *) 1 * -- Matt From: Raymond Toy <[email protected]> Date: Sat, 01 Nov 2014 09:45:47 -0700 >>>>> "Matt" == Matt Kaufmann <[email protected]> writes: Matt> I saw your question and was curious, so I looked into it a bit: >>> To your knowledge, is there any objection to defining alpha-char-p as >>> including code-char's >= 128? Matt> I see that SBCL 1.2.2 is OK with that, for example: Matt> * (code-char 232) Matt> #\LATIN_SMALL_LETTER_E_WITH_GRAVE Matt> * (alpha-char-p (code-char 232)) Matt> T Matt> * Matt> In fact, that alpha-char-p call also returns T in (versions of) Matt> Allegro CL, CCL, CLISP, CMU CL, LispWorks, and SBCL. Try (code-char #xa0). This is the unicode character no-break-space. This has no case and would presumably not be alpha-char-p. I think there are quite a few characters that would not be (from cmucl): (count nil (loop for k from 128 upto 255 collect (alpha-char-p (code-char k)))) 63 I think there is some confusion here, at least for me. If gcl uses 8-bit code-units and utf-8 strings, what exactly is (coode-char 232)? You can store that into a utf-8 string but it won't be #\latin_small_letter_e_with_grave because that would be encoded as two octets in a utf-8 string: 195 168. I think it's perfectly legal for gcl to say everything above 128 is alpha-char-p. I think, however, that people will just get confused that no such characters can be stored into a string and processed correctly as utf-8 without a bit of work. But perhaps this is just how 8-bit chars and utf-8 strings just have to work. I think 16-bit chars with utf-16 or 32-bit chars with utf-32 are far easier to explain. K.I.S.S? -- Ray _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel _______________________________________________ Gcl-devel mailing list [email protected] https://lists.gnu.org/mailman/listinfo/gcl-devel
