On 07/20/2012 04:05 AM, Дмитрий wrote:
Hello.
Does IrRegex support Unicode character classes? E.g. Will IrRegex consider accented letters (á) or Cyrillic
letters (я) as "alpha"? Wil IrRegex consider Chinese wide space ( ) as "space"? Will
IrRegex consider Chinese brackets (「」【】) as "punct"? If it doesn't, the regexp is going to be
EXTREMELY messy [in fact, I believe it may better to build such a regexp automatically then].
I’m on Windows, so I can’t check it (when I use UTF-8 console via chcp 65001, for some
reason Chicken seems to fail on every string with operation non-ascii string — even on a
simple (display "Привет")).
--
Yours sincerely,
Dmitry Kushnariov
As I said, I'm a neophyte. My "character classes" were based around
[a-zA-z] etc. So you can readily see why the pattern would have
quickly become unreasonably complex. I didn't find any definition of
other character classes (well, not one that meant anything) and given
the discussion, I think that they wouldn't have worked if I'd gotten to
the point of testing them.
I was planning on using Chicken to learn scheme, since R7SR is supposed
to be based more on R5SR than on R6SR, but maybe it's better to learn
using Racket. I *trust* the conversion won't be too difficult. (I *do*
need to use utf-8 in lots of places, and an incomplete implementation
while I was learning would be ... unpleasant. Particularly if the user
documentation presumed that it *was* complete.)
--
Charles Hixson
_______________________________________________
Chicken-users mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/chicken-users