Would it be possible (or would it be a good idea) for character regex's to have a mode option "strict" or "not-strict" that would throw an error if its input character stream contained non utf-8 characters when in strict mode.
One possible use is this. Its real easy to accidently apply a character regex to a bytestring (when you meant to apply a byte-string regex to a bytestream) and run test cases and think its working OK. I.e. to write: (regexp-match-positions* #rx"[^ÿ]+" #"...input byte string...") when you meant (regexp-match-positions* #rx#"[^ÿ]+" #". . . input byte string...") For example this appears to work: > (integer->char 255) #\ÿ > (regexp-match-positions* #rx"[^ÿ]+" #"abcÿabc") '((0 . 3) (4 . 7)) BUT > (regexp-match-positions* #rx"[^k]+" #"abcÿabc") '((0 . 3) (4 . 7)) > (regexp-match-positions* #rx".+" #"abcÿabc") '((0 . 3) (4 . 7)) > (regexp-match-positions* #rx"[^k]+" #"abcÿabc") '((0 . 3) (4 . 7)) > Having a "strict" mode would show up this error. Thanks, Harry Spier ____________________ Racket Users list: http://lists.racket-lang.org/users

