Hi Guys,
A quick question.

I'm trying to interpret unicode code-point ranges from the CSS 3 spec -

The rule in question is

nonascii :== #x80-#xD7FF #xE000-#xFFFD #x10000-#x10FFFF

Where (I think) these are unicode code-point ranges.

The latest rakudo build is fine with:

% perl6 -e perl6 -e '/<[\c[0x80]..\c[0xD7FF]]>/'

...but doesn't like the second (or third) range:

% perl6 -e '/<[\c[0xE000]..\c[0xFFFD]]>/'
Invalid character for UTF-8 encoding

...the individual code points are ok:

% perl6 -e '/<[\c[0xE000]]>/'
% perl6 -e '/<[\c[0xFFFD]]>/'

I'm think I'm getting the above error because not all unicode code-points
are defined for the range xE000 to xFFFD - see
http://www.utf8-chartable.de/unicode-utf8-table.pl  .

I'm just having a problem implementing a concise regex/grammar rule for the
above. Looking for advice.

David Warring

Reply via email to