On Thu, Aug 03, 2000 at 02:49:11AM -0400, Owen Taylor wrote:
> 
> The output of -Dr makes it pretty clear what is going on:
> 
>   Compiling REx `^\C\C(c)'
>   size 10 first at 2
>   rarest char c at 0
>      1: BOL(2)
>      2: SANY(3)
>      3: SANY(4)
>      4: OPEN1(6)
>      6:   EXACT <c>(8)
>      8: CLOSE1(10)
>     10: END(0)
>   anchored `c' at 2 (checking anchored) anchored(BOL) minlen 3 
>              
>   [...]
> 
>   Guessing start of match, REx `^\C\C(c)' against `École'...
>   String not equal...
>   Match rejected by optimizer
> 
> For regexes compiled with 'use utf8' the anchor position
> is in chars, not bytes, and the re optimizer (study_chunk)
> things that \C counts as one char.
> 
> Fixing this looks decidedly unfun.

I now submitted a perlbug on this so that this bug (which
unfortunately still seems to be there) won't be forgotten.

> Regards,
>                                         Owen

-- 
$jhi++; # http://www.iki.fi/jhi/
        # There is this special biologist word we use for 'stable'.
        # It is 'dead'. -- Jack Cohen

Reply via email to