Hi;

Am Fri, 5 Oct 2012 16:58:54 +0100 (BST)
schrieb Philip Hazel <[email protected]>:
> > Are any of the other ones dangerous? Afaict not. So limiting
> > this new compile or runtime option's effect to (*UTF8) would be
> > enough.
> 
> Of course, any application that is worried about this can itself
> check for the text (*UTF8) at the start of any user pattern that it
> passes on to PCRE. It could even use PCRE to do the check! To do it
> properly, quite a complicated pattern is needed because other
> settings such as (*CR) can precede (*UTF8) at the start of a pattern.
> Something like this should be quite efficient:
> 
>   ^(?:\(\*\w+\))*?\(\*UTF\d+\)
>   
> It only needs to be used if the pattern begins with '(*', so for many 
> patterns the extra check will be insignificant.
> 
> There are only two bits left in the PCRE options definitions (out of 
> 32), and I am rather reluctant to use one of them just for this check.

I agree that those 2 bits are too valuable to waste for this :-)

I think it'd be enought to just document that the application needs to
use pcre_fullinfo() to get the options after compiling the pattern, and
check if the UTF flag is set.

Regards,
        Christian

-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to