Re: [pcre-dev] Start optimization issue

2012-07-10 Thread Philip Hazel
On Mon, 9 Jul 2012, ND wrote: And you are right: not only atomic group but some backtracking verbs gives same result. May be some code correction needed in those two places. Yes. It seems that the .* optimization must be disabled if the .* is inside an atomic group, or if *PRUNE or *SKIP is

Re: [pcre-dev] Start optimization issue

2012-07-10 Thread Philip Hazel
On Tue, 10 Jul 2012, I wrote: Yes. It seems that the .* optimization must be disabled if the .* is inside an atomic group, or if *PRUNE or *SKIP is present in the pattern. I will look at the code to see how best to do this. I have committed a patch (r994) that I believe fixes this related

Re: [pcre-dev] Start optimization issue

2012-07-09 Thread ND
On 2012-07-09 05:20, Zoltán Herczeg wrote: I have investigated this issue, and in the optimized case, PCRE_STARTLINE is set, so it searches the first newline. If DOTALL is set the result is the same. It seems that is_anchored() is in action. I think PCRE must not assume such patterns as

Re: [pcre-dev] Start optimization issue

2012-07-09 Thread Zoltán Herczeg
If DOTALL is set the result is the same. It seems that is_anchored() is in action. I think PCRE must not assume such patterns as anchored. No. /* .* means start at start or after \n if it isn't in brackets that may be referenced. */ else if (op == OP_TYPESTAR || op == OP_TYPEMINSTAR

Re: [pcre-dev] Start optimization issue

2012-07-09 Thread ND
On 2012-07-09 10:51, Zoltán Herczeg wrote: If DOTALL is set the result is the same. It seems that is_anchored() is in action. I think PCRE must not assume such patterns as anchored. No. Yes. Consider followed pattern: PCRE version 8.31 2012-07-06 /(?.*?a)b/sI Capturing subpattern count = 0

[pcre-dev] Start optimization issue

2012-07-08 Thread ND
Good day! Here is pcretest.exe listing: PCRE version 8.31 2012-07-06 /(?.*?a)(?=ba)/ aba No match MATCH was inspected. More investigation returns that is start optimization issue. PCRE version 8.31 2012-07-06 /(*NO_START_OPT)(?.*?a)(?=ba)/ aba 0: ba What kind of start optimization doing

Re: [pcre-dev] Start optimization issue

2012-07-08 Thread Zoltán Herczeg
Hi, I have investigated this issue, and in the optimized case, PCRE_STARTLINE is set, so it searches the first newline. There is a comment for this before is_startline(...): /* This is called to find out if every branch starts with ^ or .* so that first char processing can be done to speed