https://bugs.exim.org/show_bug.cgi?id=2761
--- Comment #3 from Philip Hazel <philip.ha...@gmail.com> --- Oh, you are right in the sense that an algorithm for "doing it properly" in principle could be created. However, that would need a lot of thought and re-design. The problem is that when a partial match happens, the code does NOT remember the partially matched string. To restart "properly", it would need to do so, which might involve remembering an arbitrarily long substring. The example in the doc illustrates this. Consider the pattern /1234|3789/. If the first segment is "ABC123" there is a partial match for "1234". There is no partial match for the second alternative, because such a match does not start at the same point in the subject string. Attempting to continue with the string "7890" does not yield a match because only those alternatives that match at one point in the subject are remembered. To make this work as if it was all one string, you need to remember "123", then join on the next segment, and try the whole match again against "1237890". The way the DFA matcher works is that it has "in hand" all potential matches that start at the same point in the subject (breadth-first search of the tree). If they all end in failure, all is discarded and the process begins again, starting at the next character. That is why it would need to remember part of the subject to make a restarted match behave in the same way as a non-split match would. Clearly, I didn't understand all this at the time I was writing the DFA code... -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev