------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1099 --- Comment #1 from Philip Hazel <[email protected]> 2011-03-25 11:39:33 --- On Wed, 23 Mar 2011, Pavel Kostromitinov wrote: > In Perl, one can use $var in regexp and usually the variable reference > will be expanded. The variable is interpolated *before* the string is interpreted as a regex. Consider this example: $x = "\\d+"; print (("abc" =~ /$x/)? "abc: yes\n" : "abc: no\n"); print (("123" =~ /$x/)? "123: yes\n" : "123: no\n"); The output is: abc: no 123: yes I am not familiar with the code of Perl, but I would be surprised if it was not implemented exactly as one might expect: first the variables are interpolated into the string and then the string is interpreted as a regex. > It would be very helpful in some situations to allow PCRE to reference > such external variables too. The way I see it, without (seemingly) > breaking backward-compatibility, is to make some way to set values of > 'named subpatterns' before pcre_exec(), so they can be referenced > using existing \k{name} syntax. Surely encountering subpattern with > same name inside of pattern would take precedence. It would have to be pcre_compile() not pcre_exec(), for a start. However, the obvious implementation is to do what I think Perl does: first interpolate the variables and then call pcre_compile(). However, I do not have any plans to add this kind of functionality. The main objection I have to doing anything like this is that PCRE is not a string-manipulating library. It does not change strings. It provides just a regex-matching facility. Your suggestion is just one of many "add-on" features that people might like. Another, which was been suggested before, is a "replace" function. And no doubt there are others. In my opinion, if these kinds of function are generally wanted, somebody should design and implement a general-purpose string manipulation library, which of course could make use of PCRE for pattern matching. Issues that immediately spring to mind are: How are the variable contents coded? Zero-terminated or by length? If the latter, is a binary zero value allowed as part of the string? (If yes, for PCRE it has to be turned into \0.) Should variables be interpolated as in Perl, or as literals, or should there be an option? How to deal with UTF-8 or not UTF-8? How to handle feedback data for compiling errors? The pcre_compile() function gives an offset in the string it is compiling, but the caller of a wrapper function that interpolated variables would need an offset into the original string. My feeling is that a substantial design effort is needed to come up with an API that is sufficiently general as to be widely useful. I am not myself planning on doing anything about this. Philip -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev
