------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=891 --- Comment #2 from Alan Lehotsky <[email protected]> 2009-09-23 18:34:34 --- I had never heard of the syntax either (and agree that it's not really needed for completeness). But one of my users ran across this. If I get some free time, I'll try and implement it and contribute the code. I did find a citation (below) to prior implementation. Regards, Al Lehotsky >From http://arglist.com/regex/regex7.html, purporting to be the man pages for Spencer's BSD 4.4 regex. There are two special cases+ of bracket expressions: the bracket expressions `[[:<:]]' and `[[:>:]]' match the null string at the beginning and end of a word respectively. A word is defined as a sequence of word characters which is neither preceded nor followed by word characters. A word character is an alnum character (as defined by ctype(3)) or an underscore. This is an extension, compatible with but not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to other systems. Philip Hazel <[email protected]> Sent by: [email protected] 09/23/2009 08:45 AM Please respond to [email protected] To [email protected] cc Subject [Bug 891] Support [[:<:]] and [[:>:]] patterns ------- You are receiving this mail because: ------- You reported the bug. http://bugs.exim.org/show_bug.cgi?id=891 --- Comment #1 from Philip Hazel <[email protected]> 2009-09-23 13:45:47 --- On Tue, 22 Sep 2009, Alan Lehotsky wrote: > Apparently one or more implementations (including possibly Henry Spencer's UCB > regex code support this as synonyms for the beginning of a word and the end > of a word respectively. > > It would be handy for compatibility to recognize these two also in PCRE. Are you sure about that? The patterns [[:<:]] and [[:>:]] look like a modification of the POSIX character class syntax - and a character class always matches a character. What would be the meaning of [abc[:<:]def] for example? I did a google to try to find any documentation about this, and I couldn't. What I did find was that several engines use \< and \> for beginning and end of word. This is incompatible with Perl, and so could not be added to PCRE. (In Perl, and PCRE, backslash followed by a non- alphanumeric character always matches a literal character. That is a nice, clean rule, and I would not want to violate it, even with a special option.) If you can point me at some documentation that specifies what [[:<:]] and [[:>:]] actually mean in some other regex engine, I will think about it. But they are heckish long sequences, though in Perl and PCRE to do the same thing takes one or two more characters: \b(?=\w) start of word \b(?<=\w) end of word Regards, Philip -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev
