This adds support for using the SVR4/glibc word delimeters in regcomp as an extension to what posix requires.
We already have [[:<:]] and [[:>:]] as extensions, apparently from 'Henry Spencer's Alpha 3.0 regex release' back in 1993. But now Solaris/Linux/FreeBSD all have the other syntax and sadly lots of uses of grep and sed in what are supposed to be portable projects use it. This diff is from Garrett D'Amore in Illumos via FreeBSD. https://www.illumos.org/issues/516 Index: re_format.7 =================================================================== RCS file: /cvs/src/lib/libc/regex/re_format.7,v retrieving revision 1.16 diff -u -p -r1.16 re_format.7 --- re_format.7 5 Jun 2013 22:05:29 -0000 1.16 +++ re_format.7 1 Sep 2014 03:51:27 -0000 @@ -304,6 +304,12 @@ This is an extension, compatible with but not specified by POSIX, and should be used with caution in software intended to be portable to other systems. +The additional word delimiters +.Ql \e< +and +.Ql \e> +are provided to ease compatibility with traditional SVR4 +systems but are not portable and should be avoided. .Pp In the event that an RE could match more than one substring of a given string, Index: regcomp.c =================================================================== RCS file: /cvs/src/lib/libc/regex/regcomp.c,v retrieving revision 1.24 diff -u -p -r1.24 regcomp.c --- regcomp.c 6 May 2014 15:48:38 -0000 1.24 +++ regcomp.c 1 Sep 2014 03:25:44 -0000 @@ -349,7 +349,17 @@ p_ere_exp(struct parse *p) case '\\': REQUIRE(MORE(), REG_EESCAPE); c = GETNEXT(); - ordinary(p, c); + switch (c) { + case '<': + EMIT(OBOW, 0); + break; + case '>': + EMIT(OEOW, 0); + break; + default: + ordinary(p, c); + break; + } break; case '{': /* okay as ordinary except if digit follows */ REQUIRE(!MORE() || !isdigit((uch)PEEK()), REG_BADRPT); @@ -500,6 +510,12 @@ p_simp_re(struct parse *p, break; case '[': p_bracket(p); + break; + case BACKSL|'<': + EMIT(OBOW, 0); + break; + case BACKSL|'>': + EMIT(OEOW, 0); break; case BACKSL|'{': SETERROR(REG_BADRPT);