Re: add support for \ and \ word delimiters in regcomp
On Mon, Sep 01, 2014 at 12:41:37AM -0400, Ted Unangst wrote: On Mon, Sep 01, 2014 at 14:03, Jonathan Gray wrote: This adds support for using the SVR4/glibc word delimeters in regcomp as an extension to what posix requires. We already have [[::]] and [[::]] as extensions, apparently from 'Henry Spencer's Alpha 3.0 regex release' back in 1993. But now Solaris/Linux/FreeBSD all have the other syntax and sadly lots of uses of grep and sed in what are supposed to be portable projects use it. This diff is from Garrett D'Amore in Illumos via FreeBSD. https://www.illumos.org/issues/516 I have a slight preference for my diff (I think it's clearer than deeper nested switches), but no matter. http://marc.info/?l=openbsd-techm=131094975127745w=2 I'd be fine with that one going in as well. Are there any reasons not to add it? I don't see a portable alternative here as brought up by Mark in that thread, and the only if it's supported on the majority of UNIX-ike operating system comment seems to be true?
Re: add support for \ and \ word delimiters in regcomp
On Mon, 08 Sep 2014 02:28:42 +1000, Jonathan Gray wrote: I'd be fine with that one going in as well. Are there any reasons not to add it? I don't see a portable alternative here as brought up by Mark in that thread, and the only if it's supported on the majority of UNIX-ike operating system comment seems to be true? My preference is for Ted's patch. I was against this initially but now that it is supported by most modern systems I don't have a problem with it. - todd
add support for \ and \ word delimiters in regcomp
This adds support for using the SVR4/glibc word delimeters in regcomp as an extension to what posix requires. We already have [[::]] and [[::]] as extensions, apparently from 'Henry Spencer's Alpha 3.0 regex release' back in 1993. But now Solaris/Linux/FreeBSD all have the other syntax and sadly lots of uses of grep and sed in what are supposed to be portable projects use it. This diff is from Garrett D'Amore in Illumos via FreeBSD. https://www.illumos.org/issues/516 Index: re_format.7 === RCS file: /cvs/src/lib/libc/regex/re_format.7,v retrieving revision 1.16 diff -u -p -r1.16 re_format.7 --- re_format.7 5 Jun 2013 22:05:29 - 1.16 +++ re_format.7 1 Sep 2014 03:51:27 - @@ -304,6 +304,12 @@ This is an extension, compatible with but not specified by POSIX, and should be used with caution in software intended to be portable to other systems. +The additional word delimiters +.Ql \e +and +.Ql \e +are provided to ease compatibility with traditional SVR4 +systems but are not portable and should be avoided. .Pp In the event that an RE could match more than one substring of a given string, Index: regcomp.c === RCS file: /cvs/src/lib/libc/regex/regcomp.c,v retrieving revision 1.24 diff -u -p -r1.24 regcomp.c --- regcomp.c 6 May 2014 15:48:38 - 1.24 +++ regcomp.c 1 Sep 2014 03:25:44 - @@ -349,7 +349,17 @@ p_ere_exp(struct parse *p) case '\\': REQUIRE(MORE(), REG_EESCAPE); c = GETNEXT(); - ordinary(p, c); + switch (c) { + case '': + EMIT(OBOW, 0); + break; + case '': + EMIT(OEOW, 0); + break; + default: + ordinary(p, c); + break; + } break; case '{': /* okay as ordinary except if digit follows */ REQUIRE(!MORE() || !isdigit((uch)PEEK()), REG_BADRPT); @@ -500,6 +510,12 @@ p_simp_re(struct parse *p, break; case '[': p_bracket(p); + break; + case BACKSL|'': + EMIT(OBOW, 0); + break; + case BACKSL|'': + EMIT(OEOW, 0); break; case BACKSL|'{': SETERROR(REG_BADRPT);
Re: add support for \ and \ word delimiters in regcomp
On Mon, Sep 01, 2014 at 14:03, Jonathan Gray wrote: This adds support for using the SVR4/glibc word delimeters in regcomp as an extension to what posix requires. We already have [[::]] and [[::]] as extensions, apparently from 'Henry Spencer's Alpha 3.0 regex release' back in 1993. But now Solaris/Linux/FreeBSD all have the other syntax and sadly lots of uses of grep and sed in what are supposed to be portable projects use it. This diff is from Garrett D'Amore in Illumos via FreeBSD. https://www.illumos.org/issues/516 I have a slight preference for my diff (I think it's clearer than deeper nested switches), but no matter. http://marc.info/?l=openbsd-techm=131094975127745w=2