On 10/30/2016 12:01 PM, Pádraig Brady wrote:
> * doc/autoconf.texi (Limitations of usual tools): Display a
> table showing where the various syntaxes for word boundaries
> are supported.
> ---
>  doc/autoconf.texi | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/doc/autoconf.texi b/doc/autoconf.texi
> index 4be1f70..2e4b7ba 100644
> --- a/doc/autoconf.texi
> +++ b/doc/autoconf.texi
> @@ -19666,6 +19666,18 @@ $ @kbd{echo abc | busybox sed '/a\(b\)c/ 
> s/a\(b\)c/\1/'}
>  b
>  @end example
>  
> +Portable scripts should be aware of the inconsistencies and
> +options for handling word boundaries.
> +
> +@example
> +                \<      \b      [[:<:]]
> +Solaris 10      yes     no      no
> +Solaris XPG4    yes     no      error
> +NetBSD 5.1      no      no      yes
> +FreeBSD 9.1     no      no      yes
> +GNU             yes     yes     error
> +busybox         yes     yes     error
> +@end example

It might be nice to add Cygwin to the list, although I don't know if one
row is sufficient.  It bases its regex engine on BSD code but adds an
extension for \< and \>; but depending on whether a program uses the
libc regex or its own, you can get GNU behavior (that is, Cygwin grep
supports \< and \b but not [[:<:]] because it uses gnulib and bypasses
native regex; while a native application supports [[:<:]] and \< but not
\b because of the BSD heritage plus cygwin extension).

It may be worth pointing out that POSIX does not require ANY support for
word boundaries in regex.

ACK.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to