On 01/11/16 13:46, Eric Blake wrote:
> On 10/30/2016 12:01 PM, Pádraig Brady wrote:
>> * doc/autoconf.texi (Limitations of usual tools): Display a
>> table showing where the various syntaxes for word boundaries
>> are supported.
>> ---
>> doc/autoconf.texi | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/doc/autoconf.texi b/doc/autoconf.texi
>> index 4be1f70..2e4b7ba 100644
>> --- a/doc/autoconf.texi
>> +++ b/doc/autoconf.texi
>> @@ -19666,6 +19666,18 @@ $ @kbd{echo abc | busybox sed '/a\(b\)c/
>> s/a\(b\)c/\1/'}
>> b
>> @end example
>>
>> +Portable scripts should be aware of the inconsistencies and
>> +options for handling word boundaries.
>> +
>> +@example
>> + \< \b [[:<:]]
>> +Solaris 10 yes no no
>> +Solaris XPG4 yes no error
>> +NetBSD 5.1 no no yes
>> +FreeBSD 9.1 no no yes
>> +GNU yes yes error
>> +busybox yes yes error
>> +@end example
>
> It might be nice to add Cygwin to the list, although I don't know if one
> row is sufficient. It bases its regex engine on BSD code but adds an
> extension for \< and \>; but depending on whether a program uses the
> libc regex or its own, you can get GNU behavior (that is, Cygwin grep
> supports \< and \b but not [[:<:]] because it uses gnulib and bypasses
> native regex; while a native application supports [[:<:]] and \< but not
> \b because of the BSD heritage plus cygwin extension).
Interesting, re cygwin, though here we're considering sed,
which would fall under GNU I think?
I've updated the attached with the note about POSIX.
(I don't have commit access to this repo)
thanks,
Pádraig
From a5333e422d20f0e43407919ae1d5b46ebefc9c4e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?P=C3=A1draig=20Brady?= <[email protected]>
Date: Sun, 30 Oct 2016 16:58:09 +0000
Subject: [PATCH] doc: detail inconsistencies in sed word boundary handling
* doc/autoconf.texi (Limitations of usual tools): Display a
table showing where the various syntaxes for word boundaries
are supported.
---
doc/autoconf.texi | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index 4be1f70..3e3a894 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -19666,6 +19666,18 @@ $ @kbd{echo abc | busybox sed '/a\(b\)c/ s/a\(b\)c/\1/'}
b
@end example
+Portable scripts should be aware of the inconsistencies and options for
+handling word boundaries, as these are not specified by POSIX.
+
+@example
+ \< \b [[:<:]]
+Solaris 10 yes no no
+Solaris XPG4 yes no error
+NetBSD 5.1 no no yes
+FreeBSD 9.1 no no yes
+GNU yes yes error
+busybox yes yes error
+@end example
@item @command{sed} (@samp{t})
@c ---------------------------
--
2.5.5