Re: sed '\n\nnd'

2020-03-26 Thread Robert Elz
Date:Thu, 26 Mar 2020 09:11:16 +
From:Geoff Clare 
Message-ID:  <20200326091116.GB22632@lt2.masqnet>

  | Given that implementations differ, we should probably make the
  | behaviour explicitly unspecified.

I'd agree with that - it isn't as though it really matters except to
people looking for obscure things to test - the \d form of "anything
can be the delimiter" in practice is used with delims like ! or | or
; (etc) - not with alphas (though \x I have seen on occasion) and
certainly not with the chars where \x in other places means something
different from x (or an escaped x).

Making it unspecified seems like a reasonable think to do, even if the
GNU people do decide that their \n behaviour is a bug.

kre




Re: sed '\n\nnd'

2020-03-26 Thread shwaresyst

Sorry, I was confusing the 'd' command, thinking 'display', for the 'l' 
command. Since stdout is supposed to echo after processing, I did have it 
reversed; if the gnu version echos the 'n' it probably wasn't 'delete'd, and 
for the others it likely was. 
On Wednesday, March 25, 2020 Harald van Dijk  wrote:
On 25/03/2020 23:30, shwaresyst wrote:
> yes, without them the argument would be "nnnd", after quote removal by 
> the shell. The reasoning in first reply was meant to show that the 
> non-GNU versions might be erroneously treating the second '\' as "do 
> contol alias processing always", ignoring that its use as delimeter 
> overrides that interpretation, to get the results observed.

Again, it's the BSD version that treats the second \n as , treating 
the backslash in there as just escaping the delimiter character. You 
have it backwards. The GNU version is the one that treats the second \n 
as .

> 
> On Wednesday, March 25, 2020 Harald van Dijk  wrote:
> 
> On 25/03/2020 21:09, shwaresyst wrote:
>  > If it wasn't in single quotes, then that might be plausible, but I don't
>  > see it as the intent since no other aliases are excluded as
>  > possibilities for after the '/'. The initial "\n" makes 'n' the
>  > delimiter, the 2nd overrides it as being the BRE terminator, and the
>  > following 'n' is the terminator, before the 'd' command. Should there be
>  > something explicit about aliases not being usable when repurposed as
>  > delimiter, maybe.
> 
> This reply makes no sense to me, sorry. The single quotes are processed
> at the shell level. Without single quotes, there would be no backslash
> for sed to process.
> 
> Regardless, the only thing I wrote was that you simultaneously
> considered the GNU version more correct and explained it in a way that
> led me to believe you actually consider the BSD version more correct. I
> wrote absolutely nothing about what the standard says or intends to say.
> 
>  > 
>  > On Wednesday, March 25, 2020 Harald van Dijk  > wrote:
>  >
>  > On 25/03/2020 19:44, shwaresyst wrote:
>  >  > The GNU version is more correct, in my opinion, in that the use of 
> n as
>  >  > a delimiter should take precedence over its use as control character
>  >  > alias with the wording as is. The other versions appear to 
> consider the
>  >  > BRE as  so does not match 'n'.
>  >
>  > You have that backwards, don't you? The GNU version lets the use of \n
>  > as a control character take precedence over its use as a delimiter.
>  > That's why n gets printed: \n\nn is treated as /\n/, which can never
>  > match any single-line string, so nothing gets deleted.
>  >
>  > Likewise,
>  >
>  >    echo n | sed '\n[^\n]nd'
>  >
>  > prints nothing with GNU sed, but prints n with FreeBSD sed for the same
>  > reason: 'n' does contain a character that is not , but does not
>  > contain any character that is not .
>  >
>  >
>  >  > 
> 
>  >  > On Wednesday, March 25, 2020 Oğuz  
>  >  >> wrote:
> 
>  >  >
>  >  >      echo n | sed '\n\nnd'
>  >  >
>  >  > Above command returns 'n' with GNU sed, and nothing with BSD seds and
>  >  > OmniOS sed. [...]


Re: sed '\n\nnd'

2020-03-26 Thread Oğuz
26 Mart 2020 Perşembe tarihinde Joerg Schilling <
joerg.schill...@fokus.fraunhofer.de> yazdı:

> O?uz  wrote:
>
> > > Given that implementations differ, we should probably make the
> > > behaviour explicitly unspecified.
> >
> > But this might be a bug in GNU's implementation.
>
> Why should a behavior that is aligned with the classical UNIX behavior
> (Solaris) be buggy?
>
>
Unlike you guys I don't have access to all UNIX operating systems. I just
thought it was a bug based on my observations on GNU and BSD, and learned
that it's not.


> Jörg
>
> --
>  EMail:jo...@schily.net(home) Jörg Schilling D-13353
> Berlin
> joerg.schill...@fokus.fraunhofer.de (work) Blog:
> http://schily.blogspot.com/
>  URL: http://cdrecord.org/private/ http://sf.net/projects/
> schilytools/files/'
>


-- 
Oğuz


Re: sed '\n\nnd'

2020-03-26 Thread Joerg Schilling
O?uz  wrote:

> > Given that implementations differ, we should probably make the
> > behaviour explicitly unspecified.
>
> But this might be a bug in GNU's implementation.

Why should a behavior that is aligned with the classical UNIX behavior 
(Solaris) be buggy?

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: sed '\n\nnd'

2020-03-26 Thread Joerg Schilling
Geoff Clare  wrote:

> Solaris and HP-UX output "n" the same as GNU sed.

Sou you verified what I assumed from the fact that Solaris, AIX and HP-UX are 
based on a common source from the i18n project.

Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



Re: sed '\n\nnd'

2020-03-26 Thread Geoff Clare
Oğuz  wrote, on 26 Mar 2020:
>
> > Given that implementations differ, we should probably make the
> > behaviour explicitly unspecified.
> 
> But this might be a bug in GNU's implementation.
> 
[...]
> 
> Personally, I think BSD sed's behavior should be standardized and GNU
> would not object to that.

Solaris and HP-UX output "n" the same as GNU sed.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: sed '\n\nnd'

2020-03-26 Thread Oğuz
> Given that implementations differ, we should probably make the
> behaviour explicitly unspecified.

But this might be a bug in GNU's implementation.

$ echo t | sed 'st\ttt' | xxd
: 0a   .
$
$ echo n | sed 'sn\nnn' | xxd
: 6e0a n.

Though both '\n' and '\t' are listed in the manual as supported
control character escape sequences, they're are treated differently as
seen above, and it makes no sense. I filed a bug report about this and
waiting for a reply.
Personally, I think BSD sed's behavior should be standardized and GNU
would not object to that.

-- 
Oğuz



Re: sed '\n\nnd'

2020-03-26 Thread Geoff Clare
Oğuz  wrote, on 25 Mar 2020:
>
> echo n | sed '\n\nnd'
> 
> Above command returns 'n' with GNU sed, and nothing with BSD seds and
> OmniOS sed. The standard says
> 
> 
>-
> 
>In a context address, the construction "\cBREc", where *c* is any
>character other than  or , shall be identical to
>"/BRE/". If the character designated by *c* appears following a
>, then it shall be considered to be that literal character,
>which shall not terminate the BRE. For example, in the context address
>"\xabc\xdefx", the second *x* stands for itself, so that the BRE is
>"abcxdef".
>-
> 
>The escape sequence '\n' shall match a  embedded in the pattern
>space. A literal  shall not be used in the BRE of a context
>address or in the substitute function.
> 
> 
> but this is not clear at all. Which is the correct behavior here?

Neither is more correct than the other because, as you said yourself,
the standard is unclear. A formal interpretation would say "The standard
is unclear on this issue, and no conformance distinction can be made
between alternative implementations based on this."

Given that implementations differ, we should probably make the
behaviour explicitly unspecified.

-- 
Geoff Clare 
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England



Re: sed '\n\nnd'

2020-03-25 Thread Robert Elz
Date:Wed, 25 Mar 2020 21:09:38 + (UTC)
From:shwaresyst 
Message-ID:  <1031615939.2118006.1585170578...@mail.yahoo.com>

  | If it wasn't in single quotes, then that might be plausible,

As has been said elsewhere, that's nonsense.  The quotes are just
making clear what (exactly) is seen by sed.

  | but I don't see it as the intent since no other aliases are excluded
  | as possibilities for after the '/'. The initial "\n" makes 'n'
  | the delimiter, the 2nd overrides it as being the BRE terminator,
  | and the following 'n' is the terminator,

Yes, that is exactly what the BSD implementation does, and the GNU one
does not.

So, as Harald said, you have it backwards, your interpretation (with which
I agree, probably unsurprisingly) matches the BSD interpretation (the
expression deletes lines containing an 'n' character) and not the GNU one
(where it doesn't).

kre



Re: sed '\n\nnd'

2020-03-25 Thread shwaresyst

yes, without them the argument would be "nnnd", after quote removal by the 
shell. The reasoning in first reply was meant to show that the non-GNU versions 
might be erroneously treating the second '\' as "do contol alias processing 
always", ignoring that its use as delimeter overrides that interpretation, to 
get the results observed.
On Wednesday, March 25, 2020 Harald van Dijk  wrote:
On 25/03/2020 21:09, shwaresyst wrote:
> If it wasn't in single quotes, then that might be plausible, but I don't 
> see it as the intent since no other aliases are excluded as 
> possibilities for after the '/'. The initial "\n" makes 'n' the 
> delimiter, the 2nd overrides it as being the BRE terminator, and the 
> following 'n' is the terminator, before the 'd' command. Should there be 
> something explicit about aliases not being usable when repurposed as 
> delimiter, maybe.

This reply makes no sense to me, sorry. The single quotes are processed 
at the shell level. Without single quotes, there would be no backslash 
for sed to process.

Regardless, the only thing I wrote was that you simultaneously 
considered the GNU version more correct and explained it in a way that 
led me to believe you actually consider the BSD version more correct. I 
wrote absolutely nothing about what the standard says or intends to say.

> 
> On Wednesday, March 25, 2020 Harald van Dijk  wrote:
> 
> On 25/03/2020 19:44, shwaresyst wrote:
>  > The GNU version is more correct, in my opinion, in that the use of n as
>  > a delimiter should take precedence over its use as control character
>  > alias with the wording as is. The other versions appear to consider the
>  > BRE as  so does not match 'n'.
> 
> You have that backwards, don't you? The GNU version lets the use of \n
> as a control character take precedence over its use as a delimiter.
> That's why n gets printed: \n\nn is treated as /\n/, which can never
> match any single-line string, so nothing gets deleted.
> 
> Likewise,
> 
>    echo n | sed '\n[^\n]nd'
> 
> prints nothing with GNU sed, but prints n with FreeBSD sed for the same
> reason: 'n' does contain a character that is not , but does not
> contain any character that is not .
> 
> 
>  > 
>  > On Wednesday, March 25, 2020 Oğuz  > wrote:
>  >
>  >      echo n | sed '\n\nnd'
>  >
>  > Above command returns 'n' with GNU sed, and nothing with BSD seds and
>  > OmniOS sed. [...]


Re: sed '\n\nnd'

2020-03-25 Thread shwaresyst

yes, without them the argument would be "nnnd", after quote removal by the 
shell. The reasoning in first reply was meant to show that the non-GNU versions 
might be erroneously treating the second '\' as "do contol alias processing 
always", ignoring that its use as delimeter overrides that interpretation, to 
get the results observed.
On Wednesday, March 25, 2020 Harald van Dijk  wrote:
On 25/03/2020 21:09, shwaresyst wrote:
> If it wasn't in single quotes, then that might be plausible, but I don't 
> see it as the intent since no other aliases are excluded as 
> possibilities for after the '/'. The initial "\n" makes 'n' the 
> delimiter, the 2nd overrides it as being the BRE terminator, and the 
> following 'n' is the terminator, before the 'd' command. Should there be 
> something explicit about aliases not being usable when repurposed as 
> delimiter, maybe.

This reply makes no sense to me, sorry. The single quotes are processed 
at the shell level. Without single quotes, there would be no backslash 
for sed to process.

Regardless, the only thing I wrote was that you simultaneously 
considered the GNU version more correct and explained it in a way that 
led me to believe you actually consider the BSD version more correct. I 
wrote absolutely nothing about what the standard says or intends to say.

> 
> On Wednesday, March 25, 2020 Harald van Dijk  wrote:
> 
> On 25/03/2020 19:44, shwaresyst wrote:
>  > The GNU version is more correct, in my opinion, in that the use of n as
>  > a delimiter should take precedence over its use as control character
>  > alias with the wording as is. The other versions appear to consider the
>  > BRE as  so does not match 'n'.
> 
> You have that backwards, don't you? The GNU version lets the use of \n
> as a control character take precedence over its use as a delimiter.
> That's why n gets printed: \n\nn is treated as /\n/, which can never
> match any single-line string, so nothing gets deleted.
> 
> Likewise,
> 
>    echo n | sed '\n[^\n]nd'
> 
> prints nothing with GNU sed, but prints n with FreeBSD sed for the same
> reason: 'n' does contain a character that is not , but does not
> contain any character that is not .
> 
> 
>  > 
>  > On Wednesday, March 25, 2020 Oğuz  > wrote:
>  >
>  >      echo n | sed '\n\nnd'
>  >
>  > Above command returns 'n' with GNU sed, and nothing with BSD seds and
>  > OmniOS sed. [...]


Re: sed '\n\nnd'

2020-03-25 Thread Harald van Dijk

On 25/03/2020 23:30, shwaresyst wrote:
yes, without them the argument would be "nnnd", after quote removal by 
the shell. The reasoning in first reply was meant to show that the 
non-GNU versions might be erroneously treating the second '\' as "do 
contol alias processing always", ignoring that its use as delimeter 
overrides that interpretation, to get the results observed.


Again, it's the BSD version that treats the second \n as , treating 
the backslash in there as just escaping the delimiter character. You 
have it backwards. The GNU version is the one that treats the second \n 
as .




On Wednesday, March 25, 2020 Harald van Dijk  wrote:

On 25/03/2020 21:09, shwaresyst wrote:
 > If it wasn't in single quotes, then that might be plausible, but I don't
 > see it as the intent since no other aliases are excluded as
 > possibilities for after the '/'. The initial "\n" makes 'n' the
 > delimiter, the 2nd overrides it as being the BRE terminator, and the
 > following 'n' is the terminator, before the 'd' command. Should there be
 > something explicit about aliases not being usable when repurposed as
 > delimiter, maybe.

This reply makes no sense to me, sorry. The single quotes are processed
at the shell level. Without single quotes, there would be no backslash
for sed to process.

Regardless, the only thing I wrote was that you simultaneously
considered the GNU version more correct and explained it in a way that
led me to believe you actually consider the BSD version more correct. I
wrote absolutely nothing about what the standard says or intends to say.

 > 
 > On Wednesday, March 25, 2020 Harald van Dijk > wrote:

 >
 > On 25/03/2020 19:44, shwaresyst wrote:
 >  > The GNU version is more correct, in my opinion, in that the use of 
n as

 >  > a delimiter should take precedence over its use as control character
 >  > alias with the wording as is. The other versions appear to 
consider the

 >  > BRE as  so does not match 'n'.
 >
 > You have that backwards, don't you? The GNU version lets the use of \n
 > as a control character take precedence over its use as a delimiter.
 > That's why n gets printed: \n\nn is treated as /\n/, which can never
 > match any single-line string, so nothing gets deleted.
 >
 > Likewise,
 >
 >    echo n | sed '\n[^\n]nd'
 >
 > prints nothing with GNU sed, but prints n with FreeBSD sed for the same
 > reason: 'n' does contain a character that is not , but does not
 > contain any character that is not .
 >
 >
 >  > 

 >  > On Wednesday, March 25, 2020 Oğuz 
 > >> wrote:


 >  >
 >  >      echo n | sed '\n\nnd'
 >  >
 >  > Above command returns 'n' with GNU sed, and nothing with BSD seds and
 >  > OmniOS sed. [...]




Re: sed '\n\nnd'

2020-03-25 Thread Harald van Dijk

On 25/03/2020 21:09, shwaresyst wrote:
If it wasn't in single quotes, then that might be plausible, but I don't 
see it as the intent since no other aliases are excluded as 
possibilities for after the '/'. The initial "\n" makes 'n' the 
delimiter, the 2nd overrides it as being the BRE terminator, and the 
following 'n' is the terminator, before the 'd' command. Should there be 
something explicit about aliases not being usable when repurposed as 
delimiter, maybe.


This reply makes no sense to me, sorry. The single quotes are processed 
at the shell level. Without single quotes, there would be no backslash 
for sed to process.


Regardless, the only thing I wrote was that you simultaneously 
considered the GNU version more correct and explained it in a way that 
led me to believe you actually consider the BSD version more correct. I 
wrote absolutely nothing about what the standard says or intends to say.




On Wednesday, March 25, 2020 Harald van Dijk  wrote:

On 25/03/2020 19:44, shwaresyst wrote:
 > The GNU version is more correct, in my opinion, in that the use of n as
 > a delimiter should take precedence over its use as control character
 > alias with the wording as is. The other versions appear to consider the
 > BRE as  so does not match 'n'.

You have that backwards, don't you? The GNU version lets the use of \n
as a control character take precedence over its use as a delimiter.
That's why n gets printed: \n\nn is treated as /\n/, which can never
match any single-line string, so nothing gets deleted.

Likewise,

   echo n | sed '\n[^\n]nd'

prints nothing with GNU sed, but prints n with FreeBSD sed for the same
reason: 'n' does contain a character that is not , but does not
contain any character that is not .


 > 
 > On Wednesday, March 25, 2020 Oğuz > wrote:

 >
 >      echo n | sed '\n\nnd'
 >
 > Above command returns 'n' with GNU sed, and nothing with BSD seds and
 > OmniOS sed. [...]




Re: sed '\n\nnd'

2020-03-25 Thread shwaresyst

If it wasn't in single quotes, then that might be plausible, but I don't see it 
as the intent since no other aliases are excluded as possibilities for after 
the '/'. The initial "\n" makes 'n' the delimiter, the 2nd overrides it as 
being the BRE terminator, and the following 'n' is the terminator, before the 
'd' command. Should there be something explicit about aliases not being usable 
when repurposed as delimiter, maybe.
On Wednesday, March 25, 2020 Harald van Dijk  wrote:
On 25/03/2020 19:44, shwaresyst wrote:
> The GNU version is more correct, in my opinion, in that the use of n as 
> a delimiter should take precedence over its use as control character 
> alias with the wording as is. The other versions appear to consider the 
> BRE as  so does not match 'n'.

You have that backwards, don't you? The GNU version lets the use of \n 
as a control character take precedence over its use as a delimiter. 
That's why n gets printed: \n\nn is treated as /\n/, which can never 
match any single-line string, so nothing gets deleted.

Likewise,

  echo n | sed '\n[^\n]nd'

prints nothing with GNU sed, but prints n with FreeBSD sed for the same 
reason: 'n' does contain a character that is not , but does not 
contain any character that is not .

> 
> On Wednesday, March 25, 2020 Oğuz  wrote:
> 
>      echo n | sed '\n\nnd'
> 
> Above command returns 'n' with GNU sed, and nothing with BSD seds and 
> OmniOS sed. [...]


Re: sed '\n\nnd'

2020-03-25 Thread Harald van Dijk

On 25/03/2020 19:44, shwaresyst wrote:
The GNU version is more correct, in my opinion, in that the use of n as 
a delimiter should take precedence over its use as control character 
alias with the wording as is. The other versions appear to consider the 
BRE as  so does not match 'n'.


You have that backwards, don't you? The GNU version lets the use of \n 
as a control character take precedence over its use as a delimiter. 
That's why n gets printed: \n\nn is treated as /\n/, which can never 
match any single-line string, so nothing gets deleted.


Likewise,

  echo n | sed '\n[^\n]nd'

prints nothing with GNU sed, but prints n with FreeBSD sed for the same 
reason: 'n' does contain a character that is not , but does not 
contain any character that is not .




On Wednesday, March 25, 2020 Oğuz  wrote:

     echo n | sed '\n\nnd'

Above command returns 'n' with GNU sed, and nothing with BSD seds and 
OmniOS sed. [...]




Re: sed '\n\nnd'

2020-03-25 Thread Joerg Schilling
shwaresyst  wrote:

> The GNU version is more correct, in my opinion, in that the use of n as a 
> delimiter should take precedence over its use as control character alias with 
> the wording as is. The other versions appear to consider the BRE as  
> so does not match 'n'.

The Solaris sed behaves like the GNU sed in this case.

Illumos uses the BSD sed since the sed from Solaris, AIX, HP-UX is common 
closed source and it links only against the closed source i18n code from 
Solaris. As Illumos uses i18n in  libc from FreeBSD, the closed source Solaris 
sed cannot be used anymore.



Jörg

-- 
 EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin
joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'



RE: sed '\n\nnd'

2020-03-25 Thread shwaresyst

The GNU version is more correct, in my opinion, in that the use of n as a 
delimiter should take precedence over its use as control character alias with 
the wording as is. The other versions appear to consider the BRE as  
so does not match 'n'.
On Wednesday, March 25, 2020 Oğuz  wrote:
    echo n | sed '\n\nnd'
Above command returns 'n' with GNU sed, and nothing with BSD seds and OmniOS 
sed. The standard says 
   
   -
In a context address, the construction "\cBREc", where c is any character other 
than  or , shall be identical to "/BRE/". If the character 
designated by c appears following a , then it shall be considered to 
be that literal character, which shall not terminate the BRE. For example, in 
the context address "\xabc\xdefx", the second x stands for itself, so that the 
BRE is "abcxdef".

   -
The escape sequence '\n' shall match a  embedded in the pattern space. 
A literal  shall not be used in the BRE of a context address or in the 
substitute function.


but this is not clear at all. Which is the correct behavior here?


-- 
Oğuz