Re: sed '\n\nnd'
Date:Thu, 26 Mar 2020 09:11:16 + From:Geoff Clare Message-ID: <20200326091116.GB22632@lt2.masqnet> | Given that implementations differ, we should probably make the | behaviour explicitly unspecified. I'd agree with that - it isn't as though it really matters except to people looking for obscure things to test - the \d form of "anything can be the delimiter" in practice is used with delims like ! or | or ; (etc) - not with alphas (though \x I have seen on occasion) and certainly not with the chars where \x in other places means something different from x (or an escaped x). Making it unspecified seems like a reasonable think to do, even if the GNU people do decide that their \n behaviour is a bug. kre
Re: sed '\n\nnd'
Sorry, I was confusing the 'd' command, thinking 'display', for the 'l' command. Since stdout is supposed to echo after processing, I did have it reversed; if the gnu version echos the 'n' it probably wasn't 'delete'd, and for the others it likely was. On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 23:30, shwaresyst wrote: > yes, without them the argument would be "nnnd", after quote removal by > the shell. The reasoning in first reply was meant to show that the > non-GNU versions might be erroneously treating the second '\' as "do > contol alias processing always", ignoring that its use as delimeter > overrides that interpretation, to get the results observed. Again, it's the BSD version that treats the second \n as , treating the backslash in there as just escaping the delimiter character. You have it backwards. The GNU version is the one that treats the second \n as . > > On Wednesday, March 25, 2020 Harald van Dijk wrote: > > On 25/03/2020 21:09, shwaresyst wrote: > > If it wasn't in single quotes, then that might be plausible, but I don't > > see it as the intent since no other aliases are excluded as > > possibilities for after the '/'. The initial "\n" makes 'n' the > > delimiter, the 2nd overrides it as being the BRE terminator, and the > > following 'n' is the terminator, before the 'd' command. Should there be > > something explicit about aliases not being usable when repurposed as > > delimiter, maybe. > > This reply makes no sense to me, sorry. The single quotes are processed > at the shell level. Without single quotes, there would be no backslash > for sed to process. > > Regardless, the only thing I wrote was that you simultaneously > considered the GNU version more correct and explained it in a way that > led me to believe you actually consider the BSD version more correct. I > wrote absolutely nothing about what the standard says or intends to say. > > > > > On Wednesday, March 25, 2020 Harald van Dijk <mailto:a...@gigawatt.nl>> wrote: > > > > On 25/03/2020 19:44, shwaresyst wrote: > > > The GNU version is more correct, in my opinion, in that the use of > n as > > > a delimiter should take precedence over its use as control character > > > alias with the wording as is. The other versions appear to > consider the > > > BRE as so does not match 'n'. > > > > You have that backwards, don't you? The GNU version lets the use of \n > > as a control character take precedence over its use as a delimiter. > > That's why n gets printed: \n\nn is treated as /\n/, which can never > > match any single-line string, so nothing gets deleted. > > > > Likewise, > > > > echo n | sed '\n[^\n]nd' > > > > prints nothing with GNU sed, but prints n with FreeBSD sed for the same > > reason: 'n' does contain a character that is not , but does not > > contain any character that is not . > > > > > > > > > > > On Wednesday, March 25, 2020 Oğuz <mailto:oguzismailuy...@gmail.com> > > <mailto:oguzismailuy...@gmail.com > <mailto:oguzismailuy...@gmail.com>>> wrote: > > > > > > > echo n | sed '\n\nnd' > > > > > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > > > OmniOS sed. [...]
Re: sed '\n\nnd'
26 Mart 2020 Perşembe tarihinde Joerg Schilling < joerg.schill...@fokus.fraunhofer.de> yazdı: > O?uz wrote: > > > > Given that implementations differ, we should probably make the > > > behaviour explicitly unspecified. > > > > But this might be a bug in GNU's implementation. > > Why should a behavior that is aligned with the classical UNIX behavior > (Solaris) be buggy? > > Unlike you guys I don't have access to all UNIX operating systems. I just thought it was a bug based on my observations on GNU and BSD, and learned that it's not. > Jörg > > -- > EMail:jo...@schily.net(home) Jörg Schilling D-13353 > Berlin > joerg.schill...@fokus.fraunhofer.de (work) Blog: > http://schily.blogspot.com/ > URL: http://cdrecord.org/private/ http://sf.net/projects/ > schilytools/files/' > -- Oğuz
Re: sed '\n\nnd'
O?uz wrote: > > Given that implementations differ, we should probably make the > > behaviour explicitly unspecified. > > But this might be a bug in GNU's implementation. Why should a behavior that is aligned with the classical UNIX behavior (Solaris) be buggy? Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: sed '\n\nnd'
Geoff Clare wrote: > Solaris and HP-UX output "n" the same as GNU sed. Sou you verified what I assumed from the fact that Solaris, AIX and HP-UX are based on a common source from the i18n project. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
Re: sed '\n\nnd'
Oğuz wrote, on 26 Mar 2020: > > > Given that implementations differ, we should probably make the > > behaviour explicitly unspecified. > > But this might be a bug in GNU's implementation. > [...] > > Personally, I think BSD sed's behavior should be standardized and GNU > would not object to that. Solaris and HP-UX output "n" the same as GNU sed. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: sed '\n\nnd'
> Given that implementations differ, we should probably make the > behaviour explicitly unspecified. But this might be a bug in GNU's implementation. $ echo t | sed 'st\ttt' | xxd : 0a . $ $ echo n | sed 'sn\nnn' | xxd : 6e0a n. Though both '\n' and '\t' are listed in the manual as supported control character escape sequences, they're are treated differently as seen above, and it makes no sense. I filed a bug report about this and waiting for a reply. Personally, I think BSD sed's behavior should be standardized and GNU would not object to that. -- Oğuz
Re: sed '\n\nnd'
Oğuz wrote, on 25 Mar 2020: > > echo n | sed '\n\nnd' > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > OmniOS sed. The standard says > > >- > >In a context address, the construction "\cBREc", where *c* is any >character other than or , shall be identical to >"/BRE/". If the character designated by *c* appears following a >, then it shall be considered to be that literal character, >which shall not terminate the BRE. For example, in the context address >"\xabc\xdefx", the second *x* stands for itself, so that the BRE is >"abcxdef". >- > >The escape sequence '\n' shall match a embedded in the pattern >space. A literal shall not be used in the BRE of a context >address or in the substitute function. > > > but this is not clear at all. Which is the correct behavior here? Neither is more correct than the other because, as you said yourself, the standard is unclear. A formal interpretation would say "The standard is unclear on this issue, and no conformance distinction can be made between alternative implementations based on this." Given that implementations differ, we should probably make the behaviour explicitly unspecified. -- Geoff Clare The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
Re: sed '\n\nnd'
Date:Wed, 25 Mar 2020 21:09:38 + (UTC) From:shwaresyst Message-ID: <1031615939.2118006.1585170578...@mail.yahoo.com> | If it wasn't in single quotes, then that might be plausible, As has been said elsewhere, that's nonsense. The quotes are just making clear what (exactly) is seen by sed. | but I don't see it as the intent since no other aliases are excluded | as possibilities for after the '/'. The initial "\n" makes 'n' | the delimiter, the 2nd overrides it as being the BRE terminator, | and the following 'n' is the terminator, Yes, that is exactly what the BSD implementation does, and the GNU one does not. So, as Harald said, you have it backwards, your interpretation (with which I agree, probably unsurprisingly) matches the BSD interpretation (the expression deletes lines containing an 'n' character) and not the GNU one (where it doesn't). kre
Re: sed '\n\nnd'
yes, without them the argument would be "nnnd", after quote removal by the shell. The reasoning in first reply was meant to show that the non-GNU versions might be erroneously treating the second '\' as "do contol alias processing always", ignoring that its use as delimeter overrides that interpretation, to get the results observed. On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 21:09, shwaresyst wrote: > If it wasn't in single quotes, then that might be plausible, but I don't > see it as the intent since no other aliases are excluded as > possibilities for after the '/'. The initial "\n" makes 'n' the > delimiter, the 2nd overrides it as being the BRE terminator, and the > following 'n' is the terminator, before the 'd' command. Should there be > something explicit about aliases not being usable when repurposed as > delimiter, maybe. This reply makes no sense to me, sorry. The single quotes are processed at the shell level. Without single quotes, there would be no backslash for sed to process. Regardless, the only thing I wrote was that you simultaneously considered the GNU version more correct and explained it in a way that led me to believe you actually consider the BSD version more correct. I wrote absolutely nothing about what the standard says or intends to say. > > On Wednesday, March 25, 2020 Harald van Dijk wrote: > > On 25/03/2020 19:44, shwaresyst wrote: > > The GNU version is more correct, in my opinion, in that the use of n as > > a delimiter should take precedence over its use as control character > > alias with the wording as is. The other versions appear to consider the > > BRE as so does not match 'n'. > > You have that backwards, don't you? The GNU version lets the use of \n > as a control character take precedence over its use as a delimiter. > That's why n gets printed: \n\nn is treated as /\n/, which can never > match any single-line string, so nothing gets deleted. > > Likewise, > > echo n | sed '\n[^\n]nd' > > prints nothing with GNU sed, but prints n with FreeBSD sed for the same > reason: 'n' does contain a character that is not , but does not > contain any character that is not . > > > > > > On Wednesday, March 25, 2020 Oğuz <mailto:oguzismailuy...@gmail.com>> wrote: > > > > echo n | sed '\n\nnd' > > > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > > OmniOS sed. [...]
Re: sed '\n\nnd'
yes, without them the argument would be "nnnd", after quote removal by the shell. The reasoning in first reply was meant to show that the non-GNU versions might be erroneously treating the second '\' as "do contol alias processing always", ignoring that its use as delimeter overrides that interpretation, to get the results observed. On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 21:09, shwaresyst wrote: > If it wasn't in single quotes, then that might be plausible, but I don't > see it as the intent since no other aliases are excluded as > possibilities for after the '/'. The initial "\n" makes 'n' the > delimiter, the 2nd overrides it as being the BRE terminator, and the > following 'n' is the terminator, before the 'd' command. Should there be > something explicit about aliases not being usable when repurposed as > delimiter, maybe. This reply makes no sense to me, sorry. The single quotes are processed at the shell level. Without single quotes, there would be no backslash for sed to process. Regardless, the only thing I wrote was that you simultaneously considered the GNU version more correct and explained it in a way that led me to believe you actually consider the BSD version more correct. I wrote absolutely nothing about what the standard says or intends to say. > > On Wednesday, March 25, 2020 Harald van Dijk wrote: > > On 25/03/2020 19:44, shwaresyst wrote: > > The GNU version is more correct, in my opinion, in that the use of n as > > a delimiter should take precedence over its use as control character > > alias with the wording as is. The other versions appear to consider the > > BRE as so does not match 'n'. > > You have that backwards, don't you? The GNU version lets the use of \n > as a control character take precedence over its use as a delimiter. > That's why n gets printed: \n\nn is treated as /\n/, which can never > match any single-line string, so nothing gets deleted. > > Likewise, > > echo n | sed '\n[^\n]nd' > > prints nothing with GNU sed, but prints n with FreeBSD sed for the same > reason: 'n' does contain a character that is not , but does not > contain any character that is not . > > > > > > On Wednesday, March 25, 2020 Oğuz <mailto:oguzismailuy...@gmail.com>> wrote: > > > > echo n | sed '\n\nnd' > > > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > > OmniOS sed. [...]
Re: sed '\n\nnd'
On 25/03/2020 23:30, shwaresyst wrote: yes, without them the argument would be "nnnd", after quote removal by the shell. The reasoning in first reply was meant to show that the non-GNU versions might be erroneously treating the second '\' as "do contol alias processing always", ignoring that its use as delimeter overrides that interpretation, to get the results observed. Again, it's the BSD version that treats the second \n as , treating the backslash in there as just escaping the delimiter character. You have it backwards. The GNU version is the one that treats the second \n as . On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 21:09, shwaresyst wrote: > If it wasn't in single quotes, then that might be plausible, but I don't > see it as the intent since no other aliases are excluded as > possibilities for after the '/'. The initial "\n" makes 'n' the > delimiter, the 2nd overrides it as being the BRE terminator, and the > following 'n' is the terminator, before the 'd' command. Should there be > something explicit about aliases not being usable when repurposed as > delimiter, maybe. This reply makes no sense to me, sorry. The single quotes are processed at the shell level. Without single quotes, there would be no backslash for sed to process. Regardless, the only thing I wrote was that you simultaneously considered the GNU version more correct and explained it in a way that led me to believe you actually consider the BSD version more correct. I wrote absolutely nothing about what the standard says or intends to say. > > On Wednesday, March 25, 2020 Harald van Dijk <mailto:a...@gigawatt.nl>> wrote: > > On 25/03/2020 19:44, shwaresyst wrote: > > The GNU version is more correct, in my opinion, in that the use of n as > > a delimiter should take precedence over its use as control character > > alias with the wording as is. The other versions appear to consider the > > BRE as so does not match 'n'. > > You have that backwards, don't you? The GNU version lets the use of \n > as a control character take precedence over its use as a delimiter. > That's why n gets printed: \n\nn is treated as /\n/, which can never > match any single-line string, so nothing gets deleted. > > Likewise, > > echo n | sed '\n[^\n]nd' > > prints nothing with GNU sed, but prints n with FreeBSD sed for the same > reason: 'n' does contain a character that is not , but does not > contain any character that is not . > > > > ---------------- > > On Wednesday, March 25, 2020 Oğuz <mailto:oguzismailuy...@gmail.com> > <mailto:oguzismailuy...@gmail.com <mailto:oguzismailuy...@gmail.com>>> wrote: > > > > echo n | sed '\n\nnd' > > > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > > OmniOS sed. [...]
Re: sed '\n\nnd'
On 25/03/2020 21:09, shwaresyst wrote: If it wasn't in single quotes, then that might be plausible, but I don't see it as the intent since no other aliases are excluded as possibilities for after the '/'. The initial "\n" makes 'n' the delimiter, the 2nd overrides it as being the BRE terminator, and the following 'n' is the terminator, before the 'd' command. Should there be something explicit about aliases not being usable when repurposed as delimiter, maybe. This reply makes no sense to me, sorry. The single quotes are processed at the shell level. Without single quotes, there would be no backslash for sed to process. Regardless, the only thing I wrote was that you simultaneously considered the GNU version more correct and explained it in a way that led me to believe you actually consider the BSD version more correct. I wrote absolutely nothing about what the standard says or intends to say. On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 19:44, shwaresyst wrote: > The GNU version is more correct, in my opinion, in that the use of n as > a delimiter should take precedence over its use as control character > alias with the wording as is. The other versions appear to consider the > BRE as so does not match 'n'. You have that backwards, don't you? The GNU version lets the use of \n as a control character take precedence over its use as a delimiter. That's why n gets printed: \n\nn is treated as /\n/, which can never match any single-line string, so nothing gets deleted. Likewise, echo n | sed '\n[^\n]nd' prints nothing with GNU sed, but prints n with FreeBSD sed for the same reason: 'n' does contain a character that is not , but does not contain any character that is not . > ---------------- > On Wednesday, March 25, 2020 Oğuz <mailto:oguzismailuy...@gmail.com>> wrote: > > echo n | sed '\n\nnd' > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > OmniOS sed. [...]
Re: sed '\n\nnd'
If it wasn't in single quotes, then that might be plausible, but I don't see it as the intent since no other aliases are excluded as possibilities for after the '/'. The initial "\n" makes 'n' the delimiter, the 2nd overrides it as being the BRE terminator, and the following 'n' is the terminator, before the 'd' command. Should there be something explicit about aliases not being usable when repurposed as delimiter, maybe. On Wednesday, March 25, 2020 Harald van Dijk wrote: On 25/03/2020 19:44, shwaresyst wrote: > The GNU version is more correct, in my opinion, in that the use of n as > a delimiter should take precedence over its use as control character > alias with the wording as is. The other versions appear to consider the > BRE as so does not match 'n'. You have that backwards, don't you? The GNU version lets the use of \n as a control character take precedence over its use as a delimiter. That's why n gets printed: \n\nn is treated as /\n/, which can never match any single-line string, so nothing gets deleted. Likewise, echo n | sed '\n[^\n]nd' prints nothing with GNU sed, but prints n with FreeBSD sed for the same reason: 'n' does contain a character that is not , but does not contain any character that is not . > ------------ > On Wednesday, March 25, 2020 Oğuz wrote: > > echo n | sed '\n\nnd' > > Above command returns 'n' with GNU sed, and nothing with BSD seds and > OmniOS sed. [...]
Re: sed '\n\nnd'
On 25/03/2020 19:44, shwaresyst wrote: The GNU version is more correct, in my opinion, in that the use of n as a delimiter should take precedence over its use as control character alias with the wording as is. The other versions appear to consider the BRE as so does not match 'n'. You have that backwards, don't you? The GNU version lets the use of \n as a control character take precedence over its use as a delimiter. That's why n gets printed: \n\nn is treated as /\n/, which can never match any single-line string, so nothing gets deleted. Likewise, echo n | sed '\n[^\n]nd' prints nothing with GNU sed, but prints n with FreeBSD sed for the same reason: 'n' does contain a character that is not , but does not contain any character that is not . On Wednesday, March 25, 2020 Oğuz wrote: echo n | sed '\n\nnd' Above command returns 'n' with GNU sed, and nothing with BSD seds and OmniOS sed. [...]
Re: sed '\n\nnd'
shwaresyst wrote: > The GNU version is more correct, in my opinion, in that the use of n as a > delimiter should take precedence over its use as control character alias with > the wording as is. The other versions appear to consider the BRE as > so does not match 'n'. The Solaris sed behaves like the GNU sed in this case. Illumos uses the BSD sed since the sed from Solaris, AIX, HP-UX is common closed source and it links only against the closed source i18n code from Solaris. As Illumos uses i18n in libc from FreeBSD, the closed source Solaris sed cannot be used anymore. Jörg -- EMail:jo...@schily.net(home) Jörg Schilling D-13353 Berlin joerg.schill...@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.org/private/ http://sf.net/projects/schilytools/files/'
RE: sed '\n\nnd'
The GNU version is more correct, in my opinion, in that the use of n as a delimiter should take precedence over its use as control character alias with the wording as is. The other versions appear to consider the BRE as so does not match 'n'. On Wednesday, March 25, 2020 Oğuz wrote: echo n | sed '\n\nnd' Above command returns 'n' with GNU sed, and nothing with BSD seds and OmniOS sed. The standard says - In a context address, the construction "\cBREc", where c is any character other than or , shall be identical to "/BRE/". If the character designated by c appears following a , then it shall be considered to be that literal character, which shall not terminate the BRE. For example, in the context address "\xabc\xdefx", the second x stands for itself, so that the BRE is "abcxdef". - The escape sequence '\n' shall match a embedded in the pattern space. A literal shall not be used in the BRE of a context address or in the substitute function. but this is not clear at all. Which is the correct behavior here? -- Oğuz
sed '\n\nnd'
echo n | sed '\n\nnd' Above command returns 'n' with GNU sed, and nothing with BSD seds and OmniOS sed. The standard says - In a context address, the construction "\cBREc", where *c* is any character other than or , shall be identical to "/BRE/". If the character designated by *c* appears following a , then it shall be considered to be that literal character, which shall not terminate the BRE. For example, in the context address "\xabc\xdefx", the second *x* stands for itself, so that the BRE is "abcxdef". - The escape sequence '\n' shall match a embedded in the pattern space. A literal shall not be used in the BRE of a context address or in the substitute function. but this is not clear at all. Which is the correct behavior here? -- Oğuz