OT: Re: Strange sed substitution removes text
howdee, i am not quoting any text, because this note is OffTopic-ish... i was looking at the comments from kshe regarding a full rewrite of the sed-utility... in particular, that there were obscure corner cases of tests that seemed to fail due to NULL or EOL or whatnot... apparently, sed is a Turing-complete language - and hence, given enough time/space/memory, will surely not give a single TRUE/FALSE answer to some questions... to be honest, i do not understand all of the details or theory, that are involved in the statements i _just_ made - but... since sed is a STREAM editor, and since sed-scripts are usually finite-length, then maybe there should be some way to enforce a limit (like was done for string-buffers) on the inputs... again, this is all just my pie-in-the-sky OT-commentary... sincerely, harold.
Re: Strange sed substitution removes text
On Mon, 25 Sep 2017 18:16:15 +, Martijn van Duren wrote: > Lets just wait until Ingo has time to look into it. He's still on > holiday in Paris, so it might be a few days. Hi, I already reported this issue three months ago, along with other related and unrelated bugs; see my second message in this thread: https://marc.info/?t=14969951951 As one can infer from reading any of these messages, this is far from being the only unresolved problem with OpenBSD's sed, so I doubt it is worth asking anyone to look particularly into this one more deeply; in fact, the whole substitute() function is flawed in multiple other aspects and, together with the rest of the program, which is likewise either plain wrong or embarrassingly suboptimal in almost every possible way, disserves no more than to be thrown away and rewritten. Because of my personal need for a correct and elegant implementation, this is exactly what I did in my local tree. Nevertheless, I have not been too impatient to share my code here as no one seemed to care when I mentioned it to tech@, but perhaps this freshly posted report could make someone interested after all. (If so, however, I would still need to take the time to write a fully fledged supporting justification before submitting it, because, as I reckon, one does not simply reimplement sed without explaining in depth why and how this had to be done.) Regards, kshe
Re: Strange sed substitution removes text
On 09/25/17 09:15, Andreas Kusalananda Kähäri wrote: > Yes, this seems to fix this particular issue for me nicely, > but the "int i = 0;" is probably not needed. You're right, that part was from some debugging printfs. I also wasn't asking for OKs, but merely pointing out the root of problem. Lets just wait until Ingo has time to look into it. He's still on holiday in Paris, so it might be a few days. > > > On Sun, Sep 24, 2017 at 11:59:49PM +0200, Martijn van Duren wrote: >> and now with 100% more patch... >> >> Index: process.c >> === >> RCS file: /cvs/src/usr.bin/sed/process.c,v >> retrieving revision 1.32 >> diff -u -p -r1.32 process.c >> --- process.c22 Feb 2017 14:09:09 - 1.32 >> +++ process.c24 Sep 2017 21:58:14 - >> @@ -336,6 +336,7 @@ substitute(struct s_command *cp) >> int n, lastempty; >> size_t le = 0; >> char *s; >> +int i = 0; >> >> s = ps; >> re = cp->u.s->re; >> @@ -386,7 +387,7 @@ substitute(struct s_command *cp) >> * and at the end of the line, terminate. >> */ >> if (match[0].rm_so == match[0].rm_eo) { >> -if (*s == '\0' || *s == '\n') >> +if (*s == '\0') >> slen = -1; >> else >> slen--; >> >> >> On 09/24/17 23:57, Martijn van Duren wrote: >>> This fixes the issue for me, but I'm not sure about the motivation >>> behind the check. >>> Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for >>> the particular line. >>> >>> martijn@ >>> >>> On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote: Hi, Given the input file of three lines: line 1 line 2 line 3 and the sed script s/\>>> /g s/^/hello/ which inserts a newline in front of every word and then prepends the word "hello" to the beginning of the pattern space. The following happens: $ sed -f script.sed input.txt hello hello hello I was expecting to get hello line 1 hello line 2 hello line 3 This is a bit surprising since running only the first sed expression gives (as expected) line 1 line 2 line 3 The question is, why does the "line N" data disappear when inserting a word at the start of the pattern space here? I'm also noticing that this does not happen if a space (for instance) precedes the escaped newline in the first expression: s/\>>> /g s/^/hello/ This is using sed in the base system on OpenBSD 6.1-stable (amd64). Cheers, >>> >
Re: Strange sed substitution removes text
Yes, this seems to fix this particular issue for me nicely, but the "int i = 0;" is probably not needed. On Sun, Sep 24, 2017 at 11:59:49PM +0200, Martijn van Duren wrote: > and now with 100% more patch... > > Index: process.c > === > RCS file: /cvs/src/usr.bin/sed/process.c,v > retrieving revision 1.32 > diff -u -p -r1.32 process.c > --- process.c 22 Feb 2017 14:09:09 - 1.32 > +++ process.c 24 Sep 2017 21:58:14 - > @@ -336,6 +336,7 @@ substitute(struct s_command *cp) > int n, lastempty; > size_t le = 0; > char *s; > +int i = 0; > > s = ps; > re = cp->u.s->re; > @@ -386,7 +387,7 @@ substitute(struct s_command *cp) >* and at the end of the line, terminate. >*/ > if (match[0].rm_so == match[0].rm_eo) { > - if (*s == '\0' || *s == '\n') > + if (*s == '\0') > slen = -1; > else > slen--; > > > On 09/24/17 23:57, Martijn van Duren wrote: > > This fixes the issue for me, but I'm not sure about the motivation > > behind the check. > > Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for > > the particular line. > > > > martijn@ > > > > On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote: > >> Hi, > >> > >> Given the input file of three lines: > >> > >> line 1 > >> line 2 > >> line 3 > >> > >> and the sed script > >> > >> s/\ >> /g > >> s/^/hello/ > >> > >> which inserts a newline in front of every word and then prepends the > >> word "hello" to the beginning of the pattern space. > >> > >> The following happens: > >> > >> $ sed -f script.sed input.txt > >> hello > >> > >> hello > >> > >> hello > >> > >> > >> I was expecting to get > >> > >> hello > >> line > >> 1 > >> hello > >> line > >> 2 > >> hello > >> line > >> 3 > >> > >> This is a bit surprising since running only the first sed expression > >> gives (as expected) > >> > >> > >> line > >> 1 > >> > >> line > >> 2 > >> > >> line > >> 3 > >> > >> > >> The question is, why does the "line N" data disappear when inserting a > >> word at the start of the pattern space here? > >> > >> I'm also noticing that this does not happen if a space (for instance) > >> precedes the escaped newline in the first expression: > >> > >> s/\ >> /g > >> s/^/hello/ > >> > >> > >> This is using sed in the base system on OpenBSD 6.1-stable (amd64). > >> > >> Cheers, > >> > > -- Andreas Kusalananda Kähäri, National Bioinformatics Infrastructure Sweden (NBIS), Uppsala University, Sweden.
Re: Strange sed substitution removes text
and now with 100% more patch... Index: process.c === RCS file: /cvs/src/usr.bin/sed/process.c,v retrieving revision 1.32 diff -u -p -r1.32 process.c --- process.c 22 Feb 2017 14:09:09 - 1.32 +++ process.c 24 Sep 2017 21:58:14 - @@ -336,6 +336,7 @@ substitute(struct s_command *cp) int n, lastempty; size_t le = 0; char *s; +int i = 0; s = ps; re = cp->u.s->re; @@ -386,7 +387,7 @@ substitute(struct s_command *cp) * and at the end of the line, terminate. */ if (match[0].rm_so == match[0].rm_eo) { - if (*s == '\0' || *s == '\n') + if (*s == '\0') slen = -1; else slen--; On 09/24/17 23:57, Martijn van Duren wrote: > This fixes the issue for me, but I'm not sure about the motivation > behind the check. > Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for > the particular line. > > martijn@ > > On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote: >> Hi, >> >> Given the input file of three lines: >> >> line 1 >> line 2 >> line 3 >> >> and the sed script >> >> s/\> /g >> s/^/hello/ >> >> which inserts a newline in front of every word and then prepends the >> word "hello" to the beginning of the pattern space. >> >> The following happens: >> >> $ sed -f script.sed input.txt >> hello >> >> hello >> >> hello >> >> >> I was expecting to get >> >> hello >> line >> 1 >> hello >> line >> 2 >> hello >> line >> 3 >> >> This is a bit surprising since running only the first sed expression >> gives (as expected) >> >> >> line >> 1 >> >> line >> 2 >> >> line >> 3 >> >> >> The question is, why does the "line N" data disappear when inserting a >> word at the start of the pattern space here? >> >> I'm also noticing that this does not happen if a space (for instance) >> precedes the escaped newline in the first expression: >> >> s/\> /g >> s/^/hello/ >> >> >> This is using sed in the base system on OpenBSD 6.1-stable (amd64). >> >> Cheers, >> >
Re: Strange sed substitution removes text
This fixes the issue for me, but I'm not sure about the motivation behind the check. Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for the particular line. martijn@ On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote: > Hi, > > Given the input file of three lines: > > line 1 > line 2 > line 3 > > and the sed script > > s/\ /g > s/^/hello/ > > which inserts a newline in front of every word and then prepends the > word "hello" to the beginning of the pattern space. > > The following happens: > > $ sed -f script.sed input.txt > hello > > hello > > hello > > > I was expecting to get > > hello > line > 1 > hello > line > 2 > hello > line > 3 > > This is a bit surprising since running only the first sed expression > gives (as expected) > > > line > 1 > > line > 2 > > line > 3 > > > The question is, why does the "line N" data disappear when inserting a > word at the start of the pattern space here? > > I'm also noticing that this does not happen if a space (for instance) > precedes the escaped newline in the first expression: > > s/\ /g > s/^/hello/ > > > This is using sed in the base system on OpenBSD 6.1-stable (amd64). > > Cheers, >
Strange sed substitution removes text
Hi, Given the input file of three lines: line 1 line 2 line 3 and the sed script s/\