On Sun, Aug 15, 2021 at 11:34:22PM +0200, Martijn van Duren wrote: > Andreas Kähäri gave a nice example on misc@ on how our sed addressing > implemenation differs from gsed[0][1]. While writing my reply I noticed > that POSIX doesn't state how "next cycle" should be interpreted when it > comes to address ranges. So I can't state that our implementation is > wrong per se. However, I do think that gsed's interpretation is more > intuitive, since a numeric address is not dependent on the context of > the pattern space and thus should register as "in range".
But note that this comes out of a discussion on how to do '0,/re/' addressing with OpenBSD sed. Your changes appears to remove one way of actually handling a match of '/re/' on the first line without giving us another. It would be better to have a clean way of doing the equivalent of '0,/re/' than to remove a way to do this. Interestingly (?), the sed in plan9port works the same as our native sed. Andreas > > Diff below changes program parsing to more closely match gsed in this > regard: > $ printf 'test1\nbla1\ntest2\nbla2\n' | sed -e '1 { /^test/d; }' -e > '1,/^test/d' > bla1 > test2 > bla2 > $ printf 'test1\nbla1\ntest2\nbla2\n' | ./obj/sed -e '1 { /^test/d; }' -e > '1,/^test/d' > bla2 > $ printf 'bla0\ntest1\nbla1\ntest2\nbla2\n' | ./obj/sed -e '1 { /^test/d; }' > -e '1,/^test/d' > bla1 > test2 > bla2 > $ printf 'test1\nbla1\ntest2\nbla2\n' | gsed -e '1 { /^test/d; }' -e > '1,/^test/d' > bla2 > $ printf 'bla0\ntest1\nbla1\ntest2\nbla2\n' | gsed -e '1 { /^test/d; }' -e > '1,/^test/d' > bla1 > test2 > bla2 > > The diff passes regress, but hasn't had a lot of scrutiny. Just checking > for general interest in changing this functionality. As soon as I > know that it's something we might want I'll spend more braincycles on > it. > > martijn@ > > [0] https://marc.info/?l=openbsd-misc&m=162896537001890&w=2 > [1] https://marc.info/?l=openbsd-misc&m=162905748428954&w=2 > > Index: process.c > =================================================================== > RCS file: /cvs/src/usr.bin/sed/process.c,v > retrieving revision 1.34 > diff -u -p -r1.34 process.c > --- process.c 14 Nov 2018 10:59:33 -0000 1.34 > +++ process.c 15 Aug 2021 21:30:22 -0000 > @@ -89,14 +89,16 @@ process(void) > SPACE tspace; > size_t len, oldpsl; > char *p; > + int nextcycle; > > for (linenum = 0; mf_fgets(&PS, REPLACE);) { > pd = 0; > + nextcycle = 0; > top: > cp = prog; > redirect: > while (cp != NULL) { > - if (!applies(cp)) { > + if (!applies(cp) || nextcycle) { > cp = cp->next; > continue; > } > @@ -127,14 +129,16 @@ redirect: > break; > case 'd': > pd = 1; > - goto new; > + nextcycle = 1; > + break; > case 'D': > if (pd) > goto new; > if (psl == 0 || > (p = memchr(ps, '\n', psl)) == NULL) { > pd = 1; > - goto new; > + nextcycle = 1; > + break; > } else { > psl -= (p + 1) - ps; > memmove(ps, p + 1, psl); > @@ -267,8 +271,9 @@ new: if (!nflag && !pd) > * (lastline, linenumber, ps). > */ > #define MATCH(a) \ > - (a)->type == AT_RE ? regexec_e((a)->u.r, ps, 0, 1, 0, psl) : \ > - (a)->type == AT_LINE ? linenum == (a)->u.l : lastline() > + (a)->type == AT_LINE ? linenum == (a)->u.l : \ > + (a)->type == AT_LAST ? lastline() : \ > + pd ? 0 : regexec_e((a)->u.r, ps, 0, 1, 0, psl) > > /* > * Return TRUE if the command applies to the current line. Sets the inrange > -- Andreas (Kusalananda) Kähäri SciLifeLab, NBIS, ICM Uppsala University, Sweden .