On 06/15/11 08:58, Alexander Hall wrote: > On 06/15/11 08:35, Otto Moerbeek wrote: >> On Wed, Jun 15, 2011 at 07:44:20AM +0200, Otto Moerbeek wrote: >> >>> On Tue, Jun 14, 2011 at 11:56:27PM +0200, sven falempin wrote: >>> >>>> Hello, >>>> >>>> Indeed there is a small problem: >>>> >>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/[a$]/x/g' >>>> xbbbbbbbbbbbbbfffff >>> >>> That is expected. $ is only special when it ocurs as the list char of >>> a re. >>> >>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/a|$/x/g' >>>> x >>> >>> This is likely to be a real bug. >>> >>>> >>>> String modification is done inside the 'case 0:' >>>> substitute(struct s_command *cp) in src/usr.bin/process.c >>>> >>>> But the problem may comme from regexec_e. >>>> >>>> Maybe openbsd devs should test another regexp code version ? >>> >>> Why? If we should change libs on every bug encountered, nothing will >>> be left. >>> >>> Anyway, thanks for the report. >>> >>> -Otto >>> >>>> >>>> Hope it helps, >>>> Who still use sed anyway :) >>>> >>>> Regards. >>>> >>>> 2011/6/12 Ingo Schwarze <[email protected]> >>>> >>>>> Hi Nils, >>>>> >>>>> Nils Anspach wrote on Sun, Jun 12, 2011 at 12:49:42PM +0200: >>>>> >>>>>> I have an issue with sed. Why does >>>>>> >>>>>> echo 'ab' | sed -E 's/a|$/x/g' >>>>>> >>>>>> give 'x' whereas >>>>> >>>>> I sense a bug here. >>>>> Tracing a bit around process(), >>>>> it looks like the first application of the s command >>>>> yields dst = "x" continue_to_process = "b\n", >>>>> and then the second application >>>>> appends "\n" to dst (should rather append "b", i think). >>>>> Maybe something is wrong here with character/pointer counting, >>>>> but i'm somewhat out of time now for tracing. >>>>> >>>>> This is worth more investigation. >>>>> >>>>> Yours, >>>>> Ingo >>>>> >>>>> >>>> >>>> >>>> -- >>>> --------------------------------------------------------------------------------------------------------------------- >>>> () ascii ribbon campaign - against html e-mail >>>> /\ >> >> This dif fixes your problem here. Big question is of course: does it >> break other cases? >> >> -Otto >> >> Index: process.c >> =================================================================== >> RCS file: /cvs/src/usr.bin/sed/process.c,v >> retrieving revision 1.15 >> diff -u -p -r1.15 process.c >> --- process.c 27 Oct 2009 23:59:43 -0000 1.15 >> +++ process.c 15 Jun 2011 06:31:08 -0000 >> @@ -336,7 +336,9 @@ substitute(struct s_command *cp) >> switch (n) { >> case 0: /* Global */ >> do { >> - if (lastempty || match[0].rm_so != match[0].rm_eo) { >> + if (lastempty || match[0].rm_so != match[0].rm_eo || >> + (match[0].rm_so == match[0].rm_eo && >> + match[0].rm_so > 0)) { >> /* Locate start of replaced string. */ >> re_off = match[0].rm_so; >> /* Copy leading retained string. */ >> > > Looks ok to me (I believe the problem was that prior to the intodution > of -E, any matching '$' would always be in the first match. > > The diff doesn't break any of the regression tests (not that there are > a lot of them). While at it, here's another one! :-)
Hmmm, looking closer I guess this should rather be a part of the sedtest target...

