On 06/15/11 08:35, Otto Moerbeek wrote:
> On Wed, Jun 15, 2011 at 07:44:20AM +0200, Otto Moerbeek wrote:
>
>> On Tue, Jun 14, 2011 at 11:56:27PM +0200, sven falempin wrote:
>>
>>> Hello,
>>>
>>> Indeed there is a small problem:
>>>
>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/[a$]/x/g'
>>> xbbbbbbbbbbbbbfffff
>>
>> That is expected. $ is only special when it ocurs as the list char of
>> a re.
>>
>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/a|$/x/g'
>>> x
>>
>> This is likely to be a real bug.
>>
>>>
>>> String modification is done inside the 'case 0:'
>>> substitute(struct s_command *cp) in src/usr.bin/process.c
>>>
>>> But the problem may comme from regexec_e.
>>>
>>> Maybe openbsd devs should test another regexp code version ?
>>
>> Why? If we should change libs on every bug encountered, nothing will
>> be left.
>>
>> Anyway, thanks for the report.
>>
>> -Otto
>>
>>>
>>> Hope it helps,
>>> Who still use sed anyway :)
>>>
>>> Regards.
>>>
>>> 2011/6/12 Ingo Schwarze <[email protected]>
>>>
>>>> Hi Nils,
>>>>
>>>> Nils Anspach wrote on Sun, Jun 12, 2011 at 12:49:42PM +0200:
>>>>
>>>>> I have an issue with sed. Why does
>>>>>
>>>>> echo 'ab' | sed -E 's/a|$/x/g'
>>>>>
>>>>> give 'x' whereas
>>>>
>>>> I sense a bug here.
>>>> Tracing a bit around process(),
>>>> it looks like the first application of the s command
>>>> yields dst = "x" continue_to_process = "b\n",
>>>> and then the second application
>>>> appends "\n" to dst (should rather append "b", i think).
>>>> Maybe something is wrong here with character/pointer counting,
>>>> but i'm somewhat out of time now for tracing.
>>>>
>>>> This is worth more investigation.
>>>>
>>>> Yours,
>>>> Ingo
>>>>
>>>>
>>>
>>>
>>> --
>>> ---------------------------------------------------------------------------------------------------------------------
>>> () ascii ribbon campaign - against html e-mail
>>> /\
>
> This dif fixes your problem here. Big question is of course: does it
> break other cases?
>
> -Otto
>
> Index: process.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/sed/process.c,v
> retrieving revision 1.15
> diff -u -p -r1.15 process.c
> --- process.c 27 Oct 2009 23:59:43 -0000 1.15
> +++ process.c 15 Jun 2011 06:31:08 -0000
> @@ -336,7 +336,9 @@ substitute(struct s_command *cp)
> switch (n) {
> case 0: /* Global */
> do {
> - if (lastempty || match[0].rm_so != match[0].rm_eo) {
> + if (lastempty || match[0].rm_so != match[0].rm_eo ||
> + (match[0].rm_so == match[0].rm_eo &&
> + match[0].rm_so > 0)) {
> /* Locate start of replaced string. */
> re_off = match[0].rm_so;
> /* Copy leading retained string. */
>
Looks ok to me (I believe the problem was that prior to the intodution
of -E, any matching '$' would always be in the first match.
The diff doesn't break any of the regression tests (not that there are
a lot of them). While at it, here's another one! :-)
/Alexander
Index: Makefile
===================================================================
RCS file: /cvs/src/regress/usr.bin/sed/Makefile,v
retrieving revision 1.2
diff -u -p -r1.2 Makefile
--- Makefile 13 Oct 2008 13:27:33 -0000 1.2
+++ Makefile 15 Jun 2011 06:55:26 -0000
@@ -3,7 +3,7 @@
SED= /usr/bin/sed
-REGRESS_TARGETS= sedtest hanoi math sierpinski
+REGRESS_TARGETS= sedtest hanoi math sierpinski eol
sedtest:
sh ${.CURDIR}/[email protected] ${SED} [email protected]
@@ -19,6 +19,10 @@ math:
sierpinski:
${SED} -nf ${.CURDIR}/[email protected] ${.CURDIR}/[email protected] > [email protected]
+ diff ${.CURDIR}/[email protected] [email protected]
+
+eol:
+ ${SED} -Ef ${.CURDIR}/[email protected] ${.CURDIR}/[email protected] > [email protected]
diff ${.CURDIR}/[email protected] [email protected]
CLEANFILES+=*.out lines* script*
Index: eol.expected
===================================================================
RCS file: eol.expected
diff -N eol.expected
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ eol.expected 15 Jun 2011 06:55:26 -0000
@@ -0,0 +1 @@
+xxxxbbbbccccx
Index: eol.in
===================================================================
RCS file: eol.in
diff -N eol.in
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ eol.in 15 Jun 2011 06:55:26 -0000
@@ -0,0 +1 @@
+aaaabbbbcccc
Index: eol.sed
===================================================================
RCS file: eol.sed
diff -N eol.sed
--- /dev/null 1 Jan 1970 00:00:00 -0000
+++ eol.sed 15 Jun 2011 06:55:26 -0000
@@ -0,0 +1 @@
+s/a|$/x/g