On 06/15/11 08:58, Alexander Hall wrote:
> On 06/15/11 08:35, Otto Moerbeek wrote:
>> On Wed, Jun 15, 2011 at 07:44:20AM +0200, Otto Moerbeek wrote:
>>
>>> On Tue, Jun 14, 2011 at 11:56:27PM +0200, sven falempin wrote:
>>>
>>>> Hello,
>>>>
>>>> Indeed there is a small problem:
>>>>
>>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/[a$]/x/g'
>>>> xbbbbbbbbbbbbbfffff
>>>
>>> That is expected. $ is only special when it ocurs as the list char of
>>> a re.
>>>
>>>> # echo 'abbbbbbbbbbbbbfffff' | sed -E 's/a|$/x/g'
>>>> x
>>>
>>> This is likely to be a real bug.
>>>
>>>>
>>>> String modification is done inside the 'case 0:'
>>>> substitute(struct s_command *cp) in src/usr.bin/process.c
>>>>
>>>> But the problem may comme from regexec_e.
>>>>
>>>> Maybe openbsd devs should test another regexp code version ?
>>>
>>> Why? If we should change libs on every bug encountered, nothing will
>>> be left. 
>>>
>>> Anyway, thanks for the report.
>>>
>>>     -Otto
>>>
>>>>
>>>> Hope it helps,
>>>> Who still use sed anyway :)
>>>>
>>>> Regards.
>>>>
>>>> 2011/6/12 Ingo Schwarze <[email protected]>
>>>>
>>>>> Hi Nils,
>>>>>
>>>>> Nils Anspach wrote on Sun, Jun 12, 2011 at 12:49:42PM +0200:
>>>>>
>>>>>> I have an issue with sed. Why does
>>>>>>
>>>>>>       echo 'ab' | sed -E 's/a|$/x/g'
>>>>>>
>>>>>> give 'x' whereas
>>>>>
>>>>> I sense a bug here.
>>>>> Tracing a bit around process(),
>>>>> it looks like the first application of the s command
>>>>> yields dst = "x" continue_to_process = "b\n",
>>>>> and then the second application
>>>>> appends "\n" to dst (should rather append "b", i think).
>>>>> Maybe something is wrong here with character/pointer counting,
>>>>> but i'm somewhat out of time now for tracing.
>>>>>
>>>>> This is worth more investigation.
>>>>>
>>>>> Yours,
>>>>>   Ingo
>>>>>
>>>>>
>>>>
>>>>
>>>> -- 
>>>> ---------------------------------------------------------------------------------------------------------------------
>>>> () ascii ribbon campaign - against html e-mail
>>>> /\
>>
>> This dif fixes your problem here. Big question is of course: does it
>> break other cases?
>>
>>      -Otto
>>
>> Index: process.c
>> ===================================================================
>> RCS file: /cvs/src/usr.bin/sed/process.c,v
>> retrieving revision 1.15
>> diff -u -p -r1.15 process.c
>> --- process.c        27 Oct 2009 23:59:43 -0000      1.15
>> +++ process.c        15 Jun 2011 06:31:08 -0000
>> @@ -336,7 +336,9 @@ substitute(struct s_command *cp)
>>      switch (n) {
>>      case 0:                                 /* Global */
>>              do {
>> -                    if (lastempty || match[0].rm_so != match[0].rm_eo) {
>> +                    if (lastempty || match[0].rm_so != match[0].rm_eo ||
>> +                        (match[0].rm_so == match[0].rm_eo &&
>> +                        match[0].rm_so > 0)) {
>>                              /* Locate start of replaced string. */
>>                              re_off = match[0].rm_so;
>>                              /* Copy leading retained string. */
>>
> 
> Looks ok to me (I believe the problem was that prior to the intodution
> of -E, any matching '$' would always be in the first match.
> 
> The diff doesn't break any of the regression tests (not that there are
> a lot of them). While at it, here's another one! :-)

Hmmm, looking closer I guess this should rather be a part of the sedtest
target...

Reply via email to