OT: Re: Strange sed substitution removes text

2017-09-29 Thread harold felton
howdee,

i am not quoting any text, because this note is OffTopic-ish...

i was looking at the comments from kshe regarding a full rewrite
of the sed-utility...  in particular, that there were obscure corner
cases of tests that seemed to fail due to NULL or EOL or whatnot...

apparently, sed is a Turing-complete language - and hence,
given enough time/space/memory, will surely not give a single
TRUE/FALSE answer to some questions...

to be honest, i do not understand all of the details or theory,
that are involved in the statements i _just_ made - but...

since sed is a STREAM editor, and since sed-scripts are usually
finite-length, then maybe there should be some way to enforce
a limit (like was done for string-buffers) on the inputs...

again, this is all just my pie-in-the-sky OT-commentary...

sincerely, harold.


Re: Strange sed substitution removes text

2017-09-28 Thread kshe
On Mon, 25 Sep 2017 18:16:15 +, Martijn van Duren wrote:
> Lets just wait until Ingo has time to look into it. He's still on
> holiday in Paris, so it might be a few days.

Hi,

I already reported this issue three months ago, along with other related
and unrelated bugs; see my second message in this thread:

https://marc.info/?t=14969951951

As one can infer from reading any of these messages, this is far from
being the only unresolved problem with OpenBSD's sed, so I doubt it is
worth asking anyone to look particularly into this one more deeply; in
fact, the whole substitute() function is flawed in multiple other
aspects and, together with the rest of the program, which is likewise
either plain wrong or embarrassingly suboptimal in almost every possible
way, disserves no more than to be thrown away and rewritten.  Because of
my personal need for a correct and elegant implementation, this is
exactly what I did in my local tree.  Nevertheless, I have not been too
impatient to share my code here as no one seemed to care when I
mentioned it to tech@, but perhaps this freshly posted report could make
someone interested after all.  (If so, however, I would still need to
take the time to write a fully fledged supporting justification before
submitting it, because, as I reckon, one does not simply reimplement sed
without explaining in depth why and how this had to be done.)

Regards,

kshe



Re: Strange sed substitution removes text

2017-09-25 Thread Martijn van Duren
On 09/25/17 09:15, Andreas Kusalananda Kähäri wrote:
> Yes, this seems to fix this particular issue for me nicely,
> but the "int i = 0;" is probably not needed.

You're right, that part was from some debugging printfs.
I also wasn't asking for OKs, but merely pointing out the root of
problem.

Lets just wait until Ingo has time to look into it. He's still on
holiday in Paris, so it might be a few days.
> 
> 
> On Sun, Sep 24, 2017 at 11:59:49PM +0200, Martijn van Duren wrote:
>> and now with 100% more patch...
>>
>> Index: process.c
>> ===
>> RCS file: /cvs/src/usr.bin/sed/process.c,v
>> retrieving revision 1.32
>> diff -u -p -r1.32 process.c
>> --- process.c22 Feb 2017 14:09:09 -  1.32
>> +++ process.c24 Sep 2017 21:58:14 -
>> @@ -336,6 +336,7 @@ substitute(struct s_command *cp)
>>  int n, lastempty;
>>  size_t le = 0;
>>  char *s;
>> +int i = 0;
>>  
>>  s = ps;
>>  re = cp->u.s->re;
>> @@ -386,7 +387,7 @@ substitute(struct s_command *cp)
>>   * and at the end of the line, terminate.
>>   */
>>  if (match[0].rm_so == match[0].rm_eo) {
>> -if (*s == '\0' || *s == '\n')
>> +if (*s == '\0')
>>  slen = -1;
>>  else
>>  slen--;
>>
>>
>> On 09/24/17 23:57, Martijn van Duren wrote:
>>> This fixes the issue for me, but I'm not sure about the motivation
>>> behind the check.
>>> Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for
>>> the particular line.
>>>
>>> martijn@
>>>
>>> On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote:
 Hi,

 Given the input file of three lines:

 line 1
 line 2
 line 3

 and the sed script

 s/\>>> /g
 s/^/hello/

 which inserts a newline in front of every word and then prepends the
 word "hello" to the beginning of the pattern space.

 The following happens:

 $ sed -f script.sed input.txt
 hello

 hello

 hello


 I was expecting to get

 hello
 line
 1
 hello
 line
 2
 hello
 line
 3

 This is a bit surprising since running only the first sed expression
 gives (as expected)


 line
 1

 line
 2

 line
 3


 The question is, why does the "line N" data disappear when inserting a
 word at the start of the pattern space here?

 I'm also noticing that this does not happen if a space (for instance)
 precedes the escaped newline in the first expression:

 s/\>>> /g
 s/^/hello/


 This is using sed in the base system on OpenBSD 6.1-stable (amd64).

 Cheers,

>>>
> 



Re: Strange sed substitution removes text

2017-09-25 Thread Andreas Kusalananda Kähäri
Yes, this seems to fix this particular issue for me nicely,
but the "int i = 0;" is probably not needed.


On Sun, Sep 24, 2017 at 11:59:49PM +0200, Martijn van Duren wrote:
> and now with 100% more patch...
> 
> Index: process.c
> ===
> RCS file: /cvs/src/usr.bin/sed/process.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 process.c
> --- process.c 22 Feb 2017 14:09:09 -  1.32
> +++ process.c 24 Sep 2017 21:58:14 -
> @@ -336,6 +336,7 @@ substitute(struct s_command *cp)
>   int n, lastempty;
>   size_t le = 0;
>   char *s;
> +int i = 0;
>  
>   s = ps;
>   re = cp->u.s->re;
> @@ -386,7 +387,7 @@ substitute(struct s_command *cp)
>* and at the end of the line, terminate.
>*/
>   if (match[0].rm_so == match[0].rm_eo) {
> - if (*s == '\0' || *s == '\n')
> + if (*s == '\0')
>   slen = -1;
>   else
>   slen--;
> 
> 
> On 09/24/17 23:57, Martijn van Duren wrote:
> > This fixes the issue for me, but I'm not sure about the motivation
> > behind the check.
> > Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for
> > the particular line.
> > 
> > martijn@
> > 
> > On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote:
> >> Hi,
> >>
> >> Given the input file of three lines:
> >>
> >> line 1
> >> line 2
> >> line 3
> >>
> >> and the sed script
> >>
> >> s/\ >> /g
> >> s/^/hello/
> >>
> >> which inserts a newline in front of every word and then prepends the
> >> word "hello" to the beginning of the pattern space.
> >>
> >> The following happens:
> >>
> >> $ sed -f script.sed input.txt
> >> hello
> >>
> >> hello
> >>
> >> hello
> >>
> >>
> >> I was expecting to get
> >>
> >> hello
> >> line
> >> 1
> >> hello
> >> line
> >> 2
> >> hello
> >> line
> >> 3
> >>
> >> This is a bit surprising since running only the first sed expression
> >> gives (as expected)
> >>
> >>
> >> line
> >> 1
> >>
> >> line
> >> 2
> >>
> >> line
> >> 3
> >>
> >>
> >> The question is, why does the "line N" data disappear when inserting a
> >> word at the start of the pattern space here?
> >>
> >> I'm also noticing that this does not happen if a space (for instance)
> >> precedes the escaped newline in the first expression:
> >>
> >> s/\ >> /g
> >> s/^/hello/
> >>
> >>
> >> This is using sed in the base system on OpenBSD 6.1-stable (amd64).
> >>
> >> Cheers,
> >>
> > 

-- 
Andreas Kusalananda Kähäri,
National Bioinformatics Infrastructure Sweden (NBIS),
Uppsala University, Sweden.



Re: Strange sed substitution removes text

2017-09-24 Thread Martijn van Duren
and now with 100% more patch...

Index: process.c
===
RCS file: /cvs/src/usr.bin/sed/process.c,v
retrieving revision 1.32
diff -u -p -r1.32 process.c
--- process.c   22 Feb 2017 14:09:09 -  1.32
+++ process.c   24 Sep 2017 21:58:14 -
@@ -336,6 +336,7 @@ substitute(struct s_command *cp)
int n, lastempty;
size_t le = 0;
char *s;
+int i = 0;
 
s = ps;
re = cp->u.s->re;
@@ -386,7 +387,7 @@ substitute(struct s_command *cp)
 * and at the end of the line, terminate.
 */
if (match[0].rm_so == match[0].rm_eo) {
-   if (*s == '\0' || *s == '\n')
+   if (*s == '\0')
slen = -1;
else
slen--;


On 09/24/17 23:57, Martijn van Duren wrote:
> This fixes the issue for me, but I'm not sure about the motivation
> behind the check.
> Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for
> the particular line.
> 
> martijn@
> 
> On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote:
>> Hi,
>>
>> Given the input file of three lines:
>>
>> line 1
>> line 2
>> line 3
>>
>> and the sed script
>>
>> s/\> /g
>> s/^/hello/
>>
>> which inserts a newline in front of every word and then prepends the
>> word "hello" to the beginning of the pattern space.
>>
>> The following happens:
>>
>> $ sed -f script.sed input.txt
>> hello
>>
>> hello
>>
>> hello
>>
>>
>> I was expecting to get
>>
>> hello
>> line
>> 1
>> hello
>> line
>> 2
>> hello
>> line
>> 3
>>
>> This is a bit surprising since running only the first sed expression
>> gives (as expected)
>>
>>
>> line
>> 1
>>
>> line
>> 2
>>
>> line
>> 3
>>
>>
>> The question is, why does the "line N" data disappear when inserting a
>> word at the start of the pattern space here?
>>
>> I'm also noticing that this does not happen if a space (for instance)
>> precedes the escaped newline in the first expression:
>>
>> s/\> /g
>> s/^/hello/
>>
>>
>> This is using sed in the base system on OpenBSD 6.1-stable (amd64).
>>
>> Cheers,
>>
> 



Re: Strange sed substitution removes text

2017-09-24 Thread Martijn van Duren
This fixes the issue for me, but I'm not sure about the motivation
behind the check.
Maybe schwarze@ can shed some light on it, since he's to (cvs) blame for
the particular line.

martijn@

On 09/24/17 15:42, Andreas Kusalananda Kähäri wrote:
> Hi,
> 
> Given the input file of three lines:
> 
> line 1
> line 2
> line 3
> 
> and the sed script
> 
> s/\ /g
> s/^/hello/
> 
> which inserts a newline in front of every word and then prepends the
> word "hello" to the beginning of the pattern space.
> 
> The following happens:
> 
> $ sed -f script.sed input.txt
> hello
> 
> hello
> 
> hello
> 
> 
> I was expecting to get
> 
> hello
> line
> 1
> hello
> line
> 2
> hello
> line
> 3
> 
> This is a bit surprising since running only the first sed expression
> gives (as expected)
> 
> 
> line
> 1
> 
> line
> 2
> 
> line
> 3
> 
> 
> The question is, why does the "line N" data disappear when inserting a
> word at the start of the pattern space here?
> 
> I'm also noticing that this does not happen if a space (for instance)
> precedes the escaped newline in the first expression:
> 
> s/\ /g
> s/^/hello/
> 
> 
> This is using sed in the base system on OpenBSD 6.1-stable (amd64).
> 
> Cheers,
> 



Strange sed substitution removes text

2017-09-24 Thread Andreas Kusalananda Kähäri
Hi,

Given the input file of three lines:

line 1
line 2
line 3

and the sed script

s/\