Re: ksh: skip long history lines

2017-10-19 Thread Jeremie Courreges-Anglas
On Wed, Oct 18 2017, Ori Bernstein  wrote:
> On Thu, 19 Oct 2017 00:22:51 +0200
> Jeremie Courreges-Anglas  wrote:
>
>> Those variables are static so that error messages are only printed once
>> in the shell lifetime.  history reload happens each time the shell
>> detects that the histfile has been modified, which can happen often if
>> like me you have multiple interactives shells running at the same time.
>  
> Makes sense, although I had been thinking that it could lead to lost
> warnings when changing histfile; I'd hope that a corrupt histfile isn't
> a persistent condition.

That's a fair point, I think I understand your previous proposal better.

Warning only once per histfile would need something like a file-global
variable - that we should reset in hist_init.  This sounds like extra
quirkiness that would make the code more complicated for a limited
benefit.  In the end, it might eat some of the precious 24 lines of your
terminal, but you can also just fix the offending lines.

> In any case, this looks pretty good to me.

Thanks for the feedback.

>> > Maybe hoist this outside of the loop.
>> 
>> I don't see a reason for this.

(Now I see.)

[...]

Updated diff.  The logic has changed, hopefully giving more accurate
warning messages.

Feedback / oks welcome.


Index: history.c
===
RCS file: /d/cvs/src/bin/ksh/history.c,v
retrieving revision 1.72
diff -u -p -p -u -r1.72 history.c
--- history.c   18 Oct 2017 15:41:25 -  1.72
+++ history.c   19 Oct 2017 07:56:05 -
@@ -740,23 +740,38 @@ static void
 history_load(Source *s)
 {
char*p, encoded[LINE + 1], line[LINE + 1];
+   int  toolongseen = 0;
 
rewind(histfh);
+   line_co = 1;
 
/* just read it all; will auto resize history upon next command */
-   for (line_co = 1; ; line_co++) {
-   p = fgets(encoded, sizeof(encoded), histfh);
-   if (p == NULL || feof(histfh) || ferror(histfh))
-   break;
+   while (fgets(encoded, sizeof(encoded), histfh)) {
if ((p = strchr(encoded, '\n')) == NULL) {
-   bi_errorf("history file is corrupt");
-   return;
+   /* discard overlong line */
+   do {
+   /* maybe a missing trailing newline? */
+   if (strlen(encoded) != sizeof(encoded) - 1) {
+   bi_errorf("history file is corrupt");
+   return;
+   }
+   } while (fgets(encoded, sizeof(encoded), histfh)
+   && strchr(encoded, '\n') == NULL);
+
+   if (!toolongseen) {
+   toolongseen = 1;
+   bi_errorf("ignored history line(s) longer than"
+   " %d bytes", LINE);
+   }
+
+   continue;
}
*p = '\0';
s->line = line_co;
s->cmd_offset = line_co;
strunvis(line, encoded);
histsave(line_co, line, 0);
+   line_co++;
}
 
history_write();

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: ksh: skip long history lines

2017-10-19 Thread Sebastian Benoit
Jeremie Courreges-Anglas(j...@wxcvbn.org) on 2017.10.19 00:22:51 +0200:
> 
> Hi Ori,
> 
> thanks for your feedback.  Reply and updated diff below,
> 
> On Wed, Oct 18 2017, Ori Bernstein  wrote:
> > On Wed, 18 Oct 2017 22:33:51 +0200
> > Jeremie Courreges-Anglas  wrote:
> >
> >> 
> >> It would be nice to support arbitrarily long lines, but a first step
> >> would be to skip them gracefuly.
> >> 
> >> The code modifies the loop condition: no tests against ferror(3)/feof(3)
> >> are performed any more (I don't see their point).
> >> 
> >> We could print a warning on ferror(), though, but that would be another
> >> diff.
> >> 
> >> ok?

o

> >> 
> >> 
> >> Index: history.c
> >> ===
> >> RCS file: /d/cvs/src/bin/ksh/history.c,v
> >> retrieving revision 1.72
> >> diff -u -p -p -u -r1.72 history.c
> >> --- history.c  18 Oct 2017 15:41:25 -  1.72
> >> +++ history.c  18 Oct 2017 20:18:24 -
> >> @@ -739,24 +739,45 @@ history_close(void)
> >>  static void
> >>  history_load(Source *s)
> >>  {
> >> -  char*p, encoded[LINE + 1], line[LINE + 1];
> >> +  char*nl, *p, encoded[LINE + 1], line[LINE + 1];
> >>  
> >>rewind(histfh);
> >> +  line_co = 1;
> >>  
> >>/* just read it all; will auto resize history upon next command */
> >> -  for (line_co = 1; ; line_co++) {
> >> -  p = fgets(encoded, sizeof(encoded), histfh);
> >> -  if (p == NULL || feof(histfh) || ferror(histfh))
> >> -  break;
> >> -  if ((p = strchr(encoded, '\n')) == NULL) {
> >> -  bi_errorf("history file is corrupt");
> >> -  return;
> >> +  for (p = fgets(encoded, sizeof(encoded), histfh); p != NULL;
> >> +   p = fgets(encoded, sizeof(encoded), histfh)) {
> >
> > The redundant calls prevent this code from scanning as well as it
> > could for me. Perhaps:
> >
> > for(;;) {
> > if ((p = fgets(...) == NULL)
> > break;
> >
> > Just a minor cosmetic nitpick, it's not a big deal either way.
> 
> I must admit that the code is needlessly redundant, but I prefer to have
> the loop condition inside... the loop condition.
> 
> >> +  if ((nl = strchr(encoded, '\n')) == NULL) {
> >> +  static int corrupt, toolong;
> >
> > I'm not sure why these need to be static. I'm not an expert on
> > the code, but it looks like histsave can be called multiple
> > times as the shell runs, which triggers a history reload.
> > Keeping this state seems like it would be wrong.
> 
> Those variables are static so that error messages are only printed once
> in the shell lifetime.  history reload happens each time the shell
> detects that the histfile has been modified, which can happen often if
> like me you have multiple interactives shells running at the same time.
> 
> > Maybe hoist this outside of the loop.
> 
> I don't see a reason for this.
> 
> >> +  /* no trailing newline? */
> >> +  if (strlen(p) != sizeof(encoded) - 1) {
> >> +  if (!corrupt) {
> >> +  corrupt = 1;
> >> +  bi_errorf("history file is corrupt");
> >> +  }
> >> +  return;
> >
> > Why does `corrupt` need to be recorded at all? There's a
> > return right after it.
> 
> Also to limit the amount of crap printed on screen, though maybe we
> should warn every time in this case.  Here's an updated diff that does
> so and hopefully makes the intents clearer.  It also adds a comment
> suggested by anton@
> 
> If your histfile ends with an overlong line which is also not terminated
> by a newline, the do...while loop would silently ignore this last line.
> I think that's acceptable, but YMMV.
> 
> 
> Index: history.c
> ===
> RCS file: /d/cvs/src/bin/ksh/history.c,v
> retrieving revision 1.72
> diff -u -p -p -u -r1.72 history.c
> --- history.c 18 Oct 2017 15:41:25 -  1.72
> +++ history.c 18 Oct 2017 22:16:43 -
> @@ -739,24 +739,42 @@ history_close(void)
>  static void
>  history_load(Source *s)
>  {
> - char*p, encoded[LINE + 1], line[LINE + 1];
> + char*nl, *p, encoded[LINE + 1], line[LINE + 1];
>  
>   rewind(histfh);
> + line_co = 1;
>  
>   /* just read it all; will auto resize history upon next command */
> - for (line_co = 1; ; line_co++) {
> - p = fgets(encoded, sizeof(encoded), histfh);
> - if (p == NULL || feof(histfh) || ferror(histfh))
> - break;
> - if ((p = strchr(encoded, '\n')) == NULL) {
> - bi_errorf("history file is corrupt");
> - return;
> + while ((p = fgets(encoded, sizeof(encoded), histfh)) != NULL) {
> + if ((nl = 

Re: ksh: skip long history lines

2017-10-18 Thread Ori Bernstein
On Thu, 19 Oct 2017 00:22:51 +0200
Jeremie Courreges-Anglas  wrote:

> Those variables are static so that error messages are only printed once
> in the shell lifetime.  history reload happens each time the shell
> detects that the histfile has been modified, which can happen often if
> like me you have multiple interactives shells running at the same time.
 
Makes sense, although I had been thinking that it could lead to lost
warnings when changing histfile; I'd hope that a corrupt histfile isn't
a persistent condition.

In any case, this looks pretty good to me.

> > Maybe hoist this outside of the loop.
> 
> I don't see a reason for this.
> 
> >> +  /* no trailing newline? */
> >> +  if (strlen(p) != sizeof(encoded) - 1) {
> >> +  if (!corrupt) {
> >> +  corrupt = 1;
> >> +  bi_errorf("history file is corrupt");
> >> +  }
> >> +  return;
> >
> > Why does `corrupt` need to be recorded at all? There's a
> > return right after it.
> 
> Also to limit the amount of crap printed on screen, though maybe we
> should warn every time in this case.  Here's an updated diff that does
> so and hopefully makes the intents clearer.  It also adds a comment
> suggested by anton@
> 
> If your histfile ends with an overlong line which is also not terminated
> by a newline, the do...while loop would silently ignore this last line.
> I think that's acceptable, but YMMV.
> 
> 
> Index: history.c
> ===
> RCS file: /d/cvs/src/bin/ksh/history.c,v
> retrieving revision 1.72
> diff -u -p -p -u -r1.72 history.c
> --- history.c 18 Oct 2017 15:41:25 -  1.72
> +++ history.c 18 Oct 2017 22:16:43 -
> @@ -739,24 +739,42 @@ history_close(void)
>  static void
>  history_load(Source *s)
>  {
> - char*p, encoded[LINE + 1], line[LINE + 1];
> + char*nl, *p, encoded[LINE + 1], line[LINE + 1];
>  
>   rewind(histfh);
> + line_co = 1;
>  
>   /* just read it all; will auto resize history upon next command */
> - for (line_co = 1; ; line_co++) {
> - p = fgets(encoded, sizeof(encoded), histfh);
> - if (p == NULL || feof(histfh) || ferror(histfh))
> - break;
> - if ((p = strchr(encoded, '\n')) == NULL) {
> - bi_errorf("history file is corrupt");
> - return;
> + while ((p = fgets(encoded, sizeof(encoded), histfh)) != NULL) {
> + if ((nl = strchr(encoded, '\n')) == NULL) {
> + static int warned;
> +
> + /* no trailing newline? */
> + if (strlen(p) != sizeof(encoded) - 1) {
> + bi_errorf("history file is corrupt");
> + return;
> + }
> +
> + if (!warned) {
> + warned = 1;
> + bi_errorf(
> + "ignoring history line(s) longer than %d"
> + " bytes", LINE);
> + }
> +
> + /* discard overlong line */
> + do {
> + p = fgets(encoded, sizeof(encoded), histfh);
> + } while (p != NULL && strchr(p, '\n') == NULL);
> +
> + continue;
>   }
> - *p = '\0';
> + *nl = '\0';
>   s->line = line_co;
>   s->cmd_offset = line_co;
>   strunvis(line, encoded);
>   histsave(line_co, line, 0);
> + line_co++;
>   }
>  
>   history_write();
> 
> -- 
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
> 


-- 
Ori Bernstein 



Re: ksh: skip long history lines

2017-10-18 Thread Jeremie Courreges-Anglas

Hi Ori,

thanks for your feedback.  Reply and updated diff below,

On Wed, Oct 18 2017, Ori Bernstein  wrote:
> On Wed, 18 Oct 2017 22:33:51 +0200
> Jeremie Courreges-Anglas  wrote:
>
>> 
>> It would be nice to support arbitrarily long lines, but a first step
>> would be to skip them gracefuly.
>> 
>> The code modifies the loop condition: no tests against ferror(3)/feof(3)
>> are performed any more (I don't see their point).
>> 
>> We could print a warning on ferror(), though, but that would be another
>> diff.
>> 
>> ok?
>> 
>> 
>> Index: history.c
>> ===
>> RCS file: /d/cvs/src/bin/ksh/history.c,v
>> retrieving revision 1.72
>> diff -u -p -p -u -r1.72 history.c
>> --- history.c18 Oct 2017 15:41:25 -  1.72
>> +++ history.c18 Oct 2017 20:18:24 -
>> @@ -739,24 +739,45 @@ history_close(void)
>>  static void
>>  history_load(Source *s)
>>  {
>> -char*p, encoded[LINE + 1], line[LINE + 1];
>> +char*nl, *p, encoded[LINE + 1], line[LINE + 1];
>>  
>>  rewind(histfh);
>> +line_co = 1;
>>  
>>  /* just read it all; will auto resize history upon next command */
>> -for (line_co = 1; ; line_co++) {
>> -p = fgets(encoded, sizeof(encoded), histfh);
>> -if (p == NULL || feof(histfh) || ferror(histfh))
>> -break;
>> -if ((p = strchr(encoded, '\n')) == NULL) {
>> -bi_errorf("history file is corrupt");
>> -return;
>> +for (p = fgets(encoded, sizeof(encoded), histfh); p != NULL;
>> + p = fgets(encoded, sizeof(encoded), histfh)) {
>
> The redundant calls prevent this code from scanning as well as it
> could for me. Perhaps:
>
>   for(;;) {
>   if ((p = fgets(...) == NULL)
>   break;
>
> Just a minor cosmetic nitpick, it's not a big deal either way.

I must admit that the code is needlessly redundant, but I prefer to have
the loop condition inside... the loop condition.

>> +if ((nl = strchr(encoded, '\n')) == NULL) {
>> +static int corrupt, toolong;
>
> I'm not sure why these need to be static. I'm not an expert on
> the code, but it looks like histsave can be called multiple
> times as the shell runs, which triggers a history reload.
> Keeping this state seems like it would be wrong.

Those variables are static so that error messages are only printed once
in the shell lifetime.  history reload happens each time the shell
detects that the histfile has been modified, which can happen often if
like me you have multiple interactives shells running at the same time.

> Maybe hoist this outside of the loop.

I don't see a reason for this.

>> +/* no trailing newline? */
>> +if (strlen(p) != sizeof(encoded) - 1) {
>> +if (!corrupt) {
>> +corrupt = 1;
>> +bi_errorf("history file is corrupt");
>> +}
>> +return;
>
> Why does `corrupt` need to be recorded at all? There's a
> return right after it.

Also to limit the amount of crap printed on screen, though maybe we
should warn every time in this case.  Here's an updated diff that does
so and hopefully makes the intents clearer.  It also adds a comment
suggested by anton@

If your histfile ends with an overlong line which is also not terminated
by a newline, the do...while loop would silently ignore this last line.
I think that's acceptable, but YMMV.


Index: history.c
===
RCS file: /d/cvs/src/bin/ksh/history.c,v
retrieving revision 1.72
diff -u -p -p -u -r1.72 history.c
--- history.c   18 Oct 2017 15:41:25 -  1.72
+++ history.c   18 Oct 2017 22:16:43 -
@@ -739,24 +739,42 @@ history_close(void)
 static void
 history_load(Source *s)
 {
-   char*p, encoded[LINE + 1], line[LINE + 1];
+   char*nl, *p, encoded[LINE + 1], line[LINE + 1];
 
rewind(histfh);
+   line_co = 1;
 
/* just read it all; will auto resize history upon next command */
-   for (line_co = 1; ; line_co++) {
-   p = fgets(encoded, sizeof(encoded), histfh);
-   if (p == NULL || feof(histfh) || ferror(histfh))
-   break;
-   if ((p = strchr(encoded, '\n')) == NULL) {
-   bi_errorf("history file is corrupt");
-   return;
+   while ((p = fgets(encoded, sizeof(encoded), histfh)) != NULL) {
+   if ((nl = strchr(encoded, '\n')) == NULL) {
+   static int warned;
+
+   /* no trailing newline? */
+   if (strlen(p) != sizeof(encoded) - 1) {
+   bi_errorf("history 

Re: ksh: skip long history lines

2017-10-18 Thread Ori Bernstein
On Wed, 18 Oct 2017 22:33:51 +0200
Jeremie Courreges-Anglas  wrote:

> 
> It would be nice to support arbitrarily long lines, but a first step
> would be to skip them gracefuly.
> 
> The code modifies the loop condition: no tests against ferror(3)/feof(3)
> are performed any more (I don't see their point).
> 
> We could print a warning on ferror(), though, but that would be another
> diff.
> 
> ok?
> 
> 
> Index: history.c
> ===
> RCS file: /d/cvs/src/bin/ksh/history.c,v
> retrieving revision 1.72
> diff -u -p -p -u -r1.72 history.c
> --- history.c 18 Oct 2017 15:41:25 -  1.72
> +++ history.c 18 Oct 2017 20:18:24 -
> @@ -739,24 +739,45 @@ history_close(void)
>  static void
>  history_load(Source *s)
>  {
> - char*p, encoded[LINE + 1], line[LINE + 1];
> + char*nl, *p, encoded[LINE + 1], line[LINE + 1];
>  
>   rewind(histfh);
> + line_co = 1;
>  
>   /* just read it all; will auto resize history upon next command */
> - for (line_co = 1; ; line_co++) {
> - p = fgets(encoded, sizeof(encoded), histfh);
> - if (p == NULL || feof(histfh) || ferror(histfh))
> - break;
> - if ((p = strchr(encoded, '\n')) == NULL) {
> - bi_errorf("history file is corrupt");
> - return;
> + for (p = fgets(encoded, sizeof(encoded), histfh); p != NULL;
> +  p = fgets(encoded, sizeof(encoded), histfh)) {

The redundant calls prevent this code from scanning as well as it
could for me. Perhaps:

for(;;) {
if ((p = fgets(...) == NULL)
break;

Just a minor cosmetic nitpick, it's not a big deal either way.

> + if ((nl = strchr(encoded, '\n')) == NULL) {
> + static int corrupt, toolong;

I'm not sure why these need to be static. I'm not an expert on
the code, but it looks like histsave can be called multiple
times as the shell runs, which triggers a history reload.
Keeping this state seems like it would be wrong.

Maybe hoist this outside of the loop.

> + /* no trailing newline? */
> + if (strlen(p) != sizeof(encoded) - 1) {
> + if (!corrupt) {
> + corrupt = 1;
> + bi_errorf("history file is corrupt");
> + }
> + return;

Why does `corrupt` need to be recorded at all? There's a
return right after it.

> + }
> +
> + if (!toolong) {
> + toolong = 1;
> + bi_errorf(
> + "ignoring history line(s) longer than %d"
> + " bytes", LINE);
> + }
> +
> + do {
> + p = fgets(encoded, sizeof(encoded), histfh);
> + } while (p != NULL && strchr(p, '\n') == NULL);
> +
> + continue;
>   }
> - *p = '\0';
> + *nl = '\0';
>   s->line = line_co;
>   s->cmd_offset = line_co;
>   strunvis(line, encoded);
>   histsave(line_co, line, 0);
> + line_co++;
>   }
>  
>   history_write();
> 
> -- 
> jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE
> 


-- 
Ori Bernstein 



ksh: skip long history lines

2017-10-18 Thread Jeremie Courreges-Anglas

It would be nice to support arbitrarily long lines, but a first step
would be to skip them gracefuly.

The code modifies the loop condition: no tests against ferror(3)/feof(3)
are performed any more (I don't see their point).

We could print a warning on ferror(), though, but that would be another
diff.

ok?


Index: history.c
===
RCS file: /d/cvs/src/bin/ksh/history.c,v
retrieving revision 1.72
diff -u -p -p -u -r1.72 history.c
--- history.c   18 Oct 2017 15:41:25 -  1.72
+++ history.c   18 Oct 2017 20:18:24 -
@@ -739,24 +739,45 @@ history_close(void)
 static void
 history_load(Source *s)
 {
-   char*p, encoded[LINE + 1], line[LINE + 1];
+   char*nl, *p, encoded[LINE + 1], line[LINE + 1];
 
rewind(histfh);
+   line_co = 1;
 
/* just read it all; will auto resize history upon next command */
-   for (line_co = 1; ; line_co++) {
-   p = fgets(encoded, sizeof(encoded), histfh);
-   if (p == NULL || feof(histfh) || ferror(histfh))
-   break;
-   if ((p = strchr(encoded, '\n')) == NULL) {
-   bi_errorf("history file is corrupt");
-   return;
+   for (p = fgets(encoded, sizeof(encoded), histfh); p != NULL;
+p = fgets(encoded, sizeof(encoded), histfh)) {
+   if ((nl = strchr(encoded, '\n')) == NULL) {
+   static int corrupt, toolong;
+
+   /* no trailing newline? */
+   if (strlen(p) != sizeof(encoded) - 1) {
+   if (!corrupt) {
+   corrupt = 1;
+   bi_errorf("history file is corrupt");
+   }
+   return;
+   }
+
+   if (!toolong) {
+   toolong = 1;
+   bi_errorf(
+   "ignoring history line(s) longer than %d"
+   " bytes", LINE);
+   }
+
+   do {
+   p = fgets(encoded, sizeof(encoded), histfh);
+   } while (p != NULL && strchr(p, '\n') == NULL);
+
+   continue;
}
-   *p = '\0';
+   *nl = '\0';
s->line = line_co;
s->cmd_offset = line_co;
strunvis(line, encoded);
histsave(line_co, line, 0);
+   line_co++;
}
 
history_write();

-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE