Hi Tobias,

Tobias Stoeckmann wrote on Fri, Mar 30, 2018 at 05:39:11PM +0200:

> Our expr implementation supports 64 bit integers, but does not check
> for overflows during parsing and arithmetical operations.

Even though - as discussed previously for test(1) - behaviour is
undefined by POSIX outside the range 0 to 2147483647, we are allowed
to handle a larger range, and i agree that handling the range
-9223372036854775808 to 9223372036854775807 (INT64_MIN to INT64_MAX)
seems desirable.

> This patch fixes the problems based on FreeBSD's implementation, which
> is a bit more advanced than NetBSD, which it is based upon.

But i have two concerns regarding your implementation, see inline below.

> Index: bin/expr/expr.c
> ===================================================================
> RCS file: /cvs/src/bin/expr/expr.c,v
> retrieving revision 1.26
> diff -u -p -u -p -r1.26 expr.c
> --- bin/expr/expr.c   19 Oct 2016 18:20:25 -0000      1.26
> +++ bin/expr/expr.c   30 Mar 2018 15:33:50 -0000
> @@ -99,6 +99,7 @@ is_integer(struct val *vp, int64_t *r)
>       char           *s;
>       int             neg;
>       int64_t         i;
> +     char            c;
>  
>       if (vp->type == integer) {
>               *r = vp->u.i;
> @@ -120,8 +121,12 @@ is_integer(struct val *vp, int64_t *r)
>               if (!isdigit((unsigned char)*s))
>                       return 0;
>  
> +             c = *s - '0';
> +             if (c < 0 || INT64_MAX / 10 < i || INT64_MAX - c < i * 10)
> +                     errx(3, "out of range");

Here is the first concern.

First look at our old implementation:

   $ /obin/expr -9223372036854775807 + 0
  -9223372036854775807

   $ /obin/expr -9223372036854775807 - 1        
  -9223372036854775808
   $ /obin/expr -9223372036854775808 + 0 
  -9223372036854775808

   $ /obin/expr -9223372036854775807 - 2        
  9223372036854775807
   $ /obin/expr -9223372036854775808 - 1 
  9223372036854775807
   $ /obin/expr -9223372036854775809 + 0 
  9223372036854775807

The first three are expected; the overflow in the last three is not.

Now look at what your new implementation does:

   $ /obin/expr.tobias -9223372036854775807 + 0
  -9223372036854775807

   $ /obin/expr.tobias -9223372036854775807 - 1 
  -9223372036854775808
   $ /obin/expr.tobias -9223372036854775808 + 0 
  expr.tobias: out of range

   $ /obin/expr.tobias -9223372036854775807 - 2 
  expr.tobias: overflow
   $ /obin/expr.tobias -9223372036854775808 - 1 
  expr.tobias: out of range
   $ /obin/expr.tobias -9223372036854775809 + 0 
  expr.tobias: out of range

The first two are expected, and so is the overflow in number 4
and the out of range in number 6.  But the out of range in
numbers 3 and 5 is unexpected.  I would expect number 3 to
succeed and number 5 to fail with overflow rather than out of range.

The reason is that the hand-rolled integer parser fails for INT64_MIN
because -INT64_MIN is larger than INT64_MAX.

I'm not saying this is an absolute show-stopper, but i consider
it somewhat ugly.  What do you think about using strtonum(3)
instead in is_integer?  long long is guaranteed to be at least
64 bits long, and that would solve the problem with INT64_MIN.

> +
>               i *= 10;
> -             i += *s - '0';
> +             i += c;
>  
>               s++;
>       }
> @@ -303,8 +308,9 @@ eval5(void)
>  struct val *
>  eval4(void)
>  {
> -     struct val     *l, *r;
> -     enum token      op;
> +     struct val      *l, *r;
> +     enum token       op;
> +     volatile int64_t res;
>  
>       l = eval5();
>       while ((op = token) == MUL || op == DIV || op == MOD) {
> @@ -316,7 +322,10 @@ eval4(void)
>               }
>  
>               if (op == MUL) {
> -                     l->u.i *= r->u.i;
> +                     res = l->u.i * r->u.i;
> +                     if (r->u.i != 0 && l->u.i != res / r->u.i)
> +                             errx(3, "overflow");
> +                     l->u.i = res;
>               } else {
>                       if (r->u.i == 0) {
>                               errx(2, "division by zero");
> @@ -324,6 +333,8 @@ eval4(void)
>                       if (op == DIV) {
>                               if (l->u.i != INT64_MIN || r->u.i != -1)
>                                       l->u.i /= r->u.i;
> +                             else
> +                                     errx(3, "overflow");
>                       } else {
>                               if (l->u.i != INT64_MIN || r->u.i != -1)
>                                       l->u.i %= r->u.i;
> @@ -342,8 +353,9 @@ eval4(void)
>  struct val *
>  eval3(void)
>  {
> -     struct val     *l, *r;
> -     enum token      op;
> +     struct val      *l, *r;
> +     enum token       op;
> +     volatile int64_t res;
>  
>       l = eval4();
>       while ((op = token) == ADD || op == SUB) {
> @@ -355,9 +367,18 @@ eval3(void)
>               }
>  
>               if (op == ADD) {
> -                     l->u.i += r->u.i;
> +                     res = l->u.i + r->u.i;
> +                     if ((l->u.i > 0 && r->u.i > 0 && res <= 0) ||
> +                         (l->u.i < 0 && r->u.i < 0 && res >= 0))
> +                             errx(3, "overflow");
> +                     l->u.i = res;
>               } else {
> -                     l->u.i -= r->u.i;
> +                     res = l->u.i - r->u.i;
> +                     if ((r->u.i == INT64_MIN && l->u.i < 0) ||

Here is the second concern.
I think this line is outright wrong.
Consider this:

   $ /obin/expr -36854775808 - \( -9223372036854775807 - 1 \)
  9223372000000000000

I'm writing INT64_MIN in a slightly weird way to avoid the input
problem discussed above.  The result of our old implementation
is expected:  (negative number) - INT64_MIN is perfectly
valid.

Now look at your implementation:

   $ /obin/expr.tobias -9223372036854775807 - 1                      
  -9223372036854775808
   $ /obin/expr.tobias -36854775808 - \( -9223372036854775807 - 1 \)
  expr.tobias: overflow

Constructing INT64_MIN works, but then the substraction fails for
no good reason.

The minimal fix to your code would be

        if ((r->u.i == INT64_MIN && l->u.i == 0) ||
            (l->u.i > 0 && r->u.i < 0 && res <= 0) ||

That is complete:  The cases of mismatching signs are handled by the
two following lines.  If both numbers are >= 0, there is no problem
with subtraction.  If both numbers are <= 0, there is exactly one
problematic case: 0 - INT64_MIN.

But that can be simplified by including it into the next line as
follows:

        if ((l->u.i >= 0 && r->u.i < 0 && res <= 0) ||

Do you want to check and re-send your patch?

Yours,
  Ingo


> +                         (l->u.i > 0 && r->u.i < 0 && res <= 0) ||
> +                         (l->u.i < 0 && r->u.i > 0 && res >= 0))
> +                             errx(3, "overflow");
> +                     l->u.i = res;
>               }
>  
>               free_value(r);

Reply via email to