Hi Tobias, Tobias Stoeckmann wrote on Fri, Mar 30, 2018 at 05:39:11PM +0200:
> Our expr implementation supports 64 bit integers, but does not check > for overflows during parsing and arithmetical operations. Even though - as discussed previously for test(1) - behaviour is undefined by POSIX outside the range 0 to 2147483647, we are allowed to handle a larger range, and i agree that handling the range -9223372036854775808 to 9223372036854775807 (INT64_MIN to INT64_MAX) seems desirable. > This patch fixes the problems based on FreeBSD's implementation, which > is a bit more advanced than NetBSD, which it is based upon. But i have two concerns regarding your implementation, see inline below. > Index: bin/expr/expr.c > =================================================================== > RCS file: /cvs/src/bin/expr/expr.c,v > retrieving revision 1.26 > diff -u -p -u -p -r1.26 expr.c > --- bin/expr/expr.c 19 Oct 2016 18:20:25 -0000 1.26 > +++ bin/expr/expr.c 30 Mar 2018 15:33:50 -0000 > @@ -99,6 +99,7 @@ is_integer(struct val *vp, int64_t *r) > char *s; > int neg; > int64_t i; > + char c; > > if (vp->type == integer) { > *r = vp->u.i; > @@ -120,8 +121,12 @@ is_integer(struct val *vp, int64_t *r) > if (!isdigit((unsigned char)*s)) > return 0; > > + c = *s - '0'; > + if (c < 0 || INT64_MAX / 10 < i || INT64_MAX - c < i * 10) > + errx(3, "out of range"); Here is the first concern. First look at our old implementation: $ /obin/expr -9223372036854775807 + 0 -9223372036854775807 $ /obin/expr -9223372036854775807 - 1 -9223372036854775808 $ /obin/expr -9223372036854775808 + 0 -9223372036854775808 $ /obin/expr -9223372036854775807 - 2 9223372036854775807 $ /obin/expr -9223372036854775808 - 1 9223372036854775807 $ /obin/expr -9223372036854775809 + 0 9223372036854775807 The first three are expected; the overflow in the last three is not. Now look at what your new implementation does: $ /obin/expr.tobias -9223372036854775807 + 0 -9223372036854775807 $ /obin/expr.tobias -9223372036854775807 - 1 -9223372036854775808 $ /obin/expr.tobias -9223372036854775808 + 0 expr.tobias: out of range $ /obin/expr.tobias -9223372036854775807 - 2 expr.tobias: overflow $ /obin/expr.tobias -9223372036854775808 - 1 expr.tobias: out of range $ /obin/expr.tobias -9223372036854775809 + 0 expr.tobias: out of range The first two are expected, and so is the overflow in number 4 and the out of range in number 6. But the out of range in numbers 3 and 5 is unexpected. I would expect number 3 to succeed and number 5 to fail with overflow rather than out of range. The reason is that the hand-rolled integer parser fails for INT64_MIN because -INT64_MIN is larger than INT64_MAX. I'm not saying this is an absolute show-stopper, but i consider it somewhat ugly. What do you think about using strtonum(3) instead in is_integer? long long is guaranteed to be at least 64 bits long, and that would solve the problem with INT64_MIN. > + > i *= 10; > - i += *s - '0'; > + i += c; > > s++; > } > @@ -303,8 +308,9 @@ eval5(void) > struct val * > eval4(void) > { > - struct val *l, *r; > - enum token op; > + struct val *l, *r; > + enum token op; > + volatile int64_t res; > > l = eval5(); > while ((op = token) == MUL || op == DIV || op == MOD) { > @@ -316,7 +322,10 @@ eval4(void) > } > > if (op == MUL) { > - l->u.i *= r->u.i; > + res = l->u.i * r->u.i; > + if (r->u.i != 0 && l->u.i != res / r->u.i) > + errx(3, "overflow"); > + l->u.i = res; > } else { > if (r->u.i == 0) { > errx(2, "division by zero"); > @@ -324,6 +333,8 @@ eval4(void) > if (op == DIV) { > if (l->u.i != INT64_MIN || r->u.i != -1) > l->u.i /= r->u.i; > + else > + errx(3, "overflow"); > } else { > if (l->u.i != INT64_MIN || r->u.i != -1) > l->u.i %= r->u.i; > @@ -342,8 +353,9 @@ eval4(void) > struct val * > eval3(void) > { > - struct val *l, *r; > - enum token op; > + struct val *l, *r; > + enum token op; > + volatile int64_t res; > > l = eval4(); > while ((op = token) == ADD || op == SUB) { > @@ -355,9 +367,18 @@ eval3(void) > } > > if (op == ADD) { > - l->u.i += r->u.i; > + res = l->u.i + r->u.i; > + if ((l->u.i > 0 && r->u.i > 0 && res <= 0) || > + (l->u.i < 0 && r->u.i < 0 && res >= 0)) > + errx(3, "overflow"); > + l->u.i = res; > } else { > - l->u.i -= r->u.i; > + res = l->u.i - r->u.i; > + if ((r->u.i == INT64_MIN && l->u.i < 0) || Here is the second concern. I think this line is outright wrong. Consider this: $ /obin/expr -36854775808 - \( -9223372036854775807 - 1 \) 9223372000000000000 I'm writing INT64_MIN in a slightly weird way to avoid the input problem discussed above. The result of our old implementation is expected: (negative number) - INT64_MIN is perfectly valid. Now look at your implementation: $ /obin/expr.tobias -9223372036854775807 - 1 -9223372036854775808 $ /obin/expr.tobias -36854775808 - \( -9223372036854775807 - 1 \) expr.tobias: overflow Constructing INT64_MIN works, but then the substraction fails for no good reason. The minimal fix to your code would be if ((r->u.i == INT64_MIN && l->u.i == 0) || (l->u.i > 0 && r->u.i < 0 && res <= 0) || That is complete: The cases of mismatching signs are handled by the two following lines. If both numbers are >= 0, there is no problem with subtraction. If both numbers are <= 0, there is exactly one problematic case: 0 - INT64_MIN. But that can be simplified by including it into the next line as follows: if ((l->u.i >= 0 && r->u.i < 0 && res <= 0) || Do you want to check and re-send your patch? Yours, Ingo > + (l->u.i > 0 && r->u.i < 0 && res <= 0) || > + (l->u.i < 0 && r->u.i > 0 && res >= 0)) > + errx(3, "overflow"); > + l->u.i = res; > } > > free_value(r);