support for comparison of unlimited-length integers in expr and test

Paul Eggert Fri, 27 May 2005 13:56:53 -0700

The recent bug-fix to expr got me to thinking: why doesn't expr simply
compare integers of unlimited length, the way "sort" does?  That can
be done cheaply.  "test" has a similar problem.  I installed the
following patch to implement this idea.


This fixes all the integer-overflow problems I know of with "test",
but "expr" still has quite a few problems, e.g., "expr
9223372036854775807 + 1" still prints "-9223372036854775808".

How about if we modify "expr" to use GMP <http://swox.com/gmp/>
instead, so that expr doesn't overflow unless it runs out of memory?
The disadvantage is a reliance on the GMP library, but the advantage
is that expr will "just work".  We can fall back to the current
approach if GMP is not available.

2005-05-27  Paul Eggert  <[EMAIL PROTECTED]>

        * NEWS: expr and test now correctly compare integers of unlimited size.
        (Also, correct a comment that claimed that expr detects integer
        overflow; it does so only when converting from strings.)
        * src/expr.c: Include strnumcmp.h, xstrtol.h.
        (looks_like_integer): New function.
        (toarith): Use it.  Also, use xstrtoimax rather than rolling our
        own diagnostics.
        (eval2): Don't look for trouble if !evaluate; this simplifies things.
        Compare numbers using string comparison, so that overflow is
        not possible.
        * src/sort.c: Refactor so that others can use large-integer
        comparison functions.
        Include "strnumcmp.h".
        (NEGATION_SIGN, NUMERIC_ZERO, fraccompare):
        Remove; moved to strnumcmp.
        (decimal_point): Now int, to simplify converison overhead with
        new API.  All uses changed.
        (thousands_sep): Now -1 if there isn't one, as per new API.
        All uses changed.
        (numcompare): Move contents to strnumcmp module, except for
        skipping blanks.
        * src/test.c: Include inttostr.h, strnumcmp.h.
        (whitespace, digit, digit_value, integer_expected_error): Remove.
        (is_int): Remove; replaced by...
        (find_int): New function.
        (binary_operator): Don't let integers overflow in comparisons;
        return the correct answer instead.  Simplify the code.
        (unary_operator): Convert the integer ourself, since find_int
        no longer does so.
        * tests/expr/basic (bigcmp): New test.
        * tests/test/Test.pm (eq-6, gt-5, lt-5): New tests.
        * lib/strnumcmp.c, strnumcmp.h, strnumcmp-in.h, strintcmp.c:
        New files.
        * m4/prereq.m4 (gl_PREREQ): Require gl_STRINTCMP, gl_STRNUMCMP.
        * m4/strnumcmp.m4: New file.

Index: NEWS
===================================================================
RCS file: /fetish/cu/NEWS,v
retrieving revision 1.290
diff -p -u -r1.290 NEWS
--- NEWS        26 May 2005 19:27:50 -0000      1.290
+++ NEWS        27 May 2005 20:30:58 -0000
@@ -107,7 +107,9 @@ GNU coreutils NEWS                      
   time-of-day is changed while dd is running.  Also, it avoids
   using unsafe code in signal handlers; this fixes some core dumps.
 
-  expr now detects integer overflow when evaluating large integers,
+  expr and test now correctly compare integers of unlimited magnitude.
+
+  expr now detects integer overflow when converting strings to integers,
   rather than silently wrapping around.
 
   ls now refuses to generate time stamps containing more than 1000 bytes, to
@@ -118,9 +120,6 @@ GNU coreutils NEWS                      
 
   "pr -D FORMAT" now accepts the same formats that "date +FORMAT" does.
 
-  test now detects integer overflow when evaluating large integers,
-  rather than silently wrapping around.
-
 ** Improved portability
 
   nice now works on Darwin 7.7.0 in spite of its invalid definition of NZERO.
Index: tests/expr/basic
===================================================================
RCS file: /fetish/cu/tests/expr/basic,v
retrieving revision 1.12
diff -p -u -r1.12 basic
--- tests/expr/basic    26 May 2005 16:09:29 -0000      1.12
+++ tests/expr/basic    27 May 2005 20:30:58 -0000
@@ -59,6 +59,10 @@ my @Tests =
      ['fail-a', '3 + -', {ERR => "$prog: non-numeric argument\n"},
       {EXIT => 3}],
 
+     # This erroneously succeeded before 5.3.1.
+     ['bigcmp', '-- -2417851639229258349412352 \< 2417851639229258349412352',
+      {OUT => '1'}, {EXIT => 0}],
+
      ['fail-b', '9 9', {ERR => "$prog: syntax error\n"},
       {EXIT => 2}],
      ['fail-c', {ERR => "$prog: missing operand\n"
Index: tests/test/Test.pm
===================================================================
RCS file: /fetish/cu/tests/test/Test.pm,v
retrieving revision 1.5
diff -p -u -r1.5 Test.pm
--- tests/test/Test.pm  26 Jul 2003 12:23:27 -0000      1.5
+++ tests/test/Test.pm  27 May 2005 20:30:58 -0000
@@ -49,16 +49,19 @@ sub test_vector
      ['eq-3', '0 -eq 00', {}, '', 0],
      ['eq-4', '8 -eq 9', {}, '', 1],
      ['eq-5', '1 -eq 0', {}, '', 1],
+     ['eq-6', '340282366920938463463374607431768211456 -eq 0', {}, '', 1],
 
      ['gt-1', '5 -gt 5', {}, '', 1],
      ['gt-2', '5 -gt 4', {}, '', 0],
      ['gt-3', '4 -gt 5', {}, '', 1],
      ['gt-4', '-1 -gt -2', {}, '', 0],
+     ['gt-5', '18446744073709551616 -gt -18446744073709551616', {}, '', 0],
 
      ['lt-1', '5 -lt 5', {}, '', 1],
      ['lt-2', '5 -lt 4', {}, '', 1],
      ['lt-3', '4 -lt 5', {}, '', 0],
      ['lt-4', '-1 -lt -2', {}, '', 1],
+     ['lt-5', '-18446744073709551616 -lt 18446744073709551616', {}, '', 0],
 
      # This evokes `test: 0x0: integer expression expected'.
      ['inv-1', '0x0 -eq 00', {}, '', 2],
Index: src/expr.c
===================================================================
RCS file: /fetish/cu/src/expr.c,v
retrieving revision 1.103
diff -p -u -r1.103 expr.c
--- src/expr.c  26 May 2005 16:09:38 -0000      1.103
+++ src/expr.c  27 May 2005 20:30:58 -0000
@@ -38,6 +38,8 @@
 #include "error.h"
 #include "inttostr.h"
 #include "quotearg.h"
+#include "strnumcmp.h"
+#include "xstrtol.h"
 
 /* The official name of this program (e.g., no `g' prefix).  */
 #define PROGRAM_NAME "expr"
@@ -297,6 +299,21 @@ null (VALUE *v)
     }
 }
 
+/* Return true if CP takes the form of an integer.  */
+
+static bool
+looks_like_integer (char const *cp)
+{
+  cp += (*cp == '-');
+
+  do
+    if (! ISDIGIT (*cp))
+      return false;
+  while (*++cp);
+
+  return true;
+}
+
 /* Coerce V to a string value (can't fail).  */
 
 static void
@@ -328,33 +345,12 @@ toarith (VALUE *v)
       return true;
     case string:
       {
-       intmax_t value = 0;
-       char *cp = v->u.s;
-       int sign = (*cp == '-' ? -1 : 1);
-
-       if (sign < 0)
-         cp++;
-
-       do
-         {
-           if (ISDIGIT (*cp))
-             {
-               intmax_t new_v = 10 * value + sign * (*cp - '0');
-               if (0 < sign
-                   ? (INTMAX_MAX / 10 < value || new_v < 0)
-                   : (value < INTMAX_MIN / 10 || 0 < new_v))
-                 error (EXPR_FAILURE, 0,
-                        (0 < sign
-                         ? _("integer is too large: %s")
-                         : _("integer is too small: %s")),
-                        quotearg_colon (v->u.s));
-               value = new_v;
-             }
-           else
-             return false;
-         }
-       while (*++cp);
+       intmax_t value;
 
+       if (! looks_like_integer (v->u.s))
+         return false;
+       if (xstrtoimax (v->u.s, NULL, 10, &value, NULL) != LONGINT_OK)
+         error (EXPR_FAILURE, ERANGE, "%s", v->u.s);
        free (v->u.s);
        v->u.i = value;
        v->type = integer;
@@ -693,16 +689,6 @@ static VALUE *
 eval2 (bool evaluate)
 {
   VALUE *l;
-  VALUE *r;
-  enum
-  {
-    less_than, less_equal, equal, not_equal, greater_equal, greater_than
-  } fxn;
-  bool val;
-  intmax_t lval;
-  intmax_t rval;
-  int collation_errno;
-  char *collation_arg1;
 
 #ifdef EVAL_TRACE
   trace ("eval2");
@@ -710,6 +696,13 @@ eval2 (bool evaluate)
   l = eval3 (evaluate);
   while (1)
     {
+      VALUE *r;
+      enum
+       {
+         less_than, less_equal, equal, not_equal, greater_equal, greater_than
+       } fxn;
+      bool val = false;
+
       if (nextarg ("<"))
        fxn = less_than;
       else if (nextarg ("<="))
@@ -725,46 +718,45 @@ eval2 (bool evaluate)
       else
        return l;
       r = eval3 (evaluate);
-      tostring (l);
-      tostring (r);
 
-      /* Save the first arg to strcoll, in case we need its value for
-        a diagnostic later.  This is needed because 'toarith' might
-        free the first arg.  */
-      collation_arg1 = xstrdup (l->u.s);
-
-      errno = 0;
-      lval = strcoll (collation_arg1, r->u.s);
-      collation_errno = errno;
-      rval = 0;
-      if (toarith (l) && toarith (r))
-       {
-         lval = l->u.i;
-         rval = r->u.i;
-       }
-      else if (collation_errno && evaluate)
+      if (evaluate)
        {
-         error (0, collation_errno, _("string comparison failed"));
-         error (0, 0, _("Set LC_ALL='C' to work around the problem."));
-         error (EXPR_FAILURE, 0,
-                _("The strings compared were %s and %s."),
-                quotearg_n_style (0, locale_quoting_style, collation_arg1),
-                quotearg_n_style (1, locale_quoting_style, r->u.s));
-       }
+         int cmp;
+         tostring (l);
+         tostring (r);
 
-      switch (fxn)
-       {
-       case less_than:     val = (lval <  rval); break;
-       case less_equal:    val = (lval <= rval); break;
-       case equal:         val = (lval == rval); break;
-       case not_equal:     val = (lval != rval); break;
-       case greater_equal: val = (lval >= rval); break;
-       case greater_than:  val = (lval >  rval); break;
-       default: abort ();
+         if (looks_like_integer (l->u.s) && looks_like_integer (r->u.s))
+           cmp = strintcmp (l->u.s, r->u.s);
+         else
+           {
+             errno = 0;
+             cmp = strcoll (l->u.s, r->u.s);
+
+             if (errno)
+               {
+                 error (0, errno, _("string comparison failed"));
+                 error (0, 0, _("Set LC_ALL='C' to work around the problem."));
+                 error (EXPR_FAILURE, 0,
+                        _("The strings compared were %s and %s."),
+                        quotearg_n_style (0, locale_quoting_style, l->u.s),
+                        quotearg_n_style (1, locale_quoting_style, r->u.s));
+               }
+           }
+
+         switch (fxn)
+           {
+           case less_than:     val = (cmp <  0); break;
+           case less_equal:    val = (cmp <= 0); break;
+           case equal:         val = (cmp == 0); break;
+           case not_equal:     val = (cmp != 0); break;
+           case greater_equal: val = (cmp >= 0); break;
+           case greater_than:  val = (cmp >  0); break;
+           default: abort ();
+           }
        }
+
       freev (l);
       freev (r);
-      free (collation_arg1);
       l = int_value (val);
     }
 }
Index: src/sort.c
===================================================================
RCS file: /fetish/cu/src/sort.c,v
retrieving revision 1.311
diff -p -u -r1.311 sort.c
--- src/sort.c  14 May 2005 07:58:37 -0000      1.311
+++ src/sort.c  27 May 2005 20:30:59 -0000
@@ -35,6 +35,7 @@
 #include "posixver.h"
 #include "quote.h"
 #include "stdio-safer.h"
+#include "strnumcmp.h"
 #include "unistd-safer.h"
 #include "xmemcoll.h"
 #include "xstrtol.h"
@@ -89,13 +90,10 @@ enum
     SORT_FAILURE = 2
   };
 
-#define NEGATION_SIGN   '-'
-#define NUMERIC_ZERO    '0'
-
 /* The representation of the decimal point in the current locale.  */
-static char decimal_point;
+static int decimal_point;
 
-/* Thousands separator; if CHAR_MAX + 1, then there isn't one.  */
+/* Thousands separator; if -1, then there isn't one.  */
 static int thousands_sep;
 
 /* Nonzero if the corresponding locales are hard.  */
@@ -1063,71 +1061,6 @@ fillbuf (struct buffer *buf, FILE *fp, c
     }
 }
 
-/* Compare strings A and B containing decimal fractions < 1.  Each string
-   should begin with a decimal point followed immediately by the digits
-   of the fraction.  Strings not of this form are considered to be zero. */
-
-/* The goal here, is to take two numbers a and b... compare these
-   in parallel.  Instead of converting each, and then comparing the
-   outcome.  Most likely stopping the comparison before the conversion
-   is complete.  The algorithm used, in the old sort:
-
-   Algorithm: fraccompare
-   Action   : compare two decimal fractions
-   accepts  : char *a, char *b
-   returns  : -1 if a<b, 0 if a=b, 1 if a>b.
-   implement:
-
-   if *a == decimal_point AND *b == decimal_point
-     find first character different in a and b.
-     if both are digits, return the difference *a - *b.
-     if *a is a digit
-       skip past zeros
-       if digit return 1, else 0
-     if *b is a digit
-       skip past zeros
-       if digit return -1, else 0
-   if *a is a decimal_point
-     skip past decimal_point and zeros
-     if digit return 1, else 0
-   if *b is a decimal_point
-     skip past decimal_point and zeros
-     if digit return -1, else 0
-   return 0 */
-
-static int
-fraccompare (const char *a, const char *b)
-{
-  if (*a == decimal_point && *b == decimal_point)
-    {
-      while (*++a == *++b)
-       if (! ISDIGIT (*a))
-         return 0;
-      if (ISDIGIT (*a) && ISDIGIT (*b))
-       return *a - *b;
-      if (ISDIGIT (*a))
-       goto a_trailing_nonzero;
-      if (ISDIGIT (*b))
-       goto b_trailing_nonzero;
-      return 0;
-    }
-  else if (*a++ == decimal_point)
-    {
-    a_trailing_nonzero:
-      while (*a == NUMERIC_ZERO)
-       a++;
-      return ISDIGIT (*a);
-    }
-  else if (*b++ == decimal_point)
-    {
-    b_trailing_nonzero:
-      while (*b == NUMERIC_ZERO)
-       b++;
-      return - ISDIGIT (*b);
-    }
-  return 0;
-}
-
 /* Compare strings A and B as numbers without explicitly converting them to
    machine numbers.  Comparatively slow for short strings, but asymptotically
    hideously fast. */
@@ -1135,136 +1068,12 @@ fraccompare (const char *a, const char *
 static int
 numcompare (const char *a, const char *b)
 {
-  char tmpa;
-  char tmpb;
-  int tmp;
-  size_t log_a;
-  size_t log_b;
-
-  while (blanks[to_uchar (tmpa = *a)])
+  while (blanks[to_uchar (*a)])
     a++;
-  while (blanks[to_uchar (tmpb = *b)])
+  while (blanks[to_uchar (*b)])
     b++;
 
-  if (tmpa == NEGATION_SIGN)
-    {
-      do
-       tmpa = *++a;
-      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep);
-      if (tmpb != NEGATION_SIGN)
-       {
-         if (tmpa == decimal_point)
-           do
-             tmpa = *++a;
-           while (tmpa == NUMERIC_ZERO);
-         if (ISDIGIT (tmpa))
-           return -1;
-         while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep)
-           tmpb = *++b;
-         if (tmpb == decimal_point)
-           do
-             tmpb = *++b;
-           while (tmpb == NUMERIC_ZERO);
-         return - ISDIGIT (tmpb);
-       }
-      do
-       tmpb = *++b;
-      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep);
-
-      while (tmpa == tmpb && ISDIGIT (tmpa))
-       {
-         do
-           tmpa = *++a;
-         while (tmpa == thousands_sep);
-         do
-           tmpb = *++b;
-         while (tmpb == thousands_sep);
-       }
-
-      if ((tmpa == decimal_point && !ISDIGIT (tmpb))
-         || (tmpb == decimal_point && !ISDIGIT (tmpa)))
-       return fraccompare (b, a);
-
-      tmp = tmpb - tmpa;
-
-      for (log_a = 0; ISDIGIT (tmpa); ++log_a)
-       do
-         tmpa = *++a;
-       while (tmpa == thousands_sep);
-
-      for (log_b = 0; ISDIGIT (tmpb); ++log_b)
-       do
-         tmpb = *++b;
-       while (tmpb == thousands_sep);
-
-      if (log_a != log_b)
-       return log_a < log_b ? 1 : -1;
-
-      if (!log_a)
-       return 0;
-
-      return tmp;
-    }
-  else if (tmpb == NEGATION_SIGN)
-    {
-      do
-       tmpb = *++b;
-      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep);
-      if (tmpb == decimal_point)
-       do
-         tmpb = *++b;
-       while (tmpb == NUMERIC_ZERO);
-      if (ISDIGIT (tmpb))
-       return 1;
-      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep)
-       tmpa = *++a;
-      if (tmpa == decimal_point)
-       do
-         tmpa = *++a;
-       while (tmpa == NUMERIC_ZERO);
-      return ISDIGIT (tmpa);
-    }
-  else
-    {
-      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep)
-       tmpa = *++a;
-      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep)
-       tmpb = *++b;
-
-      while (tmpa == tmpb && ISDIGIT (tmpa))
-       {
-         do
-           tmpa = *++a;
-         while (tmpa == thousands_sep);
-         do
-           tmpb = *++b;
-         while (tmpb == thousands_sep);
-       }
-
-      if ((tmpa == decimal_point && !ISDIGIT (tmpb))
-         || (tmpb == decimal_point && !ISDIGIT (tmpa)))
-       return fraccompare (a, b);
-
-      tmp = tmpa - tmpb;
-
-      for (log_a = 0; ISDIGIT (tmpa); ++log_a)
-       do
-         tmpa = *++a;
-       while (tmpa == thousands_sep);
-
-      for (log_b = 0; ISDIGIT (tmpb); ++log_b)
-       do
-         tmpb = *++b;
-       while (tmpb == thousands_sep);
-
-      if (log_a != log_b)
-       return log_a < log_b ? -1 : 1;
-
-      if (!log_a)
-       return 0;
-
-      return tmp;
-    }
+  return strnumcmp (a, b, decimal_point, thousands_sep);
 }
 
 static int
@@ -2325,14 +2134,14 @@ main (int argc, char **argv)
     /* If the locale doesn't define a decimal point, or if the decimal
        point is multibyte, use the C locale's decimal point.  FIXME:
        add support for multibyte decimal points.  */
-    decimal_point = locale->decimal_point[0];
+    decimal_point = to_uchar (locale->decimal_point[0]);
     if (! decimal_point || locale->decimal_point[1])
       decimal_point = '.';
 
     /* FIXME: add support for multibyte thousands separators.  */
-    thousands_sep = *locale->thousands_sep;
+    thousands_sep = to_uchar (*locale->thousands_sep);
     if (! thousands_sep || locale->thousands_sep[1])
-      thousands_sep = CHAR_MAX + 1;
+      thousands_sep = -1;
   }
 
   have_read_stdin = false;
Index: src/test.c
===================================================================
RCS file: /fetish/cu/src/test.c,v
retrieving revision 1.118
diff -p -u -r1.118 test.c
--- src/test.c  14 May 2005 07:58:37 -0000      1.118
+++ src/test.c  27 May 2005 20:30:59 -0000
@@ -43,14 +43,13 @@
 #include "system.h"
 #include "error.h"
 #include "euidaccess.h"
+#include "inttostr.h"
 #include "quote.h"
+#include "strnumcmp.h"
 
 #ifndef _POSIX_VERSION
 # include <sys/param.h>
 #endif /* _POSIX_VERSION */
-#define whitespace(c) (((c) == ' ') || ((c) == '\t'))
-#define digit(c)  ((c) >= '0' && (c) <= '9')
-#define digit_value(c) ((c) - '0')
 
 char *program_name;
 
@@ -131,81 +130,40 @@ beyond (void)
   test_syntax_error (_("missing argument after %s"), quote (argv[argc - 1]));
 }
 
-/* Syntax error for when an integer argument was expected, but
-   something else was found. */
-static void
-integer_expected_error (char const *pch)
-{
-  test_syntax_error (_("%s: integer expression expected\n"), pch);
-}
-
-/* Return true if the characters pointed to by STRING constitute a
-   valid number.  Stuff the converted number into RESULT if RESULT is
-   not null.  */
-static bool
-is_int (char const *string, intmax_t *result)
-{
-  int sign;
-  intmax_t value;
-  char const *orig_string;
-
-  sign = 1;
-  value = 0;
+/* If the characters pointed to by STRING constitute a valid number,
+   return a pointer to the start of the number, skipping any blanks or
+   leading '+'.  Otherwise, report an error and exit.  */
+static char const *
+find_int (char const *string)
+{
+  char const *p;
+  char const *number_start;
 
-  if (result)
-    *result = 0;
+  for (p = string; ISBLANK (to_uchar (*p)); p++)
+    continue;
 
-  /* Skip leading whitespace characters. */
-  while (whitespace (*string))
-    string++;
-
-  if (!*string)
-    return false;
-
-  /* Save a pointer to the start, for diagnostics.  */
-  orig_string = string;
-
-  /* We allow leading `-' or `+'. */
-  if (*string == '-' || *string == '+')
+  if (*p == '+')
     {
-      if (!digit (string[1]))
-       return false;
-
-      if (*string == '-')
-       sign = -1;
-
-      string++;
+      p++;
+      number_start = p;
     }
-
-  while (digit (*string))
+  else
     {
-      if (result)
-       {
-         intmax_t new_v = 10 * value + sign * (*string - '0');
-         if (0 < sign
-             ? (INTMAX_MAX / 10 < value || new_v < 0)
-             : (value < INTMAX_MIN / 10 || 0 < new_v))
-           test_syntax_error ((0 < sign
-                               ? _("integer is too large: %s\n")
-                               : _("integer is too small: %s\n")),
-                              orig_string);
-         value = new_v;
-       }
-      string++;
+      number_start = p;
+      p += (*p == '-');
     }
 
-  /* Skip trailing whitespace, if any. */
-  while (whitespace (*string))
-    string++;
-
-  /* Error if not at end of string. */
-  if (*string)
-    return false;
-
-  if (result)
-    *result = value;
+  if (ISDIGIT (*p++))
+    {
+      while (ISDIGIT (*p))
+       p++;
+      while (ISBLANK (to_uchar (*p)))
+       p++;
+      if (!*p)
+       return number_start;
+    }
 
-  return true;
+  test_syntax_error (_("invalid integer %s\n"), quote (string));
 }
 
 /* Find the modification time of FILE, and stuff it into *AGE.
@@ -317,7 +275,6 @@ binary_operator (bool l_is_l)
 {
   int op;
   struct stat stat_buf, stat_spare;
-  intmax_t l, r;
   /* Is the right integer expression of the form '-l string'? */
   bool r_is_l;
 
@@ -336,100 +293,33 @@ binary_operator (bool l_is_l)
   if (argv[op][0] == '-')
     {
       /* check for eq, nt, and stuff */
+      if ((((argv[op][1] == 'l' || argv[op][1] == 'g')
+           && (argv[op][2] == 'e' || argv[op][2] == 't'))
+          || (argv[op][1] == 'e' && argv[op][2] == 'q')
+          || (argv[op][1] == 'n' && argv[op][2] == 'e'))
+         && !argv[op][3])
+       {
+         char lbuf[INT_BUFSIZE_BOUND (uintmax_t)];
+         char rbuf[INT_BUFSIZE_BOUND (uintmax_t)];
+         char const *l = (l_is_l
+                          ? umaxtostr (strlen (argv[op - 1]), lbuf)
+                          : find_int (argv[op - 1]));
+         char const *r = (r_is_l
+                          ? umaxtostr (strlen (argv[op + 2]), rbuf)
+                          : find_int (argv[op + 1]));
+         int cmp = strintcmp (l, r);
+         bool xe_operator = (argv[op][2] == 'e');
+         pos += 3;
+         return (argv[op][1] == 'l' ? cmp < xe_operator
+                 : argv[op][1] == 'g' ? cmp > - xe_operator
+                 : (cmp != 0) == xe_operator);
+       }
+
       switch (argv[op][1])
        {
        default:
          break;
 
-       case 'l':
-         if (argv[op][2] == 't' && !argv[op][3])
-           {
-             /* lt */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -lt"));
-               }
-
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -lt"));
-               }
-             pos += 3;
-             return l < r;
-           }
-
-         if (argv[op][2] == 'e' && !argv[op][3])
-           {
-             /* le */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -le"));
-               }
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -le"));
-               }
-             pos += 3;
-             return l <= r;
-           }
-         break;
-
-       case 'g':
-         if (argv[op][2] == 't' && !argv[op][3])
-           {
-             /* gt integer greater than */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -gt"));
-               }
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -gt"));
-               }
-             pos += 3;
-             return l > r;
-           }
-
-         if (argv[op][2] == 'e' && !argv[op][3])
-           {
-             /* ge - integer greater than or equal to */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -ge"));
-               }
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -ge"));
-               }
-             pos += 3;
-             return l >= r;
-           }
-         break;
-
        case 'n':
          if (argv[op][2] == 't' && !argv[op][3])
            {
@@ -444,51 +334,9 @@ binary_operator (bool l_is_l)
              re = age_of (argv[op + 1], &rt);
              return le > re || (le && lt > rt);
            }
-
-         if (argv[op][2] == 'e' && !argv[op][3])
-           {
-             /* ne - integer not equal */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -ne"));
-               }
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -ne"));
-               }
-             pos += 3;
-             return l != r;
-           }
          break;
 
        case 'e':
-         if (argv[op][2] == 'q' && !argv[op][3])
-           {
-             /* eq - integer equal */
-             if (l_is_l)
-               l = strlen (argv[op - 1]);
-             else
-               {
-                 if (!is_int (argv[op - 1], &l))
-                   integer_expected_error (_("before -eq"));
-               }
-             if (r_is_l)
-               r = strlen (argv[op + 2]);
-             else
-               {
-                 if (!is_int (argv[op + 1], &r))
-                   integer_expected_error (_("after -eq"));
-               }
-             pos += 3;
-             return l == r;
-           }
-
          if (argv[op][2] == 'f' && !argv[op][3])
            {
              /* ef - hard link? */
@@ -645,11 +493,13 @@ unary_operator (void)
 
     case 't':                  /* File (fd) is a terminal? */
       {
-       intmax_t fd;
+       long int fd;
+       char const *arg;
        unary_advance ();
-       if (!is_int (argv[pos - 1], &fd))
-         integer_expected_error (_("after -t"));
-       return INT_MIN <= fd && fd <= INT_MAX && isatty (fd);
+       arg = find_int (argv[pos - 1]);
+       errno = 0;
+       fd = strtol (arg, NULL, 10);
+       return (errno != ERANGE && 0 <= fd && fd <= INT_MAX && isatty (fd));
       }
 
     case 'n':                  /* True if arg has some length. */
Index: m4/prereq.m4
===================================================================
RCS file: /fetish/cu/m4/prereq.m4,v
retrieving revision 1.112
diff -p -u -r1.112 prereq.m4
--- m4/prereq.m4        18 May 2005 19:31:47 -0000      1.112
+++ m4/prereq.m4        27 May 2005 20:30:59 -0000
@@ -1,4 +1,4 @@
-#serial 55
+#serial 56
 
 dnl We use gl_ for non Autoconf macros.
 m4_pattern_forbid([^gl_[ABCDEFGHIJKLMNOPQRSTUVXYZ]])dnl
@@ -131,6 +131,8 @@ AC_DEFUN([gl_PREREQ],
   AC_REQUIRE([gl_STAT_MACROS])
   AC_REQUIRE([gl_STDIO_SAFER])
   AC_REQUIRE([gl_STRCASE])
+  AC_REQUIRE([gl_STRINTCMP])
+  AC_REQUIRE([gl_STRNUMCMP])
   AC_REQUIRE([gl_STRIPSLASH])
   AC_REQUIRE([gl_TIMESPEC])
   AC_REQUIRE([gl_UNICODEIO])
--- /dev/null   2003-03-18 13:55:57 -0800
+++ lib/strnumcmp-in.h  2005-05-27 10:54:03 -0700
@@ -0,0 +1,241 @@
+/* Compare numeric strings.  This is an internal include file.
+
+   Copyright (C) 1988, 1991, 1992, 1993, 1995, 1996, 1998, 1999, 2000,
+   2003, 2004, 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Written by Mike Haertel.  */
+
+#include "strnumcmp.h"
+
+#include <stddef.h>
+
+#define NEGATION_SIGN   '-'
+#define NUMERIC_ZERO    '0'
+
+/* ISDIGIT differs from isdigit, as follows:
+   - Its arg may be any int or unsigned int; it need not be an unsigned char.
+   - It's guaranteed to evaluate its argument exactly once.
+   - It's typically faster.
+   POSIX says that only '0' through '9' are digits.  Prefer ISDIGIT to
+   ISDIGIT_LOCALE unless it's important to use the locale's definition
+   of `digit' even when the host does not conform to POSIX.  */
+#define ISDIGIT(c) ((unsigned int) (c) - '0' <= 9)
+
+
+/* Compare strings A and B containing decimal fractions < 1.
+   DECIMAL_POINT is the decimal point.  Each string
+   should begin with a decimal point followed immediately by the digits
+   of the fraction.  Strings not of this form are treated as zero.  */
+
+/* The goal here, is to take two numbers a and b... compare these
+   in parallel.  Instead of converting each, and then comparing the
+   outcome.  Most likely stopping the comparison before the conversion
+   is complete.  The algorithm used, in the old "sort" utility:
+
+   Algorithm: fraccompare
+   Action   : compare two decimal fractions
+   accepts  : char *a, char *b
+   returns  : -1 if a<b, 0 if a=b, 1 if a>b.
+   implement:
+
+   if *a == decimal_point AND *b == decimal_point
+     find first character different in a and b.
+     if both are digits, return the difference *a - *b.
+     if *a is a digit
+       skip past zeros
+       if digit return 1, else 0
+     if *b is a digit
+       skip past zeros
+       if digit return -1, else 0
+   if *a is a decimal_point
+     skip past decimal_point and zeros
+     if digit return 1, else 0
+   if *b is a decimal_point
+     skip past decimal_point and zeros
+     if digit return -1, else 0
+   return 0 */
+
+static inline int
+fraccompare (char const *a, char const *b, char decimal_point)
+{
+  if (*a == decimal_point && *b == decimal_point)
+    {
+      while (*++a == *++b)
+       if (! ISDIGIT (*a))
+         return 0;
+      if (ISDIGIT (*a) && ISDIGIT (*b))
+       return *a - *b;
+      if (ISDIGIT (*a))
+       goto a_trailing_nonzero;
+      if (ISDIGIT (*b))
+       goto b_trailing_nonzero;
+      return 0;
+    }
+  else if (*a++ == decimal_point)
+    {
+    a_trailing_nonzero:
+      while (*a == NUMERIC_ZERO)
+       a++;
+      return ISDIGIT (*a);
+    }
+  else if (*b++ == decimal_point)
+    {
+    b_trailing_nonzero:
+      while (*b == NUMERIC_ZERO)
+       b++;
+      return - ISDIGIT (*b);
+    }
+  return 0;
+}
+
+/* Compare strings A and B as numbers without explicitly converting
+   them to machine numbers, to avoid overflow problems and perhaps
+   improve performance.  DECIMAL_POINT is the decimal point and
+   THOUSANDS_SEP the thousands separator.  A DECIMAL_POINT of -1
+   causes comparisons to act as if there is no decimal point
+   character, and likewise for THOUSANDS_SEP.  */
+
+static inline int
+numcompare (char const *a, char const *b,
+           int decimal_point, int thousands_sep)
+{
+  unsigned char tmpa = *a;
+  unsigned char tmpb = *b;
+  int tmp;
+  size_t log_a;
+  size_t log_b;
+
+  if (tmpa == NEGATION_SIGN)
+    {
+      do
+       tmpa = *++a;
+      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep);
+      if (tmpb != NEGATION_SIGN)
+       {
+         if (tmpa == decimal_point)
+           do
+             tmpa = *++a;
+           while (tmpa == NUMERIC_ZERO);
+         if (ISDIGIT (tmpa))
+           return -1;
+         while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep)
+           tmpb = *++b;
+         if (tmpb == decimal_point)
+           do
+             tmpb = *++b;
+           while (tmpb == NUMERIC_ZERO);
+         return - ISDIGIT (tmpb);
+       }
+      do
+       tmpb = *++b;
+      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep);
+
+      while (tmpa == tmpb && ISDIGIT (tmpa))
+       {
+         do
+           tmpa = *++a;
+         while (tmpa == thousands_sep);
+         do
+           tmpb = *++b;
+         while (tmpb == thousands_sep);
+       }
+
+      if ((tmpa == decimal_point && !ISDIGIT (tmpb))
+         || (tmpb == decimal_point && !ISDIGIT (tmpa)))
+       return fraccompare (b, a, decimal_point);
+
+      tmp = tmpb - tmpa;
+
+      for (log_a = 0; ISDIGIT (tmpa); ++log_a)
+       do
+         tmpa = *++a;
+       while (tmpa == thousands_sep);
+
+      for (log_b = 0; ISDIGIT (tmpb); ++log_b)
+       do
+         tmpb = *++b;
+       while (tmpb == thousands_sep);
+
+      if (log_a != log_b)
+       return log_a < log_b ? 1 : -1;
+
+      if (!log_a)
+       return 0;
+
+      return tmp;
+    }
+  else if (tmpb == NEGATION_SIGN)
+    {
+      do
+       tmpb = *++b;
+      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep);
+      if (tmpb == decimal_point)
+       do
+         tmpb = *++b;
+       while (tmpb == NUMERIC_ZERO);
+      if (ISDIGIT (tmpb))
+       return 1;
+      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep)
+       tmpa = *++a;
+      if (tmpa == decimal_point)
+       do
+         tmpa = *++a;
+       while (tmpa == NUMERIC_ZERO);
+      return ISDIGIT (tmpa);
+    }
+  else
+    {
+      while (tmpa == NUMERIC_ZERO || tmpa == thousands_sep)
+       tmpa = *++a;
+      while (tmpb == NUMERIC_ZERO || tmpb == thousands_sep)
+       tmpb = *++b;
+
+      while (tmpa == tmpb && ISDIGIT (tmpa))
+       {
+         do
+           tmpa = *++a;
+         while (tmpa == thousands_sep);
+         do
+           tmpb = *++b;
+         while (tmpb == thousands_sep);
+       }
+
+      if ((tmpa == decimal_point && !ISDIGIT (tmpb))
+         || (tmpb == decimal_point && !ISDIGIT (tmpa)))
+       return fraccompare (a, b, decimal_point);
+
+      tmp = tmpa - tmpb;
+
+      for (log_a = 0; ISDIGIT (tmpa); ++log_a)
+       do
+         tmpa = *++a;
+       while (tmpa == thousands_sep);
+
+      for (log_b = 0; ISDIGIT (tmpb); ++log_b)
+       do
+         tmpb = *++b;
+       while (tmpb == thousands_sep);
+
+      if (log_a != log_b)
+       return log_a < log_b ? -1 : 1;
+
+      if (!log_a)
+       return 0;
+
+      return tmp;
+    }
+}
--- /dev/null   2003-03-18 13:55:57 -0800
+++ lib/strnumcmp.c     2005-05-27 10:51:23 -0700
@@ -0,0 +1,30 @@
+/* Compare numeric strings.
+
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Written by Paul Eggert.  */
+
+#include "strnumcmp-in.h"
+
+/* Externally-visible name for numcompare.  */
+
+int
+strnumcmp (char const *a, char const *b,
+          int decimal_point, int thousands_sep)
+{
+  return numcompare (a, b, decimal_point, thousands_sep);
+}
--- /dev/null   2003-03-18 13:55:57 -0800
+++ lib/strnumcmp.h     2005-05-27 10:44:39 -0700
@@ -0,0 +1,2 @@
+int strintcmp (char const *, char const *);
+int strnumcmp (char const *, char const *, int, int);
--- /dev/null   2003-03-18 13:55:57 -0800
+++ lib/strintcmp.c     2005-05-27 10:53:44 -0700
@@ -0,0 +1,31 @@
+/* Compare integer strings.
+
+   Copyright (C) 2005 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software Foundation,
+   Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  */
+
+/* Written by Paul Eggert.  */
+
+#include "strnumcmp-in.h"
+
+/* Compare strings A and B as integers without explicitly converting
+   them to machine numbers, to avoid overflow problems and perhaps
+   improve performance.  */
+
+int
+strintcmp (char const *a, char const *b)
+{
+  return numcompare (a, b, -1, -1);
+}
--- /dev/null   2003-03-18 13:55:57 -0800
+++ m4/strnumcmp.m4     2005-05-27 12:09:35 -0700
@@ -0,0 +1,27 @@
+# Compare numeric strings.
+
+dnl Copyright (C) 2005 Free Software Foundation, Inc.
+
+dnl This file is free software; the Free Software Foundation
+dnl gives unlimited permission to copy and/or distribute it,
+dnl with or without modifications, as long as this notice is preserved.
+
+dnl Written by Paul Eggert.
+
+AC_DEFUN([gl_STRINTCMP],
+[
+  AC_LIBSOURCES([strintcmp.c, strnumcmp.h, strnumcmp-in.h])
+  AC_LIBOBJ([strintcmp])
+
+  dnl Prerequisites of lib/strintcmp.c.
+  AC_REQUIRE([AC_INLINE])
+])
+
+AC_DEFUN([gl_STRNUMCMP],
+[
+  AC_LIBSOURCES([strnumcmp.c, strnumcmp.h, strnumcmp-in.h])
+  AC_LIBOBJ([strnumcmp])
+
+  dnl Prerequisites of lib/strnumcmp.c.
+  AC_REQUIRE([AC_INLINE])
+])


_______________________________________________
Bug-coreutils mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-coreutils

support for comparison of unlimited-length integers in expr and test

Reply via email to