Re: Optimize three-valued comparison between integers

2020-07-24 Thread Bruno Haible
Florian Weimer wrote:
> On s390x, all three variants use conditional
> branches, but the first one is the shortest.

Indeed. Surprising.

> There is also this:
> 
> int sign4 (long n1, long n2)
> {
>   return (char) ((n1 > n2) - (n1 < n2));
> }

The case should be towards (signed char). Otherwise, on arm-linux-gnueabi
and powerpc-linux-gnu the 3 values are 0, 1, 255, not 0, 1, -1.

> It's one instruction shorter on x86-64 than sign3, but it's worse on
> other architectures.  (One of the x86-64 quirks is that conditional sets
> do not affect the full register, only a byte.)

Indeed. But IMO, the conditional jumps matter more than an immediate
instruction.

Bruno




doc: update for more recent versions of Cygwin and Mac OS X

2020-07-24 Thread Bruno Haible
The Gnulib documentation regarding Cygwin and Mac OS X is out-of-date:
Cygwin 1.7.x and Mac OS X 10.5 are hardly in use any more.

With these patches, I'm updating it to Cygwin 2.9.0 and Mac OS X 10.13.
Still not the newest (they are from 2017), but better than before.


2020-07-24  Bruno Haible  

doc: Update for Mac OS X 10.13.
* doc/*/*.texi: Update.
* m4/expm1l.m4: Update comments.
* m4/getgroups.m4: Likewise.
* m4/getlogin_r.m4: Likewise.
* m4/linkat.m4: Likewise.
* m4/printf.m4: Likewise.

2020-07-24  Bruno Haible  

doc: Update for Cygwin 2.9.0.
* doc/*/*.texi: Update.



0001-doc-Update-for-Cygwin-2.9.0.patch.gz
Description: application/gzip


0002-doc-Update-for-Mac-OS-X-10.13.patch.gz
Description: application/gzip


[PATCH] parse-datetime: modernize doc

2020-07-24 Thread Paul Eggert
* doc/parse-datetime.texi: Use more-current examples.
Don’t lead with 32-bit time_t, as it’s on its way out.
Capitalize “Epoch” to be consistent with POSIX.
---
 ChangeLog   |  5 +++
 doc/parse-datetime.texi | 97 -
 2 files changed, 53 insertions(+), 49 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 1f46cca66..f56249b17 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2020-07-24  Paul Eggert  
 
+   parse-datetime: modernize doc
+   * doc/parse-datetime.texi: Use more-current examples.
+   Don’t lead with 32-bit time_t, as it’s on its way out.
+   Capitalize “Epoch” to be consistent with POSIX.
+
timespec: remove dependence on ‘verify’
* lib/timespec.h: Do not include verify.h; no longer needed.
* modules/timespec (Depends-on): Remove ‘verify’.
diff --git a/doc/parse-datetime.texi b/doc/parse-datetime.texi
index c435ea4c2..f7e368514 100644
--- a/doc/parse-datetime.texi
+++ b/doc/parse-datetime.texi
@@ -46,17 +46,17 @@ arguments to the various programs.  The C interface (via the
 @code{parse_datetime} function) is not described here.
 
 @menu
-* General date syntax::Common rules.
-* Calendar date items::19 Dec 1994.
-* Time of day items::  9:20pm.
-* Time zone items::EST, PDT, UTC, @dots{}
-* Combined date and time of day items:: 1972-09-24T20:02:00,00-0500.
-* Day of week items::  Monday and others.
-* Relative items in date strings:: next tuesday, 2 years ago.
-* Pure numbers in date strings::   19931219, 1440.
-* Seconds since the Epoch::@@1078100502.
-* Specifying time zone rules:: TZ="America/New_York", TZ="UTC0".
-* Authors of parse_datetime::  Bellovin, Eggert, Salz, Berets, et al.
+* General date syntax::  Common rules
+* Calendar date items::  21 Jul 2020
+* Time of day items::9:20pm
+* Time zone items::  UTC, -0700, +0900, @dots{}
+* Combined date and time of day items:: 2020-07-21T20:02:00,00-0400
+* Day of week items::Monday and others
+* Relative items in date strings:: next tuesday, 2 years ago
+* Pure numbers in date strings:: 20200721, 1440
+* Seconds since the Epoch::  @@1595289600
+* Specifying time zone rules::   TZ="America/New_York", TZ="UTC0"
+* Authors of parse_datetime::Bellovin, Eggert, Salz, Berets, et al.
 @end menu
 
 
@@ -124,17 +124,17 @@ ways to do this:
 
 @example
 $ LC_ALL=C TZ=UTC0 date
-Mon Mar  1 00:21:42 UTC 2004
+Tue Jul 21 23:00:37 UTC 2020
 $ TZ=UTC0 date +'%Y-%m-%d %H:%M:%SZ'
-2004-03-01 00:21:42Z
+2020-07-21 23:00:37Z
 $ date --rfc-3339=ns  # --rfc-3339 is a GNU extension.
-2004-02-29 16:21:42.692722128-08:00
+2020-07-21 19:00:37.692722128-04:00
 $ date --rfc-2822  # a GNU extension
-Sun, 29 Feb 2004 16:21:42 -0800
+Tue, 21 Jul 2020 19:00:37 -0400
 $ date +'%Y-%m-%d %H:%M:%S %z'  # %z is a GNU extension.
-2004-02-29 16:21:42 -0800
+2020-07-21 19:00:37 -0400
 $ date +'@@%s.%N'  # %s and %N are GNU extensions.
-@@1078100502.692722128
+@@1595372437.692722128
 @end example
 
 @cindex case, ignored in dates
@@ -145,7 +145,7 @@ nested.  Hyphens not followed by a digit are currently 
ignored.  Leading
 zeros on numbers are ignored.
 
 @cindex leap seconds
-Invalid dates like @samp{2005-02-29} or times like @samp{24:00} are
+Invalid dates like @samp{2019-02-29} or times like @samp{24:00} are
 rejected.  In the typical case of a host that does not support leap
 seconds, a time like @samp{23:59:60} is rejected even if it
 corresponds to a valid leap second.
@@ -161,25 +161,23 @@ specified differently, depending on whether the month is 
specified
 numerically or literally.  All these strings specify the same calendar date:
 
 @example
-1972-09-24 # ISO 8601.
-72-9-24# Assume 19xx for 69 through 99,
-   # 20xx for 00 through 68.
-72-09-24   # Leading zeros are ignored.
-9/24/72# Common U.S. writing.
-24 September 1972
-24 Sept 72 # September has a special abbreviation.
-24 Sep 72  # Three-letter abbreviations always allowed.
-Sep 24, 1972
-24-sep-72
-24sep72
+2020-07-20 # ISO 8601.
+20-7-20# Assume 19xx for 69 through 99,
+   # 20xx for 00 through 68 (not recommended).
+7/20/2020  # Common U.S. writing.
+20 July 2020
+20 Jul 2020# Three-letter abbreviations always allowed.
+Jul 20, 2020
+20-jul-2020
+20jul2020
 @end example
 
 The year can also be omitted.  In this case, the last specified year is
 used, or the current year if none.  For example:
 
 @example
-9/24
-sep 24
+7/20
+jul 20
 @end example
 
 Here are the rules.
@@ -432,14 +430,14 @@ where the clocks were adjusted, typically for daylight 
saving time,
 the resulting date and time are adjusted accordingly.
 
 The fuzz in units can cause problems with relative items.  For
-example, @samp{2003-07-31 -1 month} might evaluate to 2003-07-01,
-because 2003-06-31 is an invalid date.  To determine 

[PATCH] timespec: remove dependence on ‘verify’

2020-07-24 Thread Paul Eggert
* lib/timespec.h: Do not include verify.h; no longer needed.
* modules/timespec (Depends-on): Remove ‘verify’.
---
 ChangeLog| 4 
 lib/timespec.h   | 1 -
 modules/timespec | 1 -
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b922dca63..1f46cca66 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
 2020-07-24  Paul Eggert  
 
+   timespec: remove dependence on ‘verify’
+   * lib/timespec.h: Do not include verify.h; no longer needed.
+   * modules/timespec (Depends-on): Remove ‘verify’.
+
Optimize a few more three-valued comparisons
* lib/timespec.h (timespec_cmp, timespec_sign):
* lib/utimecmp.c (utimecmpat):
diff --git a/lib/timespec.h b/lib/timespec.h
index f88f1e47a..dc999f944 100644
--- a/lib/timespec.h
+++ b/lib/timespec.h
@@ -34,7 +34,6 @@ extern "C" {
 #endif
 
 #include "arg-nonnull.h"
-#include "verify.h"
 
 /* Inverse resolution of timespec timestamps (in units per second),
and log base 10 of the inverse resolution.  */
diff --git a/modules/timespec b/modules/timespec
index 956cfea22..6fac24c02 100644
--- a/modules/timespec
+++ b/modules/timespec
@@ -10,7 +10,6 @@ Depends-on:
 extern-inline
 snippet/arg-nonnull
 time
-verify
 
 configure.ac:
 gl_TIMESPEC
-- 
2.25.4




[PATCH 2/2] Optimize a few more three-valued comparisons

2020-07-24 Thread Paul Eggert
* lib/timespec.h (timespec_cmp, timespec_sign):
* lib/utimecmp.c (utimecmpat):
Avoid conditional branches by using _GL_CMP.
---
 ChangeLog  |  5 +
 lib/timespec.h | 40 +++-
 lib/utimecmp.c |  6 ++
 3 files changed, 10 insertions(+), 41 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 01577c9bb..b922dca63 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,10 @@
 2020-07-24  Paul Eggert  
 
+   Optimize a few more three-valued comparisons
+   * lib/timespec.h (timespec_cmp, timespec_sign):
+   * lib/utimecmp.c (utimecmpat):
+   Avoid conditional branches by using _GL_CMP.
+
Fix _GL_CMP parenthesization typo
* m4/gnulib-common.m4 (_GL_CMP): Properly parenthesize.
 
diff --git a/lib/timespec.h b/lib/timespec.h
index 02684ce6e..f88f1e47a 100644
--- a/lib/timespec.h
+++ b/lib/timespec.h
@@ -59,46 +59,12 @@ make_timespec (time_t s, long int ns)
   return r;
 }
 
-/* Return negative, zero, positive if A < B, A == B, A > B, respectively.
-
-   For each timestamp T, this code assumes that either:
-
- * T.tv_nsec is in the range 0..9; or
- * T.tv_sec corresponds to a valid leap second on a host that supports
-   leap seconds, and T.tv_nsec is in the range 10..19; or
- * T.tv_sec is the minimum time_t value and T.tv_nsec is -1; or
-   T.tv_sec is the maximum time_t value and T.tv_nsec is 20.
-   This allows for special struct timespec values that are less or
-   greater than all possible valid timestamps.
-
-   In all these cases, it is safe to subtract two tv_nsec values and
-   convert the result to integer without worrying about overflow on
-   any platform of interest to the GNU project, since all such
-   platforms have 32-bit int or wider.
-
-   Replacing "a.tv_nsec - b.tv_nsec" with something like
-   "a.tv_nsec < b.tv_nsec ? -1 : a.tv_nsec > b.tv_nsec" would cause
-   this function to work in some cases where the above assumption is
-   violated, but not in all cases (e.g., a.tv_sec==1, a.tv_nsec==-2,
-   b.tv_sec==0, b.tv_nsec==9) and is arguably not worth the
-   extra instructions.  Using a subtraction has the advantage of
-   detecting some invalid cases on platforms that detect integer
-   overflow.  */
+/* Return negative, zero, positive if A < B, A == B, A > B, respectively.  */
 
 _GL_TIMESPEC_INLINE int _GL_ATTRIBUTE_PURE
 timespec_cmp (struct timespec a, struct timespec b)
 {
-  if (a.tv_sec < b.tv_sec)
-return -1;
-  if (a.tv_sec > b.tv_sec)
-return 1;
-
-  /* Pacify gcc -Wstrict-overflow (bleeding-edge circa 2017-10-02).  See:
- https://lists.gnu.org/r/bug-gnulib/2017-10/msg6.html  */
-  assume (-1 <= a.tv_nsec && a.tv_nsec <= 2 * TIMESPEC_HZ);
-  assume (-1 <= b.tv_nsec && b.tv_nsec <= 2 * TIMESPEC_HZ);
-
-  return a.tv_nsec - b.tv_nsec;
+  return 2 * _GL_CMP (a.tv_sec, b.tv_sec) + _GL_CMP (a.tv_nsec, b.tv_nsec);
 }
 
 /* Return -1, 0, 1, depending on the sign of A.  A.tv_nsec must be
@@ -106,7 +72,7 @@ timespec_cmp (struct timespec a, struct timespec b)
 _GL_TIMESPEC_INLINE int _GL_ATTRIBUTE_PURE
 timespec_sign (struct timespec a)
 {
-  return a.tv_sec < 0 ? -1 : a.tv_sec || a.tv_nsec;
+  return _GL_CMP (a.tv_sec, 0) + (!a.tv_sec & !!a.tv_nsec);
 }
 
 struct timespec timespec_add (struct timespec, struct timespec)
diff --git a/lib/utimecmp.c b/lib/utimecmp.c
index 600d76d90..341c68508 100644
--- a/lib/utimecmp.c
+++ b/lib/utimecmp.c
@@ -396,8 +396,6 @@ utimecmpat (int dfd, char const *dst_name,
 }
 
   /* Compare the timestamps and return -1, 0, 1 accordingly.  */
-  return (dst_s < src_s ? -1
-  : dst_s > src_s ? 1
-  : dst_ns < src_ns ? -1
-  : dst_ns > src_ns);
+  return (_GL_CMP (dst_s, src_s)
+  + ((dst_s == src_s ? ~0 : 0) & _GL_CMP (dst_ns, src_ns)));
 }
-- 
2.25.4




[PATCH 1/2] Fix _GL_CMP parenthesization typo

2020-07-24 Thread Paul Eggert
* m4/gnulib-common.m4 (_GL_CMP): Properly parenthesize.
---
 ChangeLog   | 5 +
 m4/gnulib-common.m4 | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 23e809982..01577c9bb 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2020-07-24  Paul Eggert  
+
+   Fix _GL_CMP parenthesization typo
+   * m4/gnulib-common.m4 (_GL_CMP): Properly parenthesize.
+
 2020-07-24  Bruno Haible  
 
dfa: Revert breaking gawk.
diff --git a/m4/gnulib-common.m4 b/m4/gnulib-common.m4
index ba4239188..dfd4a258e 100644
--- a/m4/gnulib-common.m4
+++ b/m4/gnulib-common.m4
@@ -305,7 +305,7 @@ AC_DEFUN([gl_COMMON_BODY], [
GCC versions up to GCC 9.
The better code  (n1 > n2) - (n1 < n2)  from Hacker's Delight § 2-9
avoids conditional jumps in all GCC versions >= 3.4.  */
-#define _GL_CMP(n1, n2) ((n1) > (n2)) - ((n1) < (n2))
+#define _GL_CMP(n1, n2) (((n1) > (n2)) - ((n1) < (n2)))
 ])
   dnl Hint which direction to take regarding cross-compilation guesses:
   dnl When a user installs a program on a platform they are not intimately
-- 
2.25.4




Re: dfa.c change - please revert

2020-07-24 Thread arnold
Thanks.

A few times a week I 'git pull' to see if anything has changed
that affects gawk.

Arnold

Bruno Haible  wrote:

> Hi Arnold,
>
> > Please revert this, as it breaks compilation in gawk.
>
> This patch should do it (keeping the optimized variant of the 3-way 
> comparison).
>
> Btw, how did you notice the breakage so rapidly? Are you scanning the commits
> or the mails, or do you have a continuous integration?
>
>
> 2020-07-24  Bruno Haible  
>
>   dfa: Revert breaking gawk.
>   Reported by Arnold Robbins .
>   * lib/dfa.c (compare): Don't reference the _GL_CMP macro.
>
> diff --git a/lib/dfa.c b/lib/dfa.c
> index 1d2d404..e79d882 100644
> --- a/lib/dfa.c
> +++ b/lib/dfa.c
> @@ -2466,7 +2466,7 @@ static int
>  compare (const void *a, const void *b)
>  {
>position const *p = a, *q = b;
> -  return _GL_CMP (p->index, q->index);
> +  return (p->index > q->index) - (p->index < q->index);
>  }
>  
>  static void



Re: dfa.c change - please revert

2020-07-24 Thread Bruno Haible
Hi Arnold,

> Please revert this, as it breaks compilation in gawk.

This patch should do it (keeping the optimized variant of the 3-way comparison).

Btw, how did you notice the breakage so rapidly? Are you scanning the commits
or the mails, or do you have a continuous integration?


2020-07-24  Bruno Haible  

dfa: Revert breaking gawk.
Reported by Arnold Robbins .
* lib/dfa.c (compare): Don't reference the _GL_CMP macro.

diff --git a/lib/dfa.c b/lib/dfa.c
index 1d2d404..e79d882 100644
--- a/lib/dfa.c
+++ b/lib/dfa.c
@@ -2466,7 +2466,7 @@ static int
 compare (const void *a, const void *b)
 {
   position const *p = a, *q = b;
-  return _GL_CMP (p->index, q->index);
+  return (p->index > q->index) - (p->index < q->index);
 }
 
 static void




dfa.c change - please revert

2020-07-24 Thread Arnold Robbins
Hi.

| diff --git a/lib/dfa.c b/lib/dfa.c
| index dee7be861..1d2d40457 100644
| --- a/lib/dfa.c
| +++ b/lib/dfa.c
| @@ -2466,7 +2466,7 @@ static int
|  compare (const void *a, const void *b)
|  {
|position const *p = a, *q = b;
| -  return p->index < q->index ? -1 : p->index > q->index;
| +  return _GL_CMP (p->index, q->index);
|  }

Please revert this, as it breaks compilation in gawk.

Thanks,

Arnold



Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen
Am Fr., 24. Juli 2020 um 10:56 Uhr schrieb Florian Weimer :

> This is the case that is unclear:
>
> double x[2];
> double *p = [1];
>
> The standard explicitly says “first element of an array”.

That's fine as well, I think. "x + 1" just points to an array of
length 1 in memory.



Re: Module with preprocessor utilities

2020-07-24 Thread Florian Weimer
* Marc Nieper-Wißkirchen:

> Am Fr., 24. Juli 2020 um 10:05 Uhr schrieb Florian Weimer 
> :
>
>> It's still a candidate for an RFE.  Martin Sebor has been working on
>> such warnings.  I'm going to talk to him.
>
> It may also be useful for optimizations because the compiler may make
> assumptions.
>
>> >> It's also undefined when you pass the address of something that is not
>> >> the first element of an array of type double to foo1.  (Undefined in the
>> >> sense that the standard does not say what happens in that case.)
>> >
>> > Yes, but a pointer to (an existing) double fulfills this requirement.
>>
>> I don't think that's how the standard defines “first element of an
>> array”.  I suspect the intent was that it the pointer has to point to an
>> array of at least the indicated number of elements, but that's not what
>> the standard says.
>
> According to ISO C, "an array type describes a contiguously allocated
> nonempty set of objects with a particular member object type, called
> the element type". Thus, given
>
> double x;
> double *p = 
>
> the pointer p points to the first element of an array (of length 1).

This is the case that is unclear:

double x[2];
double *p = [1];

The standard explicitly says “first element of an array”. 

Thanks,
Florian




Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen
Am Fr., 24. Juli 2020 um 10:05 Uhr schrieb Florian Weimer :

> It's still a candidate for an RFE.  Martin Sebor has been working on
> such warnings.  I'm going to talk to him.

It may also be useful for optimizations because the compiler may make
assumptions.

> >> It's also undefined when you pass the address of something that is not
> >> the first element of an array of type double to foo1.  (Undefined in the
> >> sense that the standard does not say what happens in that case.)
> >
> > Yes, but a pointer to (an existing) double fulfills this requirement.
>
> I don't think that's how the standard defines “first element of an
> array”.  I suspect the intent was that it the pointer has to point to an
> array of at least the indicated number of elements, but that's not what
> the standard says.

According to ISO C, "an array type describes a contiguously allocated
nonempty set of objects with a particular member object type, called
the element type". Thus, given

double x;
double *p = 

the pointer p points to the first element of an array (of length 1).

Maybe I have misunderstood you.

Thanks,

Marc



Re: Module with preprocessor utilities

2020-07-24 Thread Florian Weimer
* Marc Nieper-Wißkirchen:

> Am Fr., 24. Juli 2020 um 08:53 Uhr schrieb Florian Weimer 
> :
>
>> * Bruno Haible:
>
>> > (This is with gcc 10.1.0.)
>> >
>> > clang, OTOH, produces warnings for both foo1 and foo2.
>> >
>> > But I won't spend time to report a GCC bug on this, because - as you said -
>> > without the ability to declare a pointer to an incomplete type or a 'void 
>> > *'
>> > as non-null, this language feature is worthless.
>
> I don't think that it is a bug in the technical sense. A C99 compiler
> does not have to warn as, in general, it is undecidable whether a
> particular argument is NULL or not.

It's still a candidate for an RFE.  Martin Sebor has been working on
such warnings.  I'm going to talk to him.

>> It's also undefined when you pass the address of something that is not
>> the first element of an array of type double to foo1.  (Undefined in the
>> sense that the standard does not say what happens in that case.)
>
> Yes, but a pointer to (an existing) double fulfills this requirement.

I don't think that's how the standard defines “first element of an
array”.  I suspect the intent was that it the pointer has to point to an
array of at least the indicated number of elements, but that's not what
the standard says.

Thanks,
Florian




Re: Module with preprocessor utilities

2020-07-24 Thread Marc Nieper-Wißkirchen
Am Fr., 24. Juli 2020 um 08:53 Uhr schrieb Florian Weimer :

> * Bruno Haible:

> > (This is with gcc 10.1.0.)
> >
> > clang, OTOH, produces warnings for both foo1 and foo2.
> >
> > But I won't spend time to report a GCC bug on this, because - as you said -
> > without the ability to declare a pointer to an incomplete type or a 'void *'
> > as non-null, this language feature is worthless.

I don't think that it is a bug in the technical sense. A C99 compiler
does not have to warn as, in general, it is undecidable whether a
particular argument is NULL or not.

For reference, this language feature is described in 6.7.5.3 (7) in
the ISO C99 document.

It's good for documenting interfaces as is the restrict qualifier, or,
rather, it would be good if ISO C allowed arrays of incomplete types
where pointers of incomplete types are allowed.

So, instead of filing a bug for GCC, it may a better use of one's time
to file a request with WG 14 to relax the language in this regard.

> It's also undefined when you pass the address of something that is not
> the first element of an array of type double to foo1.  (Undefined in the
> sense that the standard does not say what happens in that case.)

Yes, but a pointer to (an existing) double fulfills this requirement.

Marc



Re: Module with preprocessor utilities

2020-07-24 Thread Florian Weimer
* Bruno Haible:

> Also, if that's the intent of the syntax, it has been overlooked by the
> GCC developers. See:
>
>  decl.c 
> #include 
>
> extern void foo1 (double x [static 1]);
> extern void foo2 (double x []) __attribute__ ((__nonnull__(1)));
>
> int bar ()
> {
>   foo1 (NULL);
>   foo2 (NULL);
>   return 1;
> }
> =
> $ gcc -Wall -S decl.c
> decl.c: In function 'bar':
> decl.c:9:3: warning: null argument where non-null required (argument 1) 
> [-Wnonnull]
> 9 |   foo2 (NULL);
>   |   ^~~~
>
> (This is with gcc 10.1.0.)
>
> clang, OTOH, produces warnings for both foo1 and foo2.
>
> But I won't spend time to report a GCC bug on this, because - as you said -
> without the ability to declare a pointer to an incomplete type or a 'void *'
> as non-null, this language feature is worthless.

It's also undefined when you pass the address of something that is not
the first element of an array of type double to foo1.  (Undefined in the
sense that the standard does not say what happens in that case.)

Thanks,
Florian




Re: Optimize three-valued comparison between integers

2020-07-24 Thread Florian Weimer
* Bruno Haible:

> A comment in Luca Saiu 'jitter' project [1] brought my attention to 3-valued
> comparison and the book "Hacker's Delight".
>
> ===
> int sign1 (long n1, long n2)
> {
>   return (n1 > n2 ? 1 : n1 < n2 ? -1 : 0);
> }
>
> int sign2 (long n1, long n2)
> {
>   return (n1 < n2 ? -1 : n1 > n2);
> }
>
> int sign3 (long n1, long n2)
> {
>   return (n1 > n2) - (n1 < n2);
> }
> ===
>
> Which of these, do you think, generates better code? The important fact,
> nowadays, is that conditional jumps cause the CPU to stall when it has
> mispredicted the jump. So, 1 or 2 more or less ALU instructions are
> irrelevant, but conditional jumps are not.
>
> You find conditional jumps in code (x86) produced by GCC versions
>
> sign1: 4.1.2 4.2.4 4.3.6 4.4.7 4.5.4 4.6.4 4.7.3 4.8.5 4.9.4 5.5.0 7.5.0 
> 8.4.0 9.3.0 10.1.0
> sign2: 4.1.2 4.2.4 4.3.6 4.4.7 7.5.0 8.4.0 9.3.0
> sign3: 
>
> So, the third variant comes out best.

On aarch64, sign1 and sign2 are one instruction shorter, and do not
contain branches, either.  On s390x, all three variants use conditional
branches, but the first one is the shortest.

There is also this:

int sign4 (long n1, long n2)
{
  return (char) ((n1 > n2) - (n1 < n2));
}

It's one instruction shorter on x86-64 than sign3, but it's worse on
other architectures.  (One of the x86-64 quirks is that conditional sets
do not affect the full register, only a byte.)

Thanks,
Florian