Re: [PATCH] count-leading-zeros: new module

2012-08-13 Thread Bruno Haible
Hi Eric,

 I needed gcc's clz to determine the most significant bit of a
 number

This new module is redundant w.r.t. the 'integer_length',
'integer_length_l', 'integer_length_ll' modules that are in gnulib
already since last year:
For x != 0:
  count_leading_zeros (x) = sizeof x * CHAR_BIT - integer_length (x).

Personally I prefer integer_length over count_leading_zeros, because
the value of integer_length does not depend on the type size:
  integer_length_ll ((unsigned long long) x) == integer_length (x)
whereas
  count_leading_zeros_ll ((unsigned long long) x) == 32 + count_leading_zeros 
(x),
count_leading_zeros is mostly a helper routine for people who implement
multiprecision division or square root algorithms.

Ondřej Bílka wrote:
 Currently fastest way is convert to float and extract exponent. This is
 always log2(n) or log2(n)+1 which can be easily repaired.

Yes, and lib/integer_length.c already implements this trick.

Eric Blake wrote:
 Unfortunately, your implementation is not portable.  Gnulib can't afford
 to make assumptions about the layout of a floating point number, even if
 it holds true for most platforms with IEEE doubles.

The code in lib/integer_length.c uses the layout information of floating-
point numbers, after having verified that these assumptions hold
(see m4/exponentd.m4).

 +#define TEST_COUNT_LEADING_ZEROS(FUNC, TYPE, BITS, MAX, ONE)\
 +  ASSERT (FUNC (0) == BITS);\
 +  for (i = 0; i  BITS; i++)\
 +{   \
 +  ASSERT (FUNC (ONE  i) == BITS - i - 1); \
 +  for (j = 0; j  i; j++)   \
 +ASSERT (FUNC ((ONE  i) | (ONE  j)) == BITS - i - 1);\
 +}   \
 +  for (i = 0; i  1000; i++)\
 +{   \
 +  TYPE value = rand () ^ (rand ()  31  1);  \
 +  int count = 0;\
 +  for (j = 0; j  BITS; j++)\
 +if (value  (ONE  j)) \
 +  count = BITS - j - 1; \
 +  ASSERT (count == FUNC (value));   \
 +}   \
 +  ASSERT (FUNC (MAX) == 0);

I try to avoid such multi-line macros, because they are hard to
single-step under gdb. Better move this code to a separate .h file
that can be #included 3 times.

Bruno




Re: [PATCH] count-leading-zeros: new module

2012-08-11 Thread Jim Meyering
Eric Blake wrote:
 I needed gcc's clz to determine the most significant bit of a
 number (useful for things like truncating to a power of 2),
 and was surprised it is not a standardized function (the
 opposite direction of finding the least significant bit is
...
 +/* Expand the code which computes the number of leading zeros of the local
 +   variable 'x' of type TYPE (an unsigned integer type) and returns it
 +   from the current function.  */
 +#if 0  __GNUC__  3 || (__GNUC__ == 3  __GNUC_MINOR__ = 4)
 +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
 +  return x ? BUILTIN (x) : CHAR_BIT * sizeof x;
 +#else
 +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
 +  /* This condition is written so as to avoid shifting by more than \
 + 31 bits at once, and also avoids a random HP-UX cc bug.  */\
 +  verify (((TYPE) -1  31  31  2) == 0); /* TYPE has at most 64 bits */ 
 \
 +  int count = 0;\
 +  if (1  (TYPE) -1  31) { /* TYPE has more than 32 bits? */  \
 +count = count_leading_zeros_32 (x  31  1);  \
 +if (count  32) \
 +  return count; \
 +  } \
 +  return count + count_leading_zeros_32 (x);
 +
 +/* Compute and return the number of leading zeros in the least
 +   significant 32 bits of X. */
 +static inline int
 +count_leading_zeros_32 (unsigned int x)
 +{
 +  int count = 0;
 +  if (x  0xU)
 +x = 16;
 +  else
 +count += 16;
 +  if (x  0xff00)
 +x = 8;
 +  else
 +count += 8;
 +  if (x  0xf0)
 +x = 4;
 +  else
 +count += 4;
 +  if (x  0xc)
 +x = 2;
 +  else
 +count += 2;
 +  if (x  2)
 +x = 1;
 +  else
 +count++;
 +  if (!(x  1))
 +count++;
 +  return count;
 +}
 +#endif

Hi Eric,

Did you consider using a variant of the following?
It looks like it would be more efficient.

[From http://graphics.stanford.edu/~seander/bithacks.html]

  Find the log base 2 of an N-bit integer in O(lg(N)) operations
  with multiply and lookup

  uint32_t v; // find the log base 2 of 32-bit v
  int r;  // result goes here

  static const int MultiplyDeBruijnBitPosition[32] =
  {
0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
  };

  v |= v  1; // first round down[sic*] to one less than a power of 2
  v |= v  2;
  v |= v  4;
  v |= v  8;
  v |= v  16;

  r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU)  27];

  The code above computes the log base 2 of a 32-bit integer with a
  small table lookup and multiply. It requires only 13 operations,
  compared to (up to) 20 for the previous method. The purely
  table-based method requires the fewest operations, but this offers
  a reasonable compromise between table size and speed.

[*] I've just reported the error in that comment: s/down/up/



Re: [PATCH] count-leading-zeros: new module

2012-08-11 Thread Ondřej Bílka
On Sat, Aug 11, 2012 at 09:45:16AM +0200, Jim Meyering wrote:
 Eric Blake wrote:
  I needed gcc's clz to determine the most significant bit of a
  number (useful for things like truncating to a power of 2),
  and was surprised it is not a standardized function (the
  opposite direction of finding the least significant bit is
 ...
  +/* Expand the code which computes the number of leading zeros of the local
  +   variable 'x' of type TYPE (an unsigned integer type) and returns it
  +   from the current function.  */
  +#if 0  __GNUC__  3 || (__GNUC__ == 3  __GNUC_MINOR__ = 4)
  +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
  +  return x ? BUILTIN (x) : CHAR_BIT * sizeof x;
  +#else
  +# define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
  +  /* This condition is written so as to avoid shifting by more than \
  + 31 bits at once, and also avoids a random HP-UX cc bug.  */\
  +  verify (((TYPE) -1  31  31  2) == 0); /* TYPE has at most 64 bits 
  */ \
  +  int count = 0;\
  +  if (1  (TYPE) -1  31) { /* TYPE has more than 32 bits? */  \
  +count = count_leading_zeros_32 (x  31  1);  \
  +if (count  32) \
  +  return count; \
  +  } \
  +  return count + count_leading_zeros_32 (x);
  +
  +/* Compute and return the number of leading zeros in the least
  +   significant 32 bits of X. */
  +static inline int
  +count_leading_zeros_32 (unsigned int x)
  +{
  +  int count = 0;
  +  if (x  0xU)
  +x = 16;
  +  else
  +count += 16;
  +  if (x  0xff00)
  +x = 8;
  +  else
  +count += 8;
  +  if (x  0xf0)
  +x = 4;
  +  else
  +count += 4;
  +  if (x  0xc)
  +x = 2;
  +  else
  +count += 2;
  +  if (x  2)
  +x = 1;
  +  else
  +count++;
  +  if (!(x  1))
  +count++;
  +  return count;
  +}
  +#endif
 
 Hi Eric,
 
 Did you consider using a variant of the following?
 It looks like it would be more efficient.
 
 [From http://graphics.stanford.edu/~seander/bithacks.html]
 
   Find the log base 2 of an N-bit integer in O(lg(N)) operations
   with multiply and lookup
 
   uint32_t v; // find the log base 2 of 32-bit v
   int r;  // result goes here
 
   static const int MultiplyDeBruijnBitPosition[32] =
   {
 0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
 8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
   };
 
   v |= v  1; // first round down[sic*] to one less than a power of 2
   v |= v  2;
   v |= v  4;
   v |= v  8;
   v |= v  16;
 
   r = MultiplyDeBruijnBitPosition[(uint32_t)(v * 0x07C4ACDDU)  27];
 
   The code above computes the log base 2 of a 32-bit integer with a
   small table lookup and multiply. It requires only 13 operations,
   compared to (up to) 20 for the previous method. The purely
   table-based method requires the fewest operations, but this offers
   a reasonable compromise between table size and speed.
 
 [*] I've just reported the error in that comment: s/down/up/

Hi,
Currently fastest way is convert to float and extract exponent. This is
always log2(n) or log2(n)+1 which can be easily repaired.

Here is implementation:

#include stdint.h

inline int clz64(int64_t x){
  if (x0) return 0;
  union convert {uint64_t i;double f;} un;
  un.f=(double) x;
  int64_t ret= (un.i  (64-12)) - 1023;
  return 63-(ret - 1 + (xret));
}

inline int clz32(int32_t x){
  if (x0) return 0;
  union convert {uint32_t i;float f;} un;
  un.f=(float) x;
  int ret=(un.i  (32-9)) - 127;
  return 31-(ret - 1 + (xret));
}

inline int clz32_alt(uint32_t x){
  union convert {uint64_t i;double f;} un;
  un.f=(double) x;
  int ret =(un.i  (64-12)) - 1023;
  return 31 - ret;
}


 gcc messes this code a bit, it generates following code:
  
cvtsi2sdq %rdi, %xmm0
movsd %xmm0, -8(%rsp)
movq  -8(%rsp), %rcx



-- 

the printer thinks its a router.



Re: [PATCH] count-leading-zeros: new module

2012-08-11 Thread Eric Blake
On 08/11/2012 04:23 AM, Ondřej Bílka wrote:

 Did you consider using a variant of the following?
 It looks like it would be more efficient.

The fallback code is only for non-gcc compilation in the first place,
but yes, the de Bruijn table lookup does seem nicer; I'll switch over to
using it.

 Currently fastest way is convert to float and extract exponent. This is
 always log2(n) or log2(n)+1 which can be easily repaired.

Unfortunately, your implementation is not portable.  Gnulib can't afford
to make assumptions about the layout of a floating point number, even if
it holds true for most platforms with IEEE doubles.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] count-leading-zeros: new module

2012-08-11 Thread Eric Blake
On 08/11/2012 01:45 AM, Jim Meyering wrote:
 Eric Blake wrote:
 I needed gcc's clz to determine the most significant bit of a
 number (useful for things like truncating to a power of 2),
 and was surprised it is not a standardized function

 Did you consider using a variant of the following?
 It looks like it would be more efficient.
 
 [From http://graphics.stanford.edu/~seander/bithacks.html]
 
   Find the log base 2 of an N-bit integer in O(lg(N)) operations
   with multiply and lookup

Pushed as follows:

From d22f151db0e5217913ec1667a3b4e354dd88a973 Mon Sep 17 00:00:00 2001
From: Eric Blake ebl...@redhat.com
Date: Sat, 11 Aug 2012 07:34:00 -0600
Subject: [PATCH] count-leading-zeros: use a lookup table on non-gcc
compilers

While this only affects non-gcc compilers, we can assume that
lookups are faster than conditionals, even if it results in
a slightly larger executable size.

* lib/count-leading-zeros.h (count_leading_zeros_32): Use an
alternate implementation, suggested by Jim Meyering.

Signed-off-by: Eric Blake ebl...@redhat.com
---
 ChangeLog |  6 ++
 lib/count-leading-zeros.h | 41 -
 2 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 80d1b61..2b7090b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2012-08-11  Eric Blake  ebl...@redhat.com
+
+   count-leading-zeros: use a lookup table on non-gcc compilers
+   * lib/count-leading-zeros.h (count_leading_zeros_32): Use an
+   alternate implementation, suggested by Jim Meyering.
+
 2012-08-10  Eric Blake  ebl...@redhat.com

count-leading-zeros: new module
diff --git a/lib/count-leading-zeros.h b/lib/count-leading-zeros.h
index 4d7c291..cbb3264 100644
--- a/lib/count-leading-zeros.h
+++ b/lib/count-leading-zeros.h
@@ -26,7 +26,7 @@
 /* Expand the code which computes the number of leading zeros of the local
variable 'x' of type TYPE (an unsigned integer type) and returns it
from the current function.  */
-#if 0  __GNUC__  3 || (__GNUC__ == 3  __GNUC_MINOR__ = 4)
+#if __GNUC__  3 || (__GNUC__ == 3  __GNUC_MINOR__ = 4)
 # define COUNT_LEADING_ZEROS(BUILTIN, TYPE) \
   return x ? BUILTIN (x) : CHAR_BIT * sizeof x;
 #else
@@ -47,30 +47,21 @@
 static inline int
 count_leading_zeros_32 (unsigned int x)
 {
-  int count = 0;
-  if (x  0xU)
-x = 16;
-  else
-count += 16;
-  if (x  0xff00)
-x = 8;
-  else
-count += 8;
-  if (x  0xf0)
-x = 4;
-  else
-count += 4;
-  if (x  0xc)
-x = 2;
-  else
-count += 2;
-  if (x  2)
-x = 1;
-  else
-count++;
-  if (!(x  1))
-count++;
-  return count;
+  /* http://graphics.stanford.edu/~seander/bithacks.html */
+  static const char deBruijnLookup[32] = {
+0, 9, 1, 10, 13, 21, 2, 29, 11, 14, 16, 18, 22, 25, 3, 30,
+8, 12, 20, 28, 15, 17, 24, 7, 19, 27, 23, 6, 26, 5, 4, 31
+  };
+
+  x = 0xU;
+  if (!x)
+return 32;
+  x |= x  1;
+  x |= x  2;
+  x |= x  4;
+  x |= x  8;
+  x |= x  16;
+  return 31 - deBruijnLookup[(x * 0x07c4acddU)  27];
 }
 #endif

-- 
1.7.11.2


-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] count-leading-zeros: new module

2012-08-10 Thread Ben Pfaff
Eric Blake ebl...@redhat.com writes:

 +/* Expand the code which computes the number of leading zeros of the local
 +   variable 'x' of type TYPE (an unsigned integer type) and returns it
 +   from the current function.  */
 +#if 0  __GNUC__  3 || (__GNUC__ == 3  __GNUC_MINOR__ = 4)

Do you really want 0  there?