Re: A Fast Alternative to Modulo Reduction

2016-12-16 Thread Alexander Nasonov
Mouse wrote:
> Abhinav Upadhyay wrote:
> >> Both operations are not _equivalent_ but he proves that the latter
> >> is also a fair mapping of x in the range [0, N) for 32 bit integers.
> 
> Only under certain assumptions about x - loosely put, that the entropy
> in x is distributed equally across all of x's bits.  But, for example,
> if you use the common string hash function that works like h=0 then
> loop{h=(h*37)+*cp++}, short strings will have all 0s in the high bits.

I can confirm that for several common hash functions (murmur, xxh,
jenkins), a distribution of bits isn't good enough to generate an
acyclic graph for chm (cdb uses the chm algorithm) for many inputs
while fast_divide32 reduction works fine most of the time.

Code is here: https://github.com:/alnsn/rgph

Alex


Re: A Fast Alternative to Modulo Reduction

2016-12-16 Thread Mouse
>> I came across an interesting article [...which...] says divisions
>> are more expensive as compared to multiplications,

This is true on some machines, less true on others.  (I don't know of
any where division is faster than multiplication.  But, since we're
talking about 32x32->64 multiplies and 32/32->32 divides, such a
machine might actually exist.)

>> so instead of doing x % N, we can use (x * N) >> 32.

Only if X has the full 32-bit range.  rand() goes only to RAND_MAX,
which is 0x7fff on the handiest machine I have to check, and
random() is specifically documented as having only 31 bits of range.
For those, you'd want to do (x*N)>>31.

>> Both operations are not _equivalent_ but he proves that the latter
>> is also a fair mapping of x in the range [0, N) for 32 bit integers.

Only under certain assumptions about x - loosely put, that the entropy
in x is distributed equally across all of x's bits.  But, for example,
if you use the common string hash function that works like h=0 then
loop{h=(h*37)+*cp++}, short strings will have all 0s in the high bits.

However, as I read the manpage, fast_divide32 doesn't do the (x*N)>>32
thing.  x/N can, I think, always be converted into (x*M1)>>M2 for
suitably chosen M1 and M2, where x/N/M1/M2 are 32-bit unsigned, the
multiply is 32x32->64, and the result is identical to x/N.  (For
example, for N=10, M1=0xcccd and M2=35.)  The fast_divide32
interface looks to me as though it's designed to support such things.
(It is, however, a good example, of hardwiring hardware assumptions
into an API - in this case, that exactly-32-bits is a useful data size.
Using fast_divide32 for _anything_ (supposedly-)machine-independent is
accumulating technical debt which will need to be paid back down the
road upon encountering a machine where 32 bits is not a convenient data
size.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: A Fast Alternative to Modulo Reduction

2016-12-16 Thread Taylor R Campbell
   Date: Fri, 16 Dec 2016 14:39:05 +0530
   From: Abhinav Upadhyay 

   I came across an interesting article [1] which talks about a faster
   alternative to the mod operation (x % N) for mapping a number to a
   fixed range, such as [0, N), which is common when we want to generate
   random numbers within a given range, or map a hash value to a range of
   indices in an array.

   It says divisions are more expensive as compared to multiplications,
   so instead of doing x % N, we can use (x * N) >> 32. Both operations
   are not _equivalent_ but he proves that the latter is also a fair
   mapping of x in the range [0, N) for 32 bit integers.

   Is this result useful? For example I saw that rand(3) uses the %
   operation to map the value in the range of RAND_MAX.

It is useful -- so useful someone implemented it in NetBSD a few years
ago!  See the fast_divide32(3) routines.  We use it in cdb, the
constant databases, and in ld.elf_so hash table lookups.  I'm sure
there are other places it could be used too.


Re: A Fast Alternative to Modulo Reduction

2016-12-16 Thread Roy Marples

On 2016-12-16 09:09, Abhinav Upadhyay wrote:

I came across an interesting article [1] which talks about a faster
alternative to the mod operation (x % N) for mapping a number to a
fixed range, such as [0, N), which is common when we want to generate
random numbers within a given range, or map a hash value to a range of
indices in an array.


You shouldn't use mod with rand(3).
Use something like arc4random_uniform(3) instead (I note that the man 
page for it is missig on my netbsd-7, see arcrandon(3)).


Roy


Re: A Fast Alternative to Modulo Reduction

2016-12-16 Thread Joerg Sonnenberger
On Fri, Dec 16, 2016 at 02:39:05PM +0530, Abhinav Upadhyay wrote:
> I came across an interesting article [1] which talks about a faster
> alternative to the mod operation (x % N) for mapping a number to a
> fixed range, such as [0, N), which is common when we want to generate
> random numbers within a given range, or map a hash value to a range of
> indices in an array.

Old hat. See fast_divide32(3).

Joerg


A Fast Alternative to Modulo Reduction

2016-12-16 Thread Abhinav Upadhyay
Hi,

I came across an interesting article [1] which talks about a faster
alternative to the mod operation (x % N) for mapping a number to a
fixed range, such as [0, N), which is common when we want to generate
random numbers within a given range, or map a hash value to a range of
indices in an array.

It says divisions are more expensive as compared to multiplications,
so instead of doing x % N, we can use (x * N) >> 32. Both operations
are not _equivalent_ but he proves that the latter is also a fair
mapping of x in the range [0, N) for 32 bit integers.

Is this result useful? For example I saw that rand(3) uses the %
operation to map the value in the range of RAND_MAX.

[1]: 
http://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

-
Abhinav