[Bug middle-end/82853] Optimize x % 3 == 0 without modulo

dvoreader at gmail dot com Tue, 26 Feb 2019 14:37:14 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853


Orr Shalom Dvory <dvoreader at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dvoreader at gmail dot com

--- Comment #33 from Orr Shalom Dvory <dvoreader at gmail dot com> ---
(In reply to Jakub Jelinek from comment #25)
> Created attachment 44657 [details]
> gcc9-pr82853.patch
> 
> Full untested patch.  For x % C1 == C2 it handles all unsigned cases where
> C1 is odd, if C1 is even, just cases where C2 <= -1U % C1, if signed modulo,
> just x % C1 == 0 cases (where C1 is not INT_MIN).

Hi, I will try to help. I've got a more accurate formula. look at my comment
over here https://reviews.llvm.org/D50222
I'm copying it here for the sake of all. 
Hi guys, I found the magical formula for unsigned integers that works also with
even numbers without the need to check for overflows with any remainder:
from divisor d and reminder r, I calculate 4 constants.
any d!=0 should fit.

void calculate_constants64(uint64_t d, uint64_t r, uint64_t &k,uint64_t &mmi,
uint64_t &s,uint64_t& u)
{
        k=__builtin_ctzll(d);/* the power of 2 */
        uint64_t d_odd=d>>k;
        mmi=find_mod_mul_inverse(d_odd,64);
        /* 64 is word size*/
        s=(r*mmi);
        u=(ULLONG_MAX-r)/d;/* note that I divide by d, not d_odd */
}
A little bit background: the constant (u +1) is the count of the possible
values in the range of 64 bit number that will yield the correct modulo.
The constant s should zero the first k bits if the given value have modulo of
r. it will also zero the modulo of d_odd.

then the compiler should generate the following code with the given constants:

int checkrem64(uint64_t k,uint64_t mmi, uint64_t s,uint64_t u,uint64_t x)
{
    uint64_t o=((x*mmi)-s);
    o= (o>>k)|(o<<(64-k));/*ROTR64(o,k)*/
    return o<=u;
}
this replace the following:

/* d is the divisor, r is the remainder */
int checkrem64(uint64_t x)
{
  return x%d==r;
}
this is the code to find modular inverse..

uint64_t find_mod_mul_inverse(uint64_t x, uint64_t bits)
{
      if (bits > 64 || ((x&1)==0))
              return 0;// invalid parameters
      uint64_t mask;
      if (bits == 64)
              mask = -1;
      else
      {                
              mask = 1;
              mask<<=bits;
              mask--;
      }
      x&=mask;
      uint64_t result=1, state=x, ctz=0;
      while(state!=1ULL)
      {
              ctz=__builtin_ctzll(state^1);
              result|=1ULL<<ctz;
              state+=x<<ctz;
              state&=mask;
      }
      return result;
}
good luck!
I tested this on all the cases of 10bit word size, and it passed.

*Edit:* I looked for something that will work for signed integers. I came up
with something that would work with negative numbers if the following
assumption was correct:

(-21)%10==9
but this assumption is not correct because (-21)%10 equals to -1.
anyway, the idea is like that, you shift the range and change s and
accordingly:

void calculate_constants64(uint64_t d, uint64_t r, uint64_t &k,uint64_t &mmi,
uint64_t &s,uint64_t& u)
{
        k=__builtin_ctzll(d);/* the power of 2 */
        uint64_t d_odd=d>>k;
        mmi=find_mod_mul_inverse(d_odd,64);
        /* 64 is word size*/
     //this is the added line to make it work with signed integers
      r+=0x8000 0000 0000 0000% d;
        s=(r*mmi);
        u=(ULLONG_MAX-r)/d;
}

int checkrem64(uint64_t k,uint64_t mmi, uint64_t s,uint64_t u,uint64_t x)
{
     //this is the added line to make it work with signed integers
//x came as signed number but was casted to unsigned
    x^=0x8000 0000 0000 0000;// this is addition simplified to xor, spaces for
clarity.
    uint64_t o=((x*mmi)-s);
    o= (o>>k)|(o<<(64-k));/*ROTR64(o,k)*/
    return o<=u;
}
There is a way to tweak u and s to make it work on negative only numbers or
positive only numbers, when r is negative or positive... but for r=0, this
should work. please tell me the exact requirements, and I will do the math.

[Bug middle-end/82853] Optimize x % 3 == 0 without modulo

Reply via email to