Hi,
I was trying to understand the algorithm inside the main divide routine. I 
struggled with this bit of code:

      if ((prem[r_order] <= py[y_order]) && (r_order > 0))
      {
         double_limb_type a = (static_cast<double_limb_type>(prem[r_order]) << 
CppInt1::limb_bits) | prem[r_order - 1];
         double_limb_type b = py[y_order];
         double_limb_type v = a / b;
         if (v <= CppInt1::max_limb_value)
         {
            guess = static_cast<limb_type>(v);
            --r_order;
         }
      }
I was trying to understand why we essential set guess to 1 in the case that the 
division is too big.
I finally realized that the only way we can have v > CppInt1::max_limb_value is 
if we have prem[r_order] == py[y_order]. Some simple inequalities with floor 
can convince you of this.
So in the case of equality we do an expensive divide and test when we could 
move the test outside like this:

      limb_type guess = 1;
      if (r_order == 0) { <-------- do this first to simplify
          guess = prem[0] / py[y_order];
      } else if (prem[r_order] == py[y_order]) { <----------- handle it here to 
avoid the divide
          guess = 1;
...
This leads to another question. Why is this line:
guess = prem[0] / py[y_order];
not this line:
guess = prem[0] / py[0];

Logically for division it must be the second line. In fact this loop has the 
invariant that r_order >= y_order. That makes sense for division. So the 
divisor must be py[0]. That suggests a divisor with one element and we handle 
that case via a special divide. So that code is not needed I believe.  We end 
up with this:

      BOOST_ASSERT(r_order >= y_order && y_order >= 1);
      if (prem[r_order] == py[y_order]) {
          guess = 1;
      } else if (prem[r_order] < py[y_order]) {
         double_limb_type a = (static_cast<double_limb_type>(prem[r_order]) << 
CppInt1::limb_bits) | prem[r_order - 1];
         double_limb_type b = py[y_order];
         double_limb_type v = a / b;
         BOOST_ASSERT(v <= CppInt1::max_limb_value);
         guess = static_cast<limb_type>(v);
         --r_order;
      } else {
         double_limb_type a = (static_cast<double_limb_type>(prem[r_order]) << 
CppInt1::limb_bits) | prem[r_order - 1];
         double_limb_type b = (y_order > 0) ? 
(static_cast<double_limb_type>(py[y_order]) << CppInt1::limb_bits) | py[y_order 
- 1] : (static_cast<double_limb_type>(py[y_order]) << CppInt1::limb_bits);
         BOOST_ASSERT(b);
         double_limb_type v = a / b;
         guess              = static_cast<limb_type>(v);
      }
      BOOST_ASSERT(guess); // If the guess ever gets to zero we go on 
forever....


Some simple random divide tests seem to show a little over 2% improvement from 
doing this. Of course this could just be code layout changes.
Just thinking that the 64 bit values are random would make me think this case 
is rare but I guess we subtract out multiples of the divisor and cause this to 
happen more for reasons that are hard to understand (for me anyway).
I see the path hit a lot.
Neill.
_______________________________________________
Boost-users mailing list
Boost-users@lists.boost.org
https://lists.boost.org/mailman/listinfo.cgi/boost-users

Reply via email to