On Fri, Dec 01, 2017 at 09:58:55AM +0100, Otto Moerbeek wrote: > On Thu, Nov 30, 2017 at 01:10:33PM +0000, kshe wrote: > > > On Thu, 30 Nov 2017 07:22:42 +0000, Otto Moerbeek wrote: > > > On Sun, Nov 26, 2017 at 07:40:03PM +0000, kshe wrote: > > > > Hi, > > > > > > > > The `Z' command can be a handy shortcut for computing logarithms; as > > > > such, for example, it is the basis of the implementation of bc(1)'s `l' > > > > function. However, the algorithm currently used in count_digits() is > > > > too naive to be really useful, becoming rapidly much slower than what > > > > would be expected from a native command. > > > > > > > > To see how this computation could easily be made exponentially faster, > > > > one may start by noticing that, if next() is the function defined for > > > > any real r as > > > > > > > > next(r) := floor(r) + 1, > > > > > > > > then clearly, for any strictly positive integer x, > > > > > > > > floor(log_2(x)) <= log_2(x) < next(log_2(x)) > > > > > > > > and therefore > > > > > > > > log_10(2) * floor(log_2(x)) <= log_10(x) < k, > > > > > > > > where > > > > > > > > k := log_10(2) * next(log_2(x)). > > > > > > > > Since log_10(2) < 1, it follows that > > > > > > > > floor(k) <= next(k - log_10(2)) <= next(log_10(x)) <= next(k), > > > > > > > > which proves that next(log_10(x)) is either floor(k) or next(k). > > > > > > > > If next(log_10(x)) = floor(k), then > > > > > > > > 10^floor(k) = 10^next(log_10(x)) > 10^log_10(x) = x. > > > > > > > > If next(log_10(x)) = next(k), then > > > > > > > > 10^floor(k) = 10^floor(log_10(x)) <= 10^log_10(x) = x. > > > > > > > > Therefore, if x >= 10^floor(k), then next(log_10(x)) cannot be floor(k), > > > > hence it must be next(k); likewise, if x < 10^floor(k), then > > > > next(log_10(x)) cannot be next(k), hence it must be floor(k). Using the > > > > conventional integer value of a boolean expression, this result can be > > > > summarised as > > > > > > > > next(log_10(x)) = floor(k) + (x >= 10^floor(k)). > > > > > > > > As an additional refinement, one may further notice that if > > > > > > > > floor(k) = floor(log_10(2) * floor(log_2(x))) > > > > > > > > then > > > > > > > > 10^floor(k) = 10^floor(log_10(2) * floor(log_2(x))) > > > > <= 10^(log_10(2) * floor(log_2(x))) > > > > <= 2^floor(log_2(x)) > > > > <= x > > > > > > > > so that it can readily be concluded that > > > > > > > > next(log_10(x)) = next(k) > > > > > > > > without having to compute 10^floor(k). > > > > > > > > The BN library permits computation of k in O(1) and 10^floor(k) in > > > > O(log(k)) which is O(log(log(x))). Therefore, one can compute > > > > next(log_10(x)) in O(1) most of the time (at least on average, and with > > > > a certain definition of such average, the full analysis of which is, I > > > > presume, outside the scope of this message), with a worst case of > > > > O(log(log(x))). In contrast, the current code is exponentially worse > > > > than what its worst case should be, computing this value in O(log(x)). > > > > > > > > $ jot -b 9 -s '' 65536 >script > > > > $ echo Z >>script > > > > > > > > $ time dc script > > > > 0m03.57s real 0m03.56s user 0m00.01s system > > > > $ time ./dc script > > > > 0m00.12s real 0m00.12s user 0m00.00s system > > > > > > > > The diff below implements this optimisation. It also fixes a small > > > > logic error in split_number(), which is used by count_digits(). > > > > > > General comment: it would be a good thing to add (part of) the proof > > > or a reference to some published work to the code in a comment. > > > Especially the derivation of c and the computation of d seem a bit > > > like dropping out of thin air. > > > > All the above proof says is that a logarithm in one base is easily > > convertible to the corresponding one in another base. Since this has > > been known for a few centuries already, I see no reason to provide more > > details here than for the algorithms (and associated constants) featured > > in /usr/share/misc/bc.library, where no explanations of any kind are to > > be found either. > > > > > From a style point of view I do not like the assignment in > > > conditionals very much. > > > > Please feel free to apply your prefered style to this patch. > > > > > There's also the problem of bits * c overflowing, though that's likely > > > theoretical. > > > > To provoke such overflow, one would first need a platform where ints are > > larger than 32 bits, and the involved numbers would have to exceed > > 2^2147483648. This is indeed unlikely to happen. > > > > Regards, > > > > kshe > > So this is what I plan to commit. Changes wrt your version: make d > uint, and do not use assignment in conditional test.
This is a very nice optimization and your version patch makes it easier to follow. ok > > I also added a regress test for Z (and found out there's a difference > between gnu dc for 0Z, but that is a different issue). > > -Otto > > Index: bcode.c > =================================================================== > RCS file: /cvs/src/usr.bin/dc/bcode.c,v > retrieving revision 1.56 > diff -u -p -r1.56 bcode.c > --- bcode.c 29 Nov 2017 19:13:31 -0000 1.56 > +++ bcode.c 1 Dec 2017 08:54:12 -0000 > @@ -390,9 +390,10 @@ split_number(const struct number *n, BIG > > bn_checkp(BN_copy(i, n->number)); > > - if (n->scale == 0 && f != NULL) > - bn_check(BN_set_word(f, 0)); > - else if (n->scale < sizeof(factors)/sizeof(factors[0])) { > + if (n->scale == 0) { > + if (f != NULL) > + bn_check(BN_set_word(f, 0)); > + } else if (n->scale < sizeof(factors)/sizeof(factors[0])) { > rem = BN_div_word(i, factors[n->scale]); > if (f != NULL) > bn_check(BN_set_word(f, rem)); > @@ -697,25 +698,56 @@ push_scale(void) > static u_int > count_digits(const struct number *n) > { > - struct number *int_part, *fract_part; > - u_int i; > + BIGNUM *int_part, *a, *p; > + BN_CTX *ctx; > + uint d; > + const uint64_t c = 1292913986; /* floor(2^32 * log_10(2)) */ > + int bits; > > if (BN_is_zero(n->number)) > return n->scale ? n->scale : 1; > > - int_part = new_number(); > - fract_part = new_number(); > - fract_part->scale = n->scale; > - split_number(n, int_part->number, fract_part->number); > - > - i = 0; > - while (!BN_is_zero(int_part->number)) { > - (void)BN_div_word(int_part->number, 10); > - i++; > + int_part = BN_new(); > + bn_checkp(int_part); > + > + split_number(n, int_part, NULL); > + bits = BN_num_bits(int_part); > + > + if (bits == 0) > + d = 0; > + else { > + /* > + * Estimate number of decimal digits based on number of bits. > + * Divide 2^32 factor out by shifting > + */ > + d = (c * bits) >> 32; > + > + /* If close to a possible rounding error fix if needed */ > + if (d != (c * (bits - 1)) >> 32) { > + ctx = BN_CTX_new(); > + bn_checkp(ctx); > + a = BN_new(); > + bn_checkp(a); > + p = BN_new(); > + bn_checkp(p); > + > + bn_check(BN_set_word(a, 10)); > + bn_check(BN_set_word(p, d)); > + bn_check(BN_exp(a, a, p, ctx)); > + > + if (BN_ucmp(int_part, a) >= 0) > + d++; > + > + BN_CTX_free(ctx); > + BN_free(a); > + BN_free(p); > + } else > + d++; > } > - free_number(int_part); > - free_number(fract_part); > - return i + n->scale; > + > + BN_free(int_part); > + > + return d + n->scale; > } > > static void > >