Am Mittwoch 30 Dezember 2009 11:57:28 schrieb Artyom Kazak:
> Why fact2 is quicker than fact?!
>
> fact2 :: Integer -> Integer
> fact2 x = f x y
> where
> f n e | n < 2 = 1
>
> | e == 0 = n * (n - 1)
> | e > 0 = (f n (e `div` 2)) * (f (n - (e * 2)) (e `div` 2))
>
> y = 2 ^ (truncate (log (fromInteger x) / log 2))
>
> fact :: Integer -> Integer
> fact 1 = 1
> fact n = n * fact (n - 1)
>
> I tried to write tail-recursive fact, fact as "product [1..n]" - fact2 is
> quicker!
>
>
> fact2 1000000 == fact 1000000 - I tested.

If you follow the evaluation of fact2, it is basically the same as

fact3 n = binaryMult [1 .. n]
      where
        binaryMult [p] = p
        binaryMult xs = binaryMult (pairMult xs)
        pairMult (x:y:zs) = x*y : pairMult zs
        pairMult xs = xs

, just without the list construction, but with a few more ones [aside: You 
should subtract 
one from the exponent of y. As it is, in the first call to f, the second factor 
is always 
1 because x < 2*y. Doesn't make much of a difference regarding performance, but 
it seems 
cleaner.]. Perhaps fact3 is a little easier to follow.

Looking at fact3 (2^k), we see that in the first iteration of binaryMult, 
2^(k-1) products 
of small integers (<= k+1 bits) are carried out. These multiplications are fast.
In the second iteration, we have 2^(k-2) products of still small integers (<= 
2*k bits). 
These multiplications are a tad slower, but still fast.
In the third iteration, we have 2^(k-3) products of integers of (<= k*2^2 bits) 
and so on.

We see that the overwhelming majority of the 2^k-1 multiplications carried out 
don't 
involve huge numbers and thus are relatively fast (for k = 32, no 
multiplication in the 
first five iterations involves a number with more than 1000 bits, so no more 
than 3% of 
the multiplications involve large numbers; for k = 20, the first product with 
more than 
100 bits is produced in the sixth iteration, so less than 20000 multiplications 
involve a 
number with more than 1000 bits, less than 1000 multiplications have a factor 
with more 
than 10000 bits, less than 128 have a factor with more than 100000 bits).

If the factorial is computed sequentially, like

fact0 n = foldl' (*) 1 [2 .. n]
-- or product [2 .. n], just don't build huge thunks like in fact above

, you have many multiplications involving one huge number (and one small), 
since fact k 
has of the order of k*log k bits. 1000! has about 8500 bits, 10000! has about 
120000 bits 
and 100000! has about 1.5 million bits.
So that way, computing the factorial of 2^20 needs more than 990000 
multiplications where 
one factor has more than 100000 bits and over 900000 multiplications where one 
factor has 
more than one million bits.

Since multiplications where one factor is huge take long, even if the other 
factor is 
small, we see why a sequential computation of a factorial is so much slower 
than a tree-
like computation.
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to