On 10/08/2012 01:34 PM, Jim Meyering wrote:
Torbjorn Granlund wrote:
> Please do, and let me and Niels know if it takes more than 45s. Your
> test case takes 28s on my 3.3 GHz Sandy bridge system with our current
> code. I'm a little disappointed the code doesn't beat the old code more
> for small factorisations.
So on my 2.1GHz i3-2310M, running over the range 452,930,477 to 472,882,027.
old broken code = 14m
old fixed code = 18m
new code = 63s
OK, this is about 60% slower than I would have expected. Our code at
http://gmplib.org:8000/factoring/ should run at about 39s on your
system. (I am using gcc 4.7.1.)
I haven't looked at the final version that went into codeutils, so I
have no idea why it runs slower. A wild guess is that its actual input
or tokenisation has been slowed down. For smallish numbers, such things
can dominate over actually factoring the numbers.
I think the current coreutils factor performance for smallish numbers
might be sufficient. (Larger numbers than 2^100 need a bit too much
time, but we are working on a fix.)
When I compare the factor built from http://gmplib.org:8000/factoring/, and
the one in coreutils.git I see these single-trial times: (on a 3.2GHz i7-970)
$ seq 452930477 472882027|env time ./factor > k
29.28user 0.44system 0:30.85elapsed 96%CPU (0avgtext+0avgdata
548maxresident)k
0inputs+1011088outputs (0major+162minor)pagefaults 0swaps
...
$ seq 452930477 472882027|env time /cu/src/factor > k
26.47user 0.48system 0:28.07elapsed 96%CPU (0avgtext+0avgdata
692maxresident)k
0inputs+1011088outputs (0major+205minor)pagefaults 0swaps
I.e., a 9% improvement. Compiled with bleeding edge gcc 4.8.0 20121007.
Thanks for checking that.
Torbjorn you seem to be interploating from your
3.3 GHz Sandy bridge to my 2.1GHz i3-2310M Sandy Bridge,
but it might not be linear due to cache, turbo boost,
mem bandwidth, ...
cheers,
Pádraig.