I have uploaded the new root code , and it should be there in an hour or so. I have changed the syntax of the undocumented internal function mpn_rootrem to suit the new code better. The return value of this function is now only defined in the case when you calculate the remainder. I will adjust the two mpz functions that use this and introduce a new mpz function to take advantage of this feature.
Jason On Aug 26, 11:44 pm, Jason Moxham <[email protected]> wrote: > Hi > > I have implemented a faster integer root extraction based on > > "Detecting Perfect Powers in Essentially Linear Time" > Daniel J Bernstein http://cr.yp.to/papers.html > > The current root extraction always returns a remainder as well , but with the > new algorithms it is faster not to calculate it for example > > ./speed -c -s 1-1000000 -f 1.6 mpn_root_new.3 mpn_rootrem_new.3 -r1 > overhead 6.67 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.77 MHz > mpn_root_new.3 mpn_rootrem_new.3 > 1 #3694.89 1.1008 > 2 #4284.75 1.0678 > 3 #4430.08 1.1121 > 4 #4662.45 1.1732 > 6 #5726.39 1.1322 > 9 #6017.92 1.1127 > 14 #6978.72 1.1228 > 22 #8152.16 1.1638 > 35 #10558.69 1.2500 > 56 #15334.99 1.3346 > 89 #25025.73 1.3784 > 142 #46958.37 1.4195 > 227 #86848.08 1.5167 > 363 #174570.33 1.4936 > 580 #360345.67 1.5261 > 928 #731964.50 1.4979 > 1484 #1460732.00 1.4964 > 2374 #2918845.00 1.4811 > 3798 #5860142.00 1.4667 > 6076 #11373722.00 1.4723 > 9721 #21570835.00 1.3570 > 15553 #41697982.00 1.3366 > 24884 #75745224.00 1.2987 > 39814 #130672237.00 1.2807 > 63702 #234201455.00 1.2879 > 101923 #452620533.00 1.2303 > 163076 #735142933.00 1.2382 > 260921 #1259932350.00 1.2596 > 417473 #2466162074.00 1.1973 > 667956 #4040044473.15 1.2111 > > this is comparing the new root against the new root with remainder , and you > can see that for cube roots we save 20% by not calculating the remainder > > ./speed -c -s 1-1000000 -f 1.6 mpn_root_new.11 mpn_rootrem_new.11 -r1 > overhead 6.67 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.77 MHz > mpn_root_new.11 mpn_rootrem_new.11 > 1 #3471.90 1.0631 > 2 #5024.17 1.0491 > 3 #4921.21 1.0453 > 4 #4866.52 1.0658 > 6 #5505.43 1.1002 > 9 #4988.69 1.1495 > 14 #5568.22 1.1735 > 22 #7983.31 1.1400 > 35 #7774.96 1.3388 > 56 #9551.65 1.4950 > 89 #12238.06 1.8159 > 142 #15696.17 2.3840 > 227 #23076.17 3.1835 > 363 #39873.64 3.7148 > 580 #72337.19 4.1317 > 928 #143198.29 4.3477 > 1484 #288703.75 4.2155 > 2374 #588909.50 4.2622 > 3798 #1207692.00 4.1712 > 6076 #2393227.00 3.9869 > 9721 #4860592.00 3.7902 > 15553 #9635550.00 3.5641 > 24884 #18805253.00 3.4424 > 39814 #35785425.00 2.6833 > 63702 #65366705.00 2.6164 > 101923 #114183384.00 2.4796 > 163076 #202749025.00 2.5899 > 260921 #348500080.00 2.5154 > 417473 #637235393.00 2.3188 > 667956 #1015214092.00 2.5943 > > and for 11-th roots it is over twice as fast to not calculate the remainder , > and it only gets better (ie 1001-th root is 100x speedup) > > How does this compare with the existing > ./speed -c -s 1-1000000 -f 1.6 mpn_rootrem.3 mpn_rootrem_new.3 -r2 > overhead 6.67 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.77 MHz > mpn_rootrem.3 mpn_rootrem_new.3 > 1 #0.5070 3864.95 > 2 #0.5938 4411.65 > 3 #0.2542 10696.65 > 4 #0.4086 11228.18 > 6 #0.4216 11841.79 > 9 #0.6229 12861.69 > 14 1.0253 #13613.28 > 22 1.4994 #15225.00 > 35 2.3086 #19700.35 > 56 2.8291 #33357.22 > 89 3.9932 #48141.96 > 142 5.1187 #86683.00 > 227 6.3146 #165246.14 > 363 6.6329 #328079.25 > 580 7.4431 #680612.00 > 928 8.1024 #1376246.00 > 1484 8.2181 #2731967.00 > 2374 8.8078 #5345203.00 > 3798 9.3735 #8587304.00 > 6076 9.3773 #16715719.00 > 9721 10.4492 #29254880.00 > 15553 11.2773 #55684068.00 > 24884 12.0694 #98372702.00 > 39814 12.3675 #167308412.00 > 63702 13.0026 #301692586.00 > 101923 13.9266 #558760850.00 > 163076 14.4037 #912504587.00 > 260921 14.9693 #1586573976.00 > 417473 15.6151 #2963410773.90 > 667956 16.4052 #4892824168.55 > > so for cube roots the new algorithm is up to 16x speedup > > ./speed -c -s 1-1000000 -f 1.6 mpn_rootrem.6 mpn_rootrem_new.6 -r2 > overhead 6.67 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.77 MHz > mpn_rootrem.6 mpn_rootrem_new.6 > 1 #0.4230 4015.07 > 2 #0.4826 5001.39 > 3 #0.4489 6002.74 > 4 #0.5826 5972.63 > 6 #0.5538 6597.57 > 9 #0.9411 6649.87 > 14 1.2604 #8589.01 > 22 1.7959 #9131.34 > 35 2.8297 #11595.51 > 56 4.9101 #15742.77 > 89 6.7737 #24455.11 > 142 8.9538 #43350.88 > 227 11.0893 #80233.38 > 363 12.0200 #165172.43 > 580 13.9284 #325329.50 > 928 14.8314 #662709.50 > 1484 14.7727 #1355535.00 > 2374 15.3892 #2732746.00 > 3798 16.4676 #5320714.00 > 6076 16.5480 #10111893.00 > 9721 18.4109 #19234270.00 > 15553 19.6854 #37076658.00 > 24884 16.5990 #64584267.00 > 39814 16.0566 #109209274.00 > 63702 16.0017 #203845446.00 > 101923 16.8694 #363341873.00 > 163076 17.8948 #571863523.00 > 260921 18.1657 #1043338924.00 > 417473 18.3265 #1885156335.00 > 667956 19.3345 #3123304968.21 > > and for 6-th roots we are 20x speedup > > ./speed -c -s 1-1000000 -f 1.6 mpn_rootrem.1532 mpn_root_new.1532 -r2 > overhead 6.67 cycles, precision 1000000 units of 3.75e-10 secs, CPU freq > 2664.77 MHz > mpn_rootrem.1532 mpn_root_new.1532 > 1 #0.0374 3262.44 > 2 #0.0170 3410.33 > 3 #0.0188 3269.75 > 4 #0.0200 3155.01 > 6 #0.0204 3214.03 > 9 #0.0225 3264.04 > 14 #0.0266 3329.49 > 22 #0.0345 3226.34 > 35 1.0942 #3566.89 > 56 3.7086 #3876.63 > 89 8.4034 #4131.03 > 142 15.7969 #6270.49 > 227 40.9515 #8352.82 > 363 94.7907 #10958.85 > 580 171.0272 #12884.93 > 928 307.8009 #15113.47 > 1484 711.6705 #14059.60 > 2374 1277.3062 #16899.00 > 3798 2546.8512 #16829.41 > 6076 4526.9584 #17086.22 > 9721 5806.8554 #25465.00 > 15553 9123.4035 #31556.31 > 24884 10879.8868 #47096.42 > 39814 10780.5283 #80652.07 > 63702 12266.8911 #145376.57 > 101923 10941.7488 #287397.00 > 163076 9076.3009 #591927.50 > 260921 8849.2853 #1269636.00 > 417473 8200.9631 #2512545.00 > 667956 7340.9630 #5057633.00 > > and just to show off , the 1532th root of a 4 million bit integer is 12 > thousand faster than existing if we dont need the remainder. > > Jason --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---
