This is the first MPIR Digest. It gives an update on what has happened in MPIR development since the last stable release was issued. Future MPIR Digests will be issued about once every three weeks.
* Added David Harvey's mulmid code (not actually used by anything yet except test code - division code is also sitting there but is not used yet even by test code, as we do not have Hensel basecase division with the same prototype as exists in Niels Moller's division patches) * Set up try tests for mpn_midmul, mpn_midmul_basecase and mpn_midmul_n * Set up David Harvey's test code for mpn_midmul_basecase, mpn_midmul and mpn_midmul_toom42 * Implemented unbalanced Toom 4 algorithm * Implemented Toom53 algorithm * Added extremely fast mpn_gcdinv_1 and mpn_gcdext_1 (about 33% faster than GMP) * Switched on the use of lgcd (Lehmer type GCD based on ngcd), originally provided in Niels Mohler's patches, for computation of GCD's for small operands * Implemented and extended GCD version of Lehmer ngcd * Added Strassen multiplication to slightly speed up multiplication of 2x2 matrices used by ngcd and ngcdext * Many minor improvements to extended GCD code * Increased the extended GCD test sizes * Switched on new MPN_ZERO and mpn_store assembly code in FFT * Wrote an mpn version of Robert Gerbicz's root test code (Robert also contributed fast binomial coefficient and factorial code) * Improvement to speed profiling program to accept lines of data from a text file * A batch script to build MPIR with a configure and make like syntax, but using Microsoft Visual C++ * Converted some primality testing code of T. R. Nicely to use as a benchmark case, with Jeff Gilchrist * Rewrote the benchmark code from scratch in C and greatly extended it * Implemented some OpenMP parallel algorithms for fast multiplication on two and three cores, with Marc Glisse * Fixed numerous issues for mingw * Started writing developer documentation * Started writing a FAQ The main people who contributed the code above are Jason Moxham who was supported in part by a grant from Microsoft Research and by the Sage Windows Project, Brian Gladman, Robert Gerbicz, Jeff Gilchrist, Marc Glisse and myself. David Harvey and T. R. Nicely have made code available on their websites under suitable licenses, which we have used. Most Wanted List -------------------------- Here is a list of the code contributions we desperately need. If you have time to contribute, these would be great projects to volunteer on: * Assembly code for David Harvey's mulmid_basecase, based on our mul_basecase * Hensel basecase division code which returns the "overflow" * Toom 43 - note this uses a 6 point evaluation, and at this point, not even an evaluation sequence has been published, let alone implemented * Optimise the mpn_nhgcd2 function in the mpn/nhgcd2.c file. This is a self contained function, more or less, and could easily be made a *lot* faster - speeding it up would automatically speed up our gcd and gcdext code for less than about 600 limbs * Asymptotically fast division code * Move the implementations of the functions used by Robert Gerbicz's fast root test code down to the mpn level * Implement Peter Montgomery's mod_1 algorithm * Implement an extended GCD version of nhgcd5 * Clean up gcd code : all matrix stuff should be put in ngcd_matrix.c, all ngcd stuff should be put in ngcd.c, all gcd functions (including mpn_gcd_1 should be put in gcd.c, all extgcd functions (including mpn_gcdext_1 and mpn_gcdinv_1) should be put in gcdext.c. The implementations of bgcd, sgcd, rgcd and hgcd should be removed and all references to them in gmp-impl.h and mpz/gcd.c should be removed (an alternative would be to profile them all and see if there are situations where they are faster than what we have, e.g. for very small operands, or highly unbalanced operands). * Rename tc4_blah functions implemented in mpn/toom4_mul_n.c to mpn_signed_blah, moving them into separate files. Remove duplicates of all these functions from the mpn/toom7_mul_n.c file (named tc7_blah there) and in the strassen multiplication mpn/ngcd_matrix.c (named strassen_blah there). * Write tuning code for NGCDEXT_THRESHOLD in extended GCD in mpn/generic/gcdext.c, tuning the cutoff between lgcd and the full extended ngcd * Write tuning code for LGCD_CUTOFF in lgcd in mpn/generic/gcd.c * Tuning code for MPN_NGCD_MATRIX_CUTOFF in strassen multiplication in mpn/generic/ngcd_matrix.c * Tuning code for MULMID_TOOM42_THRESHOLD used by David Harvey's mulmid code Here for interest sakes are the latest benchmarks on a 2.8 GHz K8. IMPORTANT DISCLAIMER: these benchmarks compare the latest development version of MPIR from trunk in svn against the latest stable version of GMP (version 4.3.1). This is therefore an unfair comparison as the latest stable version of MPIR (version 1.2.1) is slower than the svn version and doubtlessly the latest development version of GMP is faster than their stable version. The benchmarks are provided to inspire development effort only and should not be used for any other purpose! Also note that the benchmarks are incomplete in that they do not yet include the new prime benchmark we have added, however they were created with the benchmarking code created by Brian Gladman. The benchmarks were not compiled in a single session. Each was run multiple times and any obviously errant timings were eliminated (the machine was somewhat loaded at the time, so timings were somewhat erratic). K8: Squaring: GMP 4.3.0 MPIR 1.2.0 ========== ======== ========= 128 x 128 : 41491611 51018148 512 x 512 : 10849085 12546872 8192 x 8192 : 165214 168034 131072 x 131072 : 2562 2767 2097152 x 2097152 : 81.0 96.3 Multiplication: ========== 128 x 128 : 40513168 40574306 512 x 512 : 10207146 12428363 8192 x 8192 : 114962 118476 131072 x 131072 : 1754 2067 2097152 x 2097152 : 52.3 63.3 Unbalanced: ========== 15000 x 10000 : 56798 59908 20000 x 10000 : 43622 47322 30000 x 10000 : 24894 27565 16777216 x 512 : 345 332 16777216 x 262144 : 9.34 11.3 Division : ========= 8192 / 32 : 1507178 1319736 8192 / 64 : 1530848 1319605 8192 / 128 : 931519 719816 8192 / 4096 : 189753 190493 8192 / 8064 : 2434448 2333862 131072 / 65536 : 2170 2229 8388608 / 4194304 : 5.27 6.01 16777216 / 262144 : 4.12 4.46 GCD : ==== 128 x 128 : 1452407 1364651 512 x 512 : 227624 196581 8192 x 8192 : 7880 6274 131072 x 131072 : 140 138 1048576 x 1048576 : 6.04 6.87 XGCD : ===== 128 x 128 : 910501 757502 512 x 512 : 173108 151024 8192 x 8192 : 5400 4988 131072 x 131072 : 84.8 87.8 1048576 x 1048576 : 3.88 4.46 RSA : ==== 512 : 20944 22708 1024 : 4222 5078 2048 : 788 882 Pi : === 10000 : 640 651 100000 : 28.7 33.2 1000000 : 1.42 1.71 Overall : 2165 2302 --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "mpir-devel" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/mpir-devel?hl=en -~----------~----~----~----~------~----~------~--~---
