[mpir-devel] MPIR Digest 1

Bill Hart Fri, 24 Jul 2009 02:13:25 -0700

This is the first MPIR Digest. It gives an update on what has happened
in MPIR development since the last stable release was issued. Future
MPIR Digests will be issued about once every three weeks.


* Added David Harvey's mulmid code (not actually used by anything yet
except test code - division code is also sitting there but is not used
yet even by test code, as we do not have Hensel basecase division with
the same prototype as exists in Niels Moller's division patches)
* Set up try tests for mpn_midmul, mpn_midmul_basecase and mpn_midmul_n
* Set up David Harvey's test code for mpn_midmul_basecase, mpn_midmul
and mpn_midmul_toom42
* Implemented unbalanced Toom 4 algorithm
* Implemented Toom53 algorithm
* Added extremely fast mpn_gcdinv_1 and mpn_gcdext_1 (about 33% faster than GMP)
* Switched on the use of lgcd (Lehmer type GCD based on ngcd),
originally provided in Niels Mohler's patches, for computation of
GCD's for small operands
* Implemented and extended GCD version of Lehmer ngcd
* Added Strassen multiplication to slightly speed up multiplication of
2x2 matrices used by ngcd and ngcdext
* Many minor improvements to extended GCD code
* Increased the extended GCD test sizes
* Switched on new MPN_ZERO and mpn_store assembly code in FFT
* Wrote an mpn version of Robert Gerbicz's root test code (Robert also
contributed fast binomial coefficient and factorial code)
* Improvement to speed profiling program to accept lines of data from
a text file
* A batch script to build MPIR with a configure and make like syntax,
but using Microsoft Visual C++
* Converted some primality testing code of T. R. Nicely to use as a
benchmark case, with Jeff Gilchrist
* Rewrote the benchmark code from scratch in C and greatly extended it
* Implemented some OpenMP parallel algorithms for fast multiplication
on two and three cores, with Marc Glisse
* Fixed numerous issues for mingw
* Started writing developer documentation
* Started writing a FAQ

The main people who contributed the code above are Jason Moxham who
was supported in part by a grant from Microsoft Research and by the
Sage Windows Project, Brian Gladman, Robert Gerbicz, Jeff Gilchrist,
Marc Glisse and myself. David Harvey and T. R. Nicely have made code
available on their websites under suitable licenses, which we have
used.

Most Wanted List
--------------------------

Here is a list of the code contributions we desperately need. If you
have time to contribute, these would be great projects to volunteer
on:

* Assembly code for David Harvey's mulmid_basecase, based on our mul_basecase
* Hensel basecase division code which returns the "overflow"
* Toom 43 - note this uses a 6 point evaluation, and at this point,
not even an evaluation sequence has been published, let alone
implemented
* Optimise the mpn_nhgcd2 function in the mpn/nhgcd2.c file. This is a
self contained function, more or less, and could easily be made a
*lot* faster - speeding it up would automatically speed up our gcd and
gcdext code for less than about 600 limbs
 * Asymptotically fast division code
* Move the implementations of the functions used by Robert Gerbicz's
fast root test code down to the mpn level
* Implement Peter Montgomery's mod_1 algorithm
* Implement an extended GCD version of nhgcd5
* Clean up gcd code : all matrix stuff should be put in ngcd_matrix.c,
all ngcd stuff should be put in ngcd.c, all gcd functions (including
mpn_gcd_1 should be put in gcd.c, all extgcd functions (including
mpn_gcdext_1 and mpn_gcdinv_1) should be put in gcdext.c. The
implementations of bgcd, sgcd, rgcd and hgcd should be removed and all
references to them in gmp-impl.h and mpz/gcd.c should be removed (an
alternative would be to profile them all and see if there are
situations where they are faster than what we have, e.g. for very
small operands, or highly unbalanced operands).
* Rename tc4_blah functions implemented in mpn/toom4_mul_n.c to
mpn_signed_blah, moving them into separate files. Remove duplicates of
all these functions from the mpn/toom7_mul_n.c file (named tc7_blah
there) and in the strassen multiplication mpn/ngcd_matrix.c (named
strassen_blah there).
* Write tuning code for NGCDEXT_THRESHOLD in extended GCD in
mpn/generic/gcdext.c, tuning the cutoff between lgcd and the full
extended ngcd
* Write tuning code for LGCD_CUTOFF in lgcd in mpn/generic/gcd.c
* Tuning code for MPN_NGCD_MATRIX_CUTOFF in strassen multiplication in
mpn/generic/ngcd_matrix.c
* Tuning code for MULMID_TOOM42_THRESHOLD used by David Harvey's mulmid code

Here for interest sakes are the latest benchmarks on a 2.8 GHz K8.

IMPORTANT DISCLAIMER: these benchmarks compare the latest development
version of MPIR from trunk in svn against the latest stable version of
GMP (version 4.3.1). This is therefore an unfair comparison as the
latest stable version of MPIR (version 1.2.1) is slower than the svn
version and doubtlessly the latest development version of GMP is
faster than their stable version. The benchmarks are provided to
inspire development effort only and should not be used for any other
purpose!

Also note that the benchmarks are incomplete in that they do not yet
include the new prime benchmark we have added, however they were
created with the benchmarking code created by Brian Gladman. The
benchmarks were not compiled in a single session. Each was run
multiple times and any obviously errant timings were eliminated (the
machine was somewhat loaded at the time, so timings were somewhat
erratic).

K8:

Squaring:         GMP 4.3.0 MPIR 1.2.0
==========         ========  =========
128 x 128 :        41491611   51018148
512 x 512 :        10849085   12546872
8192 x 8192 :        165214     168034
131072 x 131072 :      2562       2767
2097152 x 2097152 :    81.0       96.3

Multiplication:
==========
128 x 128 :        40513168   40574306
512 x 512 :        10207146   12428363
8192 x 8192 :        114962     118476
131072 x 131072 :      1754       2067
2097152 x 2097152 :    52.3       63.3

Unbalanced:
==========
15000 x 10000 :       56798      59908
20000 x 10000 :       43622      47322
30000 x 10000 :       24894      27565
16777216 x 512 :        345        332
16777216 x 262144 :    9.34       11.3

Division :
=========
8192 / 32 :         1507178    1319736
8192 / 64 :         1530848    1319605
8192 / 128 :         931519     719816
8192 / 4096 :        189753     190493
8192 / 8064 :       2434448    2333862
131072 / 65536 :       2170       2229
8388608 / 4194304 :    5.27       6.01
16777216 / 262144 :    4.12       4.46

GCD :
====
128 x 128 :         1452407    1364651
512 x 512 :          227624     196581
8192 x 8192 :          7880       6274
131072 x 131072 :       140        138
1048576 x 1048576 :    6.04       6.87

XGCD :
=====
128 x 128 :          910501     757502
512 x 512 :          173108     151024
8192 x 8192 :          5400       4988
131072 x 131072 :      84.8       87.8
1048576 x 1048576 :    3.88       4.46

RSA :
====
512 :                 20944      22708
1024 :                 4222       5078
2048 :                  788        882

Pi :
===
10000 :                 640        651
100000 :               28.7       33.2
1000000 :              1.42       1.71

Overall :              2165       2302

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en
-~----------~----~----~----~------~----~------~--~---

[mpir-devel] MPIR Digest 1

Reply via email to