It didn't took me so much time as I feared to understand why the use
of bench_two on GMP4.3 and MPIR1.3 (on my 32-bit CPU) gave so strange
results...
GMP4.3 was (slightly) faster than MPIR1.3 for all tests, expect two
where it was terribly slower: fermat and mersenne. The overall score
says:

GMP4.3 => 136, 97.2
MPIR1.3 => 145, 104

I.e. the bench_two test I downloaded from mpir.org says that yes, for
many application GMP is faster, but there are some (two) where it is
by far slower... so, globally, MPIR is 6% better than GMP.

It sounds strange, doesn't it?

Well, go and look into the code, the tarball is available from the
main page of MPIR, you can download it, unpack it and... before you
use it, please READ THE CODE!

The two very interesting test files are: fermat_prime_p.c,
mersenne_prime_p.c .

Let's start from the first one: fermat_prime_p.c

At the beginning you can find:
#ifndef __MPIR_VERSION
// we are gmp
#define NEED_MULMOD
#elif __MPIR_VERSION < 1 || (__MPIR_VERSION == 1 &&
__MPIR_VERSION_MINOR < 3)
#define NEED_MULMOD
#endif

...you will see, this means: if someone is testing GMP or a version of
MPIR before 1.3, be _as_slow_as_possible_. The reason? This way MPIR
will look like being fast :-D

The "application" is very simple, it performs a "Pepin's Test for k"
i.e. test if "3^((F_k-1)/2) == -1 mod F_k", where "F_k = 2^(2^k)+1".

How would you write such an application? You would probably think you
can use the documented function mpz_powm...
The test doesn't do this, because this could be fast on libraries
different from MPIR-1.3, and the goal is to be _slow_... so it will
use a loop and the _undocumented_ function mpn_mulmod_2expp1. This is
a test to see how the library perform with a typical application, and
uses a function that NO application will use, for the simple fact that
_it_is_NOT_documented!
You can try:
mpir-1.3.0$ grep -ri mulmod doc/mpir.*

Nothing, no answer, it is not documented at all...And if you are not
using MPIR-1.3? will the test use something different? NO! It will
perform the computation using an _as_slow_as_possible_ substitute for
that function.

NO APPLICATION WILL EVER BE SO CRAZY, THIS IS NOT AN APPLICATION, IT'S
A FAKE!!!

I'll not analyse the ridicule "substitute", I'll do for the next
"application", because it is absurd exactly in the same way!

Next application: mersenne_prime_p.c
Here the "application" uses the Lucas-Lehmer test on a Mersenne
number, now the loop make sense, because it is not a simple
exponentiation, but a sequence of squaring-subtract, to be performed
modulo 2^p-1.
How would you implement it? With some clever reduction using mpn_add_n
or initialising the modulo once and then using it again and again...

But here, again, the main goal of the person who wrote this code was
to show that his mulmod function was giving a tremendous speed up, so,
again, the fake-application uses an undocumented function. Let us look
at the line where it is used:
mpn_mulmod_2expm1 (rp, xp, xp, k, tp); // mpn_sqrmod_2expm1 would be
faster
Note the comment, using sqr can be faster! Then read the fake,
as_slow_as_possible, implementation that is used if you are measuring
speed of something different wrt MPIR-1.3:

void    mpn_mulmod_2expm1 (mp_ptr xp,mp_ptr yp,mp_ptr zp,mp_size_t
k2,mp_ptr tp)
{mpz_t x,y,z,m;mp_size_t n,tn;
n=BITS_TO_LIMBS(k2);
mpz_init2(y,k2);mpz_init2(z,k2);mpz_init2(m,k2);mpz_init2(x,2*k2);
mpz_set_ui(m,1);mpz_mul_2exp(m,m,k2);mpz_sub_ui(m,m,1);
MPN_COPY(y->_mp_d,yp,n);tn=n;MPN_NORMALIZE(y->_mp_d,tn);y-
>_mp_size=tn;
MPN_COPY(z->_mp_d,zp,n);tn=n;MPN_NORMALIZE(z->_mp_d,tn);z-
>_mp_size=tn;
mpz_mul(x,y,z);
mpz_mod(x,x,m);tn=x->_mp_size;if(tn>n)tn=n;
MPN_COPY(xp,x->_mp_d,tn);if(tn<n)MPN_ZERO(xp+tn,n-tn);
mpz_clear(x);mpz_clear(y);mpz_clear(z);mpz_clear(m);
return;}

The guy who wrote this fake application decided to implement the
needed sqrmod with the slowest possible strategy. Directly using mpn?
no, there is the risk to be fast:
- let's allocate four mpz on the fly (this means for every iteration!)
- let's recompute the modulus in mpz on the fly (it is constant for
the full run and it is recomputed every iteration!!!)
we should exploit the fact that this function will always be called
with yp==zp, but again we run the risk to be efficient, and the author
did NON want that, because this function is used for other libraries,
to be compared with MPIR... and they must be slowed down! So, you
perfectly know (read the comment above) that yp==zp, but
- copy the memory _twice_ in _two_different_ new locations...
This way mpz_mul will see two different pointers and will _NOT_ use
sqr! Clever way to avoid any possibly faster primitive!!!
- copy back the result (the third copy, to be repeated for any cycle!)
- free the memory...(for the same variables that will be used
[recreated] again in the next step).

It is quite obvious, if you read the code of this two functions that
it was written with one goal in mind, show that any library without
those two functions was slow... or, to be more exact, that any other
library (i.e. not MPIR-1.3) was slow.

BUT THIS IS A FAKE!

As a conclusion, on my laptop, MPIR is able to be faster than GMP
!!!!!!!!ONLY CHEATING!!!!!!!

You guys are very funny!!!! :-D
Because the cheating is so evident that when your library is slower on
all operation, the fake application is 3-4 times faster with your
funny-library... and your benchmark is so.... ingenuous .... to
conclude that overall the funny-fake-library is faster!!!!
RIDICULOUS!!!!!

But now be serious, and confess... MPIR is not a library, it's a
joke! :-D

AH AH AH AH!!!!
Adios!

Gian.
-- 
You received this message because you are subscribed to the Google Groups 
"mpir-devel" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/mpir-devel?hl=en.


Reply via email to