Mersenne Digest        Sunday, December 3 2000        Volume 01 : Number 797




----------------------------------------------------------------------

Date: Fri, 01 Dec 2000 00:13:25 EST
From: [EMAIL PROTECTED]
Subject: Re: Mersenne: P4

Jason Papadopoulos wrote:

>> The alpha was already at least 5x faster than a PIII for multiprecision
>> arithmetic at the same clock speed; with the P4 it will only get worse.

Brian Beesley replied:

> Are you sure about this? I think, with Alpha, you have to execute the 
> instruction twice to get a double-precision multiplication - you can 
> store either the low half or the high half of a product in one 
> instruction, but not both.

True, you need two instructions to get both halves of a 64x64 ==> 128-bit
product on the Alpha. It always seemed wasteful to me to have an instruction
(e.g. Alpha umulh) to get the upper half a product, which simply discards
the lower half - by way of contrast, MIPS dmult calculates the full 128-bit
product in one shot, returning the result in *two* 64-bit integer registers -
but apparently the Alpha designers wanted all instructions to be in 2-input,
one-output-register format, no exceptions. You might say they felt it would
be less than RISCy to do otherwise.

Now, on the Alpha 21064, integer muls are unpipelined and slow, so getting
a 128-bit product takes around 30 cycles. On the 21164 imuls are partially
pipelined - one can start at most every 8 cycles - so a double-width product
needs around 16 cycles, very close to what the MIPS needs using its unpipelined
dmultu instruction. Now on the Alpha 21264, imuls still have fairly long
latency (I believe around 6-8 cycles) but are *fully* pipelined, so even
with the need to use 2 instructions to do it, one can effectively get a
128-bit product every 2 cycles, which really screams.

The IA-64 will (from what I've read) actually be very much like the 21264
in this respect: 2 instructions for the separate halves of a double-wide
integer product (the IA-64 analog of Alpha umulh is called xma.hu), but
here the muls will use the x86-style floating multiplier, whose 64-bit
mantissa is of convenient length for 64-bit integer multiply. Thus, imuls
will benefit from the fact that the fmul unit is already designed to have
fairly low latency and, more importantly, is highly pipelined.

How is any of this relevant to Mersenne testing? Well, fast 64-bit integer
ops should make a well-designed all-integer convolution algorithm competitive
with a floating-point-FFT-based one. However, neither of these two is truly
satisfying since each basically uses just half the functionality (integer
or floating-point) of the processor. We really need an algorithm that does
floating and integer convolutions in parallel and uses some method (e.g.
Chinese remainder theorem or some other form of error correction) to
recorrelate the two result vectors at the end of each convolution step.
Such methods are feasible, but highly nontrivial to implement, and even
more challenging to implement efficiently. Stay tuned...

- -ernst

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 01:08:46 EST
From: "Chris Nash" <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Factoring

Hi folks,

Off-topic I know, but...

>We could let prime95 decide the next election <grin>.
>Give everybody a different prime number. Multiply the primes for 
candidate A together, likewise for B.

And like in this election, unless we can completely factor the results, 
we wouldn't know who won, either.

:)
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 1 Dec 2000 07:22:24 +0100 (MET)
From: [EMAIL PROTECTED]
Subject: Re: Mersenne: Factoring

    Chris Nash proposed:

> >We could let prime95 decide the next election <grin>.
> >Give everybody a different prime number. Multiply the primes for 
> candidate A together, likewise for B.
 
> And like in this election, unless we can completely factor the results, 
> we wouldn't know who won, either.
 
    No need to factor during primery elections.

    


_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 1 Dec 2000 02:54:35 -0800 
From: Paul Leyland <[EMAIL PROTECTED]>
Subject: RE: Mersenne: Factoring

> >We could let prime95 decide the next election <grin>.
> >Give everybody a different prime number. Multiply the primes for 
> candidate A together, likewise for B.
> 
> And like in this election, unless we can completely factor 
> the results, 
> we wouldn't know who won, either.
> 
> :)

I note the smiley, but I also assume that we know the original primes as
well, as it's built into the double voting detection mechanism.  If after
dividing through all allocated primes the residue is > 1, we can conclude
that at least one voter registered a protest by spoiling their ballot paper.
Note that we also know who did *not* vote, if their prime doesn't appear in
either product.

A major problem with this protocol as I see it is that a third party can
steal your vote by stealing your prime and using it first.
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 14:22:21 +0100
From: Guillermo Ballester Valor <[EMAIL PROTECTED]>
Subject: Mersenne: Glucas v.2.0 released

Hi:

After some delays, here is Glucas 2.0. Now I think it  is stable enough
to try complete Lucas-Lehmer test from PRIMENET (using manual forms, of
course). You can download the package from E.W.MAYER server (thanks
Ernst):
   
        ftp://209.133.33.182/pub/valor/Glucas-2.0.tar.gz

You can read more about Glucas in
        
        ftp://209.133.33.182/pub/valor/README.Glucas.htm
        
The performance is near Mlucas when it is well tuned. It can be a good
chance to extent GIMPS and Lucas-Lehmer test to platforms with good
C-compilers but no expensive f90 ones. 

The actual release is, at the moment, for the UNIX/Linux world and all
its variants. For x86 users, of course, you should use mprime (Glucas is
about 65% of performance with respect mprime), but Glucas can be used
for Double-Check proposes. 

Some remarkable features of this release are:
        -It uses the Interchangeable Mersenne Residue File Format to save
files. We can use the save files in most of the systems (and they are
very compacted). Nevertheless, it has backward compatibility for Will
Edgington rw() routines used in MacLucasUNIX. 
        -There is no problem with accuracy. Glucas adjust the FFT runlength
size at run time whether the roundoff error are too high. 
        -It is coded using intensively C-macro facilities. It is relatively
easy to write small assembler macros to speed it up (as made for x86's
GNU/gcc compiler ).

There are still no precompiled binaries. I think soon will be binaries
for Alpha-Osf (ev56, ev6). Thus, you have to make the binary :(. We need
some Unix's GURUs to make the binaries as fast as possible. You can read
in the documentation how to test and timing Glucas.  

Any help, feedback or suggestion is welcome.

Regards


Guillermo.
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 12:35:24 -0500
From: Nathan Russell <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Factoring

Paul Leyland wrote:
> 
> > >We could let prime95 decide the next election <grin>.
> > >Give everybody a different prime number. Multiply the primes for
> > candidate A together, likewise for B.
> >
> > And like in this election, unless we can completely factor
> > the results,
> > we wouldn't know who won, either.
> >
> > :)
> 
> I note the smiley, but I also assume that we know the original primes as
> well, as it's built into the double voting detection mechanism.  If after
> dividing through all allocated primes the residue is > 1, we can conclude
> that at least one voter registered a protest by spoiling their ballot paper.
> Note that we also know who did *not* vote, if their prime doesn't appear in
> either product.
> 
> A major problem with this protocol as I see it is that a third party can
> steal your vote by stealing your prime and using it first.


These problems could be solved by storing the primes on a single,
secured machine and then sending them to voters, eg, on a floppy through
certified mail.  

Correct me if I'm wrong, but by using primes with, say, 150 decimal
digits generated from strong random numbers, we'd be quite safe.  

The problem is for the government to factor the huge composite number
for each candidate afterwards.  Can someone give me a rough estimate of
how long it would take a good supercomputer to check an arbitrary
100,000,000 digit number for several factors that are each a known 150
digit arbitrary prime?  

Nathan
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 1 Dec 2000 09:44:35 -0800 
From: Paul Leyland <[EMAIL PROTECTED]>
Subject: RE: Mersenne: Factoring

> The problem is for the government to factor the huge composite number
> for each candidate afterwards.  Can someone give me a rough 
> estimate of
> how long it would take a good supercomputer to check an arbitrary
> 100,000,000 digit number for several factors that are each a known 150
> digit arbitrary prime?  


Not long at all, and you don't need a supercomputer.  It's perfectly
parallelizable.  Use a room full of PCs and give them disjoint ranges of
primes to test.

A major problem, which no-one has yet commented on, is that the protocol as
stated doesn't allow a secret vote.  Only if no-one other than the voter
knows who received which number, and the voter knows only his/her own, can
it be secret.  Patching up this hole is relatively straightforward and is
left as an exercise for the reader ;-)



Paul
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 1 Dec 2000 19:06:44 -0000
From: "Brian J. Beesley" <[EMAIL PROTECTED]>
Subject: Re: Mersenne: P4

On 1 Dec 00, at 0:13, [EMAIL PROTECTED] wrote:

[... snip ...]
> How is any of this relevant to Mersenne testing? Well, fast 64-bit integer
> ops should make a well-designed all-integer convolution algorithm
> competitive with a floating-point-FFT-based one. However, neither of these
> two is truly satisfying since each basically uses just half the
> functionality (integer or floating-point) of the processor.

Half the execution units ... but maybe there may already be a 
bottleneck e.g. in the memory bus throughput, or in the instruction 
decoder.

If we were free to design a processor optimized for mega-precision 
arithmetic, we'd have registers _much_ longer than 64 bits. We would 
probably also have multi-operand instructions, allowing e.g. elements 
of a radix-N FFT to be computed in a single instruction. At any rate 
we _definitely_ want to be able to evaultate a * b + c as fast as 
possible - and it is possible to calculate the result in the time 
taken to do just the multiplication.

But we'd still need to move the operands between CPU registers and 
main memory. This is likely to continue to represent a bottleneck, 
which could only be got around by using very wide data busses and/or 
very fast (static) RAM - strategies which would result in expensive 
hardware systems.


Regards
Brian Beesley
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 1 Dec 2000 15:54:25 -0500 (EST)
From: Jason Stratos Papadopoulos <[EMAIL PROTECTED]>
Subject: Re: Mersenne: P4

On Fri, 1 Dec 2000, Brian J. Beesley wrote:

> On 1 Dec 00, at 0:13, [EMAIL PROTECTED] wrote:
> 
> [... snip ...]
> > How is any of this relevant to Mersenne testing? Well, fast 64-bit integer
> > ops should make a well-designed all-integer convolution algorithm
> > competitive with a floating-point-FFT-based one. However, neither of these
> > two is truly satisfying since each basically uses just half the
> > functionality (integer or floating-point) of the processor.
> 
> Half the execution units ... but maybe there may already be a 
> bottleneck e.g. in the memory bus throughput, or in the instruction 
> decoder.

This is definitely an issue with the Alpha 21264, which can execute
6 instructions per clock (four integer, 2 floating) but can only
decode 4 instructions per clock.

> If we were free to design a processor optimized for mega-precision 
> arithmetic, we'd have registers _much_ longer than 64 bits. We would 
> probably also have multi-operand instructions, allowing e.g. elements 
> of a radix-N FFT to be computed in a single instruction. At any rate 
> we _definitely_ want to be able to evaultate a * b + c as fast as 
> possible - and it is possible to calculate the result in the time 
> taken to do just the multiplication.

> But we'd still need to move the operands between CPU registers and 
> main memory. This is likely to continue to represent a bottleneck, 
> which could only be got around by using very wide data busses and/or 
> very fast (static) RAM - strategies which would result in expensive 
> hardware systems.

Or a huge pipeline with banked memory. This would actually not be
extremely difficult: built a vector NTT butterfly ASIC with its
own DMA engines, and connect it to multibank DRAM arrays. What you'd
get is a vector processor which could execute convolution primitives.

Sometimes I wish I knew more about hardware...

jasonp

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 22:31:09 +0100
From: Guillermo Ballester Valor <[EMAIL PROTECTED]>
Subject: Re: Mersenne: Glucas v.2.0 released

Guillermo Ballester Valor wrote:
> There are still no precompiled binaries. I think soon will be binaries
> for Alpha-Osf (ev56, ev6). Thus, you have to make the binary :(. 

Now there are two binaries in the directory

        ftp://209.133.33.182/pub/valor/bin

one is for Alpha ev5-OSF and the other is for pentium GNU/Linux glib2.0

You can play with it.

Regards

Guillermo.
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 21:03:52 -0500
From: George Woltman <[EMAIL PROTECTED]>
Subject: Mersenne: P4 arrives!

Hi all,

        Dell came through and shipped ahead of schedule.  Unfortunately I
was not able to upgrade the PC600 memory to PC800.  The first benchmarks 
are in.
You can find them at http://www.mersenne.org/bench.htm  The 1.4 GHz P4 is 
about as
fast as an 850 MHz Athlon or a 1200 MHz P-III.

        I'll keep everyone posted on improvements as development proceeds.

Regards,
George

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Fri, 01 Dec 2000 23:49:43 -0500
From: Jud McCranie <[EMAIL PROTECTED]>
Subject: Re: Mersenne: P4 arrives!

At 09:03 PM 12/1/2000 -0500, George Woltman wrote:
         Dell came through and shipped ahead of schedule.  Unfortunately I
>was not able to upgrade the PC600 memory to PC800.  The first benchmarks 
>are in.
>You can find them at http://www.mersenne.org/bench.htm  The 1.4 GHz P4 is 
>about as
>fast as an 850 MHz Athlon or a 1200 MHz P-III.

Not very impressive for the P4.  I'm planning to get a new computer next 
year, and this strongly influences my choice.


+---------------------------------------------------------+
|     Jud McCranie                                        |
|                                                         |
| Programming Achieved with Structure, Clarity, And Logic |
+---------------------------------------------------------+


_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 2 Dec 2000 10:29:02 -0800
From: "xqrpa" <[EMAIL PROTECTED]>
Subject: Re: Mersenne: P4 arrives!

Very discouraging at first blush.  Could you give us a brief summary of the
the recoding possibilities that might take adavntage of the new features?

Seems a shame that the much improved P4 memory bandwidth serves for
naught.  The chip and the mobos listed seem outlandishly expensive, too.
Double ratbite...

With Pipelined Puzzlement,
Stefanovic

- ----- Original Message -----
From: George Woltman <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, December 01, 2000 6:03 PM
Subject: Mersenne: P4 arrives!


> Hi all,
>
> Dell came through and shipped ahead of schedule.  Unfortunately I
> was not able to upgrade the PC600 memory to PC800.  The first benchmarks
> are in.
> You can find them at http://www.mersenne.org/bench.htm  The 1.4 GHz P4 is
> about as
> fast as an 850 MHz Athlon or a 1200 MHz P-III.
>
> I'll keep everyone posted on improvements as development proceeds.
>
> Regards,
> George
>
> _________________________________________________________________________
> Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
> Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
>

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sat, 02 Dec 2000 14:16:37 -0500
From: George Woltman <[EMAIL PROTECTED]>
Subject: Re: Mersenne: P4 arrives!

Hi,

At 10:29 AM 12/2/2000 -0800, xqrpa wrote:
>Very discouraging at first blush.  Could you give us a brief summary of the
>the recoding possibilities that might take adavntage of the new features?
>
>Seems a shame that the much improved P4 memory bandwidth serves for
>naught.

I was rather pleased with the results.  As noted earlier, the latencies are 
high
for the regular FPU.  The SSE2 instructions should result in more floating
point operations per clock cycle.

The memory bandwidth is for naught!  Prime95 was designed with a 32 byte
cache line in mind.  The L2 cache of the P4 has 128 byte cache lines.  Thus,
prime95 is often using only 25% of the bytes the P4 has fetched into
the L2 cache.

While speculating at this point is dangerous.  I suspect an optimized prime95
will be more than twice as fast as an Athlon 700.

>  The chip and the mobos listed seem outlandishly expensive, too.
>Double ratbite.

I can't help you there!  The best bang for your buck today is an Athlon system.
I would wait until next summer or fall before purchasing a P4.  The clock 
rate will be
higher and support for DDR SDRAM may be available.  Also, the shrink to a 0.13
micron process may be complete by then.    If upgradability is important to 
you, be aware
that todays P4 motherboards are a "dead end".  Future P4 chips will not 
work in them.

Regards,
George

_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

Date: Sun, 3 Dec 2000 12:57:41 +0100 
From: "Hoogendoorn, Sander" <[EMAIL PROTECTED]>
Subject: Mersenne: Unofficial milestone!

Hi,

Today we reached 100.000 cleared exponents since last synchronization on
primenet.
When will the next synchronization be?
The list with cleard exponents is getting a bit to long to download that
often.
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

------------------------------

End of Mersenne Digest V1 #797
******************************

Reply via email to