Re: Mersenne: Re: and, from WAY out in left field...

1999-09-25 Thread Brian J. Beesley

On 25 Sep 99, at 1:23, [EMAIL PROTECTED] wrote:

 As with all distributed efforts, we all contribute what we can. It's not
 a contest, except perhaps amongst us programmers, in terms of pushing each
 other to continually improve our codes.

There are other ways of contributing as well as simply contributing 
CPU cycles - valuable though that contribution is!

 Sorry, I am bitter and can't afford a 550 Pentium.
 
 Hopefully soon - here prices have dropped below 1000 $US for 500MHz systems-
 you can get a 300-400MHz PII virtually for free if you're willing to pay
 $20/month for 36 months of unlimited Internet access. You should try buying
 online, e.g. at www.valueamerica.com or some such store.

UK "shop window" retail prices _look_ awful - but they insist on 
selling you a huge monitor, "high-end" multimedia hardware (possibly 
including an awful scanner and a worse digital camera), a 56K modem 
(built-in WinModem, useless unless you're running Windoze) and a huge 
raft of "free" software (mostly obsolete and/or rubbish they can't 
offload any other way, except as landfill) as well as the bits you 
actually need. You can buy basic systems at quite reasonable prices 
here, if you look in the right place e.g. £440 (ex VAT) for a 
complete PII-400 system. PIII-500 systems are a couple of hundred 
pounds more expensive, but that's Intel pricing policy. In terms of 
"bangs per buck", the systems with the fastest processors are 
relatively poor value for money (especially as the performance of 
"real" programs is not linearly related to the CPU clock speed!)

There is an _enormous_ improvement in speed between a Cyrix 333 and a 
PII-400, _much_ less between a PII-400 and a PIII-600 (or whatever 
the fastest chip Intel do at the moment is - I find it hard even to 
keep track of where the "bleeding edge" is, so fast is the turnover 
of products!)

Bear in mind the transport costs, the complications with customs duty 
 the possible risks involved in dealing in foreign exchange (you 
don't know the exchange rate until the credit card company processes 
the transaction -  they can, and do, manipulate this to your  
disadvantage) I'm not sure that there's any real benefit in dealing 
direct with US suppliers for single unit purchases. Bulk purchasing 
may be an entirely different matter.

Here in the UK, component prices seem to be reasonably competitive, 
especially when purchased through the many outlets which include free 
delivery. This is a major saving on a small order.

 How do I know that the exponent I am testing has not already run to
 conclusion.?

If someone else has run your exponent to completion (unlikely) you 
will find out when you check in your completion dates. The PrimeNet 
server will mutter at you.


Regards
Brian Beesley
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Re: FFTW for GIMPS?

1999-09-25 Thread Jason Stratos Papadopoulos

On Sat, 25 Sep 1999, Guillermo Ballester Valor wrote:

 Yes, certainly I've be able to adapt lucdwt and McLucasUNIX in four
 days. On the other hand, my goal only was to know if working with FFTW
 is a good idea, and timings obtained make me think it could be.


For really big FFTs you can get major gains by using FFTW as a building
block in a bigger program, rather than have it do your entire FFT with
a single function call. As Ernst mentioned, the building block approach 
lets you fold some of the forward and inverse FFT computations together,
and this saves loads of time in cache misses avoided. On the UltraSPARC,
using FFTW for the pieces of your computation rather than the whole thing
is somewhere between 2 and 10 times faster than FFTW alone.

 If your comparison were ported to intel machines, which is wrong, your
 code will run nearly as fast as mprime!!. You say your code is twice
 faster than FFTW, sure it is, *BUT* in my pentium-166 the short code I
 wrote do an iteration of my actual exponent 3975659 in 0.901 seconds
 while mprime take only 0.359. This makes a RPI=40%. Then, your code will
 reach nearly 90% !and without lots of assembler code!.

On the Pentium, assembly-coded small FFTs run more than twice as fast
as FFTW. Even from C, you can do better on the Pentium (do a web search
for djbfft, a free Pentium-optimized FFT package). For a recursive
split-radix, you need about 200 lines of assembly; surely this is worth
twice the speed!

jasonp


_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Front-end design

1999-09-25 Thread Jukka Santala

Pierre Abbat wrote:

 I suggest a couple of named pipes for control (the front end writes to one and
 reads from the other, and mprime vice versa). Since writing to a pipe whose
 reader is stuck can get you blocked when the pipe is full, and writing to a
 pipe with nothing at the other end results in a SIGPIPE, mprime should not
 output anything to the pipe unless told to do so, and should trap SIGPIPE and
 turn off output.

The current .ini structure is, I feel, sufficient for controlling the
operation of
the client, and preferrable. However, some signal should exist to force
the client
to re-read these settings after change; at least in *nixes this should
be easy
(Just handle HUP). As for data out of the client, I like to think that
the
existing save-state files would be enough, or could be made to be
enough. I'm not
sure how much cycles generating them frequently will burn, though, which
could be
a problem. Also the timing data is not easily accessible this way, but I
like the
approach used in the recently posted *nix GUI. And I concur with the
ideas that
the "extended GUI" should, if possible, be separate from the main client
so as not
to run a risk of messing up the results.

 -Donwulff
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Decamega Tests (was: RE: Mersenne: GIMPS client output)

1999-09-25 Thread Herb Savage

John R Pierce wrote:


 There's another trick to this, primarily useful in assembler not C
 programming...

You can do it in C also.  Its called Duff's Device.  Its ugly but legal
C code.

Here a URL with info on it:

   http://bs.cs.tu-berlin.de/~jutta/c/duffs-device.html

Regards,

Herb Savage


 Say you unroll something 16 times   Take teh actual iteration count
 modulo 16, and JMP into the loop at that offset to start with the repeat
 counter set to the count/16.  i.e. if you need to do, say, 60 iterations of
 the inner code, thats 48 + 12, so you set the loop count to 3, and jump to
 the 4th block of the 16 (16-12 = 4)

 Anyways, prime95 is HEAVILY unrolled, using assembler macros to generate the
 inline linear inner 'loops'.

 -jrp
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Re: Mlucas 2.7 for x86?

1999-09-25 Thread EWMAYER

Guillermo Ballester Valor ([EMAIL PROTECTED]) writes:

 You say your code is twice
faster than FFTW, sure it is, *BUT* in my pentium-166 the short code I
wrote do an iteration of my actual exponent 3975659 in 0.901 seconds
while mprime take only 0.359. This makes a RPI=40%. Then, your code will
reach nearly 90% !and without lots of assembler code! 

Hi, Guillermo:

Alas, I don't think it works that way, for several reasons. On RISC
Microprocessors, instruction scheduling (while nontrivial) is much
easier than on CISCs, meaning that a good compiler can usually generate
code that runs nearly as fast as hand-tuned assembly code (ASM) written
by a skilled programmer. By "nearly" I mean 50-90%, depending on the
CPU, the compiler and the ASM programmer. Non-RISCs like the Intel x86
are notoriously difficult to generate good code for, either automatically
or by hand - I believe George spent 2 years getting the ASM for the early
versions of Prime95 put together, and he (George - correct me if I'm wrong)
was an x86 ASM expert to begin with.

I'm not familiar enough with the details of FFTW to say for sure (Steven
Johnson or Jason Papadopoulos could answer this), but I'm pretty sure
Frigo/Johnson have done some machine-specific ASM of critical portions
of the code, at least for some popular architectures - if so, the x86
would be one of the first candidates, for the above reason and the fact
that it's so numerous. That may explain why FFTW performs so well on
the x86. Whenever I've tried my code (not including v2.7 - that should
run at least 2x faster on x86 than 2.6, in my estimation) on x86, it's
been 5-10x slower than Prime95. (That's partly due to the fact that my
f90 compiler for x86 is awful, but only partly.)

Is there any linux or window Mlucas 2.7 executable for intel machines?

No, but anyone with an f90 compiler for x86 is free to download and
compile the source. By far the best compiler I know of for x86 is the
Digital/Compaq/Microsoft Visual f90 for Windows - I don't have that
one to play with, unfortunately, but I'd be interested to hear from
someone who does what timings they get.

Regards,
Ernst

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Re: Mlucas 2.7 for x86?

1999-09-25 Thread Guillermo Ballester Valor

[EMAIL PROTECTED] wrote:
 

 I'm not familiar enough with the details of FFTW to say for sure (Steven
 Johnson or Jason Papadopoulos could answer this), but I'm pretty sure
 Frigo/Johnson have done some machine-specific ASM of critical portions
 of the code, at least for some popular architectures - if so, the x86
 would be one of the first candidates, for the above reason and the fact
 that it's so numerous. That may explain why FFTW performs so well on
 the x86. 

I've read on the fly the code used and I've not seen any line of
assembler code. Perhaps the C-code has tuned thinking in x86. 

 
 
 Is there any linux or window Mlucas 2.7 executable for intel machines?
 
 No, but anyone with an f90 compiler for x86 is free to download and
 compile the source. By far the best compiler I know of for x86 is the
 Digital/Compaq/Microsoft Visual f90 for Windows - I don't have that
 one to play with, unfortunately, but I'd be interested to hear from
 someone who does what timings they get.
 

I've found a microsoft f-90 compiler. I've tried to compile it this
afternoon but it gives me some errors, I think easy to solve. The
compiler requires interfaces for most fft routines because first
parameter 'a' is a target !? :-(

I have no many time. But I'll be on work.

yours sincerely.

| Guillermo Ballester Valor   |  
| [EMAIL PROTECTED]  |  
| c/ cordoba, 19  |
| 18151-Ogijares (Spain)  |
| (Linux registered user 1171811) |
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Re: FFTW for GIMPS?

1999-09-25 Thread Guillermo Ballester Valor

Jason Stratos Papadopoulos wrote:
 For really big FFTs you can get major gains by using FFTW as a building
 block in a bigger program, rather than have it do your entire FFT with
 a single function call. As Ernst mentioned, the building block approach
 lets you fold some of the forward and inverse FFT computations together,
 and this saves loads of time in cache misses avoided. On the UltraSPARC,
 using FFTW for the pieces of your computation rather than the whole thing
 is somewhere between 2 and 10 times faster than FFTW alone.
 
It could be terrific!. I'll see that.

 On the Pentium, assembly-coded small FFTs run more than twice as fast
 as FFTW. Even from C, you can do better on the Pentium (do a web search
 for djbfft, a free Pentium-optimized FFT package). For a recursive
 split-radix, you need about 200 lines of assembly; surely this is worth
 twice the speed!
 
I would like to write some C-code for general proposes. For tuned
assembler we have the Woltman fantastic prime95/mprime code.

Thank you very much for your comments. It will help me a lot.

Have a nice weekend.


| Guillermo Ballester Valor   |  
| [EMAIL PROTECTED]  |  
| c/ cordoba, 19  |
| 18151-Ogijares (Spain)  |
| (Linux registered user 1171811) |
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Mersenne: Graphical visualisation of computations

1999-09-25 Thread Robert van der Peijl


Jukka Santala [EMAIL PROTECTED] wrote on Friday 24 September 1999 at 5:25
PM:

 Though you ask this, I find the topic rather appropriate for the list,
 especially given the angle of "HOW can we visualize the process of
 mathemathical operations.

That brings me to the following observation:
I think few of us fully realize the _enormous_ amount of (highly
optimized) processing required each iteration to achieve something that
looks so simple: a squaring, minus 2, mod (2^p-1).
Remember, you see nothing happening! Prime just tells you: I used
so-and-so-many BILLION clock cycles on the previous iteration. When a
computer is ray-tracing (preferrably on some fancy high-end workstation)
you are immediately dazzled by the picture-perfect pixels appearing on the
screen in brilliant colours.
But your result is not for eternity. In GIMPS it is. M7xx tested.
Composite. Or: M10yy trial-factored to 2^65. No factors below 2^65.
And you know that YOU discovered that mathematical fact.
That little 64bit residue or checksum is precious. If you had a grain of
rice for the number of clock cycles needed to produce just one of those
bits, you could feed the entire (current) world population for a whole
year! Is that about right?
Whatever the correct figure, the program doesn't show you what it's doing.
Actually, it might be interesting to be able to see, say, just the last
64 bits (in hex) of the current number L[n] in the Lucas sequence. The
program displays those bits for L[p-1], right? So why not during the
sequence?
The program should IMHO at least -optionally- display a progress bar
showing graphically what it now only shows in digits:
progress on current exponent [95.2% completed]. The bar would probably
grow one pixel longer every couple of hours.
Any comments on the issue of graphical visualisation?

Cheers,
Robert van der Peijl

_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Graphical visualisation of computations

1999-09-25 Thread Chris Nash

 I think few of us fully realize the _enormous_ amount of (highly
 optimized) processing required each iteration to achieve something that
 looks so simple: a squaring, minus 2, mod (2^p-1).
 Whatever the correct figure, the program doesn't show you what it's doing.

This is an interesting point - not so much whether a gui would be nice or
not, but, what would it show? A progress bar, maybe... anything else? There
isn't really anything else to show. Intermediate results of the LL test
don't themselves have a lot of meaning (even the final result, if non-zero,
is devoid of much interpretation). There's not a lot you could plot - a
graph of the iteration time would only serve to show when you opened
Microsoft Word or something...

I must admit, I dallied (albeit very briefly) with the bovine rc5 client.
Were it not for their peculiar statistics ('if keys were drops of water, we
could flood the world in a week' etc) I'd have got bored of the effort a lot
sooner than I did. The SETI client looks pretty (more of a screen burner
than a screen saver though) but the display is at best meaningless, a waste
of cycles at best.

I think we ought to count ourselves lucky that what we've got is bare-bones
and low-impact.

Chris Nash
Lexington KY
UNITED STATES



_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers



Re: Mersenne: Graphical visualisation of computations

1999-09-25 Thread Lucas Wiman

  I must admit, I dallied (albeit very briefly) with the bovine rc5 client.
 
 Gack!  About the only manufactured challenge comes from ID software.

Should read About the only more manufactured challenge comes form ID
software.  :)

-Lucas
_
Unsubscribe  list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ  -- http://www.tasam.com/~lrwiman/FAQ-mers