> >Speaking of calculating,
> >
> >
> >What do you think the best is the best approach for this problem?
> >
> >Calculate the n th. (any number) decimal digit of the expansion of a
> >regular rational 1/n (not necessarily the same n), with n a prime number,
> >without calculating the preceding d
On Sun, 9 Sep 2001, George Woltman wrote:
> c) At these large FFT sizes, we are now putting pressure on the
> TLB caches. The TLB maps a virtual address into a physical address.
> Intel chips keep track of 64 TLB entries, each entry maps to a 4KB
> page.
>
Hey everybody. I've realized after finishing an integer convolution
library for the Alpha (using number-theoretic FFTs) that even a nice
64-bit processor like the Alpha couldn't give me the speed I wanted for
integer-only Mersenne-mod squaring. I've also wanted for a while to build
integer-only c
On Mon, 4 Jun 2001, Russel Brooks wrote:
> After reading about Steve Gibsons recent problems at:
>
> > http://grc.com/dos/grcdos.htm
>
> I decided to install Zone Alarm on my home pc. Does anyone have
> comments on any interactions with Prime95 or any other comments?
Zone Alarm is quite nic
On Tue, 15 May 2001, Gareth Randall wrote:
> Also, any code would be very hardware specific, and may only work if
the display was not displaying, say, a desktop.
>
> However, if someone could implement it, it could provide the *ultimate*
in Mersenne related screen savers! What you'd see on the
Hey everybody. I recently replaced my regular GIMPS machine with a 1GHz K7
system carrying 256MB of virtual channel SDRAM. The machine is crunching
through the last of the double checking assignments its predecessor was
issued, and I noticed that the per-iteration time of .075 seconds is a
little
On Mon, 12 Mar 2001 [EMAIL PROTECTED] wrote:
> version, where the near-term payoff was much surer. Now
> that I've squeezed out about as much as I think I can
> be reasonably done from the floating-point version, I've
> been thinking about the modular stuff again. Your
> experience with all-mod
On Mon, 12 Mar 2001 [EMAIL PROTECTED] wrote:
> Assuming the prime p is fixed at compile time, you can specify
> a primitive root g (of prder p-1) in the binary. You can try g = 3, 5, 7, ...
> until you succeed. You will need the prime factorization of p-1
> when you test whether g is r
Hello again. After working out how to do integer FFTs on the Alpha, I'm
considering additional code that uses Fast Galois Transforms (FGTs),
e.g. for complex integers in GF(p^2) for p a prime congruent to 3 mod
4. Bill Daly mentioned this idea before on the list (end of 1998) but
unfortunately di
On Mon, 1 Jan 2001, George Woltman wrote:
> Now the bad news, when I use the prime95 memory layout where the 8 input
> values come from 8 different cache lines and the modified values are
> output to the same cache lines (an in-place FFT), the P4 code now takes 112
> clocks.
>
> The cause is t
Hello again. Version 1.1 of ICL is a service release; there are minor
speedups, a single major bug fix for Mersenne code and improved test
programs. The build process is now much more generic so that you don't
need the absolute latest GNU tools to compile it. You will still need an
Alpha, however
Hello. For any of you out there lucky enough to have access to an
Alpha 21264, I've developed a library for large integer convolutions
using number theoretic transforms and optimized for this processor. The
library distribution includes a sample program that performs a
Lucas-Lehmer test.
I desig
Hello. I'm putting the finishing touches on a large-integer convolution
library that's optimized for the Alpha ev6, and I want to build support
for Mersenne-mod convolution right into the library. However, the
library is integer-only and works with integers modulo a 62-bit prime
(eventually sever
On Mon, 11 Dec 2000, Richard B. Woods wrote:
> Jason Stratos Papadopoulos has alerted me to my not having sufficiently
> delimited the scope of my previous P4 future cautionary note in its
> topic paragraph rather than its final paragraph. That, alas, was a
> particular
On Sun, 10 Dec 2000, Richard B. Woods wrote:
> I urge those of you contemplating a Pentium 4 purchase to read a
> cautionary note about its future written by Thomas Pabst, author of
> "Tom's Hardware Guide" at http://www.sysdoc.pair.com
Anyone contemplating a P4 purchase and in need of second,
On Fri, 1 Dec 2000, Brian J. Beesley wrote:
> On 1 Dec 00, at 0:13, [EMAIL PROTECTED] wrote:
>
> [... snip ...]
> > How is any of this relevant to Mersenne testing? Well, fast 64-bit integer
> > ops should make a well-designed all-integer convolution algorithm
> > competitive with a floating-p
On Wed, 29 Nov 2000, Guillermo Ballester Valor wrote:
> Today, I've read in the manuals that a simple integer add with carry
> (addc) has 8 clocks of latency and 3 clocks of throughput for a P4.
> Humm, too much slowdown for single Ia32 instructions, Intel engineers
> will know the reasons.
Th
On Sat, 17 Jun 2000 [EMAIL PROTECTED] wrote:
> Will Prime95 be rewritten to run on the Itanium, when it comes out? Seems to
> me like 64-bit operation will speed it up significantly, as will the insane
> amount of registers and floating point units and all the other
> microprocessor
> w
On Mon, 24 Apr 2000, Henk Stokhorst wrote:
> Hi,
>
> The new IA-32 processor under development, codenamed 'Willamette', has
a 64 bit FPU
> and an ALU running at twice the clockfrequency. However, the latencies
are different
> from the existing Pentiums.
>
> Did anybody have a look at the proces
> > The problem is that a 20 Mhz 386 is loosely comparable to a 3 Mhz
> P-II.
It's much much worse than that. Even with a coprocessor, a floating point
add or multiply on a 386 takes 28-57 clocks; on the PII it takes one,
if scheduled carefully.
When the 386 and 486 were state of the art, some
On Thu, 10 Feb 2000 [EMAIL PROTECTED] wrote:
> My only question would be, "Can we not figure out a way to stay in the
> frequency domain and correct the errors internally?"
If you can figure out a way to propagate carries while in the transform
domain, then you're home free (there are FFTs base
On Wed, 9 Feb 2000 [EMAIL PROTECTED] wrote:
> As I've found the available Athlon documentation (the technical brief
> and the code optimization guide from the AMD website) to be frustratingly
> vague about things like the register set architecture and the functional
> units, can anyone answer the
On Sun, 6 Feb 2000, Lars Lindley wrote:
> > However, the FFT itself is very amenable to parallel processing
> > techniques - on a processor with N independent compute pathways, you
> > can compute N elements in the same time that a single element would
> > take to compute just one.
> >
> How much
On Mon, 8 Nov 1999, Bill Rea wrote:
> I'm a bit red-faced on this one. I just tried it again and it doesn't.
> This is still a mystery to me. It would seems to me that for
> this type of code that having full access to the 64-bit instruction
> set of the UltraSPARC CPUS and running it on a 64-bit
On Thu, 28 Oct 1999 [EMAIL PROTECTED] wrote:
> A day seems somewhat of an overestimate (though maybe it takes that
> long on a Pentium - I don't know.) On a decently fast Alpha, say a 500MHz
> 21164, I can do a million-digit (and note I mean base-10 digits, not bits)
> GCD in under 5 minutes usin
On Sun, 26 Sep 1999 [EMAIL PROTECTED] wrote:
> I'm not too surprised at this. Since my code appears to be faster than
> FFTW on most high-end CPUs, that tells me that FFTW is probably optimized
> more for the x86 (very few FP registers) than mine, which is geared toward
> hardware with at least 3
On Sat, 25 Sep 1999, Olivier Langlois wrote:
> I've played a little bit FFTW and I've remarked that its performance can
> vary a lot depending on how good your compiler is at optimization.
Absolutely. Compile FFTW with gcc on a Sun and you'll get half the
speed of FFTW using Sun cc.
> For insta
On Sat, 25 Sep 1999, Guillermo Ballester Valor wrote:
> Yes, certainly I've be able to adapt lucdwt and McLucasUNIX in four
> days. On the other hand, my goal only was to know if working with FFTW
> is a good idea, and timings obtained make me think it could be.
For really big FFTs you can get
On Sun, 19 Sep 1999, Conrad Curry wrote:
> There are several programs that can convert between intel and gas, but
> usually require some help in converting. One that can convert between
> NASM or MASM or Gas is at http://hermes.terminal.at/intel2gas/
Note that this program was designed to con
On Sun, 22 Aug 1999 [EMAIL PROTECTED] wrote:
> For example, a typical complex radix-16 FFT pass in my Mlucas code
> takes 16 complex data (32 8-byte floats), and including multiplies by
> "twiddle" factors (FFT sincos data) does 168 FADDs and 88 FMULs on them-
> that's nearly twice as many adds a
Sorry; it's come to my attention that I made a slight omission in
the source for my Fermat code.
If compiling in gcc or egcs, be sure to append -fno-inline-functions
to the compile line in f24main.c
jasonp
_
Unsubscribe & list in
Hey everybody. Now that our Fermat testing is starting to wind down
I've decided to make available the source I've written for it.
www.glue.umd.edu/~jasonp/f24v131.zip
The code there is heavily optimized for the UltraSPARC processor,
and includes gobs and gobs of sparc assembly language. Squari
On Fri, 6 Aug 1999, Blosser, Jeremy wrote:
> Now, of course, you have the intergraph cards and SGI boxes (boxen?) which
> have super cool 3D accelerators in 'em which support geometry acceleration,
> so I suppose it would be feasable to code something for these that'd just
> plain rock... A good
On Fri, 23 Apr 1999, Steinar H. Gunderson wrote:
> On Thu, Apr 22, 1999 at 04:16:54PM -0700, Mersenne Digest wrote:
> >Well, my assumption is that GCC doesn't do 64-bit... I wish I were a Ultra
> >guru like the one that did the DES port for Distributed.net... that thing
> >flies! I was getting >
On Mon, 22 Mar 1999, Aaron Blosser wrote:
> The reason I thought of this was that if this is possible, would it not also
> be possible to use multiple computers on a network to work on the same
> number? With Windows OS', you could use DCOM or even just RPC to get many
> machines working on the
On Tue, 16 Mar 1999, Luke Welsh wrote:
> Hi All--
>
> As most of you know, Majordomo has always been configured to
> bounce posts from people who are not subscribed to the list.
> In the past, this has caught all the spam (and I have saved it
> all, anybody want copies?) Well, one spam did get
On Tue, 16 Mar 1999, Roger Vives Miret wrote:
> > >To be removed, simply call 1-800-600-0343 ext. 1746
>
> And if their benefits consists on calling this phone number?
> Or if I don't live in States?
The original spam came from a uunet IP address, and I complained to
[EMAIL PROTECTED] the day
On Sun, 7 Mar 1999 [EMAIL PROTECTED] wrote:
>
> If one were to build a microprocessor SPECIFICALLY suited to LL testing,
> what would the assembly instruction set look like? Approximately what
> would the architecture look like? Speed shouldn't be an issue because
> there's never enough anyway a
On Wed, 3 Mar 1999, Brian J Beesley wrote:
> > That is a remarkably bold claim. It's akin to saying that the only way of
> > finding primes is by brute force testing of all candidates.
>
> How about a clever method of computing just the last few bits of
> the residual - if we could work mod 2^
On Sun, 28 Feb 1999, Paul Derbyshire wrote:
> At 12:06 AM 2/28/99 -0900, you wrote:
> > Eh? All the original Celeron chips were sans cache, only after
> >the customers started screaming about a 300 performing like a 200mhz
> >pentium did the 300A and later come out with a reduced size cache
On Fri, 26 Feb 1999, Hoogendoorn, Sander wrote:
> The Pentium III, also known under the code name 'Katmai', does not come with
> a feature that would show an immediate performance increase as in case of
> the K6-3. Its basic core as well as the L2-cache architecture is identical
> to the Pentium
On Thu, 7 Jan 1999, Bill Daly wrote:
[ snip very nice explanation ]
> This is only a part of Crandall's method. He also takes advantage of a
> method of packing the number to be multiplied into complex coefficients,
> which reduces the order of the FFT by a factor of 2. Also, he uses
> something
> Aside from the fact the we need to do a single FFT instead
> of two, why isn't squaring inherently faster than multiplying?
>
a * b = .25 * [ (a + b)^2 - (a - b)^2 ]
So if you figured out a fundamentally faster way to square, a multiply
is automatically as fast (within a constant factor,
43 matches
Mail list logo