Re: Mersenne: P-1 and non k-smooth factors

Brian J. Beesley Thu, 05 Dec 2002 04:46:17 -0800

On Wednesday 04 December 2002 21:46, Daran wrote:
> [... snip ...]
> > ...though I think there needs to be a
> > careful analysis as to what the extra computation time for actual E
> > values might be...
>
> I agree.  My tests have been limited to exponents in the 8.1M range, for no
> particular reason than those are the ones I am doing.


Well, you seem to have more experimental evidence than anyone else.

As for the theory - whilst there is adequate theoretical modelling of the 
expected distributions of the largest and second-largest factors of arbitary 
numbers, I couldn't find much in the literature which would help predict how 
many extra factors you would expect to find with different values of E.

There is obviously a tradeoff here between increasing B2 and simplifying E 
and increasing E compensating for increased run time by lowering B2. However 
it does seem to be obvious that increasing E always has to be paid for in 
increased memory requirements.

For exponents around 8M, this is not a particular issue. However there is a 
real, practical constraint so far as Prime95/mprime is concerned - the entire 
_virtual_ address space is limited to 4 GBytes by the 32 bit address bus, and 
the OS kernel claims some (usually half) of this, so that the total memory 
usable by a single process is limited to 2 GBytes. (There is a "big memory" 
variant of the linux kernel which expands this to 3 GBytes, but the point 
still stands).

Since, on my practical experience, a 17M exponent will quite happily use ~ 
800 MBytes in P-1 stage 2, the 32 bit address bus may well be a limiting 
factor within the exponent range covered by current versions of 
Prime95/mprime.

George - is there a "sanity check" on the memory constraints?
>
> > If we _don't_ have to worry about memory, at some point it becomes
> > cost-effective to run ECM with small limits instead of P-1 with much
> > larger limits. And ECM can "easily" dig out some factors which are more
> > or less inaccessible with P-1.
>
> I was under the impression the ECM was only practical for small exponents
> well below the current DC range.

ECM stage 2 quickly becomes impractical with larger exponents because of the 
memory requirement. ECM stage 1 is not particularly heavy on memory. Running 
stage 1 only with small limits on DC sized exponents is feasible ... it's 
just a question of whether the extra computation costs would be justified by 
the discovery of factors which were inaccessible to trial factoring or P-1.
>
> > [... snip ... I don't disagree but the basic argument is the same as
> > above]
> >
> > > In 2 out of the 29 stage 2 factors I have found so far using E=4, k has
> > > not been smooth to B2.  This suggests that increasing E from 4 to 12
> > > could yield about 20% more factors.  I've done a few tests with a
> > > modified and recompiled client, which suggests that it would worth it
> > > even if E=12 yielded as few as 10% more factors, though I need to
> > > investigate this further.
> >
> > That's a very small sample.
>
> It's the only sample I have.  I'm trying to increase it by doing some P-1s
> on exponents in the 1.2M range which have only been tested to B1=B2=1000.
>
> How many of these were found during stage 2?  (If half your factors were
> found during P-1 stage 2, and half of those used E=4 or greater, then your
> single 'surprising' factor would not be out of line with my two.)

Well, actually I was doing the test in several steps, with gradually 
increasing B1 then gradually increasing B2 - the cost of the GCDs with small 
exponents is very small so it's worth checking fairly frequently to see if a 
factor is "available".

I don't have the full data to hand but I do have some of it. The distribution 
of 22 factors found at various limits was as follows:

stage 1 B1 = 500000                      1
stage 1 B1 = 1000000                     1
stage 2 B1 = 1000000 B2 = 4000000        4
stage 2 B1 = 1000000 B2 = 10000000       5
stage 2 B1 = 1000000 B2 = 25000000      11

Some "easier" factors were in all probability "missed" because someone had 
found them by running P-1 with smaller limits before I started.
>
> I have a total of 57 factors, including one found earlier today.  A few
> were by TFs, 30 in P-1 stage 2 (including today's) and the rest in stage 1.

OK. Actually for about the last three weeks I've been running P-1 with 
"standard limits" on some exponents in the range 2M-6M (those exponents where 
all the entries in lucas_v have the same user ID, with the exception of a 
very few where P-1 was already completed to reasonable limits).

The system I'm using is configured with mem=224 MBytes (about as much as I 
dare on a 512 MBytes dual-processor system). I'm getting E=4 logged fairly 
consistently.

The results so far are:

No factor found, 130
Factor found in stage 1, 2
Factor found in stage 2, 6 - all "smooth" to B limits used.

One of the factors found in stage 1 is _very_ interesting:

6807510023694431 is a factor of M(5993719) (smooth to B1=353)
The factoring depth database had this trial factored through 2^62. The factor 
found is _much_ smaller than 2^62. Clearly something went wrong somewhere.

> [... snip ...]
> > Would there be any means of retrieving actual factors found using P-1 and
> > the E values used from the server logs? The problem otherwise is that, so
> > far as the database is concerned, once a factor is found, nobody cares
> > much how!
>
> If the server log is the same as the client's prime.log, then the
> information isn't there.
>
> What I was thinking of doing was writing a program to read in the factor
> database from, say, 1M up (to avoid any factors found by ECM), discard any
> below the TF limit, then try to find the smallest B2 which would yield the
> factor given E=4, 6, 8, 10, or 12.  This won't tell us the absolute
> frequency of extended factors, but the relative frequencies would test, and
> perhaps quantify, my conjecture about the relative effectiveness of these
> values.

Why not simply use a random sample of numbers of suitable size? Say 
around 2^41, you are looking for factors from 2^65 upwards with exponents 
around 2^23. P-1 is really about the factors of k in f=2kp+1 since the +1 is 
implicit and the 2p comes out "in the wash".

(Does the size of the numbers in the sample actually matter from 
the theoretical point of view?)

Regards
Brian Beesley
_________________________________________________________________________
Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers

Re: Mersenne: P-1 and non k-smooth factors

Reply via email to