On Wednesday 04 December 2002 21:46, Daran wrote: > [... snip ...] > > ...though I think there needs to be a > > careful analysis as to what the extra computation time for actual E > > values might be... > > I agree. My tests have been limited to exponents in the 8.1M range, for no > particular reason than those are the ones I am doing.
Well, you seem to have more experimental evidence than anyone else. As for the theory - whilst there is adequate theoretical modelling of the expected distributions of the largest and second-largest factors of arbitary numbers, I couldn't find much in the literature which would help predict how many extra factors you would expect to find with different values of E. There is obviously a tradeoff here between increasing B2 and simplifying E and increasing E compensating for increased run time by lowering B2. However it does seem to be obvious that increasing E always has to be paid for in increased memory requirements. For exponents around 8M, this is not a particular issue. However there is a real, practical constraint so far as Prime95/mprime is concerned - the entire _virtual_ address space is limited to 4 GBytes by the 32 bit address bus, and the OS kernel claims some (usually half) of this, so that the total memory usable by a single process is limited to 2 GBytes. (There is a "big memory" variant of the linux kernel which expands this to 3 GBytes, but the point still stands). Since, on my practical experience, a 17M exponent will quite happily use ~ 800 MBytes in P-1 stage 2, the 32 bit address bus may well be a limiting factor within the exponent range covered by current versions of Prime95/mprime. George - is there a "sanity check" on the memory constraints? > > > If we _don't_ have to worry about memory, at some point it becomes > > cost-effective to run ECM with small limits instead of P-1 with much > > larger limits. And ECM can "easily" dig out some factors which are more > > or less inaccessible with P-1. > > I was under the impression the ECM was only practical for small exponents > well below the current DC range. ECM stage 2 quickly becomes impractical with larger exponents because of the memory requirement. ECM stage 1 is not particularly heavy on memory. Running stage 1 only with small limits on DC sized exponents is feasible ... it's just a question of whether the extra computation costs would be justified by the discovery of factors which were inaccessible to trial factoring or P-1. > > > [... snip ... I don't disagree but the basic argument is the same as > > above] > > > > > In 2 out of the 29 stage 2 factors I have found so far using E=4, k has > > > not been smooth to B2. This suggests that increasing E from 4 to 12 > > > could yield about 20% more factors. I've done a few tests with a > > > modified and recompiled client, which suggests that it would worth it > > > even if E=12 yielded as few as 10% more factors, though I need to > > > investigate this further. > > > > That's a very small sample. > > It's the only sample I have. I'm trying to increase it by doing some P-1s > on exponents in the 1.2M range which have only been tested to B1=B2=1000. > > How many of these were found during stage 2? (If half your factors were > found during P-1 stage 2, and half of those used E=4 or greater, then your > single 'surprising' factor would not be out of line with my two.) Well, actually I was doing the test in several steps, with gradually increasing B1 then gradually increasing B2 - the cost of the GCDs with small exponents is very small so it's worth checking fairly frequently to see if a factor is "available". I don't have the full data to hand but I do have some of it. The distribution of 22 factors found at various limits was as follows: stage 1 B1 = 500000 1 stage 1 B1 = 1000000 1 stage 2 B1 = 1000000 B2 = 4000000 4 stage 2 B1 = 1000000 B2 = 10000000 5 stage 2 B1 = 1000000 B2 = 25000000 11 Some "easier" factors were in all probability "missed" because someone had found them by running P-1 with smaller limits before I started. > > I have a total of 57 factors, including one found earlier today. A few > were by TFs, 30 in P-1 stage 2 (including today's) and the rest in stage 1. OK. Actually for about the last three weeks I've been running P-1 with "standard limits" on some exponents in the range 2M-6M (those exponents where all the entries in lucas_v have the same user ID, with the exception of a very few where P-1 was already completed to reasonable limits). The system I'm using is configured with mem=224 MBytes (about as much as I dare on a 512 MBytes dual-processor system). I'm getting E=4 logged fairly consistently. The results so far are: No factor found, 130 Factor found in stage 1, 2 Factor found in stage 2, 6 - all "smooth" to B limits used. One of the factors found in stage 1 is _very_ interesting: 6807510023694431 is a factor of M(5993719) (smooth to B1=353) The factoring depth database had this trial factored through 2^62. The factor found is _much_ smaller than 2^62. Clearly something went wrong somewhere. > [... snip ...] > > Would there be any means of retrieving actual factors found using P-1 and > > the E values used from the server logs? The problem otherwise is that, so > > far as the database is concerned, once a factor is found, nobody cares > > much how! > > If the server log is the same as the client's prime.log, then the > information isn't there. > > What I was thinking of doing was writing a program to read in the factor > database from, say, 1M up (to avoid any factors found by ECM), discard any > below the TF limit, then try to find the smallest B2 which would yield the > factor given E=4, 6, 8, 10, or 12. This won't tell us the absolute > frequency of extended factors, but the relative frequencies would test, and > perhaps quantify, my conjecture about the relative effectiveness of these > values. Why not simply use a random sample of numbers of suitable size? Say around 2^41, you are looking for factors from 2^65 upwards with exponents around 2^23. P-1 is really about the factors of k in f=2kp+1 since the +1 is implicit and the 2p comes out "in the wash". (Does the size of the numbers in the sample actually matter from the theoretical point of view?) Regards Brian Beesley _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers