----- Original Message ----- From: "Brian J. Beesley" <[EMAIL PROTECTED]> To: "Daran" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Wednesday, December 04, 2002 2:55 PM Subject: Re: Mersenne: P-1 and non k-smooth factors
> Let's see if I get this right. > > Overwhelmingly, the factors produced by P-1 factoring come out because > they are smooth to the limits selected. The fraction that comes out > because of the extension is << 10%. To double that fraction (i.e. to > increase the total number of factors found by < 10%) we have to double > the stage 2 run time? That's not what I meant. > Doesn't sound that great to me, even without worrying about memory > considerations. > > If we're talking about the _extra_ computation time in stage 2 then > obviously the suggestion makes a lot more sense - Yes. If it takes Time t to run a stage 2 with E=2, and t+d to run it with E=4, then we should be willing to spend t+2d to run it with E=6, and t+5d to run it with E=12, which, as far as I can see, is approximately how the cost increases anyway. I've only had time to run a few tests, but on an 8.1M exponent, B1=45000, B2=731250 and varying values for D and E, I get the following results D E Stage 2 transforms 420 2 93946 420 4 98618 (the default for my memory settings) 420 12 121520 210 4 104272 150 4 111962 120 4 116464 I did one with D=420, E=6, but I don't seem to have made a note of it. Nor did I record running times, because I was using the computer to do other things. Transforms ought to be reasonably proportionate. The first three tests are consistent with the cost increasing linearly with E, while the latter show the penalty of lowering the memory. (Note that an unmodified client would drop to E=2 below D=210, but I wanted to see just the effect of lowering D) With D=420, E=12 is over 23% dearer than the E=4 which is currently used. However, this does *not* mean that it needs to find more 23% more factors to be worthwhile. If guess_pminus1_bounds is altered to be aware of D and E, then it compensates for the extra cost by reducing B2, trading 'normal' factors for extended ones. My (limited) tests suggest, as I said, that 2% more factors with E=6 or 10% more factors with E=12 would be a worthwhile trade. (This assumes that my formula for estimating the number of squarings is accurate, something I have not checked.) > ...though I think there needs to be a > careful analysis as to what the extra computation time for actual E values > might be... I agree. My tests have been limited to exponents in the 8.1M range, for no particular reason than those are the ones I am doing. > (as opposed to a rather simplistic linear model, which fails to take > into account that some of the "temporaries" needed for small E probably > drop out pretty well "for free"). I don't understand. > If we _don't_ have to worry about memory, at some point it becomes > cost-effective to run ECM with small limits instead of P-1 with much > larger limits. And ECM can "easily" dig out some factors which are more > or less inaccessible with P-1. I was under the impression the ECM was only practical for small exponents well below the current DC range. > [... snip ... I don't disagree but the basic argument is the same as > above] > > > > In 2 out of the 29 stage 2 factors I have found so far using E=4, k has > > not been smooth to B2. This suggests that increasing E from 4 to 12 > > could yield about 20% more factors. I've done a few tests with a > > modified and recompiled client, which suggests that it would worth it > > even if E=12 yielded as few as 10% more factors, though I need to > > investigate this further. > > That's a very small sample. It's the only sample I have. I'm trying to increase it by doing some P-1s on exponents in the 1.2M range which have only been tested to B1=B2=1000. > Some time ago I found a considerable number of first factors for exponents > in the range 100,000-150,000 using P-1 with limits up to B1=10^6, > B2=25x10^6. The results.txt file doesn't record the E value used; though > I did have tons of memory available (in relation to the exponent size) and > seem to remember something about wondering what E=12 meant in the > console output. Or maybe I'm confusing this with recollections about > running ECM? That's possible as E is also used in the ECM code. But E=12 output from a P-1 means what it says. If the value of E is not output, then it means E=2 (or perhaps E=1, but I've never tried using that little memory.) E (>2) is recorded in the results.txt file by recent clients, so either the version of the client you were using then didn't record it, or your memory setting didn't allow E>2. > My records show 67 factors found; I mailed George on one occasion because > P-1 found a factor which surprised me, but I don't think it happened > twice. How many of these were found during stage 2? (If half your factors were found during P-1 stage 2, and half of those used E=4 or greater, then your single 'surprising' factor would not be out of line with my two.) I have a total of 57 factors, including one found earlier today. A few were by TFs, 30 in P-1 stage 2 (including today's) and the rest in stage 1. > Incidentally I found a factor only yesterday using P-1 on a production > system: > > [Tue Dec 3 07:54:38 2002] > P-1 found a factor in stage #2, B1=220000, B2=5665000. > UID: beejaybee/Procyon, M17359099 has a factor: 312980494172935109497751 > > Again no mention of E. If it helps, this system was set up to use 384 > MBytes memory... Each temporary takes FFT size * 8 bytes, so you have 8*896 = 7168KB per temporary or 54 temps (less a *lot* of overhead). You need 56 temps to get to D=210, E=4. Note also that there is a huge margin of error compiled in, so that the client never comes near using the amount allowed. You'd probably have to increase it to over 400MB or more before it would get up to 56. The next level down is 54 temps for D=180, E=2, which rarely happens. (That it can happen at all is a bug, because D=210, E=2 would use the same number of temps to find the same factors, and be faster, but the client never chooses this plan.) Below that is D=150, E=2 at 46 temps, then D=120, E=2 with 38 temps. My guess is that your factor was found using one of these. > In any case this should have come out without extensions; B1=65267 > B2=3077953 is sufficient to find the factor with the "standard" stage 2 > algorithm. E=2 implements prime pairing, but yields no extended factors. > Would there be any means of retrieving actual factors found using P-1 and > the E values used from the server logs? The problem otherwise is that, so > far as the database is concerned, once a factor is found, nobody cares > much how! If the server log is the same as the client's prime.log, then the information isn't there. What I was thinking of doing was writing a program to read in the factor database from, say, 1M up (to avoid any factors found by ECM), discard any below the TF limit, then try to find the smallest B2 which would yield the factor given E=4, 6, 8, 10, or 12. This won't tell us the absolute frequency of extended factors, but the relative frequencies would test, and perhaps quantify, my conjecture about the relative effectiveness of these values. > Regards > Brian Beesley Daran _________________________________________________________________________ Unsubscribe & list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers