Mersenne: Re: Factoring benefit/cost ratio
On Wed, Dec 05, 2001 at 07:10:24PM -0600, [EMAIL PROTECTED] wrote: Am I correct in interpreting this to mean that you think that using 64-bit residuals is more reliable than using 16-bit residuals? If so, then surely you'll grant that 256-bit residuals would be even more reliable yet, meaning that there's still room for error in our practice of using 64-bit residuals. Given that the chance of error (given that it is a random error and no program error) of a wrongly matched 64-bit residual is about 0.0542%, I think you'll agree that a 64-bit residual would be sufficient. :-) /* Steinar */ -- Homepage: http://www.sesse.net/ _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
Brian Beesley wrote: On 3 Dec 2001, at 20:38, [EMAIL PROTECTED] wrote: [... snip ...] I think our record shows that a verified factor is still slightly (by a minute but nonzero margin) more reliable an indicator of compositeness than two matching nonzero LL residues. AFAIK our record does _not_ show any such thing. Oh? It doesn't? If our record does _not_ show that a verified factor is more reliable an indicator than DC LLs, then what does it show as the (higher, according to this hypothesis) error rate for verified factors? Is the observed error rate in reported factors as great as the ~1-1.5% error rate in reported LL results? How often does double-checking a reported factor find that the reported factor was in error? In _theory_ there is a very small chance that we _may_ have accepted an incorrect residual even after double-checking. There is a small chance in such a case that the residual should have been zero, i.e. we missed a Mersenne prime. ... and my contention is that those small chances, for L-L results, exceed the corresponding probabilities in the case of a reported factor. There is a small chance that we may accept an incorrect factor even after double-checking it, but that chance is even smaller than the small chance that we may accept an incorrect double-checked L-L residual. I've triple-checked thousands of small exponents - some of the ones where the accepted residual was recorded to only 16 bits or less, which makes the chance of an undetected error _much_ greater (though still quite small) - so far no substantive errors in the database have come to light. A very few (think fingers of one hand) instances of incorrectly matched residuals have come to light - How does that compare to the observed rate of incorrect factors discovered after triple-checking _them_? Some error sources would be more likely to affect the longer LL test than the shorter factor verification.) Indeed - if errors occur at random intervals, results should get less reliable as runs take progressively longer. ... and L-L verification runs take a lot longer than factor verification runs, so we can reasonably expect that this class of error will affect L-L verification runs (and thus lower such runs' reliability) a lot more than they do factor verification runs, all else being equal (such as use of the same hardware systems). However there does seem to be a wide variation in the reliability of systems. Some seem to have a lot of problems, some very few (if any). So, while a factor verification run on a system that has a lot of problems may be less reliable than an L-L verification run on a system that has very few or no problems, if we were to compare L-L verification and factor verification runs on the same system or on systems of equal problem probability, we would expect to find that errors whose rates were proportional to time would affect the L-L verification runs more than the factor verification runs because of the significant difference in average run times between the two categories, making the L-L verification runs less reliable than the factor verification runs. Agreed? I've detected four problems on my own systems so far; two of these were caused by a failed cooling fan on a P100 system (unlike current systems, the cooling was _almost_ sufficient even without forced ventilation); one was caused by a configuration error - running Glucas on an Alpha system I inadvertently allowed fast arithmetic and turned off error checking, which was a really stupid thing to do on a system with a 53-bit mantissa FPU - as this was a QA run on a 33 million range exponent, I was lucky to lose only 3 months work. The last one I never got to the bottom of, but I strongly suspect that Prime95's workspace memory was clobbered during a software installation on a Windows 98 system. How many of those problems caused errors during L-L verifications, and how many caused errors during factor verifications? However, you may not have spent anywhere near as much time doing factor verifications as you have doing L-L verifications, so it may not be valid to draw any conclusion about comparative error rates on your system. Richard B. Woods _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
On 5 Dec 2001, at 6:09, [EMAIL PROTECTED] wrote: > Brian Beesley wrote: > > On 3 Dec 2001, at 20:38, [EMAIL PROTECTED] wrote: > [... snip ...] > > > I think our record shows that a verified factor is still > > > slightly (by a minute but nonzero margin) more reliable an > > > indicator of compositeness than two matching nonzero LL > > > residues. > > > > AFAIK our record does _not_ show any such thing. > > Oh? It doesn't? There is no evidence of any verified residuals being incorrect. Neither is there any evidence that any verified factors are incorrect. Whatever theory states, the experimental evidence is that verified factors are no more (or less) reliable than verified LL tests. Suppose a taxi firm runs 10 Fords and 10 Hondas for a year. None of them break down. On that basis alone, there is no evidence whatsoever that one make is more reliable than the other. Naturally, other companies' experimental evidence may vary. > > [ big snip ] > There is a small chance that we may accept an incorrect factor even > after double-checking it, but that chance is even smaller than the > small chance that we may accept an incorrect double-checked L-L > residual. I doubt very much that we would accept an incorrect factor. The double-checking is done with completely different code. Besides which, checking a factor takes a few microseconds, whereas checking a LL test likely takes a few hundred hours. If anything goes wrong during a factoring run, we would be far more likely to miss a factor which we should have found rather than vice versa. This is relatively unimportant from the point of view of finding Mersenne primes; the effect is a small loss of efficiency. > How does that compare to the observed rate of incorrect factors > discovered after triple-checking _them_? AFAIK no-one bothers to triple-check factors. Nowadays factors are verified on the server at the time they're reported. I'm not privy to the server logs, so I simply don't know how many get rejected (except for the server _database_ problem reported recently - not related to the actual factor checking code - which causes problems with genuine factors > 32 digits in length). However, I can think of at least one way of pushing up the rejection rate. > > How many of those problems caused errors during L-L verifications, > and how many caused errors during factor verifications? > All during LL first tests, or QA runs (which are LL DC done in parallel with intermediate crosscheck points). > However, you may not have spent anywhere near as much time doing > factor verifications as you have doing L-L verifications, so it may > not be valid to draw any conclusion about comparative error rates on > your system. I've spent no time at all verifying factors - it would take less than a minute to verify everything in the factors database. The total factoring effort I've put in (ignoring ECM P-1 on small exponents) is only about 3% of my total contribution, so I would expect not to have had any factoring errors. Besides which, trial factoring _should_ have a lower error rate than LL testing, due to the lower load on the FPU (which is usually the CPU element most sensite to excess heat) and the smaller memory footprint (less chance of data getting clobbered by rogue software or random bit-flips). Regards Brian Beesley _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
Brian, I'm wondering whether we may be misunderstanding each other's contentions here. I thought you object to at least some of what I claimed, but now it seems that you're presenting arguments and evidence that support what I'm claiming. Since my previous postings may have had careless wording or otherwise obscured my intentions, and I did not earlier realize the importance of certain details to the discussion, let me restate what I've meant to claim: 1. It is more valuable to know a specific factor of a Mnumber than to know that that Mnumber is composite without knowing any specific factor. (There's little dispute about #1.) 2. Claim #1 is true not only from the viewpoint of mathematics in general, but also from the narrower viewpoint of the GIMPS search for Mersenne primes. 3. One (but not the only) justification for claim #2 is that, _in current practice_, a composite status derived by GIMPS from finding a specific factor is (slightly) more reliable than a composite status derived by GIMPS from matching nonzero residues from Lucas-Lehmer tests. That is, although in theory, or ideally, those two methods of determining compositeness are equally reliable, there currently exists a slight difference in reliability, in favor of the factor, from a practical standpoint. 4. Our experience (the record), as documented in the Mersenne mailing list or GIMPS history, supports claim #3. - - - - - Brian Beesley wrote: AFAIK our record does _not_ show any such thing. Oh? It doesn't? There is no evidence of any verified residuals being incorrect. Wait a second -- just yesterday you wrote that you had triple-checked thousands of small exponents (which means they had already been double-checked) and that A very few (think fingers of one hand) instances of incorrectly matched residuals have come to light - completing the double-check in these cases proved that one of the recorded residuals was correct. So it seems that the meaning you're assigning to verified is something like retested and retested until two residuals match. Is that a correct interpretation? If not, what is? My claim #3 means that in practice, factors require fewer verification runs to produce matching results than do L-L residues, on average. Do you disagree with that? If not, then don't we agree about claim #3? Furthermore, my claim #4 means that the demonstration that factors require fewer verification runs to produce matching results than do L-L residues, on average, rests on the observed history _including the paragraph you wrote from which I just quoted above!_ Do you disagree? Also, in that same paragraph you wrote, ... - some of the ones where the accepted residual was recorded to only 16 bits or less, which makes the chance of an undetected error _much_ greater (though still quite small) ... Am I correct in interpreting this to mean that you think that using 64-bit residuals is more reliable than using 16-bit residuals? If so, then surely you'll grant that 256-bit residuals would be even more reliable yet, meaning that there's still room for error in our practice of using 64-bit residuals. But a specific factor is a _complete value_, not some truncation, and so its reliability is not damaged by the incompleteness which you admit keeps the L-L residues from being totally reliable - right? Then you wrote so far no substantive errors in the database have come to light, but seemingly contradicted that in the very next sentence, A very few (think fingers of one hand) instances of incorrectly matched residuals have come to light - completing the double-check in these cases proved that one of the recorded residuals was correct. ... And thus _the other_ recorded residual was _incorrect_. Neither is there any evidence that any verified factors are incorrect. Depends on the meaning of verified, of course. :-) Will Edgington (I think) has reported finding errors in his factor data base ... even though he verifies factors before adding them. Mistakes happen. But I think the error rate for factors has been significantly lower than for L-L residuals. Whatever theory states, the experimental evidence is that verified factors are no more (or less) reliable than verified LL tests. Then why don't we triple-check factors as often as we triple-check L-L results? Oh, wait ... depends on the meaning of verified, again. Suppose a taxi firm runs 10 Fords and 10 Hondas for a year. [ snip ] Let's have an example in which the number and nature of the units is closer to the gigabytes of data items we're slinging around. Besides which, checking a factor takes a few microseconds, whereas checking a LL test likely takes a few hundred hours. ... which tends to support my claim #3. If anything goes wrong during a factoring run, we would be far more likely to miss a factor which we should have found rather than vice versa. ... which is in agreement with claim #3. How does that compare to the observed rate of incorrect
Re: Mersenne: Re: Factoring benefit/cost ratio
Richard, Your first interpretation of verified residues is correct, they are retested until two residues match. Any time a double-check reports in a residue which is different from the first LL test, the exponent is returned to the database to be tested again. This means that at least one of the residues is incorrect, and happens (relatively) often, I believe about two percent of the time. However, as has been pointed out before, the odds of two LL tests on different machines returning the _same_ incorrect residues are astronomical (although, of course, still non-zero). Steve -Original Message- From: [EMAIL PROTECTED] [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Wednesday, December 05, 2001 8:34 PM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio Brian, I'm wondering whether we may be misunderstanding each other's contentions here. I thought you object to at least some of what I claimed, but now it seems that you're presenting arguments and evidence that support what I'm claiming. Since my previous postings may have had careless wording or otherwise obscured my intentions, and I did not earlier realize the importance of certain details to the discussion, let me restate what I've meant to claim: 1. It is more valuable to know a specific factor of a Mnumber than to know that that Mnumber is composite without knowing any specific factor. (There's little dispute about #1.) 2. Claim #1 is true not only from the viewpoint of mathematics in general, but also from the narrower viewpoint of the GIMPS search for Mersenne primes. 3. One (but not the only) justification for claim #2 is that, _in current practice_, a composite status derived by GIMPS from finding a specific factor is (slightly) more reliable than a composite status derived by GIMPS from matching nonzero residues from Lucas-Lehmer tests. That is, although in theory, or ideally, those two methods of determining compositeness are equally reliable, there currently exists a slight difference in reliability, in favor of the factor, from a practical standpoint. 4. Our experience (the record), as documented in the Mersenne mailing list or GIMPS history, supports claim #3. - - - - - Brian Beesley wrote: AFAIK our record does _not_ show any such thing. Oh? It doesn't? There is no evidence of any verified residuals being incorrect. Wait a second -- just yesterday you wrote that you had triple-checked thousands of small exponents (which means they had already been double-checked) and that A very few (think fingers of one hand) instances of incorrectly matched residuals have come to light - completing the double-check in these cases proved that one of the recorded residuals was correct. So it seems that the meaning you're assigning to verified is something like retested and retested until two residuals match. Is that a correct interpretation? If not, what is? My claim #3 means that in practice, factors require fewer verification runs to produce matching results than do L-L residues, on average. Do you disagree with that? If not, then don't we agree about claim #3? Furthermore, my claim #4 means that the demonstration that factors require fewer verification runs to produce matching results than do L-L residues, on average, rests on the observed history _including the paragraph you wrote from which I just quoted above!_ Do you disagree? Also, in that same paragraph you wrote, ... - some of the ones where the accepted residual was recorded to only 16 bits or less, which makes the chance of an undetected error _much_ greater (though still quite small) ... Am I correct in interpreting this to mean that you think that using 64-bit residuals is more reliable than using 16-bit residuals? If so, then surely you'll grant that 256-bit residuals would be even more reliable yet, meaning that there's still room for error in our practice of using 64-bit residuals. But a specific factor is a _complete value_, not some truncation, and so its reliability is not damaged by the incompleteness which you admit keeps the L-L residues from being totally reliable - right? Then you wrote so far no substantive errors in the database have come to light, but seemingly contradicted that in the very next sentence, A very few (think fingers of one hand) instances of incorrectly matched residuals have come to light - completing the double-check in these cases proved that one of the recorded residuals was correct. ... And thus _the other_ recorded residual was _incorrect_. Neither is there any evidence that any verified factors are incorrect. Depends on the meaning of verified, of course. :-) Will Edgington (I think) has reported finding errors in his factor data base ... even though he verifies factors before adding them. Mistakes happen. But I think the error rate for factors has been significantly lower than for L-L residuals. Whatever theory states, the experimental evidence
RE: Mersenne: Re: Factoring benefit/cost ratio
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] (Aside - we're rather more likely to find a 75-bit factor than a 75- digit factor. In fact, finding a 75-digit prime factor of a no factors known Mersenne number with an exponent under 80 million would be a significant achievement in itself. Finding a 75-digit prime and non-algebraic factor of any integer with more than, say, 200 digits would be a significant achievement. The record today is 55 digits; it reached 53 digits 3 years ago and only a handful of factors with 50 or more digits have been found ever. I have precisely one of them to my credit 8-) At the moment, having found one factor, we quit. That's sufficient effort for the purpose of trying to find Mersenne primes. A little more work might break the cofactor down further. Actually, some of us don't quit. But we're a small bunch of weirdos, and we only work on tiny exponents anyway. Paul _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
RE: Mersenne: Re: Factoring benefit/cost ratio
On 4 Dec 2001, at 1:19, Paul Leyland wrote: From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] (Aside - we're rather more likely to find a 75-bit factor than a 75- digit factor. In fact, finding a 75-digit prime factor of a no factors known Mersenne number with an exponent under 80 million would be a significant achievement in itself. Finding a 75-digit prime and non-algebraic factor of any integer with more than, say, 200 digits would be a significant achievement. The record today is 55 digits; it reached 53 digits 3 years ago and only a handful of factors with 50 or more digits have been found ever. I have precisely one of them to my credit 8-) Sure. Not quite the same since there appears to be no certificate of primality, but on 30 Aug 2001 there was a message on this list to the effect that M727 (c219) = prp98.prp128. So much ECM work was done on M727 (before the NFS people started work) that it is highly unlikely that there are any factors 10^50, which means that at least the 98-digit probable prime is almost certainly a genuine prime. (Maybe that's been proved by now. ECPP on general numbers of around 100 digits isn't very expensive.) I think the 55 digit record applies to ECM. A number of much larger factors (not counting cofactors) have been found using number field sieve techniques. At the moment, having found one factor, we quit. That's sufficient effort for the purpose of trying to find Mersenne primes. A little more work might break the cofactor down further. Actually, some of us don't quit. But we're a small bunch of weirdos, and we only work on tiny exponents anyway. I was speaking for the project rather than myself. I'm also one of the small bunch of weirdos. Regards Brian Beesley _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
RE: Mersenne: Re: Factoring benefit/cost ratio
At 06:11 PM 12/4/2001 +, [EMAIL PROTECTED] wrote: Sure. Not quite the same since there appears to be no certificate of primality, but on 30 Aug 2001 there was a message on this list to the effect that M727 (c219) = prp98.prp128. Does anyone know how much CPU time was spent? So much ECM work was done on M727 (before the NFS people started work) that it is highly unlikely that there are any factors 10^50, which means that at least the 98-digit probable prime is almost certainly a genuine prime. (Maybe that's been proved by now. ECPP on general numbers of around 100 digits isn't very expensive.) Especially when one uses Primo, which makes numbers of even a thousand digits take perhaps an afternoon on my machine. I doubt very strongly I'm the first, but just for reference: http://www.mail-archive.com/mersenne@base.com/msg06304.html http://www.acsu.buffalo.edu/~nrussell/M727.zip Nathan _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
- Original Message - From: [EMAIL PROTECTED] To: Daran [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Monday, December 03, 2001 11:22 PM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio On 3 Dec 2001, at 20:45, Daran wrote: Shouldn't that be 1.015 for double-checking assignments? I think 1.03... Yes, of course. ...However you do have a point. P-1 limits do depend on the trial factoring depth,... Now I'm puzzled. Obviously both P-1 and trial factoring limits both depend upon the exponant, so will correlate, but I don't see a direct dependency between them. and are much smaller for DC assignments than for first tests, so there is already something built in. Right. I'd noticed that my P-1 probabilities for DC assignments were about half that for LLs, but I'd attributed that to the differences between the magnitudes of the exponants. This makes more sense. Also does the cost part of the calculation recognise the increased cost of trial-factorisation after 2^64? Yes. The trial factoring depth is constant at 64 bits from ~8.5 million to ~13 million. Don't forget that the number of candidates which need to be checked is _inversely_ proportional to the exponent. Because any factor must be of form 2kp+1. [...] Finally if P-1 factorisation were to be spun off into a separate work unit, then the optimal arangement would be to trial factor while cost_of_trial_factoring * chance_of_P-1_factoring is less than cost_of_P-1_factoring * chance_of_trial_factoring. Then P-1 factorise. Then complete trial factorisation according to the above formula. Interesting - but I think the effect would be small. Perhaps, but the effort to implement (without spinning off P-1 into a separate work type) would also be small. The existing formula (cost_of_trial_factoring chance_of_trial_factoring * cost_of_LL_test * 2.03) ignores the effect of P-1. A better formula would be:- cost_of_trial_factoring chance_of_trial_factoring * (cost_of_P-1_factoring + (1-chance_of_P-1_factoring)*(cost_of_LL_test * 2.03). This change on its own would surely be trivial to implement. But then, if the P-1 fails, you might as well carry on and do a little more TF, which brings us back to the simpler formula I gave earlier. You might also want to lower your P-1 bounds a shade to take into account the fact that a successful factorisation may only save a little extra TF effort. The end result is a *tiny* reduction in the probability of finding a factor (because of the slight reduction in P-1 bounds). But you'll do less work, and if you do find one, you'll probably find it earlier. How much earlier? I've just completed P-1 on M656677 which took 8787 seconds which about 1.5% of the DC estimated to take 6 days 23 hours 43 minute equals 603780 seconds, compared with a 2.6% chance of success. My guess is that this would put it optimally just before the last round of TF, which is (or should be) close to the break-even point. I'd need an estimate for the time of the last round of TF to be any more precise. The calculation would also need to be repeated for a first-time LL assignment. My final remarks on the subject of early P-1 is the observation that TF and P-1 search overlapping factor spaces. Are there any gains to be had in TF from sieving out some of the smooth k's that an early P-1 failed to find? Could you structure the factor candidate generation code, so that unsieved list doesn't contain any smooth k's in the first place? Even if it's not possible or cost effective to do either of these, the fact that there are smooth k's among the candidates should lower the estimate for the probability of success. Another reason to spin off P-1 into a separate work type is to allow machines with heaps of memory to concentrate on this kind of work. A little experimentation shows that the probability of a successful P-1 with 400MB is roughly double that of one with a minimal configuration, and some machines will be doing LLs/DCs without the ability to P-1 at all. What I envisiage is that a virgin exponant would first go out to P-1(which would also do the first few rounds of TF, according to my earlier formula), then as a factoring assignment (to completion), then finally to LL and DC. What about factoring to a fractional depth? With a roughly logarithmic distribution of factors, surely about half the factors between 2^n and 2^(n+1) would be smaller than 2^(n+0.5), whilst searching to 2^(n+0.5) would take only about 41% of the time taken to search the whole interval. This had occured to me. One could calculate the exact break-even point and TF thus far and no further. However the gains would be minimal for many exponants in the middle of the depth 'bands', which are already being near-optimally factored. Even those at the edges of the depth bands can't be very far from the break-even point. Also the server would need to record more information to enable TF to be taken a further at a later
Re: Mersenne: Re: Factoring benefit/cost ratio
- Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, December 04, 2001 2:38 AM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio and one that is decades (at least) behind GIMPS. The only reason we do any factoring at all is to reduce the time spent on LL testing. But if factoring is not really part of GIMPS's purpose (and I agree it isn't), how can a separate factoring effort be behind GIMPS at all? Aren't they measuring their progress in a different, non-comparable, dimension? Say rather that there are various criteria by which the two projects can be compared. It's probably true that it would take them decades or more to completely factor a set of prime exponents comparable to those which GIMPS has verified composite. (and that's ignoring the effort to factorise the cofactors of Mersennes with composite exponents). They're probably not that far behind GIMPS in terms of total computing done. How far will depend upon how good they are at mobilising support. [...] In reply to a statement of mine about the extra benefit of finding a specific factor, Daran G. wrote: I can see no way of objectively quantifying this benefit. Well -- if there's no objective quantification of the extra benefit of finding a specific factor, then it seems to me that there's no objectively quantifiable justification for saying that it's not valuable to do a certain amount of extra P-1 factoring on a once-LL'd Mnumber. :) Can't argue with that. [...] Daran G. wrote: It seems to be implicitely acknowledged in the way the trial factoring depths are determined. But don't forget that this refers to a depth determined by Prime95 _in the context of calculating the factoring tradeoff point for maximizing GIMPS throughput for the Test= worktype_, NOT the context of a Factor= or Pminus1= worktype where the user has explicitly specified factoring limits, possibly for reasons beyond the ken of Prime95. That seemed to be the question you were asking. [...] I'm not saying that Prime95 should incorporate into its calculations a nonzero value for the benefit of finding a specific factor. I'm saying that it is rational for someone to decide to factor past the Prime95-calculated tradeoff points, and that it is unjustified to criticize extra factoring on the grounds that going past the Prime95-calculated tradeoff points is wasted effort. I agree. But you can look at this in terms of an even greater project than GIMPS - The Great Distributed Computing Project that encompasses all such efforts, and aims to use spare computing cycles to increase the knowledge base of humankind generally. What contribution one chooses to make depends upon ones own personal preferences. Richard B. Woods Daran G. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
- Original Message - From: George Woltman [EMAIL PROTECTED] To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, December 04, 2001 3:41 AM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio I think more of the discussion has centered around stats and the formula for picking how far to trial factor, rather than whether factoring is of some mathematical value... That's true, at least for my own recent contributions to this discussion. However, what I've been trying to do is float a few ideas for increasing the effectiveness of the factorising effort. If P-1 was done early, as I suggested, with bounds unchanged, then exactly the same Mersennes will get factored sooner, on average, and with less work. Ditto with the smooth k sieving idea. And if the cost of factoring is reduced, the optimal depth increases. The benefits of such an exercise are even greater if factors are given an intrinsic value of their own. Of course, the real scarce resource is not computer cycles, nor even ideas for improvement, but in programming effort to implement. Calculating that particular trade-off I leave to others. -- George Daran G. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
- Original Message - From: George Woltman [EMAIL PROTECTED] To: Gerry Snyder [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, December 01, 2001 10:39 PM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio I prefer a factor to a double-check. But it is hard to quantify prefer in a mathematical formula for computing trial factoring limits. Prime95 uses the formula: cost_of_factoring must be less than chance_of_finding_a_factor times 2.03 * the cost_of_an_LL_test. Shouldn't that be 1.015 for double-checking assignments? Also does the cost part of the calculation recognise the increased cost of trial-factorisation after 2^64? I've noticed on occasion that I've had to do an extra round of trial factoring before proceeding with an doublecheck. This indicates that previous factorisation has been non-optimal, or have the estimates for the relative costs of factoring vs. LL testing changed with the introduction of new hardware? Finally if P-1 factorisation were to be spun off into a separate work unit, then the optimal arangement would be to trial factor while cost_of_trial_factoring * chance_of_P-1_factoring is less than cost_of_P-1_factoring * chance_of_trial_factoring. Then P-1 factorise. Then complete trial factorisation according to the above formula. -- George Daran G. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
On 3 Dec 2001, at 20:45, Daran wrote: Shouldn't that be 1.015 for double-checking assignments? I think 1.03. However you do have a point. P-1 limits do depend on the trial factoring depth, and are much smaller for DC assignments than for first tests, so there is already something built in. Also does the cost part of the calculation recognise the increased cost of trial-factorisation after 2^64? Yes. The trial factoring depth is constant at 64 bits from ~8.5 million to ~13 million. Don't forget that the number of candidates which need to be checked is _inversely_ proportional to the exponent. I've noticed on occasion that I've had to do an extra round of trial factoring before proceeding with an doublecheck. This indicates that previous factorisation has been non-optimal, or have the estimates for the relative costs of factoring vs. LL testing changed with the introduction of new hardware? I think the latter has something to do with it - PPro is about twice as efficient at factoring compared with LL testing as a plain Pentium. This is because of much improved pipeline organization including the provision of spare registers enabling speculative execution, which greatly increased the throughput of the floating point unit in particular. Another factor is that early versions of the program were unable to factor as deep as current versions. Finally if P-1 factorisation were to be spun off into a separate work unit, then the optimal arangement would be to trial factor while cost_of_trial_factoring * chance_of_P-1_factoring is less than cost_of_P-1_factoring * chance_of_trial_factoring. Then P-1 factorise. Then complete trial factorisation according to the above formula. Interesting - but I think the effect would be small. What about factoring to a fractional depth? With a roughly logarithmic distribution of factors, surely about half the factors between 2^n and 2^(n+1) would be smaller than 2^(n+0.5), whilst searching to 2^(n+0.5) would take only about 41% of the time taken to search the whole interval. Regards Brian Beesley _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
Responses to a couple of postings -- Steve Harris wrote: Actually, Richard's statement that a 'Factored' status is better for GIMPS than a 'Two LL' status is not quite true. It's better for the mathematical community as a whole, Better for the mathematical community as a whole is actually almost exactly what I had in mind when I started composing that paragraph ... honest! But then I started revising, and lost some stuff in the process. Nevertheless, I stand by my assertion that a 'Factored' status is better for GIMPS than a 'Two LL' status ... insofar as our LL testing remains less reliable than our factor verification. Double-checking certainly improves LL result reliability, but I think our record shows that a verified factor is still slightly (by a minute but nonzero margin) more reliable an indicator of compositeness than two matching nonzero LL residues. (See earlier discussions of potential reasons for incorrect LL results. Some error sources would be more likely to affect the longer LL test than the shorter factor verification.) It matters to those who are attempting to fully factor Mersenne numbers, but that's a different project altogether, Some of us like to participate in both. :-) and one that is decades (at least) behind GIMPS. The only reason we do any factoring at all is to reduce the time spent on LL testing. But if factoring is not really part of GIMPS's purpose (and I agree it isn't), how can a separate factoring effort be behind GIMPS at all? Aren't they measuring their progress in a different, non-comparable, dimension? Besides, if you do manage to find a 75-digit factor of a 2-million-digit Mersenne number, that still leaves a 125-digit remainder. Really not much help :-) A journey of a thousand miles begins with but a single step. Two million three-foot steps total more than one thousand miles. - - - - - In reply to a statement of mine about the extra benefit of finding a specific factor, Daran G. wrote: I can see no way of objectively quantifying this benefit. Well -- if there's no objective quantification of the extra benefit of finding a specific factor, then it seems to me that there's no objectively quantifiable justification for saying that it's not valuable to do a certain amount of extra P-1 factoring on a once-LL'd Mnumber. :) In response to my writing: This reflects the view (with which I agree) that it is more valuable to know a specific factor of a Mnumber than to know that a Mnumber is composite but not to know any specific factor of that Mnumber. So a Factored status is better for GIMPS than a Two LL status, but calculations of factoring benefit that consider only the savings of L-L test elimination are neglecting the difference between those two statuses. If one consciously wants to neglect that difference ... well, okay ... but I prefer to see that explicitly acknowledged. Daran G. wrote: It seems to be implicitely acknowledged in the way the trial factoring depths are determined. But don't forget that this refers to a depth determined by Prime95 _in the context of calculating the factoring tradeoff point for maximizing GIMPS throughput for the Test= worktype_, NOT the context of a Factor= or Pminus1= worktype where the user has explicitly specified factoring limits, possibly for reasons beyond the ken of Prime95. Daran continues: If one places a non-zero value on a known factor, then the utility of extra factoring work on untested, once tested, and verified composites would be increased. Yes. But that doesn't have to affect Prime95's calculation of tradeoff points _for maximizing throughput of the Test= worktype_. I'm not saying that Prime95 should incorporate into its calculations a nonzero value for the benefit of finding a specific factor. I'm saying that it is rational for someone to decide to factor past the Prime95-calculated tradeoff points, and that it is unjustified to criticize extra factoring on the grounds that going past the Prime95-calculated tradeoff points is wasted effort. Richard B. Woods _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
At 08:38 PM 12/3/2001 -0600, [EMAIL PROTECTED] wrote: Nevertheless, I stand by my assertion that a 'Factored' status is better for GIMPS than a 'Two LL' status I think everyone is in agreement on this. But if factoring is not really part of GIMPS's purpose (and I agree it isn't), I think of GIMPS as two projects. The larger project is let's find the next Mersenne prime. The much smaller project is let's maximize our knowledge of all Mersenne numbers. I'm saying that it is rational for someone to decide to factor past the Prime95-calculated tradeoff points, I agree. I've done several P90 CPU years of that myself. I think more of the discussion has centered around stats and the formula for picking how far to trial factor, rather than whether factoring is of some mathematical value. I'm not inclined to change the trial factoring formula. If I recall correctly, when I last computed the formula I erred on the side of more factoring rather than less. As to the stats, maybe, just maybe, we can upgrade our stats system in 2002. I know from other distributed projects that many users are fanatical about any changes in stats. We will have to tread very carefully! -- George _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
On 1 Dec 2001, at 17:39, George Woltman wrote: This is because my rather limited reporting software only adds up the LL results in the verified and one-LL-tests databases. Once an exponent is factored it is removed from those databases. The other problem here is that the known factors database does not include the discoverer. I prefer a factor to a double-check. But it is hard to quantify prefer in a mathematical formula for computing trial factoring limits. Prime95 uses the formula: cost_of_factoring must be less than chance_of_finding_a_factor times 2.03 * the cost_of_an_LL_test. This should maximize GIMPS throughput. The 2.03 is because we must run two (or more) LL tests to do a double-check. Again there is a complication, since the ratio of time to do factoring assignment X to time to do LL/DC assignment Y varies according to the processor. e.g. PIIs are relatively quick at factoring, whereas P4s are much more efficient LL testers. P.S. I'll comment on the M#39 news later. For now lets celebrate our grand accomplishment rather than worry about non-optimal press coverage. Hear hear. Congratulations to the discoverer (whoever he/she is), to George on his program finding a fifth Mersenne Prime, and to everyone involved in the project, without whom we wouldn't have reached this milestone. Regards Brian Beesley _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
Steve Harris wrote: Actually, Richard's statement that a 'Factored' status is better for GIMPS than a 'Two LL' status is not quite true. It's better for the mathematical community as a whole, but not for GIMPS. GIMPS is looking for primes, not factors, and without skipping over any. Hmmm, I must be having a senior moment. I would swear George said that one way a person could lose credit for a correct LL test is if later factoring finds a factor. Is my feeble brain making this up, or is finding a factor more important than stated above? Gerry -- mailto:[EMAIL PROTECTED] Gerry Snyder, AIS Director Symposium Chair, Region 15 RVP Member San Fernando Valley, Southern California Iris Societies in warm, winterless Los Angeles--USDA 9b-ish, Sunset 18-19 my work: helping generate data for: http://galileo.jpl.nasa.gov/ _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
Hi Gerry, At 11:03 AM 12/1/2001 -0800, Gerry Snyder wrote: I must be having a senior moment. I would swear George said that one way a person could lose credit for a correct LL test is if later factoring finds a factor. This is because my rather limited reporting software only adds up the LL results in the verified and one-LL-tests databases. Once an exponent is factored it is removed from those databases. Is my feeble brain making this up, or is finding a factor more important than stated above? I prefer a factor to a double-check. But it is hard to quantify prefer in a mathematical formula for computing trial factoring limits. Prime95 uses the formula: cost_of_factoring must be less than chance_of_finding_a_factor times 2.03 * the cost_of_an_LL_test. This should maximize GIMPS throughput. The 2.03 is because we must run two (or more) LL tests to do a double-check. -- George P.S. I'll comment on the M#39 news later. For now lets celebrate our grand accomplishment rather than worry about non-optimal press coverage. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: Re: Factoring benefit/cost ratio
George did say that, and I was aware of his statement, but that still has no effect on the point I was making. George's GIMPS stats also give no credit at all for finding factors, but that doesn't mean he considers finding factors worthless. Steve -Original Message- From: Gerry Snyder [EMAIL PROTECTED] To: mer [EMAIL PROTECTED] Date: Saturday, December 01, 2001 12:19 PM Subject: Re: Mersenne: Re: Factoring benefit/cost ratio Steve Harris wrote: Actually, Richard's statement that a 'Factored' status is better for GIMPS than a 'Two LL' status is not quite true. It's better for the mathematical community as a whole, but not for GIMPS. GIMPS is looking for primes, not factors, and without skipping over any. Hmmm, I must be having a senior moment. I would swear George said that one way a person could lose credit for a correct LL test is if later factoring finds a factor. Is my feeble brain making this up, or is finding a factor more important than stated above? Gerry -- mailto:[EMAIL PROTECTED] Gerry Snyder, AIS Director Symposium Chair, Region 15 RVP Member San Fernando Valley, Southern California Iris Societies in warm, winterless Los Angeles--USDA 9b-ish, Sunset 18-19 my work: helping generate data for: http://galileo.jpl.nasa.gov/ _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers
Mersenne: Re: Factoring benefit/cost ratio
Actually, Richard's statement that a 'Factored' status is better for GIMPS than a 'Two LL' status is not quite true. It's better for the mathematical community as a whole, but not for GIMPS. GIMPS is looking for primes, not factors, and without skipping over any. This means all candidates must be tested and the non-primes eliminated, and it doesn't matter whether they are eliminated by 'factored' or by 'two matching nonzero LL residues'. It matters to those who are attempting to fully factor Mersenne numbers, but that's a different project altogether, and one that is decades (at least) behind GIMPS. The only reason we do any factoring at all is to reduce the time spent on LL testing. Besides, if you do manage to find a 75-digit factor of a 2-million-digit Mersenne number, that still leaves a 125-digit remainder. Really not much help :-) Regards, Steve Harris -Original Message- From: Daran [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Friday, November 30, 2001 2:00 AM Subject: Re: Factoring benefit/cost ratio (was: Mersenne: Fw: The Mersenne Newsletter, issue #18) - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Saturday, November 24, 2001 6:49 PM Subject: Factoring benefit/cost ratio (was: Mersenne: Fw: The Mersenne Newsletter, issue #18) But ones factoring benefit calculation might [should would be in line with the popular theme of prescribing what's best for other GIMPS participants :)] include not only the time savings of eliminating the need for one or two L-L tests, but also the extra benefit of finding a specific factor. I can see no way of objectively quantifying this benefit. In the GIMPS Search Status table at www.mersenne.org/status.htm the march of progress is from Status Unknown to Composite - One LL to Composite - Two LL to ... Composite - Factored. More desireable - whether or not recorded on that page - would be Composite - Least (or greatest) factor known. Most desireable (other than Prime) would be Composite - Completely factored'. This reflects the view (with which I agree) that it is more valuable to know a specific factor of a Mnumber than to know that a Mnumber is composite but not to know any specific factor of that Mnumber. So a Factored status is better for GIMPS than a Two LL status, but calculations of factoring benefit that consider only the savings of L-L test elimination are neglecting the difference between those two statuses. If one consciously wants to neglect that difference ... well, okay ... but I prefer to see that explicitly acknowledged. It seems to be implicitely acknowledged in the way the trial factoring depths are determined. If one places a non-zero value on a known factor, then the utility of extra factoring work on untested, once tested, and verified composites would be increased. It would have to be set very high indeed to make it worth while returning to verified composite Mersennes. Richard Woods Daran G. _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers _ Unsubscribe list info -- http://www.ndatech.com/mersenne/signup.htm Mersenne Prime FAQ -- http://www.tasam.com/~lrwiman/FAQ-mers