For the benefit of our wider BOINC readership, could we exercise a little care and precision in our terminology and logic, please?
As you say, these overflow results may be either valid or invalid, and that will be determined solely by the server, some days or weeks after BOINC reports - too late to be of any use. Every one of these is initially reported as a "success", with exit status 0. If any language translation uses the same word for 'valid' and 'success', it should be corrected - provided two suitably distinct words can be found in that language, of course. These '-9' overflow codes that we introverted SETI-zens are so fond of bandying about are not BOINC exit codes. They are buried deep in the ![CDATA[ xml structure carried in stderr_out. BOINC has no business poking its nose in there - that would be akin to the postal service steaming open your mail to see if you'd enclosed a winning lottery ticket. Clearly the -9 outcome is analysed and recorded, because the recent rate data is reported on SETI's Science status page. I don't know whether the recorded rate is derived from the text in stderr_out, or (more likely) the data count in the uploaded science result file. In any event, it's pretty clear that it is taken from the canonical result (validated tasks only), so it doesn't help us with this BOINC problem. > IMO it's not about credits (big or zero) at all, it;s about to prevent > host > to download excessive number of tasks that will be most probably just > trashed. > SETI's overflow -9 is hard case, cause "invalidness" of task will be > detected much later. Task returned as "valid" one. > Watching for execution time will not distinguish between "true" overflows > and broken GPU results actually, sometimes broken GPU will return invalid > overflow after ~20 seconds ov work, not immediately. > But in any case, overflowed tasks return "-9" instead of zero, so can be > easely distinguished from all others. > IMO SETI could treat these tasks as "computational errors" in sense of > quota limits and not await their validation. Surely it will negatively > affect all hosts in case of very noisy data tape, but probability of > getting > let say 100 "true" overflows in row for any particular host could be > evaluated by SETI's staff to see if it should be taken into account or > broken GPU much more probably source of overflows. > > ----- Original Message ----- > From: "Jorden van der Elst" <[email protected]> > To: "Josef W. Segur" <[email protected]> > Cc: "BOINC Developers Mailing List" <[email protected]> > Sent: Wednesday, May 26, 2010 2:55 PM > Subject: Re: [boinc_dev] host punishment mechanism revisited > > >> On Wed, May 26, 2010 at 12:29 PM, Josef W. Segur <[email protected]> >> wrote: >>> As has been mentioned a few times, there's a fundamental difficulty >>> in applying host punishment quickly because it can depend on when >>> a result is validated. >> >> Not if there are strict(er) rules to what is "a valid result". Is a >> valid result just those that are not returned immediately with an >> error, or is it all work after validation? Is it strictly about the >> work being done, or is it to safe-guard against "wrong-doings with >> credit"? >> >> At this moment a valid result is work returned before the deadline and >> without errors, but prior to validation. >> >> E.g. at this moment - taking Seti again as an example - there are >> Linux computers out there that return valid results that show zero CPU >> time, thus claim zero credit. If paired against another box claiming >> zero credits... >> >> There are also all_platform computers out there that still run BOINC >> 4.xx, which categorically claim zero credit. And usually everyone in >> that group gets those zero credits, as the work is validated as being >> correct. >> >> If it's strictly about work being returned immediately with an error, >> then the present restrictions may be adequate. >> If it's to safe-guard against zero CPU time claimers who apparently >> (after validation) return good work, does there need to be something >> else done? What? >> If it's to safe-guard against zero credit claimers/zero credit >> granters? The best solution is to get them to update to a newer >> version, but how? >> >> >> -- >> -- Jord. >> _______________________________________________ >> boinc_dev mailing list >> [email protected] >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> To unsubscribe, visit the above URL and >> (near bottom of page) enter your email address. >> > > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
