Re: [boinc_dev] host punishment mechanism revisited

Richard Haselgrove Wed, 26 May 2010 06:31:37 -0700

For the benefit of our wider BOINC readership, could we exercise a little 
care and precision in our terminology and logic, please?


As you say, these overflow results may be either valid or invalid, and that 
will be determined solely by the server, some days or weeks after BOINC 
reports - too late to be of any use.

Every one of these is initially reported as a "success", with exit status 0. 
If any language translation uses the same word for 'valid' and 'success', it 
should be corrected - provided two suitably distinct words can be found in 
that language, of course.

These '-9' overflow codes that we introverted SETI-zens are so fond of 
bandying about are not BOINC exit codes. They are buried deep in the 
![CDATA[ xml structure carried in stderr_out. BOINC has no business poking 
its nose in there - that would be akin to the postal service steaming open 
your mail to see if you'd enclosed a winning lottery ticket.

Clearly the -9 outcome is analysed and recorded, because the recent rate 
data is reported on SETI's Science status page. I don't know whether the 
recorded rate is derived from the text in stderr_out, or (more likely) the 
data count in the uploaded science result file. In any event, it's pretty 
clear that it is taken from the canonical result (validated tasks only), so 
it doesn't help us with this BOINC problem.


> IMO it's not about credits (big or zero) at all, it;s about to prevent 
> host
> to download excessive number of tasks that will be most probably just
> trashed.
> SETI's overflow -9 is hard case, cause "invalidness" of task will be
> detected much later. Task returned as "valid" one.
> Watching for execution time will not distinguish between "true" overflows
> and broken GPU results actually, sometimes broken GPU will return invalid
> overflow after ~20 seconds ov work, not immediately.
> But in any case, overflowed tasks return "-9" instead of zero, so can be
> easely distinguished from all others.
> IMO SETI could  treat these tasks as "computational errors" in sense of
> quota limits and not await their validation. Surely it will negatively
> affect all hosts in case of very noisy data tape, but probability of 
> getting
> let say 100 "true" overflows in row for any particular host could be
> evaluated by SETI's staff to see if it should be taken into account or
> broken GPU much more probably source of overflows.
>
> ----- Original Message ----- 
> From: "Jorden van der Elst" <[email protected]>
> To: "Josef W. Segur" <[email protected]>
> Cc: "BOINC Developers Mailing List" <[email protected]>
> Sent: Wednesday, May 26, 2010 2:55 PM
> Subject: Re: [boinc_dev] host punishment mechanism revisited
>
>
>> On Wed, May 26, 2010 at 12:29 PM, Josef W. Segur <[email protected]>
>> wrote:
>>> As has been mentioned a few times, there's a fundamental difficulty
>>> in applying host punishment quickly because it can depend on when
>>> a result is validated.
>>
>> Not if there are strict(er) rules to what is "a valid result". Is a
>> valid result just those that are not returned immediately with an
>> error, or is it all work after validation? Is it strictly about the
>> work being done, or is it to safe-guard against "wrong-doings with
>> credit"?
>>
>> At this moment a valid result is work returned before the deadline and
>> without errors, but prior to validation.
>>
>> E.g. at this moment - taking Seti again as an example - there are
>> Linux computers out there that return valid results that show zero CPU
>> time, thus claim zero credit. If paired against another box claiming
>> zero credits...
>>
>> There are also all_platform computers out there that still run BOINC
>> 4.xx, which categorically claim zero credit. And usually everyone in
>> that group gets those zero credits, as the work is validated as being
>> correct.
>>
>> If it's strictly about work being returned immediately with an error,
>> then the present restrictions may be adequate.
>> If it's to safe-guard against zero CPU time claimers who apparently
>> (after validation) return good work, does there need to be something
>> else done? What?
>> If it's to safe-guard against zero credit claimers/zero credit
>> granters? The best solution is to get them to update to a newer
>> version, but how?
>>
>>
>> -- 
>> -- Jord.
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>
> _______________________________________________
> boinc_dev mailing list
> [email protected]
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
> 


_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] host punishment mechanism revisited

Reply via email to