Slawomir:
I checked in a change to validator.cpp which (I hope) will fix this problem.
Please test it when convenient and let me know whether it works.
Note to other projects: if you have validators that mark
isolated jobs as invalid (i.e. init_result() can return an error)
then this change may fix incorrect behavior like
generating too many job instances.
-- David
On 12-Mar-2013 4:55 AM, Slawomir Rzeznicki wrote:
Hello,
Recently I've ran into a problem at Enigma@Home.
The project uses quorum = 1, the other result settings are:
minimum quorum 1
initial replication 1
max # of error/total/success tasks 3, 10, 6
Everything is fine until the server receives a success result which is then
marked as invalid by the validator.
The validator leaves the outcome at RESULT_OUTCOME_SUCCESS and sets
validate_state to
VALIDATE_STATE_INVALID.
The result is then marked as 'Completed, marked as invalid' just like I want.
The problem is that the validator bumps the target_nresults by 1 which results
in spawning two results instead of one.
Now theoretically it's not a problem, here at Enigma@Home each result runs on
it's own and it doesn't really
matter if I use 1 workunit -> 1 result or 1 workunit -> many results (I use 1:1
mainly because it makes validation easier).
However due to random host turnaround times, some of the 'bonus work' will be
cancelled by server (if one of the hosts
is way faster and the other one contacts the server) or, what's worse, it will
be wasted if the other host does not contact
the server for a long period of time.
The source of the problem is in validator.cpp line 586:
// if #success results >= target_nresults,
// we need more results, so bump target_nresults
// NOTE: nsuccess_results should never be > target_nresults,
// but accommodate that if it should happen
//
if (nsuccess_results >= wu.target_nresults) {
wu.target_nresults = nsuccess_results+1;
transition_time = IMMEDIATE;
}
Just for test I commented these lines and the problem is gone.
Does it count as a bug, at least when the project uses quorum = 1 ?
Regards,
Slawomir Rzeznicki
http://www.enigmaathome.net
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.