Looks like you should also ask WHEN  the retry occurs if you enable it.

If the retry is immediate, expect something very much like an endless 
loop, until the missing output file finally arrives.

Robert Miles

-------

Date: Sat, 12 May 2012 15:05:21 -0500
From: Travis Desell<[email protected]>

So my validator is running into a weird issue where there frequently missing 
output files from results when they attempt to be validated.  Sometimes they 
seem to show up later; but when I set the retry flag to true in the validator, 
it seems to enter into an infinite loop where it repeatedly tries to validate 
the results.

Any reason why a not-insignificant number of results would reach the validator 
without an output file? Checking in the database these are results that 
succeeded successfully (everything looks fine in the stderr out, which is 
written to at the same time the output files are written).


Basically, I'm reading the result file into a string:

string get_file_as_string(string file_path) throw (int) {
     //read the entire contents of the file into a string
     ifstream sites_file(file_path.c_str());

     if (!sites_file.is_open()) {
         throw 1;
     }

     std::string fc;

     sites_file.seekg(0, std::ios::end);
     fc.reserve(sites_file.tellg());
     sites_file.seekg(0, std::ios::beg);

     fc.assign((std::istreambuf_iterator<char>(sites_file)), 
std::istreambuf_iterator<char>());

     return fc;
}

Which throws an exception if it can't open the file.  If this happens the file 
parsing function returns the error:

     string fc;
     try {
         fc = get_file_as_string(file_path);
     } catch (int err) {
         log_messages.printf(MSG_CRITICAL, "[RESULT#%d %s] 
get_data_from_result: could not open file for result\n", result.id, 
result.name);
         log_messages.printf(MSG_CRITICAL, "     file path: %s\n", 
file_path.c_str());
         //retry this result?
         return err;
     }

And then if that happens I set the retry flag to true and return an error 
within checkset:

             retval = get_data_from_result(uint32_max, checksum, failed_sets, 
results[i]);
             if (retval) {
                 log_messages.printf(MSG_CRITICAL, "result[%2d] - id: %10d, 
error getting data from result: %d, retrying.\n", i, results[i].id, retval);
                 retry = true;
                 return retval;
             }

(All the validation code is here:  
https://github.com/travisdesell/Subset-Sum/blob/master/server/sss_validation_policy.cpp
 )

So any reason this would enter into an infinite loop with the validator 
repeatedly retrying these results? Am I returning a wrong error value, or is 
there something in the results I need to set, so it will back off these results 
for awhile to retry them?  It seems like in some of the cases (maybe all of 
them?) the file does end up eventually showing up.


Thanks,
--Travis

---------------------------------------------------------------------------
Travis Desell,  Assistant Professor

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to