Looks like you should also ask WHEN the retry occurs if you enable it. If the retry is immediate, expect something very much like an endless loop, until the missing output file finally arrives.
Robert Miles ------- Date: Sat, 12 May 2012 15:05:21 -0500 From: Travis Desell<[email protected]> So my validator is running into a weird issue where there frequently missing output files from results when they attempt to be validated. Sometimes they seem to show up later; but when I set the retry flag to true in the validator, it seems to enter into an infinite loop where it repeatedly tries to validate the results. Any reason why a not-insignificant number of results would reach the validator without an output file? Checking in the database these are results that succeeded successfully (everything looks fine in the stderr out, which is written to at the same time the output files are written). Basically, I'm reading the result file into a string: string get_file_as_string(string file_path) throw (int) { //read the entire contents of the file into a string ifstream sites_file(file_path.c_str()); if (!sites_file.is_open()) { throw 1; } std::string fc; sites_file.seekg(0, std::ios::end); fc.reserve(sites_file.tellg()); sites_file.seekg(0, std::ios::beg); fc.assign((std::istreambuf_iterator<char>(sites_file)), std::istreambuf_iterator<char>()); return fc; } Which throws an exception if it can't open the file. If this happens the file parsing function returns the error: string fc; try { fc = get_file_as_string(file_path); } catch (int err) { log_messages.printf(MSG_CRITICAL, "[RESULT#%d %s] get_data_from_result: could not open file for result\n", result.id, result.name); log_messages.printf(MSG_CRITICAL, " file path: %s\n", file_path.c_str()); //retry this result? return err; } And then if that happens I set the retry flag to true and return an error within checkset: retval = get_data_from_result(uint32_max, checksum, failed_sets, results[i]); if (retval) { log_messages.printf(MSG_CRITICAL, "result[%2d] - id: %10d, error getting data from result: %d, retrying.\n", i, results[i].id, retval); retry = true; return retval; } (All the validation code is here: https://github.com/travisdesell/Subset-Sum/blob/master/server/sss_validation_policy.cpp ) So any reason this would enter into an infinite loop with the validator repeatedly retrying these results? Am I returning a wrong error value, or is there something in the results I need to set, so it will back off these results for awhile to retry them? It seems like in some of the cases (maybe all of them?) the file does end up eventually showing up. Thanks, --Travis --------------------------------------------------------------------------- Travis Desell, Assistant Professor _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
