Re: [boinc_dev] Validator problem
No, if validator works well, everything is ok, if it starts to marks them as inconclusive, there is this: 2013-09-11 21:01:28.6961 [WU#6111094 ps_130910_20263_277] handle_wu(): No canonical result yet 2013-09-11 21:01:28.7004 [debug] [WU#6111094 ps_130910_20263_277] Found 4 viable results 2013-09-11 21:01:28.7005 [debug] [WU#6111094 ps_130910_20263_277] Enough for quorum, checking set. 2013-09-11 21:01:28.7006 [CRITICAL] check_set: init_result([RESULT#14692440 ps_130910_20263_277_0]) transient failure 2013-09-11 21:01:28.7007 [CRITICAL] check_set: init_result([RESULT#14692441 ps_130910_20263_277_1]) transient failure 2013-09-11 21:01:28.7008 [CRITICAL] check_set: init_result([RESULT#15596859 ps_130910_20263_277_2]) transient failure 2013-09-11 21:01:28.7009 [CRITICAL] check_set: init_result([RESULT#15616637 ps_130910_20263_277_3]) transient failure 2013-09-11 21:01:28.7009[HOST#29989 AV#43] [outlier=0] Updating HAV in db. pfc.n=335.00-335.00 2013-09-11 21:01:28.7010[HOST#33276 AV#43] [outlier=0] Updating HAV in db. pfc.n=183.00-183.00 2013-09-11 21:01:28.7011[HOST#20844 AV#43] [outlier=0] Updating HAV in db. pfc.n=0.00-0.00 2013-09-11 21:01:28.7011[RESULT#15616637 ps_130910_20263_277_3] Inconclusive [HOST#13946] 2013-09-11 21:01:28.7012[HOST#13946 AV#43] [outlier=0] Updating HAV in db. pfc.n=12.00-12.00 2013-09-11 21:01:28.8081 [WU#6118169 ps_130910_20288_242] handle_wu(): No canonical result yet 2013-09-11 21:01:28.9177 [debug] [WU#6118169 ps_130910_20288_242] Found 3 viable results 2013-09-11 21:01:28.9179 [debug] [WU#6118169 ps_130910_20288_242] Enough for quorum, checking set. 2013-09-11 21:01:28.9180 [CRITICAL] check_set: init_result([RESULT#14706758 ps_130910_20288_242_0]) transient failure 2013-09-11 21:01:28.9181 [CRITICAL] check_set: init_result([RESULT#14706759 ps_130910_20288_242_1]) transient failure 2013-09-11 21:01:28.9182 [CRITICAL] check_set: init_result([RESULT#15605864 ps_130910_20288_242_2]) transient failure 2013-09-11 21:01:28.9182[HOST#14290 AV#43] [outlier=0] Updating HAV in db. pfc.n=46.00-46.00 2013-09-11 21:01:28.9183[HOST#13719 AV#39] [outlier=0] Updating HAV in db. pfc.n=9.00-9.00 2013-09-11 21:01:28.9183[RESULT#15605864 ps_130910_20288_242_2] Inconclusive [HOST#13946] 2013-09-11 21:01:28.9184[HOST#13946 AV#39] [outlier=0] Updating HAV in db. pfc.n=1.00-1.00 2013-09-11 21:01:28.9215 [WU#6124461 ps_130910_20312_62] handle_wu(): No canonical result yet 2013-09-11 21:01:28.9247 [debug] [WU#6124461 ps_130910_20312_62] Found 2 viable results 2013-09-11 21:01:28.9249 [debug] [WU#6124461 ps_130910_20312_62] Enough for quorum, checking set. 2013-09-11 21:01:28.9251 [CRITICAL] check_set: init_result([RESULT#14719522 ps_130910_20312_62_0]) transient failure 2013-09-11 21:01:28.9252 [CRITICAL] check_set: init_result([RESULT#14719523 ps_130910_20312_62_1]) transient failure 2013-09-11 21:01:28.9253[RESULT#14719522 ps_130910_20312_62_0] Inconclusive [HOST#45463] 2013-09-11 21:01:28.9253[HOST#45463 AV#43] [outlier=0] Updating HAV in db. pfc.n=131.00-131.00 2013-09-11 21:01:28.9263[RESULT#14719523 ps_130910_20312_62_1] Inconclusive [HOST#9190] 2013-09-11 21:01:28.9265[HOST#9190 AV#43] [outlier=0] Updating HAV in db. pfc.n=502.00-502.00 It looks like some memory leaks or something similiar, but didn't figure it out yet. Dne 12.9.2013 07:46, David Anderson napsal(a): Are there any error messages in the validator log file? -- David On 11-Sep-2013 1:19 PM, Radim Vančo wrote: I am still trying to solve problem with my validator. It works well at start, but after a few days, it starts marking all results as inconclusive. If I restart the validator, it validates again well for a few days. I am attaching source code of the validator if anyone would know where is the problem. Thanks ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Validator problem
transient failure means that the validator couldn't open a directory containing an output file (I'll change the message to make this clear). This can happen if you're using NFS and the mount failed. -- David On 12-Sep-2013 2:24 AM, Radim Vančo wrote: No, if validator works well, everything is ok, if it starts to marks them as inconclusive, there is this: 2013-09-11 21:01:28.6961 [WU#6111094 ps_130910_20263_277] handle_wu(): No canonical result yet 2013-09-11 21:01:28.7004 [debug] [WU#6111094 ps_130910_20263_277] Found 4 viable results 2013-09-11 21:01:28.7005 [debug] [WU#6111094 ps_130910_20263_277] Enough for quorum, checking set. 2013-09-11 21:01:28.7006 [CRITICAL] check_set: init_result([RESULT#14692440 ps_130910_20263_277_0]) transient failure 2013-09-11 21:01:28.7007 [CRITICAL] check_set: init_result([RESULT#14692441 ps_130910_20263_277_1]) transient failure 2013-09-11 21:01:28.7008 [CRITICAL] check_set: init_result([RESULT#15596859 ps_130910_20263_277_2]) transient failure 2013-09-11 21:01:28.7009 [CRITICAL] check_set: init_result([RESULT#15616637 ps_130910_20263_277_3]) transient failure 2013-09-11 21:01:28.7009[HOST#29989 AV#43] [outlier=0] Updating HAV in db. pfc.n=335.00-335.00 2013-09-11 21:01:28.7010[HOST#33276 AV#43] [outlier=0] Updating HAV in db. pfc.n=183.00-183.00 2013-09-11 21:01:28.7011[HOST#20844 AV#43] [outlier=0] Updating HAV in db. pfc.n=0.00-0.00 2013-09-11 21:01:28.7011[RESULT#15616637 ps_130910_20263_277_3] Inconclusive [HOST#13946] 2013-09-11 21:01:28.7012[HOST#13946 AV#43] [outlier=0] Updating HAV in db. pfc.n=12.00-12.00 2013-09-11 21:01:28.8081 [WU#6118169 ps_130910_20288_242] handle_wu(): No canonical result yet 2013-09-11 21:01:28.9177 [debug] [WU#6118169 ps_130910_20288_242] Found 3 viable results 2013-09-11 21:01:28.9179 [debug] [WU#6118169 ps_130910_20288_242] Enough for quorum, checking set. 2013-09-11 21:01:28.9180 [CRITICAL] check_set: init_result([RESULT#14706758 ps_130910_20288_242_0]) transient failure 2013-09-11 21:01:28.9181 [CRITICAL] check_set: init_result([RESULT#14706759 ps_130910_20288_242_1]) transient failure 2013-09-11 21:01:28.9182 [CRITICAL] check_set: init_result([RESULT#15605864 ps_130910_20288_242_2]) transient failure 2013-09-11 21:01:28.9182[HOST#14290 AV#43] [outlier=0] Updating HAV in db. pfc.n=46.00-46.00 2013-09-11 21:01:28.9183[HOST#13719 AV#39] [outlier=0] Updating HAV in db. pfc.n=9.00-9.00 2013-09-11 21:01:28.9183[RESULT#15605864 ps_130910_20288_242_2] Inconclusive [HOST#13946] 2013-09-11 21:01:28.9184[HOST#13946 AV#39] [outlier=0] Updating HAV in db. pfc.n=1.00-1.00 2013-09-11 21:01:28.9215 [WU#6124461 ps_130910_20312_62] handle_wu(): No canonical result yet 2013-09-11 21:01:28.9247 [debug] [WU#6124461 ps_130910_20312_62] Found 2 viable results 2013-09-11 21:01:28.9249 [debug] [WU#6124461 ps_130910_20312_62] Enough for quorum, checking set. 2013-09-11 21:01:28.9251 [CRITICAL] check_set: init_result([RESULT#14719522 ps_130910_20312_62_0]) transient failure 2013-09-11 21:01:28.9252 [CRITICAL] check_set: init_result([RESULT#14719523 ps_130910_20312_62_1]) transient failure 2013-09-11 21:01:28.9253[RESULT#14719522 ps_130910_20312_62_0] Inconclusive [HOST#45463] 2013-09-11 21:01:28.9253[HOST#45463 AV#43] [outlier=0] Updating HAV in db. pfc.n=131.00-131.00 2013-09-11 21:01:28.9263[RESULT#14719523 ps_130910_20312_62_1] Inconclusive [HOST#9190] 2013-09-11 21:01:28.9265[HOST#9190 AV#43] [outlier=0] Updating HAV in db. pfc.n=502.00-502.00 It looks like some memory leaks or something similiar, but didn't figure it out yet. Dne 12.9.2013 07:46, David Anderson napsal(a): Are there any error messages in the validator log file? -- David On 11-Sep-2013 1:19 PM, Radim Vančo wrote: I am still trying to solve problem with my validator. It works well at start, but after a few days, it starts marking all results as inconclusive. If I restart the validator, it validates again well for a few days. I am attaching source code of the validator if anyone would know where is the problem. Thanks ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list
[boinc_dev] Validator problem
I am still trying to solve problem with my validator. It works well at start, but after a few days, it starts marking all results as inconclusive. If I restart the validator, it validates again well for a few days. I am attaching source code of the validator if anyone would know where is the problem. Thanks #include string #include vector #include math.h #include error_numbers.h #include boinc_db.h #include sched_util.h #include validate_util.h #include validate_util2.h #include validator.h using std::string; using std::vector; struct DATA { int nlines; double per[10]; double rms[10]; double chisq[10]; }; int init_result(RESULT result, void* data) { FILE* f; OUTPUT_FILE_INFO fi; int n, retval, nlines; double per[10], rms[10], chisq[10], dark, lambda, beta; retval = get_output_file_path(result, fi.path); if (retval) return retval; retval = try_fopen(fi.path.c_str(), f, r); if (retval) return retval; DATA* dp = new DATA; nlines = 0; while (feof(f) == 0) { n = fscanf(f, %lf %lf %lf %lf %lf %lf, per[nlines], rms[nlines], chisq[nlines], dark, lambda, beta); if (n != 6 n != -1) return ERR_XML_PARSE; dp-per[nlines] = per[nlines]; dp-rms[nlines] = rms[nlines]; dp-chisq[nlines] = chisq[nlines]; nlines++; } dp-nlines = nlines; fclose(f); data = (void*) dp; return 0; } int compare_results(RESULT r1, void* _data1, RESULT const r2, void* _data2, bool match) { int i; double tol_per = 0.1, tol_rms = 0.1, tol_chisq = 0.5; DATA* data1 = (DATA*)_data1; DATA* data2 = (DATA*)_data2; match = true; for (i = 0; i data1-nlines; i++) { if (fabs((data1-per[i] - data2-per[i]) / (data1-per[i] + data2-per[i])) / 2 tol_per) match = false; if (fabs((data1-rms[i] - data2-rms[i]) / (data1-rms[i] + data2-rms[i])) / 2 tol_rms) match = false; if (fabs((data1-chisq[i] - data2-chisq[i]) / (data1-chisq[i] + data2-chisq[i])) / 2 tol_chisq) match = false; } return 0; } int cleanup_result(RESULT const r, void* data) { if (data) delete (DATA*) data; return 0; } ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Validator problem
Different processors can come up with slightly different results for each step in a long calculation. If you allow processors of different types to crunch the same work unit, then you will have to write some fuzziness into your validator. -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of radim.vanco Sent: Tuesday, June 04, 2013 7:26 AM To: boinc_dev@ssl.berkeley.edu Subject: [boinc_dev] Validator problem Hi, I have one problem with this custom validator. It is based on custom validator on wiki, it compares three numbers with decimal point and check structure before it. It works fine but after some time (two - four days) it will start to mark all results as inconclusive. If I test it on only a few WUs it works exactly as I want but when there are many results then it starts after few days mark everything as invalid. Does anyone know what could cause the problem? I attached source code of the validator. Radim ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
[boinc_dev] Validator problem
Hi, I have one problem with this custom validator. It is based on custom validator on wiki, it compares three numbers with decimal point and check structure before it. It works fine but after some time (two - four days) it will start to mark all results as inconclusive. If I test it on only a few WUs it works exactly as I want but when there are many results then it starts after few days mark everything as invalid. Does anyone know what could cause the problem? I attached source code of the validator. Radim #include string #include vector #include math.h #include error_numbers.h #include boinc_db.h #include sched_util.h #include validate_util.h #include validate_util2.h #include validator.h using std::string; using std::vector; struct DATA { int nlines; double per[10]; double rms[10]; double chisq[10]; }; //extern int init_result(RESULT const result, void* data) { int init_result(RESULT result, void* data) { FILE* f; OUTPUT_FILE_INFO fi; int n, retval, nlines; double per[10], rms[10], chisq[10], dark, lambda, beta; retval = get_output_file_path(result, fi.path); if (retval) return retval; retval = try_fopen(fi.path.c_str(), f, r); if (retval) return retval; DATA* dp = new DATA; nlines = 0; while (feof(f) == 0) { n = fscanf(f, %lf %lf %lf %lf %lf %lf, per[nlines], rms[nlines], chisq[nlines], dark, lambda, beta); if (n != 6 n != -1) return ERR_XML_PARSE; dp-per[nlines] = per[nlines]; dp-rms[nlines] = rms[nlines]; dp-chisq[nlines] = chisq[nlines]; nlines++; //printf (Výstup1: %lf %lf %lf %lf %lf %lf\n, per[nlines], rms[nlines], chisq[nlines], dark, lambda, beta); //printf (Počet řádků: %d\n, n); //printf (Aktuální řádek: %d\n, nlines); } dp-nlines = nlines; fclose(f); data = (void*) dp; return 0; } int compare_results(RESULT r1, void* _data1, RESULT const r2, void* _data2, bool match) { int i; double tol_per = 0.1, tol_rms = 0.1, tol_chisq = 0.5; DATA* data1 = (DATA*)_data1; DATA* data2 = (DATA*)_data2; match = true; for (i = 0; i data1-nlines; i++) { // if (fabs(data1-per[i] - data2-per[i]) tol_per) match = false; //if (fabs(data1-rms[i] - data2-rms[i]) tol_rms) match = false; //if (fabs(data1-chisq[i] - data2-chisq[i]) tol_chisq) match = false; if (fabs((data1-per[i] - data2-per[i]) / (data1-per[i] + data2-per[i])) / 2 tol_per) match = false; if (fabs((data1-rms[i] - data2-rms[i]) / (data1-rms[i] + data2-rms[i])) / 2 tol_rms) match = false; if (fabs((data1-chisq[i] - data2-chisq[i]) / (data1-chisq[i] + data2-chisq[i])) / 2 tol_chisq) match = false; //printf (Výstup: %lf %lf %lf \n, data1-per[i], data1-rms[i], data1-chisq[i]); } return 0; } int cleanup_result(RESULT const r, void* data) { if (data) delete (DATA*) data; return 0; } ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.