David, I noticed you checked in a fix for Windows CPU time problem a few hours back. I compiled and confirmed it is working. See result below for Windows 7 machine:
http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1931802 Thanks for patching the main trunk with those fixes. I'm guessing the pre-compiled binaries at http://boinc.berkeley.edu/dl/ would be due for a refresh soon as well, I guess? -- Daniel On Mon, Jan 28, 2013 at 2:18 PM, Daniel Carrion <[email protected]>wrote: > Confirmed working on Darwin. > > Windows is still broken, i.e. 0 second CPU time: > http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1899118. > Seems like CPU time is getting reset every time a new task starts in the > wrapper? Run time is fine though. > > Note: I use MinGW to compile on Windows. I had to rip the zipping code > out of the newest sample wrapper as I couldn't get it to compile properly. > This probably comes down to lack of motivation to track down exactly what > is needed to compile the new boinc_zip build using MinGW :). > > --- Daniel > > ---------- Forwarded message ---------- > From: Daniel Carrion <[email protected]> > Date: Sat, Jan 26, 2013 at 10:29 PM > Subject: Re: [boinc_dev] Wrapper CPU time woes > To: BOINC Developers Mailing List <[email protected]> > > > Confirmed working on Linux. Just need to test across rest of platforms now. > > -- Daniel > > On Sat, Jan 26, 2013 at 5:42 PM, David Anderson <[email protected]>wrote: > >> I checked in a fix (at least, I tested it and it seemed to work). >> -- David >> >> On 25-Jan-2013 5:32 PM, Daniel Carrion wrote: >> > Just wondering if any of the boinc devs have considered this issue any >> > further? We usually use the latest wrapper at boinc/sample as it seems >> to >> > be receiving new features, however, if this CPU time calc problem isn't >> > going to be considered as a real issue/bug we may have to fork... >> > >> > Can someone from BOINC dev team indicate either way so I know what path >> to >> > go down with this? >> > >> > To summarise the issue again: CPU time is calculated incorrectly as >> wrapper >> > checkpoints and moves onto next tasks. It affects UNIX machines, i.e. >> > Linux, Darwin, Android, etc... Debug output showing incorrect >> > checkpoint_cpu_time calculation as tasks switch. >> > >> > >> ========================================================================================= >> > $tail -f stderr.txt >> > wrapper: starting >> > 17:52:25 (9875): wrapper: running fit_sed (1 filters.dat >> observations.dat) >> > checkpoint_cpu_time = starting_cpu (0.000000) + final_cpu_time >> (447.131944) >> > 17:59:53 (9875): wrapper: running fit_sed (2 filters.dat >> observations.dat) >> > checkpoint_cpu_time = starting_cpu (447.131944) + final_cpu_time >> > (897.368082) >> > 18:07:25 (9875): wrapper: running fit_sed (3 filters.dat >> observations.dat) >> > checkpoint_cpu_time = starting_cpu (1344.500026) + final_cpu_time >> > (1350.548404) >> > 18:14:59 (9875): wrapper: running fit_sed (4 filters.dat >> observations.dat) >> > >> ========================================================================================== >> > >> > --- Daniel >> > >> > On Thu, Jan 10, 2013 at 10:06 AM, Daniel Carrion <[email protected] >> >wrote: >> > >> >> On my Linux machine: >> >> >> >> Cloned the main git repo. Compiled BOINC followed by sample wrapper. >> >> Copied wrapper over to project dir in place of existing/old wrapper - >> >> Fairly significant size difference. I'm guessing it's that zipping >> >> functionality. >> >> >> >> Unfortunately...Same problem seems to be happening. I.e.: >> >> >> >> ---------------------- >> >> >> >> >> >> daniel@snm-boi01:/var/lib/boinc/slots/0# tail -f >> wrapper_checkpoint.txt >> >> 2>/dev/null >> >> 1 448.900054 >> >> 2 1351.808482 <-- should be 904 >> >> 3 2710.013364 >> >> daniel@snm-boi01:/var/lib/boinc/slots/0# cat stderr.txt >> >> wrapper: starting >> >> 17:31:17 (30673): wrapper: running >> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (1 >> >> filters.dat observations.dat) >> >> 17:38:52 (30673): wrapper: running >> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (2 >> >> filters.dat observations.dat) >> >> 17:46:27 (30673): wrapper: running >> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (3 >> >> filters.dat observations.dat) >> >> 17:54:04 (30673): wrapper: running >> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (4 >> >> filters.dat observations.dat) >> >> >> >> ------------------------ >> >> >> >> Notice the checkpoint times are way off the mark. E.g. 17:54:04 - >> 17:31:17 >> >> != 2710 seconds. They're adding CPU time incorrectly as sub-tasks are >> >> finishing, check-pointing and moving onto next. >> >> >> >> I don't have immediate access to Windows build environment for BOINC, >> so I >> >> can't test if that "0 second" report time problem is still occurring >> with >> >> the latest wrapper. However, I'm more concerned about that incorrect >> CPU >> >> checkpoint time at the moment. >> >> >> >> I just want to re-emphasise that this issue does not occur with >> >> server_stable branch wrapper release. >> >> >> >> Here's some actual live runs to show you the difference between CPU >> time >> >> between versions: >> >> >> >> Wrong CPU time (most recent version): >> >> >> http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1492571 >> >> Right CPU time (old version and with fix): >> >> >> http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1487356 >> >> >> >> >> >> On Mon, Jan 7, 2013 at 4:07 PM, David Anderson <[email protected] >> >wrote: >> >> >> >>> The looks like an old version of wrapper.cpp. >> >>> Try the one in trunk. >> >>> -- David >> >>> >> >>> On 06-Jan-2013 7:23 PM, Daniel Carrion wrote: >> >>>> This concerns wrapper.cpp provided under >> >>> boinc/samples/wrapper/wrapper.cpp. >> >>>> Seems like we're getting wrong CPU times calculating under Linux, >> and I >> >>>> believe same goes for Mac. >> >>>> >> >>>> Section of code this concerns (as subtasks finish in main()): >> >>>> >> >>>> 804 checkpoint_cpu_time = task.starting_cpu + task.final_cpu_time; >> >>>> 805 >> >>>> 806 fprintf(stderr, "checkpoint_cpu_time = starting_cpu (%f) + >> >>>> final_cpu_time (%f)\n", >> >>>> 807 task.starting_cpu, task.final_cpu_time); >> >>>> 808 >> >>>> 809 write_checkpoint(i+1, checkpoint_cpu_time); >> >>>> >> >>>> Note: I added the above fprintf line for debugging. >> >>>> >> >>>> We see this in stderr.txt file as subtasks run (and checkpointed as >> they >> >>>> finish) >> >>>> >> >>>> $tail -f stderr.txt >> >>>> wrapper: starting >> >>>> 17:52:25 (9875): wrapper: running fit_sed (1 filters.dat >> >>> observations.dat) >> >>>> checkpoint_cpu_time = starting_cpu (0.000000) + final_cpu_time >> >>> (447.131944) >> >>>> 17:59:53 (9875): wrapper: running fit_sed (2 filters.dat >> >>> observations.dat) >> >>>> checkpoint_cpu_time = starting_cpu (447.131944) + final_cpu_time >> >>>> (897.368082) >> >>>> 18:07:25 (9875): wrapper: running fit_sed (3 filters.dat >> >>> observations.dat) >> >>>> checkpoint_cpu_time = starting_cpu (1344.500026) + final_cpu_time >> >>>> (1350.548404) >> >>>> 18:14:59 (9875): wrapper: running fit_sed (4 filters.dat >> >>> observations.dat) >> >>>> >> >>>> See how the final_cpu_time is causing the checkpoint_cpu_time to be >> >>>> incorrect and therefore the starting_cpu_time in the next task since >> it >> >>>> uses this value. If I change the checkpoint_cpu_time to be >> >>> final_cpu_time >> >>>> only, the problem goes away. >> >>>> >> >>>> Something else that we noticed is that the CPU time reported on >> Windows >> >>>> machines is nearly always 0.0 seconds. Not sure if this is related >> as I >> >>>> haven't looked into it further. >> >>>> >> >>>> One more thing to note, I don't see this issue on Linux with the >> wrapper >> >>>> provided at server_stable branch on old SVN repo. >> >>>> >> >>>> I'm hoping that David A. Picks this up. Tried to keep it as short as >> >>>> possible - let me know if more details required. >> >>>> _______________________________________________ >> >>>> boinc_dev mailing list >> >>>> [email protected] >> >>>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> >>>> To unsubscribe, visit the above URL and >> >>>> (near bottom of page) enter your email address. >> >>>> >> >>> _______________________________________________ >> >>> boinc_dev mailing list >> >>> [email protected] >> >>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> >>> To unsubscribe, visit the above URL and >> >>> (near bottom of page) enter your email address. >> >>> >> >> >> >> >> > _______________________________________________ >> > boinc_dev mailing list >> > [email protected] >> > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> > To unsubscribe, visit the above URL and >> > (near bottom of page) enter your email address. >> > >> _______________________________________________ >> boinc_dev mailing list >> [email protected] >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> To unsubscribe, visit the above URL and >> (near bottom of page) enter your email address. >> > > > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
