David,

I noticed you checked in a fix for Windows CPU time problem a few hours
back. I compiled and confirmed it is working. See result below for Windows
7 machine:

http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1931802

Thanks for patching the main trunk with those fixes.

I'm guessing the pre-compiled binaries at http://boinc.berkeley.edu/dl/ would
be due for a refresh soon as well, I guess?

-- Daniel

On Mon, Jan 28, 2013 at 2:18 PM, Daniel Carrion <[email protected]>wrote:

> Confirmed working on Darwin.
>
> Windows is still broken, i.e. 0 second CPU time:
> http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1899118.
> Seems like CPU time is getting reset every time a new task starts in the
> wrapper? Run time is fine though.
>
> Note:  I use MinGW to compile on Windows. I had to rip the zipping code
> out of the newest sample wrapper as I couldn't get it to compile properly.
> This probably comes down to lack of motivation to track down exactly what
> is needed to compile the new boinc_zip build using MinGW :).
>
> --- Daniel
>
> ---------- Forwarded message ----------
> From: Daniel Carrion <[email protected]>
> Date: Sat, Jan 26, 2013 at 10:29 PM
> Subject: Re: [boinc_dev] Wrapper CPU time woes
> To: BOINC Developers Mailing List <[email protected]>
>
>
> Confirmed working on Linux. Just need to test across rest of platforms now.
>
> -- Daniel
>
> On Sat, Jan 26, 2013 at 5:42 PM, David Anderson <[email protected]>wrote:
>
>> I checked in a fix (at least, I tested it and it seemed to work).
>> -- David
>>
>> On 25-Jan-2013 5:32 PM, Daniel Carrion wrote:
>> > Just wondering if any of the boinc devs have considered this issue any
>> > further? We usually use the latest wrapper at boinc/sample as it seems
>> to
>> > be receiving new features, however, if this CPU time calc problem isn't
>> > going to be considered as a real issue/bug we may have to fork...
>> >
>> > Can someone from BOINC dev team indicate either way so I know what path
>> to
>> > go down with this?
>> >
>> > To summarise the issue again: CPU time is calculated incorrectly as
>> wrapper
>> > checkpoints and moves onto next tasks. It affects UNIX machines, i.e.
>> > Linux, Darwin, Android, etc... Debug output showing incorrect
>> > checkpoint_cpu_time calculation as tasks switch.
>> >
>> >
>> =========================================================================================
>> > $tail -f stderr.txt
>> > wrapper: starting
>> > 17:52:25 (9875): wrapper: running fit_sed (1 filters.dat
>> observations.dat)
>> > checkpoint_cpu_time = starting_cpu (0.000000) + final_cpu_time
>> (447.131944)
>> > 17:59:53 (9875): wrapper: running fit_sed (2 filters.dat
>> observations.dat)
>> > checkpoint_cpu_time = starting_cpu (447.131944) + final_cpu_time
>> > (897.368082)
>> > 18:07:25 (9875): wrapper: running fit_sed (3 filters.dat
>> observations.dat)
>> > checkpoint_cpu_time = starting_cpu (1344.500026) + final_cpu_time
>> > (1350.548404)
>> > 18:14:59 (9875): wrapper: running fit_sed (4 filters.dat
>> observations.dat)
>> >
>> ==========================================================================================
>> >
>> > --- Daniel
>> >
>> > On Thu, Jan 10, 2013 at 10:06 AM, Daniel Carrion <[email protected]
>> >wrote:
>> >
>> >> On my Linux machine:
>> >>
>> >> Cloned the main git repo. Compiled BOINC followed by sample wrapper.
>> >> Copied wrapper over to project dir in place of existing/old wrapper -
>> >> Fairly significant size difference. I'm guessing it's that zipping
>> >> functionality.
>> >>
>> >> Unfortunately...Same problem seems to be happening. I.e.:
>> >>
>> >> ----------------------
>> >>
>> >>
>> >> daniel@snm-boi01:/var/lib/boinc/slots/0# tail -f
>> wrapper_checkpoint.txt
>> >> 2>/dev/null
>> >> 1 448.900054
>> >> 2 1351.808482 <-- should be 904
>> >> 3 2710.013364
>> >> daniel@snm-boi01:/var/lib/boinc/slots/0# cat stderr.txt
>> >> wrapper: starting
>> >> 17:31:17 (30673): wrapper: running
>> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (1
>> >> filters.dat observations.dat)
>> >> 17:38:52 (30673): wrapper: running
>> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (2
>> >> filters.dat observations.dat)
>> >> 17:46:27 (30673): wrapper: running
>> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (3
>> >> filters.dat observations.dat)
>> >> 17:54:04 (30673): wrapper: running
>> >> ../../projects/ec2-23-23-126-96.compute-1.amazonaws.com_pogs/fit_sed (4
>> >> filters.dat observations.dat)
>> >>
>> >> ------------------------
>> >>
>> >> Notice the checkpoint times are way off the mark. E.g. 17:54:04 -
>> 17:31:17
>> >> != 2710 seconds. They're adding CPU time incorrectly as sub-tasks are
>> >> finishing, check-pointing and moving onto next.
>> >>
>> >> I don't have immediate access to Windows build environment for BOINC,
>> so I
>> >> can't test if that "0 second" report time problem is still occurring
>> with
>> >> the latest wrapper. However, I'm more concerned about that incorrect
>> CPU
>> >> checkpoint time at the moment.
>> >>
>> >> I just want to re-emphasise that this issue does not occur with
>> >> server_stable branch wrapper release.
>> >>
>> >> Here's some actual live runs to show you the difference between CPU
>> time
>> >> between versions:
>> >>
>> >> Wrong CPU time (most recent version):
>> >>
>> http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1492571
>> >> Right CPU time (old version and with fix):
>> >>
>> http://ec2-23-23-126-96.compute-1.amazonaws.com/pogs/result.php?resultid=1487356
>> >>
>> >>
>> >> On Mon, Jan 7, 2013 at 4:07 PM, David Anderson <[email protected]
>> >wrote:
>> >>
>> >>> The looks like an old version of wrapper.cpp.
>> >>> Try the one in trunk.
>> >>> -- David
>> >>>
>> >>> On 06-Jan-2013 7:23 PM, Daniel Carrion wrote:
>> >>>> This concerns wrapper.cpp provided under
>> >>> boinc/samples/wrapper/wrapper.cpp.
>> >>>> Seems like we're getting wrong CPU times calculating under Linux,
>> and I
>> >>>> believe same goes for Mac.
>> >>>>
>> >>>> Section of code this concerns (as subtasks finish in main()):
>> >>>>
>> >>>> 804 checkpoint_cpu_time = task.starting_cpu + task.final_cpu_time;
>> >>>> 805
>> >>>> 806 fprintf(stderr, "checkpoint_cpu_time = starting_cpu (%f) +
>> >>>> final_cpu_time (%f)\n",
>> >>>> 807 task.starting_cpu, task.final_cpu_time);
>> >>>> 808
>> >>>> 809 write_checkpoint(i+1, checkpoint_cpu_time);
>> >>>>
>> >>>> Note: I added the above fprintf line for debugging.
>> >>>>
>> >>>> We see this in stderr.txt file as subtasks run (and checkpointed as
>> they
>> >>>> finish)
>> >>>>
>> >>>> $tail -f stderr.txt
>> >>>> wrapper: starting
>> >>>> 17:52:25 (9875): wrapper: running fit_sed (1 filters.dat
>> >>> observations.dat)
>> >>>> checkpoint_cpu_time = starting_cpu (0.000000) + final_cpu_time
>> >>> (447.131944)
>> >>>> 17:59:53 (9875): wrapper: running fit_sed (2 filters.dat
>> >>> observations.dat)
>> >>>> checkpoint_cpu_time = starting_cpu (447.131944) + final_cpu_time
>> >>>> (897.368082)
>> >>>> 18:07:25 (9875): wrapper: running fit_sed (3 filters.dat
>> >>> observations.dat)
>> >>>> checkpoint_cpu_time = starting_cpu (1344.500026) + final_cpu_time
>> >>>> (1350.548404)
>> >>>> 18:14:59 (9875): wrapper: running fit_sed (4 filters.dat
>> >>> observations.dat)
>> >>>>
>> >>>> See how the final_cpu_time is causing the checkpoint_cpu_time to be
>> >>>> incorrect and therefore the starting_cpu_time in the next task since
>> it
>> >>>> uses this value. If I change the checkpoint_cpu_time to be
>> >>> final_cpu_time
>> >>>> only, the problem goes away.
>> >>>>
>> >>>> Something else that we noticed is that the CPU time reported on
>> Windows
>> >>>> machines is nearly always 0.0 seconds. Not sure if this is related
>> as I
>> >>>> haven't looked into it further.
>> >>>>
>> >>>> One more thing to note, I don't see this issue on Linux with the
>> wrapper
>> >>>> provided at server_stable branch on old SVN repo.
>> >>>>
>> >>>> I'm hoping that David A. Picks this up.  Tried to keep it as short as
>> >>>> possible - let me know if more details required.
>> >>>> _______________________________________________
>> >>>> boinc_dev mailing list
>> >>>> [email protected]
>> >>>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> >>>> To unsubscribe, visit the above URL and
>> >>>> (near bottom of page) enter your email address.
>> >>>>
>> >>> _______________________________________________
>> >>> boinc_dev mailing list
>> >>> [email protected]
>> >>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> >>> To unsubscribe, visit the above URL and
>> >>> (near bottom of page) enter your email address.
>> >>>
>> >>
>> >>
>> > _______________________________________________
>> > boinc_dev mailing list
>> > [email protected]
>> > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> > To unsubscribe, visit the above URL and
>> > (near bottom of page) enter your email address.
>> >
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>
>
>
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to