Ironically, the topic of tasks which make 0% progress - BOINC thinks they're 
running, and elapsed time continues to accrue, but I guess CPU time doesn't 
rise - cropped up at LHC as well recently.

http://lhcathomeclassic.cern.ch/sixtrack/forum_thread.php?id=3843&postid=26583


I sometimes see it on my own hosts with GPU tasks: suspending the task (which 
in the GPU case removes it from memory) and allowing it to restart from 
checkpoint resets the displayed elapsed time to the time at last checkpoint, 
and it usually runs to completion and reports a normal runtime. My CPU apps are 
set for 'keep applications in memory when suspended', and I haven't seen any 
similar behaviour: but my CPUs are all Intel, and as I said in that post, 
"running on the spot" seems to be reported more often from AMD hosts.



>________________________________
> From: Juha <[email protected]>
>To: Richard Haselgrove <[email protected]> 
>Cc: BOINC Dev MailingList <[email protected]> 
>Sent: Monday, 9 June 2014, 21:52
>Subject: Re: [boinc_dev] EXIT_TIME_LIMIT_EXCEEDED (sorry, yes me again, but 
>please read)
> 
>
>I wonder if the app crashes without BOINC noticing or the task is suspended
>for whatever reason but BOINC forgets about the suspending. The Samsung's
>Power Sleep app appears to be using BOINC 7.3.0. The Android 7.3.x versions
>aren't mentioned in Release notes or Version history pages so it's pretty
>hard to tell what fixes later versions have or if 7.3.0 was even ever
>released.
>
>Now I'm not saying that BOINC isn't capable of making bad decisions.
>Consider the following:
>
>Normally tasks for a project runs for 1 hour on a host. The project has
>chosen to use max runtime of 3x estimated runtime. Runtimes for the tasks
>for this project are easy to predict and always correct.
>Now lets say the owner of the host decides to play some game. The game uses
>about 70% of CPU time, leaving the rest for BOINC.
>Since BOINC now gets about 30% of CPU cycles a task for the project needs a
>bit over three hours of wall clock time before it's done. But once three
>hours has passed BOINC decides that the now almost complete task has been
>running too long and kills it.
>
>This wasn't a hypothetical example :(
>
>Maybe elapsed time is best you have for GPU apps but for CPU apps it
>doesn't work. There is no reason why the app would always get 100% CPU time
>and very often, if not most of the time, it doesn't. I mean, the reason why
>we have BOINC is that users can run their computers the way they like and
>BOINC gets whatever CPU time is left from user's programs.
>
>I don't think fixing this by increasing the runtime multiplier is the right
>solution. Instead of letting project people work on their apps and deciding
>how much runtime their app needs and with how much variation you now force
>them to consider how the volunteers use their computers.
>
>(Ok, basing the decision purely on used CPU time wouldn't work in every
>case either. For example, the app could get stuck waiting for an event that
>never comes.)
>
>-Juha
>_______________________________________________
>boinc_dev mailing list
>[email protected]
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>
>
>
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to