Does this problem occur on SETI@home?
-- David

On 07-Jun-2014 2:51 AM, Richard Haselgrove wrote:

2) Android runtime estimates

The example here is from SIMAP. During a recent pause between batches, I noticed
that some of my 'pending validation' tasks were being slow to clear:
http://boincsimap.org/boincsimap/results.php?hostid=349248

The clearest example is the third of those three workunits:
http://boincsimap.org/boincsimap/workunit.php?wuid=57169928

Four of the seven replications have failed with 'Error while computing', and
every one of those four is an EXIT_TIME_LIMIT_EXCEEDED on an Android device.

Three of the four hosts have never returned a valid result (total credit zero),
so they have never had a chance to establish an APR for use in runtime
estimation: runtime estimates and bounds must have been generated by the server.

It seems - from these results, and others I've found pending on other machines -
that SIMAP tasks on Android are aborted with EXIT_TIME_LIMIT_EXCEEDED after ~6
hours elapsed. For the new batch released today, SIMAP are using a 3x bound
(which may be a bit low under the circumstances):

<rsc_fpops_est>13500000000000.000000</rsc_fpops_est>
<rsc_fpops_bound>40500000000000.000000</rsc_fpops_bound>

so I deduce that the tasks when first issued had a runtime estimate of ~2 hours.

My own tasks, on a fast Intel i5 'Haswell' CPU (APR 7.34 GFLOPS), take over half
an hour to complete: two hours for an ARM device sounds suspiciously low. The
only one of my Android wingmates to have registered an APR
(http://boincsimap.org/boincsimap/host_app_versions.php?hostid=771033) is 
showing
1.69 GFLOPS, but I have no way of knowing whether that APR was established 
before
or after the task in question errored out.

From experience - borne out by current tests at Albert@Home, where server logs
are helpfully exposed to the public - initial server estimates can be hopelessly
over-optimistic. These two are for the same machine:

2014-06-04 20:28:09.8459 [PID=26529] [version] [AV#716] (BRP4G-cuda32-nv301)
adjusting projected flops based on PFC avg: 2124.60G 2014-06-07 09:30:56.1506
[PID=10808] [version] [AV#716] (BRP4G-cuda32-nv301) setting projected flops 
based
on host elapsed time avg: 23.71G

Since SIMAP have recently announced that they are leaving the BOINC platform at
the end of the year (despite being an Android launch partner with Samsung), I
doubt they'll want to put much effort into researching this issue.

But if other projects experimenting with Android applications are experiencing a
high task failure rate, they might like to check whether 
EXIT_TIME_LIMIT_EXCEEDED
is a significant factor in those failures, and if so, consider the other
remediation approaches (apart from outliers, which isn't relevant in this case)
that I suggested to Eric Mcintosh at LHC.
_______________________________________________ boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit
the above URL and (near bottom of page) enter your email address.

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to