I'm happy to regard this as a full-blown development issue, not just debug ;-)
The events at AQUA have drawn our attention to three quite separate issues: 1) The sensitivity of the averaging algorithm to outliers and finger-fumble inputs. 2) The removal of support for the older flop-counting and oldest benchmark*time credit schemas. It might be argued that retention of some form of fixed credit was appropriate for projects who have to run their Beta test runs on their main project servers: almost by definition, test app behaviour cannot be determined a priori, and some sort of damage-limitation code (as was common in earlier versions of BOINC) would save us from the biggest surprises. 3) The non-deterministic nature of the credit awarded for individual tasks under CreditNew, even when the bulk averages are behaving as expected. This is not limited to AQUA - it has been apparent from the very first Beta test of CreditNew at SETI Beta in May 2010 (see http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1765&nowrap=true#39533) - but it has required the disabling of the fixed credit mechanism, and the appearance of CreditNew on a quorum 1 project like AQUA, to reveal such a stark example as the one I used to start this thread. ----- Original Message ----- From: "Kamran Karimi" <[email protected]> To: "David Anderson" <[email protected]>; "Eric J Korpela" <[email protected]> Cc: "BOINC Developers Mailing List" <[email protected]> Sent: Monday, July 11, 2011 7:44 PM Subject: Re: [boinc_dev] [boinc_alpha] CreditNew: Is it really just arandomnumber generator? > One things that AQUA apps do is to set the fpops_cumulative (to a value > that determines how much work was done) and intops_cumulative (set to > -1, to indicate to the server that fpops_cumulative must be used to > calculate credit). I wonder if this is causing any problems, though as > Richard mentioned, tasks with similar fpops_cumulative values are > getting very different results. > > -Kamran > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of David Anderson > Sent: July-11-11 11:15 AM > To: Eric J Korpela > Cc: BOINC Developers Mailing List > Subject: Re: [boinc_dev] [boinc_alpha] CreditNew: Is it really just a > randomnumber generator? > > (I'm moving this to boinc_dev; not appropriate for boinc_alpha) > > Re Eric's idea: we do maintain std dev. > However, this wouldn't work if there is an initial sequence of > consistent, small values, and large values thereafter. > > Richard, rest assured that CreditNew is not a RNG. > Something anomalous is happening at AQUA, and Kamran and I are going to > track it down shortly. > > -- David > > On 11-Jul-2011 7:05 AM, Eric J Korpela wrote: >> One problem, I think, that might lead to this is the use of moving > averages. >> (I haven't verified this numerically) They are vulnerable to high >> value outliers. If there were also a measure approximating standard >> deviation it might be possible to exclude numbers that are more that >> two standard deviations off the mean, which would lead to more stable > moving averages. >> Maybe. >> >> Eric >> >> On Fri, Jul 8, 2011 at 4:43 PM, Richard Haselgrove >> <[email protected]> wrote: >>> Sorry for the provocative title, but that's the word on the street >>> (project message boards, for those who don't read them). >>> >>> Here's some evidence to back up the theory, from AQUA today. >>> >>> AQUA has been a project which awards deterministic credit, controlled > >>> by a bespoke usage of<fpops_cumulative> in the<result> record. The >>> project also uses close-to-current BOINC server code (currently at >>> svn 23790), with minimal modification. >>> >>> Some recent server code change, as yet unidentified, has broken the >>> deterministic credit awards. I can only surmise that the change >>> relates to the promotion of CreditNew to be the default credit > schema. >>> >>> There have been wild fluctuations (several hundred million credits >>> per WU) in credit granted for a recent low-volume test application. >>> We can dismiss those as outliers (although the evidence will remain >>> on the face of the stats sites for months or years). But my >>> observation relates to the long-established, production status > Fokker-Planck application. >>> >>> My host >>> http://aqua.dwavesys.com/results.php?hostid=17302&offset=0&show_names >>> =0&state=0&appid=3 was allocated two consecutive tasks in a single >>> scheduler contact at >>> 1:43:20 UTC today, and returned both results in a single scheduler >>> contact at 16:08:46 UTC. >>> >>> The two result records returned to the server show a minor difference > >>> in CPU and elapsed timings, but are otherwise effectively identical. >>> In particular, both results have the same<fpops_cumulative>, and AQUA > >>> would like to stipulate that they receive identical credit. >>> >>> <result> <name>fp_5jul11_bm_16_005_500_000-1_998_0</name> >>> <final_cpu_time>72198.170000</final_cpu_time> >>> <final_elapsed_time>18889.015625</final_elapsed_time> >>> <exit_status>0</exit_status> <state>5</state> >>> <platform>windows_intelx86</platform> <version_num>210</version_num> >>> <plan_class>fpmt</plan_class> >>> <fpops_cumulative>7407970000000.000000</fpops_cumulative> >>> <intops_cumulative>-1.000000</intops_cumulative> >>> <app_version_num>210</app_version_num> ... </result> <result> >>> <name>fp_5jul11_bm_16_005_500_000-1_997_0</name> >>> <final_cpu_time>70629.910000</final_cpu_time> >>> <final_elapsed_time>18252.515625</final_elapsed_time> >>> <exit_status>0</exit_status> <state>5</state> >>> <platform>windows_intelx86</platform> <version_num>210</version_num> >>> <plan_class>fpmt</plan_class> >>> <fpops_cumulative>7407970000000.000000</fpops_cumulative> >>> <intops_cumulative>-1.000000</intops_cumulative> >>> <app_version_num>210</app_version_num> ... </result> >>> >>> Yet the credit award is very different: >>> >>> 12053709 10755813 8 Jul 2011 | 1:43:20 UTC 8 Jul 2011 | 16:08:46 UTC >>> Completed and validated 18,889.02 72,198.17 1,762.44 D-Wave's >>> Fokker-Planck Simulation : Multi-Threaded Anonymous platform (CPU) >>> 12053708 10755812 8 Jul 2011 | 1:43:20 UTC 8 Jul 2011 | 16:08:46 UTC >>> Completed and validated >>> 18,252.52 70,629.91 14,198.05 D-Wave's Fokker-Planck Simulation : >>> Multi-Threaded Anonymous platform (CPU) >>> >>> That's 1,762.44 credits for one member of the matched pair, 14,198.05 > >>> credits for the other. >>> >>> AQUA is a quorum=1 project, so there's no effect from validation >>> partners ('wingmen'). Since both time of issue and time of report are > >>> identical, there should be no scope for variation in either project >>> averages or host averages between the two events. >>> >>> Where, then, does the eight-fold variation in credit come from? And >>> what impression does it give in relation to the mathematical accuracy > >>> of the BOINC platform? >>> _______________________________________________ boinc_alpha mailing >>> list [email protected] >>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha To > unsubscribe, visit the above URL and (near bottom of page) enter your > email address. >>> >> _______________________________________________ boinc_alpha mailing >> list [email protected] >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha To >> unsubscribe, visit the above URL and (near bottom of page) enter your > email address. > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
