The host info is only updated when the host contacts the scheduler. -- David
Richard Haselgrove wrote: > There's definitely something wrong with the (daily) quota resetting > mechanism - whether that's the fault of the new code, or SETI's Beta > server, I'll leave to you. > > Host 12316 last downloaded a SETI Beta task at 4 Jun 2010 20:01:03 UTC > > Yet as I type this (7 June 2010 15:00 UTC), the application info still says > > Number of tasks completed 1183 > Max tasks per day 218 > Number of tasks today 273 > Consecutive valid tasks 118 > Average turnaround time 0.45 days > > > ----- Original Message ----- From: "Richard Haselgrove" > <[email protected]> > To: "David Anderson" <[email protected]> > Cc: <[email protected]> > Sent: Friday, June 04, 2010 10:04 AM > Subject: Re: [boinc_dev] host punishment mechanism revisited > > > Morning report. The validations trickled in slowly overnight: > > 04/06/2010 03:51:18 s...@home Beta Test Message from server: (reached > daily quota of 205 tasks) > 04/06/2010 04:24:17 s...@home Beta Test Message from server: (reached > daily quota of 206 tasks) > 04/06/2010 06:19:43 s...@home Beta Test Scheduler request completed: got > 34 new tasks > 04/06/2010 06:19:59 s...@home Beta Test Message from server: (reached > daily quota of 209 tasks) > > So that's a significant overshoot. > > Also, "today" seems to be lasting an awfully long time: surely this > should have reset before 09:00 UTC? > > 0.60 days > Number of tasks completed 786 > Max tasks per day 213 > Number of tasks today 241 > Consecutive valid tasks 113 > Average turnaround time 0.60 days > > If I happen to get another of those 'erroneous triplets' (which are a > project error, not a host failure), the "punishment" from the thread > title is going to be massive. > > --- On Fri, 4/6/10, Richard Haselgrove <[email protected]> > wrote: > > > From: Richard Haselgrove <[email protected]> > Subject: Re: [boinc_dev] host punishment mechanism revisited > To: "David Anderson" <[email protected]> > Cc: [email protected] > Date: Friday, 4 June, 2010, 1:44 > > > No, it wasn't to be. > > Crept up slowly to > > Number of tasks completed 778 > Max tasks per day 205 > Number of tasks today 207 > Consecutive valid tasks 105 > Average turnaround time 0.62 days > > but I ran out of jobs just two short - last 18 with no wingmates at all. > It can chew GPUGrid for a while and I'll try for quota overshoot again > in the morning. > > > --- On Thu, 3/6/10, Richard Haselgrove <[email protected]> > wrote: > > > From: Richard Haselgrove <[email protected]> > Subject: Re: [boinc_dev] host punishment mechanism revisited > To: "David Anderson" <[email protected]> > Cc: [email protected] > Date: Thursday, 3 June, 2010, 22:54 > > > Yes, that'll be useful for debugging and troubleshooting - thanks. > > I see I'm currently still seven tasks over quota: let's hope I get some > cooperative wingmates before bedtime, so I get the chance to do one more > work fetch under controlled conditions. > > > --- On Thu, 3/6/10, David Anderson <[email protected]> wrote: > > > From: David Anderson <[email protected]> > Subject: Re: [boinc_dev] host punishment mechanism revisited > To: "Richard Haselgrove" <[email protected]> > Cc: [email protected] > Date: Thursday, 3 June, 2010, 21:31 > > > I added a new web page showing app-version-level scheduling info: > http://setiweb.ssl.berkeley.edu/beta/host_app_versions.php?hostid=12316 > > (linked to from "Application details" on the host page). > > This will make it somewhat easier to follow what's going on. > > In principle there should be no overshoot of the quota. > There may be bugs, however. Please send the info before/after. > > -- David > > Richard Haselgrove wrote: >> Some movement on this one off-list, too. >> Validations now produce a quota 'reward', as designed. For the moment, >> I'm still having to update manually, because the backoff until after >> midnight is still happening (Changeset 21686 not active yet), but >> we're getting the idea. >> Two questions: >> 1) Is it right that an individual work request is allowed to >> 'overshoot' quota? Especially during error recovery, when quota is >> down to one per day, I would expect that to be strictly enforced at >> least until a 'success' result can be reported. But looking at the >> running total I've added to this list, the server sometimes gets way >> ahead of itself: >> 03/06/2010 08:28:32 s...@home Beta Test Reporting 71 completed tasks, >> requesting new tasks for GPU >> 03/06/2010 08:28:39 s...@home Beta Test Scheduler request completed: >> got 46 new tasks // 46 >> 03/06/2010 08:28:55 s...@home Beta Test Scheduler request completed: >> got 36 new tasks // 82 >> 03/06/2010 08:29:09 s...@home Beta Test Scheduler request completed: >> got 20 new tasks // 102 >> 03/06/2010 08:29:25 s...@home Beta Test Scheduler request completed: >> got 11 new tasks // 113 >> 03/06/2010 08:29:40 s...@home Beta Test Scheduler request completed: >> got 6 new tasks // 119 >> 03/06/2010 08:29:54 s...@home Beta Test Scheduler request completed: >> got 3 new tasks // 122 >> 03/06/2010 08:30:08 s...@home Beta Test Scheduler request completed: >> got 3 new tasks // 125 >> 03/06/2010 08:30:23 s...@home Beta Test Scheduler request completed: >> got 2 new tasks // 127 >> 03/06/2010 08:30:36 s...@home Beta Test Scheduler request completed: >> got 1 new tasks // 128 >> 03/06/2010 08:31:55 s...@home Beta Test Scheduler request completed: >> got 6 new tasks // 135 >> 03/06/2010 08:32:09 s...@home Beta Test Message from server: (reached >> daily quota of 131 tasks) >> >> <request_delay>84750.000000</request_delay> >> <message priority="high">No work sent</message> >> <message priority="high">(reached daily quota of 131 tasks) >> 03-Jun-2010 09:31:24 [s...@home Beta Test] Sending scheduler request: >> Requested by user. >> 03/06/2010 09:31:24 s...@home Beta Test Reporting 19 completed tasks, >> requesting new tasks for GPU >> 03/06/2010 09:31:28 s...@home Beta Test Scheduler request completed: >> got 0 new tasks >> 03/06/2010 09:31:28 s...@home Beta Test Message from server: No work sent >> 03/06/2010 09:31:28 s...@home Beta Test Message from server: (reached >> daily quota of 132 tasks) >> 03-Jun-2010 09:32:39 [s...@home Beta Test] Sending scheduler request: >> Requested by user. >> 03/06/2010 09:32:43 s...@home Beta Test Scheduler request completed: >> got 37 new tasks // 172 >> 03/06/2010 09:36:13 s...@home Beta Test Reporting 1 completed tasks, >> requesting new tasks for GPU >> 03/06/2010 09:36:16 s...@home Beta Test Message from server: (reached >> daily quota of 140 tasks) >> 03/06/2010 11:53:48 s...@home Beta Test Reporting 44 completed tasks, >> requesting new tasks for GPU >> 03/06/2010 11:54:02 s...@home Beta Test Scheduler request completed: >> got 0 new tasks >> 03/06/2010 11:54:02 s...@home Beta Test Message from server: No work sent >> 03/06/2010 11:54:02 s...@home Beta Test Message from server: (reached >> daily quota of 141 tasks) >> 2) How are we going to handle this on the website host details? As I >> type, with a quota of 141, >> http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=12316 >> is still saying "Maximum daily WU quota per CPU 100/day" >> Yet looking at a wingmate, Pappa's >> http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=45842 >> (hi, Al) is showing "Maximum daily WU quota per CPU 0/day" - yet >> returning valid work. That's not just the difference between logged-in >> and third-party reporting - other hosts I've checked are showing >> 100/day to third parties. >> A web display so far divorced from the new reality is clearly >> misleading, and shouldn't be shown. But it would be a shame to lose it >> completely: often a volunteer's first question on a help-desk is "Why >> aren't I getting any work for Project X?", and seeing a crippled quota >> is a lead-in to advising on what to do about repeated computation errors. >> >> And while I'm reporting - SETI is aware that they're a download server >> short, aren't they? >> 03-Jun-2010 09:41:21 [---] [http_debug] [ID#1439] Info: About to >> connect() to boinc2.ssl.berkeley.edu port 80 (#0) >> 03-Jun-2010 09:41:21 [---] [http_debug] [ID#1439] Info: Trying >> 208.68.240.18... 03-Jun-2010 09:41:23 [---] [http_debug] [ID#1439] >> Info: Connection refused >> 03-Jun-2010 09:41:23 [---] [http_debug] [ID#1439] Info: Failed connect >> to boinc2.ssl.berkeley.edu:80; No error >> 03-Jun-2010 09:41:23 [---] [http_debug] [ID#1439] Info: Expire cleared >> 03-Jun-2010 09:41:23 [---] [http_debug] [ID#1439] Info: Closing >> connection #0 >> 03-Jun-2010 09:41:23 [---] [http_debug] HTTP error: Couldn't connect >> to server >> >> --- On Wed, 2/6/10, Richard Haselgrove <[email protected]> >> wrote: >> >> >> From: Richard Haselgrove <[email protected]> >> Subject: Re: [boinc_dev] host punishment mechanism revisited >> To: [email protected] >> Date: Wednesday, 2 June, 2010, 9:12 >> >> >> I see that David has implemented the 'Reward for Validation' component >> of this discussion (http://boinc.berkeley.edu/trac/changeset/21675). >> >> However, don't we need to do something about backoffs? >> >> At the moment, if you ever reach the daily quota, you get a message >> saying typically "no work sent / reached daily quota of xxx tasks", >> and all scheduler RPCs are inhibited until 'server midnight + rnd(1 >> hour)'. I assume that's a server backoff instruction, and not coded >> into the client (which wouldn't know the server's local time). >> >> But the daily quota is no longer a fixed value. Indeed, if you both >> reported and requested work in the same RPC, your quota might be >> increased in the next few seconds, as the work you've just reported >> starts to validate. The backoff should be no more than the existing >> project RPC backoff and client 'no work sent' exponential backoff. >> >> Unfortunately, at the moment I can't test any of this: we only have >> one test project with this code, and it says >> >> s...@home Beta Test 02/06/2010 08:28:40 Reporting 26 completed tasks, >> not requesting new tasks >> s...@home Beta Test 02/06/2010 08:28:45 Scheduler request failed: HTTP >> internal server error >> _______________________________________________ >> boinc_dev mailing list >> [email protected] >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> To unsubscribe, visit the above URL and >> (near bottom of page) enter your email address. >> _______________________________________________ >> boinc_dev mailing list >> [email protected] >> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev >> To unsubscribe, visit the above URL and >> (near bottom of page) enter your email address. > > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > > _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
