I want to get 6.12 out the door as soon as we're all happy w/ notices.

The scheduler changes will be in a later 6.12 release, or 6.14.
I'm optimistic that it won't take too long, because
a) the design is better and way simpler
b) we have the client simulator to test it with

-- David

On 01-Nov-2010 7:18 AM, Richard Haselgrove wrote:
> I agree with John - this is a major change, and will need extensive testing.
>
> David, may I ask what your current expectation is for the timeline for the
> new scheduler? Specifically, are you going to attempt to incorporate it into
> v6.12, or would it be better to get all the 'notices' angst out of the way
> via a public release (and debug if necessary after BETA testing byt the
> public at large) first, and then we can concentrate all resources on
> functionality?
>
> I'm concerned that there seem to be a couple of recently-reported issues
> which might slip through the cracks.
>
> 1) In v6.12.4, the thrashing of GPU tasks into and out of GPU memory,
> because there seems to be no 'Task Switch Interval' inhibition on the new
> GPU scheduling by debt.
>
> 2) In v6.10.58, the idle CPUs apparently caused by the scheduler incorrectly
> handling the triple mixture of High Priority CPU / Multithread / ordinary
> priority single CPU tasks.
> (from http://boinc.berkeley.edu/dev/forum_thread.php?id=6138)
>
> If the new scheduler is going to be put into v6.12 (which will inevitably
> delay that release a bit), could those fixes (when ready) be backported into
> v6.10, please?
>
> Or if the new scheduler has to wait for v6.14, perhaps we should concentrate
> on getting v6.12 finished first?
>
>
>
>> This needs to be tested thoroughly with some very long term simulations
>> involving several years of simulation time.  Anything that involves the
>> concept of recent for work fetch will break resource share over the long
>> term when used in conjunction with CPDN.
>>
>> jm7
>>
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of boinc_cvs digest..."
>>
>>
>> Today's Topics:
>>
>>    1. r22608 - in trunk/boinc: . api client lib sched
>>       (boinc...@ssl.berkeley.edu)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 29 Oct 2010 16:41:35 -0700
>> From: boinc...@ssl.berkeley.edu
>> Subject: [boinc_cvs] r22608 - in trunk/boinc: . api client lib sched
>> To: boinc_...@ssl.berkeley.edu
>> Message-ID:<201010292341.o9tnfzuh013...@mail.ssl.berkeley.edu>
>> Content-Type: text/plain; charset=UTF-8
>>
>> Author: davea
>> Date: 2010-10-29 16:41:34 -0700 (Fri, 29 Oct 2010)
>> New Revision: 22608
>>
>> Modified:
>>    trunk/boinc/api/boinc_api.cpp
>>    trunk/boinc/checkin_notes
>>    trunk/boinc/client/client_types.cpp
>>    trunk/boinc/client/cpu_sched.cpp
>>    trunk/boinc/client/net_stats.cpp
>>    trunk/boinc/client/work_fetch.h
>>    trunk/boinc/lib/util.cpp
>>    trunk/boinc/lib/util.h
>>    trunk/boinc/sched/credit.cpp
>>    trunk/boinc/sched/update_stats.cpp
>> Log:
>> - client: small initial checkin for new scheduling system.
>>     Keep track of per-project recent estimated credit
>>
>>
>> Modified: trunk/boinc/api/boinc_api.cpp
>> ===================================================================
>> --- trunk/boinc/api/boinc_api.cpp          2010-10-29 18:58:26 UTC (rev
>> 22607)
>> +++ trunk/boinc/api/boinc_api.cpp          2010-10-29 23:41:34 UTC (rev
>> 22608)
>> @@ -835,9 +835,9 @@
>> #else
>>          strcpy(abspath, path);
>> #endif
>> -        argv[0] = GRAPHICS_APP_FILENAME;
>> +        argv[0] = (char*)GRAPHICS_APP_FILENAME;
>>          if (fullscreen) {
>> -            argv[1] = "--fullscreen";
>> +            argv[1] = (char*)"--fullscreen";
>>              argv[2] = 0;
>>              argc = 2;
>>          } else {
>>
>> Modified: trunk/boinc/checkin_notes
>> ===================================================================
>> --- trunk/boinc/checkin_notes        2010-10-29 18:58:26 UTC (rev 22607)
>> +++ trunk/boinc/checkin_notes        2010-10-29 23:41:34 UTC (rev 22608)
>> @@ -7660,3 +7660,20 @@
>>                          client_msgs.cpp
>>              clientgui/
>>                          NoticeListCtrl.cpp
>> +
>> +David  29 Oct 2010
>> +    - client: small initial checkin for new scheduling system.
>> +        Keep track of per-project recent estimated credit
>> +
>> +    api/
>> +        boinc_api.cpp
>> +    client/
>> +        client_types.cpp
>> +        cpu_sched.cpp
>> +        net_stats.cpp
>> +        work_fetch.h
>> +    lib/
>> +        util.cpp,h
>> +    sched/
>> +        credit.cpp
>> +        update_stats.cpp
>>
>> Modified: trunk/boinc/client/client_types.cpp
>> ===================================================================
>> --- trunk/boinc/client/client_types.cpp          2010-10-29 18:58:26 UTC
>> (rev 22607)
>> +++ trunk/boinc/client/client_types.cpp          2010-10-29 23:41:34 UTC
>> (rev 22608)
>> @@ -202,6 +202,8 @@
>>          if (parse_bool(buf, "dont_request_more_work",
>> dont_request_more_work)) continue;
>>          if (parse_bool(buf, "detach_when_done", detach_when_done))
>> continue;
>>          if (parse_bool(buf, "ended", ended)) continue;
>> +        if (parse_double(buf, "<rec>", pwf.rec)) continue;
>> +        if (parse_double(buf, "<rec_time>", pwf.rec_time)) continue;
>>          if (parse_double(buf, "<short_term_debt>",
>> cpu_pwf.short_term_debt)) continue;
>>          if (parse_double(buf, "<long_term_debt>", cpu_pwf.long_term_debt))
>> continue;
>>          if (parse_double(buf, "<cpu_backoff_interval>",
>> cpu_pwf.backoff_interval)) continue;
>> @@ -275,6 +277,8 @@
>>          "<master_fetch_failures>%d</master_fetch_failures>\n"
>>          "<min_rpc_time>%f</min_rpc_time>\n"
>>          "<next_rpc_time>%f</next_rpc_time>\n"
>> +        "<rec>%f</rec>\n"
>> +        "<rec_time>%f</rec_time>\n"
>>          "<short_term_debt>%f</short_term_debt>\n"
>>          "<long_term_debt>%f</long_term_debt>\n"
>>          "<cpu_backoff_interval>%f</cpu_backoff_interval>\n"
>> @@ -314,6 +318,8 @@
>>          master_fetch_failures,
>>          min_rpc_time,
>>          next_rpc_time,
>> +        pwf.rec,
>> +        pwf.rec_time,
>>          cpu_pwf.short_term_debt,
>>          cpu_pwf.long_term_debt, cpu_pwf.backoff_interval,
>> cpu_pwf.backoff_time,
>>          cuda_pwf.short_term_debt, cuda_pwf.long_term_debt,
>>
>> Modified: trunk/boinc/client/cpu_sched.cpp
>> ===================================================================
>> --- trunk/boinc/client/cpu_sched.cpp             2010-10-29 18:58:26 UTC
>> (rev 22607)
>> +++ trunk/boinc/client/cpu_sched.cpp             2010-10-29 23:41:34 UTC
>> (rev 22608)
>> @@ -514,6 +514,33 @@
>>      debt_interval_start = now;
>> }
>>
>> +#define REC_HALF_LIFE (30*86400)
>> +
>> +// update REC (recent estimated credit)
>> +//
>> +static void update_rec() {
>> +    double f = gstate.host_info.p_fpops;
>> +
>> +    for (unsigned int i=0; i<gstate.projects.size(); i++) {
>> +        PROJECT* p = gstate.projects[i];
>> +        double x = p->cpu_pwf.secs_this_debt_interval * f;
>> +        if (gstate.host_info.have_cuda()) {
>> +            x += p->cuda_pwf.secs_this_debt_interval * f *
>> cuda_work_fetch.relative_speed;
>> +        }
>> +        if (gstate.host_info.have_ati()) {
>> +            x += p->ati_pwf.secs_this_debt_interval * f *
>> ati_work_fetch.relative_speed;
>> +        }
>> +        update_average(
>> +            gstate.now,
>> +            gstate.debt_interval_start,
>> +            x,
>> +            REC_HALF_LIFE,
>> +            p->pwf.rec,
>> +            p->pwf.rec_time
>> +        );
>> +    }
>> +}
>> +
>> // adjust project debts (short, long-term)
>> //
>> void CLIENT_STATE::adjust_debts() {
>> @@ -551,6 +578,8 @@
>>          work_fetch.accumulate_inst_sec(atp, elapsed_time);
>>      }
>>
>> +    update_rec();
>> +
>>      cpu_work_fetch.update_long_term_debts();
>>      cpu_work_fetch.update_short_term_debts();
>>      if (host_info.have_cuda()) {
>>
>> Modified: trunk/boinc/client/net_stats.cpp
>> ===================================================================
>> --- trunk/boinc/client/net_stats.cpp             2010-10-29 18:58:26 UTC
>> (rev 22607)
>> +++ trunk/boinc/client/net_stats.cpp             2010-10-29 23:41:34 UTC
>> (rev 22608)
>> @@ -71,6 +71,7 @@
>>      }
>>      double start_time = gstate.now - dt;
>>      update_average(
>> +        gstate.now,
>>          start_time,
>>          nbytes,
>>          NET_RATE_HALF_LIFE,
>>
>> Modified: trunk/boinc/client/work_fetch.h
>> ===================================================================
>> --- trunk/boinc/client/work_fetch.h        2010-10-29 18:58:26 UTC (rev
>> 22607)
>> +++ trunk/boinc/client/work_fetch.h        2010-10-29 23:41:34 UTC (rev
>> 22608)
>> @@ -237,6 +237,10 @@
>>      bool can_fetch_work;
>>      bool compute_can_fetch_work(PROJECT*);
>>      bool has_runnable_jobs;
>> +    double rec;
>> +        // recent estimated credit
>> +    double rec_time;
>> +        // when it was last updated
>>      PROJECT_WORK_FETCH() {
>>          memset(this, 0, sizeof(*this));
>>      }
>>
>> Modified: trunk/boinc/lib/util.cpp
>> ===================================================================
>> --- trunk/boinc/lib/util.cpp         2010-10-29 18:58:26 UTC (rev 22607)
>> +++ trunk/boinc/lib/util.cpp         2010-10-29 23:41:34 UTC (rev 22608)
>> @@ -234,6 +234,7 @@
>> // html/inc/credit.inc
>> //
>> void update_average(
>> +    double now,
>>      double work_start_time,       // when new work was started
>>                                      // (or zero if no new work)
>>      double work,                    // amount of new work
>> @@ -241,8 +242,6 @@
>>      double&  avg,                    // average work per day (in and out)
>>      double&  avg_time                // when average was last computed
>> ) {
>> -    double now = dtime();
>> -
>>      if (avg_time) {
>>          // If an average R already exists, imagine that the new work was
>> done
>>          // entirely between avg_time and now.
>>
>> Modified: trunk/boinc/lib/util.h
>> ===================================================================
>> --- trunk/boinc/lib/util.h           2010-10-29 18:58:26 UTC (rev 22607)
>> +++ trunk/boinc/lib/util.h           2010-10-29 23:41:34 UTC (rev 22608)
>> @@ -61,7 +61,7 @@
>> extern double linux_cpu_time(int pid);
>> #endif
>>
>> -extern void update_average(double, double, double, double&, double&);
>> +extern void update_average(double, double, double, double, double&,
>> double&);
>>
>> extern int boinc_calling_thread_cpu_time(double&);
>>
>>
>> Modified: trunk/boinc/sched/credit.cpp
>> ===================================================================
>> --- trunk/boinc/sched/credit.cpp           2010-10-29 18:58:26 UTC (rev
>> 22607)
>> +++ trunk/boinc/sched/credit.cpp           2010-10-29 23:41:34 UTC (rev
>> 22608)
>> @@ -55,10 +55,12 @@
>>      DB_TEAM team;
>>      int retval;
>>      char buf[256];
>> +    double now = dtime();
>>
>>      // first, process the host
>>
>>      update_average(
>> +        now,
>>          start_time, credit, CREDIT_HALF_LIFE,
>>          host.expavg_credit, host.expavg_time
>>      );
>> @@ -76,6 +78,7 @@
>>      }
>>
>>      update_average(
>> +        now,
>>          start_time, credit, CREDIT_HALF_LIFE,
>>          user.expavg_credit, user.expavg_time
>>      );
>> @@ -103,6 +106,7 @@
>>              return retval;
>>          }
>>          update_average(
>> +            now,
>>              start_time, credit, CREDIT_HALF_LIFE,
>>              team.expavg_credit, team.expavg_time
>>          );
>> @@ -799,6 +803,7 @@
>> int write_modified_app_versions(vector<DB_APP_VERSION>&  app_versions) {
>>      unsigned int i, j;
>>      int retval = 0;
>> +    double now = dtime();
>>
>>      if (config.debug_credit&&  app_versions.size()) {
>>          log_messages.printf(MSG_NORMAL,
>> @@ -827,6 +832,7 @@
>>              }
>>              for (j=0; j<av.credit_samples.size(); j++) {
>>                  update_average(
>> +                    now,
>>                      av.credit_times[j], av.credit_samples[j],
>> CREDIT_HALF_LIFE,
>>                      av.expavg_credit, av.expavg_time
>>                  );
>>
>> Modified: trunk/boinc/sched/update_stats.cpp
>> ===================================================================
>> --- trunk/boinc/sched/update_stats.cpp           2010-10-29 18:58:26 UTC
>> (rev 22607)
>> +++ trunk/boinc/sched/update_stats.cpp           2010-10-29 23:41:34 UTC
>> (rev 22608)
>> @@ -54,6 +54,7 @@
>>      DB_USER user;
>>      int retval;
>>      char buf[256];
>> +    double now = dtime();
>>
>>      while (1) {
>>          retval = user.enumerate("where expavg_credit>0.1");
>> @@ -66,7 +67,9 @@
>>          }
>>
>>          if (user.expavg_time>  update_time_cutoff) continue;
>> -        update_average(0, 0, CREDIT_HALF_LIFE, user.expavg_credit,
>> user.expavg_time);
>> +        update_average(
>> +            now, 0, 0, CREDIT_HALF_LIFE, user.expavg_credit,
>> user.expavg_time
>> +        );
>>          sprintf( buf, "expavg_credit=%f, expavg_time=%f",
>>              user.expavg_credit, user.expavg_time
>>          );
>> @@ -84,6 +87,7 @@
>>      DB_HOST host;
>>      int retval;
>>      char buf[256];
>> +    double now = dtime();
>>
>>      while (1) {
>>          retval = host.enumerate("where expavg_credit>0.1");
>> @@ -96,7 +100,9 @@
>>          }
>>
>>          if (host.expavg_time>  update_time_cutoff) continue;
>> -        update_average(0, 0, CREDIT_HALF_LIFE, host.expavg_credit,
>> host.expavg_time);
>> +        update_average(
>> +            now, 0, 0, CREDIT_HALF_LIFE, host.expavg_credit,
>> host.expavg_time
>> +        );
>>          sprintf(
>>              buf,"expavg_credit=%f, expavg_time=%f",
>>              host.expavg_credit, host.expavg_time
>> @@ -142,6 +148,7 @@
>>      DB_TEAM team;
>>      int retval;
>>      char buf[256];
>> +    double now = dtime();
>>
>>      while (1) {
>>          retval = team.enumerate("where expavg_credit>0.1");
>> @@ -163,7 +170,10 @@
>>              continue;
>>          }
>>          if (team.expavg_time<  update_time_cutoff) {
>> -            update_average(0, 0, CREDIT_HALF_LIFE, team.expavg_credit,
>> team.expavg_time);
>> +            update_average(
>> +                now, 0, 0, CREDIT_HALF_LIFE, team.expavg_credit,
>> +                team.expavg_time
>> +            );
>>          }
>>          sprintf(
>>              buf, "expavg_credit=%f, expavg_time=%f, nusers=%d",
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> boinc_cvs mailing list
>> boinc_...@ssl.berkeley.edu
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_cvs
>>
>>
>> End of boinc_cvs Digest, Vol 71, Issue 48
>> *****************************************
>>
>>
>>
>> _______________________________________________
>> boinc_dev mailing list
>> boinc_dev@ssl.berkeley.edu
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>
>
> _______________________________________________
> boinc_dev mailing list
> boinc_dev@ssl.berkeley.edu
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to