Re: [boinc_dev] proposed scheduling policy changes

Martin Wed, 27 Oct 2010 04:43:37 -0700

Dear All,

Very good to look at this again.

The new proposal is a good quick stab at trying a change. However, I'm
not clear whether it will be a magic bullet fix. It does have two big
advantages (a:) of looking simpler and (b:) of using what the users
*see* and 'measure' for feedback.

I'm rather concerned that the feedback loop for balancing resource share
will be far far too slow to avoid instability, or you'll have to have
the mechanism so heavily damped that it doesn't respond, or that simply
the RAF 'adjust/limit' parameter will always dominate over the RAC.

Hence, don't include "RAC" at all! As explained below:

For the rushed:

For summary, jump down to "***".
For the conclusion, jump down to "******"!

This all reminds me of previous discussions stretching to some time
back. Eg:

On 18/02/09 17:48, Martin wrote:
> Eric J Korpela wrote:
> [...]
>> tries to allocate CPU time according to resource share.  What it
>> should do is try to allocate credit according to resource share.  In
>> that way a project is encouraged to equalize its credits.  If they
>> are overpaying, they will get fewer resources.  If they are
>> underpaying they will lose participants.
>
> That is actually what I look at when looking to see if the resource
> shares are working as intended!
>
> Perhaps using the credits in that way would give them a good reason
> for the [credits] existence and a very good reason to get them right
> (and better reflect real-world performance).
>
> Regards,
> Martin

As ever, good points from Paul even if rather volumous for the detail.
There's various views to this as ever.

I'm still of the opinion that the scheduler has an impossible task to
try to schedule by a presumed hardware resource share and yet the
measurement used (RAC) is only vaguely associated with hardware
resources and varies wildly depending on the mix of hardware and tasks.

Worse still, the final defining RAC for a task, as awarded by the
servers, can be different or even zero and a very long time after the event.

Hence, there are always instabilities in the scheduling, because the
measurement controlling the scheduling is unstable. That is, the
scheduler "loop gain and response delay" varies depending on each
scheduled task.

To try scheduling by RAC, you'll get unexpected results for present
scheduling from events that happened long long ago for when a server
finally reports the RAC for old tasks.

Unless you add visual dials on the view for Boinc to show users the
project debts (or "credits bank" so as to be positive?), the scheduler
is always going to do 'unexpected' things.

Is this why we're still considering this and trying another new idea?

Visibility? Simplicity?

*** For a full fix?...

Well, noone wants to program in the server-side stuff for NIST-like
credits calibration. (Would still be good to really measure real world
performance... ;-) Could compare against real world supercomputer
systems ;-) ;-) )

The "RAC" as reported back from the project servers is far too delayed
to use directly for scheduling - very painfully slow feedback.

Also, do we schedule for the 'effort' put in, or for the 'results seen'?
If for 'results seen', then a project gets an unfair share of rerunning
WUs on hosts for any that failed validation...

This all comes back to:

The cobblestone is too vague and unrepresentative a measure of resource,
and cannot *directly* measure all the resources used. Also, there just
is no such thing as FLOPS*IOPS for measuring disk storage or network
usage or (non-compute) monitoring time.

The very old days of pre-WindowsXP should now be no more... Hence we
should now be able to assume that all modern OSes now all support
reporting resource and wallclock *time* accurately.

****** For one idea:

Hence, completely abandon trying to balance RAC by the host scheduler.
Let the scheduler schedule by the well tried and tested method of
scheduling resource share by time, or probabilistically by a
proportionate priority, and let the credits and RAC be
controlled/adjusted /outside/ of the scheduler feedback loop by the
projects (for whatever they want).

Eric's present method (median of hosts performance seen) of
rationalising the credits for s...@h may well work well enough that noone
will notice anything untoward.

... Until that is, the ever increasing proportion of GPGPU/GPU results
begin to imping on the present majority for the CPU results to then in
effect change the dominant architecture being 'measured' for the credits...

Good time for a scheduler change? ;-)

(OK, sorry, couldn't resist the time pun! :-) )

Regards,
Martin

Supporting posts trimmed for main points, all very good:

On 27/10/10 10:46, Paul D. Buck wrote:
> I am not sure that some other points are well considered as well...
> 
> RAC is unstable, always has been, partly because of delays in
> validation, but partly for other reasons (never clearly identified,

> variety of projects than SaH, but, BOINC is not just SaH, though it
> is often treated as such.

> Secondly, if you are going to use RAF (not sure if this is new-new,
> or just recent, but this is the first I have heard of this tracking
> parameter), what is the point of also using RAC? ... they are

> The next point is that for some projects, especially NCIs like WUProp
> RS of 100 can never be reached because of workload limiting... and
> other projects have similar, though usually harder to reach daily
> limits as well...

> Next I would raise the point that our method of rating the GPUs is
> self limited in the sense that there are three different classes of
> work being run on GPUs ... SP, DP, and integer ... and the cards are
> rated only for one of the three classes.  So, on my Nvidia cards I

> ... and I have never been that comfortable with either the CPU
> benchmarks or the current rating system for GPUs with regard for
> actual project performance.

> the order of (in several of my systems) a factor of 50 to 60
> different (or more).  which begs the question, will this new system
> not also cause "pegging" to limits because of these efficiency
> differences as well...

> because for the most part I don't run CPU/GPU projects without
> turning off one side or the other (usually CPU side, if you can run
> on the GPU why on earth would you waste CPU time? But that is just
> me.).

> Last consideration is work availability's ... the simulator I know
> does not reflect well the variety of work availabilities from
> projects like LHC where work is almost never available and when it is
> is only there in small batches, to MW which almost always has work

> But fundamentally, the biggest concern is how do we measure the
> effectiveness of the fix when we have such inadequate tools to
> measure what BOINC is actually doing.  Productivity wise... I used to
> use BOINC View to log and then analyze the logs for things like how
> much work was really done .. but that tool is no longer supported

> ... to allow us to figure out if the new system is working as
> expected.

> On Oct 26, 2010, at 3:27 PM, Richard Haselgrove wrote:
> 
>> My first reaction, on reading this, was that resource share should
>> be based on work done, measured directly, rather than by using
>> credit as a surrogate. But I've read the design document, and
>> you've covered that point - and also taken into account the
>> immediate consequence, which is that projects which over-inflate
>> their credits get automatically penalised with a reduced effective
>> resource share, and hence lower work rate. I'm happy with that.
>> 
>> But I'm worried by any proposal which uses credit as an input to
>> drive any functional aspect of BOINC. As a trivial example, just an
>> hour ago someone on a message board celebrated that, according to 
>> http://aqua.dwavesys.com/server_status.php, AQUA has now "surpassed
>> 200,000 GFlops of compute power". BOINCstats is currently
>> displaying 169,077.8 GigaFLOPS. I suspect the difference may be due
>> to AQUA's current transitioner troubles, which results in a 'lumpy'
>> allocation of credit, and hence RAC. I suspect that the poster's
>> conclusion, that "This would place a...@h at Rank #22 in the Top500
>> list" http://www.top500.org/list/2010/06/100, would be treated with
>> a certain amount of scepticism by the developers of LINPACK.
>> 
>> If credit is to be used as an input, it will, and should, be
>> subject to greater scrutiny with regards to derivation,
>> methodology, and, dare I say, accuracy.

>> ... fatally flawed by the flop-count discrepancy between CPU and
>> CUDA applications processing identical work for the project.
>> 
>> But, with care, one can still obtain the 'claimed' credit
>> (flop-count based) for pre-validation SETI tasks, and compare them
>> with the granted credit for the same tasks post-validation.
>> 
>> I performed this analysis over the weekend of 27/28 August 2010,
>> and plotted the results for 980 tasks processed - exclusively by
>> CUDA applications - on

>> I am worried by the non-deterministic value of the individual
>> credit awards, even though the average figures seem reasonable. I
>> think we should double-check that all is going to plan with Credit
>> New, before we start building greater edifices on foundations,
>> possibly, of sand.

>> ----- Original Message ----- From: "David Anderson"
>> <[email protected]> To: "BOINC Developers Mailing List"
>> <[email protected]> Sent: Tuesday, October 26, 2010 10:13
>> PM Subject: [boinc_dev] proposed scheduling policy changes
>> 
>> 
>>> Experiments with the client simulator using Richard's scenario 
>>> made it clear that the current scheduling framework (based on STD
>>> and LTD for separate processor types) is fatally flawed: it may
>>> divide resources among projects in a way that makes no sense and
>>> doesn't respect resource shares.
>>> 
>>> In particular, resource shares, as some have already pointed
>>> out, should apply to total work (as measured by credit) rather
>>> than to individual processor types. If two projects have equal
>>> resource shares, they should ideally have equal RAC, even if that
>>> means that one of them gets 100% of a particular processor type.
>>> 
>>> I think it's possible to do this, although there are difficulties
>>> due to delayed credit granting. I wrote up a design for this: 
>>> http://boinc.berkeley.edu/trac/wiki/ClientSchedOctTen Comments
>>> are welcome.
>>> 
>>> BTW, the new mechanisms would be significantly simpler than the
>>> old ones. This is always a good sign.
>>> 
>>> -- David

-- 
--------------------
Martin Lomas
m_boincdev ml1 co uk.ddSPAM.dd
--------------------
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] proposed scheduling policy changes

Reply via email to