Re: Mersenne: problems with Solaris nice

Brian J. Beesley Sun, 01 Apr 2001 03:15:27 -0700
On 31 Mar 2001, at 22:34, Nathan Russell wrote:

> Under Linux, it is only slightly better.  (note, however, that the
> 'other' job is not a particularly kind one in CPU terms!)

Just checked. Kernel 2.4 behaves just like Solaris. Kernel 2.2 
doesn't, a "nice -n20"'d CPU-bound process gets hardly any CPU time 
whilst there are CPU-bound processes running at normal priority.

Telling people not to upgrade to kernel 2.4 simply isn't on. We'll 
have to find a solution to this.

The ideal solution would be to incorporate mprime as the null 
process, but for various reasons this isn't practical.

The upside of kernel 2.4 is that mprime seems to run a little faster 
on an otherwise idle system - around 1% or 2% - of course, anything 
helps!
> 
> I wish the admins here would allow distributed computing - they say
> that whenever the machine is slow they were getting flames about the
> distributed client 'raising the load average'.  

The flames are fed by three things: ignorance, stupidity and nothing 
else. Of course the load average rises! The point is that you should 
be able to run a huge number of low-priority CPU soak programs 
without affecting the apparent response of the system to interactive 
users.

Anyone else wanting to run a background CPU soak program is naturally 
going to be affected.
> 
> (big snip)
> 
> > I think the load average limit for starting the process should be
> > about one half less than the number of processors - say 0.5 on a
> > uniprocessor system - and the load average limit for stopping it
> > should be a little less than one more than the number of processors
> > - say 1.8 on a uniprocessor system.
> 
> I don't know much about such things, but I would note that I was
> working quite comfortably during the above test with load averages in
> the 2.3 range.  However, the typical 'uniprocessor system' is not a P3
> with a single user typing email!  

The load average is simply the average over the last interval of the 
number of computable process threads on the system - irrespective of 
priority. If you do "cat /proc/loadavg" (on a linux system; Solaris 
seems not to have this capability) the first three numbers are the 
load average over the last one minute, five minutes and one hour 
respectively.

Processes instantaneously at a lower priority (higher priority 
number) are irrelevant. Therefore the load average on its own is a 
poor indicator of the system's capability to do work, especially when 
the actual work is interactive or I/O bound rather than compute 
bound.

If the one minute load average is less than the number of processors, 
you have wasted some cycles during the last minute; you could 
consider starting an extra CPU soak process. If you have one too many 
CPU soak processes running, the load average will be around one more 
than the number of processors, even if there is nothing else active. 
Possibly a bit less than N+1, since worthwhile CPU soak processes are 
rarely perfect - they do tend to do at least some I/O. 

The paragraph above shows reasoning behind my suggestion; the values 
I give are on the safe side. Note that it's important that the 
difference between the "start" and "stop" thresholds is greater than 
1, otherwise thrashing is likely.


Regards
Brian Beesley
_________________________________________________________________________
Unsubscribe & list info -- http://www.scruz.net/~luke/signup.htm
Mersenne Prime FAQ      -- http://www.tasam.com/~lrwiman/FAQ-mers
Re: Mersenne: problems with Solaris nice

Reply via email to