On Mon, Apr 20, 2009 at 01:57:55PM -0400, RD Thrush wrote:
> I've recently noticed reduced performance when building ports for
> amd64 and i386 platforms on multiprocessor boxes.  I found the problem
> was associated with running a 'nice'd dnetc [1] process on each
> processor.  Without the 'nice'd processes, performance improves
> dramatically.
> 
> In a test case, elapsed time increased 25X (from ~24 seconds to more
> than 650 seconds) in one case and 12X (from ~30 seconds to more than
> 350 seconds) in another case.
> 
> Since I received the 4.5 CD on Saturday (thanks for another very cool
> release!), I used the bsd.mp kernels from that release and found the
> problem with reduced performance has occurred since the 4.5 release.
> 
> The problem can be reproduced by busying each core w/ a 'nice'd
> process.  Then, 'make clean;time make fake' in $PORTSDIR/devel/libtool
> illustrates the problem.

Using a make and recording the time used is useless: the most important
numbers (user and sys) are only recorded for the initial 'make' program,
not the programs it starts.

You may want to redo the test with a program that doesn't spawn other
processes, like gzipping a large file.

> FWIW, I have a soekris 5501 that does *not* have the problem which may
> indicate the issue is not in the uniprocessor environment.
> 
> Is this new 'nice' behavior expected?  Or, the side-effect of some
> other updates to the multiprocessor environment?  Hopefully,
> performance can be restored to that of the 4.5 release.

A lot of cpu affinity changes have gone into the kernel.

Please note that, while a niced process may only eat left-over processor
time (with a lower bound), the niced process will still take away
responsiveness from the system: it will finish its timeslice and the
context switches may be expensive too. Taking away responsiveness also
means more delay between waiting for data from disk and the process
processing it.

> I've appended a list of the 'time make fake' results for the release
> and snapshot bsd.mp kernels for both an amd quad-core (amd64 platform)
> and an amd dual-core (i386 platform).  The busy source and associated
> dmesgs are at the end.



> ######################################################
> a8v:Projects/busy 4025>cat Do_busy
> #!/bin/sh
> 
> if [ "X$1" != "X" ]; then
>   NICE=nice
> fi
> CPUS=$(sysctl -n hw.ncpu)
> TIME=/usr/bin/time
> BUSY=./busy
> CNT=0
> echo Busying $CPUS processors
> while :
> do
>   CNT=$((CNT + 1))
>   $NICE $TIME -l $BUSY &

Note that nice behaviour is not to apply the maximum nice level, but a
default value which makes the process a bit nicer. Most likely, your
dnetc process switches to nice level 20 (the maximum) while this
invocation probably uses 10 (as per /usr/bin/nice default).

>   if [ $CNT = $CPUS ]; then
>     exit
>   fi
> done

-- 
Ariane

Reply via email to