On Mon, Apr 20, 2009 at 01:57:55PM -0400, RD Thrush wrote: > I've recently noticed reduced performance when building ports for > amd64 and i386 platforms on multiprocessor boxes. I found the problem > was associated with running a 'nice'd dnetc [1] process on each > processor. Without the 'nice'd processes, performance improves > dramatically. > > In a test case, elapsed time increased 25X (from ~24 seconds to more > than 650 seconds) in one case and 12X (from ~30 seconds to more than > 350 seconds) in another case. > > Since I received the 4.5 CD on Saturday (thanks for another very cool > release!), I used the bsd.mp kernels from that release and found the > problem with reduced performance has occurred since the 4.5 release. > > The problem can be reproduced by busying each core w/ a 'nice'd > process. Then, 'make clean;time make fake' in $PORTSDIR/devel/libtool > illustrates the problem.
Using a make and recording the time used is useless: the most important numbers (user and sys) are only recorded for the initial 'make' program, not the programs it starts. You may want to redo the test with a program that doesn't spawn other processes, like gzipping a large file. > FWIW, I have a soekris 5501 that does *not* have the problem which may > indicate the issue is not in the uniprocessor environment. > > Is this new 'nice' behavior expected? Or, the side-effect of some > other updates to the multiprocessor environment? Hopefully, > performance can be restored to that of the 4.5 release. A lot of cpu affinity changes have gone into the kernel. Please note that, while a niced process may only eat left-over processor time (with a lower bound), the niced process will still take away responsiveness from the system: it will finish its timeslice and the context switches may be expensive too. Taking away responsiveness also means more delay between waiting for data from disk and the process processing it. > I've appended a list of the 'time make fake' results for the release > and snapshot bsd.mp kernels for both an amd quad-core (amd64 platform) > and an amd dual-core (i386 platform). The busy source and associated > dmesgs are at the end. > ###################################################### > a8v:Projects/busy 4025>cat Do_busy > #!/bin/sh > > if [ "X$1" != "X" ]; then > NICE=nice > fi > CPUS=$(sysctl -n hw.ncpu) > TIME=/usr/bin/time > BUSY=./busy > CNT=0 > echo Busying $CPUS processors > while : > do > CNT=$((CNT + 1)) > $NICE $TIME -l $BUSY & Note that nice behaviour is not to apply the maximum nice level, but a default value which makes the process a bit nicer. Most likely, your dnetc process switches to nice level 20 (the maximum) while this invocation probably uses 10 (as per /usr/bin/nice default). > if [ $CNT = $CPUS ]; then > exit > fi > done -- Ariane