subject:"SCHED_ULE should not be the default"

Re: SCHED_ULE should not be the default

2011-12-20 Thread Gary Jennejohn

On Mon, 19 Dec 2011 23:22:40 +0200
Andriy Gapon a...@freebsd.org wrote:

 on 19/12/2011 17:50 Nathan Whitehorn said the following:
  The thing I've seen is that ULE is substantially more enthusiastic about
  migrating processes between cores than 4BSD.
 
 Hmm, this seems to be contrary to my theoretical expectations.  I thought that
 with 4BSD all threads that were not in one of the following categories:
 - temporary pinned
 - bound to cpu in kernel via sched_bind
 - belong to a cpu set which a strict subset of a total set
 were placed onto a common queue that was shared by all cpus.  And as such I
 expected them to get picked up by the cpus semi-randomly.
 
 In other words, I thought that it was ULE that took into account cpu/cache
 affinities while 4BSD was deliberately entirely ignorant of those details.
 

I have a 6-core AMD CPU running FreeeBSD 10.0 and SCHED_4BSD.

I've noticed with large ports builds which are not MAKE_JOBS_SAFE that
the compile load migrates between the cores pretty quickly, but I haven't
compared it to ULE.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-19 Thread Nathan Whitehorn


On 12/18/11 04:34, Adrian Chadd wrote:

The trouble is that there's lots of anecdotal evidence, but noone's
really gone digging deep into _their_ example of why it's broken. The
developers who know this stuff don't see anything wrong. That hints to
me it may be something a little more creepy - as an example, the
interplay between netisr/swi/taskqueue/callbacks and such. It may be
that something is being starved that isn't obviously obvious. It's
just a stab in the dark, but it sounds somewhat plausible based on
what I've seen ULE do in my network throughput hacking.

I applaud reppie for trying to make it as easy as possible for people
to use KTR to provide scheduler traces for him to go digging with, so
please, if you have these issues and you can absolutely reproduce
them, please follow his instructions and work with him to get him what
he needs.



The thing I've seen is that ULE is substantially more enthusiastic about 
migrating processes between cores than 4BSD. Often, this is a good 
thing, but can increase the rate of cache misses, hurting performance 
for cache-bound processes (I see this particularly in HPC-type 
scientific workloads). It might be interesting to add some kind of 
tunable here.


Another more interesting and slightly longer-term possibility if someone 
wants a project would be to integrate scheduling decisions with hwpmc 
counters, to accumulate statistics on cache hits at each context switch 
and preferentially keep processes with a high hits/misses ratio on the 
same thread/cache domain relative to processes with a low one.

-Nathan

P.S. The other thing that could be very interesting from a research and 
scheduling standpoint would be to integrate heterogeneous SMP support 
into the operating system, with a FreeBSD-4 Application Processor 
syscall model. We seem to be going down the road where GPGPU computing 
has MMUs, timer interrupts, IPIs, etc. (the next AMD Fusions, IBM Cell), 
as well as potential systems with both x86 and ARM cores. This is 
something that no operating system currently supports well, and would be 
a place for BSD to shine. If anyone has a free graduate student...

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-19 Thread Andriy Gapon

on 19/12/2011 17:50 Nathan Whitehorn said the following:
 The thing I've seen is that ULE is substantially more enthusiastic about
 migrating processes between cores than 4BSD.

Hmm, this seems to be contrary to my theoretical expectations.  I thought that
with 4BSD all threads that were not in one of the following categories:
- temporary pinned
- bound to cpu in kernel via sched_bind
- belong to a cpu set which a strict subset of a total set
were placed onto a common queue that was shared by all cpus.  And as such I
expected them to get picked up by the cpus semi-randomly.

In other words, I thought that it was ULE that took into account cpu/cache
affinities while 4BSD was deliberately entirely ignorant of those details.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-19 Thread Alexander Best

On Mon Dec 19 11, Nathan Whitehorn wrote:
 On 12/18/11 04:34, Adrian Chadd wrote:
 The trouble is that there's lots of anecdotal evidence, but noone's
 really gone digging deep into _their_ example of why it's broken. The
 developers who know this stuff don't see anything wrong. That hints to
 me it may be something a little more creepy - as an example, the
 interplay between netisr/swi/taskqueue/callbacks and such. It may be
 that something is being starved that isn't obviously obvious. It's
 just a stab in the dark, but it sounds somewhat plausible based on
 what I've seen ULE do in my network throughput hacking.
 
 I applaud reppie for trying to make it as easy as possible for people
 to use KTR to provide scheduler traces for him to go digging with, so
 please, if you have these issues and you can absolutely reproduce
 them, please follow his instructions and work with him to get him what
 he needs.
 
 The thing I've seen is that ULE is substantially more enthusiastic about 
 migrating processes between cores than 4BSD. Often, this is a good 
 thing, but can increase the rate of cache misses, hurting performance 
 for cache-bound processes (I see this particularly in HPC-type 
 scientific workloads). It might be interesting to add some kind of 
 tunable here.

does r228718 have any impact regarding this behaviour?

cheers.
alex

 
 Another more interesting and slightly longer-term possibility if someone 
 wants a project would be to integrate scheduling decisions with hwpmc 
 counters, to accumulate statistics on cache hits at each context switch 
 and preferentially keep processes with a high hits/misses ratio on the 
 same thread/cache domain relative to processes with a low one.
 -Nathan
 
 P.S. The other thing that could be very interesting from a research and 
 scheduling standpoint would be to integrate heterogeneous SMP support 
 into the operating system, with a FreeBSD-4 Application Processor 
 syscall model. We seem to be going down the road where GPGPU computing 
 has MMUs, timer interrupts, IPIs, etc. (the next AMD Fusions, IBM Cell). 
 This is something that no operating system currently supports well, and 
 would be a place for BSD to shine. If anyone has a free graduate student...
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Alexander Best

On Sun Dec 18 11, Andrey Chernov wrote:
 On Sun, Dec 18, 2011 at 05:51:47PM +1100, Ian Smith wrote:
  On Sun, 18 Dec 2011 02:37:52 +, Bruce Cran wrote:
On 13/12/2011 09:00, Andrey Chernov wrote:
 I observe ULE interactivity slowness even on single core machine 
  (Pentium
 4) in very visible places, like 'ps ax' output stucks in the middle by 
  ~1
 second. When I switch back to SHED_4BSD, all slowness is gone. 

I'm also seeing problems with ULE on a dual-socket quad-core Xeon machine
with 16 logical CPUs. If I run tar xf somefile.tar and make -j16
buildworld then logging into another console can take several seconds.
Sometimes even the Password: prompt can take a couple of seconds to 
  appear
after typing my username.
  
  I'd resigned myself to expecting this sort of behaviour as 'normal' on 
  my single core 1133MHz PIII-M.  As a reproducable data point, running 
  'dd if=/dev/random of=/dev/null' in one konsole, specifically to heat 
  the CPU while testing my manual fan control script, hogs it up pretty 
  much while regularly running the script below in another konsole to 
  check values - which often gets stuck half way, occasionally pausing 
  _twice_ before finishing.  Switching back to the first konsole (on 
  another desktop) to kill the dd can also take a couple/few seconds.
 
 This issue not about slow machine under load, because the same 
 slow machine under exact the same load, but with SCHED_4BSD is very fast 
 to response interactively.
 
 I think we should not misinterpret interactivity with speed. I see no big 
 speed (i.e. compilation time) differences, switching schedulers, but see 
 big _interactivity_ difference. ULE in general tends to underestimate 
 interactive processes in favour of background ones. It perhaps helps to 
 compilation, but looks like slowpoke OS from the interactive user 
 experience.

+1

i've also experienced issues with ULE and performed several tests to compare
it to the historical 4BSD scheduler. the difference between the two does *not*
seem to be speed (at least not a huge difference), but interactivity.

one of the tests i performed was the following

ttyv0: untar a *huge* (+10G) archive
ttyv1: after ~ 30 seconds of untaring do 'ls -la $direcory', where directory
   contains a lot of files. i used direcory = /var/db/portsnap, because
   that directory contains 23117 files on my machine.

measuring 'ls -la $direcory' via time(1) revealed that SCHED_ULE takes  15
seconds, whereas SCHED_4BSD only takes ~ 3-5 seconds. i think the issue is io.
io operations usually get a high priority, because statistics have shown that
- unlike computational tasks - io intensive tasks only run for a small fraction
of time and then exit: read data - change data - writeback data.

so SCHED_ULE might take these statistics too literaly and gives tasks like
bsdtar(1) (in my case) too many ressources, so other tasks which require io are
struggling to get some ressources assigned to them (ls(1) in my case).

of course SCHED_4BSD isn't perfect, too. try using it and run the stress2
testsuite. your whole system will grind to a halt. mouse input drops below
1 HZ. even after killing all the stress2 tests, it will take a few minutes
after the system becomes snappy again.

cheers.
alex

 
 -- 
 http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Alexander Best

On Sun Dec 18 11, Alexander Best wrote:
 On Sun Dec 18 11, Andrey Chernov wrote:
  On Sun, Dec 18, 2011 at 05:51:47PM +1100, Ian Smith wrote:
   On Sun, 18 Dec 2011 02:37:52 +, Bruce Cran wrote:
 On 13/12/2011 09:00, Andrey Chernov wrote:
  I observe ULE interactivity slowness even on single core machine 
   (Pentium
  4) in very visible places, like 'ps ax' output stucks in the middle 
   by ~1
  second. When I switch back to SHED_4BSD, all slowness is gone. 
 
 I'm also seeing problems with ULE on a dual-socket quad-core Xeon 
   machine
 with 16 logical CPUs. If I run tar xf somefile.tar and make -j16
 buildworld then logging into another console can take several seconds.
 Sometimes even the Password: prompt can take a couple of seconds to 
   appear
 after typing my username.
   
   I'd resigned myself to expecting this sort of behaviour as 'normal' on 
   my single core 1133MHz PIII-M.  As a reproducable data point, running 
   'dd if=/dev/random of=/dev/null' in one konsole, specifically to heat 
   the CPU while testing my manual fan control script, hogs it up pretty 
   much while regularly running the script below in another konsole to 
   check values - which often gets stuck half way, occasionally pausing 
   _twice_ before finishing.  Switching back to the first konsole (on 
   another desktop) to kill the dd can also take a couple/few seconds.
  
  This issue not about slow machine under load, because the same 
  slow machine under exact the same load, but with SCHED_4BSD is very fast 
  to response interactively.
  
  I think we should not misinterpret interactivity with speed. I see no big 
  speed (i.e. compilation time) differences, switching schedulers, but see 
  big _interactivity_ difference. ULE in general tends to underestimate 
  interactive processes in favour of background ones. It perhaps helps to 
  compilation, but looks like slowpoke OS from the interactive user 
  experience.
 
 +1
 
 i've also experienced issues with ULE and performed several tests to compare
 it to the historical 4BSD scheduler. the difference between the two does *not*
 seem to be speed (at least not a huge difference), but interactivity.
 
 one of the tests i performed was the following
 
 ttyv0: untar a *huge* (+10G) archive
 ttyv1: after ~ 30 seconds of untaring do 'ls -la $direcory', where directory
contains a lot of files. i used direcory = /var/db/portsnap, because
s/portsnap/portsnap\/files/

that directory contains 23117 files on my machine.
 
 measuring 'ls -la $direcory' via time(1) revealed that SCHED_ULE takes  15
 seconds, whereas SCHED_4BSD only takes ~ 3-5 seconds. i think the issue is io.
 io operations usually get a high priority, because statistics have shown that
 - unlike computational tasks - io intensive tasks only run for a small 
 fraction
 of time and then exit: read data - change data - writeback data.
 
 so SCHED_ULE might take these statistics too literaly and gives tasks like
 bsdtar(1) (in my case) too many ressources, so other tasks which require io 
 are
 struggling to get some ressources assigned to them (ls(1) in my case).
 
 of course SCHED_4BSD isn't perfect, too. try using it and run the stress2
 testsuite. your whole system will grind to a halt. mouse input drops below
 1 HZ. even after killing all the stress2 tests, it will take a few minutes
 after the system becomes snappy again.
 
 cheers.
 alex
 
  
  -- 
  http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Bruce Cran


On 18/12/2011 10:34, Adrian Chadd wrote:

I applaud reppie for trying to make it as easy as possible for people
to use KTR to provide scheduler traces for him to go digging with, so
please, if you have these issues and you can absolutely reproduce
them, please follow his instructions and work with him to get him what
he needs.


Who's 'reppie'?

--
Bruce Cran
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread O. Hartmann

On 12/18/11 03:37, Bruce Cran wrote:
 On 13/12/2011 09:00, Andrey Chernov wrote:
 I observe ULE interactivity slowness even on single core machine
 (Pentium 4) in very visible places, like 'ps ax' output stucks in the
 middle by ~1 second. When I switch back to SHED_4BSD, all slowness is
 gone. 
 
 I'm also seeing problems with ULE on a dual-socket quad-core Xeon
 machine with 16 logical CPUs. If I run tar xf somefile.tar and make
 -j16 buildworld then logging into another console can take several
 seconds. Sometimes even the Password: prompt can take a couple of
 seconds to appear after typing my username.
 

I reported ages ago several problems using SCHED_ULE on FreeBSD 8/9 when
doing heavy I/O, either disk or network bound (that time I realised the
problem on servers doing heavy disk I/O or net I/O). It was suspected
that X could be the problem, but we also have a Dell PowerEdge 1950III
running FreeBSD 8.2-STABLE (by next week 9.0-RC[2/3]/STABLE) without X,
but the same problems, but no so prominent as with X. The box has 8
cores, 4 cores per socket each and 16 GB RAM, SAS 6/iR controller and
two PCI-X attached Broacom NexTreme NICs, so the hardware shouldn't be
any kind of trouble.

But that time (over the past two years for now), the problem was
considered a personal problem. Bah!

By the beginning of next year my working group expects new hardware.
Since we use for Linux for scientific work (due to OpenCL and CUDA on
TESLA cards), I can't use the Blade system. The boxes I expect is one
Dell Precission T7500, 96 GB RAM, two sockets, two Westmere XEONs each
socket with a summary of 12 cores/24 threads. I'll start a dual OS
installation with FreeBSD 10 and the most recent Suse (since the
development is mostly done by my colleagues on Suse for the C2075 TESLA
board, I need Suse Linux).
I will then being capable of performing some benchmarks on both boxes on
the very same hardware. The other box will be my desk's box, a brand new
Sandy-Bridge E CPU (i7-3960X) with 32 GB RAM. I'm also inclined to
install a dual boot box (I rejected this up to now since I do not like
to install GRUB2 for having multiboot when using GPT on FreeBSD). The
box will run with FreeBSD 9 and an Ubuntu or Gentoo Linux, if. I'm
unsure in the question of Linux, but I tend to have Gentoo for compiling
everything myself.
On this box, I also can perform benchmarks with several setups.

I see forward getting some help and/or tips to proof the issues we
discussed here.

Oliver



signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-18 Thread Adrian Chadd

Hi,

What Attilllo and others need are KTR traces in the most stripped down
example of interactive-busting workload you can find.

Eg: if you're doing 32 concurrent buildworlds and trying to test
interactivity - fine, but that's going to result in a lot of KTR
stuff.
If you can reproduce it using a dd via /dev/null and /dev/random (like
another poster did) with nothing else running, then even better.
If you can do it without X running, even better.

I honestly suggest ignoring benchmarks for now and concentrating on
interactivity.


Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Bruce Evans


On Wed, 14 Dec 2011, Ivan Klymenko wrote:


?? Wed, 14 Dec 2011 00:04:42 +0100
Jilles Tjoelker jil...@stack.nl ??:


On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:

If the algorithm ULE does not contain problems - it means the
problem has Core2Duo, or in a piece of code that uses the ULE
scheduler. I already wrote in a mailing list that specifically in
my case (Core2Duo) partially helps the following patch:
--- sched_ule.c.orig2011-11-24 18:11:48.0 +0200
+++ sched_ule.c 2011-12-10 22:47:08.0 +0200
...
@@ -2118,13 +2119,21 @@
struct td_sched *ts;

THREAD_LOCK_ASSERT(td, MA_OWNED);
+   if (td-td_pri_class  PRI_FIFO_BIT)
+   return;
+   ts = td-td_sched;
+   /*
+* We used up one time slice.
+*/
+   if (--ts-ts_slice  0)
+   return;


This skips most of the periodic functionality (long term load
balancer, saving switch count (?), insert index (?), interactivity
score update for long running thread) if the thread is not going to
be rescheduled right now.

It looks wrong but it is a data point if it helps your workload.


Yes, I did it for as long as possible to delay the execution of the code in 
section:


I don't understand what you are doing here, but recently noticed that
the timeslicing in SCHED_4BSD is completely broken.  This bug may be a
feature.  SCHED_4BSD doesn't have its own timeslice counter like ts_slice
above.  It uses `switchticks' instead.  But switchticks hasn't been usable
for this purpose since long before SCHED_4BSD started using it for this
purpose.  switchticks is reset on every context switch, so it is useless
for almost all purposes -- any interrupt activity on a non-fast interrupt
clobbers it.

Removing the check of ts_slice in the above and always returning might
give a similar bug to the SCHED_4BSD one.

I noticed this while looking for bugs in realtime scheduling.  In the
above, returning early for PRI_FIFO_BIT also skips most of the periodic
functionality.  In SCHED_4BSD, returning early is the usual case, so
the PRI_FIFO_BIT might as well not be checked, and it is the unusual
fifo scheduling case (which is supposed to only apply to realtime
priority threads) which has a chance of working as intended, while the
usual roundrobin case degenerates to an impure form of fifo scheduling
(iit is impure since priority decay still works so it is only fifo
among threads of the same priority).


...

@@ -2144,9 +2153,6 @@
if
(TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
tdq-tdq_ridx = tdq-tdq_idx; }
-   ts = td-td_sched;
-   if (td-td_pri_class  PRI_FIFO_BIT)
-   return;
if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
/*
 * We used a tick; charge it to the thread so
@@ -2157,11 +2163,6 @@
sched_priority(td);
}
/*
-* We used up one time slice.
-*/
-   if (--ts-ts_slice  0)
-   return;
-   /*
 * We're out of time, force a requeue at userret().
 */
ts-ts_slice = sched_slice;


With the ts_slice check here before you moved it, removing it might
give buggy behaviour closer to SCHED_4BSD.


and refusal to use options FULL_PREEMPTION


4-5 years ago, I found that any form of PREMPTION was a pessimization
for at least makeworld (since it caused too many context switches).
PREEMPTION was needed for the !SMP case, at least partly because of
the broken switchticks (switchticks, when it works, gives voluntary
yielding by some CPU hogs in the kernel.  PREEMPTION, if it works,
should do this better).  So I used PREEMPTION in the !SMP case and
not for the SMP case.  I didn't worry about the CPU hogs in the SMP
case since it is rare to have more than 1 of them and 1 will use at
most 1/2 of a multi-CPU system.


But no one has unsubscribed to my letter, my patch helps or not in
the case of Core2Duo...
There is a suspicion that the problems stem from the sections of
code associated with the SMP...
Maybe I'm in something wrong, but I want to help in solving this
problem ...


The main point of SCHED_ULE is to give better affinity for multi-CPU
systems.  But the `multi' apparently needs to be strictly more than
2 for it to brak even.

Bruce___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Jeremy Chadwick

On Thu, Dec 15, 2011 at 05:26:27PM +0100, Attilio Rao wrote:
 2011/12/13 Jeremy Chadwick free...@jdc.parodius.com:
  On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
   Not fully right, boinc defaults to run on idprio 31 so this isn't an
   issue. And yes, there are cases where SCHED_ULE shows much better
   performance then SCHED_4BSD. ??[...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 
  This is in no way shape or form the same kind of benchmark as what
  you're planning to do, but I thought I'd throw it out there for folks to
  take in as they see fit.
 
  I know folks were focused mainly on buildworld.
 
  I personally would find it interesting if someone with a higher-end
  system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
  same test (changing -jX to -j{numofcores} of course).
 
  --
  | Jeremy Chadwick ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??jdc at 
  parodius.com |
  | Parodius Networking ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 
  http://www.parodius.com/ |
  | UNIX Systems Administrator ?? ?? ?? ?? ?? ?? ?? ?? ?? Mountain View, CA, 
  US |
  | Making life hard for others since 1977. ?? ?? ?? ?? ?? ?? ?? PGP 4BD6C0CB 
  |
 
 
  sched_ule
  ===
  - time make -j2 buildworld
  ??1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
  - time make -j2 buildkernel
  ??640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w
 
 
  sched_4bsd
  
  - time make -j2 buildworld
  ??1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
  - time make -j2 buildkernel
  ??638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w
 
 
  software
  ==
  * sched_ule test: ??FreeBSD 8.2-STABLE, Thu Dec ??1 04:37:29 PST 2011
  * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011
 
 Hi Jeremy,
 thanks for the time you spent on this.
 
 However, I wanted to ask/let you note 3 things:
 1) Did you use 2 different code base for the test? (one updated on
 December 1 and another one on December 12)

No; src-all (/usr/src on this system) was not updated between December
1st and December 12th PST.  I do believe I updated it today (15th PST).
I can/will obviously hold off so that we have a consistent code base for
comparing numbers between schedulers during buildworld and/or
buildkernel.

 2) Please note that you should have repeated this test several times
 (basically until you don't get a standard deviation which is
 acceptable with ministat) and report the ministat output

This is the first time I have heard of ministat(1).  I'm pretty sure I
see what it's for and how it applies to this situation, but boy that man
page could use some clarification (I have 3 people looking at this thing
right now trying to figure out what means what in the graph :-) ).
Anyway, graph or not, I see the point.

Regarding multiple tests: yup, you're absolutely right, the only way to
do it would be to run a sequence of tests repeatedly (probably 10 per
scheduler).  Reboots and rm -fr /usr/obj/* would be required after each
test too, to guarantee empty kernel caches (of all types) consistently
every time.

What I posted was supposed to give people just a general idea if there
was any gigantic difference between the two, and there really isn't.
But, as others have stated (and you below), buildworld may not be an
effective way to benchmark what we're trying to test.

Hence me wondering exactly what would make for a good test.  Example:

1. Run + background some program that beats on things (I really don't
know what; creation/deletion of threads?  CPU benchmark?  bonnie++?),
with output going to /dev/null.
2. Run + background time make -j2 buildworld with output going to /dev/null
3. Record/save output from time.
4. rm -fr /usr/obj  shutdown -r now
5. Repeat all steps ~10 times
6. Adjust kernel configuration file to use other scheduler
7. Repeat steps 1-5.

What I'm trying to figure out is what #1 and #2 should be in the above
example.

 3) The difference is less than 2% which I suspect is really

Re: SCHED_ULE should not be the default

2011-12-18 Thread Ian Smith

On Sun, 18 Dec 2011 02:37:52 +, Bruce Cran wrote:
  On 13/12/2011 09:00, Andrey Chernov wrote:
   I observe ULE interactivity slowness even on single core machine (Pentium
   4) in very visible places, like 'ps ax' output stucks in the middle by ~1
   second. When I switch back to SHED_4BSD, all slowness is gone. 
  
  I'm also seeing problems with ULE on a dual-socket quad-core Xeon machine
  with 16 logical CPUs. If I run tar xf somefile.tar and make -j16
  buildworld then logging into another console can take several seconds.
  Sometimes even the Password: prompt can take a couple of seconds to appear
  after typing my username.

I'd resigned myself to expecting this sort of behaviour as 'normal' on 
my single core 1133MHz PIII-M.  As a reproducable data point, running 
'dd if=/dev/random of=/dev/null' in one konsole, specifically to heat 
the CPU while testing my manual fan control script, hogs it up pretty 
much while regularly running the script below in another konsole to 
check values - which often gets stuck half way, occasionally pausing 
_twice_ before finishing.  Switching back to the first konsole (on 
another desktop) to kill the dd can also take a couple/few seconds.

t23# cat /root/bin/t23stat
#!/bin/sh
echo -n `date` 
sysctl dev.cpu.0.freq dev.cpu.0.cx_usage
sysctl dev.acpi_ibm | egrep 'fan_|thermal'
sysctl hw.acpi.thermal.tz0.temperature
acpiconf -i0 | egrep 'State|Remain|Present|Volt'

Sure it's a slow machine, but it normally runs pretty smoothly.
Anything with a bit of disk i/o, like buildworld, runs smooth as.

This is on 8.2-R GENERIC, HZ=1000, 768MB with lots free, no swap in use.  
I'll definitely be trying SCHED_4BSD after updating to 8-stable unless a 
'miracle cure' appears beforehand.

cheers, Ian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-18 Thread Adrian Chadd

The trouble is that there's lots of anecdotal evidence, but noone's
really gone digging deep into _their_ example of why it's broken. The
developers who know this stuff don't see anything wrong. That hints to
me it may be something a little more creepy - as an example, the
interplay between netisr/swi/taskqueue/callbacks and such. It may be
that something is being starved that isn't obviously obvious. It's
just a stab in the dark, but it sounds somewhat plausible based on
what I've seen ULE do in my network throughput hacking.

I applaud reppie for trying to make it as easy as possible for people
to use KTR to provide scheduler traces for him to go digging with, so
please, if you have these issues and you can absolutely reproduce
them, please follow his instructions and work with him to get him what
he needs.



Adrian

(wow, lots of personal pronouns packed into one sentence. It must be
sleep time.)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-17 Thread Bruce Cran


On 13/12/2011 09:00, Andrey Chernov wrote:
I observe ULE interactivity slowness even on single core machine 
(Pentium 4) in very visible places, like 'ps ax' output stucks in the 
middle by ~1 second. When I switch back to SHED_4BSD, all slowness is 
gone. 


I'm also seeing problems with ULE on a dual-socket quad-core Xeon 
machine with 16 logical CPUs. If I run tar xf somefile.tar and make 
-j16 buildworld then logging into another console can take several 
seconds. Sometimes even the Password: prompt can take a couple of 
seconds to appear after typing my username.


--
Bruce Cran
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-17 Thread Andrey Chernov

On Sun, Dec 18, 2011 at 05:51:47PM +1100, Ian Smith wrote:
 On Sun, 18 Dec 2011 02:37:52 +, Bruce Cran wrote:
   On 13/12/2011 09:00, Andrey Chernov wrote:
I observe ULE interactivity slowness even on single core machine (Pentium
4) in very visible places, like 'ps ax' output stucks in the middle by ~1
second. When I switch back to SHED_4BSD, all slowness is gone. 
   
   I'm also seeing problems with ULE on a dual-socket quad-core Xeon machine
   with 16 logical CPUs. If I run tar xf somefile.tar and make -j16
   buildworld then logging into another console can take several seconds.
   Sometimes even the Password: prompt can take a couple of seconds to 
 appear
   after typing my username.
 
 I'd resigned myself to expecting this sort of behaviour as 'normal' on 
 my single core 1133MHz PIII-M.  As a reproducable data point, running 
 'dd if=/dev/random of=/dev/null' in one konsole, specifically to heat 
 the CPU while testing my manual fan control script, hogs it up pretty 
 much while regularly running the script below in another konsole to 
 check values - which often gets stuck half way, occasionally pausing 
 _twice_ before finishing.  Switching back to the first konsole (on 
 another desktop) to kill the dd can also take a couple/few seconds.

This issue not about slow machine under load, because the same 
slow machine under exact the same load, but with SCHED_4BSD is very fast 
to response interactively.

I think we should not misinterpret interactivity with speed. I see no big 
speed (i.e. compilation time) differences, switching schedulers, but see 
big _interactivity_ difference. ULE in general tends to underestimate 
interactive processes in favour of background ones. It perhaps helps to 
compilation, but looks like slowpoke OS from the interactive user 
experience.

-- 
http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-16 Thread Daniel Nebdal

On Thu, Dec 15, 2011 at 9:58 PM, Mike Tancsa m...@sentex.net wrote:
 On 12/15/2011 11:56 AM, Attilio Rao wrote:
 So, as very first thing, can you try the following:
 - Same codebase, etc. etc.
 - Make the test 4 times, discard the first and ministat for the other 3
 - Reboot
 - Change the steal_thresh value
 - Make the test 4 times, discard the first and ministat for the other 3

 Then report discarded values and the ministated one and we will have
 more informations I guess
 (also, I don't think devfs contention should play a role here, thus
 nevermind about it for now).


 Results and data at

 http://www.tancsa.com/ule-bsd.html

        ---Mike


I took the liberty of re-plotting this as one boxplot per test-type,
in the hope of getting a better overview. R script included. Beware
the y-ranges. (To re-plot with a specific y range, add e.g.
ylim=c(0,35) to the boxplot() calls.)

http://nebdal.net/sched/plot.html

-- 
Daniel Nebdal
Dep. of genetics, Oslo University Hospital
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/14 Mike Tancsa m...@sentex.net:
 On 12/13/2011 7:01 PM, m...@freebsd.org wrote:

 Has anyone experiencing problems tried to set sysctl 
 kern.sched.steal_thresh=1 ?

 I don't remember what our specific problem at $WORK was, perhaps it
 was just interrupt threads not getting serviced fast enough, but we've
 hard-coded this to 1 and removed the code that sets it in
 sched_initticks().  The same effect should be had by setting the
 sysctl after a box is up.

 FWIW, this does impact the performance of pbzip2 on an i7. Using a 1.1G file

 pbzip2 -v -c big  /dev/null

 with burnP6 running in the background,

 sysctl kern.sched.steal_thresh=1
 vs
 sysctl kern.sched.steal_thresh=3



    N           Min           Max        Median           Avg        Stddev
 x  10     38.005022      38.42238     38.194648     38.165052    0.15546188
 +   9     38.695417     40.595544     39.392127     39.435384    0.59814114
 Difference at 95.0% confidence
        1.27033 +/- 0.412636
        3.32852% +/- 1.08119%
        (Student's t, pooled s = 0.425627)

 a value of 1 is *slightly* faster.

Hi Mike,
was that just the same codebase with the switch SCHED_4BSD/SCHED_ULE?

Also, the results here should be in the 3% interval for the avg case,
which is not yet at the 'alarm level' but could still be an
indication.
I still suspect I/O plays a big role here, however, thus it could be
detemined by other factors.

Could you retry the bench checking CPU usage and possible thread
migration around for both cases?

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/13 Jeremy Chadwick free...@jdc.parodius.com:
 On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]

 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.

 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

 This is in no way shape or form the same kind of benchmark as what
 you're planning to do, but I thought I'd throw it out there for folks to
 take in as they see fit.

 I know folks were focused mainly on buildworld.

 I personally would find it interesting if someone with a higher-end
 system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
 same test (changing -jX to -j{numofcores} of course).

 --
 | Jeremy Chadwick                                jdc at parodius.com |
 | Parodius Networking                       http://www.parodius.com/ |
 | UNIX Systems Administrator                   Mountain View, CA, US |
 | Making life hard for others since 1977.               PGP 4BD6C0CB |


 sched_ule
 ===
 - time make -j2 buildworld
  1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
 - time make -j2 buildkernel
  640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w


 sched_4bsd
 
 - time make -j2 buildworld
  1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
 - time make -j2 buildkernel
  638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w


 software
 ==
 * sched_ule test:  FreeBSD 8.2-STABLE, Thu Dec  1 04:37:29 PST 2011
 * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011

Hi Jeremy,
thanks for the time you spent on this.

However, I wanted to ask/let you note 3 things:
1) Did you use 2 different code base for the test? (one updated on
December 1 and another one on December 12)
2) Please note that you should have repeated this test several times
(basically until you don't get a standard deviation which is
acceptable with ministat) and report the ministat output
3) The difference is less than 2% which I suspect is really
statistically unuseful/the same

I'm not really even surprised ULE is not faster than 4BSD in this case
because usually buildworld/buildkernel tests are driven for the vast
majority by I/O overhead rather than scheduler capacity. It would be
more interesting to analyze how buildworld does while another type of
workload is going on.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Mike Tancsa

On 12/15/2011 11:26 AM, Attilio Rao wrote:
 
 Hi Mike,
 was that just the same codebase with the switch SCHED_4BSD/SCHED_ULE?

Hi Attilio,
It was the same codebase.


 Could you retry the bench checking CPU usage and possible thread
 migration around for both cases?

I can, but how do I do that ?

---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/15 Mike Tancsa m...@sentex.net:
 On 12/15/2011 11:26 AM, Attilio Rao wrote:

 Hi Mike,
 was that just the same codebase with the switch SCHED_4BSD/SCHED_ULE?

 Hi Attilio,
        It was the same codebase.


 Could you retry the bench checking CPU usage and possible thread
 migration around for both cases?

 I can, but how do I do that ?

I'm thinking now to a better test-case for this: can you try that on a
tmpfs volume?

Also what filesystem you were using? How many CPUs were in place?
Did you reboot before to move the steal_thresh value?

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Mike Tancsa

On 12/15/2011 11:42 AM, Attilio Rao wrote:
 
 I'm thinking now to a better test-case for this: can you try that on a
 tmpfs volume?

There is enough RAM in the box so that it should not touch the disk, and
I was sending the output to /dev/null, so it was not writing to the disk.

 
 Also what filesystem you were using? 

UFS

 How many CPUs were in place?

4

 Did you reboot before to move the steal_thresh value?

No.

---Mike
-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/15 Mike Tancsa m...@sentex.net:
 On 12/15/2011 11:42 AM, Attilio Rao wrote:

 I'm thinking now to a better test-case for this: can you try that on a
 tmpfs volume?

 There is enough RAM in the box so that it should not touch the disk, and
 I was sending the output to /dev/null, so it was not writing to the disk.


 Also what filesystem you were using?

 UFS

 How many CPUs were in place?

 4

 Did you reboot before to move the steal_thresh value?

 No.

So, as very first thing, can you try the following:
- Same codebase, etc. etc.
- Make the test 4 times, discard the first and ministat for the other 3
- Reboot
- Change the steal_thresh value
- Make the test 4 times, discard the first and ministat for the other 3

Then report discarded values and the ministated one and we will have
more informations I guess
(also, I don't think devfs contention should play a role here, thus
nevermind about it for now).

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread O. Hartmann

Am 12/15/11 15:20, schrieb Steven Hartland:
 With all the discussion I thought I'd give a buildworld
 benchmark a go here on a spare 24 core machine. ULE
 tested fine but with 4BSD it wont even boot panicing
 with the following:-
 http://screensnapr.com/v/hwysGV.png
 
 This is on a clean 8.2-RELEASE-p4
 
 Upgrading to RELENG_9 fixed this but its a bit concerning
 that just changing the scheduler would cause the machine
 to panic on boot.
 
 Its only a single run so varience could be high but here's
 the result of a buildworld on this machine running the
 two different schedulers:-
 4BSD: 24m54.10s real 2h43m12.42s user 56m20.07s sys
 ULE:  23m54.68s real 2h34m59.04s user 50m59.91s sys
 
 What really sticks out is that this is over double that
 of an 8.2 buildworld on the same machine with the same
 kernel
 ULE:  11m12.76s real 1h27m59.39s user 28m59.57s sys
 
 This was run 9.0-PRERELEASE kernel due to 4BSD panicing
 on boot under 8.2.
 
 So for this use ULE vs 4BSD is neither here-nor-there
 but 9.0 buildworld is very slow (x2 slower) compared
 with 8.2 so whats a bigger question in my mind.
 
Regards
Steve
 


All of our 8.2-STABLE with ncpu = 4 compile the OS in half the time a
compilation of FreeBSD 9/10 is needed to. I guess this is due to the
huge LLVM contribution which is now part of the source tree. Even if you
allow building a whole LLVM suite (and not even pieces of it as in
FreeBSD standard for CLANG purposes), it takes another q0 to 20 minutes,
depending on the architecture of the underlying host.

Building kernel or worl, taking time and show then the invers of that
number isn't a good idea, in my opinion.
Therefore I like artificial benchmarks: have a set of programs that
can be compiled and take the time if compilation time is important.

Well, your one-shot test would show, that there is indeed a marginal
advantage of SCHED_ULE, if the number of cores is big enough (as said to
be n  2 in this thread). But I'm a bit disappointed about the very
small advantage on that 24 core hog.

Oliver



signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/15 Jeremy Chadwick free...@jdc.parodius.com:
 On Thu, Dec 15, 2011 at 05:26:27PM +0100, Attilio Rao wrote:
 2011/12/13 Jeremy Chadwick free...@jdc.parodius.com:
  On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
   Not fully right, boinc defaults to run on idprio 31 so this isn't an
   issue. And yes, there are cases where SCHED_ULE shows much better
   performance then SCHED_4BSD. ??[...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 
  This is in no way shape or form the same kind of benchmark as what
  you're planning to do, but I thought I'd throw it out there for folks to
  take in as they see fit.
 
  I know folks were focused mainly on buildworld.
 
  I personally would find it interesting if someone with a higher-end
  system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
  same test (changing -jX to -j{numofcores} of course).
 
  --
  | Jeremy Chadwick ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??jdc at 
  parodius.com |
  | Parodius Networking ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 
  http://www.parodius.com/ |
  | UNIX Systems Administrator ?? ?? ?? ?? ?? ?? ?? ?? ?? Mountain View, CA, 
  US |
  | Making life hard for others since 1977. ?? ?? ?? ?? ?? ?? ?? PGP 
  4BD6C0CB |
 
 
  sched_ule
  ===
  - time make -j2 buildworld
  ??1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
  - time make -j2 buildkernel
  ??640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w
 
 
  sched_4bsd
  
  - time make -j2 buildworld
  ??1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
  - time make -j2 buildkernel
  ??638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w
 
 
  software
  ==
  * sched_ule test: ??FreeBSD 8.2-STABLE, Thu Dec ??1 04:37:29 PST 2011
  * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011

 Hi Jeremy,
 thanks for the time you spent on this.

 However, I wanted to ask/let you note 3 things:
 1) Did you use 2 different code base for the test? (one updated on
 December 1 and another one on December 12)

 No; src-all (/usr/src on this system) was not updated between December
 1st and December 12th PST.  I do believe I updated it today (15th PST).
 I can/will obviously hold off so that we have a consistent code base for
 comparing numbers between schedulers during buildworld and/or
 buildkernel.

 2) Please note that you should have repeated this test several times
 (basically until you don't get a standard deviation which is
 acceptable with ministat) and report the ministat output

 This is the first time I have heard of ministat(1).  I'm pretty sure I
 see what it's for and how it applies to this situation, but boy that man
 page could use some clarification (I have 3 people looking at this thing
 right now trying to figure out what means what in the graph :-) ).
 Anyway, graph or not, I see the point.

 Regarding multiple tests: yup, you're absolutely right, the only way to
 do it would be to run a sequence of tests repeatedly (probably 10 per
 scheduler).  Reboots and rm -fr /usr/obj/* would be required after each
 test too, to guarantee empty kernel caches (of all types) consistently
 every time.

 What I posted was supposed to give people just a general idea if there
 was any gigantic difference between the two, and there really isn't.
 But, as others have stated (and you below), buildworld may not be an
 effective way to benchmark what we're trying to test.

 Hence me wondering exactly what would make for a good test.  Example:

 1. Run + background some program that beats on things (I really don't
 know what; creation/deletion of threads?  CPU benchmark?  bonnie++?),
 with output going to /dev/null.
 2. Run + background time make -j2 buildworld with output going to /dev/null
 3. Record/save output from time.
 4. rm -fr /usr/obj  shutdown -r now
 5. Repeat all steps ~10 times
 6. Adjust kernel configuration file to use other scheduler
 7. Repeat steps 1-5.

 What I'm trying to figure out is what #1 and #2 should be in the

Re: SCHED_ULE should not be the default

2011-12-15 Thread Ivan Klymenko

В Thu, 15 Dec 2011 20:02:44 +0100
Attilio Rao atti...@freebsd.org пишет:

 2011/12/15 Jeremy Chadwick free...@jdc.parodius.com:
  On Thu, Dec 15, 2011 at 05:26:27PM +0100, Attilio Rao wrote:
  2011/12/13 Jeremy Chadwick free...@jdc.parodius.com:
   On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
Not fully right, boinc defaults to run on idprio 31 so this
isn't an issue. And yes, there are cases where SCHED_ULE
shows much better performance then SCHED_4BSD. ??[...]
  
   Do we have any proof at hand for such cases where SCHED_ULE
   performs much better than SCHED_4BSD? Whenever the subject
   comes up, it is mentioned, that SCHED_ULE has better
   performance on boxes with a ncpu  2. But in the end I see here
   contradictionary statements. People complain about poor
   performance (especially in scientific environments), and other
   give contra not being the case.
  
   Within our department, we developed a highly scalable code for
   planetary science purposes on imagery. It utilizes present GPUs
   via OpenCL if present. Otherwise it grabs as many cores as it
   can. By the end of this year I'll get a new desktop box based
   on Intels new Sandy Bridge-E architecture with plenty of
   memory. If the colleague who developed the code is willing
   performing some benchmarks on the same hardware platform, we'll
   benchmark bot FreeBSD 9.0/10.0 and the most recent Suse. For
   FreeBSD I intent also to look for performance with both
   different schedulers available.
  
   This is in no way shape or form the same kind of benchmark as
   what you're planning to do, but I thought I'd throw it out there
   for folks to take in as they see fit.
  
   I know folks were focused mainly on buildworld.
  
   I personally would find it interesting if someone with a
   higher-end system (e.g. 2 physical CPUs, with 6 or 8 cores per
   CPU) was to do the same test (changing -jX to -j{numofcores} of
   course).
  
   --
   | Jeremy
   Chadwick ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??jdc at
   parodius.com | | Parodius
   Networking ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ??
   http://www.parodius.com/ | | UNIX Systems
   Administrator ?? ?? ?? ?? ?? ?? ?? ?? ?? Mountain View, CA, US |
   | Making life hard for others since 1977. ?? ?? ?? ?? ?? ?? ??
   PGP 4BD6C0CB |
  
  
   sched_ule
   ===
   - time make -j2 buildworld
   ??1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io
   4565pf+0w
   - time make -j2 buildkernel
   ??640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w
  
  
   sched_4bsd
   
   - time make -j2 buildworld
   ??1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io
   6451pf+0w
   - time make -j2 buildkernel
   ??638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w
  
  
   software
   ==
   * sched_ule test: ??FreeBSD 8.2-STABLE, Thu Dec ??1 04:37:29 PST
   2011
   * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST
   2011
 
  Hi Jeremy,
  thanks for the time you spent on this.
 
  However, I wanted to ask/let you note 3 things:
  1) Did you use 2 different code base for the test? (one updated on
  December 1 and another one on December 12)
 
  No; src-all (/usr/src on this system) was not updated between
  December 1st and December 12th PST.  I do believe I updated it
  today (15th PST). I can/will obviously hold off so that we have a
  consistent code base for comparing numbers between schedulers
  during buildworld and/or buildkernel.
 
  2) Please note that you should have repeated this test several
  times (basically until you don't get a standard deviation which is
  acceptable with ministat) and report the ministat output
 
  This is the first time I have heard of ministat(1).  I'm pretty
  sure I see what it's for and how it applies to this situation, but
  boy that man page could use some clarification (I have 3 people
  looking at this thing right now trying to figure out what means
  what in the graph :-) ). Anyway, graph or not, I see the point.
 
  Regarding multiple tests: yup, you're absolutely right, the only
  way to do it would be to run a sequence of tests repeatedly
  (probably 10 per scheduler).  Reboots and rm -fr /usr/obj/* would
  be required after each test too, to guarantee empty kernel caches
  (of all types) consistently every time.
 
  What I posted was supposed to give people just a general idea if
  there was any gigantic difference between the two, and there really
  isn't. But, as others have stated (and you below), buildworld may
  not be an effective way to benchmark what we're trying to test.
 
  Hence me wondering exactly what would make for a good test.
   Example:
 
  1. Run + background some program that beats on things (I really
  don't know what; creation/deletion of threads?  CPU benchmark?
   bonnie++?), with output going to /dev/null.
  2. Run + background time make -j2 buildworld with output going
  to /dev/null 3. Record/save output from time.
  4. rm -fr

Re: SCHED_ULE should not be the default

2011-12-15 Thread Mike Tancsa

On 12/15/2011 11:56 AM, Attilio Rao wrote:
 So, as very first thing, can you try the following:
 - Same codebase, etc. etc.
 - Make the test 4 times, discard the first and ministat for the other 3
 - Reboot
 - Change the steal_thresh value
 - Make the test 4 times, discard the first and ministat for the other 3
 
 Then report discarded values and the ministated one and we will have
 more informations I guess
 (also, I don't think devfs contention should play a role here, thus
 nevermind about it for now).


Results and data at

http://www.tancsa.com/ule-bsd.html

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-15 Thread Attilio Rao

2011/12/15 Mike Tancsa m...@sentex.net:
 On 12/15/2011 11:56 AM, Attilio Rao wrote:
 So, as very first thing, can you try the following:
 - Same codebase, etc. etc.
 - Make the test 4 times, discard the first and ministat for the other 3
 - Reboot
 - Change the steal_thresh value
 - Make the test 4 times, discard the first and ministat for the other 3

 Then report discarded values and the ministated one and we will have
 more informations I guess
 (also, I don't think devfs contention should play a role here, thus
 nevermind about it for now).


 Results and data at

 http://www.tancsa.com/ule-bsd.html

I'm not totally sure, what does burnP6 do? is it a CPU-bound workload?
Also, how many threads are spanked in your case for parallel bzip2?

Also, it would be very good if you could arrange these tests against
newer -CURRENT (with userland and kerneland debugging off).

Thanks a lot of your hard work,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-14 Thread Mike Tancsa

On 12/13/2011 7:01 PM, m...@freebsd.org wrote:
 
 Has anyone experiencing problems tried to set sysctl 
 kern.sched.steal_thresh=1 ?
 
 I don't remember what our specific problem at $WORK was, perhaps it
 was just interrupt threads not getting serviced fast enough, but we've
 hard-coded this to 1 and removed the code that sets it in
 sched_initticks().  The same effect should be had by setting the
 sysctl after a box is up.

FWIW, this does impact the performance of pbzip2 on an i7. Using a 1.1G file

pbzip2 -v -c big  /dev/null

with burnP6 running in the background,

sysctl kern.sched.steal_thresh=1 
vs
sysctl kern.sched.steal_thresh=3



N   Min   MaxMedian   AvgStddev
x  10 38.005022  38.42238 38.194648 38.1650520.15546188
+   9 38.695417 40.595544 39.392127 39.4353840.59814114
Difference at 95.0% confidence
1.27033 +/- 0.412636
3.32852% +/- 1.08119%
(Student's t, pooled s = 0.425627)

a value of 1 is *slightly* faster.


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-14 Thread Andrey Chernov

On Tue, Dec 13, 2011 at 02:22:48AM -0800, Adrian Chadd wrote:
 On 13 December 2011 01:00, Andrey Chernov a...@freebsd.org wrote:
 
  If the algorithm ULE does not contain problems - it means the problem
  has Core2Duo, or in a piece of code that uses the ULE scheduler.
 
  I observe ULE interactivity slowness even on single core machine (Pentium
  4) in very visible places, like 'ps ax' output stucks in the middle by ~1
  second. When I switch back to SHED_4BSD, all slowness is gone.
 
 Are you able to provide KTR traces of the scheduler results? Something
 that can be fed to schedgraph?

Sorry, this machine is not mine anymore. I try SCHED_ULE on Core 2 Duo 
instead and don't notice this effect, but it is overall pretty fast 
comparing to that Pentium 4.

-- 
http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-14 Thread Ivan Klymenko

В Wed, 14 Dec 2011 21:34:35 +0400
Andrey Chernov a...@freebsd.org пишет:

 On Tue, Dec 13, 2011 at 02:22:48AM -0800, Adrian Chadd wrote:
  On 13 December 2011 01:00, Andrey Chernov a...@freebsd.org wrote:
  
   If the algorithm ULE does not contain problems - it means the
   problem has Core2Duo, or in a piece of code that uses the ULE
   scheduler.
  
   I observe ULE interactivity slowness even on single core machine
   (Pentium 4) in very visible places, like 'ps ax' output stucks in
   the middle by ~1 second. When I switch back to SHED_4BSD, all
   slowness is gone.
  
  Are you able to provide KTR traces of the scheduler results?
  Something that can be fed to schedgraph?
 
 Sorry, this machine is not mine anymore. I try SCHED_ULE on Core 2
 Duo instead and don't notice this effect, but it is overall pretty
 fast comparing to that Pentium 4.
 

Give me, please, detailed instructions on how to do it - I'll do it ...
Be a shame if this the theme is will end again just only the
discussions ...  :(
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Ivan Klymenko

 On 12/12/2011 05:47, O. Hartmann wrote:
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD?
 
 I complained about poor interactive performance of ULE in a desktop
 environment for years. I had numerous people try to help, including
 Jeff, with various tunables, dtrace'ing, etc. The cause of the problem
 was never found.
 
 I switched to 4BSD, problem gone.
 
 This is on 2 separate systems with core 2 duos.
 
 
 hth,
 
 Doug
 

If the algorithm ULE does not contain problems - it means the problem
has Core2Duo, or in a piece of code that uses the ULE scheduler.
I already wrote in a mailing list that specifically in my case (Core2Duo)
partially helps the following patch:
--- sched_ule.c.orig2011-11-24 18:11:48.0 +0200
+++ sched_ule.c 2011-12-10 22:47:08.0 +0200
@@ -794,7 +794,8 @@
 * 1.5 * balance_interval.
 */
balance_ticks = max(balance_interval / 2, 1);
-   balance_ticks += random() % balance_interval;
+// balance_ticks += random() % balance_interval;
+   balance_ticks += ((int)random()) % balance_interval;
if (smp_started == 0 || rebalance == 0)
return;
tdq = TDQ_SELF();
@@ -2118,13 +2119,21 @@
struct td_sched *ts;
 
THREAD_LOCK_ASSERT(td, MA_OWNED);
+   if (td-td_pri_class  PRI_FIFO_BIT)
+   return;
+   ts = td-td_sched;
+   /*
+* We used up one time slice.
+*/
+   if (--ts-ts_slice  0)
+   return;
tdq = TDQ_SELF();
 #ifdef SMP
/*
 * We run the long term load balancer infrequently on the first cpu.
 */
-   if (balance_tdq == tdq) {
-   if (balance_ticks  --balance_ticks == 0)
+   if (balance_ticks  --balance_ticks == 0) {
+   if (balance_tdq == tdq)
sched_balance();
}
 #endif
@@ -2144,9 +2153,6 @@
if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
tdq-tdq_ridx = tdq-tdq_idx;
}
-   ts = td-td_sched;
-   if (td-td_pri_class  PRI_FIFO_BIT)
-   return;
if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
/*
 * We used a tick; charge it to the thread so
@@ -2157,11 +2163,6 @@
sched_priority(td);
}
/*
-* We used up one time slice.
-*/
-   if (--ts-ts_slice  0)
-   return;
-   /*
 * We're out of time, force a requeue at userret().
 */
ts-ts_slice = sched_slice;


and refusal to use options FULL_PREEMPTION
But no one has unsubscribed to my letter, my patch helps or not in the case of 
Core2Duo...
There is a suspicion that the problems stem from the sections of code 
associated with the SMP...
Maybe I'm in something wrong, but I want to help in solving this problem ...
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Andrey Chernov

On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
  On 12/12/2011 05:47, O. Hartmann wrote:
   Do we have any proof at hand for such cases where SCHED_ULE performs
   much better than SCHED_4BSD?
  
  I complained about poor interactive performance of ULE in a desktop
  environment for years. I had numerous people try to help, including
  Jeff, with various tunables, dtrace'ing, etc. The cause of the problem
  was never found.
  
  I switched to 4BSD, problem gone.
  
  This is on 2 separate systems with core 2 duos.
  
  
  hth,
  
  Doug
  
 
 If the algorithm ULE does not contain problems - it means the problem
 has Core2Duo, or in a piece of code that uses the ULE scheduler.

I observe ULE interactivity slowness even on single core machine (Pentium 
4) in very visible places, like 'ps ax' output stucks in the middle by ~1 
second. When I switch back to SHED_4BSD, all slowness is gone.

-- 
http://ache.vniz.net/
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Adrian Chadd

On 13 December 2011 01:00, Andrey Chernov a...@freebsd.org wrote:

 If the algorithm ULE does not contain problems - it means the problem
 has Core2Duo, or in a piece of code that uses the ULE scheduler.

 I observe ULE interactivity slowness even on single core machine (Pentium
 4) in very visible places, like 'ps ax' output stucks in the middle by ~1
 second. When I switch back to SHED_4BSD, all slowness is gone.

Are you able to provide KTR traces of the scheduler results? Something
that can be fed to schedgraph?


Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Jeremy Chadwick

On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]
 
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
 
 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

This is in no way shape or form the same kind of benchmark as what
you're planning to do, but I thought I'd throw it out there for folks to
take in as they see fit.

I know folks were focused mainly on buildworld.

I personally would find it interesting if someone with a higher-end
system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
same test (changing -jX to -j{numofcores} of course).

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |


sched_ule
===
- time make -j2 buildworld
  1689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
- time make -j2 buildkernel
  640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w


sched_4bsd

- time make -j2 buildworld
  1662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0w
- time make -j2 buildkernel
  638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w


software
==
* sched_ule test:  FreeBSD 8.2-STABLE, Thu Dec  1 04:37:29 PST 2011
* sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011


hardware
==
* Intel Core 2 Duo E8400, 3GHz
* Supermicro X7SBA
* 8GB ECC RAM (4x2GB), DDR2-800
* Intel 320-series SSD, 80GB: /, swap, /var, /tmp, /usr


tuning adjustments / etc.
===
* Before each scheduler test, system was rebooted to ensure I/O cache
  and other whatnots were empty
* All filesystems stock UFS2 + SU (root is non-SU)
* All filesystems had tunefs -t enable applied to them
* powerd(8) in use, with two rc.conf variables (per CPU spec):

performance_cx_lowest=C2
economy_cx_lowest=C2

* loader.conf

kern.maxdsiz=2560M
kern.dfldsiz=2560M
kern.maxssiz=256M
ahci_load=yes
hint.p4tcc.0.disabled=1
hint.acpi_throttle.0.disabled=1
vfs.zfs.arc_max=5120M

* make.conf

CPUTYPE?=core2

* src.conf

WITHOUT_INET6=true
WITHOUT_IPFILTER=true
WITHOUT_LIB32=true
WITHOUT_KERBEROS=true
WITHOUT_PAM_SUPPORT=true
WITHOUT_PROFILE=true
WITHOUT_SENDMAIL=true

* kernel configuration
  - note: between kernel builds, config was changed to either use
SCHED_4BSD or SCHED_ULE respectively.

cpu HAMMER
ident   GENERIC

makeoptions DEBUG=-g# Build kernel with gdb(1) debug symbols

options SCHED_4BSD  # Classic BSD scheduler
#optionsSCHED_ULE   # ULE scheduler
options PREEMPTION  # Enable kernel thread preemption
options INET# InterNETworking
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES # Enable FFS soft updates support
options UFS_ACL # Support for access control lists
options UFS_DIRHASH # Improve performance on big directories
options UFS_GJOURNAL# Enable gjournal-based UFS journaling
options MD_ROOT # MD is a potential root device
options NFSCLIENT   # Network Filesystem Client
options NFSSERVER   # Network Filesystem Server
options NFSLOCKD# Network Lock Manager
options NFS_ROOT# NFS usable as /, requires NFSCLIENT
options MSDOSFS # MSDOS Filesystem
options CD9660  # ISO 9660 Filesystem
options PROCFS  # Process filesystem (requires PSEUDOFS)
options PSEUDOFS# Pseudo-filesystem framework
options GEOM_PART_GPT   # GUID Partition Tables.
options

Re: SCHED_ULE should not be the default

2011-12-13 Thread O. Hartmann

On 12/12/11 16:13, Vincent Hoffman wrote:
 
 On 12/12/2011 13:47, O. Hartmann wrote:
 
 Not fully right, boinc defaults to run on idprio 31 so this isn't an
 issue. And yes, there are cases where SCHED_ULE shows much better
 performance then SCHED_4BSD. [...]
 
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
 It all a little old now but some if the stuff in
 http://people.freebsd.org/~kris/scaling/
 covers improvements that were seen.
 
 http://jeffr-tech.livejournal.com/5705.html
 shows a little too, reading though Jeffs blog is worth it as it has some
 interesting stuff on SHED_ULE.
 
 I thought there were some more benchmarks floating round but cant find
 any with a quick google.
 
 
 Vince
 
 

Interesting, there seems to be a much more performant scheduler in 7.0,
called SCHED_SMP. I have some faint recalls on that ... where is this
beast gone?

Oliver



signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-13 Thread Jeremy Chadwick

On Tue, Dec 13, 2011 at 12:13:42PM +0100, O. Hartmann wrote:
 On 12/12/11 16:13, Vincent Hoffman wrote:
  
  On 12/12/2011 13:47, O. Hartmann wrote:
  
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
  
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
  It all a little old now but some if the stuff in
  http://people.freebsd.org/~kris/scaling/
  covers improvements that were seen.
  
  http://jeffr-tech.livejournal.com/5705.html
  shows a little too, reading though Jeffs blog is worth it as it has some
  interesting stuff on SHED_ULE.
  
  I thought there were some more benchmarks floating round but cant find
  any with a quick google.
  
  
  Vince
  
  
 
 Interesting, there seems to be a much more performant scheduler in 7.0,
 called SCHED_SMP. I have some faint recalls on that ... where is this
 beast gone?

Boy I sure hope I remember this right.  I strongly urge others to
correct me where I'm wrong; thanks in advance!

The classic scheduler, SCHED_4BSD, was implemented back before there was
oxygen.  sched_4bsd(4) mentions this.  No need to discuss it.

Jeff Robertson began working on the first-generation ULE scheduler
during the days of FreeBSD 5.x (I believe 5.1), and a paper on it was
presented at USENIX circa 2003:
http://www.usenix.org/event/bsdcon03/tech/full_papers/roberson/roberson.pdf

Over the following years, Jeff (and others I assume -- maybe folks like
George Neville-Neil and/or Kirk McKusick?) adjusted and tinkered with
some of the semantics and models/methods.  If I remember right, some of
these quirks/fixes were committed.  All of this was happening under the
scheduler that was then called SCHED_ULE, but it was ULE 1.0 for lack
of better terminology.

This scheduler did not perform well, if I remember right, and Jeff was
quite honest about that.  From this point forward, Jeff began idealising
and working on a scheduler which he called SCHED_SMP -- think of it as
ULE 2.0, again, for lack of better terminology.  It was different than
the existing SCHED_ULE scheduler, hence a different name.  Jeff blogged
about this in early 2007, using exactly that term (ULE 2.0):
http://jeffr-tech.livejournal.com/3729.html

In mid-2007, prior to FreeBSD 7.0-RELEASE, Jeff announced that
effectively he wanted to make SCHED_ULE do what SCHED_SMP did, and
provided a patch to SCHED_ULE to accomplish just that:
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/msg00755.html

Full thread is here (beware -- many replies):
http://unix.derkeiler.com/Mailing-Lists/FreeBSD/current/2007-07/threads.html#00755

The patch mentioned above was merged into HEAD on 2007/07/19.
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c#rev1.202

So in effect, as of 2007/07/19, SCHED_ULE became SCHED_SMP.

FreeBSD 7.0-RELEASE was released on 2008/02/27, and the above
commit/changes were available at that time as well (meaning: RELENG_7
and RELENG_7_0 at that moment in time should have included the patch
from the above paragraph).

The document released by Kris Kenneway hinted at those changes and
performance improvements:
http://people.freebsd.org/~kris/scaling/7.0%20Preview.pdf

Keep in mind, however, that at that time kernel configuration files
(GENERIC, etc.) still defaulted to SCHED_4BSD.

The default scheduler in kernel config files (GENERIC, etc.) for i386
and amd64 (not sure about others) was changed in 2007/10/19:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/conf/GENERIC#rev1.475
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/amd64/conf/GENERIC#rev1.485

This was done *prior* to FreeBSD 7.1-RELEASE.  So, it first became
available as the default scheduler for the masses when 7.1-RELEASE
came out on 2009/01/05.

All of the answers, in a roundabout and non-user-friendly way, are
available by examining the commit history for src/sys/kern/sched_ule.c.
It's hard to follow especially given that you have to consider all
the releases/branchpoints that took place over time, but:
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/sched_ule.c

Are we having fun yet?  :-)

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to

Re: SCHED_ULE should not be the default

2011-12-13 Thread O. Hartmann

On 12/12/11 16:51, Steve Kargl wrote:
 On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:

 Not fully right, boinc defaults to run on idprio 31 so this isn't an
 issue. And yes, there are cases where SCHED_ULE shows much better
 performance then SCHED_4BSD.  [...]

 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.

 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

 
 This comes up every 9 months or so, and must be approaching
 FAQ status.
 
 In a HPC environment, I recommend 4BSD.  Depending on
 the workload, ULE can cause a severe increase in turn
 around time when doing already long computations.  If
 you have an MPI application, simply launching greater
 than ncpu+1 jobs can show the problem.

Well, those recommendations should based on WHY. As the mostly
negative experiences with SCHED_ULE in highly computative workloads get
allways contradicted by ...but there are workloads that show the
opposite ... this should be shown by more recent benchmarks and
explanations than legacy benchmarks from years ago.

And, indeed, I highly would recommend having a FAQ or a short note in
tuning or the handbook in which it is mentioned to use SCHED_4BSD in
HPC environments and SCHED_ULE for other workloads (which has to be more
specific).

It is not an easy task setting up a certain kind of OS for a specific
purpose and tuning by crawling the mailing lists. Some notes and hints
in the documentation is always a valuable hint and highly appreciated by
folks not deep into development.

And by the way, I have the deep impression that most of these
discussions about the poor performance of SCHED_ULE tend to always end
up in a covering up that flaw and the conclusive waste of development.
But this is only my personal impression.



signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-13 Thread Steve Kargl

On Tue, Dec 13, 2011 at 02:23:46PM +0100, O. Hartmann wrote:
 On 12/12/11 16:51, Steve Kargl wrote:
  On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 
  
  This comes up every 9 months or so, and must be approaching
  FAQ status.
  
  In a HPC environment, I recommend 4BSD.  Depending on
  the workload, ULE can cause a severe increase in turn
  around time when doing already long computations.  If
  you have an MPI application, simply launching greater
  than ncpu+1 jobs can show the problem.
 
 Well, those recommendations should based on WHY. As the mostly
 negative experiences with SCHED_ULE in highly computative workloads get
 allways contradicted by ...but there are workloads that show the
 opposite ... this should be shown by more recent benchmarks and
 explanations than legacy benchmarks from years ago.
 

I have given the WHY in previous discussions of ULE, based
on what you call legacy benchmarks.  I have not seen any
commit to sched_ule.c that would lead me to believe that
the performance issues with ULE and cpu-bound numerical
codes have been addressed.  Repeating the benchmark would
be a waste of time.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Mike Tancsa

On 12/13/2011 10:54 AM, Steve Kargl wrote:
 
 I have given the WHY in previous discussions of ULE, based
 on what you call legacy benchmarks.  I have not seen any
 commit to sched_ule.c that would lead me to believe that
 the performance issues with ULE and cpu-bound numerical
 codes have been addressed.  Repeating the benchmark would
 be a waste of time.

Trying a simple pbzip2 on a large file, the results are pretty
consistent through iterations. pbzip2 with 4BSD is barely faster on a
file thats 322MB in size.

after a reboot, I did a
strings bigfile  /dev/null
then ran
 pbzip2 -v xaa -c  /dev/null
7 times

If I do a burnP6 in the background, they perform about the same.

(from sysutils/cpuburn)
eg

 pbzip2 -v xaa -c  /dev/null
Parallel BZIP2 v1.1.6 - by: Jeff Gilchrist [http://compression.ca]
[Oct. 30, 2011]   (uses libbzip2 by Julian Seward)
Major contributions: Yavor Nikolov nikolov.javor+pbz...@gmail.com

 # CPUs: 4
 BWT Block Size: 900 KB
File Block Size: 900 KB
 Maximum Memory: 100 MB
---
 File #: 1 of 1
 Input Name: xaa
Output Name: stdout

 Input Size: 352404831 bytes
Compressing data...
Output Size: 50630745 bytes
---

 Wall Clock: 18.139342 seconds


ULE
18.113204
18.116896
18.123400
18.105894
18.163332
18.139342
18.082888

ULE with burnP6
23.076085
22.003666
21.162987
21.682445
21.935568
23.595781
21.601277


4BSD
17.983395
17.986218
18.009254
18.004312
18.001494
17.997032

4BSD with burnP6
22.215508
21.886459
21.595179
21.361830
21.325351
21.244793



# ministat uleP6 bsdP6
x uleP6
+ bsdP6
+--+
|x+   + ++x   +  x  x   +
 xx|
|
||__MA|M_A__|
|
+--+
N   Min   MaxMedian   AvgStddev
x   6 21.162987 23.595781 22.003666 22.2427550.91175566
+   6 21.244793 22.215508 21.595179 21.604853 0.3792413
No difference proven at 95.0% confidence



x ule
+ bsd
+--+
|+ +   +   + +   +
 xx x  xx   x x|
|   |__A___M___|
  |M__A__| |
+--+
N   Min   MaxMedian   AvgStddev
x   7 18.082888 18.163332 18.116896 18.120708   0.025468695
+   6 17.983395 18.009254 18.001494 17.996951   0.010248473
Difference at 95.0% confidence
-0.123757 +/- 0.024538
-0.68296% +/- 0.135414%
(Student's t, pooled s = 0.0200388)





hardware is X3450 with 8G of memory.  RELENG8

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Doug Barton

On 12/13/2011 13:31, Malin Randstrom wrote:
 stop sending me spam mail ... you never stop despite me having unsubscribeb
 several times. stop this!

If you had actually unsubscribed, the mail would have stopped. :)

You can see the instructions you need to follow below.

 ___
 freebsd-sta...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 



-- 

[^L]

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Jilles Tjoelker

On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
 If the algorithm ULE does not contain problems - it means the problem
 has Core2Duo, or in a piece of code that uses the ULE scheduler.
 I already wrote in a mailing list that specifically in my case (Core2Duo)
 partially helps the following patch:
 --- sched_ule.c.orig  2011-11-24 18:11:48.0 +0200
 +++ sched_ule.c   2011-12-10 22:47:08.0 +0200
 @@ -794,7 +794,8 @@
* 1.5 * balance_interval.
*/
   balance_ticks = max(balance_interval / 2, 1);
 - balance_ticks += random() % balance_interval;
 +//   balance_ticks += random() % balance_interval;
 + balance_ticks += ((int)random()) % balance_interval;
   if (smp_started == 0 || rebalance == 0)
   return;
   tdq = TDQ_SELF();

This avoids a 64-bit division on 64-bit platforms but seems to have no
effect otherwise. Because this function is not called very often, the
change seems unlikely to help.

 @@ -2118,13 +2119,21 @@
   struct td_sched *ts;
  
   THREAD_LOCK_ASSERT(td, MA_OWNED);
 + if (td-td_pri_class  PRI_FIFO_BIT)
 + return;
 + ts = td-td_sched;
 + /*
 +  * We used up one time slice.
 +  */
 + if (--ts-ts_slice  0)
 + return;

This skips most of the periodic functionality (long term load balancer,
saving switch count (?), insert index (?), interactivity score update
for long running thread) if the thread is not going to be rescheduled
right now.

It looks wrong but it is a data point if it helps your workload.

   tdq = TDQ_SELF();
  #ifdef SMP
   /*
* We run the long term load balancer infrequently on the first cpu.
*/
 - if (balance_tdq == tdq) {
 - if (balance_ticks  --balance_ticks == 0)
 + if (balance_ticks  --balance_ticks == 0) {
 + if (balance_tdq == tdq)
   sched_balance();
   }
  #endif

The main effect of this appears to be to disable the long term load
balancer completely after some time. At some point, a CPU other than the
first CPU (which uses balance_tdq) will set balance_ticks = 0, and
sched_balance() will never be called again.

It also introduces a hypothetical race condition because the access to
balance_ticks is no longer restricted to one CPU under a spinlock.

If the long term load balancer may be causing trouble, try setting
kern.sched.balance_interval to a higher value with unpatched code.

 @@ -2144,9 +2153,6 @@
   if (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
   tdq-tdq_ridx = tdq-tdq_idx;
   }
 - ts = td-td_sched;
 - if (td-td_pri_class  PRI_FIFO_BIT)
 - return;
   if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
   /*
* We used a tick; charge it to the thread so
 @@ -2157,11 +2163,6 @@
   sched_priority(td);
   }
   /*
 -  * We used up one time slice.
 -  */
 - if (--ts-ts_slice  0)
 - return;
 - /*
* We're out of time, force a requeue at userret().
*/
   ts-ts_slice = sched_slice;

 and refusal to use options FULL_PREEMPTION
 But no one has unsubscribed to my letter, my patch helps or not in the
 case of Core2Duo...
 There is a suspicion that the problems stem from the sections of code
 associated with the SMP...
 Maybe I'm in something wrong, but I want to help in solving this
 problem ...

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Marcus Reid

On Mon, Dec 12, 2011 at 04:29:14PM -0800, Doug Barton wrote:
 On 12/12/2011 05:47, O. Hartmann wrote:
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD?
 
 I complained about poor interactive performance of ULE in a desktop
 environment for years. I had numerous people try to help, including
 Jeff, with various tunables, dtrace'ing, etc. The cause of the problem
 was never found.

The issues that I've seen with ULE on the desktop seem to be caused by X
taking up a steady amount of CPU, and being demoted from being an
interactive process.  X then becomes the bottleneck for other
processes that would otherwise be interactive.  Try 'renice -20
pid_of_X' and see if that makes your problems go away.

Marcus
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Ivan Klymenko

В Wed, 14 Dec 2011 00:04:42 +0100
Jilles Tjoelker jil...@stack.nl пишет:

 On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
  If the algorithm ULE does not contain problems - it means the
  problem has Core2Duo, or in a piece of code that uses the ULE
  scheduler. I already wrote in a mailing list that specifically in
  my case (Core2Duo) partially helps the following patch:
  --- sched_ule.c.orig2011-11-24 18:11:48.0 +0200
  +++ sched_ule.c 2011-12-10 22:47:08.0 +0200
  @@ -794,7 +794,8 @@
   * 1.5 * balance_interval.
   */
  balance_ticks = max(balance_interval / 2, 1);
  -   balance_ticks += random() % balance_interval;
  +// balance_ticks += random() % balance_interval;
  +   balance_ticks += ((int)random()) % balance_interval;
  if (smp_started == 0 || rebalance == 0)
  return;
  tdq = TDQ_SELF();
 
 This avoids a 64-bit division on 64-bit platforms but seems to have no
 effect otherwise. Because this function is not called very often, the
 change seems unlikely to help.

Yes, this section does not apply to this problem :)
Just I posted the latest patch which i using now...

 
  @@ -2118,13 +2119,21 @@
  struct td_sched *ts;
   
  THREAD_LOCK_ASSERT(td, MA_OWNED);
  +   if (td-td_pri_class  PRI_FIFO_BIT)
  +   return;
  +   ts = td-td_sched;
  +   /*
  +* We used up one time slice.
  +*/
  +   if (--ts-ts_slice  0)
  +   return;
 
 This skips most of the periodic functionality (long term load
 balancer, saving switch count (?), insert index (?), interactivity
 score update for long running thread) if the thread is not going to
 be rescheduled right now.
 
 It looks wrong but it is a data point if it helps your workload.

Yes, I did it for as long as possible to delay the execution of the code in 
section:
...
#ifdef SMP
/*
 * We run the long term load balancer infrequently on the first cpu.
 */
if (balance_tdq == tdq) {
if (balance_ticks  --balance_ticks == 0)
sched_balance();
}
#endif
...

 
  tdq = TDQ_SELF();
   #ifdef SMP
  /*
   * We run the long term load balancer infrequently on the
  first cpu. */
  -   if (balance_tdq == tdq) {
  -   if (balance_ticks  --balance_ticks == 0)
  +   if (balance_ticks  --balance_ticks == 0) {
  +   if (balance_tdq == tdq)
  sched_balance();
  }
   #endif
 
 The main effect of this appears to be to disable the long term load
 balancer completely after some time. At some point, a CPU other than
 the first CPU (which uses balance_tdq) will set balance_ticks = 0, and
 sched_balance() will never be called again.
 

That is, for the same reason as above in the text...

 It also introduces a hypothetical race condition because the access to
 balance_ticks is no longer restricted to one CPU under a spinlock.
 
 If the long term load balancer may be causing trouble, try setting
 kern.sched.balance_interval to a higher value with unpatched code.

I checked it in the first place - but it did not help fix the situation...

The impression of malfunction rebalancing...
It seems that the thread is passed on to the same core that is loaded and so...
Perhaps this is a consequence of an incorrect definition of the topology CPU?

 
  @@ -2144,9 +2153,6 @@
  if
  (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
  tdq-tdq_ridx = tdq-tdq_idx; }
  -   ts = td-td_sched;
  -   if (td-td_pri_class  PRI_FIFO_BIT)
  -   return;
  if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
  /*
   * We used a tick; charge it to the thread so
  @@ -2157,11 +2163,6 @@
  sched_priority(td);
  }
  /*
  -* We used up one time slice.
  -*/
  -   if (--ts-ts_slice  0)
  -   return;
  -   /*
   * We're out of time, force a requeue at userret().
   */
  ts-ts_slice = sched_slice;
 
  and refusal to use options FULL_PREEMPTION
  But no one has unsubscribed to my letter, my patch helps or not in
  the case of Core2Duo...
  There is a suspicion that the problems stem from the sections of
  code associated with the SMP...
  Maybe I'm in something wrong, but I want to help in solving this
  problem ...
 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Ivan Klymenko

В Tue, 13 Dec 2011 23:02:15 +
Marcus Reid mar...@blazingdot.com пишет:

 On Mon, Dec 12, 2011 at 04:29:14PM -0800, Doug Barton wrote:
  On 12/12/2011 05:47, O. Hartmann wrote:
   Do we have any proof at hand for such cases where SCHED_ULE
   performs much better than SCHED_4BSD?
  
  I complained about poor interactive performance of ULE in a desktop
  environment for years. I had numerous people try to help, including
  Jeff, with various tunables, dtrace'ing, etc. The cause of the
  problem was never found.
 
 The issues that I've seen with ULE on the desktop seem to be caused
 by X taking up a steady amount of CPU, and being demoted from being an
 interactive process.  X then becomes the bottleneck for other
 processes that would otherwise be interactive.  Try 'renice -20
 pid_of_X' and see if that makes your problems go away.

Why, then X is not a bottleneck when using 4BSD?

 Marcus
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread mdf

On Tue, Dec 13, 2011 at 3:39 PM, Ivan Klymenko fi...@ukr.net wrote:
 В Wed, 14 Dec 2011 00:04:42 +0100
 Jilles Tjoelker jil...@stack.nl пишет:

 On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
  If the algorithm ULE does not contain problems - it means the
  problem has Core2Duo, or in a piece of code that uses the ULE
  scheduler. I already wrote in a mailing list that specifically in
  my case (Core2Duo) partially helps the following patch:
  --- sched_ule.c.orig        2011-11-24 18:11:48.0 +0200
  +++ sched_ule.c     2011-12-10 22:47:08.0 +0200
  @@ -794,7 +794,8 @@
       * 1.5 * balance_interval.
       */
      balance_ticks = max(balance_interval / 2, 1);
  -   balance_ticks += random() % balance_interval;
  +// balance_ticks += random() % balance_interval;
  +   balance_ticks += ((int)random()) % balance_interval;
      if (smp_started == 0 || rebalance == 0)
              return;
      tdq = TDQ_SELF();

 This avoids a 64-bit division on 64-bit platforms but seems to have no
 effect otherwise. Because this function is not called very often, the
 change seems unlikely to help.

 Yes, this section does not apply to this problem :)
 Just I posted the latest patch which i using now...


  @@ -2118,13 +2119,21 @@
      struct td_sched *ts;
 
      THREAD_LOCK_ASSERT(td, MA_OWNED);
  +   if (td-td_pri_class  PRI_FIFO_BIT)
  +           return;
  +   ts = td-td_sched;
  +   /*
  +    * We used up one time slice.
  +    */
  +   if (--ts-ts_slice  0)
  +           return;

 This skips most of the periodic functionality (long term load
 balancer, saving switch count (?), insert index (?), interactivity
 score update for long running thread) if the thread is not going to
 be rescheduled right now.

 It looks wrong but it is a data point if it helps your workload.

 Yes, I did it for as long as possible to delay the execution of the code in 
 section:
 ...
 #ifdef SMP
        /*
         * We run the long term load balancer infrequently on the first cpu.
         */
        if (balance_tdq == tdq) {
                if (balance_ticks  --balance_ticks == 0)
                        sched_balance();
        }
 #endif
 ...


      tdq = TDQ_SELF();
   #ifdef SMP
      /*
       * We run the long term load balancer infrequently on the
  first cpu. */
  -   if (balance_tdq == tdq) {
  -           if (balance_ticks  --balance_ticks == 0)
  +   if (balance_ticks  --balance_ticks == 0) {
  +           if (balance_tdq == tdq)
                      sched_balance();
      }
   #endif

 The main effect of this appears to be to disable the long term load
 balancer completely after some time. At some point, a CPU other than
 the first CPU (which uses balance_tdq) will set balance_ticks = 0, and
 sched_balance() will never be called again.


 That is, for the same reason as above in the text...

 It also introduces a hypothetical race condition because the access to
 balance_ticks is no longer restricted to one CPU under a spinlock.

 If the long term load balancer may be causing trouble, try setting
 kern.sched.balance_interval to a higher value with unpatched code.

 I checked it in the first place - but it did not help fix the situation...

 The impression of malfunction rebalancing...
 It seems that the thread is passed on to the same core that is loaded and 
 so...
 Perhaps this is a consequence of an incorrect definition of the topology CPU?


  @@ -2144,9 +2153,6 @@
              if
  (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
  tdq-tdq_ridx = tdq-tdq_idx; }
  -   ts = td-td_sched;
  -   if (td-td_pri_class  PRI_FIFO_BIT)
  -           return;
      if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
              /*
               * We used a tick; charge it to the thread so
  @@ -2157,11 +2163,6 @@
              sched_priority(td);
      }
      /*
  -    * We used up one time slice.
  -    */
  -   if (--ts-ts_slice  0)
  -           return;
  -   /*
       * We're out of time, force a requeue at userret().
       */
      ts-ts_slice = sched_slice;

  and refusal to use options FULL_PREEMPTION
  But no one has unsubscribed to my letter, my patch helps or not in
  the case of Core2Duo...
  There is a suspicion that the problems stem from the sections of
  code associated with the SMP...
  Maybe I'm in something wrong, but I want to help in solving this
  problem ...


Has anyone experiencing problems tried to set sysctl kern.sched.steal_thresh=1 ?

I don't remember what our specific problem at $WORK was, perhaps it
was just interrupt threads not getting serviced fast enough, but we've
hard-coded this to 1 and removed the code that sets it in
sched_initticks().  The same effect should be had by setting the
sysctl after a box is up.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-13 Thread Ivan Klymenko

В Tue, 13 Dec 2011 16:01:56 -0800
m...@freebsd.org пишет:

 On Tue, Dec 13, 2011 at 3:39 PM, Ivan Klymenko fi...@ukr.net wrote:
  В Wed, 14 Dec 2011 00:04:42 +0100
  Jilles Tjoelker jil...@stack.nl пишет:
 
  On Tue, Dec 13, 2011 at 10:40:48AM +0200, Ivan Klymenko wrote:
   If the algorithm ULE does not contain problems - it means the
   problem has Core2Duo, or in a piece of code that uses the ULE
   scheduler. I already wrote in a mailing list that specifically in
   my case (Core2Duo) partially helps the following patch:
   --- sched_ule.c.orig        2011-11-24 18:11:48.0 +0200
   +++ sched_ule.c     2011-12-10 22:47:08.0 +0200
   @@ -794,7 +794,8 @@
        * 1.5 * balance_interval.
        */
       balance_ticks = max(balance_interval / 2, 1);
   -   balance_ticks += random() % balance_interval;
   +// balance_ticks += random() % balance_interval;
   +   balance_ticks += ((int)random()) % balance_interval;
       if (smp_started == 0 || rebalance == 0)
               return;
       tdq = TDQ_SELF();
 
  This avoids a 64-bit division on 64-bit platforms but seems to
  have no effect otherwise. Because this function is not called very
  often, the change seems unlikely to help.
 
  Yes, this section does not apply to this problem :)
  Just I posted the latest patch which i using now...
 
 
   @@ -2118,13 +2119,21 @@
       struct td_sched *ts;
  
       THREAD_LOCK_ASSERT(td, MA_OWNED);
   +   if (td-td_pri_class  PRI_FIFO_BIT)
   +           return;
   +   ts = td-td_sched;
   +   /*
   +    * We used up one time slice.
   +    */
   +   if (--ts-ts_slice  0)
   +           return;
 
  This skips most of the periodic functionality (long term load
  balancer, saving switch count (?), insert index (?), interactivity
  score update for long running thread) if the thread is not going to
  be rescheduled right now.
 
  It looks wrong but it is a data point if it helps your workload.
 
  Yes, I did it for as long as possible to delay the execution of the
  code in section: ...
  #ifdef SMP
         /*
          * We run the long term load balancer infrequently on the
  first cpu. */
         if (balance_tdq == tdq) {
                 if (balance_ticks  --balance_ticks == 0)
                         sched_balance();
         }
  #endif
  ...
 
 
       tdq = TDQ_SELF();
    #ifdef SMP
       /*
        * We run the long term load balancer infrequently on the
   first cpu. */
   -   if (balance_tdq == tdq) {
   -           if (balance_ticks  --balance_ticks == 0)
   +   if (balance_ticks  --balance_ticks == 0) {
   +           if (balance_tdq == tdq)
                       sched_balance();
       }
    #endif
 
  The main effect of this appears to be to disable the long term load
  balancer completely after some time. At some point, a CPU other
  than the first CPU (which uses balance_tdq) will set balance_ticks
  = 0, and sched_balance() will never be called again.
 
 
  That is, for the same reason as above in the text...
 
  It also introduces a hypothetical race condition because the
  access to balance_ticks is no longer restricted to one CPU under a
  spinlock.
 
  If the long term load balancer may be causing trouble, try setting
  kern.sched.balance_interval to a higher value with unpatched code.
 
  I checked it in the first place - but it did not help fix the
  situation...
 
  The impression of malfunction rebalancing...
  It seems that the thread is passed on to the same core that is
  loaded and so... Perhaps this is a consequence of an incorrect
  definition of the topology CPU?
 
 
   @@ -2144,9 +2153,6 @@
               if
   (TAILQ_EMPTY(tdq-tdq_timeshare.rq_queues[tdq-tdq_ridx]))
   tdq-tdq_ridx = tdq-tdq_idx; }
   -   ts = td-td_sched;
   -   if (td-td_pri_class  PRI_FIFO_BIT)
   -           return;
       if (PRI_BASE(td-td_pri_class) == PRI_TIMESHARE) {
               /*
                * We used a tick; charge it to the thread so
   @@ -2157,11 +2163,6 @@
               sched_priority(td);
       }
       /*
   -    * We used up one time slice.
   -    */
   -   if (--ts-ts_slice  0)
   -           return;
   -   /*
        * We're out of time, force a requeue at userret().
        */
       ts-ts_slice = sched_slice;
 
   and refusal to use options FULL_PREEMPTION
   But no one has unsubscribed to my letter, my patch helps or not
   in the case of Core2Duo...
   There is a suspicion that the problems stem from the sections of
   code associated with the SMP...
   Maybe I'm in something wrong, but I want to help in solving this
   problem ...
 
 
 Has anyone experiencing problems tried to set sysctl
 kern.sched.steal_thresh=1 ?
 

In my case, the variable kern.sched.steal_thresh and so has the value 1.

 I don't remember what our specific problem at $WORK was, perhaps it
 was just interrupt threads not getting serviced fast enough, but we've
 hard-coded this to 1 and removed the code that sets it in
 sched_initticks().  The same effect should be had by setting the

Re: SCHED_ULE should not be the default

2011-12-12 Thread O. Hartmann


 Not fully right, boinc defaults to run on idprio 31 so this isn't an
 issue. And yes, there are cases where SCHED_ULE shows much better
 performance then SCHED_4BSD.  [...]

Do we have any proof at hand for such cases where SCHED_ULE performs
much better than SCHED_4BSD? Whenever the subject comes up, it is
mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
2. But in the end I see here contradictionary statements. People
complain about poor performance (especially in scientific environments),
and other give contra not being the case.

Within our department, we developed a highly scalable code for planetary
science purposes on imagery. It utilizes present GPUs via OpenCL if
present. Otherwise it grabs as many cores as it can.
By the end of this year I'll get a new desktop box based on Intels new
Sandy Bridge-E architecture with plenty of memory. If the colleague who
developed the code is willing performing some benchmarks on the same
hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
recent Suse. For FreeBSD I intent also to look for performance with both
different schedulers available.

O.



signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-12 Thread Vincent Hoffman


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/12/2011 13:47, O. Hartmann wrote:

 Not fully right, boinc defaults to run on idprio 31 so this isn't an
 issue. And yes, there are cases where SCHED_ULE shows much better
 performance then SCHED_4BSD. [...]

 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
It all a little old now but some if the stuff in
http://people.freebsd.org/~kris/scaling/
covers improvements that were seen.

http://jeffr-tech.livejournal.com/5705.html
shows a little too, reading though Jeffs blog is worth it as it has some
interesting stuff on SHED_ULE.

I thought there were some more benchmarks floating round but cant find
any with a quick google.


Vince


 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

 O.


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJO5hn7AAoJEF4mgOY1fXowOLAP/2EjhAFPb88NgKM0ieBb4X7R
NSw/9HTiwcshkfEdvYjAzYZ0cUWetEuRfnPVnh+abwfJEmMzZkwA0KIz8UYGHHik
22Z2SWSVDiwZAluz0ca7Xc931ojbzrK/zVMbivqW3cvnz8P4oEnASiENnsoa89Jy
Oskjd4QpAyIpB/AsYgc9FLT3kPX13fXC5bzw/zAPDsaupOYssRRlZu8nnqsEc1i1
IanLIPKLnIbpZTx75ehWxxRW8IjiQRvIe+7eBaDMhXO/Kvftotf0JzknrBnJezDQ
ZdhiOTq7F1Pm3dxra+DNKD+Dw+xUCYPFq/kuyqrZNz44H3qwT60vDhvw0yDz6422
nNP11z2+G4M85sahBak5AmSHuyek7HWb6uIHHnfvwNKSX4ZsdS8MVBViNJjmCYtL
PwuHDU3WdCes/vvKRNDopSp/s6RSLK9w3RT7jlMkaTu2Mmtw0BwGziDJ2pGaCQ14
68R5eO/SfNxoVp0g4lIzObyQR+//0OmALzElVK3VmHM9NoL3qZGCwBRLqjN5re82
dX6nsBr/DFJOpaFfdFLwPNyCNdNpg/WVegRkq2BEL/BaMISNiKzoVbM0Psh9gnb3
LW1j3LP2fOHhuN1bW3S31JmbNzvAnlRNynoNMldrwj5PWJY2HPk+mMFRjmRwdDTJ
9mhscz8++WRPvDZQXefl
=XqaR
-END PGP SIGNATURE-

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Gary Jennejohn

On Mon, 12 Dec 2011 15:13:00 +
Vincent Hoffman vi...@unsane.co.uk wrote:

 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 12/12/2011 13:47, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 It all a little old now but some if the stuff in
 http://people.freebsd.org/~kris/scaling/
 covers improvements that were seen.
 
 http://jeffr-tech.livejournal.com/5705.html
 shows a little too, reading though Jeffs blog is worth it as it has some
 interesting stuff on SHED_ULE.
 
 I thought there were some more benchmarks floating round but cant find
 any with a quick google.
 
 
 Vince
 
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 

These observations are not scientific, but I have a CPU from AMD with
6 cores (AMD Phenom(tm) II X6 1090T Processor).

My simple test was ``make buildkernel'' while watching the core usage with
gkrellm.

With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
I've never seen any value above 97% with gkrellm.

With SCHED_ULE I never saw all 6 cores loaded this heavily.  Usually
2 or more cores were at or below 90%.  Not really that significant, but
still a noticeable difference in apparent scheduling behavior.  Whether
the observed difference is due to some change in data from the kernel to
gkrellm is beyond me.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Steve Kargl

On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]
 
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
 
 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.
 

This comes up every 9 months or so, and must be approaching
FAQ status.

In a HPC environment, I recommend 4BSD.  Depending on
the workload, ULE can cause a severe increase in turn
around time when doing already long computations.  If
you have an MPI application, simply launching greater
than ncpu+1 jobs can show the problem.

PS: search the list archives for kargl and ULE.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread mdf

On Mon, Dec 12, 2011 at 7:32 AM, Gary Jennejohn
gljennj...@googlemail.com wrote:
 On Mon, 12 Dec 2011 15:13:00 +
 Vincent Hoffman vi...@unsane.co.uk wrote:


 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 12/12/2011 13:47, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 It all a little old now but some if the stuff in
 http://people.freebsd.org/~kris/scaling/
 covers improvements that were seen.

 http://jeffr-tech.livejournal.com/5705.html
 shows a little too, reading though Jeffs blog is worth it as it has some
 interesting stuff on SHED_ULE.

 I thought there were some more benchmarks floating round but cant find
 any with a quick google.


 Vince

 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 

 These observations are not scientific, but I have a CPU from AMD with
 6 cores (AMD Phenom(tm) II X6 1090T Processor).

 My simple test was ``make buildkernel'' while watching the core usage with
 gkrellm.

 With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
 I've never seen any value above 97% with gkrellm.

 With SCHED_ULE I never saw all 6 cores loaded this heavily.  Usually
 2 or more cores were at or below 90%.  Not really that significant, but
 still a noticeable difference in apparent scheduling behavior.  Whether
 the observed difference is due to some change in data from the kernel to
 gkrellm is beyond me.

SCHED_ULE is much sloppier about calculating which thread used a
timeslice -- unless the timeslice went 100% to a thread, the fraction
it used may get attributed elsewhere.  So top's reporting of thread
usage is not a useful metric.  Total buildworld time is, potentially.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Lars Engels

Did you use -jX to build the world?

_
Von: Gary Jennejohn gljennj...@googlemail.com
Versendet am: Mon Dec 12 16:32:21 MEZ 2011
An: Vincent Hoffman vi...@unsane.co.uk
CC: O. Hartmann ohart...@mail.zedat.fu-berlin.de, Current FreeBSD 
freebsd-current@freebsd.org, freebsd-sta...@freebsd.org, 
freebsd-performa...@freebsd.org
Betreff: Re: SCHED_ULE should not be the default


On Mon, 12 Dec 2011 15:13:00 +
Vincent Hoffman vi...@unsane.co.uk wrote:

 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 12/12/2011 13:47, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
 
  Do we have any proof at hand for such cases where SCHED_ULE performs
  much better than SCHED_4BSD? Whenever the subject comes up, it is
  mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
  2. But in the end I see here contradictionary statements. People
  complain about poor performance (especially in scientific environments),
  and other give contra not being the case.
 It all a little old now but some if the stuff in
 http://people.freebsd.org/~kris/scaling/
 covers improvements that were seen.
 
 http://jeffr-tech.livejournal.com/5705.html
 shows a little too, reading though Jeffs blog is worth it as it has some
 interesting stuff on SHED_ULE.
 
 I thought there were some more benchmarks floating round but cant find
 any with a quick google.
 
 
 Vince
 
 
  Within our department, we developed a highly scalable code for planetary
  science purposes on imagery. It utilizes present GPUs via OpenCL if
  present. Otherwise it grabs as many cores as it can.
  By the end of this year I'll get a new desktop box based on Intels new
  Sandy Bridge-E architecture with plenty of memory. If the colleague who
  developed the code is willing performing some benchmarks on the same
  hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
  recent Suse. For FreeBSD I intent also to look for performance with both
  different schedulers available.
 

These observations are not scientific, but I have a CPU from AMD with
6 cores (AMD Phenom(tm) II X6 1090T Processor).

My simple test was ``make buildkernel'' while watching the core usage with
gkrellm.

With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
I've never seen any value above 97% with gkrellm.

With SCHED_ULE I never saw all 6 cores loaded this heavily. Usually
2 or more cores were at or below 90%. Not really that significant, but
still a noticeable difference in apparent scheduling behavior. Whether
the observed difference is due to some change in data from the kernel to
gkrellm is beyond me.

-- 
Gary Jennejohn
_

freebsd-sta...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Lars Engels

Would it be possible to implement a mechanism that lets one change the 
scheduler on the fly? Afaik Solaris can do that.

_
Von: Steve Kargl s...@troutmask.apl.washington.edu
Versendet am: Mon Dec 12 16:51:59 MEZ 2011
An: O. Hartmann ohart...@mail.zedat.fu-berlin.de
CC: freebsd-performa...@freebsd.org, Current FreeBSD 
freebsd-current@freebsd.org, freebsd-sta...@freebsd.org
Betreff: Re: SCHED_ULE should not be the default


On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
 
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD. [...]
 
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.
 
 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.
 

This comes up every 9 months or so, and must be approaching
FAQ status.

In a HPC environment, I recommend 4BSD. Depending on
the workload, ULE can cause a severe increase in turn
around time when doing already long computations. If
you have an MPI application, simply launching greater
than ncpu+1 jobs can show the problem.

PS: search the list archives for kargl and ULE.

-- 
Steve
_

freebsd-sta...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Bruce Cran


On 12/12/2011 15:51, Steve Kargl wrote:
This comes up every 9 months or so, and must be approaching FAQ 
status. In a HPC environment, I recommend 4BSD. Depending on the 
workload, ULE can cause a severe increase in turn around time when 
doing already long computations. If you have an MPI application, 
simply launching greater than ncpu+1 jobs can show the problem. PS: 
search the list archives for kargl and ULE. 


This isn't something that can be fixed by tuning ULE? For example for 
desktop applications kern.sched.preempt_thresh should be set to 224 from 
its default. I'm wondering if the installer should ask people what the 
typical use will be, and tune the scheduler appropriately.


--
Bruce Cran
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Ivan Klymenko

В Mon, 12 Dec 2011 16:18:35 +
Bruce Cran br...@cran.org.uk пишет:

 On 12/12/2011 15:51, Steve Kargl wrote:
  This comes up every 9 months or so, and must be approaching FAQ 
  status. In a HPC environment, I recommend 4BSD. Depending on the 
  workload, ULE can cause a severe increase in turn around time when 
  doing already long computations. If you have an MPI application, 
  simply launching greater than ncpu+1 jobs can show the problem. PS: 
  search the list archives for kargl and ULE. 
 
 This isn't something that can be fixed by tuning ULE? For example for 
 desktop applications kern.sched.preempt_thresh should be set to 224
 from its default. I'm wondering if the installer should ask people
 what the typical use will be, and tune the scheduler appropriately.
 

This is by and large does not help in certain situations ...
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Pieter de Goeje

On Monday 12 December 2011 14:47:57 O. Hartmann wrote:
  Not fully right, boinc defaults to run on idprio 31 so this isn't an
  issue. And yes, there are cases where SCHED_ULE shows much better
  performance then SCHED_4BSD.  [...]

 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD? Whenever the subject comes up, it is
 mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
 2. But in the end I see here contradictionary statements. People
 complain about poor performance (especially in scientific environments),
 and other give contra not being the case.

 Within our department, we developed a highly scalable code for planetary
 science purposes on imagery. It utilizes present GPUs via OpenCL if
 present. Otherwise it grabs as many cores as it can.
 By the end of this year I'll get a new desktop box based on Intels new
 Sandy Bridge-E architecture with plenty of memory. If the colleague who
 developed the code is willing performing some benchmarks on the same
 hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
 recent Suse. For FreeBSD I intent also to look for performance with both
 different schedulers available.

 O.

In my spare time I do some stuff which can be considered HPC. If I recall 
correctly the most loud supporters of the notion that SCHED_BSD is faster 
than SCHED_ULE are using more threads than there are cores, causing CPU core 
contention and more importantly unevenly distributed runtimes among threads, 
resulting in suboptimal execution times for their programs. Since I've never 
actually seen that code in question it's hard to say whether or not 
this unfair distribution actually results in lower throughput or that it 
simply violates an assumption in the code that each thread takes about as 
long to finish its task.
Although I haven't actually benchmarked the two schedulers directly, I have no 
reason to suspect SCHED_ULE of suboptimal performance because:
1) A program model where there are N threads on N cores which take work items 
from a shared queue until it is empty has almost perfect scaling on SCHED_ULE 
(I get 398% CPU usage on a quadcore)
2) The same program on Linux (dual boot) compiled with exactly the same 
compiler and flags runs slightly slower. I think this has to do with VM 
differences.

What I'm trying to say is that until someone actually shows some code which 
has demonstrably lower performance on SCHED_ULE and this is not caused by 
IMHO improper timing dependencies between threads I'd say that there is no 
cause for concern here. I actually expect performance differences between the 
two schedulers to show in problems which cause a lot more contention on the 
CPU cores and use lots of locks internally so threads are frequently waiting 
on each other, for instance the MySQL benchmarks done a couple of years ago 
by Kris Kennaway.

Aside from algorithmic limitations (SCHED_BSD doesn't really scale all that 
well), there will always exist some problems in which SCHED_BSD is faster 
because it by chance has a better execution order for these problems... The 
good thing is people have a choice :-).

I'm looking forward to the results of your benchmark.

-- 
Pieter de Goeje
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Gary Jennejohn

On Mon, 12 Dec 2011 17:10:46 +0100
Lars Engels lars.eng...@0x20.net wrote:

 Did you use -jX to build the world?
 

I'm top posting since Lars did.

It was buildkernel, not buildworld.

Yes, -j6.

 _
 Von: Gary Jennejohn gljennj...@googlemail.com
 Versendet am: Mon Dec 12 16:32:21 MEZ 2011
 An: Vincent Hoffman vi...@unsane.co.uk
 CC: O. Hartmann ohart...@mail.zedat.fu-berlin.de, Current FreeBSD 
 freebsd-current@freebsd.org, freebsd-sta...@freebsd.org, 
 freebsd-performa...@freebsd.org
 Betreff: Re: SCHED_ULE should not be the default
 
 
 On Mon, 12 Dec 2011 15:13:00 +
 Vincent Hoffman vi...@unsane.co.uk wrote:
 
  
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
  
  On 12/12/2011 13:47, O. Hartmann wrote:
  
   Not fully right, boinc defaults to run on idprio 31 so this isn't an
   issue. And yes, there are cases where SCHED_ULE shows much better
   performance then SCHED_4BSD. [...]
  
   Do we have any proof at hand for such cases where SCHED_ULE performs
   much better than SCHED_4BSD? Whenever the subject comes up, it is
   mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
   2. But in the end I see here contradictionary statements. People
   complain about poor performance (especially in scientific environments),
   and other give contra not being the case.
  It all a little old now but some if the stuff in
  http://people.freebsd.org/~kris/scaling/
  covers improvements that were seen.
  
  http://jeffr-tech.livejournal.com/5705.html
  shows a little too, reading though Jeffs blog is worth it as it has some
  interesting stuff on SHED_ULE.
  
  I thought there were some more benchmarks floating round but cant find
  any with a quick google.
  
  
  Vince
  
  
   Within our department, we developed a highly scalable code for planetary
   science purposes on imagery. It utilizes present GPUs via OpenCL if
   present. Otherwise it grabs as many cores as it can.
   By the end of this year I'll get a new desktop box based on Intels new
   Sandy Bridge-E architecture with plenty of memory. If the colleague who
   developed the code is willing performing some benchmarks on the same
   hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
   recent Suse. For FreeBSD I intent also to look for performance with both
   different schedulers available.
  
 
 These observations are not scientific, but I have a CPU from AMD with
 6 cores (AMD Phenom(tm) II X6 1090T Processor).
 
 My simple test was ``make buildkernel'' while watching the core usage with
 gkrellm.
 
 With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
 I've never seen any value above 97% with gkrellm.
 
 With SCHED_ULE I never saw all 6 cores loaded this heavily. Usually
 2 or more cores were at or below 90%. Not really that significant, but
 still a noticeable difference in apparent scheduling behavior. Whether
 the observed difference is due to some change in data from the kernel to
 gkrellm is beyond me.
 
 -- 
 Gary Jennejohn
 _
 
 freebsd-sta...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 


-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Gary Jennejohn

On Mon, 12 Dec 2011 08:04:37 -0800
m...@freebsd.org wrote:

 On Mon, Dec 12, 2011 at 7:32 AM, Gary Jennejohn
 gljennj...@googlemail.com wrote:
  On Mon, 12 Dec 2011 15:13:00 +
  Vincent Hoffman vi...@unsane.co.uk wrote:
 
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
  On 12/12/2011 13:47, O. Hartmann wrote:
  
   Not fully right, boinc defaults to run on idprio 31 so this isn't an
   issue. And yes, there are cases where SCHED_ULE shows much better
   performance then SCHED_4BSD. [...]
  
   Do we have any proof at hand for such cases where SCHED_ULE performs
   much better than SCHED_4BSD? Whenever the subject comes up, it is
   mentioned, that SCHED_ULE has better performance on boxes with a ncpu 
   2. But in the end I see here contradictionary statements. People
   complain about poor performance (especially in scientific environments),
   and other give contra not being the case.
  It all a little old now but some if the stuff in
  http://people.freebsd.org/~kris/scaling/
  covers improvements that were seen.
 
  http://jeffr-tech.livejournal.com/5705.html
  shows a little too, reading though Jeffs blog is worth it as it has some
  interesting stuff on SHED_ULE.
 
  I thought there were some more benchmarks floating round but cant find
  any with a quick google.
 
 
  Vince
 
  
   Within our department, we developed a highly scalable code for planetary
   science purposes on imagery. It utilizes present GPUs via OpenCL if
   present. Otherwise it grabs as many cores as it can.
   By the end of this year I'll get a new desktop box based on Intels new
   Sandy Bridge-E architecture with plenty of memory. If the colleague who
   developed the code is willing performing some benchmarks on the same
   hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
   recent Suse. For FreeBSD I intent also to look for performance with both
   different schedulers available.
  
 
  These observations are not scientific, but I have a CPU from AMD with
  6 cores (AMD Phenom(tm) II X6 1090T Processor).
 
  My simple test was ``make buildkernel'' while watching the core usage with
  gkrellm.
 
  With SCHED_4BSD all 6 cores are loaded to 97% during the build phase.
  I've never seen any value above 97% with gkrellm.
 
  With SCHED_ULE I never saw all 6 cores loaded this heavily.  Usually
  2 or more cores were at or below 90%.  Not really that significant, but
  still a noticeable difference in apparent scheduling behavior.  Whether
  the observed difference is due to some change in data from the kernel to
  gkrellm is beyond me.
 
 SCHED_ULE is much sloppier about calculating which thread used a
 timeslice -- unless the timeslice went 100% to a thread, the fraction
 it used may get attributed elsewhere.  So top's reporting of thread
 usage is not a useful metric.  Total buildworld time is, potentially.
 

I suspect you're right since the buildworld time, a much better test,
was pretty much the same with 4BSD and ULE.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Steve Kargl

On Mon, Dec 12, 2011 at 04:18:35PM +, Bruce Cran wrote:
 On 12/12/2011 15:51, Steve Kargl wrote:
 This comes up every 9 months or so, and must be approaching FAQ 
 status. In a HPC environment, I recommend 4BSD. Depending on the 
 workload, ULE can cause a severe increase in turn around time when 
 doing already long computations. If you have an MPI application, 
 simply launching greater than ncpu+1 jobs can show the problem. PS: 
 search the list archives for kargl and ULE. 
 
 This isn't something that can be fixed by tuning ULE? For example for 
 desktop applications kern.sched.preempt_thresh should be set to 224 from 
 its default. I'm wondering if the installer should ask people what the 
 typical use will be, and tune the scheduler appropriately.
 

Tuning kern.sched.preempt_thresh did not seem to help for
my workload.  My code is a classic master-slave OpenMPI
application where the master runs on one node and all
cpu-bound slaves are sent to a second node.  If I send
send ncpu+1 jobs to the 2nd node with ncpu's, then 
ncpu-1 jobs are assigned to the 1st ncpu-1 cpus.  The
last two jobs are assigned to the ncpu'th cpu, and 
these ping-pong on the this cpu.  AFAICT, it is a cpu
affinity issue, where ULE is trying to keep each job
associated with its initially assigned cpu.

While one might suggest that starting ncpu+1 jobs
is not prudent, my example is just that.  It is an
example showing that ULE has performance issues. 
So, I now can start only ncpu jobs on each node
in the cluster and send emails to all other users
to not use those node, or use 4BSD and not worry
about loading issues.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread John Baldwin

On Monday, December 12, 2011 12:06:04 pm Steve Kargl wrote:
 On Mon, Dec 12, 2011 at 04:18:35PM +, Bruce Cran wrote:
  On 12/12/2011 15:51, Steve Kargl wrote:
  This comes up every 9 months or so, and must be approaching FAQ 
  status. In a HPC environment, I recommend 4BSD. Depending on the 
  workload, ULE can cause a severe increase in turn around time when 
  doing already long computations. If you have an MPI application, 
  simply launching greater than ncpu+1 jobs can show the problem. PS: 
  search the list archives for kargl and ULE. 
  
  This isn't something that can be fixed by tuning ULE? For example for 
  desktop applications kern.sched.preempt_thresh should be set to 224 from 
  its default. I'm wondering if the installer should ask people what the 
  typical use will be, and tune the scheduler appropriately.
  
 
 Tuning kern.sched.preempt_thresh did not seem to help for
 my workload.  My code is a classic master-slave OpenMPI
 application where the master runs on one node and all
 cpu-bound slaves are sent to a second node.  If I send
 send ncpu+1 jobs to the 2nd node with ncpu's, then 
 ncpu-1 jobs are assigned to the 1st ncpu-1 cpus.  The
 last two jobs are assigned to the ncpu'th cpu, and 
 these ping-pong on the this cpu.  AFAICT, it is a cpu
 affinity issue, where ULE is trying to keep each job
 associated with its initially assigned cpu.
 
 While one might suggest that starting ncpu+1 jobs
 is not prudent, my example is just that.  It is an
 example showing that ULE has performance issues. 
 So, I now can start only ncpu jobs on each node
 in the cluster and send emails to all other users
 to not use those node, or use 4BSD and not worry
 about loading issues.

This is a case where 4BSD's naive algorithm will spread out the load more
evenly because all the threads are on a single, shared queue and each CPU
just grabs the head of the queue when it finishes a timeslice.  ULE always
assigns threads to a single CPU (even if they aren't pinned to a single
CPU using cpuset, etc.) and then tries to balance the load across cores
later, but I believe in this case it's rebalancer won't have anything to
really do as no matter what it does with the N+1 job it's going to be
sharing a CPU with another job.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Scott Lambert

On Mon, Dec 12, 2011 at 09:06:04AM -0800, Steve Kargl wrote:
 Tuning kern.sched.preempt_thresh did not seem to help for
 my workload.  My code is a classic master-slave OpenMPI
 application where the master runs on one node and all
 cpu-bound slaves are sent to a second node.  If I send
 send ncpu+1 jobs to the 2nd node with ncpu's, then 
 ncpu-1 jobs are assigned to the 1st ncpu-1 cpus.  The
 last two jobs are assigned to the ncpu'th cpu, and 
 these ping-pong on the this cpu.  AFAICT, it is a cpu
 affinity issue, where ULE is trying to keep each job
 associated with its initially assigned cpu.
 
 While one might suggest that starting ncpu+1 jobs
 is not prudent, my example is just that.  It is an
 example showing that ULE has performance issues. 
 So, I now can start only ncpu jobs on each node
 in the cluster and send emails to all other users
 to not use those node, or use 4BSD and not worry
 about loading issues.

Does it meet your expectations if you start (j modulo ncpu) = 0
jobs on a node?

-- 
Scott LambertKC5MLE   Unix SysAdmin
lamb...@lambertfam.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Steve Kargl

On Mon, Dec 12, 2011 at 01:03:30PM -0600, Scott Lambert wrote:
 On Mon, Dec 12, 2011 at 09:06:04AM -0800, Steve Kargl wrote:
  Tuning kern.sched.preempt_thresh did not seem to help for
  my workload.  My code is a classic master-slave OpenMPI
  application where the master runs on one node and all
  cpu-bound slaves are sent to a second node.  If I send
  send ncpu+1 jobs to the 2nd node with ncpu's, then 
  ncpu-1 jobs are assigned to the 1st ncpu-1 cpus.  The
  last two jobs are assigned to the ncpu'th cpu, and 
  these ping-pong on the this cpu.  AFAICT, it is a cpu
  affinity issue, where ULE is trying to keep each job
  associated with its initially assigned cpu.
  
  While one might suggest that starting ncpu+1 jobs
  is not prudent, my example is just that.  It is an
  example showing that ULE has performance issues. 
  So, I now can start only ncpu jobs on each node
  in the cluster and send emails to all other users
  to not use those node, or use 4BSD and not worry
  about loading issues.
 
 Does it meet your expectations if you start (j modulo ncpu) = 0
 jobs on a node?
 

I've never tried to launch more than ncpu + 1 (or + 2)
jobs.  I suppose at the time I was investigating the issue,
it was determined that 4BSD allowed me to get my work done
in a more timely manner.  So, I took the path of least
resistance.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread O. Hartmann

On 12/12/11 18:06, Steve Kargl wrote:
 On Mon, Dec 12, 2011 at 04:18:35PM +, Bruce Cran wrote:
 On 12/12/2011 15:51, Steve Kargl wrote:
 This comes up every 9 months or so, and must be approaching FAQ 
 status. In a HPC environment, I recommend 4BSD. Depending on the 
 workload, ULE can cause a severe increase in turn around time when 
 doing already long computations. If you have an MPI application, 
 simply launching greater than ncpu+1 jobs can show the problem. PS: 
 search the list archives for kargl and ULE. 

 This isn't something that can be fixed by tuning ULE? For example for 
 desktop applications kern.sched.preempt_thresh should be set to 224 from 
 its default. I'm wondering if the installer should ask people what the 
 typical use will be, and tune the scheduler appropriately.


Is the tuning of kern.sched.preempt_thresh and a proper method of
estimating its correct value for the intended to use workload documented
in the manpages, maybe tuning()?

I find it hard to crawl a lot of pros and cons of mailing lists for
evaluating a correct value of this, seemingly, important tunable.

 
 Tuning kern.sched.preempt_thresh did not seem to help for
 my workload.  My code is a classic master-slave OpenMPI
 application where the master runs on one node and all
 cpu-bound slaves are sent to a second node.  If I send
 send ncpu+1 jobs to the 2nd node with ncpu's, then 
 ncpu-1 jobs are assigned to the 1st ncpu-1 cpus.  The
 last two jobs are assigned to the ncpu'th cpu, and 
 these ping-pong on the this cpu.  AFAICT, it is a cpu
 affinity issue, where ULE is trying to keep each job
 associated with its initially assigned cpu.
 
 While one might suggest that starting ncpu+1 jobs
 is not prudent, my example is just that.  It is an
 example showing that ULE has performance issues. 
 So, I now can start only ncpu jobs on each node
 in the cluster and send emails to all other users
 to not use those node, or use 4BSD and not worry
 about loading issues.
 




signature.asc
Description: OpenPGP digital signature

Re: SCHED_ULE should not be the default

2011-12-12 Thread Bruce Cran


On 12/12/2011 23:48, O. Hartmann wrote:
Is the tuning of kern.sched.preempt_thresh and a proper method of 
estimating its correct value for the intended to use workload 
documented in the manpages, maybe tuning()? I find it hard to crawl a 
lot of pros and cons of mailing lists for evaluating a correct value 
of this, seemingly, important tunable.


Note that I said for example :)
I was suggesting that there may be sysctl's that can be tweaked to 
improve performance.


--
Bruce Cran
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCHED_ULE should not be the default

2011-12-12 Thread Doug Barton

On 12/12/2011 05:47, O. Hartmann wrote:
 Do we have any proof at hand for such cases where SCHED_ULE performs
 much better than SCHED_4BSD?

I complained about poor interactive performance of ULE in a desktop
environment for years. I had numerous people try to help, including
Jeff, with various tunables, dtrace'ing, etc. The cause of the problem
was never found.

I switched to 4BSD, problem gone.

This is on 2 separate systems with core 2 duos.


hth,

Doug

-- 

[^L]

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

65 matches

Mail list logo