On Thu, 17 Apr 2008, I wrote:
                                                 There is only one
  central tunable (you have to switch on CONFIG_SCHED_DEBUG):

        /proc/sys/kernel/sched_granularity_ns

  which can be used to tune the scheduler from 'desktop' (low
  latencies) to 'server' (good batching) workloads. It defaults to a
  setting suitable for desktop workloads. SCHED_BATCH is handled by the
  CFS scheduler module too.

So it'd be worth compiling a kernel with CONFIG_SCHED_DEBUG switched on and try increasing that value, and see if that fixes the problem. Alternatively, use sched_setscheduler to set SCHED_BATCH, which should increase the timeslice (a Linux-only option).

Looking at the problem a bit closer, it's obvious to me that larger timeslices would not have fixed this problem, so ignore my suggestion.

It appears that the problem is caused by inter-process communication blocking and causing processes to be put right to the back of the run queue, therefore causing a very fine-grained round-robin of the runnable processes, which trashes the CPU caches. You may also be seeing processes forced to switch between CPUs, which breaks the caches even more. So what happens if you run pgbench on a separate machine to the server? Does the problem still exist in that case?

Matthew

--
X's book explains this very well, but, poor bloke, he did the Cambridge Maths Tripos... -- Computer Science Lecturer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to