Hi Tom, I now have excluded ntp as root cause for the CPU cycles being wasted in user space.
I installed perf and monitored two servers (with different postgresql versions and hardware specification) which are "hanging" and have some output. Since I'm no die-hard at interpreting the output of perf top what would be the next step to do? Would it be a good idea to a) read the perf manual and/or 2) provide the output of perf top as a first step to see what is going on? What I think I see is a lot spin_lock_irq and scheduler processes active. Any guidance much appreciated. Most Regards, Dennis Brouwer M4N On Mon, Sep 24, 2012 at 6:30 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Dennis Brouwer <dennis.brou...@m4n.nl> writes: > > Last week I was repeatedly able to run all these tests on the database > > without any issue but recently, all of a sudden at random, some of the > > queries performed a factor 100 less. It may take hours to complete the > > transaction. At the same moment we see a dramatic decrease in IO and the > > CPU is nearly 100% busy in user space. > > > After days of testing I may have found the cause: the ntp client. If I > stop > > the ntp client the problem vanishes. > > > I have started reading on spinlocks and other related material but this > all > > is rather complicated stuff and kindly ask in what direction I should > > search. The issue can be reproduced for both postgresql-9.1 and > > postgresql-9.2 and perhaps can be rephrased as: Very high CPU load in > user > > space (at random) with ntp enabled and (long?) running transactions. > > That's really bizarre. What "ntp client" are you using exactly? Is it > configured to adjust the system clock by slewing, or by stepping? Can > you identify what part of the code is eating CPU (try perf or oprofile)? > > regards, tom lane >