+1 for atop. Be sure to adjust the sampling interval so it suits your needs. It'll tell you what caused the spike.
Alternatively you could probably use sysdig, but I expect that'd result in a fair performance hit if your system is already struggling. Micky On 14 February 2018 at 08:15, Gunnar "Nick" Bluth <gunnar.bl...@pro-open.de> wrote: > Am 06.02.2018 um 15:31 schrieb Thomas Güttler: > > > > > > Am 05.02.2018 um 14:26 schrieb Andreas Kretschmer: > >> > >> > >> Am 05.02.2018 um 14:14 schrieb Thomas Güttler: > >>> What do you suggest to get some reliable figures? > >> > >> sar is often recommended, see > >> https://blog.2ndquadrant.com/in-the-defense-of-sar/. > >> > >> Can you exclude other reasons like vacuum / vacuum freeze? > > > > In the current case it was a problem in the hypervisor. > > > > But I want to be prepared for the next time. > > > > The tool sar looks good. This way I can generate a chart where I can see > > peaks. Nice. > > > > .... But one thing is still unclear. Imagine I see a peak in the chart. > > The peak > > was some hours ago. AFAIK sar has only the aggregated numbers. > > > > But I need to know details if I want to answer the question "Why?". The > > peak > > has gone and ps/top/iotop don't help me anymore. > > > > Any idea? > > I love atop (atoptool.nl) for exactly that kind of situation. It will > save a snapshot every 10 minutes by default, which you can then simply > "scroll" back to. Helped me pinpointing nightly issues countless times. > > Only really available for Linux though (in case you're on *BSD). > > Best regards, > -- > Gunnar "Nick" Bluth > RHCE/SCLA > > Mobil +49 172 8853339 > Email: gunnar.bl...@pro-open.de > _____________________________________________________________ > In 1984 mainstream users were choosing VMS over UNIX. > Ten years later they are choosing Windows over UNIX. > What part of that message aren't you getting? - Tom Payne > > >