Re: Details after Load Peak was: OT: Performance of VM

2018-02-13 Thread Micky Gough
+1 for atop. Be sure to adjust the sampling interval so it suits your
needs. It'll tell you what caused the spike.

Alternatively you could probably use sysdig, but I expect that'd result in
a fair performance hit if your system is already struggling.

Micky

On 14 February 2018 at 08:15, Gunnar "Nick" Bluth 
wrote:

> Am 06.02.2018 um 15:31 schrieb Thomas Güttler:
> >
> >
> > Am 05.02.2018 um 14:26 schrieb Andreas Kretschmer:
> >>
> >>
> >> Am 05.02.2018 um 14:14 schrieb Thomas Güttler:
> >>> What do you suggest to get some reliable figures?
> >>
> >> sar is often recommended, see
> >> https://blog.2ndquadrant.com/in-the-defense-of-sar/.
> >>
> >> Can you exclude other reasons like vacuum / vacuum freeze?
> >
> > In the current case it was a problem in the hypervisor.
> >
> > But I want to be prepared for the next time.
> >
> > The tool sar looks good. This way I can generate a chart where I can see
> > peaks. Nice.
> >
> >  But one thing is still unclear. Imagine I see a peak in the chart.
> > The peak
> > was some hours ago. AFAIK sar has only the aggregated numbers.
> >
> > But I need to know details if I want to answer the question "Why?". The
> > peak
> > has gone and ps/top/iotop don't help me anymore.
> >
> > Any idea?
>
> I love atop (atoptool.nl) for exactly that kind of situation. It will
> save a snapshot every 10 minutes by default, which you can then simply
> "scroll" back to. Helped me pinpointing nightly issues countless times.
>
> Only really available for Linux though (in case you're on *BSD).
>
> Best regards,
> --
> Gunnar "Nick" Bluth
> RHCE/SCLA
>
> Mobil +49 172 8853339
> Email: gunnar.bl...@pro-open.de
> _
> In 1984 mainstream users were choosing VMS over UNIX.
> Ten years later they are choosing Windows over UNIX.
> What part of that message aren't you getting? - Tom Payne
>
>
>


Re: Details after Load Peak was: OT: Performance of VM

2018-02-13 Thread Gunnar "Nick" Bluth
Am 06.02.2018 um 15:31 schrieb Thomas Güttler:
> 
> 
> Am 05.02.2018 um 14:26 schrieb Andreas Kretschmer:
>>
>>
>> Am 05.02.2018 um 14:14 schrieb Thomas Güttler:
>>> What do you suggest to get some reliable figures? 
>>
>> sar is often recommended, see
>> https://blog.2ndquadrant.com/in-the-defense-of-sar/.
>>
>> Can you exclude other reasons like vacuum / vacuum freeze?
> 
> In the current case it was a problem in the hypervisor.
> 
> But I want to be prepared for the next time.
> 
> The tool sar looks good. This way I can generate a chart where I can see
> peaks. Nice.
> 
>  But one thing is still unclear. Imagine I see a peak in the chart.
> The peak
> was some hours ago. AFAIK sar has only the aggregated numbers.
> 
> But I need to know details if I want to answer the question "Why?". The
> peak
> has gone and ps/top/iotop don't help me anymore.
> 
> Any idea?

I love atop (atoptool.nl) for exactly that kind of situation. It will
save a snapshot every 10 minutes by default, which you can then simply
"scroll" back to. Helped me pinpointing nightly issues countless times.

Only really available for Linux though (in case you're on *BSD).

Best regards,
-- 
Gunnar "Nick" Bluth
RHCE/SCLA

Mobil +49 172 8853339
Email: gunnar.bl...@pro-open.de
_
In 1984 mainstream users were choosing VMS over UNIX.
Ten years later they are choosing Windows over UNIX.
What part of that message aren't you getting? - Tom Payne




signature.asc
Description: OpenPGP digital signature


Re: Details after Load Peak was: OT: Performance of VM

2018-02-06 Thread Alan Hodgson
On Tue, 2018-02-06 at 15:31 +0100, Thomas Güttler wrote:
> 
 But one thing is still unclear. Imagine I see a peak in the chart. The peak
> was some hours ago. AFAIK sar has only the aggregated numbers.
> 
> But I need to know details if I want to answer the question "Why?". The peak
> has gone and ps/top/iotop don't help me anymore.
> 

The typical solution is to store stats on everything you can think of
with munin, cacti, ganglia, or similar systems.

I know with ganglia at least, in addition to all the many details it
already tracks on a system and the many plugins already available for
it, you can write your own plugins or simple agents, so you can keep
stats on anything you can code around.

Munin's probably the easiest to try out, though.

Details after Load Peak was: OT: Performance of VM

2018-02-06 Thread Thomas Güttler



Am 05.02.2018 um 14:26 schrieb Andreas Kretschmer:



Am 05.02.2018 um 14:14 schrieb Thomas Güttler:
What do you suggest to get some reliable figures? 


sar is often recommended, see 
https://blog.2ndquadrant.com/in-the-defense-of-sar/.

Can you exclude other reasons like vacuum / vacuum freeze?


In the current case it was a problem in the hypervisor.

But I want to be prepared for the next time.

The tool sar looks good. This way I can generate a chart where I can see peaks. 
Nice.

 But one thing is still unclear. Imagine I see a peak in the chart. The peak
was some hours ago. AFAIK sar has only the aggregated numbers.

But I need to know details if I want to answer the question "Why?". The peak
has gone and ps/top/iotop don't help me anymore.

Any idea?

Regards,
  Thomas Güttler





--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines