Hello,

I can confirm my strace looks the same.

On 1/16/08, Peter Mueller <[EMAIL PROTECTED]> wrote:
> > I am seeing that my CPU usage is ever increasing, restarting the
> > various HA services drops it down to near 0 again but then it comes
> > back up again with time.
> >
> > Graph of CPU usage:
> > http://193.201.200.132/~rip/linuxha-cpu.png
> >
> > Investigating this I found that the offending process is
> /usr/lib/heartbeat/lrmd
> >
> > My setup:
> >
> > CentOS 5.1
> > Heartbeat 2.1 from centos extras
> >
> > Has anyone seen this behavior before and can perhaps shed some light?
>
> I am experiencing the same behavior on one cluster:
> http://world.anarchy.com/~peter/ha/cpu_increase.png
> CentOS release 4.5 (Final)
> Linux oakdb04 2.6.9-55.ELlargesmp
> heartbeat-stonith-2.1.2-3.el4.centos
> heartbeat-pils-2.1.2-3.el4.centos
> heartbeat-2.1.2-3.el4.centos
>
> top - 17:25:29 up 81 days,  4:34,  1 user,  load average: 0.24, 0.22,
> 0.18
> Tasks:  97 total,   2 running,  95 sleeping,   0 stopped,   0 zombie
> Cpu(s):  3.9% us,  0.4% sy,  0.0% ni, 94.4% id,  1.2% wa,  0.0% hi,
> 0.0% si
> Mem:   8163852k total,  8142292k used,    21560k free,    88432k buffers
> Swap:  8193140k total,      208k used,  8192932k free,  6467864k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 10040 root      16   0  244m 216m 1300 R   26  2.7   5696:05 lrmd
> 10362 mysql     15   0 7321m 1.0g 5340 S    6 13.3  12773:48 mysqld
>
> A few seconds of strace on lrmd:
> [EMAIL PROTECTED] ~]# strace -p 10040 > foo
> Process 10040 attached - interrupt to quit
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> times({tms_utime=33660093, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874110
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> times({tms_utime=33660100, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874116
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874123
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874123
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874123
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874123
> poll([{fd=4, events=POLLIN|POLLPRI}, {fd=5, events=POLLIN|POLLPRI},
> {fd=6, events=POLLIN|POLLPRI}, {fd=7, events=POLLIN|POLLPRI}, {fd=9,
> events=POLLIN|POLLPRI}, {fd=8, events=POLLIN|POLLPRI}], 6, 1000) = 0
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> recvfrom(6, 0x51f533, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=6, events=0}], 1, 0)          = 0
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> recvfrom(7, 0x522603, 3973, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=7, events=0}], 1, 0)          = 0
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> recvfrom(8, 0x524a09, 3343, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=8, events=0}], 1, 0)          = 0
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> recvfrom(9, 0x527144, 3972, 64, 0, 0)   = -1 EAGAIN (Resource
> temporarily unavailable)
> poll([{fd=9, events=0}], 1, 0)          = 0
> times({tms_utime=33660106, tms_stime=517128, tms_cutime=1049402,
> tms_cstime=1264036}) = 1130874223
> Process 10040 detached
>
> Regards,
> P
>


-- 
R.I.Pienaar
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to