>>> Jan Pokorný <[email protected]> schrieb am 22.05.2018 um 19:09 in Nachricht <[email protected]>: > On 18/05/18 20:04 +0000, Shobe, Casey wrote: >> On a couple clusters that have been running for a little while >> (without fencing), I'm seeing runaway server.rb processes using 100% >> of a single CPU core each. >> >> When I look at ps, I can see that these have something to do with >> pcsd: >> >> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND >> root 6103 0.0 0.3 1076744 59200 ? Ssl Apr06 59:09 > /usr/bin/ruby ‑C/var/lib/pcsd ‑I/usr/share/pcsd ‑‑ /usr/share/pcsd/ssl.rb & > > /dev/null & >> root 17548 99.3 0.2 873648 46308 ? Rl Apr18 43356:57 \_ > /usr/bin/ruby ‑C/var/lib/pcsd ‑I/usr/share/pcsd ‑‑ /usr/share/pcsd/ssl.rb & > > /dev/null & >> root 16688 98.9 0.3 941160 49472 ? Rl May01 24300:52 \_ > /usr/bin/ruby ‑C/var/lib/pcsd ‑I/usr/share/pcsd ‑‑ /usr/share/pcsd/ssl.rb & > > /dev/null & >> root 6009 98.8 0.3 942188 49688 ? R May02 22607:08 \_ > /usr/bin/ruby ‑C/var/lib/pcsd ‑I/usr/share/pcsd ‑‑ /usr/share/pcsd/ssl.rb & > > /dev/null & >> root 15556 98.8 0.3 1076344 51836 ? R May03 21410:12 \_ > /usr/bin/ruby ‑C/var/lib/pcsd ‑I/usr/share/pcsd ‑‑ /usr/share/pcsd/ssl.rb & > > /dev/null & >> >> Running strace on one of the processes shows that they are looping >> on sched_yield().
Probably you should have inspected all processes, as one is expected to do something reasonable. Occasionally an ltrace can be helpful also... > > Can you share some HW specs with us, at least the architecture > to start with ‑‑ x86_64=amd64, arm (gen/mode?), something else? > > The suspicion here is that just the first one may be sufficiently > free from code porting glitches, I mean at the Ruby interpreter > level or lower. > > ‑‑ > Jan (Poki) _______________________________________________ Users mailing list: [email protected] https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
