Haven't seen this till you mentioned it. I can see the close calls in my
local env. It looks like it happens in a new process - after a clone()
syscall at about a couple of seconds apart. So it is likely part of the
script that does the health check:
script "</dev/tcp/${ip}/${watch_port}"
But I don't see a slowdown on the cpu side on my instance - its running
about 1% for the last 30 mins odd so suspect that might have to do with
the agent/sysdig in your case.
Filing a bug would be good - spent some time right now but couldn't figure
out what's causing it or if its a "feature".
Thanks,
Ram//
On Tue, Apr 5, 2016 at 2:04 PM, Chuck Sochin <[email protected]>
wrote:
> Using OSEv3.1.1
>
> I'm looking to setup sysdig in our native HA openshift environment, but
> having issues getting the agent to run on our infra nodes hosting
> keepalived and ha-proxy -- agent runs without issue on all the other nodes
> in our env.
>
> After the agent has been running about an hour or two, the node hangs and
> our hypervisor reports 100% cpu utilization. A power reset is the only
> option to bring the node back to life. The problem may be with keepalived
> doing an extremely large number(around 17 million in a minute) of "close"
> syscall operations, and it looks like those close operations are on any
> available fd. Is this expected behavior of keepalived running in an
> OSEv3.1.1 HA environment?
>
> Thanks!
>
>
>
>
>
>
>
> _______________________________________________
> users mailing list
> [email protected]
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
--
Ram//
main(O,s){s=--O;10<putchar(3^O?97-(15&7183>>4*s)*(O++?-1:1):10)&&\
main(++O,s++);}
_______________________________________________
users mailing list
[email protected]
http://lists.openshift.redhat.com/openshiftmm/listinfo/users