[ovs-discuss] ovsdb-server unkillable, need some help

Jeff Bachtel Wed, 19 Feb 2014 21:55:25 -0800

I'm running OpenVSwitch 1.11 from the RDO Havana repository. Inaddition, I'm running OpenStack Havana, Neutron, and Ceph Emperor, allon some CentOS 6.5 machines.

After installing Bacula on the previous openstack version (grizzly), Inoticed the networking had become somewhat load sensitive. ovsdb-serverwas freezing - not responding to queries on its unix socket and becomingunkillable in process state R< . Believing that it was probably due tobeing behind in ovs version, I pushed ahead with an upgrade only to findmy stability problems become much much worse. Every 20-30 minutes I cancount on an ovsdb-server process freezing.

Athttps://drive.google.com/folderview?id=0B-wx2_T_hW-_OXZJWGJNc0l0MzQ&usp=sharingplease find a folder with shared copies of diagnostic files from amachine with hung ovsdb-server. There is a process list (.ps, apologiesforgot postscript until upload was done), strace, dmesg, and/var/log/messages.

The strace didn't reveal anything suspicious to me. To mitigate I triedlowering log verbosity, completely recreating conf.db, as well asfrequent compacting (every minute) and putting the db on a ramdisk,nothing worked as a solution.

The ovsdb-server processes most likely to succumb to locking run on cephhosts running osd - meaning they can see a lot of network traffic, aswell as disk i/o.

I don't understand what a simple database RPC server could be doing thatwould cause it to become unkillable, especially with the attempt atminimizing disk i/o by putting the db file on a ramdisk.

I hope someone has some ideas of what I might do to test or mitigate thesituation. Not running ceph osd on the hosts is, unfortunately, not asolution I can use.


Thanks,
Jeff
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss

[ovs-discuss] ovsdb-server unkillable, need some help

Reply via email to