Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

Jan Pekař - Imatic Tue, 07 Nov 2017 02:18:08 -0800

I'm calling kill -STOP to simulate behavior, that occurred, when on oneceph node i was out of memory. Processes was not killed, but weresomehow suspended/unresponsible (they couldn't create new threads etc),and that caused all virtuals (on other nodes) to hung.

I decided to simulate it with kill -STOP MONPID OSDPID and I succeeded.

After I stop MON with OSD, it took few seconds to get osd unresponsivemessages, and exactly when I get final

libceph: osd6 down
all my virtuals stops responding (stop pinging, unable to use VNC etc)

Tried with librdb disk definition or rbd map device attached insideQEMU/KVM virtuals.


JP


On 7.11.2017 10:57, Piotr Dałek wrote:

On 17-11-07 12:02 AM, Jan Pekař - Imatic wrote:
Hi,
I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu1:2.8+dfsg-6+deb9u3
I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.

When I tested the cluster, I detected strange and severe problem.
On first node I'm running qemu hosts with librados disk connection tothe cluster and all 3 monitors mentioned in connection.
On second node I stopped mon and osd with command

kill -STOP MONPID OSDPID
Within one minute all my qemu hosts on first node freeze, so they evendon't respond to ping. [..]
Why would you want to *stop* (as in, freeze) a process instead ofkilling it?Anyway, with processes still there, it may take a few minutes beforecluster realizes that daemons are stopped and kicks it out of cluster,restoring normal behavior (assuming correctly set crush rules).


--
============
Ing. Jan Pekař
jan.pe...@imatic.cz | +420603811737
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
============
--
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

Reply via email to