Hi

My cluster consists of two nodes which serves bunch of kvm
virtualized guests via VirtualDomain resource agent.

I have random problem with stopping VirtualDomains. The same domain can
stop without errors 10 times and another stop generate error which
leads to stonith.

Errors are always the same: cannot send monitor command
'{"execute":"query-balloon"}': Connection reset by peer

Nevertheless the domain is stopped.

I saw the same problem on Scientific Linux 6.0 and Debian Squeeze hosts

Disabling ballooning device in domain libvirt xml file solves it, but It
is just workaround.

Logs:

Jul 01 12:27:45 bolek lrmd: [1882]: info: cancel_op: operation monitor[114] on 
ocf::VirtualDomain::vr_debian1 for client 1885, its parameters: 
CRM_meta_interval=[60000] CRM_meta_depth=[0] 
config=[/etc/libvirt/qemu/debian1.xml] depth=[0] crm_feature_set=[3.0.2] 
CRM_meta_name=[monitor] CRM_meta_start_delay=[10000] CRM_meta_timeout=[60000] 
migration_transport=[ssh]  cancelled
Jul 01 12:27:55 bolek lrmd: [1882]: info: rsc:vr_debian1:136: stop
Jul 01 12:27:55 bolek lrmd: [1882]: info: RA output: (vr_debian1:stop:stdout) 
Domain debian1 is being shutdown
Jul 01 12:27:58 bolek lrmd: [1882]: info: RA output: (vr_debian1:stop:stderr) 
error: cannot send monitor command '{"execute":"query-balloon"}': Connection 
reset by peer
Jul 01 12:27:58 bolek lrmd: [1882]: info: RA output: (vr_debian1:stop:stderr) 
error: Failed to destroy domain debian1


Best regards
-- 
Pawel Warowny

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to