On Saturday, December 25, 2010, DuDu <[email protected]> wrote:
> Hi,t
>
> I knew my issue sounds weird, and I'm not sure it is opennebula's fault. But 
> the problem is really annoying, so can anyone shed some light?
> I've a opennebula cluster deployed and running, with local disk. When a new 
> VM gets provisioned, the disk template is copied from a NFS to the host's 
> local disk. I've two VMs running on two hosts. These VMs have some heartbeat 
> connection between them, for HA. However when a third VM is provision on one 
> host (during the disk image copy process), the heartbeat connection is 
> timeout (socket returns "Broken Pipe"). So the failover is 
> triggered....(obviously it is NOT correct).
>
> CPU usage during the copying, and it was around 17%, which is not high. Ping 
> the host didn't show significant lag. I don't really understand why the 
> host's disk I/O triggers the VM's network problem, do you?
>

It sounds plausible anyway - with nfs you involve the network too, and
copying big files can bring hell in scheduling latencies...

What hypervisor do you use ? If you ping the vms themselves during
provisionning, do you see latency ? What about ssh interactiveness on
the host and vms ?

In parallel, I'd start by raising heartbeat's timeout to big values
(ie timeout > time to copy a vm), just to confirm what's happening.


> BR
>
>
>

-- 
*Stefan Praszalowicz*
*
*
_______________________________________________
Users mailing list
[email protected]
http://lists.opennebula.org/listinfo.cgi/users-opennebula.org

Reply via email to