Curtis,

Do you have enough space to run tcpdump (port not 22) on both hosts and on the 
small VM you have done previously - and then start the migration?

Best Regards,
Strahil NikolovOn Aug 24, 2019 22:15, "Curtis E. Combs Jr." 
<[email protected]> wrote:
>
> I applied a 90Mbs QOS Rate Limit with 10 set for the shares to both 
> interfaces of 2 of the hosts. My hosts names are swm-01 and swm-02. 
>
> Creating a small VM from a Cinder template and running it gave me a test VM. 
>
> When I migrated from swm-01 to swm-02, swm-01 immediately became 
> unresponsive to pings, SSH'es, and to the ovirt interface which marked 
> it as "NonResponsive" soon after the VM finished. The VM did finish 
> migrating, however I'm unsure if that's a good migration or not. 
>
> Thank you, Strahil. 
>
> On Sat, Aug 24, 2019 at 12:39 PM Strahil <[email protected]> wrote: 
> > 
> > What is your bandwidth threshold for the network used for VM migration ? 
> > Can you set a 90 mbit/s threshold (yes, less than 100mbit/s) and try to 
> > migrate a small (1 GB RAM) VM ? 
> > 
> > Do you see disconnects ? 
> > 
> > If no, try a little bit up (the threshold)  and check again. 
> > 
> > Best Regards, 
> > Strahil NikolovOn Aug 23, 2019 23:19, "Curtis E. Combs Jr." 
> > <[email protected]> wrote: 
> > > 
> > > It took a while for my servers to come back on the network this time. 
> > > I think it's due to ovirt continuing to try to migrate the VMs around 
> > > like I requested. The 3 servers' names are "swm-01, swm-02 and 
> > > swm-03". Eventually (about 2-3 minutes ago) they all came back online. 
> > > 
> > > So I disabled and stopped the lldpad service. 
> > > 
> > > Nope. Started some more migrations and swm-02 and swm-03 disappeared 
> > > again. No ping, SSH hung, same as before - almost as soon as the 
> > > migration started. 
> > > 
> > > If you wall have any ideas what switch-level setting might be enabled, 
> > > let me know, cause I'm stumped. I can add it to the ticket that's 
> > > requesting the port configurations. I've already added the port 
> > > numbers and switch name that I got from CDP. 
> > > 
> > > Thanks again, I really appreciate the help! 
> > > cecjr 
> > > 
> > > 
> > > 
> > > On Fri, Aug 23, 2019 at 3:28 PM Dominik Holler <[email protected]> 
> > > wrote: 
> > > > 
> > > > 
> > > > 
> > > > On Fri, Aug 23, 2019 at 9:19 PM Dominik Holler <[email protected]> 
> > > > wrote: 
> > > >> 
> > > >> 
> > > >> 
> > > >> On Fri, Aug 23, 2019 at 8:03 PM Curtis E. Combs Jr. 
> > > >> <[email protected]> wrote: 
> > > >>> 
> > > >>> This little cluster isn't in production or anything like that yet. 
> > > >>> 
> > > >>> So, I went ahead and used your ethtool commands to disable pause 
> > > >>> frames on both interfaces of each server. I then, chose a few VMs to 
> > > >>> migrate around at random. 
> > > >>> 
> > > >>> swm-02 and swm-03 both went out again. Unreachable. Can't ping, can't 
> > > >>> ssh, and the SSH session that I had open was unresponsive. 
> > > >>> 
> > > >>> Any other ideas? 
> > > >>> 
> > > >> 
> > > >> Sorry, no. Looks like two different NICs with different drivers and 
> > > >> frimware goes down together. 
> > > >> This is a strong indication that the root cause is related to the 
> > > >> switch. 
> > > >> Maybe you can get some information about the switch config by 
> > > >> 'lldptool get-tlv -n -i em1' 
> > > >> 
> > > > 
> > > > Another guess: 
> > > > After the optional 'lldptool get-tlv -n -i em1' 
> > > > 'systemctl stop lldpad' 
> > > > another try to migrate. 
> > > > 
> > > > 
> > > >> 
> > > >> 
> > > >>> 
> > > >>> On Fri, Aug 23, 2019 at 1:50 PM Dominik Holler <[email protected]> 
> > > >>> wrote: 
> > > >>> > 
> > > >>> > 
> > > >>> > 
> > > >>> > On Fri, Aug 23, 2019 at 6:45 PM Curtis E. Combs Jr. 
> > > >>> > <[email protected]> wrote: 
> > > >>> >> 
> > > >>> >> Unfortunately, I can't check on the switch. Trust me, I've tried. 
> > > >>> >> These servers are in a Co-Lo and I've put 5 tickets in asking 
> > > >>> >> about 
> > > >>> >> the port configuration. They just get ignored - but that's par for 
> > > >>> >> the 
> > > >>> >> coarse for IT here. Only about 2 out of 10 of our tickets get any 
> > > >>> >> response and usually the response doesn't help. Then
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/2WAWTUU3WQTU63TOBGYRMVJAWCIHEBWX/

Reply via email to