Hi, I've done a lot more testing today. I've narrowed the issues down to two specific VMs. When I'm running either of these two VMs I get horrific network performance. When both of those two are stopped, my network is just fine (like 99% of the time).
I've been spending the day gathering packet dumps. I'm running wireshark on my host listening to the ovirtmgmt bridge (which is my only network). So, that SHOULD be capturing everything, right? I have not noticed anything out of the ordinary except for one odd thing -- corellated with my network wonkiness wireshark reports a bunch of duplicate or out-of-order TCP packets! I'll just note that corellation does not imply causation, but I'm not seeing anything else out of the ordinary. I certainly don't see anything that would imply I've been hacked. Is there something with CentOS/ovirt-host/vdsm networking that could cause this? Or could it be a router issue? Specifically my host and my hosted-engine are on separate logical networks (different /24s) but both networks are on the same physical wire; my router, an ERPro8, uses a single interface with both /24s assigned and routes between them. But some of the duplicate/out-of-order was for the periodic host <-> engine health checks. Still, I'm not sure why it's these two specific VMs that are causing my issues, other than that they have the most amount of network traffic coming/going. If it IS a router problem (the router is relatively new, and also updated with the latest firmware), I'm honestly not sure how to properly test that. Any more ideas where I can look, or what I can/should be looking for? I'm extremely comfortable with internet technologies (25+ years experience) but this has got me stumpted! Thanks, -derek Jason Keltz <j...@cse.yorku.ca> writes: > Derek, > Have you used tcpdump to check what network traffic is coming out of > your box? Is it possible that it is some kind of DoS attack from > outside in or that your VM was compromised and is attacking other > external hosts? > > Hope you get to the bottom of it! > Jason. > > Sent with AquaMail for Android > http://www.aqua-mail.com > > > On October 2, 2017 4:56:54 PM Derek Atkins <de...@ihtfp.com> wrote: > >> Hi, >> >> I'm at my wits end so I'm tossing this here in the hopes that SOMEONE >> will be able to help me. >> >> tl;dr: Ovirt is doing something on my network that is causing my fiber >> modem to go from 3-5ms to 300-1000+ms round trip times. I know it's >> ovirt because when I unplug ovirt from my network the issue goes away; >> when I plug it back in, the issue recurs. >> >> Long version: >> >> I've been running Ovirt 4.0.6 happily on CentOS 7.3 for several months >> on a single host machine. Indeed, the host had an uptime of 200+ days >> and was working great until approximately midnight, September 21/22 >> (just over a week ago). I was on an airplane halfway across the >> Atlantic at that time, so it wasn't anything I did. >> >> My network is configured as: >> >> fiber modem <-> edgerouter <-> switch <-> everything else >> >> ovirt is living in the "everything else" area. >> >> When I sit with a laptop connected to either the everything else range >> or even directly connected to the fiber modem, I run 'mtr' and see >> network times (starting at the fiber modem) that bounce all over the >> place. When I unplug ovirt I see consistent 3-5ms times. Plug it back >> in, voom, back up to badness. >> >> I've spent several hours plugging and unplugging different devices >> trying to isolate the issue. The only "device" that has any effect is >> my ovirt box. >> >> I have tried to debug this in several ways, but really the only thing >> that seems to have helped at all is shutting down all the VMs and the >> hosted engine. Once nothing else is running (but the host itself), only >> then does the network seem to return to normal. >> >> I'm really at my wits end on this; I have no idea what is causing this >> or what might have changed to cause the issue right at that time. I >> also can't imagine what ovirt is doing over the network that could cause >> the modem, two physical hops away, to lose its mind in this way. But my >> experiementation is definitely showing a direct correlation. >> >> Help!! >> >> -derek >> >> -- >> Derek Atkins 617-623-3745 >> de...@ihtfp.com www.ihtfp.com >> Computer and Internet Security Consultant >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> > > > > > -- Derek Atkins 617-623-3745 de...@ihtfp.com www.ihtfp.com Computer and Internet Security Consultant _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users