Am 03.01.2018 um 04:57 schrieb Wei Xu: > On Tue, Jan 02, 2018 at 10:17:25PM +0100, Stefan Priebe - Profihost AG wrote: >> >> Am 02.01.2018 um 18:04 schrieb Wei Xu: >>> On Tue, Jan 02, 2018 at 04:24:33PM +0100, Stefan Priebe - Profihost AG >>> wrote: >>>> Hi, >>>> Am 02.01.2018 um 15:20 schrieb Wei Xu: >>>>> On Tue, Jan 02, 2018 at 12:17:29PM +0100, Stefan Priebe - Profihost AG >>>>> wrote: >>>>>> Hello, >>>>>> >>>>>> currently i'm trying to fix a problem where we have "random" missing >>>>>> packets. >>>>>> >>>>>> We're doing an ssh connect from machine a to machine b every 5 minutes >>>>>> via rsync and ssh. >>>>>> >>>>>> Sometimes it happens that we get this cron message: >>>>>> "Connection to 192.168.0.2 closed by remote host. >>>>>> rsync: connection unexpectedly closed (0 bytes received so far) [sender] >>>>>> rsync error: unexplained error (code 255) at io.c(226) [sender=3.1.2] >>>>>> ssh: connect to host 192.168.0.2 port 22: Connection refused" >>>>> >>>>> Hi Stefan, >>>>> What kind of virtio-net backend are you using? Can you paste your qemu >>>>> command line here? >>>> >>>> Sure netdev part: >>>> -netdev >>>> type=tap,id=net0,ifname=tap317i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on >>>> -device >>>> virtio-net-pci,mac=EA:37:42:5C:F3:33,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300 >>>> -netdev >>>> type=tap,id=net1,ifname=tap317i1,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on,queues=4 >>>> -device >>>> virtio-net-pci,mac=6A:8E:74:45:1A:0B,nedev=net1,bus=pci.0,addr=0x13,id=net1,vectors=10,mq=on,bootindex=301 >>> >>> According to what you have mentioned, the traffic is not heavy for the >>> guests, >>> the dropping shouldn't happen for regular case. >> >> The avg traffic is around 300kb/s. >> >>> What is your hardware platform? >> >> Dual Intel Xeon E5-2680 v4 >> >>> and Which versions are you using for both >>> guest/host kernel >> Kernel v4.4.103 >> >>> and qemu? >> 2.9.1 >> >>> Are there other VMs on the same host? >> Yes. > > What about the CPU load?
Host: 80-90% Idle LoadAvg: 6-7 VM: 97%-99% Idle >>>>> 'Connection refused' usually means that the client gets a TCP Reset rather >>>>> than losing packets, so this might not be a relevant issue. >>>> >>>> Mhm so you mean these might be two seperate ones? >>> >>> Yes. >>> >>>> >>>>> Also you can do a tcpdump on both guests and see what happened to SSH >>>>> packets >>>>> (tcpdump -i tapXXX port 22). >>>> >>>> Sadly not as there's too much traffic on that part as rsync is syncing >>>> every 5 minutes through ssh. >>> >>> You can do a tcpdump for the entire traffic from the guest and host and >>> compare >>> what kind of packets are dropped if the traffic is not overloaded. >> >> Are you sure? I don't get why the same amount and same kind of packets >> should be received by both tap which are connected to different bridges >> to different HW and physical interfaces. > > Exactly, possibly this would be a host or guest kernel bug cos than qemu issue > you are using vhost kernel as the backend and the two stats are independent, > you might have to check out what is happening inside the traffic. What do you mean by inside the traffic? Stefan