I managed to build the VBox v6.0 additions under 8.99.28, now when using vioif interface I get reasonable results: ... PS C:\bin\iperf-3.1.3-win64> .\iperf3.exe -c marge Connecting to host marge, port 5201 [ 4] local 192.168.0.35 port 10152 connected to 192.168.0.6 port 5201 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 68.2 MBytes 572 Mbits/sec [ 4] 1.00-2.00 sec 71.9 MBytes 603 Mbits/sec [ 4] 2.00-3.00 sec 69.8 MBytes 585 Mbits/sec [ 4] 3.00-4.00 sec 71.9 MBytes 603 Mbits/sec [ 4] 4.00-5.00 sec 68.9 MBytes 578 Mbits/sec [ 4] 5.00-6.00 sec 69.4 MBytes 581 Mbits/sec [ 4] 6.00-7.00 sec 70.2 MBytes 590 Mbits/sec [ 4] 7.00-8.00 sec 75.6 MBytes 634 Mbits/sec [ 4] 8.00-9.00 sec 70.1 MBytes 589 Mbits/sec [ 4] 9.00-10.00 sec 73.8 MBytes 619 Mbits/sec - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth [ 4] 0.00-10.00 sec 710 MBytes 595 Mbits/sec sender [ 4] 0.00-10.00 sec 710 MBytes 595 Mbits/sec receiver
iperf Done. ..... It is interesting also that when NetBSD is ran under XenServer (XCP-NG actually) in PV mode, benchmarked against the same 8.99.28 version running on a physical machine, everything on a 1GB interface and switch, I get maximum saturated line (~ 933Mb/s). When the iperf3 server is on the same XCP-BG guest and the client - a CentOS guest - the figures approach 2.3Gb/sec. On Wed, 19 Dec 2018 at 12:36, Chavdar Ivanov <[email protected]> wrote: > > The workaround is fine. In the mean time I upgraded my VirtualBox > installation to 6.0 (released yesterday) and will check again. > > While here I did some, admittedly not very scientific, benchmarks on > network performance under VirtualBox. I started a single guest of a > different type, had iperf3 installed and running as server on the > guest and tested the iperf3 client connection from the host. All > guests were configured to use bridged adapter to the active (WiFi, in > my case Intel AC-7265, but it shouldn't matter), using the first > (desktop) Intel emulation (82540EM). The results varied wildly between > different guests, the best being the latest Linux guests (OpenSUSE > Tumbleweed and Fedora 29), the worst happened to be NetBSD-current. I > also tested on a vew systems the difference in speed between the above > chosen adapter type and the virtio one; this again showed differences > - NetBSD was better, on some tests by a factor of two, when using > virtio, whereas OpenBSD was the other way round - the Intel emulation > was twice as fast. I've attached the log file of some of these > attempts for reference. I didn't have Guest additions running on any > of the BSD guests, which perhaps is relevant; the other systems had it > configured. I also switched the emulation on the NetBSD host from KVM > to default, as you suggested. > > As I said, we shouldn' t be reading too much from this, but it is > still a point. > > > On Wed, 19 Dec 2018 at 02:35, Masanobu SAITOH <[email protected]> wrote: > > > > On 2018/12/18 20:13, Masanobu SAITOH wrote: > > > Hi! > > > > > > On 2018/12/17 19:38, Chavdar Ivanov wrote: > > >> I went through a series of tests. It is indeed that point the panic > > >> takes place, the two parts of the screendump are in > > >> > > >> http://ci4ic4.tx0.org/nb-panic-wm-03.png and > > >> http://ci4ic4.tx0.org/nb-panic-wm-04.png . > > > > > > Thanks. This is the workaround code for broken lapic timer > > > counter which was added in: > > > > > > http://mail-index.netbsd.org/source-changes/2017/11/23/msg089946.html > > > > > > http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/x86/x86/lapic.c.diff?r1=1.63&r2=1.64&f=h > > > > > > Your VM is configured act as KVM > > > (See system->acceleration(L) tab or see .box file's "Paravirt provider=") > > > > > > I set up my vm to KVM and > > > > > >> VirtualBox gives three Intel NIC options: > > >> > > >> Intel PRO/1000 MT Desktop (82540EM) > > >> Intel PRO/1000 T Server (82543GC) > > >> Intel PRO/1000 MT Server (82545EM) > > >> > > >> I was able to get a panic with the same kernel from 13/12/2018 only > > >> when I select the second option: > > > > > > I changed my VM's setting to use 82543GC. I tried hibernation > > > three times but I couldn't reproduce the problem. I couldn't reproduce > > > the same problem, but this problem must be exist because you had the > > > problem. > > > > > > The possibilities are: > > > a) VirtualBox's lapic is not good. > > > b) Our workaround code is not perfect or somewhere is not good. > > > c) any others > > > > > > I suspect this problem is not from if_wm.c. but from > > >> There was a VirtualBox upgrade a few weeks ago, perhaps the problem is > > >> there. > > > > > > > > > I read vbox/src/VBox/Devices/Network/DevE1000.cpp. One of the > > > difference between 82543GC emulation and other two is that > > > it generates interrupt when chip reset occurred. If other network > > > device emulation works well, I suspect that the reset timing in vbox > > > is not good and it makes no update of lapic timer. > > > > > > Workarounds are: > > > a) Don't use KVM mode and use "Default" or other. > > > On my Windows7's virtual box, "Default" makes > > > CPUID2_RAZ bit not set. It makes NetBSD recognize > > > it's not on KVM. > > > > If the problem which lapic timer stops also exist on the "Defalut" mode, > > that workaround isn't used and delay() won't work. If so, b) is the best > > to avoid the problem. > > > > > b) Use Other than 82543GC. > > > c) any others > > > > > > BTW, when I use 82543GC emulation, I got the following bug: > > >> makphy0 at wm0 phy 0: Marvell 88E1000 Gigabit PHY, rev. 0 > > >> makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > >> makphy1 at wm0 phy 1: Marvell 88E1000 Gigabit PHY, rev. 0 > > >> makphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > > (snip) > > >> makphy31 at wm0 phy 31: Marvell 88E1000 Gigabit PHY, rev. 0 > > >> makphy31: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > >> ifmedia_match: multiple match for 0x20/0xfbff9ff, selected instance 0 > > > > > > This _IS_ a bug of VirtualBox's 82543GC emulation. > > > DevE1000Phy.cpp line 568 says: > > > > > > /* Note: A single PHY is supported, ignore PHYADR */ > > > > > > So I recommend all users not to use 82543GC emulation until this PHY > > > bug is fixed. > > > > > >> ...... > > >> -rw------- 1 root wheel 2199810 Dec 17 09:24 netbsd.9 > > >> -rw------- 1 root wheel 147348504 Dec 17 09:24 netbsd.9.core > > >> /var/crash # gdb netbsd.9 > > >> GNU gdb (GDB) 8.0.1 > > >> Copyright (C) 2017 Free Software Foundation, Inc. > > >> License GPLv3+: GNU GPL version 3 or later > > >> <http://gnu.org/licenses/gpl.html> > > >> This is free software: you are free to change and redistribute it. > > >> There is NO WARRANTY, to the extent permitted by law. Type "show > > >> copying" > > >> and "show warranty" for details. > > >> This GDB was configured as "x86_64--netbsd". > > >> Type "show configuration" for configuration details. > > >> For bug reporting instructions, please see: > > >> <http://www.gnu.org/software/gdb/bugs/>. > > >> Find the GDB manual and other documentation resources online at: > > >> <http://www.gnu.org/software/gdb/documentation/>. > > >> For help, type "help". > > >> Type "apropos word" to search for commands related to "word"... > > >> Reading symbols from netbsd.9...(no debugging symbols found)...done. > > >> (gdb) target kvm netbsd.9.core > > >> 0xffffffff80222d75 in cpu_reboot () > > >> (gdb) bt > > >> #0 0xffffffff80222d75 in cpu_reboot () > > >> #1 0xffffffff8076e6f7 in db_reboot_cmd () > > >> #2 0xffffffff8076ee92 in db_command () > > >> #3 0xffffffff8076f20c in db_command_loop () > > >> #4 0xffffffff80772b80 in db_trap () > > >> #5 0xffffffff8021f5c2 in kdb_trap () > > >> #6 0xffffffff802244b1 in trap () > > >> #7 0xffffffff8021d568 in alltraps () > > >> #8 0xffffffff8021de45 in breakpoint () > > >> #9 0xffffffff809d54b0 in vpanic () > > >> #10 0xffffffff809d5550 in panic () > > >> #11 0xffffffff802514f0 in lapic_delay () > > >> #12 0xffffffff80353270 in wm_gmii_i82543_readreg () > > >> #13 0xffffffff807b1aa5 in makphy_status () > > >> #14 0xffffffff807b1cf7 in makphy_service () > > >> #15 0xffffffff807a826c in mii_tick () > > >> #16 0xffffffff80360926 in wm_tick () > > >> #17 0xffffffff809b6b96 in callout_softclock () > > >> #18 0xffffffff809aaa55 in softint_dispatch () > > >> #19 0xffffffff8021d21f in Xsoftintr () > > >> > > >> > > >> I rebuilt the kernel (on a different physical host, but there may > > >> have been an update on the 14th there) and tried to get a panic with > > >> the .gdb kernel, but it never happened. > > >> > > >> Obviously it is not a problem for me or anyone running NetBSD as a > > >> VirtualBox guest, as using vioif / virtio is almost as twice as fast, > > >> but I reported the panic thinking it may be relevant in other use > > >> cases. > > > > > > Thank you for your report! > > > > > > > > > > > >> On Mon, 17 Dec 2018 at 07:49, Masanobu SAITOH <[email protected]> wrote: > > >>> > > >>> On 2018/12/17 1:09, Chavdar Ivanov wrote: > > >>>> I have no idea. As I said, it is running under VirtualBox on a Windows > > >>>> 10 host; I put the host in hibernation whilst the NetBSD guest is > > >>>> running. > > >>> > > >>> I tested today's -current on VirtualBox 5.2.22 on Windows 7 64bit > > >>> (on Core i7-2600). I tried hybernate(shutdown ->hybernate(H)) a few > > >>> times > > >>> but I couldn't reproduce the problem yet. > > >>> > > >>>>>>> while (deltat > 0) { > > >>>>>>> xtick = lapic_gettick(); > > >>>>>>> if (lapic_broken_periodic && xtick == 0 && otick > > >>>>>>> == 0) { > > >>>>>>> lapic_initclocks(); > > >>>>>>> xtick = lapic_gettick(); > > >>>>>>> if (xtick == 0) > > >>>>>>> panic("lapic timer stopped > > >>>>>>> ticking"); <=========== here! > > >>>>>>> } > > >>> > > >>> If that panic is from this, lapic_broken_periodic must be true, but > > >>> it's set only > > >>> when the VM is KVM: > > >>>> /* > > >>>> * Apply workaround for broken periodic timer under > > >>>> KVM > > >>>> */ > > >>>> if (vm_guest == VM_GUEST_KVM) { > > >>>> lapic_broken_periodic = true; > > >>>> lapic_timecounter.tc_quality = -100; > > >>>> aprint_debug_dev(ci->ci_dev, > > >>>> "applying KVM timer workaround\n"); > > >>>> } > > >>> > > >>> Could you try to reproduce the problem and see the panic message? > > >>> ci4ic4-panic-01.png has backtrace and it wiped out the panic message. > > >>> > > >>> Regards. > > >>> > > >>>> Previously it survived this, using the Intel Desktop NIC > > >>>> emulation within VirtualBox, even my ssh connections (from the host to > > >>>> the guest) remained active. I switched the NIC emulation for the > > >>>> NetBSD guest to virtio-net, now it behaves as before, surviving a > > >>>> hibernation. > > >>>> > > >>>> There was a VirtualBox upgrade a few weeks ago, perhaps the problem is > > >>>> there. > > >>>> On Sun, 16 Dec 2018 at 15:55, SAITOH Masanobu <[email protected]> > > >>>> wrote: > > >>>>> > > >>>>> Hi. > > >>>>> > > >>>>> On 2018/12/16 18:09, Chavdar Ivanov wrote: > > >>>>>> Repeated this morning. Happens when the host hibernates when the > > >>>>>> machine is running. The initial trace is slightly different, but the > > >>>>>> lines with wm_gmii are the same, so for now I will switch to a > > >>>>>> different NIC emulator. > > >>>>>> > > >>>>> > > >>>>> In your .png: > > >>>>>> vpanic() > > >>>>>> lapic_delay() > > >>>>>> wm_gmii_mdic_readreg() > > >>>>>> . > > >>>>>> . > > >>>>>> . > > >>>>> > > >>>>> There is no panic message itself, but I suspect it's: > > >>>>>> static void > > >>>>>> lapic_delay(unsigned int usec) > > >>>>>> { > > >>>>>> int32_t xtick, otick; > > >>>>>> int64_t deltat; /* XXX may want to be 64bit */ > > >>>>>> > > >>>>>> otick = lapic_gettick(); > > >>>>>> > > >>>>>> if (usec <= 0) > > >>>>>> return; > > >>>>>> if (usec <= 25) > > >>>>>> deltat = lapic_delaytab[usec]; > > >>>>>> else > > >>>>>> deltat = (lapic_frac_cycle_per_usec * usec) >> 32; > > >>>>>> > > >>>>>> while (deltat > 0) { > > >>>>>> xtick = lapic_gettick(); > > >>>>>> if (lapic_broken_periodic && xtick == 0 && otick > > >>>>>> == 0) { > > >>>>>> lapic_initclocks(); > > >>>>>> xtick = lapic_gettick(); > > >>>>>> if (xtick == 0) > > >>>>>> panic("lapic timer stopped > > >>>>>> ticking"); <=========== here! > > >>>>>> } > > >>>>>> if (xtick > otick) > > >>>>>> deltat -= lapic_tval - (xtick - otick); > > >>>>>> else > > >>>>>> deltat -= otick - xtick; > > >>>>>> otick = xtick; > > >>>>>> > > >>>>>> x86_pause(); > > >>>>>> } > > >>>>>> } > > >>>>> > > >>>>> Why does it cause? > > >>>>> > > >>>>> > > >>>>>> And yes, it used to survive many hibernations of the hosts before. I > > >>>>>> only had to adjust the time after waking the host up. > > >>>>>> On Sat, 15 Dec 2018 at 10:59, Chavdar Ivanov <[email protected]> > > >>>>>> wrote: > > >>>>>>> > > >>>>>>> Hi, > > >>>>>>> > > >>>>>>> On 8.99.27 AMD64 running under VirtualBox I got this morning the > > >>>>>>> panic > > >>>>>>> in http://ci4ic4.tx0.org/ci4ic4-panic-01.png > > >>>>>>> > > >>>>>>> I have the coredump, if it is of interest. I thought it might be > > >>>>>>> useful, as it is apparently in the wm driver. > > >>>>>>> > > >>>>>>> Chavdar > > >>>>>>> -- > > >>>>>>> ---- > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> -- > > >>>>> ----------------------------------------------- > > >>>>> SAITOH Masanobu ([email protected] > > >>>>> [email protected]) > > >>>> > > >>>> > > >>>> > > >>> > > >>> > > >>> -- > > >>> ----------------------------------------------- > > >>> SAITOH Masanobu ([email protected] > > >>> [email protected]) > > >> > > >> > > >> > > > > > > > > > > > > -- > > ----------------------------------------------- > > SAITOH Masanobu ([email protected] > > [email protected]) > > > > -- > ---- -- ----
