Re: AX88179 USB-to-Ethernet is slow and silently corrupts data
On Thu, May 03, 2018 at 02:11:24PM -0700, Dieter BSD wrote: > > 10.3-RELEASE [...] > pyunyh> Which phy driver is used for axge(4)? > pyunyh> You can see the phy driver name below axge(4) attachment in dmesg > pyunyh> output. > > axge0: on usbus2 > axge1: on usbus0 > miibus4: > on axge0 > miibus5: on axge1 > It's not phy driver name. The phy driver may have shown right after the miibus(4) output. Probably the phy driver name would be rgephy(4). > - Do you use manual media configuration instead of auto-negotiation? > They auto configure at 1000. I usually set them to 100 which seems to > eliminate the silent data corruption. > Good data point. > pyunyh> Does the issue happen at which media speed(10Mbps, 100Mbps or > 1000Mbs)? > > The silent data corruption happens at 1000. 100 seems to eliminate > the data corruption but 100 isn't always fast enough. I haven't tried > setting the AX88179 to 10 Mbps mode, although I tested it by sending > data to it from another machine whioh was running at 10 Mbps, with > a Netgear switch converting the 10 Mbps to 1000 Mbps. Using usb2 instead > of usb3 also seems to eliminate the date corruption. The AX88179 > doesn't seem to care about what Ethernet speed it is running at, or > what usb speed it is running at. The silent data corruption happens > if it receives too many packets per second from the Ethernet. Reducing > Ethernet speed or usb speed are simple ways to reduce how many packets > per second it handles. > It seems this is another data point. If you use ehci(4)(i.e. USB2) the issue does not happen even on 1000base-T link, right? > pyunyh> Which direction of packet flow is broken(TX or RX or both)? > > The silent data corruption happens if it receives too many packets > per second from the Ethernet. > I have not observed any data corruption when the AX88179 transmits > data to the Ethernet. Tested with rcp(1). > Ok, let's focus on RX side. > It seems interesting that it is the receive direction that gets data > corruption and the receive direction that fails completely when the > rxcsum is turned off. Perhaps related? > If S/W checksum is used, you wouldn't receive corrupted packets so your transfer operation is aborted in the middle of transfer and you already know that operation was failed. Silent data corruption means you think your transfer was successful but actual content was corrupted such that you can only find it after verifying md5 or sha256 checksums of the content. Are you seeing silent data corruption with TCP transfer(You should not use nc(1) with UDP to verify this.)? > ue0 is now connected to chipset (AMD 990FX SB950) usb controller usb2 > > ifconfig ue0 -rxcsum > > ue0: flags=8843metric 0 mtu 1500 > options=8000a > media: Ethernet 100baseTX > > Sent ue0 a bunch of udp packets from another FreeBSD box. > dd if=/dev/zero bs=1k count=50 | nc -4u 10.0.210.66 5 > > inputue0 output >packets errs idrops bytespackets errs bytes colls drops > 0 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 > 50 0 0 53000 0 0 0 0 0 > 0 0 0 0 0 0 0 0 0 > > Receiving process got none of them. > nc -4nul 5 > /var/tmp/file_via_udp_ue0 > (file is zero bytes) > > netstat -s -p udp > udp: > 0 datagrams received > 0 with incomplete header > 0 with bad data length field > 0 with bad checksum > 0 with no checksum > 0 dropped due to no socket > 0 broadcast/multicast datagrams undelivered > 0 dropped due to full socket buffers > 0 not for hashed pcb > 0 delivered > 0 datagrams output > 0 times multicast source filter matched > > So netstat sees packets coming in, but does not see any datagrams. > Where is the data disappearing? In the hardware? In the device driver? As I said other mail, netstat(1)'s raw packet counters are maintained in driver. Driver may have submitted packets to upper layer but it seems they were discarded due to other reasons. > Is "ifconfig -rxcsum" really doing the correct thing to the chip? > If it was correctly implemented, yes. > Is there some way to have RXCSUM,TXCSUM turned on, but also have the cpu > verify the checksum? I realize the the whole point of RXCSUM,TXCSUM You have to choose only one(either hardware checksum offloading or software checksum) so you can't have both. > is to reduce the load on the cpu, but data corruption sucks. > > To see if a different usb controller made any difference, I ran the same > test using ue1 = Tek Republic TUN-300 which has the same AX88179 as the Siig, > connected to onboard VIA VL805 USB 3.0 controller, and it
Re: AX88179 USB-to-Ethernet is slow and silently corrupts data
On Tue, Apr 10, 2018 at 03:54:58PM -0700, Dieter BSD wrote: > 10.3-RELEASE > amd64 with ECC memory > VIA VL805 USB 3.0 controller > ue0 is Siig USB-to-Ethernet Chipset: AX88179 > > ugen0.7: at usbus0, cfg=0 md=HOST >spd=SUPER (5.0Gbps) pwr=ON (124mA) > > ue0: flags=8c43metric 0 > mtu 1500 > options=8000b > inet 10.0.210.66 netmask 0xff00 broadcast 10.0.210.255 > nd6 options=29 > media: Ethernet autoselect (1000baseT ) > status: active > > If media is set to "1000baseT " it "works", but slowly, and > received data is silently corrupted. :-( Transmitted data is not > corrupted (tested with > 30 GB). > > ifconfig ue0 -txcsum > "works", but still gives silent data corruption > > ifconfig ue0 -rxcsum (acts the same with or without txcsum) > ping out > netstat sees packets both directions, but ping doesn't see the response: > 8 packets transmitted, 0 packets received, 100.0% packet loss > ping in >netstat sees packets in, but no responses going out > > I can see that some Ethernet controllers would not support checksum > offloading, > but it seems to me that turning the checksum offloading off should always > work? (at the expense of more cpu load) > > Previously (2016 May): > # ifconfig ue0 media 100baseTX-FDX > fixed the input error problem and the data corruption problem, > at the expense of making it even slower. > > Sent data from machine A with 10Mbps Ethernet. (Netgear Ethernet switch > converts 10Mbps to 1000Mbps) Netstat did not report any input errors on > ue0 and there was no data corruption. So ue0 can handle gigabit data rate, > but gets input errors if packets arrive too frequently. > > I tried moving it to a USB-2 port. No data corruption, but USB-2 is slow. > > The chip performs a lot better for tweaktown: > > http://www.tweaktown.com/reviews/7243/vantec-cb-u300gna-usb-3-gigabit-network-adapter-review/index.html > (Vantec CB-U300GNA with the same Asix AX88179 chip) > "full duplex gigabit with 952 Mbps consistently across the chart" > > http://www.vantecusa.com/products_detail.php?p_id=143_name=USB+3.0+Gigabit+Ethernet+Adapter_id=21_name=Network_id=5_name=Accessories > > Asix AX88179 chip: > http://www.asix.com.tw/products.php?op=pItemdetail=131;71;112 > "Supports Jumbo frame up to 4KB" > > But ifconfig rejects any value > 1500: > ifconfig ue0 mtu 1501 > ifconfig: ioctl SIOCSIFMTU (set mtu): Invalid argument > > I tried mtu of 100, 500, 1000, 1400 but they all give > rcp: lost connection > > USB disks are fast, so the USB controller seems to work ok. > > I also tried a Tek Republic TUN-300 which has the same AX88179, > and it acts the same as the Siig. > > So, transmit works, but is slow. Receive works if the amount of traffic > is low enough (limit rate of data sent, limit Ethernet speed, or > use USB-2). But if data is received too fast it gets silently corrupted. > Setting -rxcsum does not work, and cannot set mtu other than 1500. > [Removed freebsd-usb@, freebsd-hackers@ and freebsd-drivers@ in the CC list] > Questions: > Why does -rxcsum not work? The driver implements RX checksum offloading but it seems it has some issue in RX data handling. > Why does attempting to set a larger mtu fail? Jumbo frame support was not implemented in axge(4). > Why does setting a smaller mtu make rcp fail? > Why is the chip acting slow? > How do I get it to work properly? (fast and without data corruption) In order to narrow down the issue it would be helpful to know: - Disable all H/W checksum offloading features. - Which phy driver is used for axge(4)? You can see the phy driver name below axge(4) attachment in dmesg output. - Do you use manual media configuration instead of auto-negotiation? - Does the issue happen at which media speed(10Mbps, 100Mbps or 1000Mbs)? - Does the issue happen with which USB driver(ehci(4) or xhci(4))? - Which direction of packet flow is broken(TX or RX or both)? You need two boxes(one with ue(4) and the other with good working system). If TX flow of ue(4) is broken, your good working system will report number of bad input packets. If RX flow of ue(4) is broken, you may receive corrupted data without errors. Use TCP to test for TX/RX flow. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: mbuf_jumbo_9k & iSCSI failing
On Mon, Jun 26, 2017 at 03:44:58PM +0200, Julien Cigar wrote: > On Mon, Jun 26, 2017 at 04:13:33PM +0300, Andrey V. Elsukov wrote: > > On 25.06.2017 18:32, Ryan Stone wrote: > > > Having looking at the original email more closely, I see that you showed > > > an > > > mlxen interface with a 9020 MTU. Seeing allocation failures of 9k mbuf > > > clusters increase while you are far below the zone's limit means that > > > you're definitely running into the bug I'm describing, and this bug could > > > plausibly cause the iSCSI errors that you describe. > > > > > > The issue is that the newer version of the driver tries to allocate a > > > single buffer to accommodate an MTU-sized packet. Over time, however, > > > memory will become fragmented and eventually it can become impossible to > > > allocate a 9k physically contiguous buffer. When this happens the driver > > > is unable to allocate buffers to receive packets and is forced to drop > > > them. Presumably, if iSCSI suffers too many packet drops it will > > > terminate > > > the connection. The older version of the driver limited itself to > > > page-sized buffers, so it was immune to issues with memory fragmentation. > > > > I think it is not mlxen specific problem, we have the same symptoms with > > ixgbe(4) driver too. To avoid the problem we have patches that are > > disable using of 9k mbufs, and instead only use 4k mbufs. > > I had the same issue on a lightly loaded HP DL20 machine (BCM5720 > chipsets), 8GB of RAM, running 10.3. Problem usually happens > within 30 days with 9k jumbo clusters allocation failure. > This looks strange to me. If I recall correctly bge(4) does not request physically contiguous 9k jumbo buffers for BCM5720 so it wouldn't suffer from memory fragmentation. (It uses m_cljget() and takes advantage of extended RX BDs to handle up to 4 DMA segments). If your controller is either BCM5714/BCM5715 or BCM5780, it requires physically contiguous 9k jumbo buffers to handle jumbo frames though. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Vlan offloaded checksums
On Mon, Sep 12, 2016 at 06:09:02PM +0200, Emeric POUPON wrote: > Hello, > > I have a network driver that supports hardware checksums. > Thanks to offset parameters, it also supports VLAN checksums. > However, it does not handle hardware tagging (not sure the underlying network > adapter can actually do it) > > Unfortunately, the VLAN hardware checksums seem to be done only if > IFCAP_VLAN_HWTAGGING is set [1] > I do not understand this assertion: if I force the propagation of the > hardware checksuming only based on the IFCAP_VLAN_HWCSUM, it works fine with > my driver. > > What do you think? > As you said some NICs do not rely on VLAN H/W tagging to make VLAN checksum offloading work. But current vlan(4) assumes VLAN H/W tagging is prerequisite condition to support VLAN checksum offloading. The same is true for TSO support over VLAN. I don't know what NIC you're referring to but it's rare to see controllers that don't support VLAN H/W tagging on PC/Servers. But it might be common NIC feature found on SoCs. If H/W requires offset parameters for checksum offloading you may already have to parse mbufs in the driver to extract that information and it would add additional overheads. If you really want to enable VLAN H/W checksum offloading in your driver you may be able to add VLAN tag handling in the parser and announce VLAN H/W tagging capability to network stack. You may not notice performance differences with VLAN H/W checksum offloading though. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Marvell Yukon (msk) network card causes the "sticky mouse" problem: mouse stops for extended periods of time
On Fri, May 13, 2016 at 02:17:09AM -0700, Yuri wrote: > On 05/11/2016 22:58, YongHyeon PYUN wrote: > >hw.msk.msi_disable is a loader tunable so you can't check it with > >sysctl(8). Add the tunable to boot/loader.conf to take it effect. > >See loader.conf(5) for more information. > > > Adding hw.msk.msi_disable="1" reduced the mouse problem, but Marvell Do you see msk(4) interrupt handler consumes lots of CPU cycles? I can't explain why non-MSI case mitigates mouse issue. > Yukon card still didn't function properly. Speed test showed much slower > result. And mouse still doesn't move as freely as usual. Something isn't Check H/W MAC counters with sysctl(8) and see whether there are errors(sysctl dev.msk.0.stats). If there is no error I'm not sure what's going on there. Could you try other OS(i.e. Linux USB stick) and see whether it works as expected? > right with it. > > Yuri > > ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Marvell Yukon (msk) network card causes the "sticky mouse" problem: mouse stops for extended periods of time
On Wed, May 11, 2016 at 10:46:01PM -0700, Yuri wrote: > On 05/08/2016 02:33, YongHyeon PYUN wrote: > >msk(4) will try to use MSI unless not configured to do so the IRQ > >wouldn't be shared with other devices. If msk(4) is using MSI you > >should see a high irq number greater than or equal to 256 in vmstat > >output. Given that you're seeing issues with MSI, try disabling > >MSI for msk(4). Add the following tunable to /boot/loader.conf and > >reboot. > > > >hw.msk.msi_disable="1" > > For some reason hw.msk.msi_disable isn't found: > # sysctl hw.msk.msi_disable=1 > sysctl: unknown oid 'hw.msk.msi_disable' > even though it is defined in sys/dev/msk/if_msk.c and if_msk module is > loaded. hw.msk.msi_disable is a loader tunable so you can't check it with sysctl(8). Add the tunable to boot/loader.conf to take it effect. See loader.conf(5) for more information. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Marvell Yukon (msk) network card causes the "sticky mouse" problem: mouse stops for extended periods of time
On Sat, May 07, 2016 at 09:33:39AM -0700, Yuri wrote: > On 05/07/2016 08:34, Eugene Grosbein wrote: > > > >Verify if your mouse (USB one?) is using IRQ 18 too. > >"vmstat -ai" command would be helpful. > Yes, I have a USB mouse, and my USB uses IRQ 18: > > irq18: ehci0 uhci5503154 9 > stray irq180 0 > msk(4) will try to use MSI unless not configured to do so the IRQ wouldn't be shared with other devices. If msk(4) is using MSI you should see a high irq number greater than or equal to 256 in vmstat output. Given that you're seeing issues with MSI, try disabling MSI for msk(4). Add the following tunable to /boot/loader.conf and reboot. hw.msk.msi_disable="1" Let me know whether that makes any difference for you. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Support for Killer E2400 Ethernet
On Fri, Feb 19, 2016 at 08:23:28PM +0100, Tino Engel wrote: > Thanks very much for the quick reply. > > So let me shed some words on your input: > > First: Limiting the memory size did not help at all, nothing changed. > Unfortunetly I cannot post the whole results of the sysctl, since I cannot > get this box into the net, and it is quite too much to type it by hand. > Is there any special value you are interested in? > I'm not sure, just wanted to know these counters to guess what would be causing the issue. Probably error related counters would be helpful(sysctl -d dev.alc.0.stats will show description for each counters). > Then I applied your patch. > The requested output is: > alc0: DMA CFG : 0x0c347c54 > Could you verbose boot your kernel and show me the output of alc(4) related ones? It will show you read request/TLP payload size as well as PCI/Chip revision information. > The bad thing: The error still persists. :( > It always writes "DMA write error" now followed by "DMA CFG : ..." > > One more thing: > The ping -s command results in the same error as trying to fetch something > from the internet. > 'ping -s 1472' was one of known way that reliably trigger the issue on E2200(and now E2400). > Do you have any further ideas? > Not yet. Thanks. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Support for Killer E2400 Ethernet
On Thu, Feb 18, 2016 at 04:17:30PM +0100, Tino Engel wrote: > > > > Hello all, > I am trying to establish support for the Killer E2400 ethernet adapter. > I am following the approach that works for the linux driver, which is > basically:- Add the E2400 device ID- Copy all device related stuff from the > E2200 > What works:- DHCP- Ping any host in the internet > What does not work:- Downloading stuff using "fetch"- Setup pkg- Therefore > cannot browse since I even cannot install pkg in order to obtain a browser > The error message I continously receive is "alc0: DMA write error". > I have so far played with all the loader tunables and sysctls supported by > the alc driver, but no improvement of the situation. > Any ideas how to proceed? Due to lack of access to Killer E2200 controller, alc(4) was not fully tested for the controller. One user also reported that alc(4) shows DMA errors as you saw. To narrow down the issue, could you please add the following to /boot/loader.conf file and test it again if your system has more than 4GB memory? hw.physmem="3G" The tunable above will limit system memory to 3GB. Also show me the output of "sysctl dev.alc.0.stats" before and after running "ping -s 1472 remote_ip_addr" command. (Note, the ping command with -s option requires root privilege and you have to reboot to take changes effect). If limiting system memory have no effect, could you try attached patch and let me know whether it makes any difference? The patch will print "alc0: DMA CFG : 0x". Let me know the value of . Thanks. Index: sys/dev/alc/if_alc.c === --- sys/dev/alc/if_alc.c(revision 295117) +++ sys/dev/alc/if_alc.c(working copy) @@ -4184,16 +4184,22 @@ alc_init_locked(struct alc_softc *sc) reg = (RXQ_CFG_RD_BURST_DEFAULT << RXQ_CFG_RD_BURST_SHIFT) & RXQ_CFG_RD_BURST_MASK; reg |= RXQ_CFG_RSS_MODE_DIS; - if ((sc->alc_flags & ALC_FLAG_AR816X_FAMILY) != 0) + if ((sc->alc_flags & ALC_FLAG_AR816X_FAMILY) != 0) { reg |= (RXQ_CFG_816X_IDT_TBL_SIZE_DEFAULT << RXQ_CFG_816X_IDT_TBL_SIZE_SHIFT) & RXQ_CFG_816X_IDT_TBL_SIZE_MASK; - if ((sc->alc_flags & ALC_FLAG_FASTETHER) == 0 && - sc->alc_ident->deviceid != DEVICEID_ATHEROS_AR8151_V2) - reg |= RXQ_CFG_ASPM_THROUGHPUT_LIMIT_1M; + if ((sc->alc_flags & ALC_FLAG_FASTETHER) == 0) + reg |= RXQ_CFG_ASPM_THROUGHPUT_LIMIT_100M; + } else { + if ((sc->alc_flags & ALC_FLAG_FASTETHER) == 0 && + sc->alc_ident->deviceid != DEVICEID_ATHEROS_AR8151_V2) + reg |= RXQ_CFG_ASPM_THROUGHPUT_LIMIT_100M; + } CSR_WRITE_4(sc, ALC_RXQ_CFG, reg); /* Configure DMA parameters. */ + reg = CSR_READ_4(sc, ALC_DMA_CFG); + device_printf(sc->alc_dev, "DMA CFG : 0x%08x\n", reg); reg = DMA_CFG_OUT_ORDER | DMA_CFG_RD_REQ_PRI; reg |= sc->alc_rcb; if ((sc->alc_flags & ALC_FLAG_CMB_BUG) == 0) @@ -4200,8 +4206,10 @@ alc_init_locked(struct alc_softc *sc) reg |= DMA_CFG_CMB_ENB; if ((sc->alc_flags & ALC_FLAG_SMB_BUG) == 0) reg |= DMA_CFG_SMB_ENB; - else - reg |= DMA_CFG_SMB_DIS; + else { + if ((sc->alc_flags & ALC_FLAG_AR816X_FAMILY) == 0) + reg |= DMA_CFG_SMB_DIS; + } reg |= (sc->alc_dma_rd_burst & DMA_CFG_RD_BURST_MASK) << DMA_CFG_RD_BURST_SHIFT; reg |= (sc->alc_dma_wr_burst & DMA_CFG_WR_BURST_MASK) << @@ -4293,16 +4301,16 @@ alc_stop(struct alc_softc *sc) /* Disable interrupts. */ CSR_WRITE_4(sc, ALC_INTR_MASK, 0); CSR_WRITE_4(sc, ALC_INTR_STATUS, 0x); - /* Disable DMA. */ - reg = CSR_READ_4(sc, ALC_DMA_CFG); - reg &= ~(DMA_CFG_CMB_ENB | DMA_CFG_SMB_ENB); - reg |= DMA_CFG_SMB_DIS; - CSR_WRITE_4(sc, ALC_DMA_CFG, reg); - DELAY(1000); + if ((sc->alc_flags & (ALC_FLAG_CMB_BUG | ALC_FLAG_SMB_BUG)) == 0) { + /* Disable DMA. */ + reg = CSR_READ_4(sc, ALC_DMA_CFG); + reg &= ~(DMA_CFG_CMB_ENB | DMA_CFG_SMB_ENB); + reg |= DMA_CFG_SMB_DIS; + CSR_WRITE_4(sc, ALC_DMA_CFG, reg); + DELAY(1000); + } /* Stop Rx/Tx MACs. */ alc_stop_mac(sc); - /* Disable interrupts which might be touched in taskq handler. */ - CSR_WRITE_4(sc, ALC_INTR_STATUS, 0x); /* Disable L0s/L1s */ alc_aspm(sc, 0, IFM_UNKNOWN); /* Reclaim Rx buffers that have been processed. */ ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: Hi, can u help me with Atheros card?
On Mon, Sep 21, 2015 at 10:26:58AM +, wrote: > Hi. How can i activate network card Atheros AR8151 ? > > > [root@ASTERSUSHI /usr/home/nord]# uname -a > FreeBSD ASTERSUSHI 9.1-RELEASE FreeBSD 9.1-RELEASE #0: Mon Aug 17 > 02:11:08 MSK 2015 root@ASTERSUSHI:/usr/obj/usr/src/sys/KERNEL amd64 > > [root@ASTERSUSHI /usr/home/nord]# pciconf -l -v | grep -B3 network > em0@pci0:2:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > -- > em1@pci0:3:0:0: class=0x02 card=0xa01f8086 chip=0x10d38086 rev=0x00 > hdr=0x00 > vendor = 'Intel Corporation' > device = '82574L Gigabit Network Connection' > class = network > -- > subclass = PCI-PCI > none2@pci0:7:0:0: class=0x02 card=0xe0001458 chip=0x10911969 > rev=0x10 hdr=0x00 > vendor = 'Atheros Communications' > class = network > It seems it looks like a AR8161 controller. I believe it should be supported by alc(4) on 10.2-RELEASE. Alternatively you would get support by updating to latest stable/9. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:00:35AM -0400, Rick Macklem wrote: Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. Don't forget that not all drivers in the tree set the TSO limits before if_attach(), so possibly the subtraction of one TSO fragment needs to go into ip_output() Ok, I realized that some drivers may not know the answers before ether_ifattach(), due to the way they are configured/written (I saw the use of if_hw_tsomax_update() in the patch). I was not able to find an interface that configures TSO parameters after if_t conversion. I'm under the impression if_hw_tsomax_update() is not designed to use this way. Probably we need a better one?(CCed to Gleb). If it is subtracted as a part of the assignment to if_hw_tsomaxsegcount in tcp_output() at line#791 in tcp_output() like the following, I don't think it should matter if the values are set before ether_ifattach()? /* * Subtract 1 for the tcp/ip header mbuf that * will be prepended to the mbuf chain in this * function in the code below this block. */ if_hw_tsomaxsegcount = tp-t_tsomaxsegcount - 1; I don't have a good solution for the case where a driver doesn't plan on using the tcp/ip header provided by tcp_output() except to say the driver can add one to the setting to compensate for that (and if they fail to do so, it still works, although somewhat suboptimally). When I now read the comment in sys/net/if_var.h it is clear what it means, but for some reason I didn't read it that way before? (I think it was the part that said the driver didn't have to subtract for the headers that confused me?) In any case, we need to try and come up with a clear definition of what they need to be set to. I can now think of two ways to deal with this: 1 - Leave tcp_output() as is, but provide a macro for the device driver authors to use that sets if_hw_tsomaxsegcount with a flag for driver uses tcp/ip header mbuf, documenting that this flag should normally be true. OR 2 - Change tcp_output() as above, noting that this is a workaround for confusion w.r.t. whether or not if_hw_tsomaxsegcount should include the tcp/ip header mbuf and update the comment in if_var.h to reflect this. Then drivers that don't use the tcp/ip header mbuf can increase their value for if_hw_tsomaxsegcount by 1. (The comment should also mention that a value of 35 or greater is much preferred to 32 if the hardware will support that.) Both works for me. My preference is 2 just because it's very common for most drivers that use tcp/ip header mbuf. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Tue, Aug 18, 2015 at 06:04:25PM -0400, Rick Macklem wrote: Hans Petter Selasky wrote: On 08/18/15 14:53, Rick Macklem wrote: If this is just a test machine, maybe you could test with these lines (at about #880) in sys/netinet/tcp_output.c commented out? (It looks to me like this will disable TSO for almost all the NFS writes.) - around line #880 in sys/netinet/tcp_output.c: /* * In case there are too many small fragments * don't use TSO: */ if (len = max_len) { len = max_len; sendalot = 1; tso = 0; } This was added along with the other stuff that did the if_hw_tsomaxsegcount, etc and I never noticed it until now (not my patch). FYI: These lines are needed by other hardware, like the mlxen driver. If you remove them mlxen will start doing m_defrag(). I believe if you set the correct parameters in the struct ifnet for the TSO size/count limits this problem will go away. If you print the len and max_len and also the cases where TSO limits are reached, you'll see what parameter is triggering it and needs to be increased. Well, if the driver isn't setting if_hw_tsomaxsegcount correctly, then it is the driver that needs to be fixed. Having the above code block disable TSO for all of the NFS writes, including the ones that set if_hw_tsomaxsegcount correctly doesn't make sense to me. If the driver authors don't set these, the drivers do lots of m_defrag() calls. I have posted more than once to freebsd-net@ asking the driver authors to set these and some now have. (I can't do it, because I don't have the hardware to test it with.) Thanks for reminder. I have generated a diff against HEAD. https://people.freebsd.org/~yongari/tso.param.diff The diff restores optimal TSO parameters which were lost in r271946 for drivers that relied on sane default values. I'll commit it after some testing. I do think that most/all of them don't subtract 1 for the tcp/ip header and I don't think they should be expected to, since the driver isn't supposed to worry about the protocol at that level. I agree. -- I think tcp_output() should subtract one from the if_hw_tsomaxsegcount provided by the driver to handle this, since it chooses to count mbufs (the while() loop at around line #825 in sys/netinet/tcp_output.c.) before it prepends the tcp/ip header mbuf. rick --HPS ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:51:44AM +0200, Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Hi, If you change the behaviour don't forget to update and/or add comments describing it. Maybe the amount of subtraction could be defined by some macro? Then drivers which inline the headers can subtract it? I'm also ok with your suggestion. Your suggestion is fine by me. The initial TSO limits were tried to be preserved, and I believe that TSO limits never accounted for IP/TCP/ETHERNET/VLAN headers! I guess FreeBSD used to follow MS LSOv1 specification with minor exception in pseudo checksum computation. If I recall correctly the specification says upper stack can generate up to IP_MAXPACKET sized packet. Other L2 headers like ethernet/vlan header size is not included in the packet and it's drivers responsibility to allocate additional DMA buffers/segments for L2 headers. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. You're right it's complicated. Not sure if bus_dma can provide an API for this though. --HPS ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 08:13:59AM -0400, Rick Macklem wrote: Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:51:44AM +0200, Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Hi, If you change the behaviour don't forget to update and/or add comments describing it. Maybe the amount of subtraction could be defined by some macro? Then drivers which inline the headers can subtract it? I'm also ok with your suggestion. Your suggestion is fine by me. The initial TSO limits were tried to be preserved, and I believe that TSO limits never accounted for IP/TCP/ETHERNET/VLAN headers! I guess FreeBSD used to follow MS LSOv1 specification with minor exception in pseudo checksum computation. If I recall correctly the specification says upper stack can generate up to IP_MAXPACKET sized packet. Other L2 headers like ethernet/vlan header size is not included in the packet and it's drivers responsibility to allocate additional DMA buffers/segments for L2 headers. Yep. The default for if_hw_tsomax was reduced from IP_MAXPACKET to 32 * MCLBYTES - max_ethernet_header_size as a workaround/hack so that devices limited to 32 transmit segments would work (ie. the entire packet, including MAC header would fit in 32 MCLBYTE clusters). This implied that many drivers did end up using m_defrag() to copy the mbuf list to one made up of 32 MCLBYTE clusters. If a driver sets if_hw_tsomaxsegcount correctly, then it can set if_hw_tsomax to whatever it can handle as the largest TSO packet (without MAC header) the hardware can handle. If it can handle IP_MAXPACKET, then it can set it to that. I thought the upper limit was still IP_MAXPACKET. If driver increase it (i.e. IP_MAXPACKET, the length field in the IP header would overflow which in turn may break firewalls and other packet handling in IPv4/IPv6 code path. If the limit no longer apply to network stack, that's great. Some controllers can handle up to 256KB TCP/UDP segmentation and supporting that feature wouldn't be hard. ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RE not working on 10.2-RELEASE #0 r286731M
On Fri, Aug 14, 2015 at 06:29:08PM -0400, Kim Culhan wrote: [...] On 08/14/15 13:34, Kim Culhan wrote: RE on 10.2-RELEASE #0 r286731M appears to pass only arp traffic. Replaced if_re.c with version from 273757, appears to work normally. The diff: 34c34 __FBSDID($FreeBSD: stable/10/sys/dev/re/if_re.c 273757 2014-10-28 00:43:00Z yongari $); --- __FBSDID($FreeBSD: releng/10.2/sys/dev/re/if_re.c 285177 2015-07-05 20:16:38Z marius $); 3198,3202d3197 * Enable transmit and receive. */CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB|RL_CMD_RX_ENB); /* 3227a3223,3227 /* * Enable transmit and receive. */ CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB | RL_CMD_RX_ENB); 3251,3254d3250 #ifdef notdef/* Enable receiver and transmitter. */CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB|RL_CMD_RX_ENB); #endif Let me know what additional info I can provide. [...] I'm running -current with all changes in place, I'm not seeing the issues noted here with my hardware. Can you post your hardware from pciconf -lv? re0@pci0:3:0:0: class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet re1@pci0:4:5:0: class=0x02 card=0x43021186 chip=0x43021186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' device = 'DGE-530T Gigabit Ethernet Adapter (rev.C1) [Realtek RTL8169]' class = network subclass = ethernet sean pciconf -lv re0@pci0:2:0:0: class=0x02 card=0x83671043 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet re1@pci0:6:0:0: class=0x02 card=0x816910ec chip=0x816910ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8169 PCI Gigabit Ethernet Controller' class = network subclass = ethernet re2@pci0:6:1:0: class=0x02 card=0x4c001186 chip=0x43001186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' device = 'DGE-528T Gigabit Ethernet Adapter' class = network subclass = ethernet The problem was noted on re2, re0 and re1 appeared to be working normally. Hmm, it seems your PCI controller does not work. I can't explain why Sean's re1 still works though. Would you try attached patch? BTW, it would be better to see the re(4) related dmesg output. Driver will show Chip/MAC revision and that is the only way to identify each MAC revision. Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 286823) +++ sys/dev/re/if_re.c (working copy) @@ -3197,6 +3197,12 @@ re_init_locked(struct rl_softc *sc) ~0x0008); /* + * Enable transmit and receive for non-PCIe controllers. + * RX/TX MACs should be enabled before RX/TX configuration. + */ + if ((sc-rl_flags RL_FLAG_PCIE) == 0) + CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB | RL_CMD_RX_ENB); + /* * Set the initial TX configuration. */ if (sc-rl_testmode) { @@ -3223,9 +3229,11 @@ re_init_locked(struct rl_softc *sc) } /* - * Enable transmit and receive. + * Enable transmit and receive for PCIe controllers. + * RX/TX MACs should be enabled after RX/TX configuration. */ - CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB | RL_CMD_RX_ENB); + if ((sc-rl_flags RL_FLAG_PCIE) != 0) + CSR_WRITE_1(sc, RL_COMMAND, RL_CMD_TX_ENB | RL_CMD_RX_ENB); #ifdef DEVICE_POLLING /* ___ freebsd-net@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: cpsw/atphy network drivers
On Thu, Mar 12, 2015 at 01:17:12PM +, Matt Dooner wrote: Hello, Thank you for your reply. Confirming the link configuration was a good exercise, but I think I've been able to rule it out as the issue. 100baseTX half-duplex appears to be the configuration request by the switch the board was plugged into. I have connected the 335x board directly to two other systems (windows and freebsd) and the correct configuration is negotiated when either or both are set to auto (If I change the configuration on one machine the other updates its configuration accordingly). I have also tested setting the link manually on both systems. I have also confirmed that my two other systems can connect with each other and the switch. I've connected the 335x board directly to another FreeBSD 10.1 (desktop) system. The desktop uses the fxp-miibus-inphy driver combo. I ifconfig 192.168.0.1 255.255.255.0 and ifconfig 192.168.0.2 255.255.255.0 each system respectively. I also setup default routes between them. When I create traffic (ping) on either machine I see the following incremented on the 335x: dev.cpsw.0.stats.GoodTxFrames: 64 dev.cpsw.0.stats.BroadcastTxFrames: 64 dev.cpsw.0.stats.RxTx65to127OctetFrames: 64 and on the desktop: dev.fxp.0.status.tx.good_frames: 3 All other stats on both the 335x and desktop are zero. Good to know you've solved the issue. I am able to follow similar steps to build a working network between the desktop and a windows laptop. Do you know if atphy(4) has been previously tested to work on the AR8033 or even the AR8031? miidevs only has an entry for AR8021. I've No I'm not aware of that. only found limited information about the PHY being used, but its from OpenBSD and the wrong cpu type. These is also a note in the change logs about this hardware Added atphy(4) to armv7, for the Atheros AR8031 phys in the AM335x starter kit. (http://www.openbsd.org/plus57.html) openbsd-current sys\arch\armv7\imx\imxenet.c: 466: case BOARD_ID_IMX6_WANDBOARD:/* AR8031 */ 467 /* disable SmartEEE */ 468 imxenet_miibus_writereg(dev, phy, 0x0d, 0x0003); ... 472 imxenet_miibus_writereg(dev, phy, 0x0e, reg ~0x0100); 473 474: /* enable 125MHz clk output for AR8031 */ 475 imxenet_miibus_writereg(dev, phy, 0x0d, 0x0007); 476 imxenet_miibus_writereg(dev, phy, 0x0e, 0x8016); important configure pin mux and work mode to RGMII mode. It seems that some additional driver development will likely be required. I don't have datasheet for AR8031/AR8033 PHYs so I'm not sure whether it's doable to apply PHY config magic above to atphy(4). I'm under the impression that AR8031/AR8035 may have some other special registers that report resolved speed/duplex and it shall require a new PHY driver. Linux seems to have slightly better comment for those PHYs though. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: cpsw/atphy network drivers
On Fri, Mar 06, 2015 at 12:45:22PM +, Matt Dooner wrote: Hello, I am having some trouble configurating the network driver on a TI T335x-based CoM system (http://www.compulab.co.il/products/computer-on-modules/cm-t335/). It uses the the AM335x integrated Ethernet MAC coupled with the AR8033 RGMII Ethernet PHY from Atheros. U-Boot is able to find the device as expected: CM-T335w # mii device MII devices: 'cpsw' Current device: 'cpsw' CM-T335w # mdio list cpsw: 0 - AR8031/AR8033 -- cpsw CM-T335w # dhcp link up on port 0, speed 100, half duplex BOOTP broadcast 1 DHCP client bound to address 10.1.192.67 CM-T335w # ping 8.8.8.8 link up on port 0, speed 100, half duplex Using cpsw device host 8.8.8.8 is alive [...] root@beaglebone:~ # ifconfig cpsw0: flags=8847UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8000bRXCSUM,TXCSUM,VLAN_MTU,LINKSTATE ether 1c:ba:8c:ed:40:99 inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255 09:58:57 cpsw_ifmedia_sts 09:58:57 cpsw_ifmedia_sts 09:58:57 cpsw_ifmedia_sts media: Ethernet autoselect (100baseTX half-duplex) Given that you get a half-duplex link, I guess there is speed or duplex mismatch with link partner. Does link partner also agree on the established link speed/duplex? If this is not the case, you may be able to see alignment errors or CRC errors via H/W MAC statistics counters. Quick reading the code indicates that cpsw(4) exports sysctl stat nodes(dev.cpsw.%d.stats). If link partner also supports H/W MAC statistics counters you can consult the info on the link partner. If the issue is really speed/duplex mismatch issue, probably you can try one of the following. - If you have manual media configuration for cpsw(4), use auto. - If link partner uses fixed speed/duplex instead of auto, use the same media configuration on cpsw(4). - If neither helps, try unplugging the UTP cable and wait a couple of seconds then plug it again. It seems there is no miibus_statchg handler in cpsw(4) so I guess cpsw(4) may not be able to program some MAC parameters including duplex when established link is not that of the cpsw(4) assumes. So it would be best to manually set link parameters on both link parter and cpsw(4) to use the same link configuration(100Mbps, full-duplex). status: active nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384 options=63RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 inet 127.0.0.1 netmask 0xff00 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL When connected to another computer running Wireshark no frames are recorded as having been transmitted over the interface. The cpsw driver never reports receiving any packets, even when I use a tool like Ostinato to craft frames addressed to the MAC of the NIC on the board. The network interface works perfectly in Debian Linux: Fixing the link state change handling as well as promiscuous mode handling seem to need more work. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Very bad Realtek problems
On Wed, Oct 29, 2014 at 01:01:26PM -0400, Mason Loring Bliss wrote: On Wed, Oct 29, 2014 at 10:46:30AM +0900, Yonghyeon PYUN wrote: Given that you can reliably reproduce the issue, let's check simple ones first. Just as a quick update, I couldn't tolerate the network outages any more as they were impacting my work, so I bought an Intel NIC. That said, this will actually free me up to do more debugging of the Realtek as soon as I get a chance to finish setting up a small test network - I'll be able to look at both sides of interactions instead of depending on the flaky interface. Ok, if you happen to find spare time on testing, let me know your findings. If you think the issue intermittently happens regardless of network load, try attached patch. I'm not sure whether the patch makes any difference for you since many PCIe NICs don't implement CLKREQ feature. It's just a wild guess. This is an onboard NIC, for what it's worth, on my: Yes, it's very common to see LOM version in these days. Base Board Information Manufacturer: ASUSTeK Computer INC. Product Name: M4A88T-M Version: Rev X.0x I'm not sure if that changes it as compared with a plug-in PCIe device. Just mentioning it for completeness. There is no much difference for driver between LOM and standalone NIC. LOM version may have some modifications compared to engineering samples I have. And motherboard vendors are free to program EEPROM/FLASH of NIC to meet their needs. I don't think the motherboard vendor heavily changed the NIC configuration though. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Very bad Realtek problems
On Mon, Oct 27, 2014 at 11:44:45PM -0400, Mason Loring Bliss wrote: On Tue, Oct 28, 2014 at 10:50:20AM +0900, Yonghyeon PYUN wrote: Currently re(4) heavily relies on power on default settings since no detailed register configuration is not available. Some register configurations made in Windows can survive from warm boot. Alright, a cold boot doesn't help. I froze up on an rsync, and observed these stats: re0 statistics: Tx frames : 209681 Rx frames : 27559 Tx errors : 0 Rx errors : 0 Rx missed frames : 0 Rx frame alignment errs : 0 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 27548 Rx broadcast frames : 9 Rx multicast frames : 2 Tx aborts : 0 Tx underruns : 0 I rebooted with MSI and MSI-X disabled, and it broke again on an rsync. I observed: re0 statistics: Tx frames : 416065 Rx frames : 47783 Tx errors : 0 Rx errors : 0 Rx missed frames : 0 Rx frame alignment errs : 0 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 47757 Rx broadcast frames : 24 Rx multicast frames : 2 Tx aborts : 0 Tx underruns : 0 The multicast frames seem to coincide with interface lock-ups. This time to correct I said ifconfig re0 down; ifconfig re0 up. It came back but the rsync died. I guess you don't see 'watchdog timeout' errors so driver's watchdog handler didn't help. Given that you can reliably reproduce the issue, let's check simple ones first. Disable all H/W offloading features(TX/RX checksum offloading, TSO, VLAN H/W tag insertion/stripping) and see whether that makes any difference. If that has no difference, identify which part of MAC is in stuck condition. Before interface down/up again after rsync breakage, run tcpdump on your box and see whether you can still see RX packets. If you can see RX packets, it indicates RX MAC still works. After that, run ping(8) to other host and see whether you can see the ICMP echo request packets sent from your host. If you can see the ICMP echo request packets, it indicates TX MAC works. I'll be happy to provide further debugging information once I know how to collect it. If you think the issue intermittently happens regardless of network load, try attached patch. I'm not sure whether the patch makes any difference for you since many PCIe NICs don't implement CLKREQ feature. It's just a wild guess. Thanks. Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 273756) +++ sys/dev/re/if_re.c (working copy) @@ -1365,6 +1365,7 @@ re_attach(device_t dev) PCIER_LINK_CTL, 2); if ((ctl PCIEM_LINK_CTL_ASPMC) != 0) { ctl = ~PCIEM_LINK_CTL_ASPMC; +ctl = ~PCIEM_LINK_CTL_ECPM; pci_write_config(dev, sc-rl_expcap + PCIER_LINK_CTL, ctl, 2); device_printf(dev, ASPM disabled\n); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Very bad Realtek problems
On Mon, Oct 27, 2014 at 03:51:24PM -0400, Mason Loring Bliss wrote: Hi, all. I've been having sporadic and serious problems with the Realtek gigabit interface built into my motherboard. Periodically, it just freezes up. I've tried several things to no avail: turning on DEVICE_POLLING, frobbing bootloader options and sysctl settings, etc. [...] It's not clear what's happening. I have been capturing stats periodically with 'sysctl dev.re.0.stats=1', but that doesn't always show a problem. For instance, during one of the lock-ups last night, after a reboot, I got this: re0 statistics: Tx frames : 171306 Rx frames : 20271 Tx errors : 0 Rx errors : 0 Rx missed frames : 0 Rx frame alignment errs : 0 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 20271 Rx broadcast frames : 0 Rx multicast frames : 0 Tx aborts : 0 Tx underruns : 0 After running overnight, with sporadic automated transfers: re0 statistics: Tx frames : 4658945 Rx frames : 1258514 Tx errors : 0 Rx errors : 33 Rx missed frames : 0 Rx frame alignment errs : 3591 Tx single collisions : 0 Tx multiple collisions : 0 Rx unicast frames : 1255880 Rx broadcast frames : 2411 Rx multicast frames : 223 Tx aborts : 0 Tx underruns : 0 I was seeing the Rx multicast frames creep up each time I saw a freeze last night, which was confusing in that I'm not sure why there'd be any multicast traffic. RealTek controllers have small number of H/W MAC counters so it's somewhat hard to guess what's happening there. But the RX frame alignment error normally indicates cabling issue or speed/duplex mismatches with link partner. It's normal to see multicast frames in local LAN. Here's the card from dmesg, with MSI/X turned off: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet port 0xe800-0xe8ff mem 0xfbfff000-0xfbff,0xfbff8000-0xfbffbfff irq 18 at device 0.0 on pci2 re0: Chip rev. 0x2c00 re0: MAC rev. 0x0020 It seems your controller is RTL8168E. [...] In general I've been saying ifconfig re0 down ; ifconfig re0 up to kick the interface, but last night a friendly person from IRC mentioned that I could work around this by running a steady ping and frobbing mediatype when I see the pings fail. So, I've got this running: while true do ping -c 1 -t 1 firewall /dev/null 21 if [ $? -ne 0 ]; then date echo toggling re0 echo ifconfig re0 media 1000baseT mediaopt full-duplex,flowcontrol,master ifconfig re0 media autoselect mediaopt flowcontrol sleep 3 fi sleep 1 done Please don't manually set media types for 1000baseT. It will result in speed/duplex mismatches and other issues. Probably this is the main reason why you see RX alignment errors. You should always stick to auto-negotiation with 1000baseT(Flow control can be set though). Manual media configuration is to workaround buggy link partners. This has been noting failures sporadically throughout the day, but it's allowing traffic to continue moving, albeit with the occasional hiccough. This hardware has been running Debian for a couple years, and it's never had so much as a short hiccough, so I have confidence that the hardware is fine. It suggests that there's something the Linux driver is doing to handle this hardware that FreeBSD isn't doing. For a while I was dual-booting and I'd see errors with FreeBSD running that were't there under Debian. I'd started diving into the source, both Linux and FreeBSD, but I lack sufficient exposure to ethernet driver code to be able to get a high-level picture of what they're doing, and as such I haven't yet noticed any special- case or hardware glitch handling that we're missing, although I might find something eventually. Data sheet for RealTek controller is not publicly available. Linux uses firmwares for every RealTek controllers. I vaguely guess it may be PHY DSP fixups but I don't have any detailed information for the firmwares. I'm struggling with finding a way to see what's actually happening with this. I've toggled MSI and MSI-X handling, I've turned down interrupt handling delays, I've tried both I/O and memory register transfers, although I'd not actually clear what's happening differently there. I've had polling variously enabled and disabled. One thing to note is that last night's horror while I was trying to move some back-up data was after rebooting from Windows. (Installed on a partition for gaming...) It made me wonder if we're not fully setting up some state on the card. I'd have what felt like a solid, glitchless week before that. Vendor's Windows driver may access/program large set of registers unknown to re(4). Currently re(4) heavily relies on power on default settings since no detailed register configuration is not available. Some register configurations made in Windows can survive from warm boot. Does cold-boot from Windows make any difference for
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Wed, Oct 01, 2014 at 11:14:59AM +0200, Nils Beyer wrote: Hi, Yonghyeon PYUN wrote: Default interrupt moderation policy is targeted to reduce latency so it will generate up to 10k interrupts/sec under high network load. If you want to reduce number of interrupts/sec, tune interrupt moderation sysctl variables mentioned in alc(4). Tried several values here: dev.alc.0.int_rx_mod={1000,1,10} dev.alc.0.int_tx_mod={1000,1,10} but didn't notice any changes neither in CPU usage nor throughput during the iperf test; kernel{alc0 taskq} stays at 70-75%. I've downed/upped the interface alc0 after every change. You may see difference when H/W handles tiny grams(i.e. 64 bytes UDP frames). For bulk TCP/UDP transfers, alc(4) can easily saturate the link. A simple iSCSI test using the native CTL interface works really well. A fio test results in 100MB/s read and write. Double-checking using netstat -I confirms gigabit-line speeds at around 120MB/s. CPU usage at kernel{alc0 taskq} is as high as in the iperf test. So I think that's a limitation of the AR8161 chip. Updated the diff to address link establishment issue. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Confirmed; with the anti-hibernation patch, link estalishment is now working flawlessly. Thank you very much for your work... Thanks for your testing. Patch updated again to fix wrong lock assertion. http://people.freebsd.org/~yongari/alc/alc.diff.20141002 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Tue, Sep 30, 2014 at 10:20:31AM +0200, Nils Beyer wrote: Hi Yonghyeon, Yonghyeon PYUN wrote: I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. [...] My NIC (System: Dell Inc. Vostro 3460): --- none2@pci0:2:0:0: class=0x02 card=0x05621028 chip=0x10911969 rev=0x10 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR8161 Gigabit Ethernet' class = network subclass = ethernet --- I've successfully applied both of your patches to 10.1-BETA3 (r272295) sources and rebuilt the kernel. Thanks for testing and detailed report. Then I've connected a network cable and rebooted. I've got a link and performed an iperf test. The results are really good: around 930 Mbit/s TX and 840 Mbit/s RX. CPU load during that test: 70.75% kernel{alc0 taskq}. Hmm, the RX performance number looks bad to me. You have to see more than 920Mbps. Could you show me the output of pciconf -lcbv? Then I've dis- and reconnected the network cable. Unfortunately, I cannot get a link anymore; it stays at: status: no carrier - tried ifconfig down/up, re- connecting the network cable several times, but it stays down. After another reboot the link can be established again. That doesn't always happen. Sometimes I easily get a link again, sometimes not and I need to reboot. I thought I verified link lost condition before requesting test. After reading your mail, I was successfully reproduce it with engineering sample board. It seems when link lost time lasts long enough alc(4) fails to re-establish a link. I don't have idea how to address that at this moment but I'll let you know if I manage to find a clue. If you need any further information or debugging, please let me know. For now, I'm using if_alc as an unloadable module - in case the NIC stalls for some reason; and am quite happy about being able to use that NIC now. Thanks for having added support for these NICs. Because I'm not using CURRENT, I've replied to the net mailing list. I hope, it is okay for you... That's ok. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Tue, Sep 30, 2014 at 11:35:03AM +0200, Nils Beyer wrote: Hi, Yonghyeon PYUN wrote: Then I've connected a network cable and rebooted. I've got a link and performed an iperf test. The results are really good: around 930 Mbit/s TX and 840 Mbit/s RX. CPU load during that test: 70.75% kernel{alc0 taskq}. Hmm, the RX performance number looks bad to me. You have to see more than 920Mbps. You're right; my fault - sorry for that. The iperf partner seems to have a bad/weak NIC because it also only gets 840 Mbit/s sending to another computer. So I've exchanged the iperf partner with another computer and am getting now 935 Mbit/s in both directions. Ok, thanks for letting me know that. If you use jumbo frame you would get better performance numbers. I should always measure measuring equipment before measuring. Default interrupt moderation policy is targeted to reduce latency so it will generate up to 10k interrupts/sec under high network load. If you want to reduce number of interrupts/sec, tune interrupt moderation sysctl variables mentioned in alc(4). Could you show me the output of pciconf -lcbv? Probably not neccessary anymore, but here you are (with additional -e option): --- #pciconf -lcbve | tail -20 alc0@pci0:2:0:0: class=0x02 card=0x05621028 chip=0x10911969 rev=0x10 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR8161 Gigabit Ethernet' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xd040, size 262144, enabled bar [18] = type I/O Port, range 32, base 0x2000, size 128, enabled cap 01[40] = powerspec 3 supports D0 D3 current D0 cap 10[58] = PCI-Express 1 endpoint max data 128(4096) link x1(x1) speed 2.5(2.5) ASPM L1(L0s/L1) cap 05[c0] = MSI supports 16 messages, 64 bit, vector masks cap 11[d8] = MSI-X supports 16 messages, enabled Table in map 0x10[0x2000], PBA in map 0x10[0x3000] ecap 0001[100] = AER 1 0 fatal 1 non-fatal 1 corrected ecap 0003[180] = Serial 1 ff55c9fb5cf9ddff PCI-e errors = Correctable Error Detected Non-Fatal Error Detected Unsupported Request Detected Non-fatal = Unsupported Request Corrected = Bad DLLP --- Thanks for the info. I thought I verified link lost condition before requesting test. After reading your mail, I was successfully reproduce it with engineering sample board. It seems when link lost time lasts long enough alc(4) fails to re-establish a link. Confirmed - if the network cable is disconnected long enough I cannot get a link either. As a workaround I un- and reload the if_alc module; then everything is working again as before... Updated the diff to address link establishment issue. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Thanks. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Success with Qualcomm Atheros QCA8171
On Thu, Sep 25, 2014 at 11:44:24AM +0200, Nils Beyer wrote: Hi Gulyaev, Gulyaev Ghosh wrote: Since I ask on the FreeBSD forums, there is a proposition to check alx-freebsd and have initialized interface. So if someone have similar hardware, you can got your experience with that project and share info. I've got an onboard AR8161 Gigabit Ethernet adapter. With your proposed GIT- repository and using its branch master I'm able to rudimentarily use the NIC. But only if the network cable is short ( 2m). Using longer cables gives me a no carrier status. When it is connected, speed is abysmal and functio- nality ceases as soon as I perform an iperf benchmark until I reboot. FYI: I've completed adding AR816x/AR817x support . You can find diff at the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20140930 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Success with Qualcomm Atheros QCA8171
On Fri, Sep 26, 2014 at 09:17:57AM +0800, Kevin Lo wrote: On Thu, Sep 25, 2014 at 11:44:24AM +0200, Nils Beyer wrote: Hi Gulyaev, Gulyaev Ghosh wrote: Since I ask on the FreeBSD forums, there is a proposition to check alx-freebsd and have initialized interface. So if someone have similar hardware, you can got your experience with that project and share info. I've got an onboard AR8161 Gigabit Ethernet adapter. With your proposed GIT- repository and using its branch master I'm able to rudimentarily use the NIC. But only if the network cable is short ( 2m). Using longer cables gives me a no carrier status. When it is connected, speed is abysmal and functio- Yeah, the controller requires lots of DSP fixup codes under various situations. nality ceases as soon as I perform an iperf benchmark until I reboot. But for small data transfers (SSH, RSYNC, RDP) it's okay. Mark has done a good job so far: === #dmesg | grep alx alx0: Qualcomm Atheros AR8161 Gigabit Ethernet port 0x2000-0x207f mem 0xd040-0xd043 irq 16 at device 0.0 on pci2 alx0: Ethernet address: 5c:f9:dd:55:c9:fb #ifconfig alx0 alx0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 ether 5c:f9:dd:55:c9:fb inet6 fe80::5ef9:ddff:fe55:c9fb%alx0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect status: no carrier #pciconf -evl | tail -13 alx0@pci0:2:0:0:class=0x02 card=0x05621028 chip=0x10911969 rev=0x10 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR8161 Gigabit Ethernet' class = network subclass = ethernet PCI-e errors = Correctable Error Detected Non-Fatal Error Detected Unsupported Request Detected Non-fatal = Unsupported Request Corrected = Bad TLP Bad DLLP REPLAY_NUM Rollover Replay Timer Timeout === AFAIK yongari@ is working on it. Yes, I'm working on it. Very basic things seems to work at this moment but it still has lots of things to be resolved. Due to lack of spare time the progress is very slow but I'll let you guys know when it's ready for public testing. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: jme interface bounces up and down, up and down....
On Mon, Sep 15, 2014 at 08:19:37AM -0600, Brett Glass wrote: At 12:08 AM 9/15/2014, Yonghyeon PYUN wrote: Would you show me the output of dmesg(jme(4) and jmphy(4) only) to know exact chip revision? Here you are. jme0: JMicron Inc, JMC25x Gigabit Ethernet port 0xec80-0xecff,0xe800-0xe8ff mem 0xfbffc000-0xfbfff fff irq 18 at device 0.0 on pci1 jme0: MSIX count : 8 jme0: MSI count : 8 jme0: attempting to allocate 1 MSI-X vectors (8 supported) msi: routing MSI-X IRQ 256 to local APIC 0 vector 52 jme0: using IRQ 256 for MSI-X jme0: Using 1 MSIX messages. jme0: PCI device revision : 0x0250 jme0: Chip revision : 0x11 ^^ Initially I suspected you might have relatively new JMC25x controller but it seems you have early revision of the controller(JMC250 A2). Early revision of the controller has 1000baseT link establishment issue with 802.3az capable switches. The issue is explained in jme(4) man page. The known workaround is to manually set 100baseTX media. I recall you mentioned Linux had no problems so I wonder Linux was able to establish a 1000baseT link. In theory, the workaround could be implemented in driver but it is layering violation and will have to duplicate lots of work done by mii(4). jme0: ethernet hardware address not found in EEPROM. jme0: PHY is at address 1. jme0: Read request size : 512 bytes. jme0: TLP payload size : 128 bytes. miibus0: MII bus on jme0 jmphy0: JMP211 10/100/1000 media interface PHY 1 on miibus0 jmphy0: OUI 0x00d831, model 0x0021, rev. 1 jmphy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX -flow-master, auto, auto-flow jme0: bpf attached jme0: Ethernet address: e0:cb:4e:54:23:ac I do not particularly like JMicron chips (their faulty SSD controllers have cost me a lot of time and money), but that's I don't have experiences with other JMicron products so can't comment on SSD. JMC25x may not be world best gigabit ethernet controller but I was quite satisfied with its high performance and clear documentation and Vendor's support. what's on the motherboard of this Asus. So, I need to find a way to make it work. --Brett Glass ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: jme interface bounces up and down, up and down....
On Tue, Sep 16, 2014 at 05:53:51PM -0600, Brett Glass wrote: At 05:27 PM 9/16/2014, Chris Hill wrote: On Tue, 16 Sep 2014, Brett Glass wrote: So, what is the best solution? I cannot throw out the machine, and because I am using a VLAN switch to multiplex the port to three LANs I do not want to reduce the speed to 100 Mbps. Ideas? The man page mentioned says that if the link partner enabled the IEEE 802.3az Energy Efficient Ethernet feature, the controller will not be able to establish a 1000baseT link. Maybe disable 802.3az on that port, if you can. Just a thought. It's a Netgear green switch, model GS105E. It has no way to disable 802.3az. Then, probably the only available option to establish a link against the switch would be using reduced speed(100Mbps) with ifconfig(8). If you can't reduce the speed due to other reasons I'm afraid there is no way to establish a link at this moment. As I said in previous mail, did you check what resolved speed Linux shows? Also it would be good idea to know whether you're really seeing the PHY hardware issue or not. Directly connect the jme(4) to other box without switch and see whether jme(4) can establish a 1000baseT link. --Brett ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: jme interface bounces up and down, up and down....
On Tue, Sep 16, 2014 at 07:59:18PM -0500, Jim Thompson wrote: On Sep 16, 2014, at 6:53 PM, Brett Glass br...@lariat.net wrote: At 05:27 PM 9/16/2014, Chris Hill wrote: On Tue, 16 Sep 2014, Brett Glass wrote: So, what is the best solution? I cannot throw out the machine, and because I am using a VLAN switch to multiplex the port to three LANs I do not want to reduce the speed to 100 Mbps. Ideas? The man page mentioned says that if the link partner enabled the IEEE 802.3az Energy Efficient Ethernet feature, the controller will not be able to establish a 1000baseT link. Maybe disable 802.3az on that port, if you can. Just a thought. It's a Netgear green switch, model GS105E. It has no way to disable 802.3az. The linux jmebp-1.0.8.5 driver from the JMicron website ftp://driver.jmicron.com.tw/Ethernet/Linux/jmebp-1.0.8.5.tar.bz provides a workaround for the issue. It adds the delay_time module parameter, which causes the network card to attempt a fall back to 100 mbps after it cannot connect for several seconds (by default 11). With this, link detection “works”, but the connection is 100Mbps. This is likely the reason the problem didn't seem to occur with the bundled Linux distro.” I recall the workaround was suggested by the Vendor but I didn't incorporate it into the driver due to other reasons. The end result is the same if users can manually reduce the speed to 100Mbps. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: jme interface bounces up and down, up and down....
On Sat, Sep 13, 2014 at 07:13:44PM -0600, Brett Glass wrote: Everyone: I just installed FreeBSD 10.0-RELEASE on an Asus EeeBox B202 (which comes with Linux). This particular version of the product comes with a JMicron gigabit Ethernet adapter that uses the jme(4) driver. Because it only has one port and I need several, I've set it up with multiple VLANs, which are trunked out to a little Netgear VLAN switch. Unfortunately, the interface is bouncing up and down every few minutes: Sep 13 12:44:44 kern.notice testbed kernel: jme0: link state changed to UP Sep 13 12:44:44 kern.notice testbed kernel: jme0_1: link state changed to UP Sep 13 12:44:44 kern.notice testbed kernel: jme0_2: link state changed to UP Sep 13 12:44:44 kern.notice testbed kernel: jme0_3: link state changed to UP Sep 13 12:50:04 kern.notice testbed kernel: jme0: link state changed to DOWN Sep 13 12:50:04 kern.notice testbed kernel: jme0_1: link state changed to DOWN Sep 13 12:50:04 kern.notice testbed kernel: jme0_2: link state changed to DOWN Sep 13 12:50:04 kern.notice testbed kernel: jme0_3: link state changed to DOWN Sep 13 12:50:43 kern.notice testbed kernel: jme0: link state changed to UP ... The problem didn't seem to occur with the bundled Linux distro. Has anyone else seen this problem? Know of a fix? Would you show me the output of dmesg(jme(4) and jmphy(4) only) to know exact chip revision? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: NFS client READ performance on -current
On Sat, Jul 12, 2014 at 05:14:00PM -0400, Rick Macklem wrote: Yonghyeon Pyun wrote: On Fri, Jul 11, 2014 at 09:54:23AM -0400, John Baldwin wrote: On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote: John Baldwin wrote: On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote: Russell L. Carter wrote: On 07/02/14 19:09, Rick Macklem wrote: Could you please post the dmesg stuff for the network interface, so I can tell what driver is being used? I'll take a look at it, in case it needs to be changed to use m_defrag(). em0: Intel(R) PRO/1000 Network Connection 7.4.2 port 0xd020-0xd03f mem 0xfe4a-0xfe4b,0xfe48-0xfe49 irq 44 at device 0.0 on pci2 em0: Using an MSI interrupt em0: Ethernet address: 00:15:17:bc:29:ba 001.07 [2323] netmap_attach success for em0 tx 1/1024 rx 1/1024 queues/slots This is one of those dual nic cards, so there is em1 as well... Well, I took a quick look at the driver and it does use m_defrag(), but I think that the retry: label it does a goto after doing so might be in the wrong place. The attached untested patch might fix this. Is it convenient to build a kernel with this patch applied and then try it with TSO enabled? rick ps: It does have the transmit segment limit set to 32. I have no idea if this is a hardware limitation. I think the retry is not in the wrong place, but the overhead of all those pullups is apparently quite severe. The m_defrag() call after the first failure will just barely squeeze the just under 64K TSO segment into 32 mbuf clusters. Then I think any m_pullup() done during the retry will allocate an mbuf (at a glance it seems to always do this when the old mbuf is a cluster) and prepend that to the list. -- Now the list is 32 mbufs again and the bus_dmammap_load_mbuf_sg() will fail again on the retry, this time fatally, I think? I can't see any reason to re-do all the stuff using m_pullup() and Russell reported that moving the retry: fixed his problem, from what I understood. Ah, I had assumed (incorrectly) that the m_pullup()s would all be nops in this case. It seems the NIC would really like to have all those things in a single segment, but it is not required, so I agree that your patch is fine. I recall em(4) controllers have various limitation in TSO. Driver has to update IP header to make TSO work so driver has to get a writable mbufs. bpf(4) consumers will see IP packet length is 0 after this change. I think tcpdump has a compile time option to guess correct IP packet length. The firmware of controller also should be able to access complete IP/TCP header in a single buffer. I don't remember more details in TSO limitation but I guess you may be able to get more details TSO limitation from publicly available Intel data sheet. I think that the patch should handle this ok. All of the m_pullup() stuff gets done the first time. Then, if the result is more than 32 mbufs in the list, m_defrag() is called to copy the chain. This should result in all the header stuff in the first mbuf cluster and the map call is done again with this list of clusters. (Without the patch, m_pullup() would allocate another prepended mbuf and make the chain more than 32mbufs again.) Yes, your patch looks right. Russell seemed to confirm that the patch fixed the problem for him, but since I don't have em(4) hardware, it would be nice to have someone with commit privilege and access to em(4) hardware test and commit it. Due to breakage of power supply on a box with em(4) controller, I can't test the patch. But I guess it's ok to commit it and Russel already tested it. Thanks for your patch. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: NFS client READ performance on -current
On Fri, Jul 11, 2014 at 09:54:23AM -0400, John Baldwin wrote: On Thursday, July 10, 2014 6:31:43 pm Rick Macklem wrote: John Baldwin wrote: On Thursday, July 03, 2014 8:51:01 pm Rick Macklem wrote: Russell L. Carter wrote: On 07/02/14 19:09, Rick Macklem wrote: Could you please post the dmesg stuff for the network interface, so I can tell what driver is being used? I'll take a look at it, in case it needs to be changed to use m_defrag(). em0: Intel(R) PRO/1000 Network Connection 7.4.2 port 0xd020-0xd03f mem 0xfe4a-0xfe4b,0xfe48-0xfe49 irq 44 at device 0.0 on pci2 em0: Using an MSI interrupt em0: Ethernet address: 00:15:17:bc:29:ba 001.07 [2323] netmap_attach success for em0 tx 1/1024 rx 1/1024 queues/slots This is one of those dual nic cards, so there is em1 as well... Well, I took a quick look at the driver and it does use m_defrag(), but I think that the retry: label it does a goto after doing so might be in the wrong place. The attached untested patch might fix this. Is it convenient to build a kernel with this patch applied and then try it with TSO enabled? rick ps: It does have the transmit segment limit set to 32. I have no idea if this is a hardware limitation. I think the retry is not in the wrong place, but the overhead of all those pullups is apparently quite severe. The m_defrag() call after the first failure will just barely squeeze the just under 64K TSO segment into 32 mbuf clusters. Then I think any m_pullup() done during the retry will allocate an mbuf (at a glance it seems to always do this when the old mbuf is a cluster) and prepend that to the list. -- Now the list is 32 mbufs again and the bus_dmammap_load_mbuf_sg() will fail again on the retry, this time fatally, I think? I can't see any reason to re-do all the stuff using m_pullup() and Russell reported that moving the retry: fixed his problem, from what I understood. Ah, I had assumed (incorrectly) that the m_pullup()s would all be nops in this case. It seems the NIC would really like to have all those things in a single segment, but it is not required, so I agree that your patch is fine. I recall em(4) controllers have various limitation in TSO. Driver has to update IP header to make TSO work so driver has to get a writable mbufs. bpf(4) consumers will see IP packet length is 0 after this change. I think tcpdump has a compile time option to guess correct IP packet length. The firmware of controller also should be able to access complete IP/TCP header in a single buffer. I don't remember more details in TSO limitation but I guess you may be able to get more details TSO limitation from publicly available Intel data sheet. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [RFC] Allow m_dup() to use JUMBO clusters
On Mon, Jul 07, 2014 at 10:12:07AM +0200, Hans Petter Selasky wrote: Hi, I'm asking for some input on the attached m_dup() patch, so that existing functionality or dependencies are not broken. The background for the change is to allow m_dup() to defrag long mbuf chains that doesn't fit into a specific hardware's scatter gather entries, typically when doing TSO. In my case the HW limit is 16 entries of length 4K for doing a 64KByte I wonder how HW can handle a full-sized TSO packet(64KB + Ethernet header + VLAN tag). TSO packet. Currently m_dup() is at best producing 32 entries of each 2K for a 64Kbytes TSO packet. By allowing m_dup() to get JUMBO clusters when allocating mbufs, we avoid creating a new function, specific to the hardware, to defrag some rare-occurring very long mbuf chains into a mbuf chain below 16 entries. I think m_dup() was used to get a copy of writable mbuf chains. If m_dup() starts to allocate jumbo mbufs it will eventually fail on long running boxes. This will break firewall(ipfw divert, pf/ipf dup-to) rules and several ethernet drivers. I don't know how many TSO requests could be queued by HW but if the number is very small, the driver may be able to pre-allocate that number of buffers (N * (64KB + Ethernet header + VLAN tag)) in driver. Upper stack will almost always generate more than 16 mbufs for TSO packets. When driver knows the length of mbuf chain of TSO packet is more than 16, you can copy the mbuf chain to the pre-allocated buffer. I recall I didn't implement TSO on txp(4) because the firmware of txp(4) controller does not support more than 16 fragment descriptors. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RX checksum offloading problem
On Mon, May 12, 2014 at 01:22:03PM +0200, Michael Tuexen wrote: On 12 May 2014, at 03:36, Yonghyeon PYUN pyu...@gmail.com wrote: On Fri, May 09, 2014 at 12:46:48PM +0200, Michael Tuexen wrote: On 09 May 2014, at 03:35, Yonghyeon PYUN pyu...@gmail.com wrote: [...] Oops, sorry. You're right. Probably I was confused with old memory when I worked on that area. I've quickly read IP reassembly code again and as you said, it should work. However it seems there is a checksumming bug here. /* * In order to do checksumming faster we do 'end-around carry' here * (and not in for{} loop), though it implies we are not going to * reassemble more than 64k fragments. */ m-m_pkthdr.csum_data = (m-m_pkthdr.csum_data 0x) + (m-m_pkthdr.csum_data 16); I guess the line above didn't account possible carry happened after the computation. Probably it could be rewritten as the following. while (m-m_pkthdr.csum_data 0x) m-m_pkthdr.csum_data = (m-m_pkthdr.csum_data 0x) + (m-m_pkthdr.csum_data 16); I think you are right here. Good catch. Will you fix it? Done in r265942. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: TX Checksum offloading issue with re interfaces
On Mon, May 12, 2014 at 01:09:18PM +0200, Michael Tuexen wrote: On 12 May 2014, at 06:38, Yonghyeon PYUN pyu...@gmail.com wrote: On Fri, May 09, 2014 at 12:33:24PM +0200, Michael Tuexen wrote: On 09 May 2014, at 03:47, Yonghyeon PYUN pyu...@gmail.com wrote: On Thu, May 08, 2014 at 08:50:48PM +0200, Michael Tuexen wrote: Dear all, while testing checksum offloading of UDP packets over IP with IP options, I figured out that my card dev.re.1.%desc: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet dev.re.1.%driver: re dev.re.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PE1F.LAN2 dev.re.1.%pnpinfo: vendor=0x10ec device=0x8168 subvendor=0x1734 subdevice=0x1159 class=0x02 dev.re.1.%parent: pci13 dev.re.1.stats: -1 dev.re.1.int_rx_mod: 65 computes the UDP checksum, but stores it in the packet at the place, where it would be, if there are no IP options. So it corrupts the options in the packet... I looked at sys/dev/re/if_re.c, but couldn't figure out how to fix it. Any idea? re(4) has a very long history on its broken TX checksum offloading. So re(4) has many workarounds for known issues on several variants. re(4) controllers support TX IPv4/TCP/UDP checksum offloading. For 8168C/8168CP, TX IPv4 checksum offloading was disabled due to generation of corrupted frames. Could you show me the dmesg output(only re(4)/rgephy(4))? The vendor uses the same PCI id for its RTL8168/8111 family chips so dmesg output is necessary to know exact controller revision. Sure (re1 was used during the test): re0: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet port 0x8000-0x80ff mem 0xf6104000-0xf6104fff,0xf610-0xf6103fff irq 16 at device 0.0 on pci12 re0: Using 1 MSI-X message re0: Chip rev. 0x2880 re0: MAC rev. 0x0020 miibus0: MII bus on re0 rgephy0: RTL8251 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: 00:19:99:85:31:d9 re1: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet port 0x9000-0x90ff mem 0xf5c2-0xf5c20fff,0xf620-0xf620 irq 17 at device 0.0 on pci13 re1: Using 1 MSI-X message re1: Chip rev. 0x3c80 re1: MAC rev. 0x0030 miibus1: MII bus on re1 rgephy1: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus1 rgephy1: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re1: Ethernet address: 00:19:99:7e:c7:46 It seems you have two variants. You are right, I didn't know. Both are on-board interfaces... re0 is RTL8168DP and re1 is RTL8168CP. Do you see the issue on both controllers? I guess you may see the issue on re1 only since you've posted dev.re.1 output. I've attached a diff which may It wasn't intentionally, but by accident, based on the addresses I was using. However, I now tested both interfaces and re0 works without any patch, but re1 needs your patch. address the issue on re1 interface. If you see the issue on re0, I have to change the diff to include RTL8168D. Your patch looks good. Please go ahead and commit it. Thanks for your help! Fixed in r265943. Thanks for testing! ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RX checksum offloading problem
On Thu, May 08, 2014 at 08:40:22PM +0200, Michael Tuexen wrote: On 07 May 2014, at 10:37, Yonghyeon PYUN pyu...@gmail.com wrote: On Wed, May 07, 2014 at 10:07:09AM +0200, Michael Tuexen wrote: On 07 May 2014, at 09:56, Yonghyeon PYUN pyu...@gmail.com wrote: On Sat, May 03, 2014 at 11:52:47AM +0200, Michael Tuexen wrote: On 02 May 2014, at 16:02, Bjoern A. Zeeb b...@freebsd.org wrote: On 02 May 2014, at 10:22 , Michael Tuexen michael.tue...@lurchi.franken.de wrote: Dear all, during testing I found that FreeBSD head (on a raspberry pi) accepts SCTP packet with bad checksums. After debugging this I figured out that this is a problem with the csum_flags defined in mbuf.h. The SCTP code on its input path checks for CSUM_SCTP_VALID, which is defined in mbuf.h: #define CSUM_SCTP_VALID CSUM_L4_VALID This makes sense: If CSUM_SCTP_VALID is set in csum_flags, the packet is considered to have a correct checksum. For UDP and TCP some drivers calculate the UDP/TCP checksum and set CSUM_DATA_VALID in csum_flags to indicate that the UDP/TCP should consider csum_data to figure out if the packet has a correct checksum. The problem is that CSUM_DATA_VALID is defined as #define CSUM_DATA_VALID CSUM_L4_VALID In this case the semantic is not that the packet has a valid checksum, but the csum_data field contains information. Now the following happens (on the raspberry pi the driver used is dev/usb/net/if_smsc.c 1. A packet is received and if it is not too short, the checksum computed is stored in csum_data and the flag CSUM_DATA_VALID is set. This happens for all IP packets, not only for UDP and TCP packets. 2. In case of SCTP packets, the SCTP interprets CSUM_DATA_VALID as CSUM_SCTP_VALID and accepts the packet. So no SCTP checksum check ever happened. Alternatives to fix this: 1. Change all drivers to set CSUM_DATA_VALID only in case of UDP or TCP packets, since it only makes sense in these cases. Wait, or for SCTP in cad the crc32 (I think it was) was actually checked but not otherwise. This is how it should be imho. It seems like a driver bug. I went through the list of drivers and you are right, it seems to be a bug in if_smsc.c. Most of the other drivers check for UDP/TCP, a small set I can't tell. I'm not sure how the controller computes TCP/UDP checksum values. It seems the publicly available data sheet was highly sanitized so it was useless to me. The comment in the driver says that the Same for me... controller computes RX checksum after the IPv4 header to the end of ethernet frame. After seeing that comment, three questions popped up: OK, I did some testing. It looks like the card is just computing the checksum over the IP payload taking the correct IP header length into account. 1. Is the controller smart enough to skip IP options header in TCP/UDP checksum offloading? Yes, I can send fragmented and un-fragmented UDP packets with IP options and they are handled correctly. Even if the last fragment is too short. I'm assuming you're taking about receiving fragmented UDP packets with RX checksum offloading, right? 2. How controller handles UDP checksum value 0x(i.e. sender didn't compute UDP checksum)? This case isn't handled. However, udp_input() looks first for zero checksums and only after that in the csum_flags. So it doesn't result in any problems. Would you prefer not to set CSUM_DATA_VALID in this case? At least, it correctly updates UDP stats of netstat(1). 3. How the controller can compute TCP checksum of fragmented packets? At least it does it right for UDP... Hmm, CSUM_DATA_VALID indicates driver computed RX TCP/UDP checksum without pseudo header. As you know, controller can't compute TCP/UDP checksum until all its fragmented payload are read from wire. Packets may arrive out of order and may be mixed with other packets. Advanced controllers with enough memory may be able to compute TCP/UDP checksums by tracking each connections(e.g LRO) but low-end controllers may be not. It seems the controller does not even support RX TCP/UDP pseudo header checksum offloading so I wonder how this controller can support RX TCP/UDP checksum offloading for fragmented TCP/UDP packets. Some controllers maintain two bits for TCP/UDP checksum offloading result in status word. One bit is used to indicate whether controller performed TCP/UDP checksum offloading and the other bit is used to indicate whether the computed checksum is correct or not. For UDP checksum value 0x and fragmented TCP/UDP packets, these controllers do not attempt to compute TCP/UDP checksum. Best regards Michael Since you have the controller I guess it's easy to verify all cases. For case 3, I believe the controller can't handle fragmented frames so driver should have to explicitly check ip_off field of IPv4
Re: TX Checksum offloading issue with re interfaces
On Thu, May 08, 2014 at 08:50:48PM +0200, Michael Tuexen wrote: Dear all, while testing checksum offloading of UDP packets over IP with IP options, I figured out that my card dev.re.1.%desc: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet dev.re.1.%driver: re dev.re.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PE1F.LAN2 dev.re.1.%pnpinfo: vendor=0x10ec device=0x8168 subvendor=0x1734 subdevice=0x1159 class=0x02 dev.re.1.%parent: pci13 dev.re.1.stats: -1 dev.re.1.int_rx_mod: 65 computes the UDP checksum, but stores it in the packet at the place, where it would be, if there are no IP options. So it corrupts the options in the packet... I looked at sys/dev/re/if_re.c, but couldn't figure out how to fix it. Any idea? re(4) has a very long history on its broken TX checksum offloading. So re(4) has many workarounds for known issues on several variants. re(4) controllers support TX IPv4/TCP/UDP checksum offloading. For 8168C/8168CP, TX IPv4 checksum offloading was disabled due to generation of corrupted frames. Could you show me the dmesg output(only re(4)/rgephy(4))? The vendor uses the same PCI id for its RTL8168/8111 family chips so dmesg output is necessary to know exact controller revision. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RX checksum offloading problem
On Sat, May 03, 2014 at 11:52:47AM +0200, Michael Tuexen wrote: On 02 May 2014, at 16:02, Bjoern A. Zeeb b...@freebsd.org wrote: On 02 May 2014, at 10:22 , Michael Tuexen michael.tue...@lurchi.franken.de wrote: Dear all, during testing I found that FreeBSD head (on a raspberry pi) accepts SCTP packet with bad checksums. After debugging this I figured out that this is a problem with the csum_flags defined in mbuf.h. The SCTP code on its input path checks for CSUM_SCTP_VALID, which is defined in mbuf.h: #define CSUM_SCTP_VALID CSUM_L4_VALID This makes sense: If CSUM_SCTP_VALID is set in csum_flags, the packet is considered to have a correct checksum. For UDP and TCP some drivers calculate the UDP/TCP checksum and set CSUM_DATA_VALID in csum_flags to indicate that the UDP/TCP should consider csum_data to figure out if the packet has a correct checksum. The problem is that CSUM_DATA_VALID is defined as #define CSUM_DATA_VALID CSUM_L4_VALID In this case the semantic is not that the packet has a valid checksum, but the csum_data field contains information. Now the following happens (on the raspberry pi the driver used is dev/usb/net/if_smsc.c 1. A packet is received and if it is not too short, the checksum computed is stored in csum_data and the flag CSUM_DATA_VALID is set. This happens for all IP packets, not only for UDP and TCP packets. 2. In case of SCTP packets, the SCTP interprets CSUM_DATA_VALID as CSUM_SCTP_VALID and accepts the packet. So no SCTP checksum check ever happened. Alternatives to fix this: 1. Change all drivers to set CSUM_DATA_VALID only in case of UDP or TCP packets, since it only makes sense in these cases. Wait, or for SCTP in cad the crc32 (I think it was) was actually checked but not otherwise. This is how it should be imho. It seems like a driver bug. I went through the list of drivers and you are right, it seems to be a bug in if_smsc.c. Most of the other drivers check for UDP/TCP, a small set I can't tell. I'm not sure how the controller computes TCP/UDP checksum values. It seems the publicly available data sheet was highly sanitized so it was useless to me. The comment in the driver says that the controller computes RX checksum after the IPv4 header to the end of ethernet frame. After seeing that comment, three questions popped up: 1. Is the controller smart enough to skip IP options header in TCP/UDP checksum offloading? 2. How controller handles UDP checksum value 0x(i.e. sender didn't compute UDP checksum)? 3. How the controller can compute TCP checksum of fragmented packets? Since you have the controller I guess it's easy to verify all cases. For case 3, I believe the controller can't handle fragmented frames so driver should have to explicitly check ip_off field of IPv4 header. See how gem(4)/sk(4)/hme(4) and fxp(4) handle it. Best regards Michael ? Bjoern A. Zeeb Come on. Learn, goddamn it., WarGames, 1983 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: re(4) causes memory corruption?
On Tue, Apr 08, 2014 at 11:21:12AM +0300, Andriy Gapon wrote: I have this network card (it's actually integrated into a motherboard): re0: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet port 0xde00-0xdeff mem 0xfdaff000-0xfdaf,0xfdae-0xfdae irq 18 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: Chip rev. 0x3c00 re0: MAC rev. 0x0040 miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow When there is little traffic through the interface I do not observe any problems with it. But within 15 seconds of applying some moderate traffic I would always observe a heavy screen corruption often followed by a total freeze or a hardware self-reset. An example of the moderate traffic is 6 MBytes/s which results in about 10K interrupts per seconds. PCIe re(4) controllers do not seem to have intelligent interrupt moderation feature. At least it's not documented at all. To overcome the H/W limitation, re(4) uses one-shot timer interrupt to mitigate interrupt processing overhead. However the maximum time allowed to set for one-shot timer is less than or equal to 65us so you may still see lots of interrupts under heavy load. I am not sure what causes the problem. Could it be some driver using memoery that it should not or hardware writing where it should not or if this something completely in the hardware. I will appreciate any hints on possible ways to analyze this issue. It seems your controller is old RTL8168C and I'm not aware of any memory corruption issues with the RTL8168C. There were a couple of re(4) instability reports but they were using relatively recent re(4) controllers and none of them showed memory corruption. Thanks! -- Andriy Gapon ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Sun, Apr 06, 2014 at 10:49:27PM -0700, Chris H wrote: On Thu, Apr 03, 2014 at 01:18:19PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 05:53:51PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 01:40:58PM -0700, Chris H wrote: On Tue, 2014-04-01 at 13:19 -0700, Chris H wrote: [...] miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 [...]---big-snip--8--- miibus0: mii_mediachg: can't handle non-zero PHY instance 1 As you can see, it looks much the same. I have no idea what I should do to better inform the driver/kernel how to better handle it. Or is it the driver, itself? Thank you again, for your thoughtful response. --Chris I think the way to fix a phy that responds at all addresses is to set a hint in loader.conf masking out the ones that aren't real, like so: hint.miibus.0.phymask=1 You might be able to set =0x0001 to make it more clear it's a bitmask, but I'm not sure of that. Thank you very much for the hint. I'll give it a shot. Any idea why this is happening? I have 4 other MB's using the Nvidia chipset, and the nfe(4) driver. But they don't respond this way. If some nfe(4) variants badly behave in probing stage, this should be handled by driver. We already have too many hints and tunables and I don't think most users know that. In addition, adding additional NIC may change miibus instance number. Could you show me the output of 'kenv | grep smbios'? Yes, of course. Here it is: smbios.bios.reldate=11/22/2010 smbios.bios.vendor=American Megatrends Inc. smbios.bios.version=V2.7 smbios.chassis.maker=MSI smbios.chassis.serial=To Be Filled By O.E.M. smbios.chassis.tag=To Be Filled By O.E.M. smbios.chassis.version=2.0 smbios.memory.enabled=2097152 smbios.planar.maker=MSI smbios.planar.product=K9N6PGM2-V2 (MS-7309) smbios.planar.serial=To be filled by O.E.M. smbios.planar.version=2.0 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=MSI smbios.system.product=MS-7309 smbios.system.serial=To Be Filled By O.E.M. smbios.system.uuid=----406186cd4497 smbios.system.version=2.0 smbios.version=2.6 Hope this helps, and thank you for all your time, and trouble. Thanks for the info. Try attached patch and let me know how it works. Make sure to remove the hint(hint.miibus.0.phymask=1) set in loader.conf before testing it. Hello, and thanks for all the attention. Sorry for the delay. I chose to perform a dump(8) before attempting the KERn rebuild with the patch. But the kernel threw a read error message on one of the drives. So I had to sort out the problem on the drive before I could complete the dump. Then, of course I had to reslice, and format another drive to replace the ailing one, before I could perform a restore(8), and start the nfe patch; build install kernel. Weird; the drive had only a few hours on it. Well, anyway. The patch applied cleanly. So I built, and installed a new kernel with it. X's out the hint.miibus.0.phymask=0x0001 in loader.conf(5), and bounced the box. Bad news: miibus0: mii_mediachg: can't handle non-zero PHY instance 31 miibus0: mii_mediachg: can't handle non-zero PHY instance 30 miibus0: mii_mediachg: can't handle non-zero PHY instance 29 miibus0: mii_mediachg: can't handle non-zero PHY instance 28 miibus0: mii_mediachg: can't handle non-zero PHY instance 27 miibus0: mii_mediachg: can't handle non-zero PHY instance 26 miibus0: mii_mediachg: can't handle non-zero PHY instance 25 miibus0: mii_mediachg: can't handle non-zero PHY instance 24 miibus0: mii_mediachg: can't handle non-zero PHY instance 23 miibus0: mii_mediachg: can't handle non-zero PHY instance 22 miibus0: mii_mediachg: can't handle non-zero PHY instance 21 miibus0: mii_mediachg: can't handle non-zero PHY instance 20 miibus0: mii_mediachg: can't handle non-zero PHY instance 19 miibus0: mii_mediachg: can't handle non-zero PHY instance 18 miibus0: mii_mediachg: can't handle non-zero PHY instance 17 miibus0: mii_mediachg: can't handle non-zero PHY instance 16 miibus0: mii_mediachg: can't handle non-zero PHY instance 15 miibus0: mii_mediachg: can't handle non-zero PHY instance 14 miibus0: mii_mediachg: can't handle non-zero PHY instance 13 miibus0: mii_mediachg: can't handle non-zero PHY instance 12 miibus0: mii_mediachg: can't handle non-zero PHY instance 11 miibus0: mii_mediachg: can't handle non-zero PHY instance 10 miibus0: mii_mediachg: can't handle non-zero PHY instance 9 miibus0: mii_mediachg: can't handle non-zero PHY instance 8 miibus0: mii_mediachg: can't
Re: re0: watchdog timeout
On Mon, Apr 07, 2014 at 08:09:04AM +0200, Frank Volf wrote: Yonghyeon PYUN schreef op 7-4-2014 3:22: On Sun, Apr 06, 2014 at 07:37:08PM -0400, Rick Macklem wrote: Frank Volf wrote: Hello, I'm experiencing watchdog timeouts with my Realtek interface card. I'm using a fairly new system (Shuttle DS47), running FreeBSD 10-STABLE. For this shuttle a patch has been recently committed to SVN to make this card work at all (revision *262391* http://svnweb.freebsd.org/base?view=revisionrevision=262391). The timeout is only experienced under heavy network load (the system is running a bacula backup server that backups to NFS connected storage), and typically large full backups trigger this. Normal traffic works fine (this system is e.g. also my firewall to the Internet). Since you mention NFS, you could try disabling TSO on the interface and see if that helps. (I'm beginning to feel like a parrot saying this, but...) If you care about why it might help, read this email thread: http://docs.FreeBSD.org/cgi/mid.cgi?1850411724.1687820.1395621539316.JavaMail.root If it happens to help, please email again, since there are probably better ways to fix the problem than disabling TSO. re(4) controllers support TSO but it was disabled long time ago(r217832). It's still allowed to enable TSO but users have to explicitly enable it with ifconfig. If Frank didn't explicitly enable TSO on the box, TSO may have nothing to do with watchdog timeout, I guess. I haven't explicitly enabled TSO, the only option that has been explicitly set is -vlanhwtag, here is the interface config: re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8208bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 80:ee:73:77:e9:ab nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active It would be even better to know your network configuration. I'm not sure why you have to disable VLAN hardware tagging. But given that you've disabled it, could you also try disabling VLAN hardware checksum offloading? Regards, Frank ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Mon, Apr 07, 2014 at 09:40:53AM -0700, Chris H wrote: On Sun, Apr 06, 2014 at 10:49:27PM -0700, Chris H wrote: On Thu, Apr 03, 2014 at 01:18:19PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 05:53:51PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 01:40:58PM -0700, Chris H wrote: On Tue, 2014-04-01 at 13:19 -0700, Chris H wrote: [...] miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 [...]---big-snip--8--- miibus0: mii_mediachg: can't handle non-zero PHY instance 1 As you can see, it looks much the same. I have no idea what I should do to better inform the driver/kernel how to better handle it. Or is it the driver, itself? Thank you again, for your thoughtful response. --Chris I think the way to fix a phy that responds at all addresses is to set a hint in loader.conf masking out the ones that aren't real, like so: hint.miibus.0.phymask=1 You might be able to set =0x0001 to make it more clear it's a bitmask, but I'm not sure of that. Thank you very much for the hint. I'll give it a shot. Any idea why this is happening? I have 4 other MB's using the Nvidia chipset, and the nfe(4) driver. But they don't respond this way. If some nfe(4) variants badly behave in probing stage, this should be handled by driver. We already have too many hints and tunables and I don't think most users know that. In addition, adding additional NIC may change miibus instance number. Could you show me the output of 'kenv | grep smbios'? Yes, of course. Here it is: smbios.bios.reldate=11/22/2010 smbios.bios.vendor=American Megatrends Inc. smbios.bios.version=V2.7 smbios.chassis.maker=MSI smbios.chassis.serial=To Be Filled By O.E.M. smbios.chassis.tag=To Be Filled By O.E.M. smbios.chassis.version=2.0 smbios.memory.enabled=2097152 smbios.planar.maker=MSI smbios.planar.product=K9N6PGM2-V2 (MS-7309) smbios.planar.serial=To be filled by O.E.M. smbios.planar.version=2.0 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=MSI smbios.system.product=MS-7309 smbios.system.serial=To Be Filled By O.E.M. smbios.system.uuid=----406186cd4497 smbios.system.version=2.0 smbios.version=2.6 Hope this helps, and thank you for all your time, and trouble. Thanks for the info. Try attached patch and let me know how it works. Make sure to remove the hint(hint.miibus.0.phymask=1) set in loader.conf before testing it. Hello, and thanks for all the attention. Sorry for the delay. I chose to perform a dump(8) before attempting the KERn rebuild with the patch. But the kernel threw a read error message on one of the drives. So I had to sort out the problem on the drive before I could complete the dump. Then, of course I had to reslice, and format another drive to replace the ailing one, before I could perform a restore(8), and start the nfe patch; build install kernel. Weird; the drive had only a few hours on it. Well, anyway. The patch applied cleanly. So I built, and installed a new kernel with it. X's out the hint.miibus.0.phymask=0x0001 in loader.conf(5), and bounced the box. Bad news: miibus0: mii_mediachg: can't handle non-zero PHY instance 31 miibus0: mii_mediachg: can't handle non-zero PHY instance 30 miibus0: mii_mediachg: can't handle non-zero PHY instance 29 miibus0: mii_mediachg: can't handle non-zero PHY instance 28 miibus0: mii_mediachg: can't handle non-zero PHY instance 27 miibus0: mii_mediachg: can't handle non-zero PHY instance 26 miibus0: mii_mediachg: can't handle non-zero PHY instance 25 miibus0: mii_mediachg: can't handle non-zero PHY instance 24 miibus0: mii_mediachg: can't handle non-zero PHY instance 23 miibus0: mii_mediachg: can't handle non-zero PHY instance 22 miibus0: mii_mediachg: can't handle non-zero PHY instance 21 miibus0: mii_mediachg: can't handle non-zero PHY instance 20 miibus0: mii_mediachg: can't handle non-zero PHY instance 19 miibus0: mii_mediachg: can't handle non-zero PHY instance 18 miibus0: mii_mediachg: can't handle non-zero PHY instance 17 miibus0: mii_mediachg: can't handle non-zero PHY instance 16 miibus0: mii_mediachg: can't handle non-zero PHY instance 15 miibus0: mii_mediachg: can't handle non-zero PHY instance 14 miibus0: mii_mediachg: can't handle non-zero PHY instance 13 miibus0: mii_mediachg: can't handle non-zero PHY instance 12 miibus0: mii_mediachg: can't handle non-zero PHY instance 11 miibus0: mii_mediachg: can't
Re: re0: watchdog timeout
On Mon, Apr 07, 2014 at 08:45:00PM +0200, Frank Volf wrote: Yonghyeon PYUN schreef op 7-4-2014 10:32: It would be even better to know your network configuration. I'm not sure why you have to disable VLAN hardware tagging. But given that you've disabled it, could you also try disabling VLAN hardware checksum offloading? Hi, The reason that I disable VLAN hardware tagging is that the system does not work with it enabled. To show this, see the following transcript (on a freshly booted system): [...] Okat, I'll check VLAN hardware tagging with RTL8168G but watchdog timeout is different issue. I have no idea why this happens at this moment but I'll let you know if I find a clue. Anyway, thanks for reporting. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: re0: watchdog timeout
On Sun, Apr 06, 2014 at 07:37:08PM -0400, Rick Macklem wrote: Frank Volf wrote: Hello, I'm experiencing watchdog timeouts with my Realtek interface card. I'm using a fairly new system (Shuttle DS47), running FreeBSD 10-STABLE. For this shuttle a patch has been recently committed to SVN to make this card work at all (revision *262391* http://svnweb.freebsd.org/base?view=revisionrevision=262391). The timeout is only experienced under heavy network load (the system is running a bacula backup server that backups to NFS connected storage), and typically large full backups trigger this. Normal traffic works fine (this system is e.g. also my firewall to the Internet). Since you mention NFS, you could try disabling TSO on the interface and see if that helps. (I'm beginning to feel like a parrot saying this, but...) If you care about why it might help, read this email thread: http://docs.FreeBSD.org/cgi/mid.cgi?1850411724.1687820.1395621539316.JavaMail.root If it happens to help, please email again, since there are probably better ways to fix the problem than disabling TSO. re(4) controllers support TSO but it was disabled long time ago(r217832). It's still allowed to enable TSO but users have to explicitly enable it with ifconfig. If Frank didn't explicitly enable TSO on the box, TSO may have nothing to do with watchdog timeout, I guess. Good luck with it, rick What might not be standard is that I use sub-interfaces on this system. First of all, the only way that I can get the sub-interfaces to work at all is by using ifconfig_re0=-vlanhwtag I'm not sure that is related. The question is how can we debug this to solve the problem? I have non clue, but I'm happy to assist if somebody can tell me what I should do. Some information that might be useful: root@drawbridge:/usr/local/etc/bacula # dmesg | grep re0 re0: RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xf7a0-0xf7a00fff,0xf010-0xf0103fff irq 17 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip rev. 0x4c00 re0: MAC rev. 0x miibus0: MII bus on re0 re0: Ethernet address: 80:ee:73:77:e9:ab re0: watchdog timeout re0: link state changed to DOWN re0.98: link state changed to DOWN re0.10: link state changed to DOWN re0.11: link state changed to DOWN re0.12: link state changed to DOWN re0: link state changed to UP re0.98: link state changed to UP re0.10: link state changed to UP re0.11: link state changed to UP re0.12: link state changed to UP ... root@drawbridge:/usr/local/etc/bacula # uname -a FreeBSD drawbridge.internal.deze.org 10.0-STABLE FreeBSD 10.0-STABLE #0 r262433: Mon Feb 24 16:25:35 CET 2014 r...@drawbridge-new.internal.deze.org:/usr/obj/usr/sources/src10-stable/sys/SHUTTLE i386 root@drawbridge:/usr/local/etc/bacula # pciconf -lv re0 re0@pci0:2:0:0: class=0x02 card=0x40181297 chip=0x816810ec rev=0x0c hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet Kind regards, Frank ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Thu, Apr 03, 2014 at 01:18:19PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 05:53:51PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 01:40:58PM -0700, Chris H wrote: On Tue, 2014-04-01 at 13:19 -0700, Chris H wrote: [...] miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 [...]---big-snip--8--- miibus0: mii_mediachg: can't handle non-zero PHY instance 1 As you can see, it looks much the same. I have no idea what I should do to better inform the driver/kernel how to better handle it. Or is it the driver, itself? Thank you again, for your thoughtful response. --Chris I think the way to fix a phy that responds at all addresses is to set a hint in loader.conf masking out the ones that aren't real, like so: hint.miibus.0.phymask=1 You might be able to set =0x0001 to make it more clear it's a bitmask, but I'm not sure of that. Thank you very much for the hint. I'll give it a shot. Any idea why this is happening? I have 4 other MB's using the Nvidia chipset, and the nfe(4) driver. But they don't respond this way. If some nfe(4) variants badly behave in probing stage, this should be handled by driver. We already have too many hints and tunables and I don't think most users know that. In addition, adding additional NIC may change miibus instance number. Could you show me the output of 'kenv | grep smbios'? Yes, of course. Here it is: smbios.bios.reldate=11/22/2010 smbios.bios.vendor=American Megatrends Inc. smbios.bios.version=V2.7 smbios.chassis.maker=MSI smbios.chassis.serial=To Be Filled By O.E.M. smbios.chassis.tag=To Be Filled By O.E.M. smbios.chassis.version=2.0 smbios.memory.enabled=2097152 smbios.planar.maker=MSI smbios.planar.product=K9N6PGM2-V2 (MS-7309) smbios.planar.serial=To be filled by O.E.M. smbios.planar.version=2.0 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=MSI smbios.system.product=MS-7309 smbios.system.serial=To Be Filled By O.E.M. smbios.system.uuid=----406186cd4497 smbios.system.version=2.0 smbios.version=2.6 Hope this helps, and thank you for all your time, and trouble. Thanks for the info. Try attached patch and let me know how it works. Make sure to remove the hint(hint.miibus.0.phymask=1) set in loader.conf before testing it. Hello, and thanks for all the attention. Sorry for the delay. I chose to perform a dump(8) before attempting the KERn rebuild with the patch. But the kernel threw a read error message on one of the drives. So I had to sort out the problem on the drive before I could complete the dump. Then, of course I had to reslice, and format another drive to replace the ailing one, before I could perform a restore(8), and start the nfe patch; build install kernel. Weird; the drive had only a few hours on it. Well, anyway. The patch applied cleanly. So I built, and installed a new kernel with it. X's out the hint.miibus.0.phymask=0x0001 in loader.conf(5), and bounced the box. Bad news: miibus0: mii_mediachg: can't handle non-zero PHY instance 31 miibus0: mii_mediachg: can't handle non-zero PHY instance 30 miibus0: mii_mediachg: can't handle non-zero PHY instance 29 miibus0: mii_mediachg: can't handle non-zero PHY instance 28 miibus0: mii_mediachg: can't handle non-zero PHY instance 27 miibus0: mii_mediachg: can't handle non-zero PHY instance 26 miibus0: mii_mediachg: can't handle non-zero PHY instance 25 miibus0: mii_mediachg: can't handle non-zero PHY instance 24 miibus0: mii_mediachg: can't handle non-zero PHY instance 23 miibus0: mii_mediachg: can't handle non-zero PHY instance 22 miibus0: mii_mediachg: can't handle non-zero PHY instance 21 miibus0: mii_mediachg: can't handle non-zero PHY instance 20 miibus0: mii_mediachg: can't handle non-zero PHY instance 19 miibus0: mii_mediachg: can't handle non-zero PHY instance 18 miibus0: mii_mediachg: can't handle non-zero PHY instance 17 miibus0: mii_mediachg: can't handle non-zero PHY instance 16 miibus0: mii_mediachg: can't handle non-zero PHY instance 15 miibus0: mii_mediachg: can't handle non-zero PHY instance 14 miibus0: mii_mediachg: can't handle non-zero PHY instance 13 miibus0: mii_mediachg: can't handle non-zero PHY instance 12 miibus0: mii_mediachg: can't handle non-zero PHY instance 11 miibus0: mii_mediachg: can't handle non-zero PHY instance 10 miibus0: mii_mediachg: can't handle non-zero PHY instance 9 miibus0: mii_mediachg: can't handle non-zero PHY instance 8 miibus0: mii_mediachg: can't handle non-zero PHY instance 7 miibus0: mii_mediachg: can't handle non-zero PHY instance 6 miibus0: mii_mediachg: can't handle non-zero PHY instance 5 miibus0:
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Mon, Mar 31, 2014 at 07:57:28AM -0700, Chris H wrote: On Sun, Mar 30, 2014 at 01:12:20PM -0700, chr...@ultimatedns.net wrote: Greetings, I'm not sure whether this best belonged on net@, or stable@ so I'm using both. :) I'm testing both releng_9, and MB, and I encountered a new message I don't usually see using the nfe(4) driver: miibus0: mii_mediachg: can't handle non-zero PHY instance 1 ... miibus0: mii_mediachg: can't handle non-zero PHY instance 31 Truncated for brevity (31 lines in total; 1-31). I don't know how interpret this. An issue with my version of the driver, or the hardware itself? This occurred with both GENERIC, as well as my custom kernel. Would you show me the dmesg output? Happily: Calibrating TSC clock ... TSC clock: 3231132841 Hz CPU: AMD Sempron(tm) 140 Processor (3231.13-MHz K8-class CPU) Origin = AuthenticAMD Id = 0x100f62 Family = 0x10 Model = 0x6 Stepping = 2 Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2 Features2=0x802009SSE3,MON,CX16,POPCNT AMD Features=0xee500800SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow! AMD Features2=0x37fdLAHF,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT [...] nfe0: NVIDIA nForce MCP61 Networking Adapter port 0xe480-0xe487 mem 0xdff7d000-0xdff7dfff irq 20 at device 7.0 on pci0 nfe0: attempting to allocate 8 MSI vectors (8 supported) msi: routing MSI IRQ 257 to local APIC 0 vector 56 msi: routing MSI IRQ 258 to local APIC 0 vector 57 msi: routing MSI IRQ 259 to local APIC 0 vector 58 msi: routing MSI IRQ 260 to local APIC 0 vector 59 msi: routing MSI IRQ 261 to local APIC 0 vector 60 msi: routing MSI IRQ 262 to local APIC 0 vector 61 msi: routing MSI IRQ 263 to local APIC 0 vector 62 msi: routing MSI IRQ 264 to local APIC 0 vector 63 nfe0: using IRQs 257-264 for MSI nfe0: Using 8 MSI messages miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: OUI 0x04, model 0x0020, rev. 1 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 rlphy1: OUI 0x04, model 0x0020, rev. 1 rlphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow [...] rlphy30: RTL8201L 10/100 media interface PHY 30 on miibus0 rlphy30: OUI 0x04, model 0x0020, rev. 1 rlphy30: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy31: RTL8201L 10/100 media interface PHY 31 on miibus0 rlphy31: OUI 0x04, model 0x0020, rev. 1 rlphy31: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow nfe0: bpf attached nfe0: Ethernet address: 40:61:86:cd:44:97 mii(4) thinks it has 32 PHYs and this is the reason why mii(4) complains. Due to unknown reason, accessing PHY registers in device probe stage got valid response which in turn makes the driver think there are 32 PHYs. Did you ever see this this kind of message on old FreeBSD release? Or could you try cold-boot and see whether it makes any difference? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Tue, Apr 01, 2014 at 01:40:58PM -0700, Chris H wrote: On Tue, 2014-04-01 at 13:19 -0700, Chris H wrote: [...] miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 [...]---big-snip--8--- miibus0: mii_mediachg: can't handle non-zero PHY instance 1 As you can see, it looks much the same. I have no idea what I should do to better inform the driver/kernel how to better handle it. Or is it the driver, itself? Thank you again, for your thoughtful response. --Chris I think the way to fix a phy that responds at all addresses is to set a hint in loader.conf masking out the ones that aren't real, like so: hint.miibus.0.phymask=1 You might be able to set =0x0001 to make it more clear it's a bitmask, but I'm not sure of that. Thank you very much for the hint. I'll give it a shot. Any idea why this is happening? I have 4 other MB's using the Nvidia chipset, and the nfe(4) driver. But they don't respond this way. If some nfe(4) variants badly behave in probing stage, this should be handled by driver. We already have too many hints and tunables and I don't think most users know that. In addition, adding additional NIC may change miibus instance number. Could you show me the output of 'kenv | grep smbios'? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Tue, Apr 01, 2014 at 05:53:51PM -0700, Chris H wrote: On Tue, Apr 01, 2014 at 01:40:58PM -0700, Chris H wrote: On Tue, 2014-04-01 at 13:19 -0700, Chris H wrote: [...] miibus0: MII bus on nfe0 rlphy0: RTL8201L 10/100 media interface PHY 0 on miibus0 rlphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow rlphy1: RTL8201L 10/100 media interface PHY 1 on miibus0 [...]---big-snip--8--- miibus0: mii_mediachg: can't handle non-zero PHY instance 1 As you can see, it looks much the same. I have no idea what I should do to better inform the driver/kernel how to better handle it. Or is it the driver, itself? Thank you again, for your thoughtful response. --Chris I think the way to fix a phy that responds at all addresses is to set a hint in loader.conf masking out the ones that aren't real, like so: hint.miibus.0.phymask=1 You might be able to set =0x0001 to make it more clear it's a bitmask, but I'm not sure of that. Thank you very much for the hint. I'll give it a shot. Any idea why this is happening? I have 4 other MB's using the Nvidia chipset, and the nfe(4) driver. But they don't respond this way. If some nfe(4) variants badly behave in probing stage, this should be handled by driver. We already have too many hints and tunables and I don't think most users know that. In addition, adding additional NIC may change miibus instance number. Could you show me the output of 'kenv | grep smbios'? Yes, of course. Here it is: smbios.bios.reldate=11/22/2010 smbios.bios.vendor=American Megatrends Inc. smbios.bios.version=V2.7 smbios.chassis.maker=MSI smbios.chassis.serial=To Be Filled By O.E.M. smbios.chassis.tag=To Be Filled By O.E.M. smbios.chassis.version=2.0 smbios.memory.enabled=2097152 smbios.planar.maker=MSI smbios.planar.product=K9N6PGM2-V2 (MS-7309) smbios.planar.serial=To be filled by O.E.M. smbios.planar.version=2.0 smbios.socket.enabled=1 smbios.socket.populated=1 smbios.system.maker=MSI smbios.system.product=MS-7309 smbios.system.serial=To Be Filled By O.E.M. smbios.system.uuid=----406186cd4497 smbios.system.version=2.0 smbios.version=2.6 Hope this helps, and thank you for all your time, and trouble. Thanks for the info. Try attached patch and let me know how it works. Make sure to remove the hint(hint.miibus.0.phymask=1) set in loader.conf before testing it. --Chris Index: sys/dev/nfe/if_nfe.c === --- sys/dev/nfe/if_nfe.c (revision 264031) +++ sys/dev/nfe/if_nfe.c (working copy) @@ -79,6 +79,7 @@ static int nfe_suspend(device_t); static int nfe_resume(device_t); static int nfe_shutdown(device_t); static int nfe_can_use_msix(struct nfe_softc *); +static int nfe_detect_msik9(struct nfe_softc *); static void nfe_power(struct nfe_softc *); static int nfe_miibus_readreg(device_t, int, int); static int nfe_miibus_writereg(device_t, int, int, int); @@ -334,13 +335,38 @@ nfe_alloc_msix(struct nfe_softc *sc, int count) } } + static int +nfe_detect_msik9(struct nfe_softc *sc) +{ + static char *maker = MSI; + static char *product = K9N6PGM2-V2 (MS-7309); + char *m, *p; + int found; + + found = 0; + m = getenv(smbios.planar.maker); + p = getenv(smbios.planar.product); + if (m != NULL p != NULL) { + if (strcmp(m, maker) == 0 strcmp(p, product) == 0) + found = 1; + } + if (m != NULL) + freeenv(m); + if (p != NULL) + freeenv(p); + + return (found); +} + + +static int nfe_attach(device_t dev) { struct nfe_softc *sc; struct ifnet *ifp; bus_addr_t dma_addr_max; - int error = 0, i, msic, reg, rid; + int error = 0, i, msic, phyloc, reg, rid; sc = device_get_softc(dev); sc-nfe_dev = dev; @@ -608,8 +634,13 @@ nfe_attach(device_t dev) #endif /* Do MII setup */ + phyloc = MII_PHY_ANY; + if (sc-nfe_devid == PCI_PRODUCT_NVIDIA_MCP61_LAN1) { + if (nfe_detect_msik9(sc) != 0) + phyloc = 0; + } error = mii_attach(dev, sc-nfe_miibus, ifp, nfe_ifmedia_upd, - nfe_ifmedia_sts, BMSR_DEFCAPMASK, MII_PHY_ANY, MII_OFFSET_ANY, + nfe_ifmedia_sts, BMSR_DEFCAPMASK, phyloc, MII_OFFSET_ANY, MIIF_DOPAUSE); if (error != 0) { device_printf(dev, attaching PHYs failed\n); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RFC: How to fix the NFS/iSCSI vs TSO problem
On Wed, Mar 26, 2014 at 08:27:48PM -0400, Rick Macklem wrote: pyu...@gmail.com wrote: On Tue, Mar 25, 2014 at 07:10:35PM -0400, Rick Macklem wrote: Hi, First off, I hope you don't mind that I cross-posted this, but I wanted to make sure both the NFS/iSCSI and networking types say it. If you look in this mailing list thread: http://docs.FreeBSD.org/cgi/mid.cgi?1850411724.1687820.1395621539316.JavaMail.root you'll see that several people have been working hard at testing and thanks to them, I think I now know what is going on. Thanks for your hard work on narrowing down that issue. I'm too busy for $work in these days so I couldn't find time to investigate the issue. (This applies to network drivers that support TSO and are limited to 32 transmit segments-32 mbufs in chain.) Doing a quick search I found the following drivers that appear to be affected (I may have missed some): jme, fxp, age, sge, msk, alc, ale, ixgbe/ix, nfe, e1000/em, re The magic number 32 was chosen long time ago when I implemented TSO in non-Intel drivers. I tried to find optimal number to reduce the size kernel stack usage at that time. bus_dma(9) will coalesce with previous segment if possible so I thought the number 32 was not an issue. Not sure current bus_dma(9) also has the same code though. The number 32 is arbitrary one so you can increase it if you want. Well, in the case of ix Jack Vogel says it is a hardware limitation. I can't change drivers that I can't test and don't know anything about the hardware. Maybe replacing m_collapse() with m_defrag() is an exception, since I know what that is doing and it isn't hardware related, but I would still prefer a review by the driver author/maintainer before making such a change. If there are drivers that you know can be increased from 32-35 please do so, since that will not only avoid the EFBIG failures but also avoid a lot of calls to m_defrag(). Further, of these drivers, the following use m_collapse() and not m_defrag() to try and reduce the # of mbufs in the chain. m_collapse() is not going to get the 35 mbufs down to 32 mbufs, as far as I can see, so these ones are more badly broken: jme, fxp, age, sge, alc, ale, nfe, re I guess m_defeg(9) is more optimized for non-TSO packets. You don't want to waste CPU cycles to copy the full frame to reduce the number of mbufs in the chain. For TSO packets, m_defrag(9) looks better but if we always have to copy a full TSO packet to make TSO work, driver writers have to invent better scheme rather than blindly relying on m_defrag(9), I guess. Yes, avoiding m_defrag() calls would be nice. For this issue, increasing the transmit segment limit from 32-35 does that, if the change can be done easily/safely. Otherwise, all I can think of is my suggestion to add something like if_hw_tsomaxseg which the driver can use to tell tcp_output() the driver's limit for # of mbufs in the chain. The long description is in the above thread, but the short version is: - NFS generates a chain with 35 mbufs in it for (read/readdir replies and write requests) made up of (tcpip header, RPC header, NFS args, 32 clusters of file data) - tcp_output() usually trims the data size down to tp-t_tsomax (65535) and then some more to make it an exact multiple of TCP transmit data size. - the net driver prepends an ethernet header, growing the length by 14 (or sometimes 18 for vlans), but in the first mbuf and not adding one to the chain. - m_defrag() copies this to a chain of 32 mbuf clusters (because the total data length is = 64K) and it gets sent However, it the data length is a little less than 64K when passed to tcp_output() so that the length including headers is in the range 65519-65535... - tcp_output() doesn't reduce its size. - the net driver adds an ethernet header, making the total data length slightly greater than 64K - m_defrag() copies it to a chain of 33 mbuf clusters, which fails with EFBIG -- trainwrecks NFS performance, because the TSO segment is dropped instead of sent. A tester also stated that the problem could be reproduced using iSCSI. Maybe Edward Napierala might know some details w.r.t. what kind of mbuf chain iSCSI generates? Also, one tester has reported that setting if_hw_tsomax in the driver before the ether_ifattach() call didn't make the value of tp-t_tsomax smaller. However, reducing IP_MAXPACKET (which is what it is set to by default) did reduce it. I have no idea why this happens or how to fix it, but it implies that setting if_hw_tsomax in the driver isn't a solution until this is resolved. So, what to do about this? First, I'd like a simple fix/workaround that can go into 9.3.
Re: miibus0: mii_mediachg: can't handle non-zero PHY instance 31
On Sun, Mar 30, 2014 at 01:12:20PM -0700, chr...@ultimatedns.net wrote: Greetings, I'm not sure whether this best belonged on net@, or stable@ so I'm using both. :) I'm testing both releng_9, and MB, and I encountered a new message I don't usually see using the nfe(4) driver: miibus0: mii_mediachg: can't handle non-zero PHY instance 1 ... miibus0: mii_mediachg: can't handle non-zero PHY instance 31 Truncated for brevity (31 lines in total; 1-31). I don't know how interpret this. An issue with my version of the driver, or the hardware itself? This occurred with both GENERIC, as well as my custom kernel. Would you show me the dmesg output? # uname -a FreeBSD demon0 9.2-STABLE FreeBSD 9.2-STABLE #0 r263756: Wed Mar 26 11:28:10 PDT 2014 root@demon0:/usr/obj/usr/src/sys/DEMON0 amd64 Thank you for all your time, and consideration. --Chris ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: RFC: How to fix the NFS/iSCSI vs TSO problem
On Tue, Mar 25, 2014 at 07:10:35PM -0400, Rick Macklem wrote: Hi, First off, I hope you don't mind that I cross-posted this, but I wanted to make sure both the NFS/iSCSI and networking types say it. If you look in this mailing list thread: http://docs.FreeBSD.org/cgi/mid.cgi?1850411724.1687820.1395621539316.JavaMail.root you'll see that several people have been working hard at testing and thanks to them, I think I now know what is going on. Thanks for your hard work on narrowing down that issue. I'm too busy for $work in these days so I couldn't find time to investigate the issue. (This applies to network drivers that support TSO and are limited to 32 transmit segments-32 mbufs in chain.) Doing a quick search I found the following drivers that appear to be affected (I may have missed some): jme, fxp, age, sge, msk, alc, ale, ixgbe/ix, nfe, e1000/em, re The magic number 32 was chosen long time ago when I implemented TSO in non-Intel drivers. I tried to find optimal number to reduce the size kernel stack usage at that time. bus_dma(9) will coalesce with previous segment if possible so I thought the number 32 was not an issue. Not sure current bus_dma(9) also has the same code though. The number 32 is arbitrary one so you can increase it if you want. Further, of these drivers, the following use m_collapse() and not m_defrag() to try and reduce the # of mbufs in the chain. m_collapse() is not going to get the 35 mbufs down to 32 mbufs, as far as I can see, so these ones are more badly broken: jme, fxp, age, sge, alc, ale, nfe, re I guess m_defeg(9) is more optimized for non-TSO packets. You don't want to waste CPU cycles to copy the full frame to reduce the number of mbufs in the chain. For TSO packets, m_defrag(9) looks better but if we always have to copy a full TSO packet to make TSO work, driver writers have to invent better scheme rather than blindly relying on m_defrag(9), I guess. The long description is in the above thread, but the short version is: - NFS generates a chain with 35 mbufs in it for (read/readdir replies and write requests) made up of (tcpip header, RPC header, NFS args, 32 clusters of file data) - tcp_output() usually trims the data size down to tp-t_tsomax (65535) and then some more to make it an exact multiple of TCP transmit data size. - the net driver prepends an ethernet header, growing the length by 14 (or sometimes 18 for vlans), but in the first mbuf and not adding one to the chain. - m_defrag() copies this to a chain of 32 mbuf clusters (because the total data length is = 64K) and it gets sent However, it the data length is a little less than 64K when passed to tcp_output() so that the length including headers is in the range 65519-65535... - tcp_output() doesn't reduce its size. - the net driver adds an ethernet header, making the total data length slightly greater than 64K - m_defrag() copies it to a chain of 33 mbuf clusters, which fails with EFBIG -- trainwrecks NFS performance, because the TSO segment is dropped instead of sent. A tester also stated that the problem could be reproduced using iSCSI. Maybe Edward Napierala might know some details w.r.t. what kind of mbuf chain iSCSI generates? Also, one tester has reported that setting if_hw_tsomax in the driver before the ether_ifattach() call didn't make the value of tp-t_tsomax smaller. However, reducing IP_MAXPACKET (which is what it is set to by default) did reduce it. I have no idea why this happens or how to fix it, but it implies that setting if_hw_tsomax in the driver isn't a solution until this is resolved. So, what to do about this? First, I'd like a simple fix/workaround that can go into 9.3. (which is code freeze in May). The best thing I can think of is setting if_hw_tsomax to a smaller default value. (Line# 658 of sys/net/if.c in head.) Version A: replace ifp-if_hw_tsomax = IP_MAXPACKET; with ifp-if_hw_tsomax = min(32 * MCLBYTES - (ETHER_HDR_LEN + ETHER_VLAN_ENCAP_LEN), IP_MAXPACKET); plus replace m_collapse() with m_defrag() in the drivers listed above. This would only reduce the default from 65535-65518, so it only impacts the uncommon case where the output size (with tcpip header) is within this range. (As such, I don't think it would have a negative impact for drivers that handle more than 32 transmit segments.) From the testers, it seems that this is sufficient to get rid of the EFBIG errors. (The total data length including ethernet header doesn't exceed 64K, so m_defrag() fits it into 32 mbuf clusters.) The main downside of this is that there will be a lot of m_defrag() calls being done and they do quite a bit of bcopy()'ng. Version B: replace ifp-if_hw_tsomax = IP_MAXPACKET; with ifp-if_hw_tsomax = min(29 * MCLBYTES, IP_MAXPACKET); This one would avoid the m_defrag() calls, but might have a negative impact on TSO performance for
Re: Problem with Lenovo SL500
On Mon, Oct 07, 2013 at 03:14:58PM -0700, Kurt Buff wrote: All, This machine has for its wired port a RealTek unit: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xe800-0xe8ff mem 0xfcfff000-0xfcff,0xfcfe-0xfcfe irq 19 at device 0.0 on pci12 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip rev. 0x3c00 re0: MAC rev. 0x0040 miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: 00:26:18:45:77:51 I've got wireless working for iwn (thanks Adrian!), and I'm trying to use the wired NIC (re0) as an unnumbered port to monitor a mirror port on an HP switch. However, when I connect it, it shows up as only 10mbit, half-duplex on the switch, and it refuses to send packets. I've tried 'ifconfig re0 media 1000baseT -mediaopt full-duplex' with no particular luck, as it shows the following re0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:26:18:45:77:51 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet 1000baseT (10baseT/UTP half-duplex) status: active I get no output from 'tcpdump -npi re0'. I get link light, and it worked great on re0 when I did the install for FreeBSD, but no joy for capturing packets from the switch. It seems you didn't UP the interface. And you may have to put the interface into promiscuous mode to capture all packets(i.e. remove -p option). Anyone have some thoughts to share on this? Yes, I tried using a new cable, too... Kurt ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: re0 not working at boot on -CURRENT
On Wed, Sep 11, 2013 at 01:31:49AM +0200, Guido Falsi wrote: On 09/10/13 04:15, Yonghyeon PYUN wrote: On Fri, Sep 06, 2013 at 10:42:56PM +0200, Guido Falsi wrote: On 09/06/13 08:15, Yonghyeon PYUN wrote: On Wed, Jul 10, 2013 at 07:47:01PM +0200, Guido Falsi wrote: On 07/10/13 09:04, Yonghyeon PYUN wrote: On Tue, Jul 09, 2013 at 10:28:29PM +0200, Guido Falsi wrote: Hi, I have a PC with an integrate re ethernet interface, pciconf identifies it like this: re0@pci0:3:0:0: class=0x02 card=0x11c01734 chip=0x816810ec rev=0x07 hdr=0x00 I'm running FreeBSD current r252261. As stated in the subject after boot the interface does not work correctly. Using tcpdump on another host I noticed that packets (ICMP echo requests for example) do get sent, and replies generated by the other host, but the kernel does not seem to see them. Except that every now and then some packet does get to the system. I'm seeing packet 7, 27, 47, 66, 86, 106, 125, 144, 164, 183 and so on from a ping which has been running for some time. Just about one every twenty. Some pattern is showing up. this is the output of ifconfig re0 after boot: re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active If I just touch any interface flag with ifconfig, anyone, tso, -txcsum -rxcsum, it starts working flawlessly. It keeps working also if I perform the opposite operation with ifconfig afterwards, so it is not the flag itself fixing it. This is an ifconfig after performing this exercise(it's the same, since I disabled txcsum and reactivated it in this instance): re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I don't know much about FreeBSD network drivers so i can't make theories about this. I hope someone has an idea what the problem could be. I'm available for any further information needed, test, experiment and so on. Could you show me dmesg output(re(4) and rgephy(4) only)? re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xf2104000-0xf2104fff,0xf210-0xf2103fff irq 17 at device 0.0 on pci3 re0: Using 1 MSI-X message re0: turning off MSI enable bit. re0: Chip rev. 0x2c80 re0: MAC rev. 0x re0: Ethernet address: 00:19:99:f8:d3:0b miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow Also, I'm loading this as a module, but, for as much as I know, this should not make any difference. Did it ever work or you see the issue only on CURRENT? Never worked on this machine (I own it since the last days of February). I only installed current on it. If needed I can find time to test a recent 9.x snapshot on it. I worked around the problem till now using an USB ethernet adapter, always wanted to report this problem, but I've been lazy :) Would you try attached patch and let me know whether it makes any difference? Hi! Thanks for looking into this and sorry for the delay in reporting back. Unluckily the patch does not solve nor mitigates the problem. Symptoms are very similar. [...] Only real difference is the re_eri_read timeout. It did not output that error message before. Oops, sorry. It seems there is logic error in the diff. Try attached one again. Hi, This patch shows the same behavior as the unpatched kernel: [...] I'd like to note that if I perform a tcpdump from the other machine (which is also the dns server) I do see the packets getting out as usual from this machine, and replies being sent. So the problem seems to be to receive packets, while sending them works fine. Hmm, I thought the diff may reset internal RX filter but it seems it has no effect. If I find a clue I'll let you know. BTW, I guess the diff have showed IC revision of MAC. Could you show me the output of driver? Thanks for testing
Re: re0 not working at boot on -CURRENT
On Fri, Sep 06, 2013 at 10:42:56PM +0200, Guido Falsi wrote: On 09/06/13 08:15, Yonghyeon PYUN wrote: On Wed, Jul 10, 2013 at 07:47:01PM +0200, Guido Falsi wrote: On 07/10/13 09:04, Yonghyeon PYUN wrote: On Tue, Jul 09, 2013 at 10:28:29PM +0200, Guido Falsi wrote: Hi, I have a PC with an integrate re ethernet interface, pciconf identifies it like this: re0@pci0:3:0:0: class=0x02 card=0x11c01734 chip=0x816810ec rev=0x07 hdr=0x00 I'm running FreeBSD current r252261. As stated in the subject after boot the interface does not work correctly. Using tcpdump on another host I noticed that packets (ICMP echo requests for example) do get sent, and replies generated by the other host, but the kernel does not seem to see them. Except that every now and then some packet does get to the system. I'm seeing packet 7, 27, 47, 66, 86, 106, 125, 144, 164, 183 and so on from a ping which has been running for some time. Just about one every twenty. Some pattern is showing up. this is the output of ifconfig re0 after boot: re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active If I just touch any interface flag with ifconfig, anyone, tso, -txcsum -rxcsum, it starts working flawlessly. It keeps working also if I perform the opposite operation with ifconfig afterwards, so it is not the flag itself fixing it. This is an ifconfig after performing this exercise(it's the same, since I disabled txcsum and reactivated it in this instance): re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I don't know much about FreeBSD network drivers so i can't make theories about this. I hope someone has an idea what the problem could be. I'm available for any further information needed, test, experiment and so on. Could you show me dmesg output(re(4) and rgephy(4) only)? re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xf2104000-0xf2104fff,0xf210-0xf2103fff irq 17 at device 0.0 on pci3 re0: Using 1 MSI-X message re0: turning off MSI enable bit. re0: Chip rev. 0x2c80 re0: MAC rev. 0x re0: Ethernet address: 00:19:99:f8:d3:0b miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow Also, I'm loading this as a module, but, for as much as I know, this should not make any difference. Did it ever work or you see the issue only on CURRENT? Never worked on this machine (I own it since the last days of February). I only installed current on it. If needed I can find time to test a recent 9.x snapshot on it. I worked around the problem till now using an USB ethernet adapter, always wanted to report this problem, but I've been lazy :) Would you try attached patch and let me know whether it makes any difference? Hi! Thanks for looking into this and sorry for the delay in reporting back. Unluckily the patch does not solve nor mitigates the problem. Symptoms are very similar. [...] Only real difference is the re_eri_read timeout. It did not output that error message before. Oops, sorry. It seems there is logic error in the diff. Try attached one again. Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 255410) +++ sys/dev/re/if_re.c (working copy) @@ -289,6 +289,9 @@ static int re_miibus_readreg(device_t, int, int); static int re_miibus_writereg (device_t, int, int, int); static void re_miibus_statchg (device_t); +static uint32_t re_eri_read(struct rl_softc *, bus_size_t, int); +static void re_eri_write (struct rl_softc *, bus_size_t, uint32_t, int); + static void re_set_jumbo (struct rl_softc *, int); static void re_set_rxmode (struct rl_softc *); static void re_reset (struct rl_softc *); @@ -602,6 +605,7 @@ re_miibus_statchg(device_t dev
Re: re0 not working at boot on -CURRENT
On Wed, Jul 10, 2013 at 07:47:01PM +0200, Guido Falsi wrote: On 07/10/13 09:04, Yonghyeon PYUN wrote: On Tue, Jul 09, 2013 at 10:28:29PM +0200, Guido Falsi wrote: Hi, I have a PC with an integrate re ethernet interface, pciconf identifies it like this: re0@pci0:3:0:0: class=0x02 card=0x11c01734 chip=0x816810ec rev=0x07 hdr=0x00 I'm running FreeBSD current r252261. As stated in the subject after boot the interface does not work correctly. Using tcpdump on another host I noticed that packets (ICMP echo requests for example) do get sent, and replies generated by the other host, but the kernel does not seem to see them. Except that every now and then some packet does get to the system. I'm seeing packet 7, 27, 47, 66, 86, 106, 125, 144, 164, 183 and so on from a ping which has been running for some time. Just about one every twenty. Some pattern is showing up. this is the output of ifconfig re0 after boot: re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active If I just touch any interface flag with ifconfig, anyone, tso, -txcsum -rxcsum, it starts working flawlessly. It keeps working also if I perform the opposite operation with ifconfig afterwards, so it is not the flag itself fixing it. This is an ifconfig after performing this exercise(it's the same, since I disabled txcsum and reactivated it in this instance): re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I don't know much about FreeBSD network drivers so i can't make theories about this. I hope someone has an idea what the problem could be. I'm available for any further information needed, test, experiment and so on. Could you show me dmesg output(re(4) and rgephy(4) only)? re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xf2104000-0xf2104fff,0xf210-0xf2103fff irq 17 at device 0.0 on pci3 re0: Using 1 MSI-X message re0: turning off MSI enable bit. re0: Chip rev. 0x2c80 re0: MAC rev. 0x re0: Ethernet address: 00:19:99:f8:d3:0b miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow Also, I'm loading this as a module, but, for as much as I know, this should not make any difference. Did it ever work or you see the issue only on CURRENT? Never worked on this machine (I own it since the last days of February). I only installed current on it. If needed I can find time to test a recent 9.x snapshot on it. I worked around the problem till now using an USB ethernet adapter, always wanted to report this problem, but I've been lazy :) Would you try attached patch and let me know whether it makes any difference? Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 255289) +++ sys/dev/re/if_re.c (working copy) @@ -289,6 +289,9 @@ static int re_miibus_readreg (device_t, int, int); static int re_miibus_writereg (device_t, int, int, int); static void re_miibus_statchg (device_t); +static uint32_t re_eri_read (struct rl_softc *, bus_size_t, int); +static void re_eri_write (struct rl_softc *, bus_size_t, uint32_t, int); + static void re_set_jumbo (struct rl_softc *, int); static void re_set_rxmode (struct rl_softc *); static void re_reset (struct rl_softc *); @@ -602,6 +605,7 @@ re_miibus_statchg(device_t dev) struct rl_softc *sc; struct ifnet *ifp; struct mii_data *mii; + uint32_t exgmac; sc = device_get_softc(dev); mii = device_get_softc(sc-rl_miibus); @@ -627,14 +631,108 @@ re_miibus_statchg(device_t dev) break; } } + + if ((sc-rl_flags RL_FLAG_LINK) == 0) + return; + /* * RealTek controllers does not provide any interface to * Tx/Rx MACs for resolved speed, duplex and flow-control * parameters. */ + + switch (sc-rl_hwrev-rl_rev) { + case RL_HWREV_8168E_VL: + if (sc-rl_icrev
Re: bce(4) panics, 9.2rc1 [redux]
On Wed, Jul 31, 2013 at 03:54:06PM +0900, Hiroki Sato wrote: [Added yougari@ and davidch@ to the To:/Cc: list] I confirmed that my issue reported on -current@ is due to the bxe(4) driver (BCM57711). If it is disabled, shutdown works fine without NMI. Also, I received several reports about the same box that NMI occurred even on bge(4) (BCM5717) driver when probing during power-cycle test. The probability was about once per 30 power-cycles. Once it occurred, an AC on/off cycle was required (resetting a system reproduced the NMI in the same timing). Hmm, Hiroki, could you add bge_reset()/bge_chipinit() after bge_stop() in bge_shutdown() and let me know whether that change makes any difference? Sean Bruno sean_br...@yahoo.com wrote in 1375208841.1496.3.camel@localhost: se se se http://svnweb.freebsd.org/base?view=revisionrevision=236216 se se se se se Ok, confirmed after ~50 reboots. se se There is a timing problem in this revision that I don't fully se understand. Adding printf's inside bce_reset() will cause the existing se code to succeed, and sometimes the existing code in this revision will se work (about 10% of the time). se se In the failure mode, the network interface, bce0, will not come up into se service *without* and network restart, after which it works fine. se se I suspect that we are missing a DELAY or UDELAY somewhere in the se restoral of the emac_status settings that needs to be implemented. se se Sean se se p.s. sorry for the late report as the commit is well over a year old. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/180382: [ae] kernel: ae0: watchdog timeout - resetting.
On Tue, Jul 16, 2013 at 11:35:50PM +0200, claudiu vasadi wrote: Hi again, UPDATE: all is well. 24h have long passed and the server is running fine with the patch. Thanks a lot for testing. Fixed in r253404. Will it be merged to 9-STABLE? Not sure but I'll request re approval after settlement. On Mon, Jul 15, 2013 at 3:06 PM, claudiu vasadi claudiu.vas...@gmail.comwrote: Hi, Quick update: works now. Will let you know if it continues like this or if we hit another problem. Huge thanks for the patch. On Mon, Jul 15, 2013 at 2:43 AM, Yonghyeon PYUN pyu...@gmail.com wrote: On Sun, Jul 14, 2013 at 04:20:30PM +0200, claudiu vasadi wrote: Hi, The patch applied without any problems and the Size mismatch messages are gone now. However, we cannot get an IP via dhclient and we also don't have any network connectivity if we set a manual IP. The only messages we see now (with the manual IP) is no buffer space available. I checked the Ip, netmask, gateway and mbufs but they all seem to be ok. ifconfg ae0 down ifconfig ae0 up does not fix the problem. We also tried a reboot but again, it did not fix the problem. We have no ping or anything. No other error messages were observed. PS: Sorry for the late reply but the machine is on a remote site. PS2: I can arrange a ssh account if this would help you. Sorry, I couldn't test the patch due to lack of access to the L2 hardware. Actually there was a typo such that driver thought it had no link at all. I've updated diff(URL is the same) so please try again. -- Best regards, Claudiu Vasadi -- Best regards, Claudiu Vasadi ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: sis(4) flow control
On Sun, Jul 14, 2013 at 09:52:38PM +0200, Andreas Longwitz wrote: Yonghyeon PYUN wrote: Maybe there is a bug in vr(4) that generates the hang, but why is Probably yes and I shall have to narrow down the issue. One more hint: No hang - but of course no TX support - anymore, when I use --- if_vr.c.orig2013-06-25 09:58:29.0 +0200 +++ if_vr.c 2013-07-14 18:09:12.0 +0200 @@ -351,7 +351,6 @@ fc |= VR_FLOWCR1_RXPAUSE; if ((IFM_OPTIONS(mii-mii_media_active) IFM_ETH_TXPAUSE) != 0) { - fc |= VR_FLOWCR1_TXPAUSE; sc-vr_flags |= VR_F_TXPAUSE; } CSR_WRITE_1(sc, VR_FLOWCR1, fc); Probably the RX pause frames will work with this patch. Yes but it also disables generating TX pause frames. The controller is not smart enough to know how many number of RX buffers are available so driver has to explicitly tell the amount of free RX buffers. It seems the logic has a bug. negotiation of flowcontrol on vr(4) not done at boot time as shown for msk(4) ? msk(4) supported flow-control from day 1 with a hack and it was re-implemented later with proper way such that it always announces flow-control. However for other drivers(i.e vr(4)) that didn't support the feature in the beginning, you have to explicitly enable the feature. The decision was made to provide compatibility and to not introduce POLA. Thanks for clarification, I see the flag MIIF_FORCEPAUSE does the job. If you need more information about the hang let me know. I guess it would be good idea to use a link partner that shows hardware MAC statistics. If your switch provides such information that's fine. If you use direct connection between two hosts without switch, use other network drivers(most gigabit controllers support hardware MAC counters). I can easy realize to use my laptop with msk(4) as a link partner for my soekris box with vr(4). -- Dr. Andreas Longwitz ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: sis(4) flow control
On Sat, Jul 13, 2013 at 10:30:40PM +0200, Andreas Longwitz wrote: Yonghyeon PYUN wrote: Try attached patch and let me know how it works. Thanks for your patch. I will test it on next update of my soekris boxes with sis interfaces. Because they are all remote far away this will need some time. Ok. Make sure to check established link before testing flow-control. 'ifconfig sis0' will show current media and you should have something like the following. ... media: Ethernet autoselect flowcontrol (100baseTX full-duplex,flowcontrol,rxpause,txpause) If you don't see 'rxpause', re-negotiate flow-control with 'ifconfig sis0 mediaopt flow'. Because sis(4) (soekris 4801) is not available for me at the moment, I tried with vr(4) (soekris 5501). In production I run both types of boxes with FreeBSD 6 and a simple SETBIT patch to honor RX pause frames. Now I want to go with FreeBSD 8 Stable and eliminate my patch. ifconfig vr0 gives media: Ethernet autoselect (100baseTX full-duplex), therefore I tried (using serial console) ifconfig vr0 flow and now ifconfig vr0 as expected gives media: Ethernet autoselect flowcontrol (100baseTX full-duplex,flowcontrol,rxpause,txpause), but the interface vr0 hangs. Outgoing packets are ok, but all incoming packets are blocked. In this situation I can give ifconfig vr0 -mediaopt flowcontrol and see after ifconfig vr0 media: Ethernet autoselect (none) status: no carrier and one second later media: Ethernet autoselect (100baseTX full-duplex) status: active and interface works correct again. Hmm, I recall flow control worked on VT6105 when I initially added the feature but it seems there is an issue on that. vr(4) needs driver assistance to generate TX pause frames so I guess vr(4) may not be good link partner to verify flow-control. vr(4) controllers also does not have hardware MAC counters so it would be hard to know how many pause frames were processed in the controller. From console: vr0: VIA VT6105M Rhine III 10/100BaseTX port 0xe100-0xe1ff mem 0xa0004000-0xa00040ff irq 11 at device 6.0 on pci0 vr0: Quirks: 0x2 vr0: Revision: 0x96 miibus0: MII bus on vr0 ukphy0: Generic IEEE 802.3u media interface PHY 1 on miibus0 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow vr0: Ethernet address: 00:00:24:cb:1e:34 vr0: [ITHREAD] My switch is D-Link DGS-1008D green Ethernet (has support for IEEE 802.3x Flow-Control). On the same switch I have connected two other machines using msk driver (88E8050 and 88E8055) and ifconfig msk0 gives always media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause) Maybe there is a bug in vr(4) that generates the hang, but why is Probably yes and I shall have to narrow down the issue. negotiation of flowcontrol on vr(4) not done at boot time as shown for msk(4) ? msk(4) supported flow-control from day 1 with a hack and it was re-implemented later with proper way such that it always announces flow-control. However for other drivers(i.e vr(4)) that didn't support the feature in the beginning, you have to explicitly enable the feature. The decision was made to provide compatibility and to not introduce POLA. If you need more information about the hang let me know. I guess it would be good idea to use a link partner that shows hardware MAC statistics. If your switch provides such information that's fine. If you use direct connection between two hosts without switch, use other network drivers(most gigabit controllers support hardware MAC counters). ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/180382: [ae] kernel: ae0: watchdog timeout - resetting.
On Sun, Jul 14, 2013 at 04:20:30PM +0200, claudiu vasadi wrote: Hi, The patch applied without any problems and the Size mismatch messages are gone now. However, we cannot get an IP via dhclient and we also don't have any network connectivity if we set a manual IP. The only messages we see now (with the manual IP) is no buffer space available. I checked the Ip, netmask, gateway and mbufs but they all seem to be ok. ifconfg ae0 down ifconfig ae0 up does not fix the problem. We also tried a reboot but again, it did not fix the problem. We have no ping or anything. No other error messages were observed. PS: Sorry for the late reply but the machine is on a remote site. PS2: I can arrange a ssh account if this would help you. Sorry, I couldn't test the patch due to lack of access to the L2 hardware. Actually there was a typo such that driver thought it had no link at all. I've updated diff(URL is the same) so please try again. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: re0 not working at boot on -CURRENT
On Tue, Jul 09, 2013 at 10:28:29PM +0200, Guido Falsi wrote: Hi, I have a PC with an integrate re ethernet interface, pciconf identifies it like this: re0@pci0:3:0:0: class=0x02 card=0x11c01734 chip=0x816810ec rev=0x07 hdr=0x00 I'm running FreeBSD current r252261. As stated in the subject after boot the interface does not work correctly. Using tcpdump on another host I noticed that packets (ICMP echo requests for example) do get sent, and replies generated by the other host, but the kernel does not seem to see them. Except that every now and then some packet does get to the system. I'm seeing packet 7, 27, 47, 66, 86, 106, 125, 144, 164, 183 and so on from a ping which has been running for some time. Just about one every twenty. Some pattern is showing up. this is the output of ifconfig re0 after boot: re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active If I just touch any interface flag with ifconfig, anyone, tso, -txcsum -rxcsum, it starts working flawlessly. It keeps working also if I perform the opposite operation with ifconfig afterwards, so it is not the flag itself fixing it. This is an ifconfig after performing this exercise(it's the same, since I disabled txcsum and reactivated it in this instance): re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 00:19:99:f8:d3:0b inet 172.24.42.13 netmask 0xff00 broadcast 172.24.42.255 inet6 fe80::219:99ff:fef8:d30b%re0 prefixlen 64 scopeid 0x2 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I don't know much about FreeBSD network drivers so i can't make theories about this. I hope someone has an idea what the problem could be. I'm available for any further information needed, test, experiment and so on. Could you show me dmesg output(re(4) and rgephy(4) only)? Did it ever work or you see the issue only on CURRENT? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: sis(4) flow control
On Thu, Jul 11, 2013 at 12:18:19AM +0200, Andreas Longwitz wrote: Yonghyeon PYUN wrote: Hmm, does the change really make flow-control work? I believe flow-control should be negotiated with remote link partner so you have to announce flow-control capability to link partner. In addition, it seems DP83815/DP83816 does not support TX flow-control so it just honors RX pause frames. Excuse me, the comment in my patch was wrong. Better would be /* Enable reception of 802.3x pause frames. */. My soekris boxes are connected to a slow so called Ethernet Connect line (2 Mbit/s). The line works correct and stable if I respect incoming RX pause frames from the line. I do not need TX flow-control. Try attached patch and let me know how it works. Thanks for your patch. I will test it on next update of my soekris boxes with sis interfaces. Because they are all remote far away this will need some time. Ok. Make sure to check established link before testing flow-control. 'ifconfig sis0' will show current media and you should have something like the following. ... media: Ethernet autoselect flowcontrol (100baseTX full-duplex,flowcontrol,rxpause,txpause) If you don't see 'rxpause', re-negotiate flow-control with 'ifconfig sis0 mediaopt flow'. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: sis(4) flow control
On Tue, Jul 09, 2013 at 03:52:25PM +0200, Andreas Longwitz wrote: Some of my soekris boxes run with sis interfaces. Because I need ethernet flow control on these boxes I use the following patch (against 8-Stable) for some years: --- if_sis.c.orig 2013-05-15 20:01:16.0 +0200 +++ if_sis.c 2013-06-24 15:58:05.0 +0200 @@ -1965,6 +1965,18 @@ } #endif + if (sc-sis_type == SIS_TYPE_83815 sc-sis_srr = NS_SRR_16A) { + if (ifp-if_flags IFF_LINK0) { +/* + * Configure Ethernet flow control for outgoing frames. + * Enable reception of 802.3x multicast pause frames. + */ +SIS_SETBIT(sc, NS_PCR, NS_PCR_PAUSE ); + } else { +SIS_CLRBIT(sc, NS_PCR, NS_PCR_PAUSE ); + } + } + mii = device_get_softc(sc-sis_miibus); /* Set MAC address */ Other network drivers (eg. vr) have this functionality inside, it would be fine if sis learns flow control too. Hmm, does the change really make flow-control work? I believe flow-control should be negotiated with remote link partner so you have to announce flow-control capability to link partner. In addition, it seems DP83815/DP83816 does not support TX flow-control so it just honors RX pause frames. Try attached patch and let me know how it works. Index: sys/dev/sis/if_sis.c === --- sys/dev/sis/if_sis.c (revision 253125) +++ sys/dev/sis/if_sis.c (working copy) @@ -619,10 +619,22 @@ sis_miibus_statchg(device_t dev) SIS_SETBIT(sc, SIS_TX_CFG, (SIS_TXCFG_IGN_HBEAT | SIS_TXCFG_IGN_CARR)); SIS_SETBIT(sc, SIS_RX_CFG, SIS_RXCFG_RX_TXPKTS); + if (sc-sis_type == SIS_TYPE_83815) { + if ((IFM_OPTIONS(mii-mii_media_active) + IFM_ETH_RXPAUSE) != 0) +SIS_SETBIT(sc, NS_PCR, NS_PCR_PAUSE_DA | +NS_PCR_PAUSE_MCAST | NS_PCR_PAUSE_ENABLE); + else +SIS_CLRBIT(sc, NS_PCR, NS_PCR_PAUSE_DA | +NS_PCR_PAUSE_MCAST | NS_PCR_PAUSE_ENABLE); + } } else { SIS_CLRBIT(sc, SIS_TX_CFG, (SIS_TXCFG_IGN_HBEAT | SIS_TXCFG_IGN_CARR)); SIS_CLRBIT(sc, SIS_RX_CFG, SIS_RXCFG_RX_TXPKTS); + if (sc-sis_type == SIS_TYPE_83815) + SIS_CLRBIT(sc, NS_PCR, NS_PCR_PAUSE_DA | + NS_PCR_PAUSE_MCAST | NS_PCR_PAUSE_ENABLE); } if (sc-sis_type == SIS_TYPE_83815 sc-sis_srr = NS_SRR_16A) { @@ -1074,7 +1086,8 @@ sis_attach(device_t dev) * Do MII setup. */ error = mii_attach(dev, sc-sis_miibus, ifp, sis_ifmedia_upd, - sis_ifmedia_sts, BMSR_DEFCAPMASK, MII_PHY_ANY, MII_OFFSET_ANY, 0); + sis_ifmedia_sts, BMSR_DEFCAPMASK, MII_PHY_ANY, MII_OFFSET_ANY, + sc-sis_type == SIS_TYPE_83815 ? MIIF_DOPAUSE : 0); if (error != 0) { device_printf(dev, attaching PHYs failed\n); goto fail; Index: sys/dev/sis/if_sisreg.h === --- sys/dev/sis/if_sisreg.h (revision 253125) +++ sys/dev/sis/if_sisreg.h (working copy) @@ -78,6 +78,7 @@ #define NS_IHR 0x1C #define NS_CLKRUN 0x3C #define NS_WCSR 0x40 +#define NS_PCR 0x44 #define NS_SRR 0x58 #define NS_BMCR 0x80 #define NS_BMSR 0x84 @@ -123,6 +124,15 @@ #define NS_WCSR_DET_PATTERN3 0x4000 #define NS_WCSR_DET_MAGIC 0x8000 +#define NS_PCR_PAUSE_CNT_MASK 0x +#define NS_PCR_MLD_ENABLE 0x0001 +#define NS_PCR_PAUSE_NEG 0x0020 +#define NS_PCR_PAUSE_RCVD 0x0040 +#define NS_PCR_PAUSE_ACT 0x0080 +#define NS_PCR_PAUSE_DA 0x2000 +#define NS_PCR_PAUSE_MCAST 0x4000 +#define NS_PCR_PAUSE_ENABLE 0x8000 + /* NS silicon revisions */ #define NS_SRR_15C 0x302 #define NS_SRR_15D 0x403 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: misc/179033: [dc] dc ethernet driver seems to have issues with some multiport card and mother board combinations
On Mon, Jun 10, 2013 at 12:13:11PM -0700, Mr. Clif wrote: Hi John and Pyun, Ok got the new kernel installed and tested. Yes it works! :-) Maybe that Thanks, probably John can fix PCI-PCI bridge code. will also fix a simular problem with the sun cards (cas[03]), except I don't see a define like that in if_cas.c. Suggestions? Cassini does not support I/O port BARs so I guess you're seeing different issue. Would you start a new thread that explains cas(4) issues you're suffering from? Thanks, Clif John Baldwin wrote: On Thursday, May 30, 2013 1:12:14 am YongHyeon PYUN wrote: On Wed, May 29, 2013 at 08:58:10PM -0700, Mr. Clif wrote: Sorry for the confusion Pyun, I started looking at it in the context of pfsense, but they rejected my bug report which was understandable because it's an upstream issue. They suggested I resubmit it to you guys if I could reproduce it. So I booted FreeBSD and lo and behold the same two ports failed in exactly the same Ok, I'd like to fix that. Hmmm, the dc(4) driver is using the I/O port BARs rather than the memory BARs for its registers and this bug seems to be that the dc(4) device can't properly access its registers on dc0 and dc1 on the Atom box. The one thing I see is that the BIOS on the Atom box assigns addresses in the 0x1100-0x11ff range for dc0 and dc1. Those addresses conflict with ISA I/O aliases for the 0x100-0x1ff range. The Dell BIOS is more careful to avoid these ranges. I think the fix is that I need to fix the PCI-PCI bridge to reject these resource ranges if the ISA enable bit is set in the bridge's control register. However, for the time being you can change dc(4) to use the memory BAR instead of the I/O port BAR as a workaround: Index: if_dc.c === --- if_dc.c (revision 251132) +++ if_dc.c (working copy) @@ -128,7 +128,7 @@ __FBSDID($FreeBSD$); #includedev/pci/pcireg.h #includedev/pci/pcivar.h -#define DC_USEIOSPACE +//#define DC_USEIOSPACE #includedev/dc/if_dcreg.h If this fixes it then I can take this PR as a test case for handling the ISA enable bit in the PCI-PCI bridge code. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: misc/179033: [dc] dc ethernet driver seems to have issues with some multiport card and mother board combinations
On Mon, Jun 10, 2013 at 06:26:34PM -0700, Mr. Clif wrote: Is there any down side to using that dc fix in pfsense for now? If dc(4) works as expected there is no reason not to use memory BARs. Generally using memory BARs is more efficient. Many old PCI controllers used to have bugs with memory BARs so driver used safer I/O port BARs. Yes, I would like to have time to submit the cas bug as well. Maybe in the next week but probably by august I hope. ;-) Ok. Thanks for your help, Clif ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: misc/179033: [dc] dc ethernet driver seems to have issues with some multiport card and mother board combinations
On Wed, May 29, 2013 at 08:58:10PM -0700, Mr. Clif wrote: Sorry for the confusion Pyun, I started looking at it in the context of pfsense, but they rejected my bug report which was understandable because it's an upstream issue. They suggested I resubmit it to you guys if I could reproduce it. So I booted FreeBSD and lo and behold the same two ports failed in exactly the same Ok, I'd like to fix that. way. I didn't see the point in re-running all the tests because I was assuming that FreeBSD would work as well as pfsense for the ports that worked, and there were no further tests I could think of for the dead ports. There are too many different dc(4) controllers out there and each controller will require different hack to make it work. This Atom board only has serial headers not a DB9 on the back, so I have to look for the proper back panel adapter for that. Otherwise I should be able to set up that test environment. Though it might take me a If you can't setup two systems, attach USB etherent controller to the box and let me know login information to the box. That wouldn't be enough configuration for remote debugging but probably I can experiment some basic things with your help. couple of days, sorry it's crunch time for me on a volunteer project. One which I would like to deploy routers like this on. :-) Thanks, Clif yong...@freebsd.org wrote: Synopsis: [dc] dc ethernet driver seems to have issues with some multiport card and mother board combinations State-Changed-From-To: open-feedback State-Changed-By: yongari State-Changed-When: Thu May 30 01:11:55 UTC 2013 State-Changed-Why: The information you gave looks confusing to me. If you're using pfSense on Atom D510MO and seeing the issue I'm afraid I'm not able to help that. pfSense may have some local changes and I'm not familiar with that. Did you try stock FreeBSD 9.1-RELEASE on Atom D510MO? If you still see the same issue with stock FreeBSD 9.1-REELASE, could you setup remote debugging environment mentioned in the following URL? http://people.freebsd.org/~yongari/remote_debugging.txt Given that dc(4) works fine with Dell machines I guess the issue may be in pci(4) which can't correctly handle device sits behind PCI-PCI bridge. Note, Holland Consulting's document does not apply to FreeBSD. dc(4) can handle multiple instances of dc(4) and should be able to support dual/quad port dc(4) controllers. Responsible-Changed-From-To: freebsd-net-yongari Responsible-Changed-By: yongari Responsible-Changed-When: Thu May 30 01:11:55 UTC 2013 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=179033 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bge(4) sysctl tuneables -- a blast from the past.
On Tue, Apr 16, 2013 at 05:14:54PM +1000, Bruce Evans wrote: On Tue, 16 Apr 2013, YongHyeon PYUN wrote: On Mon, Apr 15, 2013 at 03:35:56PM -0700, Sean Bruno wrote: FreeBSD has too many knobs, but it would be nice if the bge defaults weren't so broken, so that they don't need overriding. So many knobs ... well here's more. :-) http://people.freebsd.org/~sbruno/bge_config_update.txt At least this gets a man page update with references to manuals. You have to change BGE_STD_RX_RING_CNT to change number of RX descriptors. It's hard-coded and it needs much more work to change that. And I don't see any reason to modify that though(Max # of RX descriptor is 512). I thought that at first too, but a simple change along these lines must be OK since old versions had it. There was a BGE_SSLOTS option that was 256. This was used instead of BGE_STD_RX_RING_CNT in much the same places that the tunable is now used, since 512 bds used to be a lot. From FreeBSD-~5.2: I'm afraid that will allocate more DMAable memory than specified with the tunable and the amount of data that bus_dmamap_sync(9) have to handle is the same as before. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bge(4) sysctl tuneables -- a blast from the past.
On Mon, Apr 15, 2013 at 03:35:56PM -0700, Sean Bruno wrote: FreeBSD has too many knobs, but it would be nice if the bge defaults weren't so broken, so that they don't need overriding. Bruce So many knobs ... well here's more. :-) http://people.freebsd.org/~sbruno/bge_config_update.txt At least this gets a man page update with references to manuals. You have to change BGE_STD_RX_RING_CNT to change number of RX descriptors. It's hard-coded and it needs much more work to change that. And I don't see any reason to modify that though(Max # of RX descriptor is 512). I think bge(4) touches minimal set of coalescing parameters but publicly available bge(4) data sheet shows more coalescing parameters. These parameters could be programmed with different values(BDs ticks) during interrupt. And some parameters are not applicable to certain controllers. In addition, the allowed value range for certain parameters vary on controller models. So I think it's good idea to mention allowed value range for each parameters as well as a warning that mentions possible connection lost caused by wrongly programmed value(i.e. no RX interrupt for bge_rx_coal_ticks == 0 bge_rx_max_coal_bds == 0) It's common to see multiple instances of bge(4) in a box so I think it would be better to implement them as sysctl tunables rather than loader tunables(i.e. each controller may need different coalescing parameters). Except hw.bge.allow_asf tunable, all others were implemented to support multiple bge(4) instances. sysctl tunables also allow dynamic change so you don't have to reboot your box to change coalescing parameters. Sean ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: enable tcpdump GUESS_TSO flag?
On Thu, Apr 04, 2013 at 11:24:18AM +, Eggert, Lars wrote: Hi, I wonder whether it'd be a good idea to enable tcpdump's GUESS_TSO flag by default? It enables a heuristic that lets tcpdump understand pcaps that include segments generated by TCP TSO (which otherwise show up as IP bad-len 0.) I don't have strong option on enabling that flag but I think it would be even better to have an option to enable/disable that feature(default off). em(4) controllers require IP length should be 0 before controller performs TSO operation. fxp(4) controllers requires IP length should be set to the IP length of the first TCP segment after TSO operation. bpf listeners see the modified packet so it can confuse them. AFAIK, except em(4)/fxp(4) controllers, no other controllers in tree have such limitation with TSO. Enabling GUESS_TSO flag may make it hard to debug network/driver issues I guess. See the dicussion at http://www.mail-archive.com/tcpdump-workers@lists.tcpdump.org/msg01051.html for details. Lars diff --git a/usr.sbin/tcpdump/tcpdump/Makefile b/usr.sbin/tcpdump/tcpdump/Makefile index ca8ec4c..5fd73a1 100644 --- a/usr.sbin/tcpdump/tcpdump/Makefile +++ b/usr.sbin/tcpdump/tcpdump/Makefile @@ -45,6 +45,10 @@ CFLAGS+= -I${.CURDIR} -I${TCPDUMP_DISTDIR} CFLAGS+= -DHAVE_CONFIG_H CFLAGS+= -D_U_=__attribute__((unused)) +# Enable tcpdump heuristic to identify TSO-generated packets; see +# http://www.mail-archive.com/tcpdump-workers@lists.tcpdump.org/msg01051.html +CFLAGS+= -DGUESS_TSO + .if ${MK_INET6_SUPPORT} != no SRCS+= print-ip6.c print-ip6opts.c print-mobility.c print-ripng.c \ print-icmp6.c print-babel.c print-frag6.c print-rt6.c print-ospf6.c \ ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Mon, Mar 18, 2013 at 10:45:24AM +0600, Eugene M. Zheganin wrote: Hi. On 14.03.2013 13:29, YongHyeon PYUN wrote: I thought you were using stable/8 but it seems you have slightly older stable/8. The bge(4) code difference between CURRENT and stable9/stable8 is very minor. Nah, I really am running recent 8/stable. My mistake was to try to apply the whole code of the if_bge.c to 8/stable. I've attached diff against stable/8 which will address ASF/IPMI issue but I'm not sure whether it helps watchdog timeout or not. I have plan to MFC the change but I need time for settlement before the MFC. Thanks a lot. Now the 'watchdog timeout' behaviour stopped, the driver reports nothing on the console, but the freezes, unfortunately - didn't stop. The machine is freezing at random periods of time, usually 1-2 days, it partially stops answering to network (network services running on this machine close the connection, and other weird stuff happens), it partially responds on the console (I can type but I cannot login) and so I have no idea how this change can freeze your box. It would be even better to know whether the issue was triggered by bge(4) changes. I think you can use bge(4)/brgphy(4) of 8.3-RELEASE on your stable/8. Copy required files from 8.3-RELEASE to stable/8 and rebuild your kernel. For instance, Copy /usr/src/sys/dev/bge/if_bge.c from 8.3-RELEASE to /usr/src/sys/dev/bge on stable/8 Copy /usr/src/sys/dev/bge/if_bgereg.h from 8.3-RELEASE to /usr/src/sys/dev/bge on stable/8 Copy /usr/src/sys/dev/mii/brgphy.c from 8.3-RELEASE to /usr/src/sys/dev/mii on stable/8 Copy /usr/src/sys/dev/mii/brgphyreg.g from 8.3-RELEASE to /usr/src/sys/dev/mii on stable/8 And rebuild your kernel. on. The asf feature has no influence on this - I tried to enable it, this happens anyway. Should I try upgrade to 9 or 10, do they have more improvements in bge(4) ? No, there is no bge(4) functional differences between stable/8 and stable/9. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Mar 14, 2013 at 12:44:33PM +0600, Eugene M. Zheganin wrote: Hi. On 13.03.2013 07:57, YongHyeon PYUN wrote: If your controller supports ASF/IPMI access please apply r248226 to stable/8 and let me know whether that makes any difference. I believe ignoring ASF/IPMI firmware is not good idea since the ASF/IPMI firmware will run regardless of hw.bge.allow_asf loader tunable configuration. You may have to set hw.bge.allow_asf=1 since it's off by default on stable/8. I'm sorry, but obviously my C skills are way to low to make a local MFC of if_bge.c - it differs way too much from the 8.3-STABLE version, and it affects lots of other stuff (mostly in /usr/src/dev/mii and /usr/src/sys/sys). I tried to do this, but I failed completely. Could I thought you were using stable/8 but it seems you have slightly older stable/8. The bge(4) code difference between CURRENT and stable9/stable8 is very minor. you please do a patch for me or can this be really MFC'd ? I got a bunch of bge timeouts on a 8.3-STABLE machine we talked about earlier just this morning. I really think this may become a problem for other people too. Yup, there's another option - to upgrade to 9.x, but right now I have unresolved issues with 9.x. I've attached diff against stable/8 which will address ASF/IPMI issue but I'm not sure whether it helps watchdog timeout or not. I have plan to MFC the change but I need time for settlement before the MFC. Index: sys/dev/bge === --- sys/dev/bge (revision 248264) +++ sys/dev/bge (working copy) Property changes on: sys/dev/bge ___ Modified: svn:mergeinfo Merged /head/sys/dev/bge:r248226 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 248264) +++ sys/dev/bge/if_bge.c (working copy) @@ -3637,15 +3637,15 @@ bge_attach(device_t dev) } bge_stop_fw(sc); - bge_sig_pre_reset(sc, BGE_RESET_START); + bge_sig_pre_reset(sc, BGE_RESET_SHUTDOWN); if (bge_reset(sc)) { device_printf(sc-bge_dev, chip reset failed\n); error = ENXIO; goto fail; } - bge_sig_legacy(sc, BGE_RESET_START); - bge_sig_post_reset(sc, BGE_RESET_START); + bge_sig_legacy(sc, BGE_RESET_SHUTDOWN); + bge_sig_post_reset(sc, BGE_RESET_SHUTDOWN); if (bge_chipinit(sc)) { device_printf(sc-bge_dev, chip initialization failed\n); @@ -3998,6 +3998,20 @@ bge_reset(struct bge_softc *sc) } else write_op = bge_writereg_ind; + if (sc-bge_asicrev != BGE_ASICREV_BCM5700 + sc-bge_asicrev != BGE_ASICREV_BCM5701) { + CSR_WRITE_4(sc, BGE_NVRAM_SWARB, BGE_NVRAMSWARB_SET1); + for (i = 0; i 8000; i++) { + if (CSR_READ_4(sc, BGE_NVRAM_SWARB) + BGE_NVRAMSWARB_GNT1) +break; + DELAY(20); + } + if (i == 8000) { + if (bootverbose) +device_printf(dev, NVRAM lock timedout!\n); + } + } /* Take APE lock when performing reset. */ bge_ape_lock(sc, BGE_APE_LOCK_GRC); ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Feb 28, 2013 at 12:37:03PM +0100, Miroslav Lachman wrote: YongHyeon PYUN wrote: On Wed, Feb 27, 2013 at 12:09:28PM +0100, Miroslav Lachman wrote: [...] I can provide you full access to this machine (if you want) or let me know, what version I should check. Older versions (6.x - 8.3) are working fine with hw.bge.allow_asf=1 in loader.conf. I didn't test newer releases on these old machines. The reporter said the machine was Sun Fire X2200 M2 so I guess you may see the same issue on both stable/9 and stable/8. Ideally the loader tunable hw.bge.allow_asf should not be there and driver should take care of it by checking the existence of ASF/IPMI firmware. Can you setup a remote debugging environments(+ IPMI access) like the following URL? http://people.freebsd.org/~yongari/remote_debugging.txt The one Sun Fire X2100 M2 is idling in datacenter and connected to internet, so I can remotely reinstall it to stable/9 withing day or two and give you full access to it (ssh user, root, BMC / IPMI admin account with remote KVM + remote media). But as I understand, you need another machine connected to it with serial and another ethernet. It will take me some more time, as I will need to go to the datacenter, find some serial cable etc. Let me know if ssh + ipmi access to X2100 alone is useful for you to start, or only full remote debugging setup is needed. Can you point me to the original problem report with X2200 M2? I had been working on fixing the IPMI regression with the help of Miroslav. It was fixed in r248226. Many thanks to Miroslav for providing full remote debugging environments. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Mar 07, 2013 at 05:15:48PM +0900, YongHyeon PYUN wrote: On Thu, Mar 07, 2013 at 01:14:03PM +0600, Eugene M. Zheganin wrote: Hi. On 07.03.2013 12:23, YongHyeon PYUN wrote: On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: It was definitely older than months. It was running something similar to FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011, this is the uname from a neighbor machine. I have, as I said, identical servers running FreeBSD. Here are some of the unames that I don't see timeouts on: 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous uptime around 180 days) These servers do not have 5718/5719/5720 changes. 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) This server has the bge(4) change but it didn't trigger watchdog timeouts. Does this server use the same controller? If yes, the issue didn't come from bge(4) change. How's that ? It's running even older version than previous two. I guess you misread the year. Oops, you're right. If your controller supports ASF/IPMI access please apply r248226 to stable/8 and let me know whether that makes any difference. I believe ignoring ASF/IPMI firmware is not good idea since the ASF/IPMI firmware will run regardless of hw.bge.allow_asf loader tunable configuration. You may have to set hw.bge.allow_asf=1 since it's off by default on stable/8. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Limits on jumbo mbuf cluster allocation
On Fri, Mar 08, 2013 at 12:27:37AM -0800, Jack Vogel wrote: On Thu, Mar 7, 2013 at 11:54 PM, YongHyeon PYUN pyu...@gmail.com wrote: On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: I have a machine (actually six of them) with an Intel dual-10G NIC on the motherboard. Two of them (so far) are connected to a network using jumbo frames, with an MTU a little under 9k, so the ixgbe driver allocates 32,000 9k clusters for its receive rings. I have noticed, on the machine that is an active NFS server, that it can get into a state where allocating more 9k clusters fails (as reflected in the mbuf failure counters) at a utilization far lower than the configured limits -- in fact, quite close to the number allocated by the driver for its rx ring. Eventually, network traffic grinds completely to a halt, and if one of the interfaces is administratively downed, it cannot be brought back up again. There's generally plenty of physical memory free (at least two or three GB). There are no console messages generated to indicate what is going on, and overall UMA usage doesn't look extreme. I'm guessing that this is a result of kernel memory fragmentation, although I'm a little bit unclear as to how this actually comes about. I am assuming that this hardware has only limited scatter-gather capability and can't receive a single packet into multiple buffers of a smaller size, which would reduce the requirement for two-and-a-quarter consecutive pages of KVA for each packet. In actual usage, most of our clients aren't on a jumbo network, so most of the time, all the packets will fit into a normal 2k cluster, and we've never observed this issue when the *server* is on a non-jumbo network. AFAIK all Intel controllers generate jumbo frame by concatenating multiple mbufs on RX side so there is no physically contiguous 9KB allocation. I vaguely guess there could be mbuf leakage when jumbo frame is enabled. I would check how driver handles mbuf shortage or frame errors while mbuf concatenation for jumbo frame is in progress. No, this is not true, if using a 9K jumbo it will actually use the larger mbuf pool, the code has been this way for a little while now. Ah, thanks for correcting me. If H/W is still able to support old style chaining like em(4), wouldn't it better to use that rather than allocating a 9KB buffer? Allocating a 9KB buffer to handle a pure TCP ACK segment looks inefficient. Jack Does anyone have suggestions for dealing with this issue? Will increasing the amount of KVA (to, say, twice physical memory) help things? It seems to me like a bug that these large packets don't have their own submap to ensure that allocation is always possible when sufficient physical pages are available. -GAWollman ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Mar 07, 2013 at 01:14:03PM +0600, Eugene M. Zheganin wrote: Hi. On 07.03.2013 12:23, YongHyeon PYUN wrote: On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: It was definitely older than months. It was running something similar to FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011, this is the uname from a neighbor machine. I have, as I said, identical servers running FreeBSD. Here are some of the unames that I don't see timeouts on: 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous uptime around 180 days) These servers do not have 5718/5719/5720 changes. 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) This server has the bge(4) change but it didn't trigger watchdog timeouts. Does this server use the same controller? If yes, the issue didn't come from bge(4) change. How's that ? It's running even older version than previous two. I guess you misread the year. Oops, you're right. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Limits on jumbo mbuf cluster allocation
On Fri, Mar 08, 2013 at 02:10:41AM -0500, Garrett Wollman wrote: I have a machine (actually six of them) with an Intel dual-10G NIC on the motherboard. Two of them (so far) are connected to a network using jumbo frames, with an MTU a little under 9k, so the ixgbe driver allocates 32,000 9k clusters for its receive rings. I have noticed, on the machine that is an active NFS server, that it can get into a state where allocating more 9k clusters fails (as reflected in the mbuf failure counters) at a utilization far lower than the configured limits -- in fact, quite close to the number allocated by the driver for its rx ring. Eventually, network traffic grinds completely to a halt, and if one of the interfaces is administratively downed, it cannot be brought back up again. There's generally plenty of physical memory free (at least two or three GB). There are no console messages generated to indicate what is going on, and overall UMA usage doesn't look extreme. I'm guessing that this is a result of kernel memory fragmentation, although I'm a little bit unclear as to how this actually comes about. I am assuming that this hardware has only limited scatter-gather capability and can't receive a single packet into multiple buffers of a smaller size, which would reduce the requirement for two-and-a-quarter consecutive pages of KVA for each packet. In actual usage, most of our clients aren't on a jumbo network, so most of the time, all the packets will fit into a normal 2k cluster, and we've never observed this issue when the *server* is on a non-jumbo network. AFAIK all Intel controllers generate jumbo frame by concatenating multiple mbufs on RX side so there is no physically contiguous 9KB allocation. I vaguely guess there could be mbuf leakage when jumbo frame is enabled. I would check how driver handles mbuf shortage or frame errors while mbuf concatenation for jumbo frame is in progress. Does anyone have suggestions for dealing with this issue? Will increasing the amount of KVA (to, say, twice physical memory) help things? It seems to me like a bug that these large packets don't have their own submap to ensure that allocation is always possible when sufficient physical pages are available. -GAWollman ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Wed, Mar 06, 2013 at 04:00:34PM +0600, Eugene M. Zheganin wrote: Hi. Hi. On 06.03.2013 12:26, YongHyeon PYUN wrote: If you were using latest stable/8, the result would be same on CURRENT. How frequently do you see the watchdog timeouts? Is there way to reproduce it? Would you show me the output of dmesg (bge(4) and brgphy(4) only) and pciconf -lcbv? I upgraded one om my routers 2 days ago to 8.3-STABLE, and got today a freeze. Uptime was less than a day. I have like dozens of these IBM system x3250, all of them run various 8.2-STABLE's, that's why I worry that much. I don't know if this is What was previous SVN revision number on that machine? The support for 5718/5719/5720 was merged to stable/8 about 3 months ago. triggered by some of my actions. These routers run gre/ipsec, dirrerent routing stuff (quagga, bird), proxies and pf. In 2011/early 2012 I saw similar watchdog issues on these machines, and I disabled the tso on them. I don't know whether this is a coincidence or it really helps, but after that I didn't see these watchdog issues until today. I'm not aware of TSO issue on your controller. pf(4) had TSO issue but I guess it was fixed long time ago. I've also discovered that this particular server is running some old bioses/firmwares including the fact that it misses some NetXtreme updates available from IBM. Would applying such updates resolve the situation ? Updating etherent controller firmware is always good idea. But I'm not sure whether this address the issue. I am ok with that fact that I cannot run ipmi/sol on these machines, but it would be nice if this watchdog issue could be somehow resolved. Actually this is the first report after the merge which seems to break bge(4). Furthermore, I have some spare machines that I can provide full access to, including ipkvm stuff. Since the machine is only partially freezing, I cannot even rely on the ichwd and watchdogd to reboot it. Sorry no clue yet. pciconf (there's two controllers in this server, I use the first, but anyway): Thanks for the info. [...] ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Mar 07, 2013 at 11:08:50AM +0600, Eugene M. Zheganin wrote: Hi. On 07.03.2013 8:24, YongHyeon PYUN wrote: What was previous SVN revision number on that machine? The support for 5718/5719/5720 was merged to stable/8 about 3 months ago. It was definitely older than months. It was running something similar to FreeBSD 8.2-STABLE #0: Mon Sep 19 08:10:00 YEKST 2011, this is the uname from a neighbor machine. I have, as I said, identical servers running FreeBSD. Here are some of the unames that I don't see timeouts on: 8.3-STABLE #2: Wed Aug 29 13:00:02 YEKT 2012 (up 187 days) 8.3-PRERELEASE #1: Thu Mar 29 16:14:11 MSK 2012 (up 15 days, previous uptime around 180 days) These servers do not have 5718/5719/5720 changes. 8.2-STABLE #0: Wed Dec 14 16:56:11 YEKT 2011 (up 99 days) This server has the bge(4) change but it didn't trigger watchdog timeouts. Does this server use the same controller? If yes, the issue didn't come from bge(4) change. One more question: could it be a zfs-related issue ? Some kernel-level locking ? All of those run zfs also (no ufs at all). Sorry I have no idea on ZFS. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Thu, Mar 07, 2013 at 08:22:51AM +0300, Zeus Panchenko wrote: Hi, here is my situation, much like the issue No, your issue is completely different one. On 06.03.2013 12:26, YongHyeon PYUN wrote: If you were using latest stable/8, the result would be same on CURRENT. I use FreeBSD 9.1-RELEASE #0 r243825: amd65 + ZFS on HP ProLiant DL360e Gen8 the box has two 4 headed cards igb(4) I350 and bge(4) NetXtreme BCM5719 according the pciconf data How frequently do you see the watchdog timeouts? Is there way to reproduce it? I noticed that after activation, bge(4) stops respond and interface becomes useless, while igb(4) works fine after some sysctl-ing for now I'm forced to not to use bge(4) at all :( 9.1-RELEASE does not have required code to support your controller. Use stable/9. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Wed, Mar 06, 2013 at 11:48:13AM +0600, Eugene M. Zheganin wrote: Hi. On 28.02.2013 11:35, YongHyeon PYUN wrote: The reporter said the machine was Sun Fire X2200 M2 so I guess you may see the same issue on both stable/9 and stable/8. Ideally the loader tunable hw.bge.allow_asf should not be there and driver should take care of it by checking the existence of ASF/IPMI firmware. Unfortunately, I just had the 'bge0 - watchdog timeout - resetting' on a recent 8.3-STABLE and a 'Broadcom NetXtreme BCM5722 Gigabit (94309)' (according to the pciconf -lv) controller. I haven't seen this in a year or two (I guess), the machine was running 8.2-STABLE. So, in order to fight this (machine is freezing during these messages) whet should I do ? Is upgrading to 10.0-CURRENT an option ? hw.bge.allow_asf is 0 already. If you were using latest stable/8, the result would be same on CURRENT. How frequently do you see the watchdog timeouts? Is there way to reproduce it? Would you show me the output of dmesg (bge(4) and brgphy(4) only) and pciconf -lcbv? ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Wed, Feb 27, 2013 at 12:09:28PM +0100, Miroslav Lachman wrote: YongHyeon PYUN wrote: On Wed, Feb 27, 2013 at 12:05:47AM +0600, Eugene M. Zheganin wrote: [...] bge(4)'s IPMI support for old controllers had many issues and didn't work well. Only some of users had luck to enjoy it. However IPMI support for 5717/5718/5719/5720 has no known issues and it should work. I also got a report that mentions IPMI does not work any more on 5715 after adding support for 5717/5718/5719/5720. The sanitized public data sheet does not mention IPMI interface so Linux tg3 would be the only source of information. Given that I don't have access to IPMI-capable controllers I have no idea when it could be fixed. Somebody with the IPMI-capable controllers have to sit down and verify all possible combinations. I have a spare machine Sun Fire X2100 M2 with 5715C: bge0@pci0:6:4:0:class=0x02 card=0x534c108e chip=0x167814e4 rev=0xa3 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5715C 10/100/100 PCIe Ethernet Controller' class = network subclass = ethernet bge1@pci0:6:4:1:class=0x02 card=0x534c108e chip=0x167814e4 rev=0xa3 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5715C 10/100/100 PCIe Ethernet Controller' class = network subclass = ethernet I can provide you full access to this machine (if you want) or let me know, what version I should check. Older versions (6.x - 8.3) are working fine with hw.bge.allow_asf=1 in loader.conf. I didn't test newer releases on these old machines. The reporter said the machine was Sun Fire X2200 M2 so I guess you may see the same issue on both stable/9 and stable/8. Ideally the loader tunable hw.bge.allow_asf should not be there and driver should take care of it by checking the existence of ASF/IPMI firmware. Can you setup a remote debugging environments(+ IPMI access) like the following URL? http://people.freebsd.org/~yongari/remote_debugging.txt Miroslav Lachman ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Wed, Feb 27, 2013 at 12:05:47AM +0600, Eugene M. Zheganin wrote: Hi. On 25.02.2013 14:20, YongHyeon PYUN wrote: On Sun, Feb 24, 2013 at 11:06:42AM +0100, Kajetan Staszkiewicz wrote: Dnia sobota, 23 lutego 2013 o 04:54:07 Marc Fournier napisał(a): We just picked up 5 new HP DL 360p Gen8 E5-2630 2P servers … just installed 9.1-RELEASE, and it looks like all of the hardware is detected properly, and being configured … After reboot, I start getting the 'watchdog timeout - resetting' message on bge0 … I've searched the web, and found the references to setting: Have a look at the following patch: http://svnweb.freebsd.org/base?view=revisionrevision=243546 When I encountered the same error on Dell machines, using bge driver from HEAD helped me, although it seems that the aforementioned patch should be enough. That change is just one of changes required to make BCM5718/5718/5719/5720 work. You need entire bge(4)/brgphy(4) changes to get working bge(4) driver on your machines. Just to clear some things: this (or earlier changes) doesn't affect and doesn't enable IPMI sol and stuff on a BCM5722 ? bge(4)'s IPMI support for old controllers had many issues and didn't work well. Only some of users had luck to enjoy it. However IPMI support for 5717/5718/5719/5720 has no known issues and it should work. I also got a report that mentions IPMI does not work any more on 5715 after adding support for 5717/5718/5719/5720. The sanitized public data sheet does not mention IPMI interface so Linux tg3 would be the only source of information. Given that I don't have access to IPMI-capable controllers I have no idea when it could be fixed. Somebody with the IPMI-capable controllers have to sit down and verify all possible combinations. Thanks. Eugene. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Fri, Feb 22, 2013 at 07:54:07PM -0800, Marc Fournier wrote: We just picked up 5 new HP DL 360p Gen8 E5-2630 2P servers … just installed 9.1-RELEASE, and it looks like all of the hardware is detected properly, and being configured … After reboot, I start getting the 'watchdog timeout - resetting' message on bge0 … I've searched the web, and found the references to setting: hw.bge.allow_asf=0 hw.pci.enable_msi=0 but after reboot with those set in /boot/loader.conf (and confirmed via sysctl -a after login), its still doing it … Looking at sysctl -a, even though: hw.pci.enable_msi=0 is set, I do see: dev.bge.0.msi=1 dev.bge.1.msi=1 dev.bge.2.msi=1 dev.bge.3.msi=1 still all set to 1 … is that right? Don't know if this is useful, but, again, according to sysctl -a: dev.bge.0.%desc: Broadcom unknown BCM5719, ASIC rev. 0x5719001 === If I do an 'ifconfig bge0', it does show the interface as being active, but I can't ping out on it … I even found someone's reference to doing a 'ifconfig bge0 -tso -vlanhwtso' and tried that … no go … Something else I can look at? You have to use latest stable/9 or stable/8. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.1-RELEASE + bge0 == watchdog timeout
On Sun, Feb 24, 2013 at 11:06:42AM +0100, Kajetan Staszkiewicz wrote: Dnia sobota, 23 lutego 2013 o 04:54:07 Marc Fournier napisał(a): We just picked up 5 new HP DL 360p Gen8 E5-2630 2P servers … just installed 9.1-RELEASE, and it looks like all of the hardware is detected properly, and being configured … After reboot, I start getting the 'watchdog timeout - resetting' message on bge0 … I've searched the web, and found the references to setting: Have a look at the following patch: http://svnweb.freebsd.org/base?view=revisionrevision=243546 When I encountered the same error on Dell machines, using bge driver from HEAD helped me, although it seems that the aforementioned patch should be enough. That change is just one of changes required to make BCM5718/5718/5719/5720 work. You need entire bge(4)/brgphy(4) changes to get working bge(4) driver on your machines. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: 9.1-stable crashes while copying data from a NFS mounted directory
On Fri, Jan 25, 2013 at 06:09:50PM +0100, Christian Gusenbauer wrote: On Friday 25 January 2013 05:50:48 YongHyeon PYUN wrote: On Fri, Jan 25, 2013 at 01:30:43PM +0900, YongHyeon PYUN wrote: On Thu, Jan 24, 2013 at 05:21:50PM -0500, John Baldwin wrote: On Thursday, January 24, 2013 4:22:12 pm Konstantin Belousov wrote: On Thu, Jan 24, 2013 at 09:50:52PM +0100, Christian Gusenbauer wrote: On Thursday 24 January 2013 20:37:09 Konstantin Belousov wrote: On Thu, Jan 24, 2013 at 07:50:49PM +0100, Christian Gusenbauer wrote: On Thursday 24 January 2013 19:07:23 Konstantin Belousov wrote: On Thu, Jan 24, 2013 at 08:03:59PM +0200, Konstantin Belousov wrote: On Thu, Jan 24, 2013 at 06:05:57PM +0100, Christian Gusenbauer wrote: Hi! I'm using 9.1 stable svn revision 245605 and I get the panic below if I execute the following commands (as single user): # swapon -a # dumpon /dev/ada0s3b # mount -u / # ifconfig age0 inet 192.168.2.2 mtu 6144 up # mount -t nfs -o rsize=32768 data:/multimedia /mnt # cp /mnt/Movies/test/a.m2ts /tmp then the system panics almost immediately. I'll attach the stack trace. Note, that I'm using jumbo frames (6144 byte) on a 1Gbit network, maybe that's the cause for the panic, because the bcopy (see stack frame #15) fails. Any clues? I tried a similar operation with the nfs mount of rsize=32768 and mtu 6144, but the machine runs HEAD and em instead of age. I was unable to reproduce the panic on the copy of the 5GB file from nfs mount. Hmmm, I did a quick test. If I do not change the MTU, so just configuring age0 with # ifconfig age0 inet 192.168.2.2 up then I can copy all files from the mounted directory without any problems, too. So it's probably age0 related? From your backtrace and the buffer printout, I see somewhat strange thing. The buffer data address is 0xff8171418000, while kernel faulted at the attempt to write at 0xff8171413000, which is is lower then the buffer data pointer, at the attempt to bcopy to the buffer. The other data suggests that there were no overflow of the data from the server response. So it might be that mbuf_len(mp) returned negative number ? I am not sure is it possible at all. Try this debugging patch, please. You need to add INVARIANTS etc to the kernel config. diff --git a/sys/fs/nfs/nfs_commonsubs.c b/sys/fs/nfs/nfs_commonsubs.c index efc0786..9a6bda5 100644 --- a/sys/fs/nfs/nfs_commonsubs.c +++ b/sys/fs/nfs/nfs_commonsubs.c @@ -218,6 +218,7 @@ nfsm_mbufuio(struct nfsrv_descript *nd, struct uio *uiop, int siz) } mbufcp = NFSMTOD(mp, caddr_t); len = mbuf_len(mp); + KASSERT(len 0, (len %d, len)); } xfer = (left len) ? len : left; #ifdef notdef @@ -239,6 +240,8 @@ nfsm_mbufuio(struct nfsrv_descript *nd, struct uio *uiop, int siz) uiop-uio_resid -= xfer; } if (uiop-uio_iov-iov_len = siz) { + KASSERT(uiop-uio_iovcnt 1, (uio_iovcnt %d, + uiop-uio_iovcnt)); uiop-uio_iovcnt--; uiop-uio_iov++; } else { I thought that server have returned too long response, but it seems to be not the case from your data. Still, I think the patch below might be due. diff --git a/sys/fs/nfsclient/nfs_clrpcops.c b/sys/fs/nfsclient/nfs_clrpcops.c index be0476a..a89b907 100644 --- a/sys/fs/nfsclient/nfs_clrpcops.c +++ b/sys/fs/nfsclient/nfs_clrpcops.c @@ -1444,7 +1444,7 @@ nfsrpc_readrpc(vnode_t vp, struct uio *uiop, struct ucred *cred, NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED); eof = fxdr_unsigned(int, *tl); } - NFSM_STRSIZ(retlen, rsize); + NFSM_STRSIZ(retlen, len); error = nfsm_mbufuio(nd, uiop, retlen); if (error) goto nfsmout; I applied your patches and now I get a panic: len -4 cpuid = 1 KDB: enter: panic Dumping 377 out of 6116 MB:..5%..13%..22%..34%..43%..51%..64%..73%..81%..94% This means
Re: [SOLVED] if_vr(4) and DFE520-TX [working with patched if_rl]
On Tue, Jan 15, 2013 at 01:04:49PM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 15.01.2013 10:51: Hmm, I don't get it. Diff inlined again. Index: sys/pci/if_rlreg.h === --- sys/pci/if_rlreg.h (revision 245199) +++ sys/pci/if_rlreg.h (working copy) @@ -1048,6 +1048,11 @@ struct rl_softc { #defineDLINK_DEVICEID_530TXPLUS0x1300 /* + * D-Link DFE-520TX rev. C1 device ID + */ +#define DLINK_DEVICEID_520TX_REVC1 0x4200 + +/* * D-Link DFE-5280T device ID */ #defineDLINK_DEVICEID_528T 0x4300 Index: sys/pci/if_rl.c === --- sys/pci/if_rl.c (revision 245199) +++ sys/pci/if_rl.c (working copy) @@ -148,6 +148,8 @@ static const struct rl_type rl_devs[] = { Delta Electronics 8139 10/100BaseTX }, { ADDTRON_VENDORID, ADDTRON_DEVICEID_8139, RL_8139, Addtron Technology 8139 10/100BaseTX }, +{ DLINK_VENDORID, DLINK_DEVICEID_520TX_REVC1, RL_8139, +D-Link DFE-520TX (rev. C1) 10/100BaseTX }, { DLINK_VENDORID, DLINK_DEVICEID_530TXPLUS, RL_8139, D-Link DFE-530TX+ 10/100BaseTX }, { DLINK_VENDORID, DLINK_DEVICEID_690TXD, RL_8139, Hooray! It is working with if_rl with your patch (loader tunable isn't used). Thanks a lot for this! Can this be committed and merged to 8/9? Yes, committed in r245485. I will MFC to stable 9/8 after a week. rl1@pci0:4:1:0: class=0x02 card=0x11031186 chip=0x42001186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' class = network subclass = ethernet rl1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=2008VLAN_MTU,WOL_MAGIC ether 90:94:e4:82:d5:e6 inet 192.168.0.208 netmask 0xff00 broadcast 192.168.0.255 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active Ping and all other working fine. Thanks for testing! -- Regards, Ruslan Tinderboxing kills... the drives. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: if_vr(4) and DFE520-TX
On Mon, Jan 14, 2013 at 03:52:18PM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 14.01.2013 10:15: On Sat, Jan 12, 2013 at 06:49:13PM +0400, Ruslan Makhmatkhanov wrote: Ok, I got some details. It's an DFE-520TX (/C1 or rev. C1). I crafted an patch attached, but whenever kldloading the modified if_vr, I got this: [...] I also tried to apply VR_Q_NEEDALIGN quirk, but nothing is changed. Any hints? I recall D-Link was one of notorious vendor which used to completely change its chip set in later revisions without notice. So I'm afraid the controller you have may not be a VIA manufactured one. Could you take a picture of the chip set of controller and let others see it? I guess it could be a RealTek 8139 or 8139C+. Here they are. Both front and back for the case (see no traces of RealTek though): http://s2.postimage.org/9nvkrlpqx/IMAG1040.jpg http://s2.postimage.org/4qi06hnrt/IMAG1041.jpg Thanks. Try attached patch and let me know how it works. If that patch does not work, try setting a loader tunable like the following. dev.rl.0.prefer_iomap=0 diff -r ffd9aeb1e7ef sys/dev/re/if_re.c --- a/sys/dev/re/if_re.c Mon May 07 23:58:27 2012 +0200 +++ b/sys/dev/re/if_re.c Tue Jan 15 01:10:46 2013 +0100 @@ -174,6 +174,8 @@ static const struct rl_type const re_devs[] = { { DLINK_VENDORID, DLINK_DEVICEID_528T, 0, D-Link DGE-528(T) Gigabit Ethernet Adapter }, + { DLINK_VENDORID, DLINK_DEVICEID_520TX, 0, + D-Link DFE-520(TX) Gigabit Ethernet Adapter }, { DLINK_VENDORID, DLINK_DEVICEID_530T_REVC, 0, D-Link DGE-530(T) Gigabit Ethernet Adapter }, { RT_VENDORID, RT_DEVICEID_8139, 0, @@ -1214,7 +1216,7 @@ * Because RTL8169SC does not seem to work when memory mapping * is used always activate io mapping. */ - if (devid == RT_DEVICEID_8169SC) + if (devid == RT_DEVICEID_8169SC || devid == DLINK_DEVICEID_520TX) prefer_iomap = 1; if (prefer_iomap == 0) { sc-rl_res_id = PCIR_BAR(1); diff -r ffd9aeb1e7ef sys/pci/if_rlreg.h --- a/sys/pci/if_rlreg.h Mon May 07 23:58:27 2012 +0200 +++ b/sys/pci/if_rlreg.h Tue Jan 15 01:10:46 2013 +0100 @@ -1048,6 +1048,11 @@ #define DLINK_DEVICEID_530TXPLUS 0x1300 /* + * D-Link DFE-520TX device ID + */ +#define DLINK_DEVICEID_520TX 0x4200 + +/* * D-Link DFE-5280T device ID */ #define DLINK_DEVICEID_528T 0x4300 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: if_vr(4) and DFE520-TX
On Tue, Jan 15, 2013 at 10:32:06AM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 15.01.2013 06:44: On Mon, Jan 14, 2013 at 03:52:18PM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 14.01.2013 10:15: On Sat, Jan 12, 2013 at 06:49:13PM +0400, Ruslan Makhmatkhanov wrote: Ok, I got some details. It's an DFE-520TX (/C1 or rev. C1). I crafted an patch attached, but whenever kldloading the modified if_vr, I got this: [...] I also tried to apply VR_Q_NEEDALIGN quirk, but nothing is changed. Any hints? I recall D-Link was one of notorious vendor which used to completely change its chip set in later revisions without notice. So I'm afraid the controller you have may not be a VIA manufactured one. Could you take a picture of the chip set of controller and let others see it? I guess it could be a RealTek 8139 or 8139C+. Here they are. Both front and back for the case (see no traces of RealTek though): http://s2.postimage.org/9nvkrlpqx/IMAG1040.jpg http://s2.postimage.org/4qi06hnrt/IMAG1041.jpg Thanks. Try attached patch and let me know how it works. If that patch does not work, try setting a loader tunable like the following. dev.rl.0.prefer_iomap=0 Terrific! It's now attaching fine, but network over it doesn't seems working (can't ping/access machine via this interface): Please use my patch. I think rl(4) is the right driver for your controller. Jeremie's patch forces re(4) to attach. re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 90:94:e4:82:d5:e6 inet 192.168.0.208 netmask 0xff00 broadcast 192.168.0.255 inet6 fe80::9294:e4ff:fe82:d5e6%re0 prefixlen 64 scopeid 0x5 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active re0@pci0:4:1:0: class=0x02 card=0x11031186 chip=0x42001186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' class = network subclass = ethernet I also tried to add dev.rl.0.prefer_iomap=0 to /boot/loader.conf with no difference. I'll try to experiment with this later this day when there will be no active users on this machine, then let you know the results. It's not a valid option when you use re(4). Thank you! ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: if_vr(4) and DFE520-TX
On Tue, Jan 15, 2013 at 10:47:38AM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 15.01.2013 10:40: On Tue, Jan 15, 2013 at 10:32:06AM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 15.01.2013 06:44: On Mon, Jan 14, 2013 at 03:52:18PM +0400, Ruslan Makhmatkhanov wrote: YongHyeon PYUN wrote on 14.01.2013 10:15: On Sat, Jan 12, 2013 at 06:49:13PM +0400, Ruslan Makhmatkhanov wrote: Ok, I got some details. It's an DFE-520TX (/C1 or rev. C1). I crafted an patch attached, but whenever kldloading the modified if_vr, I got this: [...] I also tried to apply VR_Q_NEEDALIGN quirk, but nothing is changed. Any hints? I recall D-Link was one of notorious vendor which used to completely change its chip set in later revisions without notice. So I'm afraid the controller you have may not be a VIA manufactured one. Could you take a picture of the chip set of controller and let others see it? I guess it could be a RealTek 8139 or 8139C+. Here they are. Both front and back for the case (see no traces of RealTek though): http://s2.postimage.org/9nvkrlpqx/IMAG1040.jpg http://s2.postimage.org/4qi06hnrt/IMAG1041.jpg Thanks. Try attached patch and let me know how it works. If that patch does not work, try setting a loader tunable like the following. dev.rl.0.prefer_iomap=0 Terrific! It's now attaching fine, but network over it doesn't seems working (can't ping/access machine via this interface): Please use my patch. I think rl(4) is the right driver for your controller. Jeremie's patch forces re(4) to attach. To be honest, your and Jeremie patches are identical. Your patch is against if_re/if_rlreg.h too :) Hmm, I don't get it. Diff inlined again. Index: sys/pci/if_rlreg.h === --- sys/pci/if_rlreg.h (revision 245199) +++ sys/pci/if_rlreg.h (working copy) @@ -1048,6 +1048,11 @@ struct rl_softc { #defineDLINK_DEVICEID_530TXPLUS0x1300 /* + * D-Link DFE-520TX rev. C1 device ID + */ +#defineDLINK_DEVICEID_520TX_REVC1 0x4200 + +/* * D-Link DFE-5280T device ID */ #defineDLINK_DEVICEID_528T 0x4300 Index: sys/pci/if_rl.c === --- sys/pci/if_rl.c (revision 245199) +++ sys/pci/if_rl.c (working copy) @@ -148,6 +148,8 @@ static const struct rl_type rl_devs[] = { Delta Electronics 8139 10/100BaseTX }, { ADDTRON_VENDORID, ADDTRON_DEVICEID_8139, RL_8139, Addtron Technology 8139 10/100BaseTX }, + { DLINK_VENDORID, DLINK_DEVICEID_520TX_REVC1, RL_8139, + D-Link DFE-520TX (rev. C1) 10/100BaseTX }, { DLINK_VENDORID, DLINK_DEVICEID_530TXPLUS, RL_8139, D-Link DFE-530TX+ 10/100BaseTX }, { DLINK_VENDORID, DLINK_DEVICEID_690TXD, RL_8139, re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 90:94:e4:82:d5:e6 inet 192.168.0.208 netmask 0xff00 broadcast 192.168.0.255 inet6 fe80::9294:e4ff:fe82:d5e6%re0 prefixlen 64 scopeid 0x5 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active re0@pci0:4:1:0: class=0x02 card=0x11031186 chip=0x42001186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' class = network subclass = ethernet I also tried to add dev.rl.0.prefer_iomap=0 to /boot/loader.conf with no difference. I'll try to experiment with this later this day when there will be no active users on this machine, then let you know the results. It's not a valid option when you use re(4). Thank you! Yes, it was unmindful copy/paste, sorry. -- Regards, Ruslan Tinderboxing kills... the drives. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: if_vr(4) and DFE520-TX
On Sat, Jan 12, 2013 at 06:49:13PM +0400, Ruslan Makhmatkhanov wrote: Ok, I got some details. It's an DFE-520TX (/C1 or rev. C1). I crafted an patch attached, but whenever kldloading the modified if_vr, I got this: kernel: vr0: D-Link System Inc 4200 10/100BaseTX port 0xd100-0xd1ff mem 0xf7c11000-0xf7c110ff irq 19 at device 0.0 on pci4 kernel: vr0: Quirks: 0x0 kernel: vr0: Revision: 0x10 kernel: vr0: reset never completed! kernel: vr0: attaching PHYs failed kernel: device_attach: vr0 attach returned 6 kernel: vr0: D-Link System Inc 4200 10/100BaseTX port 0xd000-0xd0ff mem 0xf7c1-0xf7c100ff irq 16 at device 1.0 on pci4 kernel: vr0: Quirks: 0x0 kernel: vr0: Revision: 0x10 kernel: vr0: reset never completed! kernel: vr0: attaching PHYs failed kernel: device_attach: vr0 attach returned 6 I also tried to apply VR_Q_NEEDALIGN quirk, but nothing is changed. Any hints? I recall D-Link was one of notorious vendor which used to completely change its chip set in later revisions without notice. So I'm afraid the controller you have may not be a VIA manufactured one. Could you take a picture of the chip set of controller and let others see it? I guess it could be a RealTek 8139 or 8139C+. Ruslan Makhmatkhanov wrote on 12.01.2013 15:26: Here is also verbose boot log for what it's worth: http://pastebin.com/SnivrtFr Please keep me in cc:, I'm not subscribed. Thanks. Ruslan Makhmatkhanov wrote on 12.01.2013 11:28: Hello, I bought two D-link DFE520-TX ethernet adapters that supposed to work with if_vr(4) according to man-page. But the driver cannot attach (tested in 9.1-R and pfSense 2.0.2/2.1 (8.1-R and 8.3-R respectively)). none2@pci0:4:0:0:class=0x02 card=0x11031186 chip=0x42001186 rev=0x10 hdr=0x00 vendor = 'D-Link System Inc' class = network subclass = ethernet Can please anybody suggest proper changes for /sys/dev/vr/if_vrreg.h|if_vr.c (pci ids would be enought, right?) to test if it works. Thanks in advance. -- Regards, Ruslan Tinderboxing kills... the drives. diff -uN vr.orig/if_vr.c vr/if_vr.c --- vr.orig/if_vr.c 2013-01-12 13:19:28.0 +0400 +++ vr/if_vr.c2013-01-12 18:42:52.0 +0400 @@ -138,6 +138,9 @@ { DELTA_VENDORID, DELTA_DEVICEID_RHINE_II, VR_Q_NEEDALIGN, Delta Electronics Rhine II 10/100BaseTX }, + { DLINK_VENDORID, DLINK_DEVICEID_RHINE_II, + 0, +D-Link System Inc 4200 10/100BaseTX }, { ADDTRON_VENDORID, ADDTRON_DEVICEID_RHINE_II, VR_Q_NEEDALIGN, Addtron Technology Rhine II 10/100BaseTX }, diff -uN vr.orig/if_vrreg.h vr/if_vrreg.h --- vr.orig/if_vrreg.h2013-01-12 13:19:28.0 +0400 +++ vr/if_vrreg.h 2013-01-12 14:29:26.0 +0400 @@ -557,6 +557,16 @@ #define DELTA_DEVICEID_RHINE_II 0x1320 /* + * D-Link System Inc device ID. + */ +#define DLINK_VENDORID 0x1186 + +/* + * D-Link System Inc device IDs. + */ +#define DLINK_DEVICEID_RHINE_II 0x4200 + +/* * Addtron vendor ID. */ #define ADDTRON_VENDORID 0x4033 ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/174851: [bxe] [patch] UDP checksum offload is wrong in bxe driver
On Mon, Dec 31, 2012 at 03:04:47PM -0800, Barney Cordoba wrote: --- On Mon, 12/31/12, Adrian Chadd adr...@freebsd.org wrote: From: Adrian Chadd adr...@freebsd.org Subject: Re: kern/174851: [bxe] [patch] UDP checksum offload is wrong in bxe driver To: Garrett Cooper yaneg...@gmail.com Cc: Barney Cordoba barney_cord...@yahoo.com, David Christensen davi...@freebsd.org, lini...@freebsd.org, freebsd-net@freebsd.org Date: Monday, December 31, 2012, 2:00 PM On 31 December 2012 07:58, Garrett Cooper yaneg...@gmail.com wrote: I would ask David about whether or not there was a performance difference because they might have some numbers for if_bxe. Not sure about the concept in general, but it seems like a reasonable application protocol specific request. But by and large, I agree that UDP checksumming doesn't make logical sense because it adds unnecessary overhead on a L3 protocol that's assumed to be unreliable. People are terminating millions of VoIP calls on FreeBSD devices. All using UDP. I can imagine large scale VoIP gateways wanting to try and benefit from this. The statement above assumes that there is a benefit. voIP packets are short, so the benefit of offloading is reduced. There is some delay added by the hardware, and there are cpu cycles used in managing the offload code. So those operations not only muddy the code, but they may not be faster than simply doing the checksum on a much, much faster cpu. I'm under the impression that recent Intel controllers tend to add more burden for driver to setup checksum offloading context. In addition, some controllers have DMA restrictions to make checksum offloading works such that it may add additional overhead unless its DMA engine supports multiple outstanding reads. As you said, checksum offloading may not make it faster but saved CPU cycles for checksum offloading could be used for other system activities. You can disable checksum offloading any time when you find it's not good for specific load. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: svn commit: r242739 - stable/9/sys/dev/ti
On Wed, Nov 07, 2012 at 06:15:30PM -0800, Adrian Chadd wrote: So I am curious - did this give a real benefit? In 3.x/4.x days it surely have had helped a lot, I guess mainly because the CPU was not fast enough to saturate the link with software checksum(i.e. NFS over UDP). Generally I prefer correctness to performance and it seems there is no easy way to get full advantage of TCP/UDP checksum offloading of controller on fragmented IP packets on FreeBSD 8+. So I disabled it to reduce the chance of generating corrupted packets. If so, may I suggest we perhaps accelerate discussing if_transmit() of multiple frames per call? Hmm, actually I'm still not a fan of if_transmit() at this moment. Honestly I don't have good queuing code in driver to handle queue full condition. Interactions with altq(9) is also one of my concern as well as packet reordering issue of drbr(9) interface. That would allow features like this to be re-enabled. Adrian On 7 November 2012 18:06, Pyun YongHyeon yong...@freebsd.org wrote: Author: yongari Date: Thu Nov 8 02:06:27 2012 New Revision: 242739 URL: http://svnweb.freebsd.org/changeset/base/242739 Log: MFC r242425: Remove TCP/UDP checksum offloading feature for IP fragmented datagrams. Traditionally upper stack fragmented packets without computing TCP/UDP checksum and these datagrams were passed to driver. But there are chances that other packets slip into the interface queue in SMP world. If this happens firmware running on MIPS 4000 processor in the controller would see mixed packets and it shall send out corrupted packets. While I'm here simplify checksum offloading setup. Modified: stable/9/sys/dev/ti/if_ti.c Directory Properties: stable/9/sys/ (props changed) stable/9/sys/dev/ (props changed) Modified: stable/9/sys/dev/ti/if_ti.c == --- stable/9/sys/dev/ti/if_ti.c Thu Nov 8 02:01:04 2012(r242738) +++ stable/9/sys/dev/ti/if_ti.c Thu Nov 8 02:06:27 2012(r242739) @@ -127,7 +127,7 @@ __FBSDID($FreeBSD$); #include sys/sysctl.h -#define TI_CSUM_FEATURES (CSUM_IP | CSUM_TCP | CSUM_UDP | CSUM_IP_FRAGS) +#define TI_CSUM_FEATURES (CSUM_IP | CSUM_TCP | CSUM_UDP) /* * We can only turn on header splitting if we're using extended receive * BDs. @@ -3083,16 +3083,10 @@ ti_encap(struct ti_softc *sc, struct mbu m = *m_head; csum_flags = 0; - if (m-m_pkthdr.csum_flags) { - if (m-m_pkthdr.csum_flags CSUM_IP) - csum_flags |= TI_BDFLAG_IP_CKSUM; - if (m-m_pkthdr.csum_flags (CSUM_TCP | CSUM_UDP)) - csum_flags |= TI_BDFLAG_TCP_UDP_CKSUM; - if (m-m_flags M_LASTFRAG) - csum_flags |= TI_BDFLAG_IP_FRAG_END; - else if (m-m_flags M_FRAG) - csum_flags |= TI_BDFLAG_IP_FRAG; - } + if (m-m_pkthdr.csum_flags CSUM_IP) + csum_flags |= TI_BDFLAG_IP_CKSUM; + if (m-m_pkthdr.csum_flags (CSUM_TCP | CSUM_UDP)) + csum_flags |= TI_BDFLAG_TCP_UDP_CKSUM; frag = sc-ti_tx_saved_prodidx; for (i = 0; i nseg; i++) { ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: bxe + if_lagg
On Wed, Oct 31, 2012 at 12:05:37PM -0400, Tom Judge wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/31/12 3:47 AM, YongHyeon PYUN wrote: On Tue, Oct 30, 2012 at 11:23:37AM -0400, Tom Judge wrote: [...] I am trying to get if_lagg working in an HP blade for failover between the 2 in chassis cisco switches, but it would seem that the link state is not being propagated up to the lagg device. Any hints/ideas? dmesg: bxe1: Broadcom NetXtreme II BCM57711E 10GbE (A0) BXE v:1.5.52 bxe1: Ethernet address: 00:25:b3:a8:76:e4 bxe1: ASIC (0x1650); Rev (A0); Bus (PCIe x4, 5Gbps); Flags (MSI-X); Queues (RSS:16); BD's (RX:510,TX:255); Firmware (5.2.13); Bootcode (4.8.0) Try attached patch and let me know whether it makes any difference. This results in zero network connectivity, even with an IP assigned to the bxe device directly. It does show status: active and the same 10Gbase-SR media however. Ping results in no route to host. :-( It seems upper stack still thinks link is down so it didn't even bother to send any packets. Probably that was the reason why bxe(4) did not announce IFCAP_LINKSTATE capability to stack. It seems slow path handler(link state change tracker) does not work as expected. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/171520: [alc] alc network driver + tso + vlan does not work.
On Mon, Oct 22, 2012 at 10:15:28AM +0600, Nikolay Nevzorov wrote: 2012/10/22 Eugene Grosbein egrosb...@rdtc.ru 22.10.2012 10:55, Nikolay Nevzorov пишет: I can make clean test. What do you mean by clean: reboot with disable mpd (and so will be not enabled kernel NAT in system) or remove LIBALIAS from kernel too? You need not reboot or disable mpd. Just make sure your testing traffic does not pass through NAT. Any traffic throuhg NAT does not cause problems. And any routed traffic so on. Problem only with traffic that generated on host with alc0, because host generate packets much more bigger than MTU (about 2300 bytes per packet with MTU 1500), a see it with tcpdump on alc0. It's completely normal to see bigger MTU sized packets on TSO-capable controllers. bpf sees these packets *before* hardware actually segments these packets. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: FreeBSD 9.0-RELEASE Realtek 8168/81111 Driver
On Wed, Sep 26, 2012 at 04:12:19PM +0300, Vladimir Vladimir wrote: Hi all, I've got an issue with new mother board ASUS P8Z77-M PRO. My FreeBSD 9.0-RELEASE can't set up Network interface for integrated NIC Realtek 8168/8. I have tried FreeBSD-8.2, and FreeBSD-9.0 I' downloaded new distributions from freebsd.org and tried boot from them but result the same. FreeBSD couldn't setup re0 interface. in the dmesg output : re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xdc104000-0xdc104fff,0xdc10-0xdc103fff irq 17 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: turning off MSI enable bit. re0: Chip rev. 0x2c80 re0: MAC rev. 0x so looks like FreeBSD detected NIC, but ifconfig shows loop interface only There had been lots of re(4) changes since 9.0-RELEASE. Try 8.3-RELEASE or 9.1-RC1. lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384 options=3RXCSUM,TXCSUM inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0xa inet 127.0.0.1 netmask 0xff00 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL if i try command ifconfig re0 create ifconfig: SIOCIFCREATE2: Invalid argument at the same time Ubuntu works perfect on this mother board, so there aren't any hardware issues In the official ASUS documentation for motherboard ASUS P8Z77-M PRO it's got NIC Realtek® 8111F, 1 x Gigabit LAN Controller(s) http://www.asus.ua/Motherboards/Intel_Socket_1155/P8Z77M_PRO/#specifications May be re(4) driver not updated for this mother board yet?. http://www.freebsd.org/cgi/man.cgi?query=reapropos=0sektion=4manpath=FreeBSD+9.0-stablearch=defaultformat=html If anybody faced of such issue., let me know. Thanks Vlad. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: kern/167325: [netinet] [patch] sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC
On Fri, Sep 07, 2012 at 05:44:48PM -0400, Jeremiah Lott wrote: On Apr 27, 2012, at 2:07 AM, lini...@freebsd.org wrote: Old Synopsis: sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC New Synopsis: [netinet] [patch] sosend sometimes return EINVAL with TSO and VLAN on 82599 NIC http://www.freebsd.org/cgi/query-pr.cgi?pr=167325 I did an analysis of this pr a while back and I figured I'd share. Definitely looks like a real problem here, but at least in 8.2 it is difficult to hit it. First off, vlan tagging is not required to hit this. The code is question does not account for any amount of link-local header, so you can reproduce the bug even without vlans. In order to trigger it, the tcp stack must choose to send a tso packet with a total size (including tcp+ip header and options, but not link-local header) between 65522 and 65535 bytes (because adding 14 byte link-local header will then exceed 64K limit). In 8.1, the tcp stack only chooses to send tso bursts that will result in full mtu-size on-wire packets. To achieve this, it will truncate the tso packet size to be a multiple of mss, not including header and tcp options. The check has been relaxed a little in head, but the same basic check is still there. None of the normal mtus have multiples falling in this range. To reproduce it I used an mtu of 1445. When timestamps are in use, every packet has a 40 bytes tcp/ip header + 10 bytes for the timestamp option + 2 bytes pad. You can get a packet length 65523 as follows: 65523 - (40 + 10 + 2) = 65471 (size of tso packet data) 65471 / 47 = 1393 (size of data per on-wire packet) 1393 + (40 + 10 + 2) = 1445 (mtu is data + header + options + pad) Once you set your mtu to 1445, you need a program that can get the stack to send a maximum sized packet. With the congestion window that can be more difficult than it seems. I used some python that sends enough data to open the window, sleeps long enough to drain all outstanding data, but not long enough for the congestion window to go stale and close again, then sends a bunch more data. It also helps to turn off delayed acks on the receiver. Sometimes you will not drain the entire send buffer because an ack for the final chunk is still delayed when you start the second transmit. When the problem described in the pr hits, the EINVAL from bus_dmamap_load_mbuf_sg bubbles right up to userspace. At first I thought this was a driver bug rather than stack bug. The code in question does what it is commented to do (limit the tso packet so that ip-ip_len does not overflow). However, it also seems reasonable that the driver limit its dma tag at 64K (do we really want it allocating another whole page just for the 14 byte link-local header). Perhaps the tcp stack should ensure that the tso packet + max_linkhdr is 64K. Comments? Hmm, I think it's a driver bug. Upper stack may not know whether L2 includes VLAN. Almost all drivers in tree includes L2 header size in DMA tag. If ethernet hardwares can handle this oversized frames(64KB + L2 header) with TSOv4/TSOv6 I think there is no reason not to support it. As an aside, the patch attached to the pr is also slightly wrong. Taking the max_linkhdr into account when rounding the packet to be a multiple of mss does not make sense, it should only take it into account when calculating the max tso length. Jeremiah Lott Avere Systems ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Dell PowerEdge R820 Broadcom BCM57800 support
On Fri, Sep 07, 2012 at 01:45:59PM -0700, Sean Bruno wrote: On Thu, 2012-08-16 at 09:56 -0700, John wrote: Hi Folks, I have an R820 I'm testing. The system seems to boot up fine, but no network adapters show up. From pciconf -l : none4@pci0:1:0:0: class=0x02 card=0x1f5c1028 chip=0x168a14e4 rev=0x10 hdr=0x00 none5@pci0:1:0:1: class=0x02 card=0x1f5c1028 chip=0x168a14e4 rev=0x10 hdr=0x00 none6@pci0:1:0:2: class=0x02 card=0x1f671028 chip=0x168a14e4 rev=0x10 hdr=0x00 none7@pci0:1:0:3: class=0x02 card=0x1f671028 chip=0x168a14e4 rev=0x10 hdr=0x00 which appears to be these: Broadcom BCM57800 NetXtreme II 10 GigE 1f5c Broadcom BCM57800 NetXtreme II 1 GigE1f67 The chipid is 0x168a14e4 which indicates vendor is Broadcom and device is NetXtreme II BCM57800 10G gigabit ethernet. I guess bxe(4) would be right driver to pick up the controller but it seems there is no support for BCM57800 in bxe(4) at this moment. Probably David can add more comment on this(CCed). Does anyone have any experience with these? Thanks, John John: Hey, I'm currently testing a patchset that enables the use of the 1Gig adapter via bge(4). I'm not sure about the 10Gig adapter though, is that bxe(4) At this time, there no functional version of bge(4) that works on a stable release. You'd have the best luck in compiling your own kernel from stable/9 and applying the following updates from http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c You'll need to overwrie brgphy.c in sys/dev/mii and move if_bge.c if_bgereg.h to sys/dev/bge and recompile your kernel. bge(4) does not support Broadcom NetXtreme II BCM57800 controllers. It wouldn't make any difference. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: [patch] if_bxe shutdown fix
On Wed, Sep 05, 2012 at 12:11:14AM -0500, Mike Silbersack wrote: On 9/5/12 3:56 PM, YongHyeon PYUN wrote: On Tue, Sep 04, 2012 at 11:35:13PM -0500, Mike Silbersack wrote: Does anyone want to review this patch before I check it in? The change has been reviewed and tested by coworkers, but not yet reviewed by any other FreeBSD committers. http://www.silby.com/patches/if_bxe.c-safestop.patch This resolves an issue we saw at work where IPMI would report bus errors when you rebooted a system with bxe NICs if you had not UP'd all of the bxe NICs before the shutdown. Yeah I also have a similar patch. But I checked sc-state after getting a BXE_CORE_LOCK as the state is protected by the lock. Thanks, Mike Silby Silbersack Good catch. How does this look? http://www.silby.com/patches/if_bxe.c-safestop-2.patch Patch looks good to me. ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: Broadcom NetXtreme BCM5719 support
On Wed, Jul 25, 2012 at 01:21:05PM -0700, Jason Wolfe wrote: On Thu, Jul 12, 2012 at 11:02 PM, Eugene M. Zheganin e...@norma.perm.ru wrote: Hi. On 13.07.2012 04:39, Jason Wolfe wrote: bge0:Broadcom unknown BCM5719, ASIC rev 0x5719001 mem 0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus0:MII bus on bge0 bge0: Ethernet Address: xx:xx:xx:xx:xx:xx ... bge0: watchdog timeout -- resetting bge0: link state changed to UP bge0: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN ... bge0@pci:0:3:0:0: class=0x02 card=0x169d103c chip=0x165714e4 rev=0x01 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5719 Gigabit Ethernet PCIe' class = network subclass= ethernet Anything in the pipe on this one, or any access I can provide that might assist us? I got a BMC5722 chip (and IBM x3250 mX systems), but same stuff with timeout/resets. I can say 8.1/8.2 were more stable concerning bge(4). You could try to switch off tso and vlanhwtso, at least it did the trick for me (did it ? not sure though. I was having problems one a month on 8.1/8.2, after upgrading to 8.3-STABLE I start having problems with it every day, literally, after ifconfig bge0 -tso -vlanhwtso it's running for 5 day now.) Eugene. Yeah, I had no luck even with all options disabled. The NIC constantly bounces, and ifconfig never reports anything but status: no carrier. No love on the driver side for this NetXtreme BCM5719 Gigabit Ethernet PCIe card? See kern/171121. Give latest WIP version spin and let me know how it goes on your box. Jason ___ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to freebsd-net-unsubscr...@freebsd.org
Re: vr(4) troubles for AMD Geode CS5536 chipset
On Fri, Aug 31, 2012 at 12:45:53PM +0700, Eugene Grosbein wrote: In previous letter I've described my attempts to try vr(4) from HEAD. Now I'd like to explain why I've tried it. The problem is that stock vr(4) from 8.3-STABLE/i386 has serious issues for my system. I have home router with two vr interfaces, vr0 is for LAN (IPoE) and vr1 is for WAN (PPPoE/mpd). Presently, every day my WAN vr interface stops running correctly: sometimes it stops receiving all packets - tcpdump shows none of them. Sometimes, it receives some but with great delay - up to 10 seconds (not miliseconds) and even more. tcpdump shows that delay occurs on receive path. Sometimes, it even rearranges packets - tcpdump shows that some incoming ICMP echo requests with lower sequence numbers come in later that already answered higher-numbered requests. Hmm, it seems driver's consumer/producer index of RX path were corrupted. ifconfig vr1 down/up revives interface completely until next morning. sysctl net.inet.ip.fw.enable=0 does not solve the problem. I have control over WAN switching/routing network and may assure it runs just fine. However, I can't guarantee it has no soft anomalies like short storms or some silly broadcasts. I've tried to make incoming flood with ng_source(4) generated UDP flood at 100M rate for 60 seconds and failed to reproduce the problem artificially. I've tried to move WAN from vr1 to vr0 and the problem has moved to vr0 too. My LAN has very little traffic and corresponding vr interface exhibits no problems. This router also routinely runs transmission (torrent client from ports) serving torrents from USB-attached HDD making severe CPU load, so I suspect the problem may be related with CPU load. I've also checked mbuf/mbuf clusters usage and they are all right: # netstat -m 1539/2076/3615 mbufs in use (current/cache/total) 1200/1278/2478/65536 mbuf clusters in use (current/cache/total/max) 1200/306 mbuf+clusters out of packet secondary zone in use (current/cache) 318/181/499/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4056K/3799K/7855K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/4/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines # vmstat -z | egrep -i 'ITEM|mbuf' ITEM SIZE LIMIT USED FREE REQUESTS FAILURES mbuf_packet: 256,0, 1429, 77, 112854470, 0 mbuf: 256,0, 489, 1620, 369073316, 0 mbuf_cluster:2048,65536, 1506, 604, 5401864, 0 mbuf_jumbo_page: 4096,12800, 469, 158, 8306777, 0 mbuf_jumbo_9k: 9216, 6400,0,0,0, 0 mbuf_jumbo_16k: 16384, 3200,0,0,0, 0 mbuf_ext_refcnt:4,0,0,0,0, 0 NetGraph items:36, 4130,1, 117, 263123, 0 NetGraph data items: 36, 531,0, 295, 106663377, 0 While ifconfig vr1 down/up solves the problem completely (for some long time), taking link down/up using switch solves it in half - huge packet delays disappear and turn to 25% packet loss happening in regular short intervals, once a second of like. ifconfig down/up clears this mess too. Please help me to debug this, it's pretty annoying. By chance, did vr(4) spew some kind of diagnostics messages to console? If I remember correctly, vr(4) automatically restarts controller and show these errors when it detects abnormal condition. Abnormal conditions for vr(4) would be: - TX/RX MAC stuck - RX MAC stop due to FIFO overflow or no RX buffers - PCI bus errors - TX abort - TX underrun I had a hope new vr(4) driver would help but it takes my system down under average load and is unusable. Here is start of dmesg.boot: Copyright (c) 1992-2012 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.3-STABLE #1: Wed Aug 29 22:49:45 NOVT 2012 r...@grosbein.pp.ru:/usr/local/obj/nanobsd.gw/i386/usr/local/src/sys/GW i386 Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Geode(TM) Integrated Processor by AMD PCS (499.91-MHz 586-class CPU) Origin = AuthenticAMD Id = 0x5a2 Family = 5 Model = a Stepping = 2