Re: watchdog timeout problem
On Thu, Nov 02, 2017 at 10:13:15AM -0400, Ernie Luzar wrote: > Posted this 10/31/2017 got no reply. > > Been getting these error messages since about release 10.0 I think. > Have changed to new hardware box and new cable modem and still having > the same error messages. Also occurs when I use em0 interface to connect > to the public internet instead of vge0. > > vge0: flags=8843> metric 0 mtu 1500 > options=389b WOL_UCAST,WOL_MCAST,WOL_MAGIC> > ether 00:0b:db:19:33:18 > hwaddr 10:00:60:21:00:93 > inet xxx.xxx.xxx.xxx netmask 0xf000 > broadcast 255.255.255.255 > nd6 options=29 > media: Ethernet autoselect (1000baseT ) > status: active > > > > Oct 30 23:43:38 fbsd kernel: vge0: watchdog timeout > Oct 30 23:43:38 fbsd kernel: vge0: link state changed to DOWN > Oct 30 23:43:42 fbsd kernel: vge0: link state changed to UP [...] Would you show me the output of dmesg? > > 11/2/2017 posting this now as a update > > I have continued to research this problem. > The "man watchdog" says that the command, > watchdog -d will provide debugging info, and > watchdog -t will set a new timeout timer value > > When I issue either of those commands I get this error message > watchdog: patting the dog: Operation not supported > > The man page also says a value of -t 0 disables the watchdog function. > > Issuing "watchdog -t 0" does not get that above error message, but the > watchdog function is still enabled because I am still getting the > > kernel: vge0: watchdog timeout > kernel: vge0: link state changed to DOWN > kernel: vge0: link state changed to UP > > messages. > The watchdog timeout message is generated when vge(4) didn't see transmit completion interrupts. It has nothing to do with watchdog(8) or watchdog(4). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Problem with re(4) Ethernet driver has resurfaced in 11-STABLE and HEAD
On Sun, May 21, 2017 at 09:29:42AM +, Thomas Mueller wrote: > from YongHyeon PYUN: > > > [removed stable@ from CC] > > > > I recently updated my 10.1-STABLE to 11.0-STABLE and find I can no longer > > > connect with the Ethernet. > > > > dhclient re0 produces > > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 4 > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 11 > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 19 > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 7 > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 14 > > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 5 > > > No DHCPOFFERS received. > > > No working leases in persistent database - sleeping. > > > > If you assign an static IPv4 address to re(4) are you able to use > > the network interface? > > > AFAIK there was no significant re(4) changes for a long time. Could > > you show us back trace information? > > Problem with re(4) reappeared in both 11.0-STABLE and HEAD, but OK to trim > stable@ since changes/fixes would go to HEAD first. > > No connection with static IPv4 address. > > Where do I get back trace information? > https://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html > Problem was more severe with HEAD in that OS immediately crashed into > debugger, while in 11.0-STABLE, only the connection failed but may have left > memory unstable. > I guess you have two issues. No connection with static IPv4 address and kernel crash. Couldn't you obtain a kernel crash dump when you encounter kernel panic? If this is no crash dump you probably have some missing kernel configuration. > I can still connect on that computer with Hiro H50191 USB wireless adapter, > driver rsu. > > Tom > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Problem with re(4) Ethernet driver has resurfaced in 11-STABLE and HEAD
On Thu, May 18, 2017 at 07:04:51AM +, Thomas Mueller wrote: [removed stable@ from CC] > I recently updated my 10.1-STABLE to 11.0-STABLE and find I can no longer > connect with the Ethernet. > > dhclient re0 produces > > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 4 > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 11 > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 19 > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 7 > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 14 > DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 5 > No DHCPOFFERS received. > No working leases in persistent database - sleeping. > If you assign an static IPv4 address to re(4) are you able to use the network interface? > uname -a shows > > FreeBSD amelia2 11.0-STABLE FreeBSD 11.0-STABLE #1 r317932: Mon May 8 > 23:23:37 UTC 2017 root@amelia2:/usr/obj/usr/src11/sys/SANDY11NC amd64 > > Relevant lines from /var/run/dmesg.boot are > > re0: port > 0xe000-0xe0ff mem 0xf7d04000-0xf7d04fff,0xf7d0-0xf7d03fff irq 17 at > device 0.0 on pci2 > re0: Using 1 MSI-X message > re0: Chip rev. 0x2c80 > re0: MAC rev. 0x0010 > miibus0: on re0 > rgephy0: PHY 1 on miibus0 > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, > 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, > 1000baseT-FDX-flow-master, auto, auto-flow > re0: Using defaults for TSO: 65518/35/2048 > re0: Ethernet address: d4:3d:7e:97:17:e2 > re0: netmap queues/slots: TX 1/256, RX 1/256 > > Problem shows much quicker in my recent build of HEAD (12-current), where > dhclient re0 > gives just a couple lines screen output before crashing into debugger > db> prompt > AFAIK there was no significant re(4) changes for a long time. Could you show us back trace information? Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Sat, Nov 19, 2016 at 07:44:35PM +0100, O. Hartmann wrote: > Am Mon, 7 Nov 2016 11:16:23 +0900 > YongHyeon PYUN <pyu...@gmail.com> schrieb: > > > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > > On Mon, 31 Oct 2016 11:12:22 +0900 > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > > capable link partner like other re(4) box without switches > > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > > Ethernet related one or not. > > > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > > event, but at least in one of ten. > > > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > > UTP cable was there active network traffic on re(4) device? > > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > > unplugging or plugging). And would you show me backtrace of > > > > > > panic? > > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > > > [...] > > > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > > > re0: > > > > > > > port 0x3000-0x30ff mem > > > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > > > [...] > > > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > > > Thanks. > > > &g
Re: CURRENT: re(4) crashing system
On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > On Mon, 31 Oct 2016 11:12:22 +0900 > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > > > > > > > [...] > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > capable link partner like other re(4) box without switches > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > Ethernet related one or not. > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > event, but at least in one of ten. > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > UTP cable was there active network traffic on re(4) device? > > > > It would be helpful to know which event triggers the crash(e.g. > > > > unplugging or plugging). And would you show me backtrace of > > > > panic? > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > went hand in hand with the problem I face with CURRENT right > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > Ok. > > > > > > > > [...] > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > laptop (Lenovo E540) > > > > > > > > > > [...] > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > re0: > > > > > port 0x3000-0x30ff mem > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > This looks like 8168GU controller. > > > > > > > > [...] > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > present without this option - just for the record. > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > Thanks. > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > really easy - just one pull of the LAN cabling - and we are > > > happy :-/ > > > > > > Please let me know if you need something else. I will return to > > > normal operations (disabling debugging) due to CURRENT is very > > > unstable at the moment on other hosts beyond r307157. > > > > > > > It seems the attachment was stripped. > > This time I hope I got it right! > > Attached you'll find the latest CURRENT's backtrace on the provoked > crash (plug and unplug). > > I also saved the kernel and coredump, so if you need me to do further > investigations,please let me know. > Thanks a lot for the backtrace. This backtrace is not the one I expected and I guess the issue is related with cached route removal on interface down. Quick looking over the code didn't reveal the cause of crash(I'm not familiar with that part code). Probably gnn@ may have better idea what's going on here(CCed). Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > On Thu, 27 Oct 2016 10:00:04 +0900 > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > > > > > [...] > > > > > > I'm not sure but it's likely the issue is related with EEE/Green > > > > Ethernet handling. EEE is negotiated feature with link partner. If > > > > you directly connect your laptop to non-EEE capable link partner > > > > like other re(4) box without switches you may be able to tell > > > > whether the issue is EEE/Green Ethernet related one or not. > > > > > > Me either since when I discovered a problem the first time with > > > CURRENT, that was the Friday before last week's Friday, there was a > > > unlucky coicidence: I got the new switch, FreeBSD introduced a > > > serious bug and I changed the NICs. > > > > > > The laptop, the last in the row of re(4) equipted systems on which I > > > use the Realtek NIC, does well now with Green IT technology, but > > > crashes on plugging/unplugging - not on each event, but at least in > > > one of ten. > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > UTP cable was there active network traffic on re(4) device? > > It would be helpful to know which event triggers the crash(e.g. > > unplugging or plugging). And would you show me backtrace of panic? > > > > > I guess the Green IT issue is more a unlucky guess of mine and went > > > hand in hand with the problem I face with CURRENT right now on some > > > older, Non UEFI machines. > > > > > > > Ok. > > > > [...] > > > > > > As requested the informations about re0 and rgephy0 on the laptop > > > (Lenovo E540) > > > > > > [...] > > > > > > rgephy0: PHY 1 on miibus0 > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > > > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > re0: > > > port 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff > > > at device 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > > > re0: Chip rev. 0x5080 > > > re0: MAC rev. 0x0010 > > > > This looks like 8168GU controller. > > > > [...] > > > > > I use options netmap in kernel config, but the problem is also > > > present without this option - just for the record. > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > Thanks. > > Attached, you'll find the backtrace of the crash. This time it was > really easy - just one pull of the LAN cabling - and we are happy :-/ > > Please let me know if you need something else. I will return to normal > operations (disabling debugging) due to CURRENT is very unstable at the > moment on other hosts beyond r307157. > It seems the attachment was stripped. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > On Tue, 25 Oct 2016 11:05:38 +0900 > YongHyeon PYUN <pyu...@gmail.com> wrote: > [...] > > I'm not sure but it's likely the issue is related with EEE/Green > > Ethernet handling. EEE is negotiated feature with link partner. If > > you directly connect your laptop to non-EEE capable link partner > > like other re(4) box without switches you may be able to tell > > whether the issue is EEE/Green Ethernet related one or not. > > Me either since when I discovered a problem the first time with > CURRENT, that was the Friday before last week's Friday, there was a > unlucky coicidence: I got the new switch, FreeBSD introduced a serious > bug and I changed the NICs. > > The laptop, the last in the row of re(4) equipted systems on which I > use the Realtek NIC, does well now with Green IT technology, but > crashes on plugging/unplugging - not on each event, but at least in one > of ten. Hmm, it seems you know how to trigger the issue. When you unplug UTP cable was there active network traffic on re(4) device? It would be helpful to know which event triggers the crash(e.g. unplugging or plugging). And would you show me backtrace of panic? > I guess the Green IT issue is more a unlucky guess of mine and went > hand in hand with the problem I face with CURRENT right now on some > older, Non UEFI machines. > Ok. [...] > > As requested the informations about re0 and rgephy0 on the laptop > (Lenovo E540) > > [...] > > rgephy0: PHY 1 on miibus0 > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, > 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow > > re0: port > 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device > 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > re0: Chip rev. 0x5080 > re0: MAC rev. 0x0010 This looks like 8168GU controller. [...] > I use options netmap in kernel config, but the problem is also present > without this option - just for the record. > Yup, netmap(4) has nothing to do with the crash. Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Mon, Oct 24, 2016 at 02:03:37PM +0200, O. Hartmann wrote: > On Mon, 24 Oct 2016 14:14:00 +0900 > YongHyeon PYUN <pyu...@gmail.com> wrote: > > > On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > > > I tried to report earlier here that CURRENT does have some serious > > > problems right now and one of those problems seems to be triggered by > > > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > > > > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > > > Laptop I can test on and trigger the problem. > > > > > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > > > always falling back to 100baseTX although the device claims to be a 1 > > > GBit capable device. > > > > > > When I try to put the device manually into 1000basTX mode via > > > > > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > > > > > it is possible to crash the system. The system also crashes when > > > plugging/unplugging the LAN cord - I guess the renegotiation is > > > triggering this crash immediately. > > > > > > I tried with several switches and routers capable of 1 GBit and it > > > seems to be independent from the network hardware in use. > > > > > > I tried to capture a backtrace when the kernel crashes, but I do not > > > know how to save the the kernel debugger output. Although I configured > > > according the handbook debugging, there is no coredump at all. > > > > > > Advice is appreciated - if anybody is interesetd in solving this. > > > > > > > There were several instability reports on re(4). I vaguely guess > > it would be related with some missing initializations for certain > > controllers. Unfortunately, there is no publicly available > > datasheet for those controllers and it's not likely to get access > > to it in near future. It seems vendor's FreeBSD driver accesses > > lots of magic registers as well as loading DSP fixups. I have no > > idea what it wants to do and re(4) used to heavily rely on power-on > > default register values. Engineering samples I have do not show > > instabilities so it wouldn't be easy to identify the issue. > > > > Probably the first step to address the issue would be identifying > > those chips and narrowing down the scope of guessing. Would you > > show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) > > output is useless here since RealTek uses the same PCI id for > > PCIe variants. > > > > BTW, I was told that the vendor's FreeBSD driver seems to work fine > > for normal usage pattern. The vendor's driver triggered an instant > > panic and lacked H/W offloading features in the past. It might > > have changed though. > > The problemacy with re(4) drivers arose again, when I bought some "green" > equipment, mainly switches, which reduces power emission on short cables or > non-connected ports. This brought down some servers with re(4) chipsets > immediately and I had no clue what happend. I do not know whether this is a I'm not sure but it's likely the issue is related with EEE/Green Ethernet handling. EEE is negotiated feature with link partner. If you directly connect your laptop to non-EEE capable link partner like other re(4) box without switches you may be able to tell whether the issue is EEE/Green Ethernet related one or not. > single fate so to speak, or this problem will arise for others, too. We > exchanged on serving hardware all Realtek NICs with those from Intel, and > luckily some server mainboards already have Intel PHY or NICs. The Broadcom > devices we have on some older Fujitus hardware is also stable like a charme, > even with the new power saving switches. > bge(4) also lacks EEE support(Publicly available datasheet is too sanitized one). bge(4) firmware probably does not announce EEE capability by default in link establishment while recent re(4) devices seem to unconditionally announce EEE. Generally EEE handling requires a kind of handshake for link state change from MAC/PHY. > While we can swap on server or workstation platforms the NIC, it is almost > impossible on laptops and the number of laptops with realtek chips seems to > grow. It is a pity that the venodr of the chipsets reject supporting other > OSes > than Windows - or in some rare cases only Linux. After you wrote the answer, I > checked on the net who's suiatble drivers and the situation seems bad for > almost all OSes apart from commercial ones like Windooze and Apple OS X. > > As so
Re: CURRENT: re(4) crashing system
On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > I tried to report earlier here that CURRENT does have some serious > problems right now and one of those problems seems to be triggered by > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > Laptop I can test on and trigger the problem. > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > always falling back to 100baseTX although the device claims to be a 1 > GBit capable device. > > When I try to put the device manually into 1000basTX mode via > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > it is possible to crash the system. The system also crashes when > plugging/unplugging the LAN cord - I guess the renegotiation is > triggering this crash immediately. > > I tried with several switches and routers capable of 1 GBit and it > seems to be independent from the network hardware in use. > > I tried to capture a backtrace when the kernel crashes, but I do not > know how to save the the kernel debugger output. Although I configured > according the handbook debugging, there is no coredump at all. > > Advice is appreciated - if anybody is interesetd in solving this. > There were several instability reports on re(4). I vaguely guess it would be related with some missing initializations for certain controllers. Unfortunately, there is no publicly available datasheet for those controllers and it's not likely to get access to it in near future. It seems vendor's FreeBSD driver accesses lots of magic registers as well as loading DSP fixups. I have no idea what it wants to do and re(4) used to heavily rely on power-on default register values. Engineering samples I have do not show instabilities so it wouldn't be easy to identify the issue. Probably the first step to address the issue would be identifying those chips and narrowing down the scope of guessing. Would you show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) output is useless here since RealTek uses the same PCI id for PCIe variants. BTW, I was told that the vendor's FreeBSD driver seems to work fine for normal usage pattern. The vendor's driver triggered an instant panic and lacked H/W offloading features in the past. It might have changed though. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Realtek 8168/8111 if_re not working in current r295091
On Wed, Feb 03, 2016 at 08:57:01PM +0100, s@web.de wrote: > After updating -current at Jan, 31st (r295091) the Realtek ethernet device > driver of my Zotac ZBox RI323 mini pc seems to be broken: I can neither > connect to the host even though the interface is shown as active, nor can I > initiate connection from the host through re0. > Reverting the kernel to my previous build -current r290151 (install date Nov > 1st, 2015) the re0 interface is working OK. > > Looking through the svn logs regarding /head/sys/dev/re/if_re.c I supect, > that Revision 290566 might have someting to do with this and that I have to > include my Realtek Chipset to the exclusion list for "enabling RX/TX after > initial configuration (or viceversa; I am really confused here), but I havent > got a clue how; as I do not know how to find the right RL_HWREV_XXX flag for > my device. > You can get Chip/MAC revision information from dmesg output. (dmesg | grep re0 would do). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: Kernel panic with fresh current, probably nfs related
On Sat, Aug 22, 2015 at 11:25:58AM -0700, Sean Bruno wrote: I'm going to guess that you're using an em net driver, since that is the only one that sets if_hw_tsomax IP_MAXPACKET (65535) from what I can see. Sean, EM_TSO_SIZE is defined as (65535 + sizeof(struct ether_vlan_header)), which makes it IP_MAXPACKET. The value of if_hw_tsomax must be = IP_MAXPACKET and I'm guessing this is what caused the above panic. (Someday it would be nice if TSO segments IP_MAXPACKET could be handled, but that will take changes in the ip layer and router software so that a bogus ip_len field doesn't cause problems.) if_hw_tsomax needs to be the maximum segment size that the driver can accept from IP. Since the driver adds any MAC header after accepting the TSO segment from the IP layer, it shouldn't include MAC header(s) in the value for if_hw_tsomax. (If its limit includes MAC header(s), it needs to subtract those out when setting if_hw_tsomax, not add them.) Since I am working up a patch for the value of if_hw_tsomaxsegcount, I think I'll add a check for IP_MAXPACKET for if_hw_tsomax as well. rick Huh, ok. You want to try something like this then? sean Index: if_em.h === --- if_em.h (revision 286991) +++ if_em.h (working copy) @@ -268,7 +268,7 @@ #define EM_MAX_SCATTER 64 #define EM_VFTA_SIZE 128 -#define EM_TSO_SIZE (65535 + sizeof(struct ether_vlan_header)) +#define EM_TSO_SIZE (65535 - sizeof(struct ether_vlan_header)) #define EM_TSO_SEG_SIZE 4096/* Max dma segment size */ #define EM_MSIX_MASK 0x01F0 /* For 82574 use */ #define EM_MSIX_LINK 0x0100 /* For 82574 use */ I don't remember TSO details on em(4) controllers at this moment(it had been long time ago since lastly I touched it) but I think the controller has no additional limit on TSO size(it claims the controller supports MS Large Send Offload so it should support up to 64KB IP datagram) so the change would be sub-optimal. I've attached a new diff. It was not tested though, I don't have em(4) controllers. Index: if_lem.h === --- if_lem.h (revision 286991) +++ if_lem.h (working copy) @@ -238,7 +238,7 @@ #define EM_MAX_SCATTER 64 #define EM_VFTA_SIZE 128 -#define EM_TSO_SIZE (65535 + sizeof(struct ether_vlan_header)) +#define EM_TSO_SIZE (65535 - sizeof(struct ether_vlan_header)) #define EM_TSO_SEG_SIZE 4096/* Max dma segment size */ #define EM_MSIX_MASK 0x01F0 /* For 82574 use */ #define ETH_ZLEN 60 I think lem(4) does not support TSO so the change would have no effect. Actually all reference on TSO for lem(4) should be removed I guess. Index: sys/dev/e1000/if_em.c === --- sys/dev/e1000/if_em.c (revision 287087) +++ sys/dev/e1000/if_em.c (working copy) @@ -3044,7 +3044,7 @@ em_setup_interface(device_t dev, struct adapter *a if_setioctlfn(ifp, em_ioctl); if_setgetcounterfn(ifp, em_get_counter); /* TSO parameters */ - ifp-if_hw_tsomax = EM_TSO_SIZE; + ifp-if_hw_tsomax = IP_MAXPACKET; ifp-if_hw_tsomaxsegcount = EM_MAX_SCATTER; ifp-if_hw_tsomaxsegsize = EM_TSO_SEG_SIZE; ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: E1000 mbuf leaks
On Mon, Jul 27, 2015 at 01:02:32PM +0200, Hans Petter Selasky wrote: Hi, I'm currently doing some busdma work, and possibly stepped over some driver bugs. When bus_dmamap_load_mbuf_sg() returns ENOMEM the mbuf chain is not freed. Is there some magic in bus_dmamap_load_mbuf_sg() for that error code or is there a possible memory leak in all E1000 drivers? See attached patch. I don't think it's an mbuf leak since lem(4) just prepend the mbuf to the if sendq(driver will retry it later). But I think your patch looks more correct in bus_dma(9) perspective. If bus_dmamap_load_mbuf_sg(9) returned an error except EFBIG, it would be correct for lem(4) to free the mbuf chains rather than restarting the bus_dmamap_load_mbuf_sg(9) later which shall fail again with ENOMEM. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Thu, Oct 02, 2014 at 02:07:30PM +0900, Yonghyeon PYUN wrote: On Wed, Oct 01, 2014 at 10:36:37AM +0900, Yonghyeon PYUN wrote: On Tue, Sep 30, 2014 at 10:57:41AM +0900, Yonghyeon PYUN wrote: Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Patch updated to address link establishment issue. http://people.freebsd.org/~yongari/alc/alc.diff.20141002 Patch updated again to correct wrong lock assertion. FYI: I've committed all the changes required to support AR816x/AR817x. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Sat, Oct 04, 2014 at 08:10:06PM +, Craig Wiesen wrote: Yonghyeon PYUN pyunyh at gmail.com writes: On Wed, Oct 01, 2014 at 10:36:37AM +0900, Yonghyeon PYUN wrote: On Tue, Sep 30, 2014 at 10:57:41AM +0900, Yonghyeon PYUN wrote: Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Patch updated to address link establishment issue. http://people.freebsd.org/~yongari/alc/alc.diff.20141002 Patch updated again to correct wrong lock assertion. ___ Hi- I can add that I tested your patches on a 9.3 Stable machine. The motherboard is a GA-Z77-D3H (rev. 1.1) with onboard Atheros AR816x. I did have to apply one of the patch hunks by hand, see below. I am able to ssh into the machine, and remotely access apache/poudriere. I have not seen any problems so far. I've included a few outputs for you to examine. Thanks for your testing! [...] Rejected hunk: # cat if_alc.c.rej This was caused by not MFCing r240693. I'll see whether it could be merged to stable/9. *** *** 831,843 CSR_WRITE_4(sc, ALC_PCIE_PHYMISC2, val); } /* Disable ASPM L0S and L1. */ - cap = CSR_READ_2(sc, base + PCIER_LINK_CAP); if ((cap PCIEM_LINK_CAP_ASPM) != 0) { - ctl = CSR_READ_2(sc, base + PCIER_LINK_CTL); if ((ctl PCIEM_LINK_CTL_RCB) != 0) sc-alc_rcb = DMA_CFG_RCB_128; if (bootverbose) - device_printf(dev, RCB %u bytes\n, sc-alc_rcb == DMA_CFG_RCB_64 ? 64 : 128); state = ctl PCIEM_LINK_CTL_ASPMC; if (state PCIEM_LINK_CTL_ASPMC_L0S) --- 1279,1291 CSR_WRITE_4(sc, ALC_PCIE_PHYMISC2, val); } /* Disable ASPM L0S and L1. */ + cap = CSR_READ_2(sc, sc-alc_expcap + PCIER_LINK_CAP); if ((cap PCIEM_LINK_CAP_ASPM) != 0) { + ctl = CSR_READ_2(sc, sc-alc_expcap + PCIER_LINK_CTL); if ((ctl PCIEM_LINK_CTL_RCB) != 0) sc-alc_rcb = DMA_CFG_RCB_128; if (bootverbose) + device_printf(sc-alc_dev, RCB %u bytes\n, sc-alc_rcb == DMA_CFG_RCB_64 ? 64 : 128); state = ctl PCIEM_LINK_CTL_ASPMC; if (state PCIEM_LINK_CTL_ASPMC_L0S) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Fri, Oct 03, 2014 at 09:29:46PM +0200, Dariusz Wierzbicki wrote: Dnia 2014-10-02, o godz. 14:07:30 Yonghyeon PYUN pyu...@gmail.com napisał(a): On Wed, Oct 01, 2014 at 10:36:37AM +0900, Yonghyeon PYUN wrote: On Tue, Sep 30, 2014 at 10:57:41AM +0900, Yonghyeon PYUN wrote: Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Patch updated to address link establishment issue. http://people.freebsd.org/~yongari/alc/alc.diff.20141002 Patch updated again to correct wrong lock assertion. Hi ! Thanks for your work ! Are your patches only for current ? I tried on 10 stable. No, it should be applied to stable/10 as well. I intentionally didn't include additional diff for MAC statistics which will not work on stable/10 and stable/9 due to if_inc_counter changes made in HEAD. I tried to apply the diff again against stable/10 and it succeeded with minor fuzz and offset differences. My system: dw@dw:~ % uname -a FreeBSD dw 10.1-RC1 FreeBSD 10.1-RC1 #1 r272477M: Fri Oct 3 20:48:05 CEST 2014 dw@dw:/usr/obj/usr/src/sys/DW amd64 [...] I applied that part manually. Compiled and rebooted system. dmesg | grep alc : alc0: could not disable Rx/Tx MAC(0x4000cb20)! alc0: reset timeout(0x4000cb20)! alc0: could not disable Rx/Tx MAC(0x4000cb20)! ^ I'm more worried about MAC reset and master reset timeout shown below. The MAC reset timeout makes me wonder how this can happen since driver just checks bit 0 and bit 1, the low nibble of the register value can't be 0. alc0: link state changed to UP alc0: could not disable Rx/Tx MAC(0x4000cb20)! alc0: Qualcomm Atheros AR8161 Gigabit Ethernet port 0xd000-0xd07f mem 0xf720-0xf723 irq 18 at device 0.0 on pci3 alc0: reset timeout(0x4000cd00)! I think this also can't happen since driver checks bit[0-3], the low byte should be non-zero when the timeout triggers. alc0: 11776 Tx FIFO, 12032 Rx FIFO miibus0: MII bus on alc0 alc0: Ethernet address: 74:d4:35:91:32:04 [...] If you need other data or more testing, let me know. Do you have any local changes in alc(4)? As I said, the diff could be applied to stable/10 without any manual modification. Thanks for testing! ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Wed, Oct 01, 2014 at 10:36:37AM +0900, Yonghyeon PYUN wrote: On Tue, Sep 30, 2014 at 10:57:41AM +0900, Yonghyeon PYUN wrote: Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Patch updated to address link establishment issue. http://people.freebsd.org/~yongari/alc/alc.diff.20141002 Patch updated again to correct wrong lock assertion. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [CFT] alc(4) QAC AR816x/AR817x ethernet controller support
On Tue, Sep 30, 2014 at 10:57:41AM +0900, Yonghyeon PYUN wrote: Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. http://people.freebsd.org/~yongari/alc/pci.quirk.diff http://people.freebsd.org/~yongari/alc/alc.diff.20141001 Patch updated to address link establishment issue. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
[CFT] alc(4) QAC AR816x/AR817x ethernet controller support
Hi, I've added support for QAC AR816x/AR817x ethernet controllers. It passed my limited testing and I need more testers. You can find patches from the following URLs. http://people.freebsd.org/~yongari/alc/pci.quirk.diff and http://people.freebsd.org/~yongari/alc/alc.diff.20140930 pci.qurik.diff is to workaround silicon bug of AR816x. Without it MSI/MSIX interrupt wouldn't work. If you just want to use legacy INTx interrupt you don't have to apply it but you have to tell alc(4) not to use MSI/MSIX interrupt with tunables( hw.alc.msi.disable and hw.alc.msix_disable). alc.diff.20140930 will add support for AR8161/AR8162/AR8171/AR8172 and E2200 controllers. It supports all hardware features except RSS. If you have any QAC AR816x/AR817x or old AR813x/AR815x controllers please test and report how the diff works for you. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [RFC] Allow m_dup() to use JUMBO clusters
On Mon, Jul 07, 2014 at 10:12:07AM +0200, Hans Petter Selasky wrote: Hi, I'm asking for some input on the attached m_dup() patch, so that existing functionality or dependencies are not broken. The background for the change is to allow m_dup() to defrag long mbuf chains that doesn't fit into a specific hardware's scatter gather entries, typically when doing TSO. In my case the HW limit is 16 entries of length 4K for doing a 64KByte I wonder how HW can handle a full-sized TSO packet(64KB + Ethernet header + VLAN tag). TSO packet. Currently m_dup() is at best producing 32 entries of each 2K for a 64Kbytes TSO packet. By allowing m_dup() to get JUMBO clusters when allocating mbufs, we avoid creating a new function, specific to the hardware, to defrag some rare-occurring very long mbuf chains into a mbuf chain below 16 entries. I think m_dup() was used to get a copy of writable mbuf chains. If m_dup() starts to allocate jumbo mbufs it will eventually fail on long running boxes. This will break firewall(ipfw divert, pf/ipf dup-to) rules and several ethernet drivers. I don't know how many TSO requests could be queued by HW but if the number is very small, the driver may be able to pre-allocate that number of buffers (N * (64KB + Ethernet header + VLAN tag)) in driver. Upper stack will almost always generate more than 16 mbufs for TSO packets. When driver knows the length of mbuf chain of TSO packet is more than 16, you can copy the mbuf chain to the pre-allocated buffer. I recall I didn't implement TSO on txp(4) because the firmware of txp(4) controller does not support more than 16 fragment descriptors. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Thu, May 08, 2014 at 05:23:32PM +, Alexey Dokuchaev wrote: On Tue, Mar 05, 2013 at 09:14:11AM +, Alexey Dokuchaev wrote: On Tue, Mar 05, 2013 at 05:57:03PM +0900, YongHyeon PYUN wrote: Hmm, Does the switch support EEE feature? If yes, would you try disabling it? I do not think it [1] does; plus I cannot do much about this switch, as I'm pretty far away from it right now. [1] http://netgear.com/home/products/switches-and-access-points/unmanaged-switches/GS608.aspx (got it about 4 years ago) I just had a chance to plug the Ethernet cable directly into my laptop's bge(4) port, and it immediately negotiated at 1000baseT; but with the switch, it can only feel fine with 10baseT/UTP (after some 1000baseT-no carrier flip flopping). So it looks like it fails to talk to the switch. Given that this switch of mine in a simple (dumb) piece of equipment, any ideas how to help ale(4) to negotiate with it at full speed? Because there is no publicly available data sheet for Atheros F1 PHY I'm not sure what could be done in this case. The only thing I can think of at this moment is announcement of next page in auto negotiation. atphy(4) does not directly manipulate master/slave, single port/multi port configuration and this configuration may need next page if other link partner also announces next page capability. Try attached patch and let me know whether this makes any difference for you. You may have to cold boot the box because stock driver used to clear next page bit in auto-negotiation. ./danfe Index: sys/dev/mii/atphy.c === --- sys/dev/mii/atphy.c (revision 265477) +++ sys/dev/mii/atphy.c (working copy) @@ -338,7 +338,9 @@ atphy_setmedia(struct mii_softc *sc, int media) { uint16_t anar; - anar = BMSR_MEDIA_TO_ANAR(sc-mii_capabilities) | ANAR_CSMA; + anar = PHY_READ(sc, MII_ANAR); + anar = ANAR_NP; + anar |= BMSR_MEDIA_TO_ANAR(sc-mii_capabilities) | ANAR_CSMA; if ((IFM_SUBTYPE(media) == IFM_AUTO || (media IFM_FDX) != 0) ((media IFM_FLOW) != 0 || (sc-mii_flags MIIF_FORCEPAUSE) != 0)) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: RFC: deprecation of nve(4) in 10-STABLE and removal from 11-CURRENT
On Mon, Feb 03, 2014 at 02:56:37PM +0100, Christian Brueffer wrote: Hi, for some time now we have had two drivers for NVIDIA NForce/MCP network chips, namely nve(4) and nfe(4). The former came first and is based on a binary blob. The latter was later ported from OpenBSD and is blob-free. nfe(4) supports all chips nve(4) supports, in addition to all the newer hardware. In essence, nfe(4) has been the de-facto standard driver for a long time. nve(4) has been commented out in GENERIC since 2007. For this reason I propose deprecating nve(4) in 10-STABLE and removing it from HEAD. Does anyone see a reason not to do this? A couple of users were still using nve(4) in the past. I guess the issue might be lack of code for waking up MAC/PHY from powerdown. nfe(4) already has the needed code and should support all known NVIDIA ethernet controllers with full offloading support. So no objection from me. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: regression: msk0 watchdog timeout and interrupt storm
On Sat, Feb 01, 2014 at 12:18:59PM +0400, Boris Samorodov wrote: Hi Yonghyeon and All, (this time it's a CURRENT issue) 31.10.2013 17:33, Boris Samorodov пишет: 30.10.2013 06:16, Yonghyeon PYUN пишет: On Tue, Oct 29, 2013 at 05:38:27PM +0400, Boris Samorodov wrote: From time to time I use a notebook and boot FreeBSD from USB stick. FreeBSD 9.2-i386 works OK. So I tried to use FreeBSD 10.0-i386 BETA2 and the network adapter works for some 10-15 seconds and then stops with diagnostic message msk0:watchdog timeout. I've found similar case at freebsd-current@ with no workaround. Yes, there is an interrupt storm as well. There had been no functional changes for very long time so I'm not sure what's going on here. I've attached local change I have at this moment but I'm afraid it wouldn't address the issue above. I recall jhb also reported interrupt storm in the past but the root cause was not identified yet. Could you change msk_intr() and let me know which interrupt is firing? I've yet to organize a build. Here is some additional info: - mskc0@pci0:3:0:0: class=0x02 card=0xff501179 chip=0x435511ab rev=0x12 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8040T PCI-E Fast Ethernet Controller' class = network subclass = ethernet cap 01[48] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[5c] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[c0] = PCI-Express 2 legacy endpoint max data 128(128) link x1(x1) speed 2.5(2.5) ASPM disabled(L0s/L1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0003[130] = Serial 1 b8b063681e00 - Meanwhile some more investigations, vmstat -i for calm and storm: - interrupt total rate irq1: atkbd01025 2 irq9: acpi0 204 0 irq14: ata0 327 0 irq16: uhci0+246 0 irq20: hpet0 22472 52 irq23: uhci2 ehci1 10341 24 irq256: hdac0 52 0 irq257: mskc0258 0 irq258: ahci0221 0 Total 35146 81 - interrupt total rate irq1: atkbd01508 2 irq9: acpi0 234 0 irq14: ata0 409 0 irq16: uhci0+246 0 irq20: hpet0 72288131 irq23: uhci2 ehci1 10846 19 irq256: hdac0 52 0 irq257: mskc04419760 8021 irq258: ahci0221 0 Total4505564 8177 - And vmstat -w1 for calm and storm: - procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs us sy id 0 0 0 206928 956040 277 0 2 0 330 4 0 0 117 476 454 0 1 99 0 0 0 206928 956036 0 0 0 0 8 4 0 0 50 123 137 0 0 100 0 0 0 206928 956036 0 0 0 0 0 4 0 0 47 120 92 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 43 123 119 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 55 132 123 0 1 99 0 0 0 206928 956004 0 0 0 0 0 4 0 0 68 123 185 0 1 99 0 0 0 206928 956036 0 0 0 0 8 4 0 0 86 123 266 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 44 125 124 0 0 100 0 0 0 206928 956036 0 0 0 0 0 4 0 0 64 128 164 0 1 99 0 0 0 206928 956036 0 0 0 0 0 4 0 0 42 131 101 0 1 99 - procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr mm0 ad0 in sy cs us sy id 0 0 0 213648 954676 104 0 1 0 121 4 0 0 22299 204 44262 0 10 90 0 0 0 213648 954672 0 0 0 0 8 4 0 0 112259 123 222379 0 44 56 0 0 0 213648 954672 0 0 0 0 0 4 0 0 111792 123 221489 0 43 57 0 0 0 213648 954672 1 0 0 0 0 4 0 0 109887 183 217754 0 43 57 0 0 0 213648 954668 2 0 0 0 0 4 0 0 109543 146 216963 0 44 56 0 0 0 213648 954668 0 0 0 0 0 4 0 0 110142 123 218187 0 45 55 0 0 0 213648 954660 472 0 0 0 474 4 0 0 109340 717 216674 0 42 57 0 0 0 213648 954656 2 0 0 0 0 4 0 0 109459
Re: FreeBSD 10-RC4: Got crash in igb driver
On Fri, Jan 10, 2014 at 02:35:29PM +0400, Gleb Smirnoff wrote: Yonghyeon, On Fri, Jan 10, 2014 at 10:21:14AM +0900, Yonghyeon PYUN wrote: Y I experience some troubles with the igb device driver on FreeBSD 10-RC4. Y Y The kernel make a pagefault in the igb_tx_ctx_setup function when accessing to Y a IPv6 header. Y Y The network configuration is the following: Y - box acting as an IPv6 router Y - one interface with an IPv6 (igb0) Y - another interface with a vlan, and IPv6 on it (vlan0 on igb1) Y Y Vlan Hardware tagging is set on both interfaces. Y Y The packet that cause the crash come from igb0 and go to vlan0. Y Y After investigation, i see that the mbuf is split in two. The first one carry Y the ethernet header, the second, the IPv6 header and data payload. Y Y The split is due to the m_copy done in ip6_forward, that make the mbuf not Y writable and the M_PREPEND in ether_output that insert the new mbuf before Y the original one. Y Y The kernel crashes only if the newly allocated mbuf is at the end of a memory Y page, and no page is available after this one. So, it's extremly rare. Y Y I inserted a KASSERT into the function (see attached patch) to check this Y behavior, and it raises on every IPv6 forwarded packet to the vlan. The Y problem disapear if i remove hardware tagging. Y Y In the commit 256200, i see that pullups has been removed. May it be related ? Y Y I think I introduced the header parsing code to meet controller Y requirement in em(4) and Jack borrowed that code in the past but it Y seems it was removed in r256200. It seems igb_tx_ctx_setup() Y assumes it can access ethernet/IP/TCP/UDP headers in the first mbuf Y of the chain. Y This looks wrong to me. Can you please restore the important code in head ASAP? Although crashes happen only when the mbuf is last in a page and page isn't mapped, we read thrash from next allocation on almost every packet. It seems other Intel ethernet drivers except em(4) have similar issues. I didn't check recent Intel controllers/drivers for long time so I don't know details on hardware requirements of offloading. Since Jack is very responsive and has hardwares to verify, he would be more appropriate person to handle these issues. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: FreeBSD 10-RC4: Got crash in igb driver
On Fri, Jan 10, 2014 at 09:37:33AM +0100, Fabien Thomas wrote: Le 10 janv. 2014 ? 02:21, Yonghyeon PYUN pyu...@gmail.com a ?crit : On Thu, Jan 09, 2014 at 04:06:09PM +0100, Alexandre Martins wrote: Dear, I experience some troubles with the igb device driver on FreeBSD 10-RC4. The kernel make a pagefault in the igb_tx_ctx_setup function when accessing to a IPv6 header. The network configuration is the following: - box acting as an IPv6 router - one interface with an IPv6 (igb0) - another interface with a vlan, and IPv6 on it (vlan0 on igb1) Vlan Hardware tagging is set on both interfaces. The packet that cause the crash come from igb0 and go to vlan0. After investigation, i see that the mbuf is split in two. The first one carry the ethernet header, the second, the IPv6 header and data payload. The split is due to the m_copy done in ip6_forward, that make the mbuf not writable and the M_PREPEND in ether_output that insert the new mbuf before the original one. The kernel crashes only if the newly allocated mbuf is at the end of a memory page, and no page is available after this one. So, it's extremly rare. I inserted a KASSERT into the function (see attached patch) to check this behavior, and it raises on every IPv6 forwarded packet to the vlan. The problem disapear if i remove hardware tagging. In the commit 256200, i see that pullups has been removed. May it be related ? I think I introduced the header parsing code to meet controller requirement in em(4) and Jack borrowed that code in the past but it seems it was removed in r256200. It seems igb_tx_ctx_setup() assumes it can access ethernet/IP/TCP/UDP headers in the first mbuf of the chain. This looks wrong to me. Instead of patching each driver with pullup code we can add a generic pullup code ? - get the contiguous protocol requirement (L2, L3, L4) from underlying driver. - do the pullup I believe Andre already planned that and he would be working on removing home-grown header parser implemented in drivers. Can you confirm the problem ? Probably Jack can tell more about change made in r256200. It's not easy for me to verify correctness of igb(4) at this moment. Best regards -- Alexandre Martins NETASQ -- We secure IT ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: FreeBSD 10-RC4: Got crash in igb driver
On Thu, Jan 09, 2014 at 04:06:09PM +0100, Alexandre Martins wrote: Dear, I experience some troubles with the igb device driver on FreeBSD 10-RC4. The kernel make a pagefault in the igb_tx_ctx_setup function when accessing to a IPv6 header. The network configuration is the following: - box acting as an IPv6 router - one interface with an IPv6 (igb0) - another interface with a vlan, and IPv6 on it (vlan0 on igb1) Vlan Hardware tagging is set on both interfaces. The packet that cause the crash come from igb0 and go to vlan0. After investigation, i see that the mbuf is split in two. The first one carry the ethernet header, the second, the IPv6 header and data payload. The split is due to the m_copy done in ip6_forward, that make the mbuf not writable and the M_PREPEND in ether_output that insert the new mbuf before the original one. The kernel crashes only if the newly allocated mbuf is at the end of a memory page, and no page is available after this one. So, it's extremly rare. I inserted a KASSERT into the function (see attached patch) to check this behavior, and it raises on every IPv6 forwarded packet to the vlan. The problem disapear if i remove hardware tagging. In the commit 256200, i see that pullups has been removed. May it be related ? I think I introduced the header parsing code to meet controller requirement in em(4) and Jack borrowed that code in the past but it seems it was removed in r256200. It seems igb_tx_ctx_setup() assumes it can access ethernet/IP/TCP/UDP headers in the first mbuf of the chain. This looks wrong to me. Can you confirm the problem ? Probably Jack can tell more about change made in r256200. It's not easy for me to verify correctness of igb(4) at this moment. Best regards -- Alexandre Martins NETASQ -- We secure IT ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient failure with Realtek 8111E Ethernet on new MSI motherboard
On Thu, Nov 07, 2013 at 02:25:18AM +, Thomas Mueller wrote: I tried the patch on 9.2-STABLE, rebuilt the kernel and modules, installed to the correct place on USB stick, /media/zip0/boot/kernelre USB stick was mounted on /media/zip0 when I did this. Then I umounted, took the USB stick to new computer with MSI Z77 MPOWER motherboard. I booted that USB stick, escaped to loader prompt, unload and boot /boot/kernelre/kernel got the same error when running dhclient re0. Hmm, then I have no idea at this moment. :-( If I manage to find any clue, I'll let you know. Thanks a lot for testing! Now I also have to update NetBSD-current and then build a Linux installation. Linux may offer a better chance of configuring wireless adapters. I was hoping a fix to the re(4) bug could make it for FreeBSD 10.0-RELEASE but am not betting on it. Tom ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient failure with Realtek 8111E Ethernet on new MSI motherboard
On Wed, Nov 06, 2013 at 02:36:07AM +, Thomas Mueller wrote: from Yonghyeon PYUN: Thomas, would you try attached patch on your system? [-- Attachment #2: re.8168evl.diff --] [-- Type: text/x-diff, Encoding: 7bit, Size: 3.6K --] Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=re.8168evl.diff Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 257422) +++ sys/dev/re/if_re.c (working copy) @@ -295,6 +295,8 @@ static int re_miibus_writereg (device_t, int, int, int); static void re_miibus_statchg (device_t); +static void re_eri_write (struct rl_softc *, bus_size_t, uint32_t, int); + static void re_set_jumbo (struct rl_softc *, int); static void re_set_rxmode (struct rl_softc *); static void re_reset (struct rl_softc *); @:10,32s/^/@ -641,6 +643,32 @@ } (snip) Which version/branch of FreeBSD is this for? I guess the diff would apply to CURRENT and any stable. 9.2_STABLE, 10-stable or 11-head? Does it require a specific svn revision? No. I just updated FreeBSD-current on new MSI motherboard (svn revision 257695). dhclient re0 still gives same error. That's expected behavior since there is no code to activate the workaround at this moment. Given that you have CURRENT at this moment, apply the diff and let me know how it goes. Now I have to update FreeBSD-current amd64 on same computer. I go through this in the hope of being able to configure wifi with Hiro 50191 USB-stick-type WLAN adapter, driver rsu. So far, can't see wifi network. I see what more I need to do, or maybe no wifi signal? Sorry, I'm dumb on wireless drivers so have nothing to comment. :-( Tom ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Shuttle DS47 - Realtek RT 8111G
On Sun, Sep 15, 2013 at 09:04:28PM -0600, Scott Long wrote: On Sep 15, 2013, at 8:17 PM, Yonghyeon PYUN pyu...@gmail.com wrote: On Sat, Sep 14, 2013 at 08:47:06PM -0600, Scott Long wrote: Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 255582) +++ sys/dev/re/if_re.c (working copy) @@ -234,6 +234,10 @@ { RL_HWREV_8168E_VL, RL_8169, 8168E/8111E-VL, RL_JUMBO_MTU_6K}, { RL_HWREV_8168F, RL_8169, 8168F/8111F, RL_JUMBO_MTU_9K}, { RL_HWREV_8411, RL_8169, 8411, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_0, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_1, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_2, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_4, RL_8169, 8411, RL_JUMBO_MTU_9K}, { 0, 0, NULL, 0 } }; @@ -1457,6 +1461,10 @@ case RL_HWREV_8168E_VL: case RL_HWREV_8168F: case RL_HWREV_8411: + case RL_HWREV_8168G_0: + case RL_HWREV_8168G_1: + case RL_HWREV_8168G_2: + case RL_HWREV_8168G_4: sc-rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PAR | RL_FLAG_DESCV2 | RL_FLAG_MACSTAT | RL_FLAG_CMDSTOP | RL_FLAG_AUTOPAD | RL_FLAG_JUMBOV2 | Index: sys/pci/if_rlreg.h === --- sys/pci/if_rlreg.h (revision 255582) +++ sys/pci/if_rlreg.h (working copy) @@ -191,6 +191,10 @@ #defineRL_HWREV_8402 0x4400 #defineRL_HWREV_8168F 0x4800 #defineRL_HWREV_8411 0x4880 +#define RL_HWREV_8168G_00x4c00 +#define RL_HWREV_8168G_10x4c10 I don't know exact model number for these MACs but it may be 8168G. +#define RL_HWREV_8168G_20x5090 This looks like 8168GU. +#define RL_HWREV_8168G_40x5c80 This looks like 8411B. RL_TXCFG_HWREV is 0x7CC0 so driver will not see RL_HWREV_8168G_1(0x4c10) and RL_HWREV_8168G_2(0x5090). It seems newer RealTek controllers seem to use ODP to access PHY. In addition, these controllers may need to set RX DMA parameter (bit 11 of RL_RXCFG). I'm not sure what this bit does though. Scott, did you test your patch on real H/W? If it works I'm fine with your patch. Just remove RL_HWREV_8168G_1 and RL_HWREV_8168G_2 as current driver has no way to get these revisions. I tested the 0x4c0 on real hardware. an MSI Z87I motherboard. The rest came from looking at the linux driver. That driver is structured very differently (and better, IMHO) than the FreeBSD one, so there's a lot that wasn't obvious to me. I'd be very happy to work more on this with your guidance. FYI: Fixed in r257304-257306. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Shuttle DS47 - Realtek RT 8111G
On Sat, Sep 14, 2013 at 08:47:06PM -0600, Scott Long wrote: Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c(revision 255582) +++ sys/dev/re/if_re.c(working copy) @@ -234,6 +234,10 @@ { RL_HWREV_8168E_VL, RL_8169, 8168E/8111E-VL, RL_JUMBO_MTU_6K}, { RL_HWREV_8168F, RL_8169, 8168F/8111F, RL_JUMBO_MTU_9K}, { RL_HWREV_8411, RL_8169, 8411, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_0, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_1, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_2, RL_8169, 8168G/8111G, RL_JUMBO_MTU_9K}, + { RL_HWREV_8168_4, RL_8169, 8411, RL_JUMBO_MTU_9K}, { 0, 0, NULL, 0 } }; @@ -1457,6 +1461,10 @@ case RL_HWREV_8168E_VL: case RL_HWREV_8168F: case RL_HWREV_8411: + case RL_HWREV_8168G_0: + case RL_HWREV_8168G_1: + case RL_HWREV_8168G_2: + case RL_HWREV_8168G_4: sc-rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PAR | RL_FLAG_DESCV2 | RL_FLAG_MACSTAT | RL_FLAG_CMDSTOP | RL_FLAG_AUTOPAD | RL_FLAG_JUMBOV2 | Index: sys/pci/if_rlreg.h === --- sys/pci/if_rlreg.h(revision 255582) +++ sys/pci/if_rlreg.h(working copy) @@ -191,6 +191,10 @@ #define RL_HWREV_8402 0x4400 #define RL_HWREV_8168F 0x4800 #define RL_HWREV_8411 0x4880 +#define RL_HWREV_8168G_00x4c00 +#define RL_HWREV_8168G_10x4c10 I don't know exact model number for these MACs but it may be 8168G. +#define RL_HWREV_8168G_20x5090 This looks like 8168GU. +#define RL_HWREV_8168G_40x5c80 This looks like 8411B. RL_TXCFG_HWREV is 0x7CC0 so driver will not see RL_HWREV_8168G_1(0x4c10) and RL_HWREV_8168G_2(0x5090). It seems newer RealTek controllers seem to use ODP to access PHY. In addition, these controllers may need to set RX DMA parameter (bit 11 of RL_RXCFG). I'm not sure what this bit does though. Scott, did you test your patch on real H/W? If it works I'm fine with your patch. Just remove RL_HWREV_8168G_1 and RL_HWREV_8168G_2 as current driver has no way to get these revisions. #define RL_HWREV_8139 0x6000 #define RL_HWREV_8139A 0x7000 #define RL_HWREV_8139AG 0x7080 On Sep 14, 2013, at 3:41 PM, Thomas Guldener tgulde...@bluewin.ch wrote: FreeBSD 10 Alpha Release is Booting on the Shuttle DS47 - But still no support for the Realtek RT 8111G Network Cards. g. Thomas ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
CFT: bge(4) TX/RX checksum offloading
Hi, It was known that bge(4) generated wrong TCP/UDP checksum when the frame length was less then 60 bytes. So bge(4) implemented padding workaround for such runt frames. bge(4) also ignored H/W assisted TCP/UDP checksum result when the length of received frame was less than 60 bytes. This workaround came from NetBSD about 7 years ago. Recently I started to wonder why bge(4) needs such workaround given that 1) publicly available data sheet does not mention the issue and 2) Linux tg3 does not have any workaround for the issue. I also asked the question to Broadcom and I was told that they(both Linux and Windows software developers) can't recall they have the issue. Linux does not use IP checksum offloading feature of controller so it's possible for the controller to have IP checksum offloading issue on runt frames. But I was not able to reproduce the issue on my box. Here is the patch that removes the workaround in bge(4). http://people.freebsd.org/~yongari/bge/bge.csum.diff The diff was generated against HEAD but it will also apply cleanly to stable/9. If you use bge(4) devices, please give it a whirl and let me know how well it works on your configuration. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: msk0 watchdog timeout and interrupt storm
On Sat, Jul 13, 2013 at 01:39:06PM +0200, Denis D wrote: If you use dual-boot, please try cold-boot it. Other OS may have put the PHY into weird state. Cold-boot shall make firmware restore its PHY configuration. Hello pyunyh, when i really understand the word coldbootkorrekt,it means, that i have to shutdown my pc. And start it (during he was off) and boot into FreeBSD. My PC was off for 9 hours because of work, but still the same watchdog timeout error. Did you completely remove power-cord and wait 1 ~ 2 min. before boot? There are many Yukon II variants and each controller seems to require special handling to work-around silicon bugs. And your controller has a Audio Video Bridging (AVB) feature which may or may not need a special handling in TX/RX path. At least it may need to initialize or disable QoS specific feature of controller, I guess. Unfortunately errata or detailed programming information is not available to open source developers. Interrupt storm seems to indicate one of important event was not properly handled in driver. Not sure what it is. Maybe some other solutions? Sorry, have no further idea at this moment. I hoped cold-boot shall put controller into compatible mode but it seems it does not. I'll let you know if I happen to find a clue. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: msk0 watchdog timeout and interrupt storm
On Sun, Jul 07, 2013 at 10:10:42PM +0200, Denis D wrote: Hello Community,I hope someone could help me with this problem. The last days I have tried to find a solution, but haven't found one.The watchdog timeout happens, when I'm going to download something or copy a file on my FTP server. When I start the transfer of the file, I wait a moment and then my down-/upload freezes at something around 500 KB. After waiting a little while or press a key like return, it comes to the interrupt storm. interrupt storm detected on irq51:; throttling interrupt source. Here is some information about my system: ifconfig msk0msk0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE ether bc:ae:c5:5a:ef:ec inet 192.168.2.30 netmask 0xff00 broadcast 192.168.2.255 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex,flowcontrol,rxpause,txpause) status: active pciconf -lv mskc0@pci0:3:0:0: class=0x02 card=0x84391043 chip=0x438111ab rev=0x11 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = 'Yukon Optima 88E8059 [PCIe Gigabit Ethernet Controller with AVB]' class = network subclass = ethernetvmstat -iinterrupt total rate irq1: atkbd0 916 2 irq16: hdac1 97 0 irq17: ehci0 ehci1+ 8729 21 irq18: ohci0 ohci1* 67 0 irq19: ahci1 2883 7 irq25: hdac0 4 0 irq51: mskc0 90 0 irq256: hpet0:t0 30332 75 Total 43118 107 My loader.conf: hw.msk.msi_disable=1 hw.pci.enable_msi=0 hw.pci.enable_msix=0 My rc.conf hostname=FreeBSD.local.domain keymap=german.iso.acc.kbd ifconfig_msk0=DHCPsshd_enable=YES moused_enable=YES powerd_enable=YES # Set dumpdev to AUTO to enable crash dumps, NO to disable dumpdev=AUTO I have also tried to change ifconfig_msk0=DHCP to ifconfig_msk0=SYNCDHCP but nothing changed.If nothing helps, I will buy a new network card. If you use dual-boot, please try cold-boot it. Other OS may have put the PHY into weird state. Cold-boot shall make firmware restore its PHY configuration. P.S: Can someone delete my other 2 posts? The format of them was horrible and the another one has no subject :( ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problems with ipfw/natd and axe(4)
On Sat, May 11, 2013 at 12:04:09AM +0400, Gleb Smirnoff wrote: Spil, On Fri, May 10, 2013 at 09:06:35AM +0200, Spil Oss wrote: S There seems to be quite a bit of overhaul on the firewall code, pf and S ipfw have been moved to sys/netpfil? Can there be some regressions in S there that I hit? Yes, a regression is possible there. However, the issue seems to be axe(4) specific, since there are no reports on more common NICs. There was no change to axe(4) except added a new device id so it seems the issue is not in driver. In addition, AX88772B engineering sample I have works without problems on CURRENT. I didn't use ipfw(4) or natd though. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problems with axe(4) and checksum offloading
On Sat, Apr 13, 2013 at 03:25:11PM +0200, Spil Oss wrote: Hi YongHyeon, Will post on freebsd-ipfw@ as well... Does your engineering sample function normally with rxcsum/txcsum disabled? Yes. Kind regards, Spil. On Thu, Apr 11, 2013 at 3:11 AM, YongHyeon PYUN pyu...@gmail.com wrote: On Wed, Apr 10, 2013 at 07:48:00PM +0200, Spil Oss wrote: Hi YongHyeon, With the original unmodified .ko... ifconfig output as requested at bottom Static IP-configuration does not make a difference with the ipfw behaviour. ipfw ruleset (based on /etc/rc.firewall simple ruleset) 00010 allow ip from any to me dst-port 22 recv ue0 00010 allow tcp from me 22 to any xmit ue0 00100 allow ip from any to any via lo0 00200 deny ip from any to 127.0.0.0/8 00300 deny ip from 127.0.0.0/8 to any 00400 deny ip from any to ::1 00500 deny ip from ::1 to any 00600 allow ipv6-icmp from :: to ff02::/16 00700 allow ipv6-icmp from fe80::/10 to fe80::/10 00800 allow ipv6-icmp from fe80::/10 to ff02::/16 00900 allow ipv6-icmp from any to any ip6 icmp6types 1 01000 allow ipv6-icmp from any to any ip6 icmp6types 2,135,136 01100 deny ip from 10.16.2.1 to any in via ue0 01200 deny ip from 172.17.2.111 to any in via re0 01300 deny ip from any to 10.0.0.0/8 via ue0 01500 deny ip from any to 192.168.0.0/16 via ue0 01600 deny ip from any to 0.0.0.0/8 via ue0 01700 deny ip from any to 169.254.0.0/16 via ue0 01800 deny ip from any to 192.0.2.0/24 via ue0 01900 deny ip from any to 224.0.0.0/4 via ue0 02000 deny ip from any to 240.0.0.0/4 via ue0 02100 divert 8668 ip4 from any to any via ue0 02200 deny ip from 10.0.0.0/8 to any via ue0 02400 deny ip from 192.168.0.0/16 to any via ue0 02500 deny ip from 0.0.0.0/8 to any via ue0 02600 deny ip from 169.254.0.0/16 to any via ue0 02700 deny ip from 192.0.2.0/24 to any via ue0 02800 deny ip from 224.0.0.0/4 to any via ue0 02900 deny ip from 240.0.0.0/4 to any via ue0 03000 allow tcp from any to any established 03100 allow ip from any to any frag 03200 allow tcp from any to me dst-port 22 setup 03300 allow tcp from any to me dst-port 25 setup 03400 allow tcp from any to me dst-port 465 setup 03500 allow tcp from any to me dst-port 587 setup 03600 allow tcp from any to me dst-port 80 setup 03700 allow tcp from any to me dst-port 443 setup 03800 deny log logamount 5 ip4 from any to any in via ue0 setup proto tcp 03900 allow tcp from any to any setup 04000 allow udp from me to any dst-port 53 keep-state 04100 allow udp from me to any dst-port 123 keep-state 04200 allow ip from any to any dst-port 22 recv ue0 65535 deny ip from any to any If I remove rule 10 it will NOT work with ue0, the ruleset without rule 10 DOES work with re0. Found an older PR about em or fxp having trouble with natd when rxcsum/txcsum was enabled, that is why I started fiddling with rxcsum/txcsum and found that the NIC would be unusable without rxcsum/txcsum enabled. If only I could find that PR now (kern/170081???)... Was fixed in base... If you don't use ipfw/natd, checksum offloading of axe(4) work? If yes, you'd get better answer from ipfw mailing list. Some other post reported fake AX88772A chips (32-pin packaging vs 64 in the original) on cheap USB NICs so I checked the hardware as well and could not AX88772A does not support TX/RX checksum offloading. see an issue (64-pin packaging). # ifconfig ue0 ue0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8000bRXCSUM,TXCSUM,VLAN_MTU,LINKSTATE ether 00:60:6e:42:5b:53 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active # dhclient ue0 DHCPDISCOVER on ue0 to 255.255.255.255 port 67 interval 4 DHCPOFFER from 172.17.2.1 DHCPREQUEST on ue0 to 255.255.255.255 port 67 DHCPACK from 172.17.2.1 bound to 172.17.2.111 -- renewal in 43200 seconds. # ifconfig ue0 ue0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8000bRXCSUM,TXCSUM,VLAN_MTU,LINKSTATE ether 00:60:6e:42:5b:53 inet6 fe80::260:6eff:fe42:5b53%ue0 prefixlen 64 scopeid 0x7 inet 172.17.2.111 netmask 0xff00 broadcast 172.17.2.255 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I can see TX/RX checksum offloading is active and it successfully got a IP address via DHCP. On Wed, Apr 10, 2013 at 4:14 AM, YongHyeon PYUN pyu...@gmail.com wrote: On Mon, Apr 08, 2013 at 08:45:58PM +0200, Spil Oss wrote: Hi YongHyeon, output from verbose boot ugen3.2: vendor 0x0b95 at usbus3 axe0: vendor 0x0b95 product 0x772b, rev 2.00/0.01, addr 2
Re: Problems with axe(4) and checksum offloading
On Wed, Apr 10, 2013 at 07:48:00PM +0200, Spil Oss wrote: Hi YongHyeon, With the original unmodified .ko... ifconfig output as requested at bottom Static IP-configuration does not make a difference with the ipfw behaviour. ipfw ruleset (based on /etc/rc.firewall simple ruleset) 00010 allow ip from any to me dst-port 22 recv ue0 00010 allow tcp from me 22 to any xmit ue0 00100 allow ip from any to any via lo0 00200 deny ip from any to 127.0.0.0/8 00300 deny ip from 127.0.0.0/8 to any 00400 deny ip from any to ::1 00500 deny ip from ::1 to any 00600 allow ipv6-icmp from :: to ff02::/16 00700 allow ipv6-icmp from fe80::/10 to fe80::/10 00800 allow ipv6-icmp from fe80::/10 to ff02::/16 00900 allow ipv6-icmp from any to any ip6 icmp6types 1 01000 allow ipv6-icmp from any to any ip6 icmp6types 2,135,136 01100 deny ip from 10.16.2.1 to any in via ue0 01200 deny ip from 172.17.2.111 to any in via re0 01300 deny ip from any to 10.0.0.0/8 via ue0 01500 deny ip from any to 192.168.0.0/16 via ue0 01600 deny ip from any to 0.0.0.0/8 via ue0 01700 deny ip from any to 169.254.0.0/16 via ue0 01800 deny ip from any to 192.0.2.0/24 via ue0 01900 deny ip from any to 224.0.0.0/4 via ue0 02000 deny ip from any to 240.0.0.0/4 via ue0 02100 divert 8668 ip4 from any to any via ue0 02200 deny ip from 10.0.0.0/8 to any via ue0 02400 deny ip from 192.168.0.0/16 to any via ue0 02500 deny ip from 0.0.0.0/8 to any via ue0 02600 deny ip from 169.254.0.0/16 to any via ue0 02700 deny ip from 192.0.2.0/24 to any via ue0 02800 deny ip from 224.0.0.0/4 to any via ue0 02900 deny ip from 240.0.0.0/4 to any via ue0 03000 allow tcp from any to any established 03100 allow ip from any to any frag 03200 allow tcp from any to me dst-port 22 setup 03300 allow tcp from any to me dst-port 25 setup 03400 allow tcp from any to me dst-port 465 setup 03500 allow tcp from any to me dst-port 587 setup 03600 allow tcp from any to me dst-port 80 setup 03700 allow tcp from any to me dst-port 443 setup 03800 deny log logamount 5 ip4 from any to any in via ue0 setup proto tcp 03900 allow tcp from any to any setup 04000 allow udp from me to any dst-port 53 keep-state 04100 allow udp from me to any dst-port 123 keep-state 04200 allow ip from any to any dst-port 22 recv ue0 65535 deny ip from any to any If I remove rule 10 it will NOT work with ue0, the ruleset without rule 10 DOES work with re0. Found an older PR about em or fxp having trouble with natd when rxcsum/txcsum was enabled, that is why I started fiddling with rxcsum/txcsum and found that the NIC would be unusable without rxcsum/txcsum enabled. If only I could find that PR now (kern/170081???)... Was fixed in base... If you don't use ipfw/natd, checksum offloading of axe(4) work? If yes, you'd get better answer from ipfw mailing list. Some other post reported fake AX88772A chips (32-pin packaging vs 64 in the original) on cheap USB NICs so I checked the hardware as well and could not AX88772A does not support TX/RX checksum offloading. see an issue (64-pin packaging). # ifconfig ue0 ue0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8000bRXCSUM,TXCSUM,VLAN_MTU,LINKSTATE ether 00:60:6e:42:5b:53 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active # dhclient ue0 DHCPDISCOVER on ue0 to 255.255.255.255 port 67 interval 4 DHCPOFFER from 172.17.2.1 DHCPREQUEST on ue0 to 255.255.255.255 port 67 DHCPACK from 172.17.2.1 bound to 172.17.2.111 -- renewal in 43200 seconds. # ifconfig ue0 ue0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8000bRXCSUM,TXCSUM,VLAN_MTU,LINKSTATE ether 00:60:6e:42:5b:53 inet6 fe80::260:6eff:fe42:5b53%ue0 prefixlen 64 scopeid 0x7 inet 172.17.2.111 netmask 0xff00 broadcast 172.17.2.255 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active I can see TX/RX checksum offloading is active and it successfully got a IP address via DHCP. On Wed, Apr 10, 2013 at 4:14 AM, YongHyeon PYUN pyu...@gmail.com wrote: On Mon, Apr 08, 2013 at 08:45:58PM +0200, Spil Oss wrote: Hi YongHyeon, output from verbose boot ugen3.2: vendor 0x0b95 at usbus3 axe0: vendor 0x0b95 product 0x772b, rev 2.00/0.01, addr 2 on usbus3 axe0: PHYADDR 0xe0:0x10 miibus1: MII bus on axe0 ukphy0: Generic IEEE 802.3u media interface PHY 16 on miibus1 ukphy0: OUI 0x007063, model 0x0008, rev. 1 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow ue0: USB Ethernet on axe0 ue0: bpf attached ue0: Ethernet address: 00:60:6e:42:5b:53 ue0: link state changed to UP ue0: link state changed to DOWN ue0: link state changed to UP AX88772B engineering sample I have still worked on latest current
Re: Problems with axe(4) and checksum offloading
On Mon, Apr 08, 2013 at 08:45:58PM +0200, Spil Oss wrote: Hi YongHyeon, output from verbose boot ugen3.2: vendor 0x0b95 at usbus3 axe0: vendor 0x0b95 product 0x772b, rev 2.00/0.01, addr 2 on usbus3 axe0: PHYADDR 0xe0:0x10 miibus1: MII bus on axe0 ukphy0: Generic IEEE 802.3u media interface PHY 16 on miibus1 ukphy0: OUI 0x007063, model 0x0008, rev. 1 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow ue0: USB Ethernet on axe0 ue0: bpf attached ue0: Ethernet address: 00:60:6e:42:5b:53 ue0: link state changed to UP ue0: link state changed to DOWN ue0: link state changed to UP AX88772B engineering sample I have still worked on latest current. Could you use a static IP rather than using DHCP and see whether that makes any difference?(Note, you have to revert your changes made to axe(4) before trying that). Also show me the output of 'ifconfig ue0' before/after running dhclient(8). Apart from what I originally described... Networking does work, but not when packets pass through ipfw and nat. If I add my ipfw rules before the divert natd rule networking works as expected, without the SYN,ACK response packets are not accepted if I e.g. connect to something on the axe interface. I have validated the ipfw ruleset with the onboard realtek NIC and it then works as expected. # usbconfig -u 3 -a 2 dump_info ugen3.2: product 0x772b vendor 0x0b95 at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON (200mA) Kind regards, Spil. On Mon, Apr 8, 2013 at 8:35 AM, YongHyeon PYUN pyu...@gmail.com wrote: On Sun, Apr 07, 2013 at 09:14:16PM +0200, Spil Oss wrote: Hi all, With checksum offloading enabled I cannot use my axe NIC (ASIX AX88772B). ifconfig ue0 -txcsum -rxcsum will make dhclient ue0 return if I re-enable txcsum and rxcsum I get an immediate response from the dhcp server. Tried to remove the csum features by commenting out ifp-if_capabilities |= IFCAP_TXCSUM | IFCAP_RXCSUM; ifp-if_hwassist = AXE_CSUM_FEATURES; (lines 855 and 856 in /usr/src/sys/dev/usb/net/if_axe.c) and rebuild the module. This does remove RXCSUM and TXCSUM from options and behaves the same as disabling the features with ifconfig (i.e. does not work) 10.0-CURRENT r248351 Hope someone can help me... Spil. Last time I tried, checksum offloading worked as expected. Would you show me the verbose dmesg output after attaching the axe(4) NIC? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Problems with axe(4) and checksum offloading
On Sun, Apr 07, 2013 at 09:14:16PM +0200, Spil Oss wrote: Hi all, With checksum offloading enabled I cannot use my axe NIC (ASIX AX88772B). ifconfig ue0 -txcsum -rxcsum will make dhclient ue0 return if I re-enable txcsum and rxcsum I get an immediate response from the dhcp server. Tried to remove the csum features by commenting out ifp-if_capabilities |= IFCAP_TXCSUM | IFCAP_RXCSUM; ifp-if_hwassist = AXE_CSUM_FEATURES; (lines 855 and 856 in /usr/src/sys/dev/usb/net/if_axe.c) and rebuild the module. This does remove RXCSUM and TXCSUM from options and behaves the same as disabling the features with ifconfig (i.e. does not work) 10.0-CURRENT r248351 Hope someone can help me... Spil. Last time I tried, checksum offloading worked as expected. Would you show me the verbose dmesg output after attaching the axe(4) NIC? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Tue, Mar 05, 2013 at 08:06:20AM +, Alexey Dokuchaev wrote: On Tue, Mar 05, 2013 at 04:43:15PM +0900, YongHyeon PYUN wrote: Could you disable WOL before rebooting your box? # ifconfig ale0 -wol # reboot It came up as 100baseTX. :( You don't use any manual link configuration, right? Right, everything is auto (that is, the defaults). When you see the controller established a 100Mbps link, how about restarting auto-negotiation? Does that also result in 100Mbps link? # ifconfig ale0 media auto # ifconfig ale0 | egrep -v ether\|inet ale0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c319aTXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MCAST,WOL_MAGIC,VLAN_HWTSO,LINKSTATE nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active Tried a few times, no difference. Hmm, Does the switch support EEE feature? If yes, would you try disabling it? ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Mon, Mar 04, 2013 at 08:18:58AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 04:06:32PM +0900, YongHyeon PYUN wrote: On Mon, Mar 04, 2013 at 06:59:40AM +, Alexey Dokuchaev wrote: Better this time, I'm having 1000baseT again! :-) Thanks a lot for testing and patience! Could you reboot multiple times and check whether you reliably get a gigabit link? Yes, multiple reboots was a good idea, it's not very stable: 1st reboot: 100baseTX (!) 2nd reboot: 1000baseT 3rd reboot: 1000baseT 4th reboot: 1000baseT 5th reboot: 100baseTX (!) 6th reboot: 100baseTX (!) 7th reboot: 1000baseT 8th reboot: 100baseTX (!) 9th reboot: 1000baseT 10th reboot: 1000baseT I've tried various combinations of just reboot, shutdown -r +1m and pinging some host while waiting for reboot. Could you disable WOL before rebooting your box? You can disable WOL like the following. #ifconfig ale0 -wol ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Tue, Mar 05, 2013 at 06:59:10AM +, Alexey Dokuchaev wrote: On Tue, Mar 05, 2013 at 02:49:20PM +0900, YongHyeon PYUN wrote: On Mon, Mar 04, 2013 at 08:18:58AM +, Alexey Dokuchaev wrote: Yes, multiple reboots was a good idea, it's not very stable: [...] I've tried various combinations of just reboot, shutdown -r +1m and pinging some host while waiting for reboot. Could you disable WOL before rebooting your box? # ifconfig ale0 -wol # reboot It came up as 100baseTX. :( You don't use any manual link configuration, right? When you see the controller established a 100Mbps link, how about restarting auto-negotiation? Does that also result in 100Mbps link? #ifconfig ale0 media auto ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Sun, Mar 03, 2013 at 12:00:10PM +, Alexey Dokuchaev wrote: On Sun, Mar 03, 2013 at 09:53:30AM +, Alexey Dokuchaev wrote: However, after reboot ale0 come up at 1000baseT full-duplex, with patched driver (longer delays in ale_phy_reset()). I've reverted this change and rebooted again, but it again come up as GigE. Alas, after make kernel, link come up as 100mbps again, playing with delays and rebooting (several times) did not make it GigE. I'm not sure what's actually affecting it. :-( Would you try attached patch? ./danfe Index: sys/dev/mii/atphy.c === --- sys/dev/mii/atphy.c (revision 247382) +++ sys/dev/mii/atphy.c (working copy) @@ -287,9 +287,11 @@ atphy_reset(struct mii_softc *sc) uint32_t reg; int i; +#if 0 /* Take PHY out of power down mode. */ PHY_WRITE(sc, 29, 0x29); PHY_WRITE(sc, 30, 0); +#endif reg = PHY_READ(sc, ATPHY_SCR); /* Enable automatic crossover. */ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Mon, Mar 04, 2013 at 01:53:44AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 09:50:44AM +0900, YongHyeon PYUN wrote: On Sun, Mar 03, 2013 at 12:00:10PM +, Alexey Dokuchaev wrote: Alas, after make kernel, link come up as 100mbps again, playing with delays and rebooting (several times) did not make it GigE. I'm not sure what's actually affecting it. :-( Would you try attached patch? Yes, it did help. With 2000us delays (I didn't revert them since you didn't Great! But it seems 2ms delays is too much. ask), machine came up after make kernel and reboot with ale0 in GigE mode. I'll be happy to conduct more tests for you, if needed, thanks! Could you revert the change(2000us delays) and try it again? If that change works I still have to find a specific PHY model to exclude the blind PHY wakeup. ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Mon, Mar 04, 2013 at 02:46:31AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 11:10:59AM +0900, YongHyeon PYUN wrote: On Mon, Mar 04, 2013 at 01:53:44AM +, Alexey Dokuchaev wrote: Yes, it did help. With 2000us delays (I didn't revert them [...] Great! But it seems 2ms delays is too much. Could you revert the change (2000us delays) and try it again? Reverting if_ale.c, making kernel, and reboot gave me 100baseTX again; :-( second reboot (with the same kernel) did not help. Bumping delays to 2ms (just to make sure) restored GigE mode upon 1st reboot after make kernel. Ok, here is final diff which combines two things you've tested. So revert any changes before applying it. Let me know how it goes on your box. ./danfe Index: sys/dev/ale/if_ale.c === --- sys/dev/ale/if_ale.c(revision 247382) +++ sys/dev/ale/if_ale.c(working copy) @@ -406,11 +406,11 @@ ale_phy_reset(struct ale_softc *sc) CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + DELAY(2000); CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_EXT_RESET | GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + DELAY(2000); #defineATPHY_DBG_ADDR 0x1D #defineATPHY_DBG_DATA 0x1E @@ -635,7 +635,7 @@ ale_attach(device_t dev) /* Set up MII bus. */ error = mii_attach(dev, sc-ale_miibus, ifp, ale_mediachange, ale_mediastatus, BMSR_DEFCAPMASK, sc-ale_phyaddr, MII_OFFSET_ANY, - MIIF_DOPAUSE); + MIIF_DOPAUSE | MIIF_MACPRIV0); if (error != 0) { device_printf(dev, attaching PHYs failed\n); goto fail; Index: sys/dev/mii/atphy.c === --- sys/dev/mii/atphy.c (revision 247382) +++ sys/dev/mii/atphy.c (working copy) @@ -100,8 +100,14 @@ atphy_probe(device_t dev) static int atphy_attach(device_t dev) { + struct mii_attach_args *ma; + u_int flags; - mii_phy_dev_attach(dev, MIIF_NOMANPAUSE, atphy_funcs, 1); + ma = device_get_ivars(dev); + flags = MIIF_NOMANPAUSE; + if ((miibus_get_flags(dev) MIIF_MACPRIV0) != 0) + flags |= MIIF_PHYPRIV0; + mii_phy_dev_attach(dev, flags, atphy_funcs, 1); return (0); } @@ -287,9 +293,11 @@ atphy_reset(struct mii_softc *sc) uint32_t reg; int i; - /* Take PHY out of power down mode. */ - PHY_WRITE(sc, 29, 0x29); - PHY_WRITE(sc, 30, 0); + if ((sc-mii_flags MIIF_PHYPRIV0) != 0) { + /* Take PHY out of power down mode. */ + PHY_WRITE(sc, 29, 0x29); + PHY_WRITE(sc, 30, 0); + } reg = PHY_READ(sc, ATPHY_SCR); /* Enable automatic crossover. */ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Mon, Mar 04, 2013 at 05:59:48AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 02:23:28PM +0900, YongHyeon PYUN wrote: Ok, here is final diff which combines two things you've tested. So revert any changes before applying it. Let me know how it goes on your box. Hmm, apparently something went wrong, as I'm back to 100baseTX after make kernel and reboot... Hmm, updated diff again. ./danfe Index: sys/dev/ale/if_ale.c === --- sys/dev/ale/if_ale.c(revision 247382) +++ sys/dev/ale/if_ale.c(working copy) @@ -406,11 +406,13 @@ ale_phy_reset(struct ale_softc *sc) CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + CSR_READ_2(sc, ALE_GPHY_CTRL); + DELAY(2000); CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_EXT_RESET | GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + CSR_READ_2(sc, ALE_GPHY_CTRL); + DELAY(2000); #defineATPHY_DBG_ADDR 0x1D #defineATPHY_DBG_DATA 0x1E @@ -635,7 +637,7 @@ ale_attach(device_t dev) /* Set up MII bus. */ error = mii_attach(dev, sc-ale_miibus, ifp, ale_mediachange, ale_mediastatus, BMSR_DEFCAPMASK, sc-ale_phyaddr, MII_OFFSET_ANY, - MIIF_DOPAUSE); + MIIF_DOPAUSE | MIIF_MACPRIV0); if (error != 0) { device_printf(dev, attaching PHYs failed\n); goto fail; @@ -1515,6 +1517,7 @@ ale_setwol(struct ale_softc *sc) GPHY_CTRL_HIB_PULSE | GPHY_CTRL_PHY_PLL_ON | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_IDDQ | GPHY_CTRL_PCLK_SEL_DIS | GPHY_CTRL_PWDOWN_HW); + CSR_READ_2(sc, ALE_GPHY_CTRL); return; } @@ -1547,6 +1550,7 @@ ale_setwol(struct ale_softc *sc) GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_IDDQ | GPHY_CTRL_PCLK_SEL_DIS | GPHY_CTRL_PWDOWN_HW); + CSR_READ_2(sc, ALE_GPHY_CTRL); } /* Request PME. */ pmstat = pci_read_config(sc-ale_dev, pmc + PCIR_POWER_STATUS, 2); Index: sys/dev/mii/atphy.c === --- sys/dev/mii/atphy.c (revision 247382) +++ sys/dev/mii/atphy.c (working copy) @@ -100,8 +100,14 @@ atphy_probe(device_t dev) static int atphy_attach(device_t dev) { + struct mii_attach_args *ma; + u_int flags; - mii_phy_dev_attach(dev, MIIF_NOMANPAUSE, atphy_funcs, 1); + ma = device_get_ivars(dev); + flags = MIIF_NOMANPAUSE; + if ((miibus_get_flags(dev) MIIF_MACPRIV0) != 0) + flags |= MIIF_PHYPRIV0; + mii_phy_dev_attach(dev, flags, atphy_funcs, 1); return (0); } @@ -287,9 +293,11 @@ atphy_reset(struct mii_softc *sc) uint32_t reg; int i; - /* Take PHY out of power down mode. */ - PHY_WRITE(sc, 29, 0x29); - PHY_WRITE(sc, 30, 0); + if ((sc-mii_flags MIIF_PHYPRIV0) != 0) { + /* Take PHY out of power down mode. */ + PHY_WRITE(sc, 29, 0x29); + PHY_WRITE(sc, 30, 0); + } reg = PHY_READ(sc, ATPHY_SCR); /* Enable automatic crossover. */ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Mon, Mar 04, 2013 at 06:59:40AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 03:29:44PM +0900, YongHyeon PYUN wrote: On Mon, Mar 04, 2013 at 05:59:48AM +, Alexey Dokuchaev wrote: On Mon, Mar 04, 2013 at 02:23:28PM +0900, YongHyeon PYUN wrote: Ok, here is final diff which combines two things you've tested. So revert any changes before applying it. Let me know how it goes on your box. Hmm, apparently something went wrong, as I'm back to 100baseTX after make kernel and reboot... Hmm, updated diff again. Better this time, I'm having 1000baseT again! :-) Thanks a lot for testing and patience! Could you reboot multiple times and check whether you reliably get a gigabit link? ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Fri, Feb 22, 2013 at 01:56:07AM +, Alexey Dokuchaev wrote: On Fri, Feb 22, 2013 at 10:13:08AM +0900, YongHyeon PYUN wrote: On Thu, Feb 21, 2013 at 12:43:44PM +, Alexey Dokuchaev wrote: ale_flags = 0x0040 Thanks for the info. Indeed, your controller is AR8121 Gigabit etherent(L1E). I guess the PHY initialization is not complete. Would you try attached patch? Thanks for the patch. Unfortunately, it's still 100baseTX full-duplex after driver reload. Even tried delaying for 3000, no difference. :( Then have no idea at this moment. Can you try other OS and check whether it can establish a gigabit link? ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Wed, Feb 20, 2013 at 06:08:53AM +, Alexey Dokuchaev wrote: On Wed, Feb 20, 2013 at 01:37:39PM +0900, YongHyeon PYUN wrote: On Tue, Feb 19, 2013 at 08:23:02AM +, Alexey Dokuchaev wrote: ale0@pci0:2:0:0:class=0x02 card=0x82261043 chip=0x10261969 rev=0xb0 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR8121/AR8113/AR8114 Gigabit or Fast Ethernet' class = network subclass = ethernet According the the specs, it should be GigE. [...] There is a fast etherenet version(L2E) so I'm not sure what you have. Could you show me dmesg output(ale(4) atphy(4) only) and devinfo -rv | grep atphy? $ dmesg | egrep ale\|atphy ale0: Atheros AR8121/AR8113/AR8114 PCIe Ethernet port 0xcc00-0xcc7f mem 0xfe9c-0xfe9f irq 17 at device 0.0 on pci2 ale0: 960 Tx FIFO, 1024 Rx FIFO ale0: Using 1 MSI messages. ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode. miibus0: MII bus on ale0 atphy0: Atheros F1 10/100/1000 PHY PHY 0 on miibus0 atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow $ devinfo -rv | grep atphy atphy0 pnpinfo oui=0xc82e model=0x1 rev=0x9 at phyno=0 Hmm, it's still not clear whether the controller is Gigabit or not. Could you try attached patch and let me the output? I'm not sure why it happens; maybe it's somehow related to a handful of those ale0: link state changed to DOWN/UP flip-flops I see in dmesg(8), before it can finally obtain DHCP lease? That's normal when you initiate auto-negotiation with dhclient. Yes, I've already seen your lengthy explanation [1], thanks! I remember these adapters had problems in the past, like infamous Corrupted MAC on input disconnect messages, but I don't recall that I could not use it in GigE mode. [...] If you still see the Corrupted MAC on input message, let me know. No, those are long gone now (hopefully; at least I haven't seen them for a while). ./danfe [1] http://lists.freebsd.org/pipermail/freebsd-net/2009-January/020662.html Index: sys/dev/ale/if_ale.c === --- sys/dev/ale/if_ale.c (revision 246937) +++ sys/dev/ale/if_ale.c (working copy) @@ -497,6 +497,9 @@ ale_attach(device_t dev) sc-ale_flags |= ALE_FLAG_FASTETHER; } } +#if 1 + printf(ale_flags = 0x%08x\n, sc-ale_flags); +#endif /* * All known controllers seems to require 4 bytes alignment * of Tx buffers to make Tx checksum offload with custom ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Thu, Feb 21, 2013 at 12:43:44PM +, Alexey Dokuchaev wrote: On Thu, Feb 21, 2013 at 05:33:35PM +0900, YongHyeon PYUN wrote: On Wed, Feb 20, 2013 at 06:08:53AM +, Alexey Dokuchaev wrote: $ dmesg | egrep ale\|atphy ale0: Atheros AR8121/AR8113/AR8114 PCIe Ethernet port 0xcc00-0xcc7f mem 0xfe9c-0xfe9f irq 17 at device 0.0 on pci2 ale0: 960 Tx FIFO, 1024 Rx FIFO ale0: Using 1 MSI messages. ale0: 4GB boundary crossed, switching to 32bit DMA addressing mode. miibus0: MII bus on ale0 atphy0: Atheros F1 10/100/1000 PHY PHY 0 on miibus0 atphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow $ devinfo -rv | grep atphy atphy0 pnpinfo oui=0xc82e model=0x1 rev=0x9 at phyno=0 Hmm, it's still not clear whether the controller is Gigabit or not. Could you try attached patch and let me the output? ale_flags = 0x0040 Thanks for the info. Indeed, your controller is AR8121 Gigabit etherent(L1E). I guess the PHY initialization is not complete. Would you try attached patch? ./danfe Index: sys/dev/ale/if_ale.c === --- sys/dev/ale/if_ale.c(revision 246937) +++ sys/dev/ale/if_ale.c(working copy) @@ -406,11 +406,11 @@ CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + DELAY(2000); CSR_WRITE_2(sc, ALE_GPHY_CTRL, GPHY_CTRL_EXT_RESET | GPHY_CTRL_HIB_EN | GPHY_CTRL_HIB_PULSE | GPHY_CTRL_SEL_ANA_RESET | GPHY_CTRL_PHY_PLL_ON); - DELAY(1000); + DELAY(2000); #defineATPHY_DBG_ADDR 0x1D #defineATPHY_DBG_DATA 0x1E ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: ale(4) cannot negotiate as GigE
On Tue, Feb 19, 2013 at 08:23:02AM +, Alexey Dokuchaev wrote: Hi there, I've recently put back online one of my home servers, updated to the latest -CURRENT code. All went fine, but one thing bothers me. This box bears Asus P5Q Pro mobo, with the following onboard NIC: ale0@pci0:2:0:0:class=0x02 card=0x82261043 chip=0x10261969 rev=0xb0 hdr=0x00 vendor = 'Atheros Communications Inc.' device = 'AR8121/AR8113/AR8114 Gigabit or Fast Ethernet' class = network subclass = ethernet According the the specs, it should be GigE. In fact, when plugged into a capable switch, it displays green (gig) status (same on the switch), but once being initialized by the kernel, it downgrades to yellowish 100mbps (real speeds agree). There is a fast etherenet version(L2E) so I'm not sure what you have. Could you show me dmesg output(ale(4) atphy(4) only) and devinfo -rv| grep atphy? I'm not sure why it happens; maybe it's somehow related to a handful of those ale0: link state changed to DOWN/UP flip-flops I see in dmesg(8), before it can finally obtain DHCP lease? That's normal when you initiates auto-negotiation with dhclient. I remember these adapters had problems in the past, like infamous Corrupted MAC on input disconnect messages, but I don't recall that I could not use it in GigE mode. Anything I can do about it? Googling did not help much: most reports date back to ca. 2009, and apparently were ironed out in later revisions (e.g. selectively disabling checksum offloading). Thanks, If you still see the Corrupted MAC on input message, let me know. ./danfe ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: About 802.1Q tag
On Mon, Nov 26, 2012 at 09:54:14AM +0900, Kohji Okuno wrote: Hi, Would someone check the following code? If the hardware do not process an 802.1Q tag, the kernel repacks mbuf in line 578-580. But, `struct ether_header *eh' was assigned at line 484. And, in line 611-637, because of the kernel refers old eh pointer, the kernel will misjudges its ether packet. I think that `eh = mtod(m, struct ether_header *);' is needed after line 580. Yes, your analysis looks correct. Thanks, Kohji Okuno sys/net/if_ethersubr.c: 448 static void 449 ether_input_internal(struct ifnet *ifp, struct mbuf *m) 450 { 451 struct ether_header *eh; 484 eh = mtod(m, struct ether_header *); 554 /* 555* If the hardware did not process an 802.1Q tag, do this now, 556* to allow 802.1P priority frames to be passed to the main input 557* path correctly. 558* TODO: Deal with Q-in-Q frames, but not arbitrary nesting levels. 559*/ 560 if ((m-m_flags M_VLANTAG) == 0 etype == ETHERTYPE_VLAN) { 578 bcopy((char *)evl, (char *)evl + ETHER_VLAN_ENCAP_LEN, 579 ETHER_HDR_LEN - ETHER_TYPE_LEN); 580 m_adj(m, ETHER_VLAN_ENCAP_LEN); 581 } 610 611 #if defined(INET) || defined(INET6) 612 /* 613* Clear M_PROMISC on frame so that carp(4) will see it when the 614* mbuf flows up to Layer 3. 615* FreeBSD's implementation of carp(4) uses the inprotosw 616* to dispatch IPPROTO_CARP. carp(4) also allocates its own 617* Ethernet addresses of the form 00:00:5e:00:01:xx, which 618* is outside the scope of the M_PROMISC test below. 619* TODO: Maintain a hash table of ethernet addresses other than 620* ether_dhost which may be active on this ifp. 621*/ 622 if (ifp-if_carp (*carp_forus_p)(ifp, eh-ether_dhost)) { 623 m-m_flags = ~M_PROMISC; 624 } else 625 #endif 626 { 627 /* 628* If the frame received was not for our MAC address, set the 629* M_PROMISC flag on the mbuf chain. The frame may need to 630* be seen by the rest of the Ethernet input path in case of 631* re-entry (e.g. bridge, vlan, netgraph) but should not be 632* seen by upper protocol layers. 633*/ 634 if (!ETHER_IS_MULTICAST(eh-ether_dhost) 635 bcmp(IF_LLADDR(ifp), eh-ether_dhost, ETHER_ADDR_LEN) != 0) 636 m-m_flags |= M_PROMISC; 637 } ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Tue, Oct 02, 2012 at 11:10:23AM -0700, Sean Bruno wrote: On Tue, 2012-10-02 at 15:59 -0700, YongHyeon PYUN wrote: Sean, do you have a box with BCM5703/5704/5714/5715 controller? I have a 5704C in an HP DL380G4 here that seems to be working. I'll have to poke around further to see what else I have lying around. bge0: HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100 mem 0xfdef-0xfdef irq 25 at device 1.0 on pci3 bge0: CHIP ID 0x2100; ASIC REV 0x02; CHIP REV 0x21; PCI-X 133 MHz miibus0: MII bus on bge0 brgphy0: BCM5704 1000BASE-T media interface PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:0f:20:f6:e6:23 bge1: HP NC7782 Gigabit Server Adapter, ASIC rev. 0x002100 mem 0xfdee-0xfdee irq 26 at device 1.1 on pci3 bge1: CHIP ID 0x2100; ASIC REV 0x02; CHIP REV 0x21; PCI-X 133 MHz miibus1: MII bus on bge1 brgphy1: BCM5704 1000BASE-T media interface PHY 1 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:0f:20:f6:e6:22 Sean, I have checked in all changes except one in the WIP version to HEAD. If you happen to see any abnormal bge(4) behavior on CURRENT let me know. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Thu, Sep 27, 2012 at 05:09:34PM -0700, Sean Bruno wrote: On Wed, 2012-09-19 at 09:44 -0700, Sean Bruno wrote: On Fri, 2012-09-14 at 14:27 -0700, YongHyeon PYUN wrote: All, There were lots of reports that stock bge(4) does not work on Dell Rx20/HP DL 360 G8. With the help of Broadcom and BCM5719/BCM5720 users I managed to address the issue but I had to touch very sensitive part of driver. Before committing the change to tree I'd like to know whether this change introduces regressions on old bge(4) controllers. If you're bge(4) user, please try latest WIP version at the following URL and let me know how it goes on your box. I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c We're starting to gather data and have a couple of machines (pciconf, ifconfig, dmesg) here that may provide some insights. Everything seems to be working at a cursory level. http://people.freebsd.org/~sbruno/new_bge/ Thanks for testing! Sean, do you have a box with BCM5703/5704/5714/5715 controller? If the answer is yes, would you give it spin on the box? Due to the reset sequence changes I'd like to know whether there are any regressions on these controllers. The reset sequence change will also affect BCM5906/5906M controller. I guess bge(4) didn't completely reset BCM5906 such that it may have resulted in RX CPU handing under device resume. The WIP version wouldn't completely solve resume issue but it would make one step forward to right direction. We have seen 2 instances of one or more of the HP machines failing and dropping off the network. however, we don't have specifics yet. Sean ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Fri, Sep 21, 2012 at 08:34:29PM +0900, Wanpeng Qian wrote: On Thu, Sep 20, 2012 at 06:56:09AM +0900, Wanpeng Qian wrote: Hi, On Mon, Sep 17, 2012 at 09:37:21PM +0900, Wanpeng Qian wrote: Hi, here is the dmesg output. bge0: HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100 mem 0xfe9f-0xfe9f irq 18 at device 0.0 on pci4 bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5784 10/100/1000baseT PHY PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow It seems your controller is BCM5784 A1. The latest WIP have one change that may affect its DMA behavior. So it would be good to know how the WIP version works on your box. I update my system to 9-STABLE and using your WIP files. after I reboot the whole system. I cannot find bge anymore. here is the pciconf -lv output. none1@pci0:4:0:0: class=0x02 card=0x705d103c chip=0x165b14e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5723 Gigabit Ethernet PCIe' class = network subclass = ethernet Hmm, the WIP version didn't remove the chip id so bge(4) may have failed to attach. Could you check any message printed by bge(4) in dmesg output? There is neither message related to bge in the dmesg output. nor ifconfig -a output. anything else I can try ? Does stock bge(4) in latest stable/9 recognize your controller? If the answer is yes, would you post full verbose boot message? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Thu, Sep 20, 2012 at 06:56:09AM +0900, Wanpeng Qian wrote: Hi, On Mon, Sep 17, 2012 at 09:37:21PM +0900, Wanpeng Qian wrote: Hi, here is the dmesg output. bge0: HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100 mem 0xfe9f-0xfe9f irq 18 at device 0.0 on pci4 bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5784 10/100/1000baseT PHY PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow It seems your controller is BCM5784 A1. The latest WIP have one change that may affect its DMA behavior. So it would be good to know how the WIP version works on your box. I update my system to 9-STABLE and using your WIP files. after I reboot the whole system. I cannot find bge anymore. here is the pciconf -lv output. none1@pci0:4:0:0: class=0x02 card=0x705d103c chip=0x165b14e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5723 Gigabit Ethernet PCIe' class = network subclass = ethernet Hmm, the WIP version didn't remove the chip id so bge(4) may have failed to attach. Could you check any message printed by bge(4) in dmesg output? Regards. Qian FreeBSD 9.0 RELEASE. Regards. Qian watchdog timeouts can be triggered by various issues so it's hard to guess the root cause of the issue. Would you show me the dmesg output(bge(4)/brgphy(4) output only)? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Mon, Sep 17, 2012 at 09:37:21PM +0900, Wanpeng Qian wrote: Hi, here is the dmesg output. bge0: HP NC107i PCIe Gigabit Server Adapter, ASIC rev. 0x5784100 mem 0xfe9f-0xfe9f irq 18 at device 0.0 on pci4 bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5784 10/100/1000baseT PHY PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow It seems your controller is BCM5784 A1. The latest WIP have one change that may affect its DMA behavior. So it would be good to know how the WIP version works on your box. FreeBSD 9.0 RELEASE. Regards. Qian watchdog timeouts can be triggered by various issues so it's hard to guess the root cause of the issue. Would you show me the dmesg output(bge(4)/brgphy(4) output only)? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Mon, Sep 17, 2012 at 05:39:09PM +0600, Eugene M. Zheganin wrote: Hi. On 15.09.2012 03:27, YongHyeon PYUN wrote: I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. There's a reopened bug concerning 8.x releases version of the bge(4) driver not working with IPMI ( http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/122252 http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/122252 ). I can also say that enabling ASF on RELENG_8 still leads to locking and hangups. Does this CFT mean that this situation may be improved with the new bge(4) version, on 9.x ? I'm afraid it wouldn't. ASF/IPMI support of bge(4) has many issues. Only small number of lucky users were able to use IPMI. I wanted to not break IPMI for these users in the WIP version. But ASF/IPMI should work for controllers with APE(BCM5719/BCM5720). Eugene. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Fri, Sep 14, 2012 at 09:11:02PM +0900, Wanpeng Qian wrote: It seems BCM5723 support code was not added by me so I don't know how well it works in previous FreeBSD releases. Did bge(4) ever work with your controller? The driver works fine except the bge0: Watchdog timeout, that will bring the interface down/up for a while. make it unstable for network share service. This card works fine under windows and opensolaris. so I think this is a driver issue. watchdog timeouts can be triggered by various issues so it's hard to guess the root cause of the issue. Would you show me the dmesg output(bge(4)/brgphy(4) output only)? When I search by google. many users report this issue, from FreeBSD 7 to Current. that is no workaround at this time except buy another card. Regards. Qian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Fri, Sep 14, 2012 at 01:04:50PM -0500, Pedro Giffuni wrote: Success !!! It fixed kern/169634 for me. Great, would you write a follow-up to the PR? If still possible it should be pushed into 9.1-RELEASE. I'm afraid it was too late. Thank you so much for working on this! No problem! Pedro. On 09/14/2012 16:27, YongHyeon PYUN wrote: All, There were lots of reports that stock bge(4) does not work on Dell Rx20/HP DL 360 G8. With the help of Broadcom and BCM5719/BCM5720 users I managed to address the issue but I had to touch very sensitive part of driver. Before committing the change to tree I'd like to know whether this change introduces regressions on old bge(4) controllers. If you're bge(4) user, please try latest WIP version at the following URL and let me know how it goes on your box. I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c Build instructions 1. Copy both if_bge.c/if_bgereg.h to /usr/src/sys/dev/bge directory 2. Copy brgphy.c /usr/src/sys/dev/mii 3. Rebuild kernel and reboot to take the change effect. You can also use the files above for for 9.1/stable/9. For stable/8 it needs slight modification and I couldn't find time to regenerate the patch. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Fri, Sep 14, 2012 at 03:19:52PM +0900, Wanpeng Qian wrote: Hi, I encounter a watchdog timeout issue on NetXtreme BCM5723 Gigabit Ethernet PCIe, dose this patch solve this issue? I'm not aware of BCM5723. Could you show me the output of pciconf -lcbv? If so, I can test it. Regards. Qian All, There were lots of reports that stock bge(4) does not work on Dell Rx20/HP DL 360 G8. With the help of Broadcom and BCM5719/BCM5720 users I managed to address the issue but I had to touch very sensitive part of driver. Before committing the change to tree I'd like to know whether this change introduces regressions on old bge(4) controllers. If you're bge(4) user, please try latest WIP version at the following URL and let me know how it goes on your box. I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c Build instructions 1. Copy both if_bge.c/if_bgereg.h to /usr/src/sys/dev/bge directory 2. Copy brgphy.c /usr/src/sys/dev/mii 3. Rebuild kernel and reboot to take the change effect. You can also use the files above for for 9.1/stable/9. For stable/8 it needs slight modification and I couldn't find time to regenerate the patch. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for bge(4) testers
On Fri, Sep 14, 2012 at 05:38:36PM +0900, Wanpeng Qian wrote: Hi, Here is the output. the machine is HP Microserver N36L. onboard lan. bge0@pci0:4:0:0: class=0x02 card=0x705d103c chip=0x165b14e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM5723 Gigabit Ethernet PCIe' class = network subclass = ethernet bar [10] = type Memory, range 64, base 0xfe9f, size 65536, enabled cap 01[48] = powerspec 3 supports D0 D3 current D0 cap 03[40] = VPD cap 09[60] = vendor (length 108) cap 05[50] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[cc] = PCI-Express 2 endpoint max data 128(256) link x1(x1) ecap 0001[100] = AER 1 0 fatal 0 non-fatal 1 corrected ecap 0002[13c] = VC 1 max VC0 ecap 0003[160] = Serial 1 d8d385fffeaf9f38 ecap 0004[16c] = unknown 1 It seems BCM5723 support code was not added by me so I don't know how well it works in previous FreeBSD releases. Did bge(4) ever work with your controller? Regards. Qian On Fri, Sep 14, 2012 at 03:19:52PM +0900, Wanpeng Qian wrote: Hi, I encounter a watchdog timeout issue on NetXtreme BCM5723 Gigabit Ethernet PCIe, dose this patch solve this issue? I'm not aware of BCM5723. Could you show me the output of pciconf -lcbv? If so, I can test it. Regards. Qian All, There were lots of reports that stock bge(4) does not work on Dell Rx20/HP DL 360 G8. With the help of Broadcom and BCM5719/BCM5720 users I managed to address the issue but I had to touch very sensitive part of driver. Before committing the change to tree I'd like to know whether this change introduces regressions on old bge(4) controllers. If you're bge(4) user, please try latest WIP version at the following URL and let me know how it goes on your box. I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c Build instructions 1. Copy both if_bge.c/if_bgereg.h to /usr/src/sys/dev/bge directory 2. Copy brgphy.c /usr/src/sys/dev/mii 3. Rebuild kernel and reboot to take the change effect. You can also use the files above for for 9.1/stable/9. For stable/8 it needs slight modification and I couldn't find time to regenerate the patch. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Call for bge(4) testers
All, There were lots of reports that stock bge(4) does not work on Dell Rx20/HP DL 360 G8. With the help of Broadcom and BCM5719/BCM5720 users I managed to address the issue but I had to touch very sensitive part of driver. Before committing the change to tree I'd like to know whether this change introduces regressions on old bge(4) controllers. If you're bge(4) user, please try latest WIP version at the following URL and let me know how it goes on your box. I'm especially interested in whether there is any ASF/IPMI regression on BCM570x/571x. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c Build instructions 1. Copy both if_bge.c/if_bgereg.h to /usr/src/sys/dev/bge directory 2. Copy brgphy.c /usr/src/sys/dev/mii 3. Rebuild kernel and reboot to take the change effect. You can also use the files above for for 9.1/stable/9. For stable/8 it needs slight modification and I couldn't find time to regenerate the patch. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient cause up/down cycle after 239356 ?
On Thu, Aug 23, 2012 at 11:35:34AM +1000, Peter Jeremy wrote: On 2012-Aug-22 15:35:01 -0400, John Baldwin j...@freebsd.org wrote: Hmm. Perhaps we could use a debouncer to ignore short link flaps? Kind of gross (and OpenBSD doesn't do this). For now this change basically ignores link up events if they occur with 5 seconds of the link down event. The 5 is hardcoded which is kind of yuck. I'm also a bit concerned about this for similar reasons to adrian@. We need to distinguish between short link outages caused by (eg) a switch admin reconfiguring the switch (which needs the lease to be re-checked) and those caused by broken NICs which report link status changes when they are touched. Maybe an alternative is to just ignore link flaps when they occur within a few seconds of a script_go(). (And/or make the ignore timeout configurable). Apart from fxp(4), does anyone know how many NICs are similarly broken? FreeBSD used to blindly call driver's if_init() in ether_ioctl() whenever an IP address is assigned to interface. This results in calling foo_init in a driver such that controller/link reset happens after IP address assignment. I tried to fix many ethernet drivers in tree to ignore redundant foo_init() call by checking whether this foo_init() call is the very first time initialization of interface. Both NetBSD/OpenBSD seems to not call if_init() if the driver is already running. Because some controllers(e.g. fxp(4)) may require full controller reset to make multicast work, I couldn't follow their approach. I still don't know what other drivers except fxp(4) require full controller reset. There are too many old ethernet drivers I don't have access. Another reason why fxp(4) requires redundant controller reset is flow control support of the driver. Due to hardware limitation, MAC configuration for negotiated link's flow control parameters also requires controller reset. Does anyone know why this issue doesn't bite OpenBSD? Does it have I guess OpenBSD's fxp(4) has to reset controller to update multicast filter but it does not support flow control for fxp(4) yet so OpenBSD may see less number of link flips than that of FreeBSD. a work-around to avoid resetting the link, not report link status changes or just no-one has noticed the issue? BTW to jhb: Can you check your mailer's list configuration. You appear to be adding freebsd-current@freebsd.org and leaving curr...@freebsd.org in the Cc list. -- Peter Jeremy ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: dhclient cause up/down cycle after 239356 ?
On Wed, Aug 22, 2012 at 08:27:01AM +1000, Peter Jeremy wrote: On 2012-Aug-21 19:42:17 +0300, Vitalij Satanivskij sa...@ukr.net wrote: Look's like dhclient do down/up sequence - Not intentionally. Aug 21 19:21:00 home kernel: fxp0: link state changed to UP Aug 21 19:21:01 home kernel: fxp0: link state changed to DOWN Aug 21 19:21:01 home dhclient: New IP Address (fxp0): xx.xx.xx.xx Aug 21 19:21:01 home dhclient: New Subnet Mask (fxp0): 255.255.255.0 Aug 21 19:21:01 home dhclient: New Broadcast Address (fxp0): xx.xx.xx.xx Aug 21 19:21:01 home dhclient: New Routers (fxp0): xx.xx.xx.xx Aug 21 19:21:03 home kernel: fxp0: link state changed to UP I can reproduce this behaviour - but only on fxp (i82559 in my case) NICs. My bge (BCM5750) and rl (RTL8139) NICs do not report the spurious DOWN/UP. (I don't normally run DHCP on any fxp interfaces, so I didn't see it during my testing). The problem appears to be the $IFCONFIG $interface inet alias 0.0.0.0 netmask 255.0.0.0 broadcast 255.255.255.255 up executed by /sbin/dhclient-script during PREINIT. This is making the fxp NIC reset the link (actually, assigning _any_ IP address to an fxp NIC causes it to reset the link). The post r239356 dhclient detects This comes from the hardware limitation. Assigning addresses will result in programming multicast filter and fxp(4) controllers require full controller reset to reprogram the multicast filter. the link going down and exits. Before r239356 iface just doing down/up without dhclient exit and everything work fine. For you, anyway. Failing to detect link down causes problems for me because my dhclient was not seeing my cable-modem resets and therefore failing to reacquire a DHCP lease. -- Peter Jeremy ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: sk0 link bouncing
On Wed, Jul 04, 2012 at 12:08:16PM +0200, Willem Jan Withagen wrote: Hi, I've got tons of these since I stopped loading the port with traffic It seems to have a pretty steady 27 min interval. Jul 4 07:00:05 freetest kernel: sk0: link state changed to DOWN Jul 4 07:00:05 freetest kernel: sk0: link state changed to UP Jul 4 07:27:21 freetest kernel: sk0: link state changed to DOWN Jul 4 07:27:21 freetest kernel: sk0: link state changed to UP Jul 4 07:53:48 freetest kernel: sk0: link state changed to DOWN Jul 4 07:53:48 freetest kernel: sk0: link state changed to UP Jul 4 08:21:16 freetest kernel: sk0: link state changed to DOWN Jul 4 08:21:16 freetest kernel: sk0: link state changed to UP Jul 4 08:48:10 freetest kernel: sk0: link state changed to DOWN Jul 4 08:48:11 freetest kernel: sk0: link state changed to UP Jul 4 09:13:38 freetest kernel: sk0: link state changed to DOWN Jul 4 09:13:38 freetest kernel: sk0: link state changed to UP Jul 4 09:39:06 freetest kernel: sk0: link state changed to DOWN Jul 4 09:39:06 freetest kernel: sk0: link state changed to UP Very recent 10-current install with std GENERIC kernel. FreeBSD freetest.digiware.nl 10.0-CURRENT FreeBSD 10.0-CURRENT #1: Sat Jun 30 09:35:43 UTC 2012 r...@freetest.digiware.nl:/usr/obj/usr/src/sys/GENERIC amd64 The port is connected to a basic netgear 10/100/1000 switch with nothing modified in the config of that port. Other connections do not seem to suffer from disconnecting. Used the server to 'zfs send' a 360G backup to, and then it did not do anything like this, the port just stayed up. Suggestions where of what to look for this? Probably you have to implement link state change handler for Marvell controller(i.e. sk_marv_miibus_statchg()). Locking for MII access should be revisited too. Sorry, don't have spare time to do that. Thanx, --WjW ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: NICs not in GENERIC
On Tue, Feb 21, 2012 at 03:56:56PM +0100, Alexander Leidinger wrote: Hi, is there a specific reason that the following NICs are not (or shall not be) in GENERIC (at least on i386)? - if_cas: is compiled as a module, Sun hardware, non-x86 only? - if_cxgb - if_cxgbe Last time I tried cas(4) on i386, it worked without problems and I think all Sun add-on cards would work. However as Scott said, it would be rare to see these Sun controllers on x86 world. - if_gem: is compiled as a module, Apple/Sun, non-x86 only? - if_hme: is compiled as a module, Sun hardware, non-x86 only? - if_ic: no man-page - if_ipheth: no man-page - if_mos: USB NIC - if_mxge - if_my - if_nxge - if_vtnet: virtual NIC for hypervisors Bye, Alexander. -- Progress might have been all right once, but it's gone on too long. -- Ogden Nash http://www.Leidinger.netAlexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Intermittent re0 phy failure
On Mon, Jan 30, 2012 at 05:32:06PM -0500, Michael Butler wrote: On 01/18/12 20:52, YongHyeon PYUN wrote: On Wed, Jan 18, 2012 at 08:01:42PM -0500, Michael Butler wrote: On 01/18/12 19:54, YongHyeon PYUN wrote: On Wed, Jan 18, 2012 at 05:48:47PM -0500, Michael Butler wrote: At random intervals, when re0 is without any significant load; idle for lengthy periods, I see .. kernel: re0: PHY read failed last message repeated 4 times kernel: re0: link state changed to DOWN Unplugging the cable and re-inserting is sufficient to restore functionality. [ .. snip .. ] Thanks a lot. Would you try attached patch? Since applying this (for 8168D) and the patch at SVN r230336 (which affected another system of mine), neither system has gone deaf, Thanks for testing. Committed to HEAD(r231622). Thanks! imb ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Intermittent re0 phy failure
On Wed, Jan 18, 2012 at 05:48:47PM -0500, Michael Butler wrote: At random intervals, when re0 is without any significant load; idle for lengthy periods, I see .. kernel: re0: PHY read failed last message repeated 4 times kernel: re0: link state changed to DOWN Unplugging the cable and re-inserting is sufficient to restore functionality. kernel is @ SVN r230276 Any ideas how to track this down? Knowing which kind of controller you have would be more helpful. Show me both re(4)/rgephy(4) related message from dmesg and 'devinfo -rv | grep rgephy' output. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Intermittent re0 phy failure
On Wed, Jan 18, 2012 at 08:01:42PM -0500, Michael Butler wrote: On 01/18/12 19:54, YongHyeon PYUN wrote: On Wed, Jan 18, 2012 at 05:48:47PM -0500, Michael Butler wrote: At random intervals, when re0 is without any significant load; idle for lengthy periods, I see .. kernel: re0: PHY read failed last message repeated 4 times kernel: re0: link state changed to DOWN Unplugging the cable and re-inserting is sufficient to restore functionality. kernel is @ SVN r230276 Any ideas how to track this down? Knowing which kind of controller you have would be more helpful. Show me both re(4)/rgephy(4) related message from dmesg and 'devinfo -rv | grep rgephy' output. As requested: dmesg: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0x2000-0x20ff mem 0xf070-0xf0700fff,0xf020-0xf0203fff irq 17 at device 0.0 on pci3 re0: MSI count : 1 re0: MSI-X count : 4 re0: attempting to allocate 1 MSI-X vectors (4 supported) msi: routing MSI-X IRQ 257 to local APIC 0 vector 51 re0: using IRQ 257 for MSI-X re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip rev. 0x2800 re0: MAC rev. 0x miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: OUI 0x00e04c, model 0x0011, rev. 2 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow devinfo -rv | grep rgephy rgephy0 pnpinfo oui=0xe04c model=0x11 rev=0x2 at phyno=1 Thanks a lot. Would you try attached patch? Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 230315) +++ sys/dev/re/if_re.c (working copy) @@ -1433,11 +1433,16 @@ sc-rl_flags |= RL_FLAG_MACSLEEP; /* FALLTHROUGH */ case RL_HWREV_8168CP: - case RL_HWREV_8168D: sc-rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PAR | RL_FLAG_DESCV2 | RL_FLAG_MACSTAT | RL_FLAG_CMDSTOP | RL_FLAG_AUTOPAD | RL_FLAG_JUMBOV2 | RL_FLAG_WOL_MANLINK; break; + case RL_HWREV_8168D: + sc-rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PHYWAKE_PM | + RL_FLAG_PAR | RL_FLAG_DESCV2 | RL_FLAG_MACSTAT | + RL_FLAG_CMDSTOP | RL_FLAG_AUTOPAD | RL_FLAG_JUMBOV2 | + RL_FLAG_WOL_MANLINK; + break; case RL_HWREV_8168DP: sc-rl_flags |= RL_FLAG_PHYWAKE | RL_FLAG_PAR | RL_FLAG_DESCV2 | RL_FLAG_MACSTAT | RL_FLAG_AUTOPAD | ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: re(4) driver dropping packets when reading NFS files
On Mon, Dec 26, 2011 at 10:55:50PM -0500, Rick Macklem wrote: Way back in Nov 2010, this thread was related to a problem I had, where an re(4) { 810xE PCIe 10/100baseTX, according to the driver } interface dropped received packets, resulting in a significant impact of NFS performance. Well, it turns out that a recent (post r224506) commit seems to have fixed the problem. It hasn't dropped any packets since I upgraded to a kernel with a r228281 version of if_re.c. So, good news. Thanks to those maintaining this driver, rick ps: If you have a need to know which commit fixed this, I can probably test variants to find out. Otherwise, I'm just happy that it's fixed.:-) Glad to know the issue was fixed. Probably the change made in r227593 or 227854 might have fixed it. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: lock order reversals with netmap
On Thu, Dec 01, 2011 at 05:54:44PM +0100, Luigi Rizzo wrote: On Thu, Dec 01, 2011 at 04:44:24PM +0100, Rene Ladan wrote: Hi, on FreeBSD 10.0-CURRENT #7 r228176M: Thu Dec 1 13:56:02 CET 2011 (GENERIC + CAPABILITIES + netmap with head.diff and bge patches applied) I get these lock order reversals when running a netmap-enabled program (details in the attachment) with syscall (54, FreeBSD ELF64, sys_ioctl): Rene, thanks for the report. As i mentioned earlier to Rene, the 'bge' driver is neither complete nor tested so i am even surprised that it does not crash right away. I'll keep his report in mind when we will complete the support for bge. BTW is someone is familiar with the architecture of the 'bge' NICs please can she/he contact me. I am unclear on why there are two lists of rx buffers (std and jumbo) and one ring -- perhaps the NIC first receives the frame in its fifo and then decides which type of buffer to use to store it ? Actually there are three rings but the additional mini ring is only available for BCM5700. Controller determines which ring(mini, standard and jumbo) would be used to receive the frame based on the frame size. For example, if jumbo frame is enabled and controller receives a pure TCP ACK, controller will use standard RX ring, mini RX ring on BCM5700, which in turn can save system resources. Controller maintains pool of TX/RX buffers in NIC's internal memory space(2 MIPS processors in NIC) and all these decision is made by firmware of the NIC with the help of driver. Broadcom provides publicly available data sheet for open source developers. See the following URL. http://www.broadcom.com/support/ethernet_nic/open_source.php Having two RX buffers are common for controllers that support header splitting. igb(4) and ti(4) have the feature but I think that feature was disabled in igb(4) due to bugs or incomplete implementation in driver. cheers luigi Dec 1 16:23:09 acer kernel: exclusive sleep mutex netmap memory allocator lock (netmap memory allocator lock) r = 0 (0xfe00027d1880) locked @ /usr/src/sys/dev/netmap/netmap.c:1484 Dec 1 16:23:09 acer kernel: exclusive sleep mutex bge0 (network driver) r = 0 (0xff8000768010) locked @ /usr/src/sys/dev/netmap/if_bge_netmap.h:60 The application does not invoke the offending function (netmap_malloc()) itself. Regards, Ren? -- http://www.rene-ladan.nl:8080/ GPG fingerprint = ADBC ECCD EB5F A6B4 549F 600D 8C9E 647A E564 2BFC (subkeys.pgp.net) Dec 1 15:41:20 acer kernel: FreeBSD 10.0-CURRENT #7 r228176M: Thu Dec 1 13:56:02 CET 2011 Dec 1 15:41:20 acer kernel: real memory = 4294967296 (4096 MB) Dec 1 15:41:20 acer kernel: avail memory = 4080091136 (3891 MB) Dec 1 15:41:20 acer kernel: 001.05 netmap_memory_init [1627] netmap_buffer_base 0xff8117eaa000 (offset 679936) Dec 1 15:41:20 acer kernel: 001.06 netmap_memory_init [1636] Have 129 MB, use 661KB for rings, 65862 buffers at 0xff8117eaa000 Dec 1 15:41:20 acer kernel: netmap: loaded module with 129 Mbytes Dec 1 15:41:20 acer kernel: bge0: Broadcom NetLink Gigabit Ethernet Controller, ASIC rev. 0x5784100 mem 0xf510-0xf510 irq 16 at device 0.0 on pci2 Dec 1 15:41:20 acer kernel: bge0: CHIP ID 0x05784100; ASIC REV 0x5784; CHIP REV 0x57841; PCI-E Dec 1 15:41:20 acer kernel: miibus0: MII bus on bge0 Dec 1 15:41:20 acer kernel: brgphy0: BCM5784 10/100/1000baseT PHY PHY 1 on miibus0 Dec 1 15:41:20 acer kernel: brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow Dec 1 15:41:20 acer kernel: bge0: Ethernet address: 00:26:2d:5e:d8:ee Dec 1 15:41:20 acer kernel: 001.09 netmap_attach [1243] ok for bge0 Dec 1 16:23:09 acer kernel: 989.882634 netmap_set_ringid [779] ringid bge0 set to SW RING Dec 1 16:23:09 acer kernel: uma_zalloc_arg: zone 64 with the following non-sleepable locks held: Dec 1 16:23:09 acer kernel: exclusive sleep mutex netmap memory allocator lock (netmap memory allocator lock) r = 0 (0xfe00027d1880) locked @ /usr/src/sys/dev/netmap/netmap.c:1484 Dec 1 16:23:09 acer kernel: exclusive sleep mutex bge0 (network driver) r = 0 (0xff8000768010) locked @ /usr/src/sys/dev/netmap/if_bge_netmap.h:60 Dec 1 16:23:09 acer kernel: KDB: stack backtrace: Dec 1 16:23:09 acer kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Dec 1 16:23:10 acer kernel: kdb_backtrace() at kdb_backtrace+0x37 Dec 1 16:23:10 acer kernel: _witness_debugger() at _witness_debugger+0x2c Dec 1 16:23:10 acer kernel: witness_warn() at witness_warn+0x2c2 Dec 1 16:23:10 acer kernel: uma_zalloc_arg() at uma_zalloc_arg+0x335 Dec 1 16:23:10 acer kernel: malloc() at malloc+0xbe Dec 1 16:23:10 acer kernel: netmap_malloc() at netmap_malloc+0x86 Dec 1 16:23:10 acer kernel: netmap_ioctl() at
Re: if_bce tx / rx tick limits
On Wed, Nov 30, 2011 at 03:22:37PM +0100, Florian Wilkemeyer wrote: Hi, i wonder about the bce driver's tx / rx tick limits (ticks and ticks_int are limited to 100; otherwise default value (80) gets used) (if_bce.c line 921 / 933 .. ) I think this highly depends on firmware of controller. David may be able to answer(CCed). On DragonFly BSD the values can be set much higher (such as 1000 ..) which would be great in a high-traffic setup. (On linux there's no limit too as far as i remember) No, the value should be represented with 10bits so having no limit looks like a bug in Linux. Is there any reason why its limited down to 100? Thanks Florian P.S. I'm sorry if this was the wrong mailing list for it; i don't know whats the right list for this (probably net or driver ?) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CFT: msk(4) 64bit DMA support
On Thu, May 26, 2011 at 06:40:43PM -0700, YongHyeon PYUN wrote: Hi, Here is a patch that implements 64bit DMA on msk(4). If you use msk(4) on a system that has more than 4GB memory, please try the patch at the following URL and let me know whether it works or not. You need latest msk(4) in HEAD to apply the patch. http://people.freebsd.org/~yongari/msk/msk.64bit.dma.diff Previously msk(4) may have used bounce buffers on systems that have more than 4GB memory. You can verify whether msk(4) is using bounce buffers by checking the output of sysctl hw.busdma. For instance, hw.busdma.zone0.total_bounced counter would increase while network operation is in progress. If patch above works you wouldn't see the counter change anymore and it would also enhance network performance since it wouldn't have to copy from or to bounce buffers. Committed to HEAD(r227582). Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Call for testers : ALi/ULi M5261/M5263 ethernet controller
On Sun, Oct 16, 2011 at 05:22:13PM -0700, YongHyeon PYUN wrote: Hi, If you have ALi/ULi M5261/M5263 ethernet controller please try the patch at the following URL and let me know how it works. http://people.freebsd.org/~yongari/dc/dc.uli562x.diff The patch was generated against latest HEAD and it should be cleanly applied to latest stable/8 and stable/7. I committed revised version to HEAD(r226699, r226701). Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Call for testers : ALi/ULi M5261/M5263 ethernet controller
Hi, If you have ALi/ULi M5261/M5263 ethernet controller please try the patch at the following URL and let me know how it works. http://people.freebsd.org/~yongari/dc/dc.uli562x.diff The patch was generated against latest HEAD and it should be cleanly applied to latest stable/8 and stable/7. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Queue drop not accounted ?
On Thu, Sep 15, 2011 at 09:25:19PM -0400, Arnaud Lacombe wrote: Hi, Shouldn't packet freed in IFQ_ENQUEUE() because the queue is full be accounted as dropped, cf attached patch ? Hmm, I think err would be set to ENOBUFS for queue full case and this err will crease ifq_drops. Thanks, - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Sun, Aug 21, 2011 at 04:48:56PM -0700, YongHyeon PYUN wrote: On Fri, Aug 19, 2011 at 12:17:12AM -0700, Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: ? ?When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. I can't reproduce this with AR8151 sample board. Could you give me dmesg output to know exact controller revision? One issue I'm aware of is lack of re-establishing link when controller firmware put its PHY to deep sleep mode. The deep sleep mode seems to be automatically activated by firmware when it detects no energy signal(i.e. cable unplugged) so I had to down and up the interface again to take the PHY out of the sleep mode. Not re-establishing link issue was fixed in r225088. I'm not sure whether this also fixes kern/148772 though. Because you also seem to have the same issue of the PR, it would be good to know whether it makes any difference or not. ) at _bus_dmamap_sync+0x51 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f syscall(e6ca3d28) at syscall+0x2e Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54kernel trap 12 with interrupts disabled Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 KDB: stack backtrace: db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at db_trace_self_wrapper+0x26 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 trap(e6ca32dc) at trap+0x15a calltrap() at calltrap+0x6 ? ?I tried to track down what the exact issue was, but I got lost (the locking sort of looks ok to me, but I'm still not an expert with mutex(9)). ? ?I still have the vmcore and can provide more helpful details when requested. The locking itself is almost certainly fine. ?The error message is not very helpful, but what went wrong was the page fault. ?You just happen to panic on a witness warning before vm_fault can panic due to a bad address. The alc(4) maintainer would probably like info on the trap (line of code and where the bad pointer came from). I talked to Xin a bit and as he noted the panic was just a symptom of the actual issue at hand. I think the problem is that the rx ring's rx_m value isn't set to NULL when an error occurred, but getting to the exact problem at hand, the following call is failing: if (bus_dmamap_load_mbuf_sg(sc-alc_cdata.alc_rx_tag, // -- HERE sc-alc_cdata.alc_rx_sparemap, m, segs, nsegs, 0) != 0) { m_freem(m); return (ENOBUFS); } It's failing with ENOMEM. Still trying to determine what the exact Even if bus_dmamap_load_mbuf_sg(9) fails driver should not panic. Could you show me full back-trace? reason for ENOMEM is from the x86 busdma code though.. Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Mon, Aug 22, 2011 at 02:26:47PM -0700, Garrett Cooper wrote: On Mon, Aug 22, 2011 at 1:40 PM, YongHyeon PYUN pyu...@gmail.com wrote: On Sun, Aug 21, 2011 at 04:48:56PM -0700, YongHyeon PYUN wrote: On Fri, Aug 19, 2011 at 12:17:12AM -0700, Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, ?m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: ? ?When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: ? ? These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. I can't reproduce this with AR8151 sample board. Could you give me dmesg output to know exact controller revision? One issue I'm aware of is lack of re-establishing link when controller firmware put its PHY to deep sleep mode. ?The deep sleep mode seems to be automatically activated by firmware when it detects no energy signal(i.e. cable unplugged) so I had to down and up the interface again to take the PHY out of the sleep mode. Not re-establishing link issue was fixed in r225088. ?I'm not sure whether this also fixes kern/148772 though. Because you also seem to have the same issue of the PR, it would be good to know whether it makes any difference or not. The panic no longer occurs with that commit when running /etc/rc.d/netif restart after inserting and reinserting the ethernet cable (I've done it several times for good measure); the failing case was potentially being triggered somehow by the hibernation code path. Hmm, have no idea how this can be related with the panic. :-( BTW, does the commit also fix kern/148772? Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Fri, Aug 19, 2011 at 12:17:12AM -0700, Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: ? ?When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. I can't reproduce this with AR8151 sample board. Could you give me dmesg output to know exact controller revision? One issue I'm aware of is lack of re-establishing link when controller firmware put its PHY to deep sleep mode. The deep sleep mode seems to be automatically activated by firmware when it detects no energy signal(i.e. cable unplugged) so I had to down and up the interface again to take the PHY out of the sleep mode. ) at _bus_dmamap_sync+0x51 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f syscall(e6ca3d28) at syscall+0x2e Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54kernel trap 12 with interrupts disabled Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 KDB: stack backtrace: db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at db_trace_self_wrapper+0x26 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 trap(e6ca32dc) at trap+0x15a calltrap() at calltrap+0x6 ? ?I tried to track down what the exact issue was, but I got lost (the locking sort of looks ok to me, but I'm still not an expert with mutex(9)). ? ?I still have the vmcore and can provide more helpful details when requested. The locking itself is almost certainly fine. ?The error message is not very helpful, but what went wrong was the page fault. ?You just happen to panic on a witness warning before vm_fault can panic due to a bad address. The alc(4) maintainer would probably like info on the trap (line of code and where the bad pointer came from). I talked to Xin a bit and as he noted the panic was just a symptom of the actual issue at hand. I think the problem is that the rx ring's rx_m value isn't set to NULL when an error occurred, but getting to the exact problem at hand, the following call is failing: if (bus_dmamap_load_mbuf_sg(sc-alc_cdata.alc_rx_tag, // -- HERE sc-alc_cdata.alc_rx_sparemap, m, segs, nsegs, 0) != 0) { m_freem(m); return (ENOBUFS); } It's failing with ENOMEM. Still trying to determine what the exact Even if bus_dmamap_load_mbuf_sg(9) fails driver should not panic. Could you show me full back-trace? reason for ENOMEM is from the x86 busdma code though.. Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Sun, Aug 21, 2011 at 06:26:45PM -0700, Garrett Cooper wrote: On Sun, Aug 21, 2011 at 4:48 PM, YongHyeon PYUN pyu...@gmail.com wrote: On Fri, Aug 19, 2011 at 12:17:12AM -0700, Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, ?m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: ? ?When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: ? ? These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. I can't reproduce this with AR8151 sample board. Could you give me dmesg output to know exact controller revision? One issue I'm aware of is lack of re-establishing link when controller firmware put its PHY to deep sleep mode. ?The deep sleep mode seems to be automatically activated by firmware when it detects no energy signal(i.e. cable unplugged) so I had to down and up the interface again to take the PHY out of the sleep mode. ) at _bus_dmamap_sync+0x51 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f syscall(e6ca3d28) at syscall+0x2e Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54kernel trap 12 with interrupts disabled Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 KDB: stack backtrace: db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at db_trace_self_wrapper+0x26 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 trap(e6ca32dc) at trap+0x15a calltrap() at calltrap+0x6 ? ?I tried to track down what the exact issue was, but I got lost (the locking sort of looks ok to me, but I'm still not an expert with mutex(9)). ? ?I still have the vmcore and can provide more helpful details when requested. The locking itself is almost certainly fine. ?The error message is not very helpful, but what went wrong was the page fault. ?You just happen to panic on a witness warning before vm_fault can panic due to a bad address. The alc(4) maintainer would probably like info on the trap (line of code and where the bad pointer came from). ? ? I talked to Xin a bit and as he noted the panic was just a symptom of the actual issue at hand. I think the problem is that the rx ring's rx_m value isn't set to NULL when an error occurred, but getting to the exact problem at hand, the following call is failing: ? ? ? ? if (bus_dmamap_load_mbuf_sg(sc-alc_cdata.alc_rx_tag, // -- HERE ? ? ? ? ? ? sc-alc_cdata.alc_rx_sparemap, m, segs, nsegs, 0) != 0) { ? ? ? ? ? ? ? ? m_freem(m); ? ? ? ? ? ? ? ? return (ENOBUFS); ? ? ? ? } ? ? It's failing with ENOMEM. Still trying to determine what the exact Even if bus_dmamap_load_mbuf_sg(9) fails driver should not panic. Could you show me full back-trace? I tried to hack the kernel to get it to dump properly, but that inevitably failed (one of the buffers or the stack data associated probably got stomped on when the system panicked). Here are some pics. Thanks a lot. I see that alc(4) failed to allocate RX buffers and it seems the panic happened in alc_stop(). But I can't understand how it could be triggered. When RX buffer allocation failed, the mbuf pointer would have been NULL such that bus_dmamap_sync(9) wouldn't be invoked in alc_stop(). I also see you have wireless network setup in the back trace. Could you also reproduce alc(4) panic without wireless network configuration? Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Fri, Aug 19, 2011 at 12:17:12AM -0700, Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: ? ?When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. ) at _bus_dmamap_sync+0x51 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f syscall(e6ca3d28) at syscall+0x2e Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54kernel trap 12 with interrupts disabled Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 KDB: stack backtrace: db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at db_trace_self_wrapper+0x26 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 trap(e6ca32dc) at trap+0x15a calltrap() at calltrap+0x6 ? ?I tried to track down what the exact issue was, but I got lost (the locking sort of looks ok to me, but I'm still not an expert with mutex(9)). ? ?I still have the vmcore and can provide more helpful details when requested. The locking itself is almost certainly fine. ?The error message is not very helpful, but what went wrong was the page fault. ?You just happen to panic on a witness warning before vm_fault can panic due to a bad address. The alc(4) maintainer would probably like info on the trap (line of code and where the bad pointer came from). I talked to Xin a bit and as he noted the panic was just a symptom of the actual issue at hand. I think the problem is that the rx ring's rx_m value isn't set to NULL when an error occurred, but getting to the exact problem at hand, the following call is failing: Could you elaborate on this issue? alc(4) was designed to cope with this kind error and not resetting rx_m to NULL is intentional behavior such that alc(4) will reuse previously loaded DMA map and buffer. if (bus_dmamap_load_mbuf_sg(sc-alc_cdata.alc_rx_tag, // -- HERE sc-alc_cdata.alc_rx_sparemap, m, segs, nsegs, 0) != 0) { m_freem(m); return (ENOBUFS); } It's failing with ENOMEM. Still trying to determine what the exact reason for ENOMEM is from the x86 busdma code though.. bus_dmamap_load_mbuf_sg(9) can return ENOMEM if there is no available resource and driver should be prepared to recover from this kind of error. However I wonder why this happens on almost idle system. Thanks, -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Deterministic panic due to non-sleepable lock with if_alc when reconfiguring interfaces
On Fri, Aug 19, 2011 at 08:10:31AM -0400, John Baldwin wrote: On Friday, August 19, 2011 3:17:12 am Garrett Cooper wrote: On Thu, Aug 18, 2011 at 9:31 PM, m...@freebsd.org wrote: On Thu, Aug 18, 2011 at 5:50 PM, Garrett Cooper yaneg...@gmail.com wrote: When loading if_alc as a module on my netbook and running /etc/rc.d/netif restart, I can deterministically panic my netbook with the following message: These repro steps were overly simplified. The complete steps are: 1. Attach ethernet cable to alc(4) enabled NIC. 2. Boot up machine. 3. Login. 4. Physically remove ethernet cable from alc(4) enabled NIC. 5. Run `/etc/rc.d/netif restart' as root. ) at _bus_dmamap_sync+0x51 alc_stop(c3dbb000,0,c0c51844,93a,80206910,...) at alc_stop+0x24e alc_ioctl(c3d07400,80206910,c40423c0,c06a7935,c0914e3c,...) at alc_ioctl+0x22e ifioctl(c45029c0,80206910,c40423c0,c40505c0,c4528c00,...) at ifioctl+0xc98 soo_ioctl(c4574e00,80206910,c40423c0,c413e680,c40505c0,...) at soo_ioctl+0x401 kern_ioctl(c40505c0,3,80206910,c40423c0,c40423c0,...) at kern_ioctl+0x1d7 ioctl(c40505c0,e6ca3cec,e6ca3d28,c08e929d,0,...) at ioctl+0x118 syscallenter(c40505c0,e6ca3ce4,e6ca3ce4,0,0,...) at syscallenter+0x23f syscall(e6ca3d28) at syscall+0x2e Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54kernel trap 12 with interrupts disabled Kernel page fault with the following non-sleepable locks held: exclusive sleep mutex alc0 (network driver) r = 0 (0xc3dbc608) locked @ /usr/src/sys/modules/alc/../../dev/alc/if_alc.c:2362 KDB: stack backtrace: db_trace_self_wrapper(c08e727a,80,6e726500,74206c65,20706172,...) at db_trace_self_wrapper+0x26 kdb_backtrace(93a,0,,c0ad6114,e6ca323c,...) at kdb_backtrace+0x2a _witness_debugger(c08e9f67,e6ca3250,4,1,0,...) at _witness_debugger+0x1e witness_warn(5,0,c0924fe1,c097df50,c3e42b00,...) at witness_warn+0x1f1 trap(e6ca32dc) at trap+0x15a calltrap() at calltrap+0x6 I tried to track down what the exact issue was, but I got lost (the locking sort of looks ok to me, but I'm still not an expert with mutex(9)). I still have the vmcore and can provide more helpful details when requested. The locking itself is almost certainly fine. The error message is not very helpful, but what went wrong was the page fault. You just happen to panic on a witness warning before vm_fault can panic due to a bad address. The alc(4) maintainer would probably like info on the trap (line of code and where the bad pointer came from). I talked to Xin a bit and as he noted the panic was just a symptom of the actual issue at hand. I think the problem is that the rx ring's rx_m value isn't set to NULL when an error occurred, but getting to the exact problem at hand, the following call is failing: if (bus_dmamap_load_mbuf_sg(sc-alc_cdata.alc_rx_tag, // -- HERE sc-alc_cdata.alc_rx_sparemap, m, segs, nsegs, 0) != 0) { m_freem(m); return (ENOBUFS); } It's failing with ENOMEM. Still trying to determine what the exact reason for ENOMEM is from the x86 busdma code though.. ENOMEM The load request has failed due to insufficient resources, and the caller specifically used the BUS_DMA_NOWAIT flag. (bus_dmamap_load_mbuf*() imply BUS_DMA_NOWAIT.) You couldn't allocate enough bounce pages: /* Reserve Necessary Bounce Pages */ if (map-pagesneeded != 0) { mtx_lock(bounce_lock); if (flags BUS_DMA_NOWAIT) { if (reserve_bounce_pages(dmat, map, 0) != 0) { mtx_unlock(bounce_lock); return (ENOMEM); } Of course, now the question is why you even need bounce pages for alc(4): /* Create DMA tag for Rx buffers. */ error = bus_dma_tag_create( sc-alc_cdata.alc_buffer_tag, /* parent */ ALC_RX_BUF_ALIGN, 0,/* alignment, boundary */ BUS_SPACE_MAXADDR, /* lowaddr */ BUS_SPACE_MAXADDR, /* highaddr */ NULL, NULL, /* filter, filterarg */ MCLBYTES, /* maxsize */ 1, /* nsegments */ MCLBYTES, /* maxsegsize */ 0, /* flags */ NULL, NULL, /* lockfunc, lockarg */ sc-alc_cdata.alc_rx_tag); It can handle 64-bit DMA just fine, and mbuf clusters used for RX should always be aligned and never need bounce pages. Right. alc(4) hardware has no DMA address limit for TX/RX buffers but its descriptors/status block DMA address should be within a 4GB. alc(4) explicitly
Re: AX88772A AX88772B chipset differences?
On Wed, Jul 20, 2011 at 11:46:54PM +0400, Andrey Smagin wrote: 13 июля 2011, 04:07 от YongHyeon PYUN pyu...@gmail.com: On Thu, Jun 30, 2011 at 10:19:14AM -0700, YongHyeon PYUN wrote: On Thu, Jun 30, 2011 at 02:44:48PM +0400, Andrey Smagin wrote: I have card based on AX88772B. I tried patch axe driver for vendor and device IDs. card detected, set up link, but no data received. What else need for patch in this driver ? Anybody have datasheet ? ASIX requires a login account to get the data sheet so it's not publicly available to open source developers. AFAIK the difference between AX88772A and AX88772B is IPv4/IPv6 checksum offloading support of AX88772B. The introduction of checksum offloading means they might have changed its RX header format which in turn makes current RX handler to not work. The other difference would be more advanced power saving used in AX8877B but it wouldn't be much difference to axe(4) driver once PHY is correctly woken in initialization phase. Could you show me your diff and verbose boot output to know PHY model and EEPROM data? I have a minimal patch for AX88772B. It requires more work to support TX/RX checksum offloading, flow-control and power saving but attached patch would be enough for most cases. Let me know whether it works or not. Great thanx !!! It work but I not tested under heavy load. Only ping and some Mbytes via nfs. Thanks for testing. The patch was already committed to HEAD(r224020). I'm implementing TX/RX checksum offloading and flow-control and that feature would be available in near future. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: AX88772A AX88772B chipset differences?
On Thu, Jun 30, 2011 at 10:19:14AM -0700, YongHyeon PYUN wrote: On Thu, Jun 30, 2011 at 02:44:48PM +0400, Andrey Smagin wrote: I have card based on AX88772B. I tried patch axe driver for vendor and device IDs. card detected, set up link, but no data received. What else need for patch in this driver ? Anybody have datasheet ? ASIX requires a login account to get the data sheet so it's not publicly available to open source developers. AFAIK the difference between AX88772A and AX88772B is IPv4/IPv6 checksum offloading support of AX88772B. The introduction of checksum offloading means they might have changed its RX header format which in turn makes current RX handler to not work. The other difference would be more advanced power saving used in AX8877B but it wouldn't be much difference to axe(4) driver once PHY is correctly woken in initialization phase. Could you show me your diff and verbose boot output to know PHY model and EEPROM data? I have a minimal patch for AX88772B. It requires more work to support TX/RX checksum offloading, flow-control and power saving but attached patch would be enough for most cases. Let me know whether it works or not. Index: sys/dev/usb/usbdevs === --- sys/dev/usb/usbdevs (revision 223958) +++ sys/dev/usb/usbdevs (working copy) @@ -1045,6 +1045,7 @@ product ASIX AX88178 0x1780 AX88178 product ASIX AX88772 0x7720 AX88772 product ASIX AX88772A 0x772a AX88772A USB 2.0 10/100 Ethernet +product ASIX AX88772B 0x772b AX88772B USB 2.0 10/100 Ethernet /* ASUS products */ product ASUS2 USBN11 0x0b05 USB-N11 Index: sys/dev/usb/net/if_axereg.h === --- sys/dev/usb/net/if_axereg.h (revision 223958) +++ sys/dev/usb/net/if_axereg.h (working copy) @@ -92,6 +92,8 @@ #define AXE_CMD_SW_PHY_STATUS 0x0021 #define AXE_CMD_SW_PHY_SELECT 0x0122 +#define AXE_772B_CMD_RXCTL_WRITE_CFG 0x012A + /* AX88772A and AX88772B only. */ #define AXE_CMD_READ_VLAN_CTRL 0x4027 #define AXE_CMD_WRITE_VLAN_CTRL 0x4028 @@ -132,12 +134,18 @@ #define AXE_178_RXCMD_KEEP_INVALID_CRC 0x0004 #define AXE_RXCMD_BROADCAST 0x0008 #define AXE_RXCMD_MULTICAST 0x0010 +#define AXE_RXCMD_ACCEPT_RUNT 0x0040 /* AX88772B */ #define AXE_RXCMD_ENABLE 0x0080 #define AXE_178_RXCMD_MFB_MASK 0x0300 #define AXE_178_RXCMD_MFB_2048 0x #define AXE_178_RXCMD_MFB_4096 0x0100 #define AXE_178_RXCMD_MFB_8192 0x0200 #define AXE_178_RXCMD_MFB_16384 0x0300 +#define AXE_772B_RXCMD_HDR_TYPE_0 0x +#define AXE_772B_RXCMD_HDR_TYPE_1 0x0100 +#define AXE_772B_RXCMD_IPHDR_ALIGN 0x0200 +#define AXE_772B_RXCMD_ADD_CHKSUM 0x0400 +#define AXE_RXCMD_LOOPBACK 0x1000 /* AX88772A/AX88772B */ #define AXE_PHY_SEL_PRI 1 #define AXE_PHY_SEL_SEC 0 @@ -176,7 +184,7 @@ #define AXE_PHY_MODE_REALTEK_8251CL 0x0E #define AXE_PHY_MODE_ATTANSIC 0x40 -/* AX88772A only. */ +/* AX88772A/AX88772B only. */ #define AXE_SW_PHY_SELECT_EXT 0x #define AXE_SW_PHY_SELECT_EMBEDDED 0x0001 #define AXE_SW_PHY_SELECT_AUTO 0x0002 @@ -199,6 +207,24 @@ #define AXE_CONFIG_IDX 0 /* config number 1 */ #define AXE_IFACE_IDX 0 +/* EEPROM Map. */ +#define AXE_EEPROM_772B_NODE_ID 0x04 +#define AXE_EEPROM_772B_PHY_PWRCFG 0x18 + +struct ax88772b_mfb { + int byte_cnt; + int threshold; + int size; +}; +#define AX88772B_MFB_2K 0 +#define AX88772B_MFB_4K 1 +#define AX88772B_MFB_6K 2 +#define AX88772B_MFB_8K 3 +#define AX88772B_MFB_16K 4 +#define AX88772B_MFB_20K 5 +#define AX88772B_MFB_24K 6 +#define AX88772B_MFB_32K 7 + struct axe_sframe_hdr { uint16_t len; uint16_t ilen; @@ -228,6 +254,7 @@ uint8_t sc_ipgs[3]; uint8_t sc_phyaddrs[2]; + uint16_t sc_pwrcfg; int sc_tx_bufsz; }; Index: sys/dev/usb/net/if_axe.c === --- sys/dev/usb/net/if_axe.c (revision 223958) +++ sys/dev/usb/net/if_axe.c (working copy) @@ -142,6 +142,7 @@ AXE_DEV(ASIX, AX88178, AXE_FLAG_178), AXE_DEV(ASIX, AX88772, AXE_FLAG_772), AXE_DEV(ASIX, AX88772A, AXE_FLAG_772A), + AXE_DEV(ASIX, AX88772B, AXE_FLAG_772B), AXE_DEV(ATEN, UC210T, 0), AXE_DEV(BELKIN, F5D5055, AXE_FLAG_178), AXE_DEV(BILLIONTON, USB2AR, 0), @@ -190,7 +191,9 @@ static int axe_cmd(struct axe_softc *, int, int, int, void *); static void axe_ax88178_init(struct axe_softc *); static void axe_ax88772_init(struct axe_softc *); +static void axe_ax88772_phywake(struct axe_softc *); static void axe_ax88772a_init(struct axe_softc *); +static void axe_ax88772b_init(struct axe_softc *); static int axe_get_phyno(struct axe_softc *, int); static const struct usb_config axe_config[AXE_N_TRANSFER] = { @@ -217,6 +220,17 @@ }, }; +static const struct ax88772b_mfb ax88772b_mfb_table[] = { + { 0x8000, 0x8001, 2048 }, +{ 0x8100, 0x8147, 4096}, +{ 0x8200, 0x81EB, 6144}, +{ 0x8300, 0x83D7, 8192}, +{ 0x8400, 0x851E
Re: AX88772A AX88772B chipset differences?
On Thu, Jun 30, 2011 at 02:44:48PM +0400, Andrey Smagin wrote: I have card based on AX88772B. I tried patch axe driver for vendor and device IDs. card detected, set up link, but no data received. What else need for patch in this driver ? Anybody have datasheet ? ASIX requires a login account to get the data sheet so it's not publicly available to open source developers. AFAIK the difference between AX88772A and AX88772B is IPv4/IPv6 checksum offloading support of AX88772B. The introduction of checksum offloading means they might have changed its RX header format which in turn makes current RX handler to not work. The other difference would be more advanced power saving used in AX8877B but it wouldn't be much difference to axe(4) driver once PHY is correctly woken in initialization phase. Could you show me your diff and verbose boot output to know PHY model and EEPROM data? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: CFT: msk(4) 64bit DMA support
On Sun, Jun 05, 2011 at 02:23:57PM -0400, David Schultz wrote: 85;95;0cOn Thu, May 26, 2011, YongHyeon PYUN wrote: Here is a patch that implements 64bit DMA on msk(4). If you use msk(4) on a system that has more than 4GB memory, please try the patch at the following URL and let me know whether it works or not. You need latest msk(4) in HEAD to apply the patch. http://people.freebsd.org/~yongari/msk/msk.64bit.dma.diff Previously msk(4) may have used bounce buffers on systems that have more than 4GB memory. You can verify whether msk(4) is using bounce buffers by checking the output of sysctl hw.busdma. For instance, hw.busdma.zone0.total_bounced counter would increase while network operation is in progress. If patch above works you wouldn't see the counter change anymore and it would also enhance network performance since it wouldn't have to copy from or to bounce buffers. Sorry for late reply. After applying this patch, I still see total_bounced increasing: hw.busdma.zone0.total_bounced: 441 Hmm, I guess it could be caused by other drivers in the system. Can you verify whether all other drivers in the system use 64bit DMA? I think just testing msk(4) with netperf/iperf will make it clear (i.e. no disk access). Note that I have MSI disabled to work around some issues with the card becoming wedged: hw.pci.enable_msix=0 hw.pci.enable_msi=0 MSI has nothing to do with 64bit DMA. Possibly relevant bits of dmesg: FreeBSD 9.0-CURRENT #4 r222717M: Sun Jun 5 12:27:07 EDT 2011 CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz (3000.06-MHz K8-class CPU) Origin = GenuineIntel Id = 0x10676 Family = 6 Model = 17 Stepping = 6 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x8e3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1 AMD Features=0x20100800SYSCALL,NX,LM AMD Features2=0x1LAHF TSC: P-state invariant, performance statistics real memory = 8589934592 (8192 MB) avail memory = 8246677504 (7864 MB) Event timer LAPIC quality 400 ACPI APIC Table: IntelR AWRDACPI FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) ioapic0: Changing APIC ID to 4 ioapic0 Version 2.0 irqs 0-23 on motherboard mskc0: Marvell Yukon 88E8053 Gigabit Ethernet port 0xae00-0xaeff mem 0xfdefc000-0xfdef irq 17 at device 0.0 on pci4 msk0: Marvell Technology Group Ltd. Yukon EC Id 0xb6 Rev 0x02 on mskc0 msk0: Ethernet address: 00:01:29:a3:3c:a3 miibus0: MII bus on msk0 e1000phy0: Marvell 88E Gigabit PHY PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow msk0: link state changed to UP ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
CFT: msk(4) 64bit DMA support
Hi, Here is a patch that implements 64bit DMA on msk(4). If you use msk(4) on a system that has more than 4GB memory, please try the patch at the following URL and let me know whether it works or not. You need latest msk(4) in HEAD to apply the patch. http://people.freebsd.org/~yongari/msk/msk.64bit.dma.diff Previously msk(4) may have used bounce buffers on systems that have more than 4GB memory. You can verify whether msk(4) is using bounce buffers by checking the output of sysctl hw.busdma. For instance, hw.busdma.zone0.total_bounced counter would increase while network operation is in progress. If patch above works you wouldn't see the counter change anymore and it would also enhance network performance since it wouldn't have to copy from or to bounce buffers. Thanks. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: [head tinderbox] failure on i386/i386
On Sat, May 07, 2011 at 04:18:08AM +, FreeBSD Tinderbox wrote: TB --- 2011-05-07 02:10:00 - tinderbox 2.7 running on freebsd-current.sentex.ca TB --- 2011-05-07 02:10:00 - starting HEAD tinderbox run for i386/i386 TB --- 2011-05-07 02:10:00 - cleaning the object tree TB --- 2011-05-07 02:10:28 - cvsupping the source tree TB --- 2011-05-07 02:10:28 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/HEAD/i386/i386/supfile TB --- 2011-05-07 02:10:41 - building world TB --- 2011-05-07 02:10:41 - MAKEOBJDIRPREFIX=/obj TB --- 2011-05-07 02:10:41 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-05-07 02:10:41 - TARGET=i386 TB --- 2011-05-07 02:10:41 - TARGET_ARCH=i386 TB --- 2011-05-07 02:10:41 - TZ=UTC TB --- 2011-05-07 02:10:41 - __MAKE_CONF=/dev/null TB --- 2011-05-07 02:10:41 - cd /src TB --- 2011-05-07 02:10:41 - /usr/bin/make -B buildworld World build started on Sat May 7 02:10:42 UTC 2011 Rebuilding the temporary build tree stage 1.1: legacy release compatibility shims stage 1.2: bootstrap tools stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3: cross tools stage 4.1: building includes stage 4.2: building libraries stage 4.3: make dependencies stage 4.4: building everything World build completed on Sat May 7 04:07:08 UTC 2011 TB --- 2011-05-07 04:07:08 - generating LINT kernel config TB --- 2011-05-07 04:07:08 - cd /src/sys/i386/conf TB --- 2011-05-07 04:07:08 - /usr/bin/make -B LINT TB --- 2011-05-07 04:07:08 - building LINT kernel TB --- 2011-05-07 04:07:08 - MAKEOBJDIRPREFIX=/obj TB --- 2011-05-07 04:07:08 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2011-05-07 04:07:08 - TARGET=i386 TB --- 2011-05-07 04:07:08 - TARGET_ARCH=i386 TB --- 2011-05-07 04:07:08 - TZ=UTC TB --- 2011-05-07 04:07:08 - __MAKE_CONF=/dev/null TB --- 2011-05-07 04:07:08 - cd /src TB --- 2011-05-07 04:07:08 - /usr/bin/make -B buildkernel KERNCONF=LINT Kernel build for LINT started on Sat May 7 04:07:08 UTC 2011 stage 1: configuring the kernel stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3.1: making dependencies stage 3.2: building everything [...] ld -b binary -d -warn-common -r -d -o wpifw.fwo wpi.fw cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/xe/if_xe.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/xe/if_xe_pccard.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/xl/if_xl.c cc1: warnings being treated as errors /src/sys/dev/xl/if_xl.c: In function 'xl_poll_locked': /src/sys/dev/xl/if_xl.c:2383: warning: implicit declaration of function 'xl_stats_update_locked' /src/sys/dev/xl/if_xl.c:2383: warning: nested extern declaration of 'xl_stats_update_locked' *** Error code 1 Sorry for the breakage. Should be fixed now. ___ freebsd-current@freebsd.org mailing list
Re: [regression] unable to boot: no GEOM devices found.
On Tue, Apr 12, 2011 at 11:12:55PM +0300, Alexander Motin wrote: David Naylor wrote: On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote: David Naylor wrote: I am running -current and since a few days ago (at least 2011/04/11) I am unable to boot. The boot process stops when it looks to find a bootable device. The prompt (when pressing '?') does not display any device and yielding one second (or more) to the kernel (by pressing '.') does not improve the situation. A known working date is 2011/02/20. I am running amd64 on a nVidia MCP51 chipset. MCP51... again... I am willing to help any way I can. You could start from capturing and showing verbose dmesg. Full or at least in parts related to disks. I captured the dmesg output for both the old (working) kernel and the new (bad) kernel. See attached for the difference between the two. If you need the full dmesg please let me know. One thing I found is that the old kernel would not boot if I simply rebooted from the bad kernel. I had to do a hard power off before the old kernel would work again. Is some device state surviving between reboots? +ata2: reiniting channel .. +ata2: SATA connect time=0ms status=0113 +ata2: reset tp1 mask=01 ostat0=58 ostat1=00 +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 +ata2: reset tp2 stat0=50 stat1=00 devices=0x1 +ata2: reinit done .. +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0 As soon as all devices detected but not responding to commands, I would suppose that there is something wrong with ATA interrupts. There is a long chain of interrupt problems in this chipset. I have already tried to debug one case where ATA wasn't generating interrupts at all. Unfortunately, without success -- requests were executing, but not generating interrupts, it wasn't looked like ATA driver problem. What's about possible candidate to revision triggering your problem, I would look on this message: +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0 At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) and it is interrupt related. Does the driver disable MSI for MCP51? I think jhb's patch fixed one MSI issue of all MCP chipset. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: xterm -C and TIOCCONS vs. PRIV_TTY_CONSOLE
On Fri, Jan 07, 2011 at 11:53:06AM +0100, Gary Jennejohn wrote: On Thu, 6 Jan 2011 21:09:05 -0800 Garrett Cooper gcoo...@freebsd.org wrote: On Thu, Jan 6, 2011 at 8:49 PM, Craig Leres le...@ee.lbl.gov wrote: On 01/06/11 20:05, Garrett Cooper wrote: Just to make sure we're both on the same page: $ grep xterm /etc/ttys ttyv0 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv1 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv2 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv3 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv4 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv5 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv6 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv7 /usr/libexec/getty Pc ? ? ? ? xterm ? on ?secure ttyv8 /usr/local/bin/xdm -nodaemon ?xterm ? off secure No, that's not what mine looks like. I changed it to match and rebooted but it doesn't help with the TIOCCONS issue. When I run xinit, it starts up the xterm -C which does a TIOCCONS. The 8.1 kernel checks for PRIV_TTY_CONSOLE which isn't set and denies the request: ? ? ? ?case TIOCCONS: ? ? ? ? ? ? ? ?/* Set terminal as console TTY. */ ? ? ? ? ? ? ? ?if (*(int *)data) { ? ? ? ? ? ? ? ? ? ? ? ?error = priv_check(td, PRIV_TTY_CONSOLE); ? ? ? ? ? ? ? ? ? ? ? ?if (error) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?return (error); ? ? ? ? ? ? ? ? ? ? ? ?/* ? ? ? ? ? ? ? ? ? ? ? ? * XXX: constty should really need to be locked! ? ? ? ? ? ? ? ? ? ? ? ? * XXX: allow disconnected constty's to be stolen! ? ? ? ? ? ? ? ? ? ? ? ? */ ? ? ? ? ? ? ? ? ? ? ? ?if (constty == tp) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?return (0); ? ? ? ? ? ? ? ? ? ? ? ?if (constty != NULL) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?return (EBUSY); ? ? ? ? ? ? ? ? ? ? ? ?tty_unlock(tp); ? ? ? ? ? ? ? ? ? ? ? ?constty_set(tp); ? ? ? ? ? ? ? ? ? ? ? ?tty_lock(tp); ? ? ? ? ? ? ? ?} else if (constty == tp) { ? ? ? ? ? ? ? ? ? ? ? ?constty_clear(); ? ? ? ? ? ? ? ?} ? ? ? ? ? ? ? ?return (0); There's nothing I see in all of /usr/src that turns on PRIV_TTY_CONSOLE in any case. You could rewrite the above like this: ? ? ? ?case TIOCCONS: ? ? ? ? ? ? ? ?/* Set terminal as console TTY. */ ? ? ? ? ? ? ? ?if (*(int *)data) { ? ? ? ? ? ? ? ? ? ? ? ?return (EPERM) ? ? ? ? ? ? ? ?} else if (constty == tp) { ? ? ? ? ? ? ? ? ? ? ? ?constty_clear(); ? ? ? ? ? ? ? ?} ? ? ? ? ? ? ? ?return (0); and it won't change any behavior. Ok -- figured I would ask about the obvious. I wish I could help you further right now, but unfortunately I have a lot on my plate. I've CCed ed@ and the list again so that someone else might be able to chime in and help you further. I'd say there are a few factors which come into play here: 1) the setting of security.bsd.suser_enabled, default 1 2) kern_tty.c checking for a cred which is never set 3) whether xterm is setuid root a) suser_enabled is almost guaranteed to be 1 on OP's system since just about nothing works when it is set to 0 (tried that) b) the kernel checking for the cred PRIV_TTY_CONSOLE is probably a bug since it never gets set anywhere. However, this usually isn't noticed because c) xterm is generally setuid root and the logic in priv_check_cred() in kern_priv.c doesn't even look at what cred is set to, except for a few which can raise some limits, because cr_uid is 0 (super user) So, the crux of the matter is whether OP's xterm is setuid root. My xterm is and I can run 'xterm -C' without a problem. It seems I'm seeing this one on 8.2R. Of course, xterm is setuid root. I hacked tty.c to return success for TIOCCONS so was able to see kernel messages but messages passed through syslog still does not work. :-( ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org