Re: CURRENT: re(4) crashing system
Am Sun, 20 Nov 2016 16:43:52 +0900 YongHyeon PYUN schrieb: > On Sat, Nov 19, 2016 at 07:44:35PM +0100, O. Hartmann wrote: > > Am Mon, 7 Nov 2016 11:16:23 +0900 > > YongHyeon PYUN schrieb: > > > > > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > > > On Mon, 31 Oct 2016 11:12:22 +0900 > > > > YongHyeon PYUN wrote: > > > > > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > > > > > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > > > capable link partner like other re(4) box without switches > > > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > > > Ethernet related one or not. > > > > > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > > > event, but at least in one of ten. > > > > > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > > > UTP cable was there active network traffic on re(4) device? > > > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > > > unplugging or plugging). And would you show me backtrace of > > > > > > > panic? > > > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > > > > > re0: > > > > > > > > port 0x3000-0x30ff mem > > > > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > > > > really easy - just one pull of the LAN cabling - and we are > > > > > > happy :-/ > > > > > > > > > > > > Please let me know if you need something else. I will return to > > > > > > normal operations (disabling debugging) due to CURRENT is very > > > > > > unstable at the moment on other hosts beyond r307157. > > > > > > > > > > > > > > > > It seems the attachment was stripped. > > > > > > > > This time I hope I got it right! > > > > > > > > Attached you'll find the latest CURRENT's backtrace on the provoked > > > > crash (plug and unplug). > > > > > > > > I also saved the kernel and coredump, so if you need me to do further > > > > investigations,please let me know. > > > > > > > > > > Thanks a lot for the backtrace. This backtrace is not the one I > > > expected and I guess the issue is related with cached route removal > > > on interface down. Quick looking over the code didn't reveal the > > > cause of crash(I'm not familiar with that part code). Probably > > > gnn@ may have better idea what's going on here(CCed). > > > > > > Thanks. > > > > In another thread I complained about permanent crashes on several "older" > > Intel > > archi
Re: CURRENT: re(4) crashing system
On Sat, Nov 19, 2016 at 07:44:35PM +0100, O. Hartmann wrote: > Am Mon, 7 Nov 2016 11:16:23 +0900 > YongHyeon PYUN schrieb: > > > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > > On Mon, 31 Oct 2016 11:12:22 +0900 > > > YongHyeon PYUN wrote: > > > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > > capable link partner like other re(4) box without switches > > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > > Ethernet related one or not. > > > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > > event, but at least in one of ten. > > > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > > UTP cable was there active network traffic on re(4) device? > > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > > unplugging or plugging). And would you show me backtrace of > > > > > > panic? > > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > > > [...] > > > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > > > re0: > > > > > > > port 0x3000-0x30ff mem > > > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > > > [...] > > > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > > > Thanks. > > > > > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > > > really easy - just one pull of the LAN cabling - and we are > > > > > happy :-/ > > > > > > > > > > Please let me know if you need something else. I will return to > > > > > normal operations (disabling debugging) due to CURRENT is very > > > > > unstable at the moment on other hosts beyond r307157. > > > > > > > > > > > > > It seems the attachment was stripped. > > > > > > This time I hope I got it right! > > > > > > Attached you'll find the latest CURRENT's backtrace on the provoked > > > crash (plug and unplug). > > > > > > I also saved the kernel and coredump, so if you need me to do further > > > investigations,please let me know. > > > > > > > Thanks a lot for the backtrace. This backtrace is not the one I > > expected and I guess the issue is related with cached route removal > > on interface down. Quick looking over the code didn't reveal the > > cause of crash(I'm not familiar with that part code). Probably > > gnn@ may have better idea what's going on here(CCed). > > > > Thanks. > > In another thread I complained about permanent crashes on several "older" > Intel > architectures (IvyBridge and down). It has been revealed, that > > option FLOWTABLE > > in the kernel, which is part of my custom kernels a long time for now, has > been > identified as the culprit on those systems. Commenting out that special > option solved the > problem! > > Interestingly, also commenting out this option from the ker
Re: CURRENT: re(4) crashing system
Am Mon, 7 Nov 2016 11:16:23 +0900 YongHyeon PYUN schrieb: > On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > > On Mon, 31 Oct 2016 11:12:22 +0900 > > YongHyeon PYUN wrote: > > > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > > YongHyeon PYUN wrote: > > > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > > capable link partner like other re(4) box without switches > > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > > Ethernet related one or not. > > > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > > event, but at least in one of ten. > > > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > > UTP cable was there active network traffic on re(4) device? > > > > > It would be helpful to know which event triggers the crash(e.g. > > > > > unplugging or plugging). And would you show me backtrace of > > > > > panic? > > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > > went hand in hand with the problem I face with CURRENT right > > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > > > > Ok. > > > > > > > > > > [...] > > > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > > laptop (Lenovo E540) > > > > > > > > > > > > [...] > > > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > > > re0: > > > > > > port 0x3000-0x30ff mem > > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > > > This looks like 8168GU controller. > > > > > > > > > > [...] > > > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > > present without this option - just for the record. > > > > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > > > Thanks. > > > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > > really easy - just one pull of the LAN cabling - and we are > > > > happy :-/ > > > > > > > > Please let me know if you need something else. I will return to > > > > normal operations (disabling debugging) due to CURRENT is very > > > > unstable at the moment on other hosts beyond r307157. > > > > > > > > > > It seems the attachment was stripped. > > > > This time I hope I got it right! > > > > Attached you'll find the latest CURRENT's backtrace on the provoked > > crash (plug and unplug). > > > > I also saved the kernel and coredump, so if you need me to do further > > investigations,please let me know. > > > > Thanks a lot for the backtrace. This backtrace is not the one I > expected and I guess the issue is related with cached route removal > on interface down. Quick looking over the code didn't reveal the > cause of crash(I'm not familiar with that part code). Probably > gnn@ may have better idea what's going on here(CCed). > > Thanks. In another thread I complained about permanent crashes on several "older" Intel architectures (IvyBridge and down). It has been revealed, that option FLOWTABLE in the kernel, which is part of my custom kernels a long time for now, has been identified as the culprit on those systems. Commenting out that special option solved the problem! Interestingly, also commenting out this option from the kernel config of the laptop in question of this thread, I wasn't able - as of this writing - to reproduce the crashes, so it might be that the same issue with FLOWTABLE has been triggered by pluggin and/or unpluggin the LAN cord. Usually I was able to trigger the coredump after two or three rounds,
Re: CURRENT: re(4) crashing system
On Sun, Nov 06, 2016 at 01:20:36PM +0100, Hartmann, O. wrote: > On Mon, 31 Oct 2016 11:12:22 +0900 > YongHyeon PYUN wrote: > > > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > > On Thu, 27 Oct 2016 10:00:04 +0900 > > > YongHyeon PYUN wrote: > > > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > > YongHyeon PYUN wrote: > > > > > > > > > > > > > [...] > > > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > > capable link partner like other re(4) box without switches > > > > > > you may be able to tell whether the issue is EEE/Green > > > > > > Ethernet related one or not. > > > > > > > > > > Me either since when I discovered a problem the first time with > > > > > CURRENT, that was the Friday before last week's Friday, there > > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > > introduced a serious bug and I changed the NICs. > > > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > > which I use the Realtek NIC, does well now with Green IT > > > > > technology, but crashes on plugging/unplugging - not on each > > > > > event, but at least in one of ten. > > > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > > UTP cable was there active network traffic on re(4) device? > > > > It would be helpful to know which event triggers the crash(e.g. > > > > unplugging or plugging). And would you show me backtrace of > > > > panic? > > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > > went hand in hand with the problem I face with CURRENT right > > > > > now on some older, Non UEFI machines. > > > > > > > > > > > > > Ok. > > > > > > > > [...] > > > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > > laptop (Lenovo E540) > > > > > > > > > > [...] > > > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > > > re0: > > > > > port 0x3000-0x30ff mem > > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > > > This looks like 8168GU controller. > > > > > > > > [...] > > > > > > > > > I use options netmap in kernel config, but the problem is also > > > > > present without this option - just for the record. > > > > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > > > Thanks. > > > > > > Attached, you'll find the backtrace of the crash. This time it was > > > really easy - just one pull of the LAN cabling - and we are > > > happy :-/ > > > > > > Please let me know if you need something else. I will return to > > > normal operations (disabling debugging) due to CURRENT is very > > > unstable at the moment on other hosts beyond r307157. > > > > > > > It seems the attachment was stripped. > > This time I hope I got it right! > > Attached you'll find the latest CURRENT's backtrace on the provoked > crash (plug and unplug). > > I also saved the kernel and coredump, so if you need me to do further > investigations,please let me know. > Thanks a lot for the backtrace. This backtrace is not the one I expected and I guess the issue is related with cached route removal on interface down. Quick looking over the code didn't reveal the cause of crash(I'm not familiar with that part code). Probably gnn@ may have better idea what's going on here(CCed). Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Mon, 31 Oct 2016 11:12:22 +0900 YongHyeon PYUN wrote: > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > On Thu, 27 Oct 2016 10:00:04 +0900 > > YongHyeon PYUN wrote: > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > YongHyeon PYUN wrote: > > > > > > > > > > [...] > > > > > > > > I'm not sure but it's likely the issue is related with > > > > > EEE/Green Ethernet handling. EEE is negotiated feature with > > > > > link partner. If you directly connect your laptop to non-EEE > > > > > capable link partner like other re(4) box without switches > > > > > you may be able to tell whether the issue is EEE/Green > > > > > Ethernet related one or not. > > > > > > > > Me either since when I discovered a problem the first time with > > > > CURRENT, that was the Friday before last week's Friday, there > > > > was a unlucky coicidence: I got the new switch, FreeBSD > > > > introduced a serious bug and I changed the NICs. > > > > > > > > The laptop, the last in the row of re(4) equipted systems on > > > > which I use the Realtek NIC, does well now with Green IT > > > > technology, but crashes on plugging/unplugging - not on each > > > > event, but at least in one of ten. > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > UTP cable was there active network traffic on re(4) device? > > > It would be helpful to know which event triggers the crash(e.g. > > > unplugging or plugging). And would you show me backtrace of > > > panic? > > > > I guess the Green IT issue is more a unlucky guess of mine and > > > > went hand in hand with the problem I face with CURRENT right > > > > now on some older, Non UEFI machines. > > > > > > > > > > Ok. > > > > > > [...] > > > > > > > > As requested the informations about re0 and rgephy0 on the > > > > laptop (Lenovo E540) > > > > > > > > [...] > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, > > > > 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > re0: > > > > port 0x3000-0x30ff mem > > > > 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device 0.0 on > > > > pci2 re0: Using 1 MSI-X message re0: ASPM disabled re0: Chip > > > > rev. 0x5080 re0: MAC rev. 0x0010 > > > > > > This looks like 8168GU controller. > > > > > > [...] > > > > > > > I use options netmap in kernel config, but the problem is also > > > > present without this option - just for the record. > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > Thanks. > > > > Attached, you'll find the backtrace of the crash. This time it was > > really easy - just one pull of the LAN cabling - and we are > > happy :-/ > > > > Please let me know if you need something else. I will return to > > normal operations (disabling debugging) due to CURRENT is very > > unstable at the moment on other hosts beyond r307157. > > > > It seems the attachment was stripped. This time I hope I got it right! Attached you'll find the latest CURRENT's backtrace on the provoked crash (plug and unplug). I also saved the kernel and coredump, so if you need me to do further investigations,please let me know. Thanks in advance and kind regards, oliver core.txt.0 Description: Binary data ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
Am Mon, 31 Oct 2016 11:12:22 +0900 YongHyeon PYUN schrieb: > On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > > On Thu, 27 Oct 2016 10:00:04 +0900 > > YongHyeon PYUN wrote: > > > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > > YongHyeon PYUN wrote: > > > > > > > > > > [...] > > > > > > > > I'm not sure but it's likely the issue is related with EEE/Green > > > > > Ethernet handling. EEE is negotiated feature with link partner. If > > > > > you directly connect your laptop to non-EEE capable link partner > > > > > like other re(4) box without switches you may be able to tell > > > > > whether the issue is EEE/Green Ethernet related one or not. > > > > > > > > Me either since when I discovered a problem the first time with > > > > CURRENT, that was the Friday before last week's Friday, there was a > > > > unlucky coicidence: I got the new switch, FreeBSD introduced a > > > > serious bug and I changed the NICs. > > > > > > > > The laptop, the last in the row of re(4) equipted systems on which I > > > > use the Realtek NIC, does well now with Green IT technology, but > > > > crashes on plugging/unplugging - not on each event, but at least in > > > > one of ten. > > > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > > UTP cable was there active network traffic on re(4) device? > > > It would be helpful to know which event triggers the crash(e.g. > > > unplugging or plugging). And would you show me backtrace of panic? > > > > > > > I guess the Green IT issue is more a unlucky guess of mine and went > > > > hand in hand with the problem I face with CURRENT right now on some > > > > older, Non UEFI machines. > > > > > > > > > > Ok. > > > > > > [...] > > > > > > > > As requested the informations about re0 and rgephy0 on the laptop > > > > (Lenovo E540) > > > > > > > > [...] > > > > > > > > rgephy0: PHY 1 on miibus0 > > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > > > > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > > > re0: > > > > port 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff > > > > at device 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > > > > re0: Chip rev. 0x5080 > > > > re0: MAC rev. 0x0010 > > > > > > This looks like 8168GU controller. > > > > > > [...] > > > > > > > I use options netmap in kernel config, but the problem is also > > > > present without this option - just for the record. > > > > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > > > Thanks. > > > > Attached, you'll find the backtrace of the crash. This time it was > > really easy - just one pull of the LAN cabling - and we are happy :-/ > > > > Please let me know if you need something else. I will return to normal > > operations (disabling debugging) due to CURRENT is very unstable at the > > moment on other hosts beyond r307157. > > > > It seems the attachment was stripped. [...] Sorry for the late reply. Indeed, someone forgot to append the dump/core info and this someone seems to be me. I have severe time constraints and I will prepare another crash/dump on this weekend with a most recent CURRENT. My apologizes for this, kind regards, Oliver pgprRC6baT13b.pgp Description: OpenPGP digital signature
Re: CURRENT: re(4) crashing system
On Fri, Oct 28, 2016 at 09:21:13PM +0200, Hartmann, O. wrote: > On Thu, 27 Oct 2016 10:00:04 +0900 > YongHyeon PYUN wrote: > > > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > > On Tue, 25 Oct 2016 11:05:38 +0900 > > > YongHyeon PYUN wrote: > > > > > > > [...] > > > > > > I'm not sure but it's likely the issue is related with EEE/Green > > > > Ethernet handling. EEE is negotiated feature with link partner. If > > > > you directly connect your laptop to non-EEE capable link partner > > > > like other re(4) box without switches you may be able to tell > > > > whether the issue is EEE/Green Ethernet related one or not. > > > > > > Me either since when I discovered a problem the first time with > > > CURRENT, that was the Friday before last week's Friday, there was a > > > unlucky coicidence: I got the new switch, FreeBSD introduced a > > > serious bug and I changed the NICs. > > > > > > The laptop, the last in the row of re(4) equipted systems on which I > > > use the Realtek NIC, does well now with Green IT technology, but > > > crashes on plugging/unplugging - not on each event, but at least in > > > one of ten. > > > > Hmm, it seems you know how to trigger the issue. When you unplug > > UTP cable was there active network traffic on re(4) device? > > It would be helpful to know which event triggers the crash(e.g. > > unplugging or plugging). And would you show me backtrace of panic? > > > > > I guess the Green IT issue is more a unlucky guess of mine and went > > > hand in hand with the problem I face with CURRENT right now on some > > > older, Non UEFI machines. > > > > > > > Ok. > > > > [...] > > > > > > As requested the informations about re0 and rgephy0 on the laptop > > > (Lenovo E540) > > > > > > [...] > > > > > > rgephy0: PHY 1 on miibus0 > > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > > > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > > > re0: > > > port 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff > > > at device 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > > > re0: Chip rev. 0x5080 > > > re0: MAC rev. 0x0010 > > > > This looks like 8168GU controller. > > > > [...] > > > > > I use options netmap in kernel config, but the problem is also > > > present without this option - just for the record. > > > > > > > Yup, netmap(4) has nothing to do with the crash. > > > > Thanks. > > Attached, you'll find the backtrace of the crash. This time it was > really easy - just one pull of the LAN cabling - and we are happy :-/ > > Please let me know if you need something else. I will return to normal > operations (disabling debugging) due to CURRENT is very unstable at the > moment on other hosts beyond r307157. > It seems the attachment was stripped. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Thu, 27 Oct 2016 10:00:04 +0900 YongHyeon PYUN wrote: > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > On Tue, 25 Oct 2016 11:05:38 +0900 > > YongHyeon PYUN wrote: > > > > [...] > > > > I'm not sure but it's likely the issue is related with EEE/Green > > > Ethernet handling. EEE is negotiated feature with link partner. If > > > you directly connect your laptop to non-EEE capable link partner > > > like other re(4) box without switches you may be able to tell > > > whether the issue is EEE/Green Ethernet related one or not. > > > > Me either since when I discovered a problem the first time with > > CURRENT, that was the Friday before last week's Friday, there was a > > unlucky coicidence: I got the new switch, FreeBSD introduced a > > serious bug and I changed the NICs. > > > > The laptop, the last in the row of re(4) equipted systems on which I > > use the Realtek NIC, does well now with Green IT technology, but > > crashes on plugging/unplugging - not on each event, but at least in > > one of ten. > > Hmm, it seems you know how to trigger the issue. When you unplug > UTP cable was there active network traffic on re(4) device? > It would be helpful to know which event triggers the crash(e.g. > unplugging or plugging). And would you show me backtrace of panic? > > > I guess the Green IT issue is more a unlucky guess of mine and went > > hand in hand with the problem I face with CURRENT right now on some > > older, Non UEFI machines. > > > > Ok. > > [...] > > > > As requested the informations about re0 and rgephy0 on the laptop > > (Lenovo E540) > > > > [...] > > > > rgephy0: PHY 1 on miibus0 > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, > > 1000baseT-FDX-master, 1000baseT-FDX-flow, > > 1000baseT-FDX-flow-master, auto, auto-flow > > > > re0: > > port 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff > > at device 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > > re0: Chip rev. 0x5080 > > re0: MAC rev. 0x0010 > > This looks like 8168GU controller. > > [...] > > > I use options netmap in kernel config, but the problem is also > > present without this option - just for the record. > > > > Yup, netmap(4) has nothing to do with the crash. > > Thanks. Attached, you'll find the backtrace of the crash. This time it was really easy - just one pull of the LAN cabling - and we are happy :-/ Please let me know if you need something else. I will return to normal operations (disabling debugging) due to CURRENT is very unstable at the moment on other hosts beyond r307157. Kind regards, oh ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
Am Thu, 27 Oct 2016 10:00:04 +0900 YongHyeon PYUN schrieb: > On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > > On Tue, 25 Oct 2016 11:05:38 +0900 > > YongHyeon PYUN wrote: > > > > [...] > > > > I'm not sure but it's likely the issue is related with EEE/Green > > > Ethernet handling. EEE is negotiated feature with link partner. If > > > you directly connect your laptop to non-EEE capable link partner > > > like other re(4) box without switches you may be able to tell > > > whether the issue is EEE/Green Ethernet related one or not. > > > > Me either since when I discovered a problem the first time with > > CURRENT, that was the Friday before last week's Friday, there was a > > unlucky coicidence: I got the new switch, FreeBSD introduced a serious > > bug and I changed the NICs. > > > > The laptop, the last in the row of re(4) equipted systems on which I > > use the Realtek NIC, does well now with Green IT technology, but > > crashes on plugging/unplugging - not on each event, but at least in one > > of ten. > > Hmm, it seems you know how to trigger the issue. When you unplug > UTP cable was there active network traffic on re(4) device? > It would be helpful to know which event triggers the crash(e.g. > unplugging or plugging). And would you show me backtrace of panic? Yes, as I wrote, plugging and unplugging. Usually, there is no traffic I'm aware of, simply the - a hunch - attempt to renegotiate the connection triggers the crash. As I can force by bringing up and down the port on the switch. Of course you can get a panic/backtrace, but I need the weekend. I complained in another thread about the inability of getting a core - I use ELI encrypted swap, so I shot myself at that point. > > > I guess the Green IT issue is more a unlucky guess of mine and went > > hand in hand with the problem I face with CURRENT right now on some > > older, Non UEFI machines. > > > > Ok. > > [...] > > > > As requested the informations about re0 and rgephy0 on the laptop > > (Lenovo E540) > > > > [...] > > > > rgephy0: PHY 1 on miibus0 > > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, > > 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow > > > > re0: port > > 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device > > 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > > re0: Chip rev. 0x5080 > > re0: MAC rev. 0x0010 > > This looks like 8168GU controller. > > [...] > > > I use options netmap in kernel config, but the problem is also present > > without this option - just for the record. > > > > Yup, netmap(4) has nothing to do with the crash. > > Thanks. > ___ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" pgpzPKWwKVKBl.pgp Description: OpenPGP digital signature
Re: CURRENT: re(4) crashing system
On Tue, Oct 25, 2016 at 07:03:38AM +0200, Hartmann, O. wrote: > On Tue, 25 Oct 2016 11:05:38 +0900 > YongHyeon PYUN wrote: > [...] > > I'm not sure but it's likely the issue is related with EEE/Green > > Ethernet handling. EEE is negotiated feature with link partner. If > > you directly connect your laptop to non-EEE capable link partner > > like other re(4) box without switches you may be able to tell > > whether the issue is EEE/Green Ethernet related one or not. > > Me either since when I discovered a problem the first time with > CURRENT, that was the Friday before last week's Friday, there was a > unlucky coicidence: I got the new switch, FreeBSD introduced a serious > bug and I changed the NICs. > > The laptop, the last in the row of re(4) equipted systems on which I > use the Realtek NIC, does well now with Green IT technology, but > crashes on plugging/unplugging - not on each event, but at least in one > of ten. Hmm, it seems you know how to trigger the issue. When you unplug UTP cable was there active network traffic on re(4) device? It would be helpful to know which event triggers the crash(e.g. unplugging or plugging). And would you show me backtrace of panic? > I guess the Green IT issue is more a unlucky guess of mine and went > hand in hand with the problem I face with CURRENT right now on some > older, Non UEFI machines. > Ok. [...] > > As requested the informations about re0 and rgephy0 on the laptop > (Lenovo E540) > > [...] > > rgephy0: PHY 1 on miibus0 > rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, > 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 1000baseT-FDX-master, > 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow > > re0: port > 0x3000-0x30ff mem 0xf0d04000-0xf0d04fff,0xf0d0-0xf0d03fff at device > 0.0 on pci2 re0: Using 1 MSI-X message re0: ASPM disabled > re0: Chip rev. 0x5080 > re0: MAC rev. 0x0010 This looks like 8168GU controller. [...] > I use options netmap in kernel config, but the problem is also present > without this option - just for the record. > Yup, netmap(4) has nothing to do with the crash. Thanks. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Tue, 25 Oct 2016 11:05:38 +0900 YongHyeon PYUN wrote: > On Mon, Oct 24, 2016 at 02:03:37PM +0200, O. Hartmann wrote: > > On Mon, 24 Oct 2016 14:14:00 +0900 > > YongHyeon PYUN wrote: > > > > > On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > > > > I tried to report earlier here that CURRENT does have some > > > > serious problems right now and one of those problems seems to > > > > be triggered by the recent re(4) driver. The problem is also > > > > present in recen 11-STABLE! > > > > > > > > Below, you'll find pciconf-output reagrding the device on a > > > > Lenovo E540 Laptop I can test on and trigger the problem. > > > > > > > > The phenomenon is that this NIC does not negotiate 1000baseTX, > > > > it is always falling back to 100baseTX although the device > > > > claims to be a 1 GBit capable device. > > > > > > > > When I try to put the device manually into 1000basTX mode via > > > > > > > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) > > > > driver) > > > > > > > > it is possible to crash the system. The system also crashes when > > > > plugging/unplugging the LAN cord - I guess the renegotiation is > > > > triggering this crash immediately. > > > > > > > > I tried with several switches and routers capable of 1 GBit and > > > > it seems to be independent from the network hardware in use. > > > > > > > > I tried to capture a backtrace when the kernel crashes, but I > > > > do not know how to save the the kernel debugger output. > > > > Although I configured according the handbook debugging, there > > > > is no coredump at all. > > > > > > > > Advice is appreciated - if anybody is interesetd in solving > > > > this. > > > > > > There were several instability reports on re(4). I vaguely guess > > > it would be related with some missing initializations for certain > > > controllers. Unfortunately, there is no publicly available > > > datasheet for those controllers and it's not likely to get access > > > to it in near future. It seems vendor's FreeBSD driver accesses > > > lots of magic registers as well as loading DSP fixups. I have no > > > idea what it wants to do and re(4) used to heavily rely on > > > power-on default register values. Engineering samples I have do > > > not show instabilities so it wouldn't be easy to identify the > > > issue. > > > > > > Probably the first step to address the issue would be identifying > > > those chips and narrowing down the scope of guessing. Would you > > > show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) > > > output is useless here since RealTek uses the same PCI id for > > > PCIe variants. > > > > > > BTW, I was told that the vendor's FreeBSD driver seems to work > > > fine for normal usage pattern. The vendor's driver triggered an > > > instant panic and lacked H/W offloading features in the past. It > > > might have changed though. > > > > The problemacy with re(4) drivers arose again, when I bought some > > "green" equipment, mainly switches, which reduces power emission on > > short cables or non-connected ports. This brought down some servers > > with re(4) chipsets immediately and I had no clue what happend. I > > do not know whether this is a > > I'm not sure but it's likely the issue is related with EEE/Green > Ethernet handling. EEE is negotiated feature with link partner. If > you directly connect your laptop to non-EEE capable link partner > like other re(4) box without switches you may be able to tell > whether the issue is EEE/Green Ethernet related one or not. Me either since when I discovered a problem the first time with CURRENT, that was the Friday before last week's Friday, there was a unlucky coicidence: I got the new switch, FreeBSD introduced a serious bug and I changed the NICs. The laptop, the last in the row of re(4) equipted systems on which I use the Realtek NIC, does well now with Green IT technology, but crashes on plugging/unplugging - not on each event, but at least in one of ten. I guess the Green IT issue is more a unlucky guess of mine and went hand in hand with the problem I face with CURRENT right now on some older, Non UEFI machines. > > > single fate so to speak, or this problem will arise for others, > > too. We exchanged on serving hardware all Realtek NICs with those > > from Intel, and luckily some server mainboards already have Intel > > PHY or NICs. The Broadcom devices we have on some older Fujitus > > hardware is also stable like a charme, even with the new power > > saving switches. > > bge(4) also lacks EEE support(Publicly available datasheet is too > sanitized one). bge(4) firmware probably does not announce EEE > capability by default in link establishment while recent re(4) > devices seem to unconditionally announce EEE. Generally EEE > handling requires a kind of handshake for link state change from > MAC/PHY. > > > While we can swap on server or workstation platforms the NIC, it is > > almost impossible on laptops
Re: CURRENT: re(4) crashing system
On Mon, Oct 24, 2016 at 02:03:37PM +0200, O. Hartmann wrote: > On Mon, 24 Oct 2016 14:14:00 +0900 > YongHyeon PYUN wrote: > > > On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > > > I tried to report earlier here that CURRENT does have some serious > > > problems right now and one of those problems seems to be triggered by > > > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > > > > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > > > Laptop I can test on and trigger the problem. > > > > > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > > > always falling back to 100baseTX although the device claims to be a 1 > > > GBit capable device. > > > > > > When I try to put the device manually into 1000basTX mode via > > > > > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > > > > > it is possible to crash the system. The system also crashes when > > > plugging/unplugging the LAN cord - I guess the renegotiation is > > > triggering this crash immediately. > > > > > > I tried with several switches and routers capable of 1 GBit and it > > > seems to be independent from the network hardware in use. > > > > > > I tried to capture a backtrace when the kernel crashes, but I do not > > > know how to save the the kernel debugger output. Although I configured > > > according the handbook debugging, there is no coredump at all. > > > > > > Advice is appreciated - if anybody is interesetd in solving this. > > > > > > > There were several instability reports on re(4). I vaguely guess > > it would be related with some missing initializations for certain > > controllers. Unfortunately, there is no publicly available > > datasheet for those controllers and it's not likely to get access > > to it in near future. It seems vendor's FreeBSD driver accesses > > lots of magic registers as well as loading DSP fixups. I have no > > idea what it wants to do and re(4) used to heavily rely on power-on > > default register values. Engineering samples I have do not show > > instabilities so it wouldn't be easy to identify the issue. > > > > Probably the first step to address the issue would be identifying > > those chips and narrowing down the scope of guessing. Would you > > show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) > > output is useless here since RealTek uses the same PCI id for > > PCIe variants. > > > > BTW, I was told that the vendor's FreeBSD driver seems to work fine > > for normal usage pattern. The vendor's driver triggered an instant > > panic and lacked H/W offloading features in the past. It might > > have changed though. > > The problemacy with re(4) drivers arose again, when I bought some "green" > equipment, mainly switches, which reduces power emission on short cables or > non-connected ports. This brought down some servers with re(4) chipsets > immediately and I had no clue what happend. I do not know whether this is a I'm not sure but it's likely the issue is related with EEE/Green Ethernet handling. EEE is negotiated feature with link partner. If you directly connect your laptop to non-EEE capable link partner like other re(4) box without switches you may be able to tell whether the issue is EEE/Green Ethernet related one or not. > single fate so to speak, or this problem will arise for others, too. We > exchanged on serving hardware all Realtek NICs with those from Intel, and > luckily some server mainboards already have Intel PHY or NICs. The Broadcom > devices we have on some older Fujitus hardware is also stable like a charme, > even with the new power saving switches. > bge(4) also lacks EEE support(Publicly available datasheet is too sanitized one). bge(4) firmware probably does not announce EEE capability by default in link establishment while recent re(4) devices seem to unconditionally announce EEE. Generally EEE handling requires a kind of handshake for link state change from MAC/PHY. > While we can swap on server or workstation platforms the NIC, it is almost > impossible on laptops and the number of laptops with realtek chips seems to > grow. It is a pity that the venodr of the chipsets reject supporting other > OSes > than Windows - or in some rare cases only Linux. After you wrote the answer, I > checked on the net who's suiatble drivers and the situation seems bad for > almost all OSes apart from commercial ones like Windooze and Apple OS X. > > As soon as I get hands on the laptop again, I'll send the requested > informations. I know that I played around with re(4) and rgephy(4) in the > kernel, the rgephy(4) showed up on the dmesg, but I didn't see any effect - > except that it offered some additional "media xxx-options-xxx" mostly appended > with "flow" - but rying brought also down the system as pluggin or unplugging. rgephy(4) will show recognized PHY H/W model. Another information I'd like to know is OUI information of the PH
Re: CURRENT: re(4) crashing system
On Mon, 24 Oct 2016 14:14:00 +0900 YongHyeon PYUN wrote: > On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > > I tried to report earlier here that CURRENT does have some serious > > problems right now and one of those problems seems to be triggered by > > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > > Laptop I can test on and trigger the problem. > > > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > > always falling back to 100baseTX although the device claims to be a 1 > > GBit capable device. > > > > When I try to put the device manually into 1000basTX mode via > > > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > > > it is possible to crash the system. The system also crashes when > > plugging/unplugging the LAN cord - I guess the renegotiation is > > triggering this crash immediately. > > > > I tried with several switches and routers capable of 1 GBit and it > > seems to be independent from the network hardware in use. > > > > I tried to capture a backtrace when the kernel crashes, but I do not > > know how to save the the kernel debugger output. Although I configured > > according the handbook debugging, there is no coredump at all. > > > > Advice is appreciated - if anybody is interesetd in solving this. > > > > There were several instability reports on re(4). I vaguely guess > it would be related with some missing initializations for certain > controllers. Unfortunately, there is no publicly available > datasheet for those controllers and it's not likely to get access > to it in near future. It seems vendor's FreeBSD driver accesses > lots of magic registers as well as loading DSP fixups. I have no > idea what it wants to do and re(4) used to heavily rely on power-on > default register values. Engineering samples I have do not show > instabilities so it wouldn't be easy to identify the issue. > > Probably the first step to address the issue would be identifying > those chips and narrowing down the scope of guessing. Would you > show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) > output is useless here since RealTek uses the same PCI id for > PCIe variants. > > BTW, I was told that the vendor's FreeBSD driver seems to work fine > for normal usage pattern. The vendor's driver triggered an instant > panic and lacked H/W offloading features in the past. It might > have changed though. The problemacy with re(4) drivers arose again, when I bought some "green" equipment, mainly switches, which reduces power emission on short cables or non-connected ports. This brought down some servers with re(4) chipsets immediately and I had no clue what happend. I do not know whether this is a single fate so to speak, or this problem will arise for others, too. We exchanged on serving hardware all Realtek NICs with those from Intel, and luckily some server mainboards already have Intel PHY or NICs. The Broadcom devices we have on some older Fujitus hardware is also stable like a charme, even with the new power saving switches. While we can swap on server or workstation platforms the NIC, it is almost impossible on laptops and the number of laptops with realtek chips seems to grow. It is a pity that the venodr of the chipsets reject supporting other OSes than Windows - or in some rare cases only Linux. After you wrote the answer, I checked on the net who's suiatble drivers and the situation seems bad for almost all OSes apart from commercial ones like Windooze and Apple OS X. As soon as I get hands on the laptop again, I'll send the requested informations. I know that I played around with re(4) and rgephy(4) in the kernel, the rgephy(4) showed up on the dmesg, but I didn't see any effect - except that it offered some additional "media xxx-options-xxx" mostly appended with "flow" - but rying brought also down the system as pluggin or unplugging. The last kernel I compiled was then without rgephy(4) - the NIC worked as expected, but pluggin/unplugging or having some power-down activities on a Netgear SoHo green-pwer switch brings the system down as usual. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: CURRENT: re(4) crashing system
On Sun, Oct 23, 2016 at 01:25:38PM +0200, Hartmann, O. wrote: > I tried to report earlier here that CURRENT does have some serious > problems right now and one of those problems seems to be triggered by > the recent re(4) driver. The problem is also present in recen 11-STABLE! > > Below, you'll find pciconf-output reagrding the device on a Lenovo E540 > Laptop I can test on and trigger the problem. > > The phenomenon is that this NIC does not negotiate 1000baseTX, it is > always falling back to 100baseTX although the device claims to be a 1 > GBit capable device. > > When I try to put the device manually into 1000basTX mode via > > ifconfig re0 media 1000baseTX mediaopt full-duplex (with re(4) driver) > > it is possible to crash the system. The system also crashes when > plugging/unplugging the LAN cord - I guess the renegotiation is > triggering this crash immediately. > > I tried with several switches and routers capable of 1 GBit and it > seems to be independent from the network hardware in use. > > I tried to capture a backtrace when the kernel crashes, but I do not > know how to save the the kernel debugger output. Although I configured > according the handbook debugging, there is no coredump at all. > > Advice is appreciated - if anybody is interesetd in solving this. > There were several instability reports on re(4). I vaguely guess it would be related with some missing initializations for certain controllers. Unfortunately, there is no publicly available datasheet for those controllers and it's not likely to get access to it in near future. It seems vendor's FreeBSD driver accesses lots of magic registers as well as loading DSP fixups. I have no idea what it wants to do and re(4) used to heavily rely on power-on default register values. Engineering samples I have do not show instabilities so it wouldn't be easy to identify the issue. Probably the first step to address the issue would be identifying those chips and narrowing down the scope of guessing. Would you show me the dmesg output(re(4) and regphy(4) only)? pciconf(8) output is useless here since RealTek uses the same PCI id for PCIe variants. BTW, I was told that the vendor's FreeBSD driver seems to work fine for normal usage pattern. The vendor's driver triggered an instant panic and lacked H/W offloading features in the past. It might have changed though. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"