Re: D-Link DGE530T issue
On Fri, Nov 17, 2017 at 03:12:58PM +0300, Mike Black wrote: > Hello > > VPD is > none0@pci0:5:1:0: class=0x02 card=0x4b011086 chip=0x4b011086 > rev=0x11 hdr=0x00 > vendor = 'J. Bond Computer Systems' > class = network > subclass = ethernet > bar [10] = type Memory, range 32, base 0xfebec000, size 16384, enabled > bar [14] = type I/O Port, range 32, base 0xee00, size 512, enabled > cap 01[48] = powerspec 2 supports D0 D1 D2 D3 current D3 > cap 03[50] = VPD > VPD ident = 'DGD-530T Ghgabht Ethernet @dapte' ^ ^ ^ As Boris said there are bit flipping errors. I think it's better to find other NIC. Even if it work after adding this device's vendor ID to sk(4) you may encounter silent data corruption in near future. > > But I see... it says some DGD-530T... Do not know why, because it's > DGE-530T for sure. > > So you're saying this it hardware degradation? > I will try to find some windows host and plug it in there to check it. > > 2017-11-17 13:19 GMT+03:00 YongHyeon PYUN <pyu...@gmail.com>: > > > On Fri, Nov 17, 2017 at 09:55:03AM +0300, Mike Black wrote: > > > Hello. I looked into svn code for 8.3R and 11.1R and there seems no > > changes > > > in descriptors/identifiers. So I think that NIC is being wrongly > > identified > > > during startup process - it is being recognized with a wrong PCI VID. How > > > can this be checked or fixed? > > > I use a loadable kernel module after a startup, so there is no useful > > > messages during boot process. > > > > > > > It seems it's single bit error but if it's dying there would no way > > to get fixed. Given that pciconf(8) says VPD capability, try to > > read it(i.e. pciconf -lcbvV). Generally VPD contains a readable > > product string so you may be able to know whether there are other > > errors. If vendor ID is the only corrupted one, you can simply > > patch the device ID in the driver. > > > > > 15 戟棘�뤢�. 2017 均. 11:00 PM 極棘剋�뚍론압꽥겉궿둔뿌� "Mike Black" <amdm...@gmail.com> > > > 戟逵極龜�곍겆�: > > > > > > > Hello > > > > > > > > I've got old PCI NIC D-Link DGE530T Rev 11 with SysKonnect chip on it. > > > > Years ago it worked in FreeBSD 8/9 Stable with if_sk driver. > > > > > > > > Now I'm runnig > > > > 11.1-STABLE FreeBSD 11.1-STABLE #1 r323214: Sat Nov 11 19:06:20 MSK > > 2017 > > > >amd_miek@diablo.miekoff.local:/usr/obj/usr/src/sys/DIABLO64 amd64 > > > > 1101502 1101502 > > > > > > > > But recently I plugged this card back and it's not being recognized by > > a > > > > driver. > > > > > > > > pciconf says that is > > > > none0@pci0:5:1:0: class=0x02 card=0x4b011086 chip=0x4b011086 > > > > rev=0x11 hdr=0x00 > > > > vendor = 'J. Bond Computer Systems' > > > > class = network > > > > subclass = ethernet > > > > bar [10] = type Memory, range 32, base 0xfebec000, size 16384, > > > > enabled > > > > bar [14] = type I/O Port, range 32, base 0xee00, size 512, > > enabled > > > > cap 01[48] = powerspec 2 supports D0 D1 D2 D3 current D3 > > > > cap 03[50] = VPD > > > > > > > > According /usr/share/misc/pci_vendors this D-link should have 4b011186 > > not > > > > 4b011086. > > > > I looked into driver code (if_sk) and it expects 1186 card also. > > > > I googled about this issue but found no one similar in a recent years > > > > So I'd like to know what's wrong - some changes in driver in a recent > > > > years or smth going wrong while OS detecting this NIC. But that's > > > > confusing, because this exact NIC worked years ago... > > > > > > > > -- > > > > amd_miek > > > > Think different. > > > > Just superior. > > ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: D-Link DGE530T issue
On Fri, Nov 17, 2017 at 09:55:03AM +0300, Mike Black wrote: > Hello. I looked into svn code for 8.3R and 11.1R and there seems no changes > in descriptors/identifiers. So I think that NIC is being wrongly identified > during startup process - it is being recognized with a wrong PCI VID. How > can this be checked or fixed? > I use a loadable kernel module after a startup, so there is no useful > messages during boot process. > It seems it's single bit error but if it's dying there would no way to get fixed. Given that pciconf(8) says VPD capability, try to read it(i.e. pciconf -lcbvV). Generally VPD contains a readable product string so you may be able to know whether there are other errors. If vendor ID is the only corrupted one, you can simply patch the device ID in the driver. > 15 戟棘�뤢�. 2017 均. 11:00 PM 極棘剋�뚍론압꽥겉궿둔뿌� "Mike Black"> 戟逵極龜�곍겆�: > > > Hello > > > > I've got old PCI NIC D-Link DGE530T Rev 11 with SysKonnect chip on it. > > Years ago it worked in FreeBSD 8/9 Stable with if_sk driver. > > > > Now I'm runnig > > 11.1-STABLE FreeBSD 11.1-STABLE #1 r323214: Sat Nov 11 19:06:20 MSK 2017 > >amd_miek@diablo.miekoff.local:/usr/obj/usr/src/sys/DIABLO64 amd64 > > 1101502 1101502 > > > > But recently I plugged this card back and it's not being recognized by a > > driver. > > > > pciconf says that is > > none0@pci0:5:1:0: class=0x02 card=0x4b011086 chip=0x4b011086 > > rev=0x11 hdr=0x00 > > vendor = 'J. Bond Computer Systems' > > class = network > > subclass = ethernet > > bar [10] = type Memory, range 32, base 0xfebec000, size 16384, > > enabled > > bar [14] = type I/O Port, range 32, base 0xee00, size 512, enabled > > cap 01[48] = powerspec 2 supports D0 D1 D2 D3 current D3 > > cap 03[50] = VPD > > > > According /usr/share/misc/pci_vendors this D-link should have 4b011186 not > > 4b011086. > > I looked into driver code (if_sk) and it expects 1186 card also. > > I googled about this issue but found no one similar in a recent years > > So I'd like to know what's wrong - some changes in driver in a recent > > years or smth going wrong while OS detecting this NIC. But that's > > confusing, because this exact NIC worked years ago... > > > > -- > > amd_miek > > Think different. > > Just superior. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Thu, Aug 27, 2015 at 11:29:28AM +0200, Johann Hugo wrote: It's working for me so far and I haven't seen any watchdog timeouts. With 10.2-RELEASE I got timeouts and lost connectivity in less that a minute. Ok, great. Committed in r287238. Thanks again. Johann On Wed, Aug 26, 2015 at 10:28 AM, Yonghyeon PYUN pyu...@gmail.com wrote: On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote: 10.2-RELEASE does not work for me. It works for a very short while and then it stops with msk0 watchdog timeout errors Thanks a lot for your report. This is the first report for msk(4) watchdog timeouts on 10.2-RELEASE. I'm not sure what patch Roosevelt was talking about, but the patch in this thread works for me: https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's been running stable for the last week. I see. I'm under the impression that RX/TX descriptor ring alignment shall trigger the same issue so it would be better to know how attached patch works on your box. Thanks. Johann On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote: On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: Hi, So, I can confirm with the attached patch. I have a working msk0 that hasn't failed for the past month. I considered this problem fix for me. Since, I have went a long time without any problems. Thanks! I'm not sure which patch you used. Given that users reported 10.2-RELEASE works, it would be great if you revert local patch and try it again on 10.2-RELEASE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Aug 26, 2015 at 10:06:29AM +0200, Johann Hugo wrote: 10.2-RELEASE does not work for me. It works for a very short while and then it stops with msk0 watchdog timeout errors Thanks a lot for your report. This is the first report for msk(4) watchdog timeouts on 10.2-RELEASE. I'm not sure what patch Roosevelt was talking about, but the patch in this thread works for me: https://lists.freebsd.org/pipermail/freebsd-stable/2015-April/082226.html I've changed MSK_STAT_ALIGN from 4096 to 8192 in if_mskreg.h and it's been running stable for the last week. I see. I'm under the impression that RX/TX descriptor ring alignment shall trigger the same issue so it would be better to know how attached patch works on your box. Thanks. Johann On Sun, Aug 16, 2015 at 2:08 PM, Yonghyeon PYUN pyu...@gmail.com wrote: On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: Hi, So, I can confirm with the attached patch. I have a working msk0 that hasn't failed for the past month. I considered this problem fix for me. Since, I have went a long time without any problems. Thanks! I'm not sure which patch you used. Given that users reported 10.2-RELEASE works, it would be great if you revert local patch and try it again on 10.2-RELEASE. Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:00:35AM -0400, Rick Macklem wrote: Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. Don't forget that not all drivers in the tree set the TSO limits before if_attach(), so possibly the subtraction of one TSO fragment needs to go into ip_output() Ok, I realized that some drivers may not know the answers before ether_ifattach(), due to the way they are configured/written (I saw the use of if_hw_tsomax_update() in the patch). I was not able to find an interface that configures TSO parameters after if_t conversion. I'm under the impression if_hw_tsomax_update() is not designed to use this way. Probably we need a better one?(CCed to Gleb). If it is subtracted as a part of the assignment to if_hw_tsomaxsegcount in tcp_output() at line#791 in tcp_output() like the following, I don't think it should matter if the values are set before ether_ifattach()? /* * Subtract 1 for the tcp/ip header mbuf that * will be prepended to the mbuf chain in this * function in the code below this block. */ if_hw_tsomaxsegcount = tp-t_tsomaxsegcount - 1; I don't have a good solution for the case where a driver doesn't plan on using the tcp/ip header provided by tcp_output() except to say the driver can add one to the setting to compensate for that (and if they fail to do so, it still works, although somewhat suboptimally). When I now read the comment in sys/net/if_var.h it is clear what it means, but for some reason I didn't read it that way before? (I think it was the part that said the driver didn't have to subtract for the headers that confused me?) In any case, we need to try and come up with a clear definition of what they need to be set to. I can now think of two ways to deal with this: 1 - Leave tcp_output() as is, but provide a macro for the device driver authors to use that sets if_hw_tsomaxsegcount with a flag for driver uses tcp/ip header mbuf, documenting that this flag should normally be true. OR 2 - Change tcp_output() as above, noting that this is a workaround for confusion w.r.t. whether or not if_hw_tsomaxsegcount should include the tcp/ip header mbuf and update the comment in if_var.h to reflect this. Then drivers that don't use the tcp/ip header mbuf can increase their value for if_hw_tsomaxsegcount by 1. (The comment should also mention that a value of 35 or greater is much preferred to 32 if the hardware will support that.) Both works for me. My preference is 2 just because it's very common for most drivers that use tcp/ip header mbuf. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Tue, Aug 18, 2015 at 06:04:25PM -0400, Rick Macklem wrote: Hans Petter Selasky wrote: On 08/18/15 14:53, Rick Macklem wrote: If this is just a test machine, maybe you could test with these lines (at about #880) in sys/netinet/tcp_output.c commented out? (It looks to me like this will disable TSO for almost all the NFS writes.) - around line #880 in sys/netinet/tcp_output.c: /* * In case there are too many small fragments * don't use TSO: */ if (len = max_len) { len = max_len; sendalot = 1; tso = 0; } This was added along with the other stuff that did the if_hw_tsomaxsegcount, etc and I never noticed it until now (not my patch). FYI: These lines are needed by other hardware, like the mlxen driver. If you remove them mlxen will start doing m_defrag(). I believe if you set the correct parameters in the struct ifnet for the TSO size/count limits this problem will go away. If you print the len and max_len and also the cases where TSO limits are reached, you'll see what parameter is triggering it and needs to be increased. Well, if the driver isn't setting if_hw_tsomaxsegcount correctly, then it is the driver that needs to be fixed. Having the above code block disable TSO for all of the NFS writes, including the ones that set if_hw_tsomaxsegcount correctly doesn't make sense to me. If the driver authors don't set these, the drivers do lots of m_defrag() calls. I have posted more than once to freebsd-net@ asking the driver authors to set these and some now have. (I can't do it, because I don't have the hardware to test it with.) Thanks for reminder. I have generated a diff against HEAD. https://people.freebsd.org/~yongari/tso.param.diff The diff restores optimal TSO parameters which were lost in r271946 for drivers that relied on sane default values. I'll commit it after some testing. I do think that most/all of them don't subtract 1 for the tcp/ip header and I don't think they should be expected to, since the driver isn't supposed to worry about the protocol at that level. I agree. -- I think tcp_output() should subtract one from the if_hw_tsomaxsegcount provided by the driver to handle this, since it chooses to count mbufs (the while() loop at around line #825 in sys/netinet/tcp_output.c.) before it prepends the tcp/ip header mbuf. rick --HPS ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:51:44AM +0200, Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Hi, If you change the behaviour don't forget to update and/or add comments describing it. Maybe the amount of subtraction could be defined by some macro? Then drivers which inline the headers can subtract it? I'm also ok with your suggestion. Your suggestion is fine by me. The initial TSO limits were tried to be preserved, and I believe that TSO limits never accounted for IP/TCP/ETHERNET/VLAN headers! I guess FreeBSD used to follow MS LSOv1 specification with minor exception in pseudo checksum computation. If I recall correctly the specification says upper stack can generate up to IP_MAXPACKET sized packet. Other L2 headers like ethernet/vlan header size is not included in the packet and it's drivers responsibility to allocate additional DMA buffers/segments for L2 headers. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. You're right it's complicated. Not sure if bus_dma can provide an API for this though. --HPS ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Maybe it can be controlled by some kind of flag, if all the three TSO limits should include the TCP/IP/ethernet headers too. I'm pretty sure we want both versions. Hmm, I'm afraid it's already complex. Drivers have to tell almost the same information to both bus_dma(9) and network stack. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ix(intel) vs mlxen(mellanox) 10Gb performance
On Wed, Aug 19, 2015 at 08:13:59AM -0400, Rick Macklem wrote: Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:51:44AM +0200, Hans Petter Selasky wrote: On 08/19/15 09:42, Yonghyeon PYUN wrote: On Wed, Aug 19, 2015 at 09:00:52AM +0200, Hans Petter Selasky wrote: On 08/18/15 23:54, Rick Macklem wrote: Ouch! Yes, I now see that the code that counts the # of mbufs is before the code that adds the tcp/ip header mbuf. In my opinion, this should be fixed by setting if_hw_tsomaxsegcount to whatever the driver provides - 1. It is not the driver's responsibility to know if a tcp/ip header mbuf will be added and is a lot less confusing that expecting the driver author to know to subtract one. (I had mistakenly thought that tcp_output() had added the tc/ip header mbuf before the loop that counts mbufs in the list. Btw, this tcp/ip header mbuf also has leading space for the MAC layer header.) Hi Rick, Your question is good. With the Mellanox hardware we have separate so-called inline data space for the TCP/IP headers, so if the TCP stack subtracts something, then we would need to add something to the limit, because then the scatter gather list is only used for the data part. I think all drivers in tree don't subtract 1 for if_hw_tsomaxsegcount. Probably touching Mellanox driver would be simpler than fixing all other drivers in tree. Hi, If you change the behaviour don't forget to update and/or add comments describing it. Maybe the amount of subtraction could be defined by some macro? Then drivers which inline the headers can subtract it? I'm also ok with your suggestion. Your suggestion is fine by me. The initial TSO limits were tried to be preserved, and I believe that TSO limits never accounted for IP/TCP/ETHERNET/VLAN headers! I guess FreeBSD used to follow MS LSOv1 specification with minor exception in pseudo checksum computation. If I recall correctly the specification says upper stack can generate up to IP_MAXPACKET sized packet. Other L2 headers like ethernet/vlan header size is not included in the packet and it's drivers responsibility to allocate additional DMA buffers/segments for L2 headers. Yep. The default for if_hw_tsomax was reduced from IP_MAXPACKET to 32 * MCLBYTES - max_ethernet_header_size as a workaround/hack so that devices limited to 32 transmit segments would work (ie. the entire packet, including MAC header would fit in 32 MCLBYTE clusters). This implied that many drivers did end up using m_defrag() to copy the mbuf list to one made up of 32 MCLBYTE clusters. If a driver sets if_hw_tsomaxsegcount correctly, then it can set if_hw_tsomax to whatever it can handle as the largest TSO packet (without MAC header) the hardware can handle. If it can handle IP_MAXPACKET, then it can set it to that. I thought the upper limit was still IP_MAXPACKET. If driver increase it (i.e. IP_MAXPACKET, the length field in the IP header would overflow which in turn may break firewalls and other packet handling in IPv4/IPv6 code path. If the limit no longer apply to network stack, that's great. Some controllers can handle up to 256KB TCP/UDP segmentation and supporting that feature wouldn't be hard. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Aug 12, 2015 at 09:44:06AM -0400, Roosevelt Littleton wrote: Hi, So, I can confirm with the attached patch. I have a working msk0 that hasn't failed for the past month. I considered this problem fix for me. Since, I have went a long time without any problems. Thanks! I'm not sure which patch you used. Given that users reported 10.2-RELEASE works, it would be great if you revert local patch and try it again on 10.2-RELEASE. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Sat, Jul 25, 2015 at 02:08:10PM +0300, Alnis Morics wrote: Just tried 10.2-RC1 amd64 GENERIC, and the problem seems to be gone. I was even able to scp a 500 MB file. Could it be related to this fix in BETA2, as mentioned in the announcement, The watchdog(4) device has been fixed to print to the correct buffer.? msk(4) will show watchdog timeouts when it detects driver TX path is in stuck condition but I believe this has nothing to do with watchdog(4). There was no msk(4) code change in 10.2-RC1. If you happen to see the watchdog timeouts again, please try attached patch and let me know whether it makes any difference for you. I didn't get much feedbacks on the patch so I'm not sure whether it really fixes the root cause. pciconf -lv [..] mskc0@pci0:9:0:0:class=0x02 card=0xc072144d chip=0x435411ab rev=0x00 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8040 PCI-E Fast Ethernet Controller' class = network subclass = ethernet Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 10.1-STABLE bce: Watchdog timeout occurred
On Wed, Apr 22, 2015 at 12:39:16AM -0400, Chris Ross wrote: On Apr 21, 2015, at 10:10 , Gareth Wyn Roberts g.w.robe...@glyndwr.ac.uk wrote: This may be caused by DMA alignment problems. See https://docs.freebsd.org/cgi/getmsg.cgi?fetch=145859+0+archive/2015/freebsd-stable/20150419.freebsd-stable for a recent thread about the msk driver. The msk maintainer Yonghyeon Pyun has opted for super safe options of 32K alignment! It's a long shot, but you could try increasing BCE_DMA_ALIGN and/or BCE_RX_BUF_ALIGN in the include file if_bcereg.h, say up to 4096, to see whether it makes any difference. Well, after making that change, I was able to confirm that the problem doesn't seem to occur. However, in trying to verify the problem on an unmodified kernel, I've rebooted a GENERIC from r281672 without that change, and am also not seeing the problem. :-/ I'm not sure whether the gremlins have fixed something, or if I was just too critical in my initial analysis. For now I'll take that change out of my tree and run without it. If I see the flapping again, I'll confirm that it's repeatable, then change the alignments as suggested and see if I see a change. I guess the alignment issue of msk(4) has nothing to do with bce(4) watchdog timeouts. It would be more helpful to know details of your controller(bce(4)/brgphy(4) related dmesg output, pciconf output etc) and network setup. If you know a reliable way that triggers the watchdog timeouts, please share that info too. I would have tried to disable all hardware offloading features(TSO, checksum, VLAN H/W tagging etc) and see whether that makes any differences in the first step to narrow down the issue. Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Wed, Apr 15, 2015 at 09:52:09PM +, Gareth Wyn Roberts wrote: I've inserted code to print some values which show the differences between specifying 4096 or 8192 for MSK_STAT_ALIGN. In both cases the status buffer has length 0x4000 (8x2048=16K) but the alignments are different as expected, respectively start addresses 0x5c3b000 or 0xbdc2c000. The following values were output from functions msk_status_dma_alloc(), msk_dmamap_cb() and msk_handle_events(). The Break #n refer to breaks in msk_handle_events(). #1 occurs if ((control HW_OWNER) == 0), #5 is OP_RXSTAT and #6 is OP_TXINDEXLE. The first output is for MSK_STAT_ALIGN=8192. It continues normally. Although not shown here, it reaches cons=2047 then cons=0 as expected. The second output is for MSK_STAT_ALIGN=4096. Although there can be isolated occurences of Break #1 (e.g. cons=196) (?are these to be expected?), it continues normally until cons=512. At this point it continually invokes the #1 block because the msk_control from msk_stat_ring[512] is always zero and the network hangs immediately. This suggests the Yukon Ultra 2 88E8057 can't access the next 4096 memory block, but why not? Yes, it seems the status LE block is not updated at all for MSK_STAT_ALIGN == 4096 and some elements of the status block looks suspicious(put index increases but the value in the location is 0). I vaguely guess this indicates there are DMA alignment and/or DMA boundary issues. The maximum number of elements of the status block is 4096 so the maximum size of the status block is 32KB. For i386, msk(4) uses 8KB status block(1024 elements). For 64bit architectures, the block size is increased to 16KB(2048 elements). Probably the safe alignment value for the status block would be 32K. This looks excessive value to me but it shall avoid guessing DMA boundary issue. Please let me know if any further information would be helpful. Thanks a lot. I've attached a diff which sets the alignment of TX/RX ring and status block to 32KB. Not sure whether this also addresses other msk(4) related watchdog timeouts. Index: sys/dev/msk/if_mskreg.h === --- sys/dev/msk/if_mskreg.h (revision 281587) +++ sys/dev/msk/if_mskreg.h (working copy) @@ -2175,13 +2175,8 @@ #define MSK_ADDR_LO(x) ((uint64_t) (x) 0xUL) #define MSK_ADDR_HI(x) ((uint64_t) (x) 32) -/* - * At first I guessed 8 bytes, the size of a single descriptor, would be - * required alignment constraints. But, it seems that Yukon II have 4096 - * bytes boundary alignment constraints. - */ -#define MSK_RING_ALIGN 4096 -#define MSK_STAT_ALIGN 4096 +#define MSK_RING_ALIGN 32768 +#define MSK_STAT_ALIGN 32768 /* Rx descriptor data structure */ struct msk_rx_desc { ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk msk0 watchdog timeout freeze hang lock stop problem
On Sun, Apr 12, 2015 at 05:57:34PM +, Gareth Wyn Roberts wrote: I've run in to problems using the msk device where initially it works well enough to set DHCP etc. but stops/freezes as soon as any appreciable network traffic occurs . There are several threads describing similar symptoms over the past two years or more. I've been following several false leads but have finally found a solution (at least it solves my problem). I'm running a standard FreeBSD 10.1-RELEASE and the NIC is detected as: mskc0: Marvell Yukon 88E8057 Gigabit Ethernet mem 0xfa00-0xfa003fff irq 19 at device 0.0 on pci6 msk0: Marvell Technology Group Ltd. Yukon Ultra 2 Id 0xba Rev 0x00 on mskc0 msk0: Ethernet address: 00:13:77:e9:df:eb miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-ma ster, auto, auto-flow The network worked when using the i386 release, but failed for the amd64 release (as reported previously) which prompted me to disable 64-bit DMA (the patch for this is attached below). This worked for the first kernel built but mysteriously failed when another unrelated part of the kernel was changed (a usb driver) and the kernel recompiled. So identical msk driver code worked in one kernel but not the second! This suggested that alignment differences between the two kernels were causing the msk driver to fail. Others have reported varying behaviour depending on different circumstances. It transpires that changing just one value in the if_mskreg.h file solved all my problems. Subsequently I have not been able to make it fail under heavy network traffic in either 32-bit or 64-bit mode. I'm working on 10.1-RELEASE source, i.e. if_msk.c revision 262524 and if_mskreg.h revision 264442. Thanks for letting me know your findings. I really appreciate that. I recall that the alignment requirement of status LEs(List Elements in Marvell terms) is 2048 and the maximum size of the status LEs is 4096 bytes(Actual alignment seems to be much lower value like 32 or 64 bytes, but alignment 2048 is chosen to avoid silicon bugs). Later experiments showed some variants of Yukon II require 4096 bytes alignment and I changed the alignment to 4096 in the past. It seems your finding indicates msk(4) needs 8192 alignment for status LEs. However this does not explain how and why the same code in 8.x/9.x works well. In addition, it's not common to require alignment size greater than PAGE_SIZE on x86 given that the maximum size of DMA buffer is 4096 bytes. I have to check whether there was a change in bus_dma(9) between 8.x/9.x and 10.x but it needs more time due to lack of spare time. Probably you can verify the DMA address of status LEs meets the following requirements both on i386 and amd64. - Alignment is 4096. - Number of DMA segment is 1. - DMA segment base address plus DMA segment size does not cross a PAGE_SIZE boundary. Here's the patch to if_mskreg.h --- if_mskreg.h-orig2014-11-11 20:02:58.0 + +++ if_mskreg.h 2015-04-12 18:47:20.0 +0100 @@ -2179,9 +2179,11 @@ * At first I guessed 8 bytes, the size of a single descriptor, would be * required alignment constraints. But, it seems that Yukon II have 4096 * bytes boundary alignment constraints. + * And it seems that the DMA status region for the Yukon Ultra 2 (88E8057) + * requires 8192 byte alignment to prevent locking. */ #define MSK_RING_ALIGN 4096 -#defineMSK_STAT_ALIGN 4096 +#defineMSK_STAT_ALIGN 8192 The patches to both files which also implement a MSK_64BIT_DMA_DISABLE flag are attached. Perhaps the developers would consider committing these as it may be useful for future debugging. If you have more than 4GB memory installed and disables 64bit DMA addressing, msk(4) shall use bounce buffers. Passing packets through bounce buffers involves copy operation and it costs a lot. You can check hw.busdma sysctl node to see whether there are drivers that use bounce buffers. And if you want to disable 64bit DMA on 64bit architectures, add '#undef MSK_64BIT_DMA' just below BUS_SPACE_MAXADDR check in if_mskreg.h. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: dhclient failure with Realtek 8111E Etnernet on new MSI motherboard
On Thu, Sep 26, 2013 at 08:09:36AM +, Thomas Mueller wrote: I rebuilt the kernel while keeping the existing kernel, installing to /boot/kernelre on the USB stick. Unfortunately all the modules were redundantly rebuilt. Maybe I should have had -D NO_MODULES instead of -DNO_MODULES? I typed unload at the loader prompt, then boot /boot/kernelre/kernel. I had the same problem as before with dhclient, looked like nothing different. The patch was not intended to address your issue. It was for getting correct MAC revision number. So seeing no behavioral change is normal. The MAC revision number now indicates 0x0010 which means you have slightly different variant. I'll let you know if I happen to find more clue on that MAC revision. Lines in /var/run/dmesg.boot relating to re0 were: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xe000-0xe0ff mem 0xf7d04000-0xf7d04fff,0xf7d0-0xf7d03fff irq 17 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: Chip rev. 0x2c80 re0: MAC rev. 0x0010 miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: d4:3d:7e:97:17:e2 Tom ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: dhclient failure with Realtek 8111E Etnernet on new MSI motherboard
On Sun, Sep 22, 2013 at 08:28:08PM +, Thomas Mueller wrote: I've been unable to establish Internet connection from a new computer with Realtek 811E Ethernet despite this Ethernet chip working on another computer with another MSI motherboard. Problem motherboard is MSI Z77 MPOWER. Older, by two years, motherboard is MSI Z68MA-ED55(B3). uname -a shows FreeBSD amelia2 9.2-PRERELEASE FreeBSD 9.2-PRERELEASE #17 r254196: Sun Aug 11 00:36:49 UTC 2013 root@amelia2:/usr/obj/usr/src/sys/SANDY amd64 I get the same problem with FreeBSD 9.1_STABLE i386. These are USB-stick installations. I was able to connect to the Internet with (MSI) Winki 3 (Linux-based), included on a DVD included in the motherboard package. After nothing but frustration trying to boot USB-stick installations of NetBSD 6.1-STABLE and HEAD (i386), I successfully booted NetBSD-HEAD amd64 from early last May, and dhclient re0 was successful, whereupon I downloaded, by cvs, the HEAD source tree and pkgsrc tree. This proves or strongly suggests the Ethernet chip is healthy. Anything I can do (at loader prompt or loader.conf?) to make this Ethernet work in FreeBSD? I could update NetBSD-HEAD from source, update the packages through pkgsrc, and build subversion, then checkout the FreeBSD-HEAD source tree, ports and doc trees too, and build FreeBSD-HEAD from source on hard drive using USB-stick installation of FreeBSD 9.2 (prerelease or release). Related part of /var/run/dmesg.boot is re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xe000-0xe0f f mem 0xf7d04000-0xf7d04fff,0xf7d0-0xf7d03fff irq 17 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: Chip rev. 0x2c80 re0: MAC rev. 0x It looks like 8168E-VL. Could you try attached patch and show me the dmesg output(re(4) and rgephy(4) only)? The patch was generated to support 8106E but it will correctly show MAC revision number. miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX , 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX- master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: d4:3d:7e:97:17:e2 Log of dhclient re0 was DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 3 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 3 DHCPOFFER from 192.168.1.1 Driver got a response but it seems it was discarded. DHCPREQUEST on re0 to 255.255.255.255 port 67 DHCPREQUEST on re0 to 255.255.255.255 port 67 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 6 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 13 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 14 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 17 DHCPDISCOVER on re0 to 255.255.255.255 port 67 interval 11 No DHCPOFFERS received. No working leases in persistent database - sleeping. Somewhat later I got Memory modified after free 0xfe0011546800(2048) val=977e3dd 4 @ 0xfe0011546800 Memory modified after free 0xfe001153b800(2048) val= @ 0xfe00115 3b800 Memory modified after free 0xfe0011524800(2048) val=977e3dd4 @ 0xfe00115 24800 VESA: set_mode(): 24(18) - 24(18) Memory modified after free 0xfe0011594000(2048) val=977e3dd4 @ 0xfe00115 94000 The size(2048) indicates mbuf cluster which in turn means bad things happened in re(4). I have no idea how this can happen though. If you assign static IP addressi to re(4), does the driver works as expected? In one case, when I went to bed on this, hours later the system crashed and went into the debugger (db), where I was rather lost, couldn't kill dhclient, after some time types reboot. Should I have posted this to a different list (hardware, questions?)? I would like to find if FreeBSD HEAD (10.0 alphas) would do better. Also, because of nearness of 10.0-RELEASE, I would rather go this track than 9.2 and then update; I already have 9.2 prerelease on other computer. Motherboard also has Atheros Wi-Fi (Atheros AR9271 802.11n), and I also have a USB stick-type WLAN adapter (Hiro Inc H50191, Realtek RTL8191SU 802.11n chip). Tom Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 255757) +++ sys/dev/re/if_re.c (working copy) @@ -223,6 +223,7 @@ { RL_HWREV_8402, RL_8169, 8402, RL_MTU }, { RL_HWREV_8105E, RL_8169, 8105E, RL_MTU }, { RL_HWREV_8105E_SPIN1, RL_8169, 8105E, RL_MTU }, + { RL_HWREV_8106E, RL_8169, 8106E, RL_MTU }, { RL_HWREV_8168B_SPIN2, RL_8169, 8168, RL_JUMBO_MTU }, { RL_HWREV_8168B_SPIN3, RL_8169, 8168, RL_JUMBO_MTU }, { RL_HWREV_8168C, RL_8169, 8168C/8111C, RL_JUMBO_MTU_6K }, @@ -1367,10 +1368,11 @@ break; default: device_printf(dev, Chip rev. 0x%08x\n,
Re: dhclient failure with Realtek 8111E Etnernet on new MSI motherboard
On Thu, Sep 26, 2013 at 02:31:30AM +, Thomas Mueller wrote: It looks like 8168E-VL. Could you try attached patch and show me the dmesg output(re(4) and rgephy(4) only)? The patch was generated to support 8106E but it will correctly show MAC revision number. I assume I go to /usr/src and run patch /home/arlene/computer/re.8106.diff Yes. Then rebuild the kernel with -DNO_MODULES and install under a different name, like kernelre? Rebuilding kernel should be enough. See http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig-building.html for more information. I would install this on USB-stick installation, could do this for i386 USB-stick installation as well. Somewhat later I got Memory modified after free 0xfe0011546800(2048) val=977e3dd 4 @ 0xfe0011546800 Memory modified after free 0xfe001153b800(2048) val= @ 0xfe00115 3b800 Memory modified after free 0xfe0011524800(2048) val=977e3dd4 @ 0xfe00115 24800 VESA: set_mode(): 24(18) - 24(18) Memory modified after free 0xfe0011594000(2048) val=977e3dd4 @ 0xfe00115 94000 The size(2048) indicates mbuf cluster which in turn means bad things happened in re(4). I have no idea how this can happen though. If you assign static IP addressi to re(4), does the driver works as expected? I can try assigning a static address to re4, not really sure how to set up manually, though I did it long ago in Slackware Linux. I wouldn't have known size 2048 indicated something bad, though the message's presence and system crash indicated that something was fouled up in memory. Tom ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: hme0 interface going up/down (dhclient ?)
On Wed, Jul 10, 2013 at 02:40:09AM +, dcx dcy wrote: Hello, the patch corrected this issue. Thank you very much for your help and time, it is appreciated! Best Regards, Dominic. Thanks for testing. Committed in r253134. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: hme0 interface going up/down (dhclient ?)
On Tue, Jul 09, 2013 at 02:05:30PM +, dcx dcy wrote: Hi all, I am having an issue where my hme0 interface is always turning up and down with dhclient requesting a lease. I am thinking this could be the same issue described by Jeremy Chadwick on June 9th: http://lists.freebsd.org/pipermail/freebsd-stable/2013-June/073711.html Everything was fine on 8.2-STABLE and older versions. It was then upgraded directly to 9.0-STABLE and this is where I started having issues. I am currently running 9.1-STABLE (July 7th) and issue persists. This is a Sun Netra T1 acting as my home gateway, hme0 is connected to a Cisco switch. I am normally using DHCP to get an ip from my ISP. For now, the only way it works is to set a static ip (I tried different dhclient options ... sync, etc). hme1 and ath0 (AR5413) is serving internal network. Try attached patch and let me know whether it makes any difference for you. Index: sys/dev/hme/if_hme.c === --- sys/dev/hme/if_hme.c (revision 253125) +++ sys/dev/hme/if_hme.c (working copy) @@ -742,6 +742,10 @@ hme_init_locked(struct hme_softc *sc) u_int32_t n, v; HME_LOCK_ASSERT(sc, MA_OWNED); + + if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) + return; + /* * Initialization sequence. The numbered steps below correspond * to the sequence outlined in section 6.3.5.1 in the Ethernet @@ -1324,6 +1328,7 @@ hme_eint(struct hme_softc *sc, u_int status) /* check for fatal errors that needs reset to unfreeze DMA engine */ if ((status HME_SEB_STAT_FATAL_ERRORS) != 0) { HME_WHINE(sc-sc_dev, error signaled, status=%#x\n, status); + sc-sc_ifp-if_drv_flags = ~IFF_DRV_RUNNING; hme_init_locked(sc); } } @@ -1370,6 +1375,7 @@ hme_watchdog(struct hme_softc *sc) device_printf(sc-sc_dev, device timeout (no link)\n); ++ifp-if_oerrors; + ifp-if_drv_flags = ~IFF_DRV_RUNNING; hme_init_locked(sc); hme_start_locked(ifp); return (EJUSTRETURN); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Sanity Check on Mac Mini
On Sun, Jul 07, 2013 at 05:56:09PM -0700, Doug Hardie wrote: As I previously indicated, I have tested a couple more Minis and updated the instructions with what I learned. Here is the revised version: [...] 2.12.3Rebuilding the kernel to support the Ethernet Interface Once the system has been rebooted, you will notice that ifconfig may not show the ethernet interface. There are at least two different chips being used for that interface. Some of the units work right out of the box. Others do not. I have two units and the only visible difference is the Part No. Part Nu. MC815LL/A appears to be the older unit and the bge interface worked on install. Part No MD387LL/A is newer and has the newer chips that require the driver update. If the bge interface does not show, then the bge driver needs to be updated to recognize the NIC. Mount the second memstick with the files retrieved earlier and move them into the kernel source. I used the following commands: cp -p brgphy.c /usr/src/sys/dev/mii cp -p if_bgereg.h /usr/src/sys/dev/bge cp -p if_bge.c /usr/src/sys/dev/bge then rebuild the kernel. Note the instructions here are for GENERIC, but you can use KERNCONF to specify a custom kernel. cd /usr/src make buildkernel make installkernel Reboot the server as before. Now ifconfig will show bge0 and it will work. The mini is now running a useable version of 9.1-Release. There are still some items remaining to be resolved: Updating the kernel with the recent security patches, Disabling Bluetooth and Wireless to save power, and unattended rebooting. These issues are still being addressed. I'm not sure whether this bge(4) controller is sitting behind TB(Apple Thunderbolt) bridge. The Apple TB bridge has known performance issue and some BCM controllers have a work-around to mitigate it. The work-around is not enabled by default so I'm interested in bge(4) performance numbers on your box. If you can't get more than 920 ~ 930Mbps(950Mbps or higher with jumbo frame) please let me know. I didn't enable the work-around yet since it will hurt other BCM controllers when TB bridge is absent. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: fxp0 interface going up/down/up/down (dhclient related?)
On Sun, Jun 09, 2013 at 12:21:37PM +0200, Alban Hertroys wrote: I'm having an issue where my fxp0 interface keeps looping between DOWN/UP, with dhclient requesting a lease each time in between. I think it's caused by dhclient: solfertje # dhclient -d fxp0 DHCPREQUEST on fxp0 to 255.255.255.255 port 67 send_packet: Network is down DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down ^C In above test I turned off devd (/etc/rc.d/devd stop) and background dhclient (/etc/rc.d/dhclient stop fxp0), and I still go the above result. There's practically no time spent between up/down cycles, this just keeps going on and on. fxp0 is the only interface that runs on DHCP. The others have static IP's. Try attached patch and let me know whether it also works for you. Index: sys/dev/fxp/if_fxp.c === --- sys/dev/fxp/if_fxp.c (revision 251021) +++ sys/dev/fxp/if_fxp.c (working copy) @@ -1075,7 +1075,8 @@ fxp_suspend(device_t dev) pmstat |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE; sc-flags |= FXP_FLAG_WOL; /* Reconfigure hardware to accept magic frames. */ - fxp_init_body(sc, 1); + ifp-if_drv_flags = ~IFF_DRV_RUNNING; + fxp_init_body(sc, 0); } pci_write_config(sc-dev, pmc + PCIR_POWER_STATUS, pmstat, 2); } @@ -2141,8 +2142,10 @@ fxp_tick(void *xsc) */ if (sc-rx_idle_secs FXP_MAX_RX_IDLE) { sc-rx_idle_secs = 0; - if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) + if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) { + ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 1); + } return; } /* @@ -2240,6 +2243,7 @@ fxp_watchdog(struct fxp_softc *sc) device_printf(sc-dev, device timeout\n); sc-ifp-if_oerrors++; + sc-ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 1); } @@ -2274,6 +2278,10 @@ fxp_init_body(struct fxp_softc *sc, int setmedia) int i, prm; FXP_LOCK_ASSERT(sc, MA_OWNED); + + if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) + return; + /* * Cancel any pending I/O */ @@ -2813,6 +2821,7 @@ fxp_miibus_statchg(device_t dev) */ if (sc-revision == FXP_REV_82557) return; + ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); } @@ -2836,9 +2845,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr if (ifp-if_flags IFF_UP) { if (((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) ((ifp-if_flags ^ sc-if_flags) - (IFF_PROMISC | IFF_ALLMULTI | IFF_LINK0)) != 0) + (IFF_PROMISC | IFF_ALLMULTI | IFF_LINK0)) != 0) { +ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); - else if ((ifp-if_drv_flags IFF_DRV_RUNNING) == 0) + } else if ((ifp-if_drv_flags IFF_DRV_RUNNING) == 0) fxp_init_body(sc, 1); } else { if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) @@ -2851,8 +2861,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr case SIOCADDMULTI: case SIOCDELMULTI: FXP_LOCK(sc); - if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) + if ((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) { + ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); + } FXP_UNLOCK(sc); break; @@ -2942,8 +2954,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr ~(IFCAP_VLAN_HWTSO | IFCAP_VLAN_HWCSUM); reinit++; } - if (reinit 0 ifp-if_flags IFF_UP) + if (reinit 0 (ifp-if_drv_flags IFF_DRV_RUNNING) != 0) { + ifp-if_drv_flags = ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); + } FXP_UNLOCK(sc); VLAN_CAPABILITIES(ifp); break; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: fxp0 interface going up/down/up/down (dhclient related?)
On Sun, Jun 09, 2013 at 03:39:45PM +0200, Alban Hertroys wrote: On Jun 9, 2013, at 13:45, YongHyeon PYUN pyu...@gmail.com wrote: On Sun, Jun 09, 2013 at 12:21:37PM +0200, Alban Hertroys wrote: I'm having an issue where my fxp0 interface keeps looping between DOWN/UP, with dhclient requesting a lease each time in between. I think it's caused by dhclient: solfertje # dhclient -d fxp0 DHCPREQUEST on fxp0 to 255.255.255.255 port 67 send_packet: Network is down DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down fxp0 link state down - up DHCPREQUEST on fxp0 to 255.255.255.255 port 67 DHCPACK from 109.72.40.1 bound to 141.105.10.89 -- renewal in 7200 seconds. fxp0 link state up - down ^C In above test I turned off devd (/etc/rc.d/devd stop) and background dhclient (/etc/rc.d/dhclient stop fxp0), and I still go the above result. There's practically no time spent between up/down cycles, this just keeps going on and on. fxp0 is the only interface that runs on DHCP. The others have static IP's. Try attached patch and let me know whether it also works for you. fxp.init.diff I'm now running with this patch and the symptoms seem to have gone away. Thanks! Is there anything I should be aware of with this patch or anything you'd like to know about how it runs? No, I already tested the patch and will commit today. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: fxp0 interface going up/down/up/down (dhclient related?)
On Sun, Jun 09, 2013 at 09:13:33PM +0400, Lev Serebryakov wrote: Hello, Jeremy. You wrote 9 июня 2013 г., 14:44:01: JC The issue is described in the 8.4-RELEASE Errata Notes; the driver is JC using the same driver version as in stable/9, hence you're experiencing JC the same problem. See Open Issues: I had some memory, that I had had this problem on my router some time (year? two years? three?) ago, and it was fixed somehow at then-HEAD (9?) system with disabling link down event on fxp(4), caused by chip reset after address setting. Is it deja-vu or true memory? There was a bug at the time but it was fixed long time ago. Current issue is different one but the end result looks very similar to the old bug. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Mon, Jun 03, 2013 at 09:25:33AM +0300, Daniel Braniss wrote: On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote: On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote: --/04w6evG8XlLl3ft Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=bge.media_sts.diff Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c(revision 251021) +++ sys/dev/bge/if_bge.c(working copy) @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar BGE_LOCK(sc); + if ((ifp-if_flags IFF_UP) == 0) { + BGE_UNLOCK(sc); + return; + } if (sc-bge_flags BGE_FLAG_TBI) { ifmr-ifm_status = IFM_AVALID; ifmr-ifm_active = IFM_ETHER; --/04w6evG8XlLl3ft-- after 18hs, the logs are empty! it seems the patch fixes the problem. now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts ... It could be any number of daemons that query interface state such as an SNMP server, ladvd, etc. If you wanted help you could modify the patch so that it does something like this: #include sys/proc.h if (/* test for IFF_UP */) { BGE_UNLOCK(sc); if_printf(ifp, state queried on down interface by pid %d (%s), --| add a \n curthread-td_proc-p_pid, curthread-td_proc-p_comm); return; } -- John Baldwin snmpd call this several times a second, (difficult to measeure since sysolog just says last message repeated 22 times in any case, the DOWN/UP appears once every few hours, oh well. I have now stopped the snmpd daemon, maybe there is someone else ... I have no idea why snmpd wants to know media status for interfaces that are put into down state. The media status resolved after bringing up the interface may be different one that was seen before. The patch also makes dhclient think driver got a valid link regardless of link establishment. I guess that wouldn't be issue though. I'll commit the patch after some more testing. Thanks for reporting and testing! no problem! after more than 3 days, there were no more 'reports', so snmpd was the culprit. the snmpd we use is from ports, i'll try and see waht's going on ... FYI: Committed in r251481. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote: On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote: --/04w6evG8XlLl3ft Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename=bge.media_sts.diff Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c(revision 251021) +++ sys/dev/bge/if_bge.c(working copy) @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar BGE_LOCK(sc); + if ((ifp-if_flags IFF_UP) == 0) { + BGE_UNLOCK(sc); + return; + } if (sc-bge_flags BGE_FLAG_TBI) { ifmr-ifm_status = IFM_AVALID; ifmr-ifm_active = IFM_ETHER; --/04w6evG8XlLl3ft-- after 18hs, the logs are empty! it seems the patch fixes the problem. now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts ... It could be any number of daemons that query interface state such as an SNMP server, ladvd, etc. If you wanted help you could modify the patch so that it does something like this: #include sys/proc.h if (/* test for IFF_UP */) { BGE_UNLOCK(sc); if_printf(ifp, state queried on down interface by pid %d (%s), --| add a \n curthread-td_proc-p_pid, curthread-td_proc-p_comm); return; } -- John Baldwin snmpd call this several times a second, (difficult to measeure since sysolog just says last message repeated 22 times in any case, the DOWN/UP appears once every few hours, oh well. I have now stopped the snmpd daemon, maybe there is someone else ... I have no idea why snmpd wants to know media status for interfaces that are put into down state. The media status resolved after bringing up the interface may be different one that was seen before. The patch also makes dhclient think driver got a valid link regardless of link establishment. I guess that wouldn't be issue though. I'll commit the patch after some more testing. Thanks for reporting and testing! thanks, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Wed, May 29, 2013 at 08:47:14AM +0900, Hiroki Sato wrote: YongHyeon PYUN pyu...@gmail.com wrote in 20130528023300.ga3...@michelle.cdnetworks.com: py I'll have access to the other box on Wednesday and will try the other test. py py Here is patch I'm testing and it seems to work with dhclient on py CURRENT. py Mike, could you try attached patch? On my box it worked without problem. Link status change of fxp0 was down-up only in the patched driver. Thanks for testing! -- Hiroki ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Tue, May 28, 2013 at 03:34:00PM -0400, Michael L. Squires wrote: Short answer: it didn't work. [...] Patch did not solve the problem on the home NAT box. I'll try it on the Hmm, I can't reproduce it on my box. I double checked every possible controller initialization sequences in driver but couldn't find a clue. Let you know if I manage to narrow down the issue. second 1U box at work tomorrow. I applied the patch (see below) and recompiled/reinstalled world. Rebuilding kernel should be enough. root@familysquires:/usr/src/sys/dev/fxp # uname -a FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 22:56:19 EDT 2013 r...@familysquires.net:/usr/obj/usr/src/sys/NEWGATE i386 drwxr-xr-x 236 root 3584 May 28 10:28 ../ -rw-r--r-- 1 root 95366 May 28 10:28 if_fxp.c -rw-r--r-- 1 root 94968 Mar 28 09:04 if_fxp.c.orig -rw-r--r-- 1 root 15638 Mar 28 09:04 if_fxpreg.h -rw-r--r-- 1 root 8717 Mar 28 09:04 if_fxpvar.h -rw-r--r-- 1 root 23009 Mar 28 09:04 rcvbundl.h One immediate difference in behavior is that without the modified rc.conf the box was unable to use ntp to the outside world; it eventually sync'd on my internal ntp server. With the modified rc.conf the box immediately sync'd to an ntp server in the outside world. There is a side-effect of the rc.conf workaround. Parallel detection may or may not work and generally can result in duplex mismatch. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote: On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: MII bus on bge0 brgphy0: BCM5714 1000BASE-T media interface PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:5d:5b:bd bge1: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: MII bus on bge1 brgphy1: BCM5714 1000BASE-T media interface PHY 1 on miibus3 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1b:24:5d:5b:be sf-10 ifconfig bge1 bge1: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTA TE ether 00:1b:24:5d:5b:be nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active Because bge1 is not UP, I wonder how you get link UP/DOWN events. Do you have some network script run by cron? no scripts. this port is shared with the ILO/IPMI, and back in March you fixed a problem that it was hanging soon after it was initialized by the driver, (r248226 - but I'm not sure if it was ever MFC'ed). It was MFCed. Initialy I thought it could be caused by connections to it from other hosts (either via the web, or ssh) so I killed them, but it didn't help. without that patch the connection fails, and I don't see any DOWN/UP. Could you check how many number of interrupts you get from bge1? Ideally you shouldn't get any interrupts for bge1. it's not even mentioned :-) sf-04 vmstat -i interrupt total rate irq3: uart1 964 0 irq4: uart06 0 irq14: ata0 227354 0 irq17: bge0 1021981 2 irq21: ohci0 28 0 irq22: ehci0 2 0 irq23: atapci1293228 0 cpu0:timer 383244076 1124 cpu1:timer 2225144 6 cpu2:timer 2056087 6 cpu3:timer 2093943 6 Total 391162813 1147 Then the only way link UP/DOWN event could be generated for DOWN interface would be invocation of media status query (i.e. ifconfig -a) triggered by an external application. Most drivers I touched check IFF_UP flag before poking media status register. However I'm not sure you're seeing this issue because you do not use any network script run by cron. Anyway, try attached patch and let me know whether it makes any difference. is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? I have no idea, but I have older versions, and ill start from the oldets (9.1-prerelease), but it will take time, since it takes hours till it happens. ok. Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 251021) +++ sys/dev/bge/if_bge.c (working copy) @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar BGE_LOCK(sc); + if ((ifp-if_flags IFF_UP) == 0) { + BGE_UNLOCK(sc); + return; + } if (sc-bge_flags BGE_FLAG_TBI) { ifmr-ifm_status = IFM_AVALID; ifmr-ifm_active = IFM_ETHER; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: MII bus on bge0 brgphy0: BCM5714 1000BASE-T media interface PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:5d:5b:bd bge1: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: MII bus on bge1 brgphy1: BCM5714 1000BASE-T media interface PHY 1 on miibus3 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1b:24:5d:5b:be sf-10 ifconfig bge1 bge1: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTA TE ether 00:1b:24:5d:5b:be nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active Because bge1 is not UP, I wonder how you get link UP/DOWN events. Do you have some network script run by cron? no scripts. this port is shared with the ILO/IPMI, and back in March you fixed a problem that it was hanging soon after it was initialized by the driver, (r248226 - but I'm not sure if it was ever MFC'ed). It was MFCed. Initialy I thought it could be caused by connections to it from other hosts (either via the web, or ssh) so I killed them, but it didn't help. without that patch the connection fails, and I don't see any DOWN/UP. Could you check how many number of interrupts you get from bge1? Ideally you shouldn't get any interrupts for bge1. is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? I have no idea, but I have older versions, and ill start from the oldets (9.1-prerelease), but it will take time, since it takes hours till it happens. ok. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? There is not correlation with time, since they happend at totaly different times. I rebooted both hosts at almost the same time. one host : uptime: 5:24PM up 6:15, 0 users, load averages: 0.00, 0.00, 0.00 May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP and uptime: 5:24PM up 6:14, 0 users, load averages: 0.00, 0.00, 0.00 May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have more than 10 of this hosts, and if I upgrade all of them, the logs will fill up with this :-) any ideas? cheers, danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: SunFire X2200 ilo's bge1 DOWN/UP
On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: MII bus on bge0 brgphy0: BCM5714 1000BASE-T media interface PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:5d:5b:bd bge1: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: MII bus on bge1 brgphy1: BCM5714 1000BASE-T media interface PHY 1 on miibus3 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1b:24:5d:5b:be sf-10 ifconfig bge1 bge1: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTA TE ether 00:1b:24:5d:5b:be nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active Because bge1 is not UP, I wonder how you get link UP/DOWN events. Do you have some network script run by cron? is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? I have no idea, but I have older versions, and ill start from the oldets (9.1-prerelease), but it will take time, since it takes hours till it happens. ok. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: Hiroki Sato h...@freebsd.org wrote in 20130524.162926.395058052118975996@allbsd.org: hr YongHyeon PYUN pyu...@gmail.com wrote hr in 20130524054720.ga1...@michelle.cdnetworks.com: hr hr A workaround is specifying the following line in rc.conf: hr hr ifconfig_fxp0=DHCP media 100baseTX mediaopt full-duplex Hmm, I guess this can happen on other NICs when the link negotiation causes a link-state flap. Is it true? Probably not. AFAIK fxp(4) is the only controller that requires two full resets to support flow control. Multicast programming for fxp(4) also requires full controller reset so trying to renew its existing lease for fxp(4) looks wrong to me. -- Hiroki ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Fri, May 24, 2013 at 03:32:29AM -0400, Charles Sprickman wrote: On May 24, 2013, at 1:47 AM, YongHyeon PYUN wrote: On Thu, May 23, 2013 at 09:49:19PM -0700, Jeremy Chadwick wrote: On Thu, May 23, 2013 at 09:40:35PM -0700, Jeremy Chadwick wrote: On Thu, May 23, 2013 at 11:42:44PM -0400, Glen Barber wrote: On Thu, May 23, 2013 at 08:38:06PM -0700, Jeremy Chadwick wrote: If someone wants me to test DHCP via fxp(4) on the above system (I can do so with both NICs), just let me know; it should only take me half an hour or so. I'll politely wait for someone to say please do so else won't bother. For the sake of completeness... Please do so. :) Issue reproduced 100% reliably, even within sysinstall. {snip} Forgot to add: This issue ONLY happens when using DHCP. Statically assigning the IP address works fine; fxp0 goes down once, up once, then stays up indefinitely. I asked Mike to try backing out dhclient(8) change(r247336) but it seems he missed that. Jeremy, could you try that? I have a system up and running and showing the problem (that was non-trival, just for the record - one machine blew the PSU after POST, the other refused to boot off an IDE drive, and then required two CD-ROM drives before I found a functional one, and it took a good half-hour to find what's apparently the last piece of writable CD-R media I own). I am not awesome with svn, but I'll see if I can manually undo r247336 and give it a spin. Download http://svnweb.freebsd.org/base/stable/8/sbin/dhclient/dhclient.c?r1=231278r2=247336view=patch And apply the patch with -R. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Sun, May 26, 2013 at 08:38:41PM +0900, YongHyeon PYUN wrote: On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: Hiroki Sato h...@freebsd.org wrote in 20130524.162926.395058052118975996@allbsd.org: hr YongHyeon PYUN pyu...@gmail.com wrote hr in 20130524054720.ga1...@michelle.cdnetworks.com: hr hr A workaround is specifying the following line in rc.conf: hr hr ifconfig_fxp0=DHCP media 100baseTX mediaopt full-duplex Hmm, I guess this can happen on other NICs when the link negotiation causes a link-state flap. Is it true? Probably not. AFAIK fxp(4) is the only controller that requires two full resets to support flow control. Multicast programming for fxp(4) also requires full controller reset so trying to renew its existing lease for fxp(4) looks wrong to me. After reading code again, I think the dhclient change may affect all controllers that don't have protection against multiple initialization of upper stack. if_init() of driver is called whenever an IP address is assigned to an interface. The stack could be changed to call if_init() only when IFF_DRV_RUNNING flag is not set but that would break old drivers which may require full controller reset for multicast filter reprogramming. I also guess there may be several drivers that do not implement reinitialization protection in arm/mips. It seems fxp(4)'s simple protection against unnecessary controller initialization does not work well due to the limitation of controller. We may be able to improve fxp(4) case but other old/buggy drivers should be fixed too. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Thu, May 23, 2013 at 09:49:19PM -0700, Jeremy Chadwick wrote: On Thu, May 23, 2013 at 09:40:35PM -0700, Jeremy Chadwick wrote: On Thu, May 23, 2013 at 11:42:44PM -0400, Glen Barber wrote: On Thu, May 23, 2013 at 08:38:06PM -0700, Jeremy Chadwick wrote: If someone wants me to test DHCP via fxp(4) on the above system (I can do so with both NICs), just let me know; it should only take me half an hour or so. I'll politely wait for someone to say please do so else won't bother. For the sake of completeness... Please do so. :) Issue reproduced 100% reliably, even within sysinstall. {snip} Forgot to add: This issue ONLY happens when using DHCP. Statically assigning the IP address works fine; fxp0 goes down once, up once, then stays up indefinitely. I asked Mike to try backing out dhclient(8) change(r247336) but it seems he missed that. Jeremy, could you try that? I guess dhclient(8) does not like flow-control negotiation of fxp(4) after link establishment. I also tested network I/O in the statically-assigned scenario. Pinging the box from another machine on the LAN: $ ping 192.168.1.192 PING 192.168.1.192 (192.168.1.192): 56 data bytes 64 bytes from 192.168.1.192: icmp_seq=0 ttl=64 time=0.180 ms 64 bytes from 192.168.1.192: icmp_seq=1 ttl=64 time=0.138 ms 64 bytes from 192.168.1.192: icmp_seq=2 ttl=64 time=0.214 ms 64 bytes from 192.168.1.192: icmp_seq=3 ttl=64 time=0.165 ms 64 bytes from 192.168.1.192: icmp_seq=4 ttl=64 time=0.114 ms ^C --- 192.168.1.192 ping statistics --- 5 packets transmitted, 5 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 0.114/0.162/0.214/0.034 ms ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Apparent fxp regression in FreeBSD 8.4-RC3
On Sat, May 11, 2013 at 10:57:44PM -0400, Michael L. Squires wrote: I upgraded to FreeBSD 8.4-RC3 and noticed a problem with the fxp driver on an older Supermicro single CPU single core Xeon motherboard. I know that 8.3-Release does not have this issue, but don't know when in the updates to that release the regression was introduced. I use the fxp driver to connect to a Motorola Surfboard cable modem, and immediately saw the following occur many times: May 10 23:00:04 familysquires kernel: fxp0: link state changed to DOWN May 10 23:00:04 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 10 23:00:04 familysquires dhclient: New Broadcast Address (fxp0): 255.255.25 5.255 May 10 23:00:04 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 10 23:00:06 familysquires kernel: fxp0: link state changed to UP May 10 23:00:22 familysquires dhclient: New IP Address (fxp0): xx.xxx.xxx.163 May 10 23:00:22 familysquires kernel: fxp0: link state changed to DOWN May 10 23:00:22 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 10 23:00:22 familysquires dhclient: New Broadcast Address (fxp0): 255.255.255.255 May 10 23:00:22 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 10 23:00:24 familysquires kernel: fxp0: link state changed to UP repeated without end. If you assign static IP address, fxp(4) works? I reinsalled 8.3-Release p8 FreeBSD familysquires.net 8.3-RELEASE-p8 FreeBSD 8.3-RELEASE-p8 #46: Sat May 11 00:05:26 EDT 2013 which ended the string up fxp up/down messages. This kernel has now operated for 24 hours without generating this error. There were several fxp(4)changes made since FreeBSD 8.3-RELEASE but I don't see any fxp(4) commits that may result in DHCP issue above. I recall there was a dhclient(8) change that makes dhclient track link state. Could you rebuild dhclient(8) and try again without that change(i.e. locally back out r247336)? I've attached a verbose dmesg from 8.4-RC3 and a standard dmesg from 8.3-Release p8, and can provide whatever else you need. This is not a critical issue for me. The system has an unused bge interface (replaced by an Intel em0 interface during a previous bout of a problem with the bge driver). Mike Squires mi...@siralan.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: A few problems
On Sat, Mar 16, 2013 at 01:08:06PM +0400, Michael BlackHeart wrote: Hello there. I've got a couple of things I don't get or can't handle. [...] re0@pci0:4:0:0: class=0x02 card=0x512c1462 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base 0xd800, size 256, enabled bar [18] = type Memory, range 64, base 0xfeaff000, size 4096, enabled bar [20] = type Prefetchable Memory, range 64, base 0xf8ff, size 65536, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 endpoint IRQ 1 max data 128(256) link x1(x1) speed 2.5(2.5) cap 11[b0] = MSI-X supports 2 messages in map 0x20 enabled cap 03[d0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 0100684ce000 re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 description: ToISP options=8218bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE ether 00:21:85:1c:24:fa media: Ethernet autoselect (100baseTX full-duplex) status: active [...] One is that re0 doesn't neogatiate direct link with a connected PC (using non-crossover UTP), but sk0 does that easy. It seems to me that according to RTL8111 chip specification there shouldn't be any problem, probably it's a driver problem? What is your link parter for re0? I don't remember whether the PHY hardware really supports automatic MDI crossover detection. Even if the PHY hardware does not support it, the link partner would be able to do that. And could you show me the output of dmesg(re(4) and rgephy(4) only) and devinfo -rv | grep rgephy? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: A few problems
On Mon, Mar 18, 2013 at 05:19:11PM +0400, Michael BlackHeart wrote: 2013/3/18 YongHyeon PYUN pyu...@gmail.com: On Sat, Mar 16, 2013 at 01:08:06PM +0400, Michael BlackHeart wrote: Hello there. I've got a couple of things I don't get or can't handle. [...] re0@pci0:4:0:0: class=0x02 card=0x512c1462 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base 0xd800, size 256, enabled bar [18] = type Memory, range 64, base 0xfeaff000, size 4096, enabled bar [20] = type Prefetchable Memory, range 64, base 0xf8ff, size 65536, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 endpoint IRQ 1 max data 128(256) link x1(x1) speed 2.5(2.5) cap 11[b0] = MSI-X supports 2 messages in map 0x20 enabled cap 03[d0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 0100684ce000 re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 description: ToISP options=8218bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE ether 00:21:85:1c:24:fa media: Ethernet autoselect (100baseTX full-duplex) status: active [...] One is that re0 doesn't neogatiate direct link with a connected PC (using non-crossover UTP), but sk0 does that easy. It seems to me that according to RTL8111 chip specification there shouldn't be any problem, probably it's a driver problem? What is your link parter for re0? I don't remember whether the PHY hardware really supports automatic MDI crossover detection. Even if the PHY hardware does not support it, the link partner would be able to do that. And could you show me the output of dmesg(re(4) and rgephy(4) only) and devinfo -rv | grep rgephy? Here's info: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd800-0xd8ff mem 0xfeaff000-0xfeaf,0xf8ff-0xf8ff irq 17 at device 0.0 on pci4 re0: Using 1 MSI-X message re0: Chip rev. 0x3c00 re0: MAC rev. 0x0040 miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: 00:21:85:1c:24:fa devinfo -rv | grep rgephy rgephy0 pnpinfo oui=0xe04c model=0x11 rev=0x2 at phyno=1 This link connected to Realtek 8111E under Win7. I'll repeat that when it's connected to sk0, everything works. Of e1000phy(4) supports automatic crossover detection/correction. I thought newer RealTek 8211 PHYs also support the feature but it seems it's not enabled by default. Could you try attached patch and let me know how it goes? course when I'm switching links, I change IPs and other configuration in rc.conf and reboots system. For example I'll provide info for sk0 (Dlink DGE-530T): skc0: D-Link DGE-530T Gigabit Ethernet port 0xe800-0xe8ff mem 0xfebec000-0xfebe irq 17 at device 1.0 on pci5 skc0: DGE-530T Gigabit Ethernet Adapter rev. (0x9) sk0: Marvell Semiconductor, Inc. Yukon on skc0 sk0: Ethernet address: 00:19:5b:86:3b:53 miibus1: MII bus on sk0 e1000phy0: Marvell 88E1011 Gigabit PHY PHY 0 on miibus1 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto e1000phy0 pnpinfo oui=0xac2 model=0x2 rev=0x5 at phyno=0 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: A few problems
On Tue, Mar 19, 2013 at 01:56:29PM +0900, YongHyeon PYUN wrote: On Mon, Mar 18, 2013 at 05:19:11PM +0400, Michael BlackHeart wrote: 2013/3/18 YongHyeon PYUN pyu...@gmail.com: On Sat, Mar 16, 2013 at 01:08:06PM +0400, Michael BlackHeart wrote: Hello there. I've got a couple of things I don't get or can't handle. [...] re0@pci0:4:0:0: class=0x02 card=0x512c1462 chip=0x816810ec rev=0x02 hdr=0x00 vendor = 'Realtek Semiconductor Co., Ltd.' device = 'RTL8111/8168B PCI Express Gigabit Ethernet controller' class = network subclass = ethernet bar [10] = type I/O Port, range 32, base 0xd800, size 256, enabled bar [18] = type Memory, range 64, base 0xfeaff000, size 4096, enabled bar [20] = type Prefetchable Memory, range 64, base 0xf8ff, size 65536, enabled cap 01[40] = powerspec 3 supports D0 D1 D2 D3 current D0 cap 05[50] = MSI supports 1 message, 64 bit cap 10[70] = PCI-Express 1 endpoint IRQ 1 max data 128(256) link x1(x1) speed 2.5(2.5) cap 11[b0] = MSI-X supports 2 messages in map 0x20 enabled cap 03[d0] = VPD ecap 0001[100] = AER 1 0 fatal 0 non-fatal 2 corrected ecap 0002[140] = VC 1 max VC0 ecap 0003[160] = Serial 1 0100684ce000 re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 description: ToISP options=8218bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWCSUM,TSO4,WOL_MAGIC,LINKSTATE ether 00:21:85:1c:24:fa media: Ethernet autoselect (100baseTX full-duplex) status: active [...] One is that re0 doesn't neogatiate direct link with a connected PC (using non-crossover UTP), but sk0 does that easy. It seems to me that according to RTL8111 chip specification there shouldn't be any problem, probably it's a driver problem? What is your link parter for re0? I don't remember whether the PHY hardware really supports automatic MDI crossover detection. Even if the PHY hardware does not support it, the link partner would be able to do that. And could you show me the output of dmesg(re(4) and rgephy(4) only) and devinfo -rv | grep rgephy? Here's info: re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd800-0xd8ff mem 0xfeaff000-0xfeaf,0xf8ff-0xf8ff irq 17 at device 0.0 on pci4 re0: Using 1 MSI-X message re0: Chip rev. 0x3c00 re0: MAC rev. 0x0040 miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: 00:21:85:1c:24:fa devinfo -rv | grep rgephy rgephy0 pnpinfo oui=0xe04c model=0x11 rev=0x2 at phyno=1 This link connected to Realtek 8111E under Win7. I'll repeat that when it's connected to sk0, everything works. Of e1000phy(4) supports automatic crossover detection/correction. I thought newer RealTek 8211 PHYs also support the feature but it seems it's not enabled by default. Could you try attached patch and let me know how it goes? Attached patch. Index: sys/dev/mii/rgephy.c === --- sys/dev/mii/rgephy.c (revision 248449) +++ sys/dev/mii/rgephy.c (working copy) @@ -488,7 +488,7 @@ rgephy_load_dspcode(struct mii_softc *sc) static void rgephy_reset(struct mii_softc *sc) { - uint16_t ssr; + uint16_t pcr, ssr; if ((sc-mii_flags MIIF_PHYPRIV0) == 0 sc-mii_mpd_rev == 3) { /* RTL8211C(L) */ @@ -499,6 +499,15 @@ rgephy_reset(struct mii_softc *sc) } } + if (sc-mii_mpd_rev = 2) { + pcr = PHY_READ(sc, RGEPHY_MII_PCR); + if ((pcr RGEPHY_PCR_MDIX_AUTO) == 0) { + pcr = ~RGEPHY_PCR_MDI_MASK; + pcr |= RGEPHY_PCR_MDIX_AUTO; + PHY_WRITE(sc, RGEPHY_MII_PCR, pcr); + } + } + mii_phy_reset(sc); DELAY(1000); rgephy_load_dspcode(sc); Index: sys/dev/mii/rgephyreg.h === --- sys/dev/mii/rgephyreg.h (revision 248449) +++ sys/dev/mii/rgephyreg.h (working copy) @@ -137,6 +137,16 @@ #define RGEPHY_EXTSTS_T_FD_CAP 0x2000 /* 1000base-T FD capable */ #define RGEPHY_EXTSTS_T_HD_CAP 0x1000 /* 1000base-T HD capable */ +#define RGEPHY_MII_PCR 0x10 /* PHY Specific control register */ +#define RGEPHY_PCR_ASSERT_CRS 0x0800 +#define RGEPHY_PCR_FORCE_LINK 0x0400 +#define RGEPHY_PCR_MDI_MASK 0x0060 +#define RGEPHY_PCR_MDIX_AUTO 0x0040 +#define RGEPHY_PCR_MDIX_MANUAL 0x0020 +#define RGEPHY_PCR_MDI_MANUAL 0x +#define RGEPHY_PCR_CLK125_DIS 0x0010 +#define RGEPHY_PCR_JABBER_DIS 0x0001 + /* RTL8211B(L)/RTL8211C(L) */ #define RGEPHY_MII_SSR 0x11 /* PHY Specific status register */ #define RGEPHY_SSR_S1000
Re: Strange reboot since 9.1
On Thu, Mar 07, 2013 at 08:38:27AM -0800, Jeremy Chadwick wrote: On Thu, Mar 07, 2013 at 04:38:54PM +0100, Lo?c Blot wrote: Hi Marcelo, thanks. Here is a better trace: - kgdb /boot/kernel/kernel.symbols /var/crash/vmcore.11 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as amd64-marcel-freebsd... Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor read data, page not present instruction pointer = 0x20:0x80a84414 stack pointer = 0x28:0xff822fc267a0 frame pointer = 0x28:0xff822fc26830 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 12 (irq265: bce0) trap number = 12 panic: page fault cpuid = 0 KDB: stack backtrace: #0 0x809208a6 at kdb_backtrace+0x66 #1 0x808ea8be at panic+0x1ce #2 0x80bd8240 at trap_fatal+0x290 #3 0x80bd857d at trap_pfault+0x1ed #4 0x80bd8b9e at trap+0x3ce #5 0x80bc315f at calltrap+0x8 #6 0x80a861d5 at udp_input+0x475 #7 0x80a043dc at ip_input+0xac #8 0x809adafb at netisr_dispatch_src+0x20b #9 0x809a35cd at ether_demux+0x14d #10 0x809a38a4 at ether_nh_input+0x1f4 #11 0x809adafb at netisr_dispatch_src+0x20b #12 0x80438fd7 at bce_intr+0x487 #13 0x808be8d4 at intr_event_execute_handlers+0x104 #14 0x808c0076 at ithread_loop+0xa6 #15 0x808bb9ef at fork_exit+0x11f #16 0x80bc368e at fork_trampoline+0xe Uptime: 27m20s Dumping 1265 out of 8162 MB:..2%..11%..21%..31%..41%..51%..61%..71%..81%..92% #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 224 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt f #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 No locals. #1 0x808ea3a1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 _ep = Variable _ep is not available. (kgdb) bt #0 doadump (textdump=Variable textdump is not available. ) at pcpu.h:224 #1 0x808ea3a1 in kern_reboot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:448 #2 0x808ea897 in panic (fmt=0x1 Address 0x1 out of bounds) at /usr/src/sys/kern/kern_shutdown.c:636 #3 0x80bd8240 in trap_fatal (frame=0xc, eva=Variable eva is not available. ) at /usr/src/sys/amd64/amd64/trap.c:857 #4 0x80bd857d in trap_pfault (frame=0xff822fc266f0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:773 #5 0x80bd8b9e in trap (frame=0xff822fc266f0) at /usr/src/sys/amd64/amd64/trap.c:456 #6 0x80bc315f in calltrap () at /usr/src/sys/amd64/amd64/exception.S:228 #7 0x80a84414 in udp_append (inp=0xfe019e2a1000, ip=0xfe00444b6c80, n=0xfe00444b6c00, off=20, udp_in=0xff822fc268a0) at /usr/src/sys/netinet/udp_usrreq.c:252 #8 0x80a861d5 in udp_input (m=0xfe00444b6c00, off=Variable off is not available. ) at /usr/src/sys/netinet/udp_usrreq.c:618 #9 0x80a043dc in ip_input (m=0xfe00444b6c00) at /usr/src/sys/netinet/ip_input.c:760 #10 0x809adafb in netisr_dispatch_src (proto=1, source=Variable source is not available. ) at /usr/src/sys/net/netisr.c:1013 #11 0x809a35cd in ether_demux (ifp=0xfe00053fa000, m=0xfe00444b6c00) at /usr/src/sys/net/if_ethersubr.c:940 #12 0x809a38a4 in ether_nh_input (m=Variable m is not available. ) at /usr/src/sys/net/if_ethersubr.c:759 #13 0x809adafb in netisr_dispatch_src (proto=9, source=Variable source is not available. ) at /usr/src/sys/net/netisr.c:1013 #14 0x80438fd7 in bce_intr (xsc=Variable xsc is not available. ) at /usr/src/sys/dev/bce/if_bce.c:6903 #15 0x808be8d4 in intr_event_execute_handlers (p=Variable p is not available. ) at /usr/src/sys/kern/kern_intr.c:1262 #16 0x808c0076 in ithread_loop (arg=0xfe00057424e0) at /usr/src/sys/kern/kern_intr.c:1275 #17 0x808bb9ef in fork_exit (callout=0x808bffd0 ithread_loop, arg=0xfe00057424e0, frame=0xff822fc26c40) at /usr/src/sys/kern/kern_fork.c:992 #18 0x80bc368e in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:602 #19 0x in ?? ()
Re: Unusual TCP/IP Packet Size
On Wed, Feb 13, 2013 at 02:01:32PM -0800, Jeremy Chadwick wrote: On Wed, Feb 13, 2013 at 01:57:38PM -0800, Doug Hardie wrote: On 13 February 2013, at 02:29, Eugene Grosbein egrosb...@rdtc.ru wrote: 13.02.2013 17:25, Doug Hardie ??: Monitoring a tcpdump between two systems, a FreeBSD 9.1 system has the following interface: msk0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c011bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,VLAN_HWTSO,LINKSTATE ether 00:11:2f:2a:c7:03 inet 10.0.1.199 netmask 0xff00 broadcast 10.0.1.255 inet6 fe80::211:2fff:fe2a:c703%msk0 prefixlen 64 scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex,flowcontrol,rxpause,txpause) status: active It sent the following packet: (data content abbreviated) 02:14:42.081617 IP 10.0.1.199.443 10.0.1.2.61258: Flags [P.], seq 930:4876, ack 846, win 1040, options [nop,nop,TS val 401838072 ecr 920110183], length 3946 0x: 4500 0f9e ea89 4000 4006 2a08 0a00 01c7 E.@.@.*. 0x0010: 0a00 0102 01bb ef4a ece1 680b ae37 1bbc ...J..h..7.. 0x0020: 8018 0410 3407 0101 080a 17f3 8ff8 4...??. The indicated packet length is 3946 and the load of data shown is that size. The MTU on both interfaces is 1500. The receiving system received 3 packets. There is a router and switch between them. One of them fragmented that packet. This is part of a SSL/TLS exchange and one side or the other is hanging on this and just dropping the connection. I suspect the packet size is the issue. ssldump complains about the packet too and stops monitoring. Could this possibly be related to the hardware checksums? You have TSO enabled on the interface, so large outgoing TCP packet is pretty normal. It will be split by the NIC. Disable TSO with ifconfig if it interferes with your ssldump. Thanks. Now all the packets are 1500 or under. They all are received with a SSL header. If disabling TSO on msk(4) fixed the issue of the remote end dropping/ignoring the packet, that sounds like a bug in msk(4). Yong-Hyeon, do you have any recent msk(4) patches relating to TSO? No, I'm not aware of msk(4) related TSO issues. For some controllers, msk(4) used to touch reserved registers which in turn resulted in unexpected results. This was fixed long time ago but it would be good idea to cold-boot the box and see whether that makes any difference. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Unusual TCP/IP Packet Size
On Wed, Feb 13, 2013 at 05:00:59AM -0800, Jeremy Chadwick wrote: On Wed, Feb 13, 2013 at 05:29:53PM +0700, Eugene Grosbein wrote: 13.02.2013 17:25, Doug Hardie ??: Monitoring a tcpdump between two systems, a FreeBSD 9.1 system has the following interface: msk0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c011bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,VLAN_HWTSO,LINKSTATE ether 00:11:2f:2a:c7:03 inet 10.0.1.199 netmask 0xff00 broadcast 10.0.1.255 inet6 fe80::211:2fff:fe2a:c703%msk0 prefixlen 64 scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex,flowcontrol,rxpause,txpause) status: active It sent the following packet: (data content abbreviated) 02:14:42.081617 IP 10.0.1.199.443 10.0.1.2.61258: Flags [P.], seq 930:4876, ack 846, win 1040, options [nop,nop,TS val 401838072 ecr 920110183], length 3946 0x: 4500 0f9e ea89 4000 4006 2a08 0a00 01c7 E.@.@.*. 0x0010: 0a00 0102 01bb ef4a ece1 680b ae37 1bbc ...J..h..7.. 0x0020: 8018 0410 3407 0101 080a 17f3 8ff8 4...??. The indicated packet length is 3946 and the load of data shown is that size. The MTU on both interfaces is 1500. The receiving system received 3 packets. There is a router and switch between them. One of them fragmented that packet. This is part of a SSL/TLS exchange and one side or the other is hanging on this and just dropping the connection. I suspect the packet size is the issue. ssldump complains about the packet too and stops monitoring. Could this possibly be related to the hardware checksums? You have TSO enabled on the interface, so large outgoing TCP packet is pretty normal. It will be split by the NIC. Disable TSO with ifconfig if it interferes with your ssldump. This is not the behaviour I see with em(4) on a 82573E with all defaults used (which includes TSO4). Note that Doug is using msk(4). I can provide packet captures on both ends of a LAN segment using both tcpdump (on the FreeBSD side) and Wireshark (on the Windows side) that show a difference in behaviour compared to what Doug sees. This is strange. tcpdump sees a (big) TCP segment right before controller actually transmits it. So if TSO is active for the TCP segment, you should see a series of small TCP packets on receiver side(i.e. 3 TCP packets in Doug's case). If you don't see a big TCP segment with tcpdump on TX path, probably TSO was not used for the TCP segment. It's possible for controller to corrupt the TCP segment during segmentation but Doug's tcpdump looks completely normal to me since tcpdump sees the segment before TCP segmentation. What I see on the FreeBSD side with tcpdump is repeated bad-len 0 messages for payloads which are chunked or segmented as a result of TSO. I do not see a 1:1 ratio of bad-len entries to chunked payloads; I only see one bad-len entry for all chunks (up until the next ACK or PSH+ACK of course). I vaguely recall that some users reported similar TSO issues on various drivers. The root cause of the issue was not identified though. Personally I couldn't reproduce the issue at that time. It could be a driver or network stack bug. The important part: I do not see captured TCP packets reporting a length greater than MTU (or MSS for that matter (remember: IP header + TCP header + MSS = MTU). Also note for Doug: remember that if you're doing packet captures between two devices that have NAT involved, you may see different behaviour. Example case: 03:58:47.907582 IP 67.180.84.87.2983 206.125.172.42.80: Flags [.], ack 13419, win 64240, length 0 03:58:47.907649 IP 206.125.172.42.80 67.180.84.87.2983: Flags [.], seq 17799:19259, ack 292, win 1026, length 1460 03:58:47.907679 IP 206.125.172.42.80 67.180.84.87.2983: Flags [.], seq 19259:20719, ack 292, win 1026, length 1460 03:58:47.912546 IP 67.180.84.87.2983 206.125.172.42.80: Flags [.], ack 16339, win 64240, length 0 In the above example there's a Linux NAT router (67.180.84.87) with a client (192.168.1.50) behind it, talking to 206.125.172.42. MTU is 1500 (I obviously didn't include the initial SYN :-) ). -- | Jeremy Chadwick j...@koitsu.org | | UNIX Systems Administratorhttp://jdc.koitsu.org/ | | Mountain View, CA, US| | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Unusual TCP/IP Packet Size
On Wed, Feb 13, 2013 at 09:10:36PM -0800, Kevin Oberman wrote: On Wed, Feb 13, 2013 at 5:37 PM, YongHyeon PYUN pyu...@gmail.com wrote: On Wed, Feb 13, 2013 at 05:00:59AM -0800, Jeremy Chadwick wrote: On Wed, Feb 13, 2013 at 05:29:53PM +0700, Eugene Grosbein wrote: 13.02.2013 17:25, Doug Hardie ??: Monitoring a tcpdump between two systems, a FreeBSD 9.1 system has the following interface: msk0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c011bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,TSO4,VLAN_HWTSO,LINKSTATE ether 00:11:2f:2a:c7:03 inet 10.0.1.199 netmask 0xff00 broadcast 10.0.1.255 inet6 fe80::211:2fff:fe2a:c703%msk0 prefixlen 64 scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex,flowcontrol,rxpause,txpause) status: active It sent the following packet: (data content abbreviated) 02:14:42.081617 IP 10.0.1.199.443 10.0.1.2.61258: Flags [P.], seq 930:4876, ack 846, win 1040, options [nop,nop,TS val 401838072 ecr 920110183], length 3946 0x: 4500 0f9e ea89 4000 4006 2a08 0a00 01c7 E.@.@.*. 0x0010: 0a00 0102 01bb ef4a ece1 680b ae37 1bbc ...J..h..7.. 0x0020: 8018 0410 3407 0101 080a 17f3 8ff8 4...??. The indicated packet length is 3946 and the load of data shown is that size. The MTU on both interfaces is 1500. The receiving system received 3 packets. There is a router and switch between them. One of them fragmented that packet. This is part of a SSL/TLS exchange and one side or the other is hanging on this and just dropping the connection. I suspect the packet size is the issue. ssldump complains about the packet too and stops monitoring. Could this possibly be related to the hardware checksums? You have TSO enabled on the interface, so large outgoing TCP packet is pretty normal. It will be split by the NIC. Disable TSO with ifconfig if it interferes with your ssldump. This is not the behaviour I see with em(4) on a 82573E with all defaults used (which includes TSO4). Note that Doug is using msk(4). I can provide packet captures on both ends of a LAN segment using both tcpdump (on the FreeBSD side) and Wireshark (on the Windows side) that show a difference in behaviour compared to what Doug sees. This is strange. tcpdump sees a (big) TCP segment right before controller actually transmits it. So if TSO is active for the TCP segment, you should see a series of small TCP packets on receiver side(i.e. 3 TCP packets in Doug's case). If you don't see a big TCP segment with tcpdump on TX path, probably TSO was not used for the TCP segment. It's possible for controller to corrupt the TCP segment during segmentation but Doug's tcpdump looks completely normal to me since tcpdump sees the segment before TCP segmentation. What I see on the FreeBSD side with tcpdump is repeated bad-len 0 messages for payloads which are chunked or segmented as a result of TSO. I do not see a 1:1 ratio of bad-len entries to chunked payloads; I only see one bad-len entry for all chunks (up until the next ACK or PSH+ACK of course). I vaguely recall that some users reported similar TSO issues on various drivers. The root cause of the issue was not identified though. Personally I couldn't reproduce the issue at that time. It could be a driver or network stack bug. Beware TSO. It can significantly improve throughput on high speed networks, but it really has issues. TSO segments the data and transmits all of them back-to-back with no delay beyond IFG (the 802.3 mandated space between frames) TSO does not understand congestion control. If there is congestion and TSO sends several frames in a row, it is entirely possible that a queue is full or getting close enough to full to start dropping packets and these segmented frames are excellent candidates. I'm not saying the drawback of TSO. Sometimes segmented packets have malformed IP header length under certain circumstances such that these packets were dropped on receiver side. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: re(4) problems with GA-H77N-WIFI
On Fri, Feb 08, 2013 at 08:27:55PM +0100, Oliver Fromme wrote: I'm sorry for the late reply. I didn't have much time this week to investigate this issue. At the moment I implemented a work-around with an additional switch using VLANs, but I'd really like to get the second NIC working. YongHyeon PYUN wrote: On Mon, Feb 04, 2013 at 07:15:51PM +0100, Oliver Fromme wrote: Recently I got a new mainboard for a router, it's a Gigabyte GA-H77N-WIFI with two onboard re(4) NICs. The problem is that re0 works fine and re1 doesn't: It doesn't receive any packets. Tcpdump displays all outgoing packets, but no incoming ones on re1. Can you see the packets sent from re1 on other box? No. I can only see them locally in tcpdump, but they never hit the wire. If not, it probably indicates GMAC is in weird state which in turn indicates initialization was not complete for the controller. Ifconfig shows the link correctly (100 or 1000 Mbit, depending on where I plug the cable in). I also swapped cables just to be sure, but it made no difference. If you cold-boot the box with UTP cable plugged in to re1 does it make any difference? No, it doesn't. I'm running a recent stable/9 (about 14 days old). What's the best way to debug this problem? At the I would check whether GMAC is active when driver detects a valid link. Add a code like the following to re_miibus_statchg() to get the status of RL_COMMAND register. You would get the status whenever a link is established with link partner. re0: CMD 0x0c re0: link state changed to UP re0: link state changed to DOWN re1: link state changed to UP re1: link state changed to DOWN re1: CMD 0x0c re1: link state changed to UP re0: CMD 0x0c re0: link state changed to UP re1: link state changed to DOWN I always seem to get 0x0c for both re0 and re1. Hmm, it seems GMAC is in sane state. Would you show me the output of devinfo -rv | grep rgephy? To rule out hardware issues, could you also try other OS like Linux? Best regards Oliver ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: re(4) problems with GA-H77N-WIFI
On Mon, Feb 04, 2013 at 07:15:51PM +0100, Oliver Fromme wrote: Hello, I need some advice how to debug this issue ... Recently I got a new mainboard for a router, it's a Gigabyte GA-H77N-WIFI with two onboard re(4) NICs. The problem is that re0 works fine and re1 doesn't: It doesn't receive any packets. Tcpdump displays all outgoing packets, but no incoming ones on re1. Can you see the packets sent from re1 on other box? If not, it probably indicates GMAC is in weird state which in turn indicates initialization was not complete for the controller. Ifconfig shows the link correctly (100 or 1000 Mbit, depending on where I plug the cable in). I also swapped cables just to be sure, but it made no difference. If you cold-boot the box with UTP cable plugged in to re1 does it make any difference? I'm running a recent stable/9 (about 14 days old). What's the best way to debug this problem? At the I would check whether GMAC is active when driver detects a valid link. Add a code like the following to re_miibus_statchg() to get the status of RL_COMMAND register. You would get the status whenever a link is established with link partner. Index: if_re.c === --- if_re.c (revision 246338) +++ if_re.c (working copy) @@ -626,6 +626,9 @@ default: break; } + if (sc-rl_flags RL_FLAG_LINK) + device_printf(sc-rl_dev, CMD 0x%02x\n, + CSR_READ_1(sc, RL_COMMAND)); } /* * RealTek controllers does not provide any interface to moment I'm not even sure if it's the hardware, or if it's FreeBSD's fault (or my fault) ... Best regards Oliver PS: dmesg ... pcib2: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 pci2: ACPI PCI bus on pcib2 re0: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xe000-0xe0ff mem 0xf0104000-0xf0104fff,0xf010-0xf0103fff irq 16 at device 0.0 on pci2 re0: Using 1 MSI-X message re0: Chip rev. 0x2c80 re0: MAC rev. 0x miibus0: MII bus on re0 rgephy0: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus0 rgephy0: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re0: Ethernet address: 90:2b:34:5f:bd:21 pcib3: ACPI PCI-PCI bridge irq 17 at device 28.5 on pci0 pci3: ACPI PCI bus on pcib3 re1: RealTek 8168/8111 B/C/CP/D/DP/E/F PCIe Gigabit Ethernet port 0xd000-0xd0ff mem 0xf0004000-0xf0004fff,0xf000-0xf0003fff irq 17 at device 0.0 on pci3 re1: Using 1 MSI-X message re1: Chip rev. 0x2c80 re1: MAC rev. 0x miibus1: MII bus on re1 rgephy1: RTL8169S/8110S/8211 1000BASE-T media interface PHY 1 on miibus1 rgephy1: none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, auto, auto-flow re1: Ethernet address: 90:2b:34:5f:bd:11 ifconfig ... re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 90:2b:34:5f:bd:21 inet ... nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active re1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8209bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_MAGIC,LINKSTATE ether 90:2b:34:5f:bd:11 inet ... nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (100baseTX full-duplex) status: active ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD-9.1 would not boot on pentium3 laptop
On Wed, Feb 06, 2013 at 01:40:11AM -0500, Mikhail T. wrote: On 06.02.2013 01:24, Mikhail T. wrote: Now, if only I could figure out, why my network card (3COM's 3C556 Mini PCI) is not seen by the 9.1... Disabling Wake on LAN in the BIOS solved this problem. Now xl0 is seen and functional. Solved. Because I added WOL support xl(4) in the past I'm interested in knowing whether that change broke your controller when BIOS enables WOL. If you boot with bootverbose, do you see a message like No auxiliary remote wakeup connector! from xl(4)? (of course, WOL should be enabled in BIOS before boot) Also show the output of 'pciconf -lcbv. I struggle to understand, how a less seasoned user could be expected to figure these two issues out... -mi ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable 9.1, bge and ipmi not cooperating
On Thu, Jan 03, 2013 at 08:45:05AM +0200, Daniel Braniss wrote: On Wed, Jan 02, 2013 at 08:33:41AM +0200, Daniel Braniss wrote: Hi, the last batch of changes to bge caused the ipmi to stop working on a Sun Fire X2200 M2, and bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: MII bus on bge0 bge1: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: MII bus on bge1 Could you narrow down which revision broke IPMI? And what problems are you seeing? I don't have IPMI-capable controller to test. IPMI support for old controllers(pre-5717 controllers) was not complete and it was luck if it worked on a couple of controllers. first approximation: kernels upto Oct 5th work OK. so it must be one of these: changeset: 16420:855e9b949c7a branch: 9 user:yongari date:Mon Nov 26 04:39:41 2012 + summary: MFC r242426: changeset: 16419:5fa365c1dcdb branch: 9 user:yongari date:Mon Nov 26 04:34:05 2012 + summary: MFC r241983-241985: changeset: 16418:b69de3b6a9e8 branch: 9 user:yongari date:Mon Nov 26 04:25:41 2012 + summary: MFC r241438: changeset: 16416:564dcb92a91e branch: 9 user:yongari date:Mon Nov 26 04:10:27 2012 + summary: MFC r241436: changeset: 16414:a49ee7f76f76 branch: 9 user:yongari date:Mon Nov 26 02:41:30 2012 + summary: MFC r241388-241393: changeset: 16413:0f885258cea4 branch: 9 user:yongari date:Mon Nov 26 02:31:28 2012 + summary: MFC r241215-241216,241219-241220,241341,241343: changeset: 16308:27fd493a855c branch: 9 user:dim date:Mon Nov 12 07:34:05 2012 + summary: MFC r242625: changeset: 16169:e292988ae112 branch: 9 user:gavin date:Wed Oct 24 19:04:17 2012 + summary: Merge r240680 from head: btw, if you send me patches I can try them out Sorry, I need more exact revision number. There had been too many changes. cheers, danny PS: I'm sorry I caught this now, but these machines are production, and I didn't want to touch them :-( ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: stable 9.1, bge and ipmi not cooperating
On Wed, Jan 02, 2013 at 08:33:41AM +0200, Daniel Braniss wrote: Hi, the last batch of changes to bge caused the ipmi to stop working on a Sun Fire X2200 M2, and bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: MII bus on bge0 bge1: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x009003 mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: MII bus on bge1 Could you narrow down which revision broke IPMI? And what problems are you seeing? I don't have IPMI-capable controller to test. IPMI support for old controllers(pre-5717 controllers) was not complete and it was luck if it worked on a couple of controllers. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge on the new Mac Mini
On Thu, Nov 29, 2012 at 08:14:12AM -0500, Richard Kuhns wrote: On 11/28/12 19:08, YongHyeon PYUN wrote: On Wed, Nov 28, 2012 at 10:12:05AM -0500, Richard Kuhns wrote: On 11/27/12 19:19, YongHyeon PYUN wrote: On Tue, Nov 27, 2012 at 08:34:13AM -0500, Richard Kuhns wrote: On 11/27/12 00:24, YongHyeon PYUN wrote: On Mon, Nov 26, 2012 at 10:13:47AM -0500, Richard Kuhns wrote: On 11/21/12 21:08, YongHyeon PYUN wrote: On Thu, Nov 22, 2012 at 10:49:21AM +0900, YongHyeon PYUN wrote: On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0: class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Oops, it seems I've not included your device ID in the diff. Try attach one instead. Make sure you use brgphy.c from HEAD. There's progress! With your latest patch using brgphy.c, if_bge.c, and if_bgereg.h from head I'm now seeing the bge0 interface. Unfortunately, the moment I try to configure it the box locks up completely; it won't even toggle the caps lock LED. Booting single user and running ifconfig shows: bge0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE ether a8:20:66:11:3b:d6 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active I did a verbose boot; here's the part that seems to be relevant to bge0: bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x57766001 mem 0xa040-0xa040,0xa041-0xa041 irq 16 at device 0.0 on pci1 bge0: CHIP ID 0x10110142; ASIC REV 0x10110; CHIP REV 0x101101; PCI-E ^ All these information are garbage which indicates a bug in the diff. miibus0: MII bus on bge0 brgphy0: BCM57765 1000BASE-T media interface PHY 1 on miibus0 brgphy0: OUI 0x001be9, model 0x0024, rev. 1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: bpf attached bge0: Ethernet address: a8:20:66:11:3b:d6 ioapic0: routing intpin 16 (PCI IRQ 16) to lapic 0 vector 61 I greatly appreciate your efforts. I'm sorry for the delay getting back with you, but we had a busy Thanksgiving weekend. Try again with attached bge.57766.diff3. Thanks for testing! I don't think the patch actually got attached :-( Oops, attached. And there was great rejoicing... It seems to take longer than I'm used to for it to decide it has link (about halfway through 'waiting for the default route interface'), but it works! Great. Could you show me dmesg(bge(4) and brgphy(4) only) and ifconfig bge0 output? Sure. Here's the 'ifconfig bge0' output: bge0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c019bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE ether a8:20:66:11:3b:d6 inet 172.28.1.90 netmask 0xff00 broadcast 172.28.1.255
Re: bge on the new Mac Mini
On Wed, Nov 28, 2012 at 10:12:05AM -0500, Richard Kuhns wrote: On 11/27/12 19:19, YongHyeon PYUN wrote: On Tue, Nov 27, 2012 at 08:34:13AM -0500, Richard Kuhns wrote: On 11/27/12 00:24, YongHyeon PYUN wrote: On Mon, Nov 26, 2012 at 10:13:47AM -0500, Richard Kuhns wrote: On 11/21/12 21:08, YongHyeon PYUN wrote: On Thu, Nov 22, 2012 at 10:49:21AM +0900, YongHyeon PYUN wrote: On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0:class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Oops, it seems I've not included your device ID in the diff. Try attach one instead. Make sure you use brgphy.c from HEAD. There's progress! With your latest patch using brgphy.c, if_bge.c, and if_bgereg.h from head I'm now seeing the bge0 interface. Unfortunately, the moment I try to configure it the box locks up completely; it won't even toggle the caps lock LED. Booting single user and running ifconfig shows: bge0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE ether a8:20:66:11:3b:d6 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active I did a verbose boot; here's the part that seems to be relevant to bge0: bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x57766001 mem 0xa040-0xa040,0xa041-0xa041 irq 16 at device 0.0 on pci1 bge0: CHIP ID 0x10110142; ASIC REV 0x10110; CHIP REV 0x101101; PCI-E ^ All these information are garbage which indicates a bug in the diff. miibus0: MII bus on bge0 brgphy0: BCM57765 1000BASE-T media interface PHY 1 on miibus0 brgphy0: OUI 0x001be9, model 0x0024, rev. 1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: bpf attached bge0: Ethernet address: a8:20:66:11:3b:d6 ioapic0: routing intpin 16 (PCI IRQ 16) to lapic 0 vector 61 I greatly appreciate your efforts. I'm sorry for the delay getting back with you, but we had a busy Thanksgiving weekend. Try again with attached bge.57766.diff3. Thanks for testing! I don't think the patch actually got attached :-( Oops, attached. And there was great rejoicing... It seems to take longer than I'm used to for it to decide it has link (about halfway through 'waiting for the default route interface'), but it works! Great. Could you show me dmesg(bge(4) and brgphy(4) only) and ifconfig bge0 output? I've just installed subversion, and I'm doing an 'svn co' of stable/9. Many thanks for the work you've done! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge on the new Mac Mini
On Tue, Nov 27, 2012 at 08:34:13AM -0500, Richard Kuhns wrote: On 11/27/12 00:24, YongHyeon PYUN wrote: On Mon, Nov 26, 2012 at 10:13:47AM -0500, Richard Kuhns wrote: On 11/21/12 21:08, YongHyeon PYUN wrote: On Thu, Nov 22, 2012 at 10:49:21AM +0900, YongHyeon PYUN wrote: On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0: class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Oops, it seems I've not included your device ID in the diff. Try attach one instead. Make sure you use brgphy.c from HEAD. There's progress! With your latest patch using brgphy.c, if_bge.c, and if_bgereg.h from head I'm now seeing the bge0 interface. Unfortunately, the moment I try to configure it the box locks up completely; it won't even toggle the caps lock LED. Booting single user and running ifconfig shows: bge0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE ether a8:20:66:11:3b:d6 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active I did a verbose boot; here's the part that seems to be relevant to bge0: bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x57766001 mem 0xa040-0xa040,0xa041-0xa041 irq 16 at device 0.0 on pci1 bge0: CHIP ID 0x10110142; ASIC REV 0x10110; CHIP REV 0x101101; PCI-E ^ All these information are garbage which indicates a bug in the diff. miibus0: MII bus on bge0 brgphy0: BCM57765 1000BASE-T media interface PHY 1 on miibus0 brgphy0: OUI 0x001be9, model 0x0024, rev. 1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: bpf attached bge0: Ethernet address: a8:20:66:11:3b:d6 ioapic0: routing intpin 16 (PCI IRQ 16) to lapic 0 vector 61 I greatly appreciate your efforts. I'm sorry for the delay getting back with you, but we had a busy Thanksgiving weekend. Try again with attached bge.57766.diff3. Thanks for testing! I don't think the patch actually got attached :-( Oops, attached. Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 243552) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -360,6 +360,7 @@ #defineBGE_ASICREV_BCM5784 0x5784 #defineBGE_ASICREV_BCM5785 0x5785 #defineBGE_ASICREV_BCM577650x57785 +#defineBGE_ASICREV_BCM577660x57766 #defineBGE_ASICREV_BCM577800x57780 /* chip revisions */ @@ -2483,7 +2484,9 @@ struct bge_status_block { #defineBCOM_DEVICEID_BCM5906M 0x1713 #defineBCOM_DEVICEID_BCM57760 0x1690 #defineBCOM_DEVICEID_BCM57761 0x16B0 +#defineBCOM_DEVICEID_BCM57762 0x1682 #defineBCOM_DEVICEID_BCM57765 0x16B4
Re: bge on the new Mac Mini
On Mon, Nov 26, 2012 at 10:13:47AM -0500, Richard Kuhns wrote: On 11/21/12 21:08, YongHyeon PYUN wrote: On Thu, Nov 22, 2012 at 10:49:21AM +0900, YongHyeon PYUN wrote: On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0:class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Oops, it seems I've not included your device ID in the diff. Try attach one instead. Make sure you use brgphy.c from HEAD. There's progress! With your latest patch using brgphy.c, if_bge.c, and if_bgereg.h from head I'm now seeing the bge0 interface. Unfortunately, the moment I try to configure it the box locks up completely; it won't even toggle the caps lock LED. Booting single user and running ifconfig shows: bge0: flags=8802BROADCAST,SIMPLEX,MULTICAST metric 0 mtu 1500 options=8009bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LINKSTATE ether a8:20:66:11:3b:d6 nd6 options=21PERFORMNUD,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active I did a verbose boot; here's the part that seems to be relevant to bge0: bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x57766001 mem 0xa040-0xa040,0xa041-0xa041 irq 16 at device 0.0 on pci1 bge0: CHIP ID 0x10110142; ASIC REV 0x10110; CHIP REV 0x101101; PCI-E ^ All these information are garbage which indicates a bug in the diff. miibus0: MII bus on bge0 brgphy0: BCM57765 1000BASE-T media interface PHY 1 on miibus0 brgphy0: OUI 0x001be9, model 0x0024, rev. 1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: bpf attached bge0: Ethernet address: a8:20:66:11:3b:d6 ioapic0: routing intpin 16 (PCI IRQ 16) to lapic 0 vector 61 I greatly appreciate your efforts. I'm sorry for the delay getting back with you, but we had a busy Thanksgiving weekend. Try again with attached bge.57766.diff3. Thanks for testing! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge on the new Mac Mini
On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0: class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Just to make sure you know, I've made no local modifications at all. This was a fresh install and I've touched nothing except for these 2 files. Thanks! -- Richard Kuhns r...@wintek.com My Desk: 765-269-8541 Wintek Corporation Internet Support: 765-269-8503 427 N 6th Street Consulting: 765-269-8504 Lafayette, IN 47901-2211 Accounting: 765-269-8502 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge on the new Mac Mini
On Thu, Nov 22, 2012 at 10:49:21AM +0900, YongHyeon PYUN wrote: On Wed, Nov 21, 2012 at 02:59:34PM -0500, Richard Kuhns wrote: On 11/20/12 03:52, YongHyeon PYUN wrote: On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0: class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. I'm afraid it didn't help. I ended up grabbing if_bge.c and if_bgereg.h from I guess you also need to copy brgphy.c from HEAD to /usr/src/sys/dev/mii directory. HEAD using svnweb.freebsd.org. The patch installed cleanly and there were no errors during the build, but still no NIC. Does it mean you're not seeing bge0 interface? Or you can't pass any traffic via bge0? Oops, it seems I've not included your device ID in the diff. Try attach one instead. Make sure you use brgphy.c from HEAD. Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 243366) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -360,6 +360,7 @@ #defineBGE_ASICREV_BCM5784 0x5784 #defineBGE_ASICREV_BCM5785 0x5785 #defineBGE_ASICREV_BCM577650x57785 +#defineBGE_ASICREV_BCM577660x57766 #defineBGE_ASICREV_BCM577800x57780 /* chip revisions */ @@ -2483,7 +2484,9 @@ struct bge_status_block { #defineBCOM_DEVICEID_BCM5906M 0x1713 #defineBCOM_DEVICEID_BCM57760 0x1690 #defineBCOM_DEVICEID_BCM57761 0x16B0 +#defineBCOM_DEVICEID_BCM57762 0x1682 #defineBCOM_DEVICEID_BCM57765 0x16B4 +#defineBCOM_DEVICEID_BCM57766 0x1686 #defineBCOM_DEVICEID_BCM57780 0x1692 #defineBCOM_DEVICEID_BCM57781 0x16B1 #defineBCOM_DEVICEID_BCM57785 0x16B5 @@ -2961,6 +2964,7 @@ struct bge_softc { #defineBGE_FLAG_5755_PLUS 0x0010 #defineBGE_FLAG_5788 0x0020 #defineBGE_FLAG_5717_PLUS 0x0040 +#defineBGE_FLAG_57765_PLUS 0x0080 #defineBGE_FLAG_40BIT_BUG 0x0100 #defineBGE_FLAG_4G_BNDRY_BUG 0x0200 #defineBGE_FLAG_RX_ALIGNBUG0x0400 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c(revision 243366) +++ sys/dev/bge/if_bge.c(working copy) @@ -216,7 +216,9 @@ static const struct bge_type { { BCOM_VENDORID,BCOM_DEVICEID_BCM5906M }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57760 }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57761 }, + { BCOM_VENDORID,BCOM_DEVICEID_BCM57762 }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57765 }, + { BCOM_VENDORID,BCOM_DEVICEID_BCM57766 }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57780 }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57781 }, { BCOM_VENDORID,BCOM_DEVICEID_BCM57785 }, @@ -347,6 +349,7 @@ static const struct bge_revision bge_majorrevs[] = { BGE_ASICREV_BCM5787, unknown BCM5754/5787 }, { BGE_ASICREV_BCM5906, unknown BCM5906 }, { BGE_ASICREV_BCM57765, unknown BCM57765 }, + { BGE_ASICREV_BCM57766, unknown BCM57766 }, { BGE_ASICREV_BCM57780, unknown BCM57780 }, { BGE_ASICREV_BCM5717, unknown BCM5717 }, { BGE_ASICREV_BCM5719, unknown BCM5719 }, @@ -362,6 +365,7 @@ static const struct bge_revision
Re: bge on the new Mac Mini
On Fri, Nov 16, 2012 at 10:30:04AM -0500, Richard Kuhns wrote: Hi all, Over the last month or so I've installed FreeBSD 9 (-stable) on several Mac Minis via the memstick image; they seem to be pretty good little boxes for things like offsite secondary nameservers, for example, and they're easily replaced in case of problems. However, the newest minis have slightly different hardware, and FreeBSD can't find the built-in NIC. pciconf -lv on the new mini shows it as none3@pci0:1:0:0: class=0x02 card=0x168614e4 chip=0x168614e4 rev=0x01 It seems this controller is BCM57766. hdr=0x00 vendor = 'Broadcom Corporation' class = network subclass = ethernet The previous edition mini (that works) reports bge0@pci0:2:0:0: class=0x02 card=0x16b414e4 chip=0x16b414e4 rev=0x10 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme BCM57765 Gigabit Ethernet PCIe' class = network subclass = ethernet Is there a chance that adding the new card/chip info to the current driver would allow it to work? I'll be happy to test and report back. I'm afraid I'm not familiar enough with hardware at that level to figure out the patch myself. Try attached patch and let me know whether the patch works or not. If the patch works please share dmesg output(bge(4) and brgphy(4) output only). Note, the patch was generated against CURRENT. Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 243255) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -360,6 +360,7 @@ #define BGE_ASICREV_BCM5784 0x5784 #define BGE_ASICREV_BCM5785 0x5785 #define BGE_ASICREV_BCM57765 0x57785 +#define BGE_ASICREV_BCM57766 0x57766 #define BGE_ASICREV_BCM57780 0x57780 /* chip revisions */ @@ -2484,6 +2485,7 @@ struct bge_status_block { #define BCOM_DEVICEID_BCM57760 0x1690 #define BCOM_DEVICEID_BCM57761 0x16B0 #define BCOM_DEVICEID_BCM57765 0x16B4 +#define BCOM_DEVICEID_BCM57766 0x1682 #define BCOM_DEVICEID_BCM57780 0x1692 #define BCOM_DEVICEID_BCM57781 0x16B1 #define BCOM_DEVICEID_BCM57785 0x16B5 @@ -2961,6 +2963,7 @@ struct bge_softc { #define BGE_FLAG_5755_PLUS 0x0010 #define BGE_FLAG_5788 0x0020 #define BGE_FLAG_5717_PLUS 0x0040 +#define BGE_FLAG_57765_PLUS 0x0080 #define BGE_FLAG_40BIT_BUG 0x0100 #define BGE_FLAG_4G_BNDRY_BUG 0x0200 #define BGE_FLAG_RX_ALIGNBUG 0x0400 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 243255) +++ sys/dev/bge/if_bge.c (working copy) @@ -217,6 +217,7 @@ static const struct bge_type { { BCOM_VENDORID, BCOM_DEVICEID_BCM57760 }, { BCOM_VENDORID, BCOM_DEVICEID_BCM57761 }, { BCOM_VENDORID, BCOM_DEVICEID_BCM57765 }, + { BCOM_VENDORID, BCOM_DEVICEID_BCM57766 }, { BCOM_VENDORID, BCOM_DEVICEID_BCM57780 }, { BCOM_VENDORID, BCOM_DEVICEID_BCM57781 }, { BCOM_VENDORID, BCOM_DEVICEID_BCM57785 }, @@ -362,6 +363,7 @@ static const struct bge_revision bge_majorrevs[] = #define BGE_IS_575X_PLUS(sc) ((sc)-bge_flags BGE_FLAG_575X_PLUS) #define BGE_IS_5755_PLUS(sc) ((sc)-bge_flags BGE_FLAG_5755_PLUS) #define BGE_IS_5717_PLUS(sc) ((sc)-bge_flags BGE_FLAG_5717_PLUS) +#define BGE_IS_57765_PLUS(sc) ((sc)-bge_flags BGE_FLAG_57765_PLUS) const struct bge_revision * bge_lookup_rev(uint32_t); const struct bge_vendor * bge_lookup_vendor(uint16_t); @@ -2243,7 +2245,7 @@ bge_blockinit(struct bge_softc *sc) } else if (!BGE_IS_5705_PLUS(sc)) limit = BGE_RX_RINGS_MAX; else if (sc-bge_asicrev == BGE_ASICREV_BCM5755 || - sc-bge_asicrev == BGE_ASICREV_BCM57765) + BGE_IS_57765_PLUS(sc)) limit = 4; else limit = 1; @@ -2658,6 +2660,7 @@ bge_probe(device_t dev) break; case BCOM_DEVICEID_BCM57761: case BCOM_DEVICEID_BCM57765: +case BCOM_DEVICEID_BCM57766: case BCOM_DEVICEID_BCM57781: case BCOM_DEVICEID_BCM57785: case BCOM_DEVICEID_BCM57791: @@ -3321,10 +3324,13 @@ bge_attach(device_t dev) /* Save chipset family. */ switch (sc-bge_asicrev) { + case BGE_ASICREV_BCM57765: + case BGE_ASICREV_BCM57766: + sc-bge_flags |= BGE_FLAG_57765_PLUS; + /* FALLTHROUGH */ case BGE_ASICREV_BCM5717: case BGE_ASICREV_BCM5719: case BGE_ASICREV_BCM5720: - case BGE_ASICREV_BCM57765: sc-bge_flags |= BGE_FLAG_5717_PLUS | BGE_FLAG_5755_PLUS | BGE_FLAG_575X_PLUS | BGE_FLAG_5705_PLUS | BGE_FLAG_JUMBO | BGE_FLAG_JUMBO_FRAME; @@ -3738,12 +3744,9 @@ bge_attach(device_t dev) sc-bge_phy_flags |= BGE_PHY_NO_3LED; if ((BGE_IS_5705_PLUS(sc)) sc-bge_asicrev != BGE_ASICREV_BCM5906 - sc-bge_asicrev != BGE_ASICREV_BCM5717 - sc-bge_asicrev != BGE_ASICREV_BCM5719 - sc-bge_asicrev != BGE_ASICREV_BCM5720 sc-bge_asicrev != BGE_ASICREV_BCM5785 - sc-bge_asicrev != BGE_ASICREV_BCM57765 - sc-bge_asicrev != BGE_ASICREV_BCM57780) { +
Re: bge problems in RELENG_9, bge0: watchdog timeout -- resetting
On Thu, Aug 23, 2012 at 06:15:05PM +0200, Anders Nordby wrote: Hi, On ons, jul 04, 2012 at 06:01:36pm -0700, YongHyeon PYUN wrote: There is a WIP version at the following URL. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c I have a couple of positive feedbacks but it seems it still has some issues. Let me know whether it makes any difference on your box. I tried these bge source files in 9.1-PRERELEASE this week, and it does not help. If I try to log in with SSH I get: Aug 23 17:30:32 login: ROOT LOGIN (root) ON ttyu0 bge0: watchdog timeout -- resetting Aug 23 17:31:31 kernel: bge0: watchdog timeout -- resetting Aug 23 17:31:31 kernel: bge0: link state changed to DOWN Aug 23 17:31:35 kernel: bge0: link state changed to UP bge0: watchdog timeout -- resetting Aug 23 17:33:24 kernel: bge0: watchdog timeout -- resetting Aug 23 17:33:24 kernel: bge0: link state changed to DOWN Aug 23 17:33:28 kernel: bge0: link state changed to UP I tried setting hw.bge.allow_asf to 0, but it did not help. The loader tunable has no effect for controllers with APE(Application Processor Engine). During boot I get: pcib3: ACPI PCI-PCI bridge at device 2.0 on pci0 pci3: ACPI PCI bus on pcib3 pci0:3:0:0: failed to read VPD data. bge0: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bf-0xf6bf0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 bge0: APE FW version: NCSI v1.0.80.0 It seems your APE runs slightly newer NC-SI firmware. I was able to reproduce watchdog timeouts on Dell R820 but I'm not sure you're also seeing the same issue here. Due to unknown reason, it seems programming RX MTU register has no effect with BCM5720 on R820. Receiving frames larger than 175(?) bytes seem to hang the controller on R820. Current workaround for the issue is to set the MTU of sender(i.e. link partner or switch) to some low value, 128 for example. That would show poor performance but shall make your controller work. I asked help to Broadcom and waiting for answers/hint from Broadcom. bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5719C 1000BASE-T media interface PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 2c:76:8a:54:08:14 pci0:3:0:1: failed to read VPD data. bge1: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bc-0xf6bc0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 at device 0.1 on pci3 bge1: APE FW version: NCSI v1.0.80.0 bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus1: MII bus on bge1 brgphy1: BCM5719C 1000BASE-T media interface PHY 2 on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 2c:76:8a:54:08:15 pci0:3:0:2: failed to read VPD data. bge2: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b9-0xf6b90xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 at device 0.2 on pci3 bge2: APE FW version: NCSI v1.0.80.0 bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus2: MII bus on bge2 brgphy2: BCM5719C 1000BASE-T media interface PHY 3 on miibus2 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge2: Ethernet address: 2c:76:8a:54:08:16 pci0:3:0:3: failed to read VPD data. bge3: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b6-0xf6b60xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 at device 0.3 on pci3 bge3: APE FW version: NCSI v1.0.80.0 bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus3: MII bus on bge3 brgphy3: BCM5719C 1000BASE-T media interface PHY 4 on miibus3 brgphy3: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-aster, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge3: Ethernet address: 2c:76:8a:54:08:17 Regards, -- Anders. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: pf nat fails on msk0 from packets deriving from a jail interface
On Wed, Aug 08, 2012 at 02:33:25PM +0300, George Mamalakis wrote: Hi all, Suddenly I am facing a problem on a new PC, using a configuration that I have been using on more than 10 servers for the last few years. The only thing that I find that differs from my other configuratinos is the NIC of the PC. If not, I must be missing something very trivial. I have built a jail on this PC, following the handbook's guidelines (section: application of jails). The PC has one NIC, msk0, where I run pf on (built on my kernel; I have already tried using the module). My pf.conf is as simple as possible: # cat /etc/pf.conf nat on msk0 from any to any - 10.0.3.6 pass quick all when I jexec inside the jail, and pf is running, I am unable to reach any machine except my jail (not even the host). If pf is off, the network works just fine (of course my router knows where to find my jail's subnet). What is strange is that if I tcpdump on msk0, then after a few seconds that I request something from within the jail, I see the packets going and coming on msk0 using the correct IP (the NAT IP), but it seems that the machine fails to route them back inside the jail. I guess this is the same issue reported in kern/170081. Some msk(4) controllers lack full hardware checksum offloading capability such that pseudo checksum should be computed by upper layer. It seems pf(4) NAT was broken for controllers that lack pseudo checksumming. This indicates the following ethernet controller do not work with pf(4) NAT. sk(4), msk(4), fxp(4), hme(4) and gem(4) Try disabling RX checksum offloading as a work-around. #ifconfig msk0 -rxcsum ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge problems in RELENG_9, bge0: watchdog timeout -- resetting
On Mon, Jul 09, 2012 at 10:34:21AM -0700, Sean Bruno wrote: On Wed, 2012-07-04 at 18:01 -0700, YongHyeon PYUN wrote: here is a WIP version at the following URL. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c I have a couple of positive feedbacks but it seems it still has some issues. Let me know whether it makes any difference on your box. I grabbed these updates and applied them cleanly to stable/9 on a Dell R620 with a quad port BCM5720, I still see watchdog timeouts and reset indications. I am able to ping out of the box for a short amount of time before the device hangs and times out. Sean, sorry for late reply. Given that I have no problems on sample 5720 controller I still have no clue yet. -bash-4.2# ping XXX.XXX.XXX.1 PING XXX.XXX.XXX.1 (XXX.XXX.XXX.XXX): 56 data bytes ping: sendto: Network is down ping: sendto: Network is down ping: sendto: Network is down ping: sendto: Network is down ping: sendto: Network is down Jul 9 17:31:41 kern.crit x89 kernel: bge2: watchdog timeout -- resetting Jul 9 17:31:41 kern.notice x89 kernel: bge2: link state changed to DOWN Jul 9 17:31:41 kern.notice x89 kernel: bge2: link state changed to Two link state change message indicates there is an issue in state tracking. I'm experimenting a different approach but it seems it takes too long due to lack of time. Any way, I've uploaded updated bge(4)(URL is the same as before). DOWN ping: sendto: No route to host ping: sendto: No route to host ping: sendto: No route to host ping: sendto: No route to host 64 bytes from XXX.XXX.XXX.1: icmp_seq=9 ttl=64 time=1.408 ms Jul 9 17:31:45 kern.notice x89 kernel: bge2: link state changed to UP Jul 9 17:31:45 kern.notice x89 kernel: bge2: link state changed to UP [...] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bge problems in RELENG_9, bge0: watchdog timeout -- resetting
On Tue, Jul 03, 2012 at 08:57:04PM +0200, Anders Nordby wrote: Hi, I'm having lots of difficulties with BCM5719, which is the default network card of HP Proliant DL 360 G8 servers. I can get a few ping replies before I get a couple of these: bge0: watchdog timeout -- resetting bge0: watchdog timeout -- resetting Then everything hangs. Can not log in using ssh. I'm running: FreeBSD-9.0-RELENG_9-20120701-JPSNAP-amd64 Info about the NIC: # devinfo -rv | grep phy brgphy0 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=1 brgphy1 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=2 brgphy2 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=3 brgphy3 pnpinfo oui=0x1be9 model=0x22 rev=0x0 at phyno=4 # grep bge /var/run/dmesg.boot bge0: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bf-0xf6bf, 0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus0: MII bus on bge0 bge0: Ethernet address: 2c:76:8a:54:08:14 bge1: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bc-0xf6bc, 0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 at device 0.1 on pci3 bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus1: MII bus on bge1 bge1: Ethernet address: 2c:76:8a:54:08:15 bge2: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b9-0xf6b9, 0xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 at device 0.2 on pci3 bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus2: MII bus on bge2 bge2: Ethernet address: 2c:76:8a:54:08:16 bge3: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b6-0xf6b6, 0xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 at device 0.3 on pci3 bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus3: MII bus on bge3 bge3: Ethernet address: 2c:76:8a:54:08:17 Searching other bug reports and posts, I've tried: hw.bge.allow_asf=0 hw.pci.enable_msi=0 But it didn't help. Any ideas? If I don't use the loader.conf settings above, I also get (before the watchdog timeouts): bge0: 2 link states coalesced bge0: 2 link states coalesced bge0: 2 link states coalesced There is a WIP version at the following URL. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c I have a couple of positive feedbacks but it seems it still has some issues. Let me know whether it makes any difference on your box. Best regards, -- Anders. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Network unavailable when booting directly to FreeBSD.
On Mon, Jun 25, 2012 at 02:22:54PM -0700, Pedro Giffuni wrote: Hi again; --- Dom 24/6/12, Pedro Giffuni p...@freebsd.org ha scritto: ... --- Lun 25/6/12, YongHyeon PYUN pyu...@gmail.com ha scritto: ... Could you narrow down which commit broke bge(4)? Sean Bruno suggested it may be r233495, but I haven't found the time to revert it. I will let you know tomorrow. Reverting only r233495 didn't fix it either. This will Because your controller is not BCM5704, r233495 should have no effects. take some time :(. Ok, if you happen to find guilty commit let me know. Pedro. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Network unavailable when booting directly to FreeBSD.
On Sat, Jun 23, 2012 at 01:04:44PM -0700, Pedro Giffuni wrote: Hello; [...] Iff I boot Windows first and then reboot to start FreeBSD the network works fine. This looks strange and I can't narrow down what other changes made since 9.0-RELEASE broke the driver. Would you try reverting r235821? I reverted it manually but things didn't change. If that does not solve the issue, would you try a WIP version at the following URL? It's mainly written to improve BCM5720 with APE firmware support and it exactly follows recommendations suggested by Broadcom so it may have some differences. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c No joy either :( Could you narrow down which commit broke bge(4)? Pedro. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Network unavailable when booting directly to FreeBSD.
On Thu, Jun 21, 2012 at 02:05:33PM -0500, Pedro Giffuni wrote: Hello; I noticed a regression from 9.0 and I cannot boot directly FreeBSD and access the network. Unfortunately I cannot recall the exact commit where this started happening. uname -a FreeBSD pcbsd-8555 9.0-STABLE FreeBSD 9.0-STABLE #12: Wed May 30 11:16:35 PDT 2012 r...@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC amd64 From my dmesg __ ... pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pcib0: Length mismatch for 4 range: 81 vs 7f pci0: ACPI PCI bus on pcib0 pci0: memory, RAM at device 0.0 (no driver attached) pci0: memory, RAM at device 0.1 (no driver attached) pci0: memory, RAM at device 0.2 (no driver attached) pci0: memory, RAM at device 0.3 (no driver attached) pci0: memory, RAM at device 0.4 (no driver attached) pci0: memory, RAM at device 0.5 (no driver attached) pci0: memory, RAM at device 0.6 (no driver attached) pci0: memory, RAM at device 0.7 (no driver attached) pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0 pci1: ACPI PCI bus on pcib1 pcib2: ACPI PCI-PCI bridge at device 3.0 on pci0 pci2: ACPI PCI bus on pcib2 bge0: Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev. 0x00b002 mem 0xfdef-0xfdef irq 16 at device 0.0 on pci2 bge0: CHIP ID 0xb002; ASIC REV 0x0b; CHIP REV 0xb0; PCI-E miibus0: MII bus on bge0 brgphy0: BCM5754/5787 1000BASE-T media interface PHY 1 on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:18:8b:76:a4:1e pcib3: ACPI PCI-PCI bridge at device 4.0 on pci0 pci3: ACPI PCI bus on pcib3 Cuse4BSD v0.1.23 @ /dev/cuse bge0: watchdog timeout -- resetting bge0: watchdog timeout -- resetting WARNING: attempt to domain_add(bluetooth) after domainfinalize() fuse4bsd: version 0.3.9-pre1, FUSE ABI 7.8 bge0: watchdog timeout -- resetting bge0: link state changed to DOWN bge0: link state changed to UP bge0: watchdog timeout -- resetting bge0: link state changed to DOWN bge0: link state changed to UP bge0: watchdog timeout -- resetting bge0: link state changed to DOWN bge0: link state changed to UP _ Iff I boot Windows first and then reboot to start FreeBSD the network works fine. This looks strange and I can't narrow down what other changes made since 9.0-RELEASE broke the driver. Would you try reverting r235821? If that does not solve the issue, would you try a WIP version at the following URL? It's mainly written to improve BCM5720 with APE firmware support and it exactly follows recommendations suggested by Broadcom so it may have some differences. http://people.freebsd.org/~yongari/bge/if_bge.c http://people.freebsd.org/~yongari/bge/if_bgereg.h http://people.freebsd.org/~yongari/bge/brgphy.c Pedro. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk0: interrupt storm
On Wed, Apr 25, 2012 at 09:35:28AM -0400, John Baldwin wrote: On Tuesday, April 24, 2012 3:07:14 pm John Baldwin wrote: On Tuesday, April 24, 2012 4:07:19 pm YongHyeon PYUN wrote: On Mon, Apr 23, 2012 at 10:24:41AM -0400, John Baldwin wrote: On Wednesday, March 07, 2012 3:40:53 pm YongHyeon PYUN wrote: On Tue, Mar 06, 2012 at 10:36:05AM -0500, John Baldwin wrote: On Thursday, March 01, 2012 8:29:55 pm YongHyeon PYUN wrote: On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: My laptop running 9.0-RELEASE/amd64/GENERIC freezes and (sometimes) unfreezes intermittently, logging the following: Feb 28 23:07:36 lifebook kernel: interrupt storm detected on irq259:; throttling interrupt source $ vmstat -i ... irq259: mskc0 11669511 3456 Looks very similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 Any suggestions? Try disabling MSI and see whether that makes any difference. I also get interrupt storms with msk. They do fix themselves when they happen, and I've seen it happen with the machine is idle. This is on my little netbook where msk had several problems initially that have since been fixed. mskc0: Marvell Yukon 88E8072 Gigabit Ethernet port 0x2000-0x20ff mem 0xe000-0xe0003fff irq 19 at device 0.0 on pci32 msk0: Marvell Technology Group Ltd. Yukon EX Id 0xb5 Rev 0x02 on mskc0 msk0: Ethernet address: 00:24:81:40:e3:ef miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow mskc0@pci0:32:0:0: class=0x02 card=0x3056103c chip=0x436c11ab rev=0x10 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8072 PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet John, can you let me know the value of B0_Y2_SP_ISRC2 register in interrupt handler when you see the interrupt storm? I finally tested this. I added some KTR traces to dump ISRC2 on each call to msk_intr() and hacked the interrupt thread code to turn KTR tracing off when a storm occurred. The traces look like this: index cpu timestamptrace -- --- - 148 0 111662766108828 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 147 0 111662765994576 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 146 0 111662765380260 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 145 0 111662765257308 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 144 0 111662765134356 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 143 0 111662765011560 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 142 0 111662764888656 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 141 0 111662764773924 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 140 0 111662764659360 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 139 0 111662764528140 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 138 0 111662764413576 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 137 0 111662764287852 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 ... (All traces have the same register value.) The TSC on this netbook runs at machdep.tsc_freq: 1596035244 (The timestamps above are TSC values.) Let me know if you'd like me to log more stuff in the driver. Thanks! wonder why the deivce gets TWSI completion interrupt since the driver does not monitor temperature sensor. In addition, the interrupt was already disabled so have no idea how this can happen. Here, I assume your controller implemented optional temperature sensor and it is monitored by H/W. Anyway, try attached patch and let me know whether it makes any difference. It does fix the interrupt storms. I added a debugging printf to fire each time msk_intr() sees this bit to see if it storms, etc. What I see is that each time I would previously get a single printf reporting an interrupt storm, I now get a single printf reporting that the TWSI_RDY bit was set. Sadly, I spoke too soon. With this patch applied I got another storm last night where this bit was not set during the storm: index cpu timestamptrace -- --- - 71 0 36775451301480 msk_intr: B0_Y2_SP_ISRC2 = 0x4000 70 0 36775450145436 msk_intr: B0_Y2_SP_ISRC2 = 0x4000 69 0 36775449956940 msk_intr: B0_Y2_SP_ISRC2 = 0x4000 68 0 36775449768564 msk_intr: B0_Y2_SP_ISRC2 = 0x4000 67 0 36775448604912 msk_intr: B0_Y2_SP_ISRC2
Re: msk0: interrupt storm
On Mon, Apr 23, 2012 at 10:24:41AM -0400, John Baldwin wrote: On Wednesday, March 07, 2012 3:40:53 pm YongHyeon PYUN wrote: On Tue, Mar 06, 2012 at 10:36:05AM -0500, John Baldwin wrote: On Thursday, March 01, 2012 8:29:55 pm YongHyeon PYUN wrote: On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: My laptop running 9.0-RELEASE/amd64/GENERIC freezes and (sometimes) unfreezes intermittently, logging the following: Feb 28 23:07:36 lifebook kernel: interrupt storm detected on irq259:; throttling interrupt source $ vmstat -i ... irq259: mskc0 11669511 3456 Looks very similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 Any suggestions? Try disabling MSI and see whether that makes any difference. I also get interrupt storms with msk. They do fix themselves when they happen, and I've seen it happen with the machine is idle. This is on my little netbook where msk had several problems initially that have since been fixed. mskc0: Marvell Yukon 88E8072 Gigabit Ethernet port 0x2000-0x20ff mem 0xe000-0xe0003fff irq 19 at device 0.0 on pci32 msk0: Marvell Technology Group Ltd. Yukon EX Id 0xb5 Rev 0x02 on mskc0 msk0: Ethernet address: 00:24:81:40:e3:ef miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow mskc0@pci0:32:0:0: class=0x02 card=0x3056103c chip=0x436c11ab rev=0x10 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8072 PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet John, can you let me know the value of B0_Y2_SP_ISRC2 register in interrupt handler when you see the interrupt storm? I finally tested this. I added some KTR traces to dump ISRC2 on each call to msk_intr() and hacked the interrupt thread code to turn KTR tracing off when a storm occurred. The traces look like this: index cpu timestamptrace -- --- - 148 0 111662766108828 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 147 0 111662765994576 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 146 0 111662765380260 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 145 0 111662765257308 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 144 0 111662765134356 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 143 0 111662765011560 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 142 0 111662764888656 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 141 0 111662764773924 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 140 0 111662764659360 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 139 0 111662764528140 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 138 0 111662764413576 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 137 0 111662764287852 msk_intr: B0_Y2_SP_ISRC2 = 0x4400 ... (All traces have the same register value.) The TSC on this netbook runs at machdep.tsc_freq: 1596035244 (The timestamps above are TSC values.) Let me know if you'd like me to log more stuff in the driver. Thanks! wonder why the deivce gets TWSI completion interrupt since the driver does not monitor temperature sensor. In addition, the interrupt was already disabled so have no idea how this can happen. Here, I assume your controller implemented optional temperature sensor and it is monitored by H/W. Anyway, try attached patch and let me know whether it makes any difference. Index: sys/dev/msk/if_msk.c === --- sys/dev/msk/if_msk.c (revision 234591) +++ sys/dev/msk/if_msk.c (working copy) @@ -3734,6 +3734,9 @@ if ((status Y2_IS_STAT_BMU) != 0 domore == 0) CSR_WRITE_4(sc, STAT_CTRL, SC_STAT_CLR_IRQ); + /* Clear TWSI IRQ. */ + if ((status Y2_IS_TWSI_RDY) != 0) + CSR_WRITE_4(sc, B2_I2C_IRQ, 1); /* Reenable interrupts. */ CSR_WRITE_4(sc, B0_Y2_SP_ICR, 2); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: fxp entering promiscuous mode causing link to bounce
On Sat, Mar 17, 2012 at 09:53:31PM -0400, Mike Tancsa wrote: On 3/17/2012 6:58 PM, YongHyeon PYUN wrote: On Fri, Mar 16, 2012 at 04:49:54PM -0400, Mike Tancsa wrote: tcpdump -ni fxp0 -c 20 fxp0: link state changed to DOWN fxp0: promiscuous mode enabled fxp0: link state changed to UP fxp0: link state changed to DOWN fxp0: promiscuous mode disabled fxp0: link state changed to UP I verified it on 2 different boxes. Is there a way to prevent this from happening ? It looks like a regression introduced in flow control support. Thanks very much, that indeed did fix it!! 0(smtp1)# patch fxp.p Hmm... Looks like a unified diff to me... The text leading up to this was: -- |Index: sys/dev/fxp/if_fxp.c |=== |--- sys/dev/fxp/if_fxp.c (revision 233076) |+++ sys/dev/fxp/if_fxp.c (working copy) -- Patching file sys/dev/fxp/if_fxp.c using Plan A... Hunk #1 succeeded at 900 (offset -2 lines). Hunk #2 succeeded at 2808 (offset -2 lines). Hunk #3 succeeded at 2914 (offset -2 lines). fxp0: promiscuous mode enabled fxp0: promiscuous mode disabled ... and not bounced link/dropped packets. Thanks for testing. Committed in r233158. fxp(4) controllers require controller reinitialization for promiscuous mode change so it's normal to miss packets during that transition. fxp(4) controllers have many limitation and this is one of them. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: fxp entering promiscuous mode causing link to bounce
On Fri, Mar 16, 2012 at 04:49:54PM -0400, Mike Tancsa wrote: I dont recall seeing this on RELENG_7, but I dont have a box to test with anymore confirm. On one box I upgraded to RELENG_8 I just noticed the nic will bounce if I enable tcpdump on it. Sure enough, trying on a different RELENG_8 box with an fxp nic shows the same result. eg tcpdump -ni fxp0 -c 20 fxp0: link state changed to DOWN fxp0: promiscuous mode enabled fxp0: link state changed to UP fxp0: link state changed to DOWN fxp0: promiscuous mode disabled fxp0: link state changed to UP I verified it on 2 different boxes. Is there a way to prevent this from happening ? It looks like a regression introduced in flow control support. I think stable/7 also has the same code so you will see the same issue on stable/7. However if you don't see the issue on stable/7 I can't explain that. Anyway, try attached patch and let me know how it works. I found other places which will result in link DOWN/UP so changed them to get previous good behavior. Index: sys/dev/fxp/if_fxp.c === --- sys/dev/fxp/if_fxp.c (revision 233076) +++ sys/dev/fxp/if_fxp.c (working copy) @@ -902,7 +902,7 @@ FXP_LOCK(sc); /* Clear wakeup events. */ CSR_WRITE_1(sc, FXP_CSR_PMDR, CSR_READ_1(sc, FXP_CSR_PMDR)); - fxp_init_body(sc, 1); + fxp_init_body(sc, 0); fxp_stop(sc); FXP_UNLOCK(sc); } @@ -2810,7 +2810,7 @@ if (((ifp-if_drv_flags IFF_DRV_RUNNING) != 0) ((ifp-if_flags ^ sc-if_flags) (IFF_PROMISC | IFF_ALLMULTI | IFF_LINK0)) != 0) -fxp_init_body(sc, 1); +fxp_init_body(sc, 0); else if ((ifp-if_drv_flags IFF_DRV_RUNNING) == 0) fxp_init_body(sc, 1); } else { @@ -2916,7 +2916,7 @@ reinit++; } if (reinit 0 ifp-if_flags IFF_UP) - fxp_init_body(sc, 1); + fxp_init_body(sc, 0); FXP_UNLOCK(sc); VLAN_CAPABILITIES(ifp); break; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bce: Device not configured
On Fri, Mar 16, 2012 at 04:50:22PM +0100, Jan Winter wrote: On 03/16/12 18:32, YongHyeon PYUN wrote: On Thu, Mar 15, 2012 at 03:20:10PM +0100, Jan Winter wrote: On 03/15/12 18:29, YongHyeon PYUN wrote: On Wed, Mar 14, 2012 at 03:34:20PM +0100, Jan Winter wrote: On 03/14/12 19:40, YongHyeon PYUN wrote: On Tue, Mar 13, 2012 at 02:08:46PM +0100, Jan Winter wrote: Hello, on an Dell Blade m610 is not possible to change the network media option: ifconfig bce0 media 100baseTX mediaopt full-duplex up ifconfig: SIOCSIFMEDIA (media): Device not configured Setting the media option to autoselect and connecting the m610 to a 100 MBit switch, I always get no carrier only 1g full-duplex seems to be working. I have tested this on 8.3-prerelease and 9-stable any Ideas? cheers Jan pciconf -lv bce0@pci0:1:0:0:class=0x02 card=0x02871028 chip=0x163a14e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II BCM5709S Gigabit Ethernet' class = network subclass = ethernet dmesg bce0:Broadcom NetXtreme II BCM5709 1000Base-SX (C0)mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 miibus0:MII buson bce0 brgphy0:BCM5709S 1000/2500baseSX PHYPHY 2 on miibus0 brgphy0: 1000baseSX-FDX, auto bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1:Broadcom NetXtreme II BCM5709 1000Base-SX (C0)mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 miibus1:MII buson bce1 brgphy1:BCM5709S 1000/2500baseSX PHYPHY 2 on miibus1 brgphy1: 1000baseSX-FDX, auto bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) I'm not sure you're seeing one of long standing remote PHY issue of blade box but would you try the patch at the following URL? http://people.freebsd.org/~yongari/bce/bce.rphy.diff After applying the patch, show me the dmesg output(bce(4) and brgphy(4) related ones) and 'ifconfig -m bce0'. Note, the patch was not tested at all(lack of hardware). Hello, thank you very much, for your quick support Now its looking much better ifconfig -m bce0 bce0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE capabilities=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE ether 00:26:b9:fb:04:0c inet 192.168.100.30 netmask 0xff00 broadcast 192.168.100.255 inet6 fe80::226:b9ff:fefb:40c%bce0 prefixlen 64 tentative scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseTfull-duplex) status: active supported media: media autoselect media 1000baseT mediaopt full-duplex media 1000baseT media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP dmesg: . bce0:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 bce0: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 256 to local APIC 16 vector 52 bce0: using IRQ 256 for MSI bce0: Remote PHY : TP bce0: bpf attached bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 bce1: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 257 to local APIC 16 vector 53 bce1: using IRQ 257 for MSI bce1: Remote PHY : TP bce1: bpf attached bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) . I have done a quick test with 100 and 1000 MBit, both working very well. Thanks a lot for testing. This patch was made long time ago but I haven't had chance to commit it due to lack of access to hardware. Because the patch bypasses mii(4) layer and makes it hard to read code, I didn't like the patch but it seems the patch makes bce(4) usable on blade boxes at least. I'll commit the patch next week. Its possible to get a Patch for 8 Stable? I will do MFC to stable/[7-9]. And bce.rphy.diff should
Re: Changes brought to bce(4) disabling ipmi access during boot
On Fri, Mar 16, 2012 at 11:41:51PM +0100, Paul Guyot wrote: Le 16 mars 2012 ? 18:06, YongHyeon PYUN a ?crit : On Thu, Mar 15, 2012 at 09:19:27AM +0100, Paul Guyot wrote: Le 15 mars 2012 ? 18:10, YongHyeon PYUN a ?crit : On Wed, Mar 14, 2012 at 11:44:37PM +0100, Paul Guyot wrote: Hello, Changes brought to bce(4) prevents booting a R410 Dell server with GELI-encrypted root ZFS partition requiring a passphrase, something that was possible with 9-RELEASE. Using a binary search, the bug comes from the following revision: Updating collection src-all/cvs Edit src/sys/dev/bce/if_bce.c Add delta 1.89.2.4 2012.01.09.19.07.14 yongari Edit src/sys/dev/bce/if_bcereg.h Add delta 1.35.2.3 2012.01.09.19.07.14 yongari Shutting down connection to server Could you try attach patch and let me know whether it recovers IPMI functionality? Thank you for your quick patch. Unfortunately, it does not recover IPMI functionality with STABLE@2012.01.09.19.08.00. Hmm, how about this one? It did not work either. So I patched the original (RELEASE) driver to print information about the various conditions newly tested by the STABLE driver in bce_miibus_statchg. The result is the following. The box has two bce interfaces, the one connected is bce0. The loader was configured with boot_verbose. Before the passphrase is entered: bce0: Broadcom NetXtreme II BCM5716 1000Base-T (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 bce0: attempting to allocate 1 MSI vectors (16 supported) bce0: using IRQ 256 for MSI miibus0: MII bus on bce0 bce0: bpf attached bce0: Ethernet address: 78:2b:cb:18:22:75 bce0: [1998] ifp != NULL bce0: [2000] (ifp-if_drv_flags IFF_DRV_RUNNING) == 0 bce0: [2008] mii != NULL bce0: [2023] (mii-mii_media_status IFM_ACTIVE) != IFM_ACTIVE) bce0: [2026] (mii-mii_media_status IFM_AVALID) == IFM_AVALID) bce0: [2058] Unknown link speed, enabling default GMII interface. bce0: [2082] Disabling RX flow control. bce0: [2095] Disabling TX flow control. bce0: ASIC (0x57092008); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.2.3); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.11) bce1: Broadcom NetXtreme II BCM5716 1000Base-T (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 bce1: attempting to allocate 1 MSI vectors (16 supported) bce1: using IRQ 257 for MSI miibus1: MII bus on bce1 bce1: bpf attached bce1: Ethernet address: 78:2b:cb:18:22:76 bce1: [1998] ifp != NULL bce1: [2000] (ifp-if_drv_flags IFF_DRV_RUNNING) == 0 bce1: [2008] mii != NULL bce1: [2023] (mii-mii_media_status IFM_ACTIVE) != IFM_ACTIVE) bce1: [2026] (mii-mii_media_status IFM_AVALID) == IFM_AVALID) bce1: [2058] Unknown link speed, enabling default GMII interface. bce1: [2082] Disabling RX flow control. bce1: [2095] Disabling TX flow control. bce1: ASIC (0x57092008); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.2.3); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.11) After the passphrase is entered and network is started: bce0: [1998] ifp != NULL bce0: [2000] (ifp-if_drv_flags IFF_DRV_RUNNING) == 0 bce0: [2008] mii != NULL bce0: [2023] (mii-mii_media_status IFM_ACTIVE) != IFM_ACTIVE) bce0: [2026] (mii-mii_media_status IFM_AVALID) == IFM_AVALID) bce0: [2058] Unknown link speed, enabling default GMII interface. bce0: [2082] Disabling RX flow control. bce0: [2095] Disabling TX flow control. bce0: [1998] ifp != NULL bce0: [2002] (ifp-if_drv_flags IFF_DRV_RUNNING) != 0 bce0: [2008] mii != NULL bce0: [2018] (mii-mii_media_status (IFM_ACTIVE | IFM_AVALID)) == (IFM_ACTIVE | IFM_AVALID) bce0: [2053] Enabling GMII interface. bce0: [2082] Disabling RX flow control. bce0: [2095] Disabling TX flow control. bce0: link state changed to UP bce0: Gigabit link up! bce0: Gigabit link up! bce0: Gigabit link up! From what I understand, both new conditions that may return early are true ((ifp-if_drv_flags IFF_DRV_RUNNING) == 0 and later (mii-mii_media_status IFM_ACTIVE) != IFM_ACTIVE), which yields bce_link_up to be FALSE. Yet I am confused by the role of actually writing to BCE_EMAC_MODE in order to keep the iDRAC link up, and wether the issue would not come from another part of the change. Because there is no publicly available documentation for ASF/IPMI link handling it's not clear what steps should be taken whenever its link state is changed. Previously bce(4) seems to drive bce_tick regardless of driver running state. New patch attached. Paul Index: sys/dev/bce/if_bce.c === --- sys/dev/bce/if_bce.c(revision 233076) +++ sys/dev/bce/if_bce.c(working copy) @@ -1462,6 +1462,10 @@ * still running. */ bce_pulse(sc); + /* Track ASF/IPMI link state change. */ + sc-bce_link_tick = TRUE; + sc-bce_link_up = FALSE; + callout_reset(sc-bce_tick_callout
Re: Changes brought to bce(4) disabling ipmi access during boot
On Thu, Mar 15, 2012 at 09:19:27AM +0100, Paul Guyot wrote: Le 15 mars 2012 ? 18:10, YongHyeon PYUN a ?crit : On Wed, Mar 14, 2012 at 11:44:37PM +0100, Paul Guyot wrote: Hello, Changes brought to bce(4) prevents booting a R410 Dell server with GELI-encrypted root ZFS partition requiring a passphrase, something that was possible with 9-RELEASE. Using a binary search, the bug comes from the following revision: Updating collection src-all/cvs Edit src/sys/dev/bce/if_bce.c Add delta 1.89.2.4 2012.01.09.19.07.14 yongari Edit src/sys/dev/bce/if_bcereg.h Add delta 1.35.2.3 2012.01.09.19.07.14 yongari Shutting down connection to server Could you try attach patch and let me know whether it recovers IPMI functionality? Thank you for your quick patch. Unfortunately, it does not recover IPMI functionality with STABLE@2012.01.09.19.08.00. Hmm, how about this one? Index: sys/dev/bce/if_bce.c === --- sys/dev/bce/if_bce.c(revision 232950) +++ sys/dev/bce/if_bce.c(working copy) @@ -1992,8 +1992,7 @@ ifp = sc-bce_ifp; mii = device_get_softc(sc-bce_miibus); - if (mii == NULL || ifp == NULL || - (ifp-if_drv_flags IFF_DRV_RUNNING) == 0) + if (mii == NULL || ifp == NULL) return; sc-bce_link_up = FALSE; @@ -2038,9 +2037,6 @@ } } - if (sc-bce_link_up == FALSE) - return; - /* Set half or full duplex based on PHY settings. */ if ((mii-mii_media_active IFM_GMASK) == IFM_HDX) { DBPRINT(sc, BCE_INFO_PHY, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bce: Device not configured
On Thu, Mar 15, 2012 at 03:20:10PM +0100, Jan Winter wrote: On 03/15/12 18:29, YongHyeon PYUN wrote: On Wed, Mar 14, 2012 at 03:34:20PM +0100, Jan Winter wrote: On 03/14/12 19:40, YongHyeon PYUN wrote: On Tue, Mar 13, 2012 at 02:08:46PM +0100, Jan Winter wrote: Hello, on an Dell Blade m610 is not possible to change the network media option: ifconfig bce0 media 100baseTX mediaopt full-duplex up ifconfig: SIOCSIFMEDIA (media): Device not configured Setting the media option to autoselect and connecting the m610 to a 100 MBit switch, I always get no carrier only 1g full-duplex seems to be working. I have tested this on 8.3-prerelease and 9-stable any Ideas? cheers Jan pciconf -lv bce0@pci0:1:0:0:class=0x02 card=0x02871028 chip=0x163a14e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II BCM5709S Gigabit Ethernet' class = network subclass = ethernet dmesg bce0:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 miibus0:MII bus on bce0 brgphy0:BCM5709S 1000/2500baseSX PHY PHY 2 on miibus0 brgphy0: 1000baseSX-FDX, auto bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 miibus1:MII bus on bce1 brgphy1:BCM5709S 1000/2500baseSX PHY PHY 2 on miibus1 brgphy1: 1000baseSX-FDX, auto bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) I'm not sure you're seeing one of long standing remote PHY issue of blade box but would you try the patch at the following URL? http://people.freebsd.org/~yongari/bce/bce.rphy.diff After applying the patch, show me the dmesg output(bce(4) and brgphy(4) related ones) and 'ifconfig -m bce0'. Note, the patch was not tested at all(lack of hardware). Hello, thank you very much, for your quick support Now its looking much better ifconfig -m bce0 bce0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE capabilities=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE ether 00:26:b9:fb:04:0c inet 192.168.100.30 netmask 0xff00 broadcast 192.168.100.255 inet6 fe80::226:b9ff:fefb:40c%bce0 prefixlen 64 tentative scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseTfull-duplex) status: active supported media: media autoselect media 1000baseT mediaopt full-duplex media 1000baseT media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP dmesg: . bce0:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 bce0: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 256 to local APIC 16 vector 52 bce0: using IRQ 256 for MSI bce0: Remote PHY : TP bce0: bpf attached bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 bce1: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 257 to local APIC 16 vector 53 bce1: using IRQ 257 for MSI bce1: Remote PHY : TP bce1: bpf attached bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) . I have done a quick test with 100 and 1000 MBit, both working very well. Thanks a lot for testing. This patch was made long time ago but I haven't had chance to commit it due to lack of access to hardware. Because the patch bypasses mii(4) layer and makes it hard to read code, I didn't like the patch but it seems the patch makes bce(4) usable on blade boxes at least. I'll commit the patch next week. Its possible to get a Patch for 8 Stable? I will do MFC to stable/[7-9]. And bce.rphy.diff should be applied cleanly to stable/[7-9]. I getting erros with the 8-stable source Oops, try this one for stable/8. http
Re: Changes brought to bce(4) disabling ipmi access during boot
On Wed, Mar 14, 2012 at 11:44:37PM +0100, Paul Guyot wrote: Hello, Changes brought to bce(4) prevents booting a R410 Dell server with GELI-encrypted root ZFS partition requiring a passphrase, something that was possible with 9-RELEASE. Using a binary search, the bug comes from the following revision: Updating collection src-all/cvs Edit src/sys/dev/bce/if_bce.c Add delta 1.89.2.4 2012.01.09.19.07.14 yongari Edit src/sys/dev/bce/if_bcereg.h Add delta 1.35.2.3 2012.01.09.19.07.14 yongari Shutting down connection to server Could you try attach patch and let me know whether it recovers IPMI functionality? RELEASE as well as STABLE with date=2012.01.09.19.00.00 boot properly. The boot fails with date=2012.01.09.19.08.00 For more details: the box is configured to boot from a plain ZFS pool that contains the kernel (zboot) and then to request passphrase for a GELI-encrypted ZFS pool containing everything else (including /etc/rc.d), in a way similar to what is described here: http://www.keltia.net/howtos/freebsd-dedibox The passphrase should be entered from the virtual console (KVM) simulated by the ipmi controller (through Dell's iDRAC6). On RELEASE, the boot works properly and can be followed from the KVM console, where the passphrase can be entered. On STABLE, the KVM gets disconnected. Besides, the ipmi is down, and the box is eventually bricked: since plugging a real console is not an option, the only way to get access to the server is to reboot it electrically (and to configure the PXE to perform a netboot in order to switch the kernel). I believe the ipmi controller uses the main ethernet port to simulate a physical console and the change in the bce driver disables the ethernet port. Since the box waits from the passphrase to configure the network, the box gets unreachable. Paul -- Semiocasthttp://semiocast.com/ +33.183627948 - 20 rue Lacaze, 75014 Paris Index: sys/dev/bce/if_bce.c === --- sys/dev/bce/if_bce.c (revision 232950) +++ sys/dev/bce/if_bce.c (working copy) @@ -1992,8 +1992,7 @@ ifp = sc-bce_ifp; mii = device_get_softc(sc-bce_miibus); - if (mii == NULL || ifp == NULL || - (ifp-if_drv_flags IFF_DRV_RUNNING) == 0) + if (mii == NULL || ifp == NULL) return; sc-bce_link_up = FALSE; ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bce: Device not configured
On Wed, Mar 14, 2012 at 03:34:20PM +0100, Jan Winter wrote: On 03/14/12 19:40, YongHyeon PYUN wrote: On Tue, Mar 13, 2012 at 02:08:46PM +0100, Jan Winter wrote: Hello, on an Dell Blade m610 is not possible to change the network media option: ifconfig bce0 media 100baseTX mediaopt full-duplex up ifconfig: SIOCSIFMEDIA (media): Device not configured Setting the media option to autoselect and connecting the m610 to a 100 MBit switch, I always get no carrier only 1g full-duplex seems to be working. I have tested this on 8.3-prerelease and 9-stable any Ideas? cheers Jan pciconf -lv bce0@pci0:1:0:0:class=0x02 card=0x02871028 chip=0x163a14e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II BCM5709S Gigabit Ethernet' class = network subclass = ethernet dmesg bce0:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 miibus0:MII bus on bce0 brgphy0:BCM5709S 1000/2500baseSX PHY PHY 2 on miibus0 brgphy0: 1000baseSX-FDX, auto bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1:Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 miibus1:MII bus on bce1 brgphy1:BCM5709S 1000/2500baseSX PHY PHY 2 on miibus1 brgphy1: 1000baseSX-FDX, auto bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) I'm not sure you're seeing one of long standing remote PHY issue of blade box but would you try the patch at the following URL? http://people.freebsd.org/~yongari/bce/bce.rphy.diff After applying the patch, show me the dmesg output(bce(4) and brgphy(4) related ones) and 'ifconfig -m bce0'. Note, the patch was not tested at all(lack of hardware). Hello, thank you very much, for your quick support Now its looking much better ifconfig -m bce0 bce0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500 options=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE capabilities=c01bbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,VLAN_HWTSO,LINKSTATE ether 00:26:b9:fb:04:0c inet 192.168.100.30 netmask 0xff00 broadcast 192.168.100.255 inet6 fe80::226:b9ff:fefb:40c%bce0 prefixlen 64 tentative scopeid 0x1 nd6 options=29PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL media: Ethernet autoselect (1000baseT full-duplex) status: active supported media: media autoselect media 1000baseT mediaopt full-duplex media 1000baseT media 100baseTX mediaopt full-duplex media 100baseTX media 10baseT/UTP mediaopt full-duplex media 10baseT/UTP dmesg: . bce0: Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 bce0: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 256 to local APIC 16 vector 52 bce0: using IRQ 256 for MSI bce0: Remote PHY : TP bce0: bpf attached bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1: Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 bce1: attempting to allocate 1 MSI vectors (16 supported) msi: routing MSI IRQ 257 to local APIC 16 vector 53 bce1: using IRQ 257 for MSI bce1: Remote PHY : TP bce1: bpf attached bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|Remote PHY(TP)|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) . I have done a quick test with 100 and 1000 MBit, both working very well. Thanks a lot for testing. This patch was made long time ago but I haven't had chance to commit it due to lack of access to hardware. Because the patch bypasses mii(4) layer and makes it hard to read code, I didn't like the patch but it seems the patch makes bce(4) usable on blade boxes at least. I'll commit the patch next week. Its possible to get a Patch for 8 Stable? I will do MFC to stable/[7-9]. And bce.rphy.diff should be applied cleanly to stable/[7-9]. thank in advance Jan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any
Re: bce: Device not configured
On Wed, Mar 14, 2012 at 10:58:33PM +0200, Sami Halabi wrote: Hi, I'm having this card on my IBM X3550, FBSD8.1-R-p8: # pciconf -lv bce0@pci0:11:0:0: class=0x02 card=0x03a91014 chip=0x163914e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II Gigabit Ethernet (BCM5709)' class = network subclass = ethernet bce1@pci0:11:0:1: class=0x02 card=0x03a91014 chip=0x163914e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II Gigabit Ethernet (BCM5709)' class = network subclass = ethernet # lspci 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) 0b:00.1 Ethernet controller: Broadcom Corporation NetXtreme II BCM5709 Gigabit Ethernet (rev 20) the box is running for about 200 days so i don't have dmesg output. do i need to patch? I think you don't need the patch unless you're suffering from link establishment issue. The patch was made to support remote PHY capability of Broadcom controller which is commonly found on Dell blade systems. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: bce: Device not configured
On Tue, Mar 13, 2012 at 02:08:46PM +0100, Jan Winter wrote: Hello, on an Dell Blade m610 is not possible to change the network media option: ifconfig bce0 media 100baseTX mediaopt full-duplex up ifconfig: SIOCSIFMEDIA (media): Device not configured Setting the media option to autoselect and connecting the m610 to a 100 MBit switch, I always get no carrier only 1g full-duplex seems to be working. I have tested this on 8.3-prerelease and 9-stable any Ideas? cheers Jan pciconf -lv bce0@pci0:1:0:0:class=0x02 card=0x02871028 chip=0x163a14e4 rev=0x20 hdr=0x00 vendor = 'Broadcom Corporation' device = 'NetXtreme II BCM5709S Gigabit Ethernet' class = network subclass = ethernet dmesg bce0: Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xda00-0xdbff irq 36 at device 0.0 on pci1 miibus0: MII bus on bce0 brgphy0: BCM5709S 1000/2500baseSX PHY PHY 2 on miibus0 brgphy0: 1000baseSX-FDX, auto bce0: Ethernet address: 00:26:b9:fb:04:0c bce0: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) bce1: Broadcom NetXtreme II BCM5709 1000Base-SX (C0) mem 0xdc00-0xddff irq 48 at device 0.1 on pci1 miibus1: MII bus on bce1 brgphy1: BCM5709S 1000/2500baseSX PHY PHY 2 on miibus1 brgphy1: 1000baseSX-FDX, auto bce1: Ethernet address: 00:26:b9:fb:04:0e bce1: ASIC (0x57092000); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.0.11); Bufs (RX:2;TX:2;PG:8); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.5) Coal (RX:6,6,18,18; TX:20,20,80,80) I'm not sure you're seeing one of long standing remote PHY issue of blade box but would you try the patch at the following URL? http://people.freebsd.org/~yongari/bce/bce.rphy.diff After applying the patch, show me the dmesg output(bce(4) and brgphy(4) related ones) and 'ifconfig -m bce0'. Note, the patch was not tested at all(lack of hardware). ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk0: interrupt storm
On Tue, Mar 06, 2012 at 10:36:05AM -0500, John Baldwin wrote: On Thursday, March 01, 2012 8:29:55 pm YongHyeon PYUN wrote: On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: My laptop running 9.0-RELEASE/amd64/GENERIC freezes and (sometimes) unfreezes intermittently, logging the following: Feb 28 23:07:36 lifebook kernel: interrupt storm detected on irq259:; throttling interrupt source $ vmstat -i ... irq259: mskc0 11669511 3456 Looks very similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 Any suggestions? Try disabling MSI and see whether that makes any difference. I also get interrupt storms with msk. They do fix themselves when they happen, and I've seen it happen with the machine is idle. This is on my little netbook where msk had several problems initially that have since been fixed. mskc0: Marvell Yukon 88E8072 Gigabit Ethernet port 0x2000-0x20ff mem 0xe000-0xe0003fff irq 19 at device 0.0 on pci32 msk0: Marvell Technology Group Ltd. Yukon EX Id 0xb5 Rev 0x02 on mskc0 msk0: Ethernet address: 00:24:81:40:e3:ef miibus0: MII bus on msk0 e1000phy0: Marvell 88E1149 Gigabit PHY PHY 0 on miibus0 e1000phy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow mskc0@pci0:32:0:0: class=0x02 card=0x3056103c chip=0x436c11ab rev=0x10 hdr=0x00 vendor = 'Marvell Technology Group Ltd.' device = '88E8072 PCI-E Gigabit Ethernet Controller' class = network subclass = ethernet John, can you let me know the value of B0_Y2_SP_ISRC2 register in interrupt handler when you see the interrupt storm? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk0: interrupt storm
On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: My laptop running 9.0-RELEASE/amd64/GENERIC freezes and (sometimes) unfreezes intermittently, logging the following: Feb 28 23:07:36 lifebook kernel: interrupt storm detected on irq259:; throttling interrupt source $ vmstat -i ... irq259: mskc0 11669511 3456 Looks very similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 Any suggestions? Try disabling MSI and see whether that makes any difference. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: msk0: interrupt storm
On Fri, Mar 02, 2012 at 01:13:56AM +0400, Pavel Gorshkov wrote: On Thu, Mar 01, 2012 at 05:29:55PM -0800, YongHyeon PYUN wrote: On Wed, Feb 29, 2012 at 01:03:29AM +0400, Pavel Gorshkov wrote: My laptop running 9.0-RELEASE/amd64/GENERIC freezes and (sometimes) unfreezes intermittently, logging the following: Feb 28 23:07:36 lifebook kernel: interrupt storm detected on irq259:; throttling interrupt source $ vmstat -i ... irq259: mskc0 11669511 3456 Looks very similar to this: http://www.freebsd.org/cgi/query-pr.cgi?pr=164569 Any suggestions? Try disabling MSI and see whether that makes any difference. hw.msk.msi_disable is not recognized as a valid sysctl variable and I'm not sure about it having any effect whatsoever, but hw.msk.msk_disable is a loader tunable so it can't be set after boot. See msk(4) for more information. putting hw.msk.msi_disable=1 into /boot/loader.conf seems to have resulted in this: irq16: mskc0 uhci0355402884 that is, msk0 is now on irq16, but the freezes are still there: Mar 1 23:55:12 lifebook kernel: interrupt storm detected on irq16:; throttling interrupt source Still have no idea. Would you post dmesg output? If you know how to reproduce the issue, let me know. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Regression in 8.2-STABLE bge code (from 7.4-STABLE)
On Thu, Feb 23, 2012 at 09:46:20AM -0500, John Baldwin wrote: On Tuesday, February 14, 2012 7:56:00 pm YongHyeon PYUN wrote: On Sat, Jan 28, 2012 at 09:24:53PM -0500, Michael L. Squires wrote: Sorry for late reply. Had been busy due to relocation. There is a bug in the Tyan S4881/S4882 PCI-X bridges that was fixed with a patch in 7.x (thank you very much). This patch is not present in the 8.2-STABLE code and the symptoms (watchdog timeouts) have recurred. Hmm, I thought the mailbox reordering bug was avoided by limiting DMA address space to 32bits but it seems it was not right workaround for AMD 8131 PCI-X Bridge. The watchdog timeouts do not appear to be present after I switched to an Intel gigabit PCI-X card. I did a brute-force patch of the 8.2-STABLE bge code using the patches for 7.4-STABLE; the resulting code compiled and, other than odd behavior at startup, seems to be working normally. This is using FreeBSD 8.2-STABLE amd64; I don't know what happens with i386. Given the age of the boards it may be easier if I just continue using the Intel gigabit card but am happy to test anything that comes my way. Try attached patch and let me know how it goes. I didn't enable 64bit DMA addressing though. I think the AMD-8131 PCI-X bridge needs both workarounds. Eh, please don't do the thing where you walk all pcib devices. Instead, walk up the tree like so: static int bge_mbox_reorder(struct bge_softc *sc) { devclass_t pcib, pci; device_t dev, bus; pci = devclass_find(pci); pcib = devclass_find(pcib); dev = sc-dev; bus = device_get_parent(dev); for (;;) { dev = device_get_parent(bus); bus = device_get_parent(dev); if (device_get_devclass(dev) != pcib_devclass || device_get_devclass(bus) != pci_devclass) break; /* Probe device ID. */ } return (0); } It is not safe to use pci_get_vendor() with non-PCI devices (you may get random junk, and Host-PCI bridges are not PCI devices). Also, this will only apply the quirk if a relevant bridge is in the bge device's path. Thanks for reviewing and suggestion. Would you review updated one? Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 232144) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -2828,6 +2828,7 @@ #defineBGE_FLAG_RX_ALIGNBUG0x0400 #defineBGE_FLAG_SHORT_DMA_BUG 0x0800 #defineBGE_FLAG_4K_RDMA_BUG0x1000 +#defineBGE_FLAG_MBOX_REORDER 0x2000 uint32_tbge_phy_flags; #defineBGE_PHY_NO_WIRESPEED0x0001 #defineBGE_PHY_ADC_BUG 0x0002 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c(revision 232144) +++ sys/dev/bge/if_bge.c(working copy) @@ -380,6 +380,8 @@ static int bge_dma_ring_alloc(struct bge_softc *, bus_size_t, bus_size_t, bus_dma_tag_t *, uint8_t **, bus_dmamap_t *, bus_addr_t *, const char *); +static int bge_mbox_reorder(struct bge_softc *); + static int bge_get_eaddr_fw(struct bge_softc *sc, uint8_t ether_addr[]); static int bge_get_eaddr_mem(struct bge_softc *, uint8_t[]); static int bge_get_eaddr_nvram(struct bge_softc *, uint8_t[]); @@ -635,6 +637,8 @@ off += BGE_LPMBX_IRQ0_HI - BGE_MBX_IRQ0_HI; CSR_WRITE_4(sc, off, val); + if ((sc-bge_flags BGE_FLAG_MBOX_REORDER) != 0) + CSR_READ_4(sc, off); } /* @@ -2609,8 +2613,8 @@ * XXX * watchdog timeout issue was observed on BCM5704 which * lives behind PCI-X bridge(e.g AMD 8131 PCI-X bridge). -* Limiting DMA address space to 32bits seems to address -* it. +* Both limiting DMA address space to 32bits and flushing +* mailbox write seem to address the issue. */ if (sc-bge_flags BGE_FLAG_PCIX) lowaddr = BUS_SPACE_MAXADDR_32BIT; @@ -2775,6 +2779,47 @@ } static int +bge_mbox_reorder(struct bge_softc *sc) +{ + /* Lists of PCI bridges that are known to reorder mailbox writes. */ + static const struct mbox_reorder { + const uint16_t vendor; + const uint16_t device; + const char *desc; + } const mbox_reorder_lists[] = { + { 0x1022, 0x7450, AMD-8131 PCI-X Bridge }, + }; + devclass_t pci, pcib; + device_t bus, dev; + int count, i; + + count = sizeof(mbox_reorder_lists) / sizeof(mbox_reorder_lists[0]); + pci = devclass_find(pci); + pcib = devclass_find(pcib); + dev = sc-bge_dev; + bus
Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems
On Thu, Feb 23, 2012 at 07:41:25AM +0100, Attila Nagy wrote: On 02/23/12 21:44, YongHyeon PYUN wrote: I have to ask more information for the controller to Broadcom. Not sure whether I can get some hint at this moment though. :-( Is there anything I can do? I ask this because I have to give back this server very soon. Given that you also have USB related errors, could you completely remove bge(4) in your kernel and see whether it can successfully boot up? I think you can add the following entries to /boot/device.hints without rebuilding kernel. hint.bge.0.disabled=1 hint.bge.1.disabled=1 hint.bge.2.disabled=1 hint.bge.3.disabled=1 This does not help. Removing bge makes it stop here: da0 at ciss0 bus 0 scbus0 target 0 lun 0 da0: COMPAQ RAID 0 VOLUME OK Fixed Direct Access SCSI-5 device da0: 135.168MB/s transfers da0: Command Queueing enabled da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C) panic: bootpc_init: no eligible interfaces cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 panic() at panic+0x187 bootpc_init() at bootpc_init+0x1205 mi_startup() at mi_startup+0x77 btext() at btext+0x2c KDB: enter: panic [ thread pid 0 tid 10 ] Stopped at kdb_enter+0x3b: movq$0,0x976972(%rip) db Which is completely OK, because there are really no interfaces to boot from. Note that there is no NMI either (maybe because it would happen later in the initialization process). Sadly, I can't boot from disk, but I assume it would work. Ok, I guess you're seeing similar issue that Sean reported. I'll let you when I have experimental patch. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems
On Wed, Feb 22, 2012 at 08:49:31AM +0100, Attila Nagy wrote: Hi, I get this on a recent stable/9 system with uhci support removed from the kernel config: da0 at ciss0 bus 0 scbus0 target 0 lun 0 da0: COMPAQ RAID 0 VOLUME OK Fixed Direct Access SCSI-5 device da0: 135.168MB/s transfers da0: Command Queueing enabled da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C) cd0 at ata3 bus 0 scbus3 target 0 lun 0 cd0: HP DV-W28S-W G.W3 Removable CD-ROM SCSI-0 device cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed NMI ISA 70, EISA ff I/O channel check, likely hardware failure. Fatal trap 19: non-maskable interrupt trap while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0x804543fb stack pointer = 0x28:0x81251e40 frame pointer = 0x28:0x814cf660 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, IOPL = 0 current process = 0 (swapper) [ thread pid 0 tid 10 ] Stopped at bge_init_locked+0x233b: movl0x81c(%rsi),%eax db and this with a plain GENERIC kernel: da0 at ciss0 bus 0 scbus0 target 0 lun 0 da0: COMPAQ RAID 0 VOLUME OK Fixed Direct Access SCSI-5 device da0: 135.168MB/s transfers da0: Command Queueing enabled da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C) cd0 at ata3 bus 0 scbus3 target 0 lun 0 cd0: HP DV-W28S-W G.W3 Removable CD-ROM SCSI-0 device cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed NMI ISA 70, EISA ff I/O channel check, likely hardware failure. Fatal trap 19: non-maskable interrupt trap while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0x80711dc5 stack pointer = 0x28:0x81272040 frame pointer = 0x28:0xff907cf44b40 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= interrupt enabled, IOPL = 0 current process = 12 (irq16: uhci0) [ thread pid 12 tid 100098 ] Stopped at uhci_interrupt+0x65:movzwl %ax,%eax db KDB: stack backtrace: KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a kdb_backtrace() at kdb_backtrace+0x37 mi_switch() at mi_switch+0x27a turnstile_wait() at turnstile_wait+0x1cb _mtx_lock_sleep() at _mtx_lock_sleep+0xb0 ukbd_poll() at ukbd_poll+0xbe kbdmux_poll() at kbdmux_poll+0x3f sc_cngetc() at sc_cngetc+0xec cncheckc() at cncheckc+0x4a cngetc() at cngetc+0x1c db_readline() at db_readline+0x77 db_read_line() at db_read_line+0x15 db_command_loop() at db_command_loop+0x38 db_trap() at db_trap+0x89 kdb_trap() at kdb_trap+0x101 trap_fatal() at trap_fatal+0x29d trap() at trap+0x10a nmi_calltrap() at nmi_calltrap+0x8 --- trap 0x13, rip = 0x80711dc5, rsp = 0x81272040, rbp = 0xff907cf44b40 --- uhci_interrupt() at uhci_interrupt+0x65 intr_event_execute_handlers() at intr_event_execute_handlers+0x104 ithread_loop() at ithread_loop+0xa4 fork_exit() at fork_exit+0x11f fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xff907cf44d00, rbp = 0 --- db After disabling stopping on NMI (kdb_on_nmi), I still can't boot from bge (this is a PXE booted machine), I get this in an infinite loop: bge1: link state changed to DOWN DHCP/BOOTP timeout for server 255.255.255.255 bge1: 3 link states coalesced bge1: link state changed to UP bge0: 2 link states coalesced bge0: link state changed to DOWN bge0: link state changed to UP bge1: link state changed to DOWN bge0: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN bge1: 2 link states coalesced bge1: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN bge0: 2 link states coalesced bge0: link state changed to DOWN bge1: 2 link states coalesced bge1: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN bge0: link state changed to UP bge0: link state changed to DOWN Linux and Windows boot fine on the machine. dmesg up to the point where it crashes: Copyright (c) 1992-2012 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 9.0-STABLE #3: Tue Feb 21 11:57:33 CET 2012 r...@boot.lab:/usr/obj/usr/src/sys/BOOTCLNT amd64 CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz (2693.57-MHz K8-class CPU) Origin = GenuineIntel Id = 0x206d6 Family = 6 Model =
Re: Fatal trap 19, Stopped at bge_init_locked+ and bge booting problems
On Wed, Feb 22, 2012 at 03:43:54PM +0100, Attila Nagy wrote: On 02/23/12 05:15, YongHyeon PYUN wrote: bge0:Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E ^^ This controller is new one. Probably BCM5719 A1 but not sure. Yes, it's in a new machine. bge0: Try again This message indicates your controller has ASF/IPMI firmware. Try disabling ASF and see whether it makes any difference. (Change hw.bge.allow_asf tunable to 0). Oh, I always forget that (on the other machines this is set). This is what I get with machdep.panic_on_nmi: 0 machdep.kdb_on_nmi: 0 hw.bge.allow_asf: 0 I have to ask more information for the controller to Broadcom. Not sure whether I can get some hint at this moment though. :-( Given that you also have USB related errors, could you completely remove bge(4) in your kernel and see whether it can successfully boot up? I think you can add the following entries to /boot/device.hints without rebuilding kernel. hint.bge.0.disabled=1 hint.bge.1.disabled=1 hint.bge.2.disabled=1 hint.bge.3.disabled=1 bge0: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bf-0xf6bf,0xf6be-0xf6be,0xf6bd-0xf6bd irq 32 at device 0.0 on pci3 bge0: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E bge0: Try again miibus0: MII bus on bge0 ukphy0: Generic IEEE 802.3u media interface PHY 1 on miibus0 ukphy0: none, 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 3c:4a:92:b2:3c:08 pci0:3:0:1: failed to read VPD data. bge1: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6bc-0xf6bc,0xf6bb-0xf6bb,0xf6ba-0xf6ba irq 36 at device 0.1 on pci3 bge1: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus1: MII bus on bge1 brgphy0: BCM5719C 1000BASE-T media interface PHY 2 on miibus1 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 3c:4a:92:b2:3c:09 pci0:3:0:2: failed to read VPD data. bge2: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b9-0xf6b9,0xf6b8-0xf6b8,0xf6b7-0xf6b7 irq 32 at device 0.2 on pci3 bge2: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus2: MII bus on bge2 brgphy1: BCM5719C 1000BASE-T media interface PHY 3 on miibus2 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge2: Ethernet address: 3c:4a:92:b2:3c:0a pci0:3:0:3: failed to read VPD data. bge3: Broadcom unknown BCM5719, ASIC rev. 0x5719001 mem 0xf6b6-0xf6b6,0xf6b5-0xf6b5,0xf6b4-0xf6b4 irq 36 at device 0.3 on pci3 bge3: CHIP ID 0x05719001; ASIC REV 0x5719; CHIP REV 0x57190; PCI-E miibus3: MII bus on bge3 brgphy2: BCM5719C 1000BASE-T media interface PHY 4 on miibus3 brgphy2: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge3: Ethernet address: 3c:4a:92:b2:3c:0b [...] da0: 286070MB (585871964 512 byte sectors: 255H 32S/T 65535C) NMI ISA 60, EISA ff I/O channel check, likely hardware failure.Sending DHCP Discover packet from interface bge0 (3c:4a:92:b2:3c:08) cd0 at ata3 bus 0 scbus3 target 0 lun 0 cd0: HP DV-W28S-W G.W3 Removable CD-ROM SCSI-0 device cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes) cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed bge0: 11 link states coalesced bge0: link state changed to DOWN ugen0.2: vendor 0x8087 at usbus0 uhub3: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 on usbus0 bge1: 5 link states coalesced bge1: link state changed to DOWN bge2: link state changed to DOWN bge3: link state changed to DOWN bge0: ugen2.2: vendor 0x8087 at usbus2 uhub4: vendor 0x8087 product 0x0024, class 9/0, rev 2.00/0.00, addr 2 on usbus2 2 link states coalesced bge0: link state changed to DOWN bge1: 4 link states coalesced bge1: link state changed to DOWN bge0: 4 link states coalesced bge0: link state changed to DOWN Sending DHCP Discover packet from interface bge1 (3c:4a:92:b2:3c:09) uhub3: 6 ports with 6 removable, self powered bge0: usb_alloc_device: set address 2 failed (USB_ERR_TIMEOUT, ignored) 6 link states coalesced bge0: link state changed to DOWN bge1: 2 link states coalesced bge1: link state changed to DOWN Sending DHCP Discover packet from interface bge2 (3c:4a:92:b2:3c:0a) bge0: 2 link states coalesced bge0: link state changed to DOWN bge1: usbd_setup_device_desc: getting device descriptor at addr 2 failed
Re: Regression in 8.2-STABLE bge code (from 7.4-STABLE)
On Sat, Jan 28, 2012 at 09:24:53PM -0500, Michael L. Squires wrote: Sorry for late reply. Had been busy due to relocation. There is a bug in the Tyan S4881/S4882 PCI-X bridges that was fixed with a patch in 7.x (thank you very much). This patch is not present in the 8.2-STABLE code and the symptoms (watchdog timeouts) have recurred. Hmm, I thought the mailbox reordering bug was avoided by limiting DMA address space to 32bits but it seems it was not right workaround for AMD 8131 PCI-X Bridge. The watchdog timeouts do not appear to be present after I switched to an Intel gigabit PCI-X card. I did a brute-force patch of the 8.2-STABLE bge code using the patches for 7.4-STABLE; the resulting code compiled and, other than odd behavior at startup, seems to be working normally. This is using FreeBSD 8.2-STABLE amd64; I don't know what happens with i386. Given the age of the boards it may be easier if I just continue using the Intel gigabit card but am happy to test anything that comes my way. Try attached patch and let me know how it goes. I didn't enable 64bit DMA addressing though. I think the AMD-8131 PCI-X bridge needs both workarounds. Thanks, Mike Squires mikes at siralan.org Index: sys/dev/bge/if_bgereg.h === --- sys/dev/bge/if_bgereg.h (revision 231621) +++ sys/dev/bge/if_bgereg.h (working copy) @@ -2828,6 +2828,7 @@ #define BGE_FLAG_RX_ALIGNBUG 0x0400 #define BGE_FLAG_SHORT_DMA_BUG 0x0800 #define BGE_FLAG_4K_RDMA_BUG 0x1000 +#define BGE_FLAG_MBOX_REORDER 0x2000 uint32_t bge_phy_flags; #define BGE_PHY_NO_WIRESPEED 0x0001 #define BGE_PHY_ADC_BUG 0x0002 Index: sys/dev/bge/if_bge.c === --- sys/dev/bge/if_bge.c (revision 231621) +++ sys/dev/bge/if_bge.c (working copy) @@ -380,6 +380,8 @@ static int bge_dma_ring_alloc(struct bge_softc *, bus_size_t, bus_size_t, bus_dma_tag_t *, uint8_t **, bus_dmamap_t *, bus_addr_t *, const char *); +static int bge_mbox_reorder(struct bge_softc *); + static int bge_get_eaddr_fw(struct bge_softc *sc, uint8_t ether_addr[]); static int bge_get_eaddr_mem(struct bge_softc *, uint8_t[]); static int bge_get_eaddr_nvram(struct bge_softc *, uint8_t[]); @@ -635,6 +637,8 @@ off += BGE_LPMBX_IRQ0_HI - BGE_MBX_IRQ0_HI; CSR_WRITE_4(sc, off, val); + if ((sc-bge_flags BGE_FLAG_MBOX_REORDER) != 0) + CSR_READ_4(sc, off); } /* @@ -2609,8 +2613,8 @@ * XXX * watchdog timeout issue was observed on BCM5704 which * lives behind PCI-X bridge(e.g AMD 8131 PCI-X bridge). - * Limiting DMA address space to 32bits seems to address - * it. + * Both limiting DMA address space to 32bits and flushing + * mailbox write seem to address the issue. */ if (sc-bge_flags BGE_FLAG_PCIX) lowaddr = BUS_SPACE_MAXADDR_32BIT; @@ -2775,6 +2779,42 @@ } static int +bge_mbox_reorder(struct bge_softc *sc) +{ + /* Lists of PCI bridges that are known to reorder mailbox writes. */ + static const struct mbox_reorder { + const uint16_t vendor; + const uint16_t device; + const char *desc; + } const mbox_reorder_lists[] = { + { 0x1022, 0x7450, AMD-8131 PCI-X Bridge }, + }; + devclass_t pcib; + device_t dev; + int i, count, unit; + + count = sizeof(mbox_reorder_lists) / sizeof(mbox_reorder_lists[0]); + pcib = devclass_find(pcib); + for (unit = 0; unit devclass_get_maxunit(pcib); unit++) { + dev = devclass_get_device(pcib, unit); + if (dev == NULL) +continue; + for (i = 0; i count; i++) { + if (pci_get_vendor(dev) == + mbox_reorder_lists[i].vendor + pci_get_device(dev) == + mbox_reorder_lists[i].device) { +device_printf(sc-bge_dev, +enabling MBOX workaround for %s\n, +mbox_reorder_lists[i].desc); +return (1); + } + } + } + return (0); +} + +static int bge_attach(device_t dev) { struct ifnet *ifp; @@ -3094,6 +3134,14 @@ if (BGE_IS_5714_FAMILY(sc) (sc-bge_flags BGE_FLAG_PCIX)) sc-bge_flags |= BGE_FLAG_40BIT_BUG; /* + * Some PCI-X bridges are known to trigger write reordering to + * the mailbox registers. Typical phenomena is watchdog timeouts + * caused by out-of-order TX completions. Enable workaround for + * PCI-X devices that live behind these bridges. + */ + if (sc-bge_flags BGE_FLAG_PCIX bge_mbox_reorder(sc) != 0) + sc-bge_flags |= BGE_FLAG_MBOX_REORDER; + /* * Allocate the interrupt, using MSI if possible. These devices * support 8 MSI messages, but only the first one is used in * normal operation. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9-stable - ifmedia_set: no match for 0x0/0xfffffff
On Sun, Jan 29, 2012 at 01:19:40PM +0900, Randy Bush wrote: What happens if you set hw.bge.allow_asf to 0 and use auto-negotiation on both sides? it works! the switch was already auto-neg, and i forced auto-neg on the server side. Apart from suspend/resume issue, bge(4) still needs more code to handle controllers with ASF/IPMI firmware. This part is mostly undocumented and hard to experiment due to lack of hardware access. Current IPMI/ASF handling code shows mixed results and setting hw.bge.allow_asf to 0 will break IPMI support. thanks. this was not pleasant. did i remember to whine that i am in tokyo and the server is on the beast coast of the states? :) i think a bit of a warning about hw.bge.allow_asf in UPDATING might help folk. thank you *very* much for your help. randy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [releng_9 tinderbox] failure on powerpc/powerpc
On Wed, Jan 04, 2012 at 01:53:46AM +, FreeBSD Tinderbox wrote: TB --- 2012-01-03 23:59:19 - tinderbox 2.8 running on freebsd-stable.sentex.ca TB --- 2012-01-03 23:59:19 - starting RELENG_9 tinderbox run for powerpc/powerpc TB --- 2012-01-03 23:59:19 - cleaning the object tree TB --- 2012-01-03 23:59:42 - cvsupping the source tree TB --- 2012-01-03 23:59:42 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_9/powerpc/powerpc/supfile TB --- 2012-01-04 00:00:22 - building world TB --- 2012-01-04 00:00:22 - CROSS_BUILD_TESTING=YES TB --- 2012-01-04 00:00:22 - MAKEOBJDIRPREFIX=/obj TB --- 2012-01-04 00:00:22 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-01-04 00:00:22 - SRCCONF=/dev/null TB --- 2012-01-04 00:00:22 - TARGET=powerpc TB --- 2012-01-04 00:00:22 - TARGET_ARCH=powerpc TB --- 2012-01-04 00:00:22 - TZ=UTC TB --- 2012-01-04 00:00:22 - __MAKE_CONF=/dev/null TB --- 2012-01-04 00:00:22 - cd /src TB --- 2012-01-04 00:00:22 - /usr/bin/make -B buildworld World build started on Wed Jan 4 00:00:23 UTC 2012 Rebuilding the temporary build tree stage 1.1: legacy release compatibility shims stage 1.2: bootstrap tools stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3: cross tools stage 4.1: building includes stage 4.2: building libraries stage 4.3: make dependencies stage 4.4: building everything World build completed on Wed Jan 4 01:52:02 UTC 2012 TB --- 2012-01-04 01:52:02 - generating LINT kernel config TB --- 2012-01-04 01:52:02 - cd /src/sys/powerpc/conf TB --- 2012-01-04 01:52:02 - /usr/bin/make -B LINT TB --- 2012-01-04 01:52:02 - cd /src/sys/powerpc/conf TB --- 2012-01-04 01:52:02 - /usr/sbin/config -m LINT TB --- 2012-01-04 01:52:02 - building LINT kernel TB --- 2012-01-04 01:52:02 - CROSS_BUILD_TESTING=YES TB --- 2012-01-04 01:52:02 - MAKEOBJDIRPREFIX=/obj TB --- 2012-01-04 01:52:02 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2012-01-04 01:52:02 - SRCCONF=/dev/null TB --- 2012-01-04 01:52:02 - TARGET=powerpc TB --- 2012-01-04 01:52:02 - TARGET_ARCH=powerpc TB --- 2012-01-04 01:52:02 - TZ=UTC TB --- 2012-01-04 01:52:02 - __MAKE_CONF=/dev/null TB --- 2012-01-04 01:52:02 - cd /src TB --- 2012-01-04 01:52:02 - /usr/bin/make -B buildkernel KERNCONF=LINT Kernel build for LINT started on Wed Jan 4 01:52:02 UTC 2012 stage 1: configuring the kernel stage 2.1: cleaning up the object tree stage 2.2: rebuilding the object tree stage 2.3: build tools stage 3.1: making dependencies [...] awk -f /src/sys/tools/makeobjops.awk /src/sys/powerpc/powerpc/iommu_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/powerpc/powerpc/mmu_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/powerpc/powerpc/pic_if.m -h awk -f /src/sys/tools/makeobjops.awk /src/sys/powerpc/powerpc/platform_if.m -h rm -f .newdep /usr/bin/make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES | MKDEP_CPP=cc -E CC=cc xargs mkdep -a -f .newdep -O -pipe -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -I/src/sys/contrib/ipfilter -I/src/sys/contrib/pf -I/src/sys/dev/ath -I/src/sys/dev/ath/ath_hal -I/src/sys/contrib/ngatm -I/src/sys/dev/twa -I/src/sys/gnu/fs/xfs/FreeBSD -I/src/sys/gnu/fs/xfs/FreeBSD/support -I/src/sys/gnu/fs/xfs -I/src/sys/dev/cxgb -I/src/sys/dev/cxgbe -I/src/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=15000 --param inline-unit-growth=100 --param large-function-growth=1000 -fno-builtin -msoft-float -Wa,-many -fno-omit-frame-pointer -msoft-float -mno-altivec -ffreestanding -fstack-protector /src/sys/dev/ti/if_ti.c:134:2: error: #error options TI_JUMBO_HDRSPLIT requires TI_SF_BUF_JUMBO mkdep: compile failed *** Error code 1 I think this is transient tinderbox error triggered by r229432. Stop in /obj/powerpc.powerpc/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2012-01-04 01:53:45 - WARNING: /usr/bin/make returned exit code 1 TB --- 2012-01-04 01:53:45 - ERROR: failed to build LINT kernel TB --- 2012-01-04 01:53:45 - 5441.09 user 688.28 system 6866.24 real http://tinderbox.freebsd.org/tinderbox-releng_9-RELENG_9-powerpc-powerpc.full ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9.0-RC2 re(4) no memory for jumbo buffers issue
On Sun, Jan 01, 2012 at 09:03:07PM -0500, Mike Andrews wrote: On Fri, 30 Dec 2011, YongHyeon PYUN wrote: On Thu, Dec 29, 2011 at 10:51:25PM -0500, Mike Andrews wrote: On 11/28/2011 6:42 PM, YongHyeon PYUN wrote: On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote: On 11/27/11 8:39 PM, YongHyeon PYUN wrote: On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek RTL8111C-GR gigabit NICs on it. As far as I can tell, these support jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on Actually the maximum size is 6KB for RTL8111C, not 7422. RTL8111C and newer PCIe based gigabit controllers no longer support scattering a jumbo frame into multiple RX buffers so a single RX buffer has to receive an entire jumbo frame. This adds more burden to system because it has to allocate a jumbo frame even when it receives a pure TCP ACK. OK, that makes sense. FreeBSD 9.0-RC2, after a week or so of update, with fairly light network activity, the interfaces die with no memory for jumbo buffers errors on the console. Unloading and reloading the driver (via serial console) doesn't help; only rebooting seems to clear it up. The jumbo code path is the same as normal MTU sized one so I think possibility of leaking mbufs in driver is very low. And the message no memory for jumbo RX buffers can only happen either when you up the interface again or interface restart triggered by watchdog timeout handler. I don't think you're seeing watchdog timeouts though. I'm fairly certain the interface isn't changing state when this happens -- it just kinda spontaneously happens after a week or two, with no interface up/down transitions. I don't see any watchdog messages when this happens. There is another code path that causes controller reinitialization. If you change MTU or offloading configuration(TSO, VLAN tagging, checksum offloading etc) it will reinitialize the controller. So do you happen to trigger one of these code path during a week or two? When you see no memory for jumbo RX buffers message, did you check available mbuf pool? Not yet, that's why I asked for debugging tips -- I'll do that the next time this happens. What's the best way to go about debugging this... which sysctl's should I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 to 16384 and it doesn't seem to help things... maybe prolonging it slightly, but not by much. The problem is it takes a week or so to reproduce the problem each time... I vaguely guess it could be related with other subsystem which leaks mbufs such that driver was not able to get more jumbo RX buffers from system. For instance, r228016 would be worth to try on your box. I can't clearly explain why em(4) does not suffer from the issue though. I've just this morning built a kernel with that fix, so we'll see how that goes. Ok. OK, this just happened again with a 9.0-RC3 kernel rev r228247. whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig Ah, sorry. I should have spotted this issue earlier. Try attached patch and let me know whether it makes any difference. re1 up re0: no memory for jumbo RX buffers re1: no memory for jumbo RX buffers whitedog# netstat -m 526/1829/2355 mbufs in use (current/cache/total) 0/1278/1278/25600 mbuf clusters in use (current/cache/total/max) 0/356 mbuf+clusters out of packet secondary zone in use (current/cache) 0/336/336/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 512/385/897/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4739K/7822K/12561K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/4560/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines OK, well, the patch changes things... kind of :) After putting a lot of stress on the network -- namely about three passes 'make buildworld buildkernel' over NFS/TCP with a 5000 byte MTU -- the interface hangs again, but the symptoms are now different. First, no When you think the interface is stuck, can you check which part(TX, RX or both) of MAC is in stuck condition? If you can see receiving packets with tcpdump it means RX MAC is still working. If you can see packets sent from host with re(4) on destination host that means TX MAC works. console messages whatsoever, other than NFS timeouts -- even if you ifconfig up/down the interface, which previously would generate the 'no memory for jumbo RX buffers' message. That message no longer appears, ever. Even weirder, the interface will revive itself on its own after about 15
Re: 9.0-RC2 re(4) no memory for jumbo buffers issue
On Thu, Dec 29, 2011 at 10:51:25PM -0500, Mike Andrews wrote: On 11/28/2011 6:42 PM, YongHyeon PYUN wrote: On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote: On 11/27/11 8:39 PM, YongHyeon PYUN wrote: On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek RTL8111C-GR gigabit NICs on it. As far as I can tell, these support jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on Actually the maximum size is 6KB for RTL8111C, not 7422. RTL8111C and newer PCIe based gigabit controllers no longer support scattering a jumbo frame into multiple RX buffers so a single RX buffer has to receive an entire jumbo frame. This adds more burden to system because it has to allocate a jumbo frame even when it receives a pure TCP ACK. OK, that makes sense. FreeBSD 9.0-RC2, after a week or so of update, with fairly light network activity, the interfaces die with no memory for jumbo buffers errors on the console. Unloading and reloading the driver (via serial console) doesn't help; only rebooting seems to clear it up. The jumbo code path is the same as normal MTU sized one so I think possibility of leaking mbufs in driver is very low. And the message no memory for jumbo RX buffers can only happen either when you up the interface again or interface restart triggered by watchdog timeout handler. I don't think you're seeing watchdog timeouts though. I'm fairly certain the interface isn't changing state when this happens -- it just kinda spontaneously happens after a week or two, with no interface up/down transitions. I don't see any watchdog messages when this happens. There is another code path that causes controller reinitialization. If you change MTU or offloading configuration(TSO, VLAN tagging, checksum offloading etc) it will reinitialize the controller. So do you happen to trigger one of these code path during a week or two? When you see no memory for jumbo RX buffers message, did you check available mbuf pool? Not yet, that's why I asked for debugging tips -- I'll do that the next time this happens. What's the best way to go about debugging this... which sysctl's should I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 to 16384 and it doesn't seem to help things... maybe prolonging it slightly, but not by much. The problem is it takes a week or so to reproduce the problem each time... I vaguely guess it could be related with other subsystem which leaks mbufs such that driver was not able to get more jumbo RX buffers from system. For instance, r228016 would be worth to try on your box. I can't clearly explain why em(4) does not suffer from the issue though. I've just this morning built a kernel with that fix, so we'll see how that goes. Ok. OK, this just happened again with a 9.0-RC3 kernel rev r228247. whitedog# ifconfig re0 down;ifconfig re0 up;ifconfig re1 down;ifconfig Ah, sorry. I should have spotted this issue earlier. Try attached patch and let me know whether it makes any difference. re1 up re0: no memory for jumbo RX buffers re1: no memory for jumbo RX buffers whitedog# netstat -m 526/1829/2355 mbufs in use (current/cache/total) 0/1278/1278/25600 mbuf clusters in use (current/cache/total/max) 0/356 mbuf+clusters out of packet secondary zone in use (current/cache) 0/336/336/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 512/385/897/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 4739K/7822K/12561K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/4560/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines Index: sys/dev/re/if_re.c === --- sys/dev/re/if_re.c (revision 229006) +++ sys/dev/re/if_re.c (working copy) @@ -3558,7 +3558,6 @@ } /* Free the TX list buffers. */ - for (i = 0; i sc-rl_ldata.rl_tx_desc_cnt; i++) { txd = sc-rl_ldata.rl_tx_desc[i]; if (txd-tx_m != NULL) { @@ -3572,11 +3571,10 @@ } /* Free the RX list buffers. */ - for (i = 0; i sc-rl_ldata.rl_rx_desc_cnt; i++) { rxd = sc-rl_ldata.rl_rx_desc[i]; if (rxd-rx_m != NULL) { - bus_dmamap_sync(sc-rl_ldata.rl_tx_mtag, + bus_dmamap_sync(sc-rl_ldata.rl_rx_mtag, rxd-rx_dmamap, BUS_DMASYNC_POSTREAD); bus_dmamap_unload(sc-rl_ldata.rl_rx_mtag, rxd-rx_dmamap); @@ -3584,6 +3582,20 @@ rxd-rx_m = NULL; } } + + if ((sc-rl_flags RL_FLAG_JUMBOV2) != 0) { + for (i = 0; i sc-rl_ldata.rl_rx_desc_cnt; i++) { + rxd = sc-rl_ldata.rl_jrx_desc[i]; + if (rxd-rx_m != NULL) { +bus_dmamap_sync(sc
Re: 9.0-RC2 re(4) no memory for jumbo buffers issue
On Mon, Nov 28, 2011 at 05:38:16PM -0500, Mike Andrews wrote: On 11/27/11 8:39 PM, YongHyeon PYUN wrote: On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek RTL8111C-GR gigabit NICs on it. As far as I can tell, these support jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on Actually the maximum size is 6KB for RTL8111C, not 7422. RTL8111C and newer PCIe based gigabit controllers no longer support scattering a jumbo frame into multiple RX buffers so a single RX buffer has to receive an entire jumbo frame. This adds more burden to system because it has to allocate a jumbo frame even when it receives a pure TCP ACK. OK, that makes sense. FreeBSD 9.0-RC2, after a week or so of update, with fairly light network activity, the interfaces die with no memory for jumbo buffers errors on the console. Unloading and reloading the driver (via serial console) doesn't help; only rebooting seems to clear it up. The jumbo code path is the same as normal MTU sized one so I think possibility of leaking mbufs in driver is very low. And the message no memory for jumbo RX buffers can only happen either when you up the interface again or interface restart triggered by watchdog timeout handler. I don't think you're seeing watchdog timeouts though. I'm fairly certain the interface isn't changing state when this happens -- it just kinda spontaneously happens after a week or two, with no interface up/down transitions. I don't see any watchdog messages when this happens. There is another code path that causes controller reinitialization. If you change MTU or offloading configuration(TSO, VLAN tagging, checksum offloading etc) it will reinitialize the controller. So do you happen to trigger one of these code path during a week or two? When you see no memory for jumbo RX buffers message, did you check available mbuf pool? Not yet, that's why I asked for debugging tips -- I'll do that the next time this happens. What's the best way to go about debugging this... which sysctl's should I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 to 16384 and it doesn't seem to help things... maybe prolonging it slightly, but not by much. The problem is it takes a week or so to reproduce the problem each time... I vaguely guess it could be related with other subsystem which leaks mbufs such that driver was not able to get more jumbo RX buffers from system. For instance, r228016 would be worth to try on your box. I can't clearly explain why em(4) does not suffer from the issue though. I've just this morning built a kernel with that fix, so we'll see how that goes. Ok. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9.0-RC2 re(4) no memory for jumbo buffers issue
On Sat, Nov 26, 2011 at 04:05:58PM -0500, Mike Andrews wrote: I have a Supermicro 5015A-H (Intel Atom 330) server with two Realtek RTL8111C-GR gigabit NICs on it. As far as I can tell, these support jumbo frames up to 7422 bytes. When running them at an MTU of 5000 on Actually the maximum size is 6KB for RTL8111C, not 7422. RTL8111C and newer PCIe based gigabit controllers no longer support scattering a jumbo frame into multiple RX buffers so a single RX buffer has to receive an entire jumbo frame. This adds more burden to system because it has to allocate a jumbo frame even when it receives a pure TCP ACK. FreeBSD 9.0-RC2, after a week or so of update, with fairly light network activity, the interfaces die with no memory for jumbo buffers errors on the console. Unloading and reloading the driver (via serial console) doesn't help; only rebooting seems to clear it up. The jumbo code path is the same as normal MTU sized one so I think possibility of leaking mbufs in driver is very low. And the message no memory for jumbo RX buffers can only happen either when you up the interface again or interface restart triggered by watchdog timeout handler. I don't think you're seeing watchdog timeouts though. When you see no memory for jumbo RX buffers message, did you check available mbuf pool? I don't have this issue with any of my em(4) based systems that are also using a 5000 byte MTU -- and they push considerably more traffic. I don't really consider this a regression from FreeBSD 8.2 because 8.2 didn't support jumbos at all on this hardware... :) What's the best way to go about debugging this... which sysctl's should I be looking at first? I have already tried raising kern.ipc.nmbjumbo9 to 16384 and it doesn't seem to help things... maybe prolonging it slightly, but not by much. The problem is it takes a week or so to reproduce the problem each time... I vaguely guess it could be related with other subsystem which leaks mbufs such that driver was not able to get more jumbo RX buffers from system. For instance, r228016 would be worth to try on your box. I can't clearly explain why em(4) does not suffer from the issue though. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Call for testers : ALi/ULi M5261/M5263 ethernet controller
On Sun, Oct 16, 2011 at 05:22:13PM -0700, YongHyeon PYUN wrote: Hi, If you have ALi/ULi M5261/M5263 ethernet controller please try the patch at the following URL and let me know how it works. http://people.freebsd.org/~yongari/dc/dc.uli562x.diff The patch was generated against latest HEAD and it should be cleanly applied to latest stable/8 and stable/7. I committed revised version to HEAD(r226699, r226701). Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Call for testers : ALi/ULi M5261/M5263 ethernet controller
On Wed, Oct 19, 2011 at 12:09:41PM +0200, Marco Steinbach wrote: YongHyeon PYUN wrote on 17.10.2011 02:22: Hi, If you have ALi/ULi M5261/M5263 ethernet controller please try the patch at the following URL and let me know how it works. http://people.freebsd.org/~yongari/dc/dc.uli562x.diff The patch was generated against latest HEAD and it should be cleanly applied to latest stable/8 and stable/7. Thanks. Thank you for working on this. Although the patch applies cleanly, it doesn't seem to work for me. I'm getting the following message upon boot: dc0: ULi M5263 FastEthernet port 0xe400-0xe4ff mem 0xff6fec00-0xff6fecff irq 17 at device 17.0 on pci0 dc0: attaching PHYs failed device_attach: dc0 attach returned 6 The device doesn't show up in ifconfig. FreeBSD x2.c0c0.intra 8.2-STABLE FreeBSD 8.2-STABLE #2 r226509M: Wed Oct 19 11:37:33 CEST 2011 root@x2.c0c0.intra:/usr/obj/usr/src/sys/GENERIC i386 dc0@pci0:0:17:0:class=0x02 card=0x52631849 chip=0x526310b9 rev=0x40 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ULi PCI Fast Ethernet Controller (Albatron K8ULTRA-U Pro)' class = network subclass = ethernet It's a lab machine, so I'm completely free to try out anything you might suggest. Thanks for testing! I already got a feedback from user and the user also said the patch does not work. I'm trying to debug the issue as the user is willing to provide remote access. However it seems it's somewhat hard for the user to setup remote debugging environments. BTW, can you setup remote debugging environments like the following URL? http://people.freebsd.org/~yongari/remote_debugging.txt I'll let you know if I manage to make it work. Here's the complete output of pciconf -lv: hostb0@pci0:0:0:0: class=0x06 card=0x chip=0x169510b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ULi M1695 K8 Northbridge with PCIe and hypertransport' class = bridge subclass = HOST-PCI pcib1@pci0:0:1:0: class=0x060400 card=0x chip=0x524b10b9 rev=0x00 hdr=0x01 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ALi PCIe Bridge' class = bridge subclass = PCI-PCI pcib2@pci0:0:2:0: class=0x060400 card=0x chip=0x524c10b9 rev=0x00 hdr=0x01 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ALi PCIe Bridge' class = bridge subclass = PCI-PCI pcib3@pci0:0:3:0: class=0x060400 card=0x chip=0x524d10b9 rev=0x00 hdr=0x01 vendor = 'Acer Labs Incorporated (ALi/ULi)' class = bridge subclass = PCI-PCI hostb1@pci0:0:4:0: class=0x06 card=0x chip=0x168910b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ULi M1689 K8 Northbridge with AGP and hypertransport' class = bridge subclass = HOST-PCI pcib4@pci0:0:5:0: class=0x060400 card=0x chip=0x524610b9 rev=0x00 hdr=0x01 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ULi AGP 3.0 Controller' class = bridge subclass = PCI-PCI pcib5@pci0:0:6:0: class=0x060401 card=0x chip=0x524910b9 rev=0x00 hdr=0x01 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'HyperTransport to PCI Bridge (M5249)' class = bridge subclass = PCI-PCI isab0@pci0:0:7:0: class=0x060100 card=0x15631849 chip=0x156310b9 rev=0x70 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ALI M1563 South Bridge with Hypertransport Support' class = bridge subclass = PCI-ISA none0@pci0:0:7:1: class=0x068000 card=0x71011849 chip=0x710110b9 rev=0x00 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ALI M7101 Power Management Controller' class = bridge none1@pci0:0:8:0: class=0x040100 card=0x08501849 chip=0x545510b9 rev=0x20 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'AC'97 Audio Controller (M1563M Southbridge)' class = multimedia subclass = audio dc0@pci0:0:17:0:class=0x02 card=0x52631849 chip=0x526310b9 rev=0x40 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'ULi PCI Fast Ethernet Controller (Albatron K8ULTRA-U Pro)' class = network subclass = ethernet atapci2@pci0:0:18:0:class=0x01018a card=0x52291849 chip=0x522910b9 rev=0xc7 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'EIDE Controller (M5229 Southbridge)' class = mass storage subclass = ATA atapci3@pci0:0:18:1:class=0x01018f card=0x52891849 chip=0x528910b9 rev=0x10 hdr=0x00 vendor = 'Acer Labs Incorporated (ALi/ULi)' device = 'M5289 SATA/Raid controller (ULI
Re: Call for testers : ALi/ULi M5261/M5263 ethernet controller
On Wed, Oct 19, 2011 at 11:50:10AM -0700, Jeremy Chadwick wrote: On Wed, Oct 19, 2011 at 11:43:17AM -0700, YongHyeon PYUN wrote: Thanks for testing! I already got a feedback from user and the user also said the patch does not work. I'm trying to debug the issue as the user is willing to provide remote access. However it seems it's somewhat hard for the user to setup remote debugging environments. BTW, can you setup remote debugging environments like the following URL? http://people.freebsd.org/~yongari/remote_debugging.txt I'll let you know if I manage to make it work. YongHyeon and others, If you guys can't get a good development environment going for YongHyeon, let me know and I can invest in one of these motherboards and either send it to YongHyeon (back in South Korea?) or I can set it up I'm still living in US. :-) locally and get him serial console access to boot. Just let me know if all other avenues are exhausted. Thanks for the offer. If all remote debugging fails I'll ask you help. -- | Jeremy Chadwickjdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Call for testers : ALi/ULi M5261/M5263 ethernet controller
Hi, If you have ALi/ULi M5261/M5263 ethernet controller please try the patch at the following URL and let me know how it works. http://people.freebsd.org/~yongari/dc/dc.uli562x.diff The patch was generated against latest HEAD and it should be cleanly applied to latest stable/8 and stable/7. Thanks. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Realtek integrated nic problem
On Sun, Sep 25, 2011 at 03:57:02AM -0300, Nenhum_de_Nos wrote: On Sun, September 25, 2011 03:46, Jeremy Chadwick wrote: On Sun, Sep 25, 2011 at 03:32:36AM -0300, Nenhum_de_Nos wrote: On Sat, September 24, 2011 22:12, Adrian Chadd wrote: Surely this is something to take up with the pfsense team? about the compiling issue yes, but yet the info about the if_re.ko is of great value here :) as Jeremy said, the maintainer would be the best person to answer this :) I believe Adrian's point (and it's 100% valid) is that for pfSense issues you really need to bring them up with the pfSense folks. We all recognise pfSense is based on FreeBSD, but it's a fairly customised environment. Point is that mailing a FreeBSD list about issues centralised to pfSense isn't the best choice; for example, you wouldn't mail the lkml list about an issue with Red Hat. You have to bring these issues to the distributor's attention first. The other benefit is that by bringing it to the pfSense folks' attention, it may be possible to get a patch or updated driver brought in to the pfSense tree, which could fix the problem for future users. The FreeBSD mailing lists, generally speaking, have no idea what the state of things is with pfSense. For example *I* have no idea what FreeBSD version they use, what custom modifications they have in place, etc.. I follow FreeBSD, I don't follow pfSense. :-) I know Jeremy. I just said about pfSense as I thought it would give some context about the problem, but my issue now that I know 8.2 would run the nic fine (thing FreeBSD related, I suppose), is if is safe to run one .ko from 8.2 on 8.1. I just asked here because I think its related to FreeBSD, No, you should have to rebuild kernel with updated re(4) driver. Probably pfSense guys would be able to release new driver. Or they can give you right direction to rebuild kernel. Sorry, I don't use pfSense so don't know how this could be done. regardless of pfSense being the target box. If on FreeBSD it is possible, I'll find out about pfSense afterward. My focus here is FreeBSD. no intention to make a big deal about this, nor to make off-topic questions here. If it is seen as such, I'll shut up then. thanks, matheus ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busdma MFC broke ipfw fwd for RELENG_6
On Sat, Sep 17, 2011 at 09:03:50PM +0700, Eugene Grosbein wrote: 17.09.2011 02:13, YongHyeon PYUN пишет: On Fri, Sep 16, 2011 at 11:45:25AM +0700, Eugene Grosbein wrote: 16.09.2011 02:19, YongHyeon PYUN пишет: On Fri, Sep 16, 2011 at 02:02:37AM +0700, Eugene Grosbein wrote: 16.09.2011 01:15, YongHyeon PYUN пишет: I remember re(4) in 6.x also have a couple of bus_dma(9) bugs. How about applying the following revision? http://svnweb.freebsd.org/base?view=revisionrevision=175337 Not sure whether it shall apply cleanly. It does not and there is too much differences in the code for my skills to apply manually :-) Alternatively try replacing BUS_DMA_ALLOCNOW to 0 in bus_dma_tag_create(9). I'm not sure I undersdand this right... Do you mean this change? No, change BUS_DMA_ALLOCNOW used in re(4). With clean RELENG_6 sources and the only following patch, the problem still persists. :-( I have back-ported re(4)/rl(4) for latest 6.x. http://people.freebsd.org/~yongari/re/6.x/README.txt Just compile tested and not sure whether it fixes the issue. I confirm that the problem disappears using this driver with clean RELENG_6 sources and RealTek 8169/8169S/8169SB(L)/8110S/8110SB(L) Gigabit Ethernet: re0@pci1:11:0: class=0x02 card=0x816910ec chip=0x816910ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8110SB Single-Chip Gigabit LOM Ethernet Controller' class = network subclass = ethernet cap 01[dc] = powerspec 2 supports D0 D1 D2 D3 current D0 Also, this machine uses RT8139 (rl0) for its LAN and it works too. Glad to hear that. Unfortunately I have no plan or time to merge all changes made in re(4)/rl(4) to stable/6 so you may have to stick to this unofficial driver. As you already know, I overhauled these drivers long time ago and there were too many changes. Eugene Grosbein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busdma MFC broke ipfw fwd for RELENG_6
On Sun, Sep 18, 2011 at 03:11:46AM +0700, Eugene Grosbein wrote: 18.09.2011 03:05, YongHyeon PYUN пишет: I have back-ported re(4)/rl(4) for latest 6.x. http://people.freebsd.org/~yongari/re/6.x/README.txt Just compile tested and not sure whether it fixes the issue. I confirm that the problem disappears using this driver with clean RELENG_6 sources and RealTek 8169/8169S/8169SB(L)/8110S/8110SB(L) Gigabit Ethernet: re0@pci1:11:0: class=0x02 card=0x816910ec chip=0x816910ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'RTL8110SB Single-Chip Gigabit LOM Ethernet Controller' class = network subclass = ethernet cap 01[dc] = powerspec 2 supports D0 D1 D2 D3 current D0 Also, this machine uses RT8139 (rl0) for its LAN and it works too. Glad to hear that. Unfortunately I have no plan or time to merge all changes made in re(4)/rl(4) to stable/6 so you may have to stick to this unofficial driver. As you already know, I overhauled these drivers long time ago and there were too many changes. Well, given that before busdma commit that hardware worked just fine with stock driver, it could be less overhead for me to rollback that one busdma small chunk :-) Who knows, which drivers got broken then in 2010 in 6.4-STABLE with busdma change besides re(4)... I agree but that MFC may have fixed bugs in other driver which correctly used bus_dma(9) KPI. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busdma MFC broke ipfw fwd for RELENG_6
On Fri, Sep 16, 2011 at 11:45:25AM +0700, Eugene Grosbein wrote: 16.09.2011 02:19, YongHyeon PYUN пишет: On Fri, Sep 16, 2011 at 02:02:37AM +0700, Eugene Grosbein wrote: 16.09.2011 01:15, YongHyeon PYUN пишет: I remember re(4) in 6.x also have a couple of bus_dma(9) bugs. How about applying the following revision? http://svnweb.freebsd.org/base?view=revisionrevision=175337 Not sure whether it shall apply cleanly. It does not and there is too much differences in the code for my skills to apply manually :-) Alternatively try replacing BUS_DMA_ALLOCNOW to 0 in bus_dma_tag_create(9). I'm not sure I undersdand this right... Do you mean this change? No, change BUS_DMA_ALLOCNOW used in re(4). With clean RELENG_6 sources and the only following patch, the problem still persists. :-( I have back-ported re(4)/rl(4) for latest 6.x. http://people.freebsd.org/~yongari/re/6.x/README.txt Just compile tested and not sure whether it fixes the issue. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busdma MFC broke ipfw fwd for RELENG_6
On Fri, Sep 16, 2011 at 12:41:23AM +0700, Eugene Grosbein wrote: 16.09.2011 00:14, John Baldwin пишет: On Thursday, September 15, 2011 12:44:32 pm Eugene Grosbein wrote: Hi! I understand that it is a bit late for RELENG_6 reports as 6.4-RELEASE was out in 2008 but the breakage had happened due to MFC so it's possible same problem exists in newer branches. Long story short: I've updated my old 6.4-STABLE system for recent zoneinfo updates and found the update broke 'ipfw fwd' feature: forwarded packets get corrupted, routed packets go just fine. The commit in question has been performed in 2010/08/06: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/i386/i386/Attic/busdma_machdep.c.diff?r1=1.74.2.6;r2=1.74.2.7 I've rolled it back using recent RELENG_6 sources and packet corruption have disappeared. It may be a bug in the rl(4) driver? Perhaps this is a candidate to try? http://svnweb.freebsd.org/base?view=revisionrevision=184240 My system is i386 with 512MB RAM only and outgoing packets go through re(4), not rl(4). Nevertheless, I've just applied revision 184240 to RELENG_6 manualy (it did not apply cleanly) and reapplied MFC. The problem has returned. I remember re(4) in 6.x also have a couple of bus_dma(9) bugs. How about applying the following revision? http://svnweb.freebsd.org/base?view=revisionrevision=175337 Not sure whether it shall apply cleanly. Alternatively try replacing BUS_DMA_ALLOCNOW to 0 in bus_dma_tag_create(9). Eugene Grosbein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busdma MFC broke ipfw fwd for RELENG_6
On Fri, Sep 16, 2011 at 02:02:37AM +0700, Eugene Grosbein wrote: 16.09.2011 01:15, YongHyeon PYUN пишет: I remember re(4) in 6.x also have a couple of bus_dma(9) bugs. How about applying the following revision? http://svnweb.freebsd.org/base?view=revisionrevision=175337 Not sure whether it shall apply cleanly. It does not and there is too much differences in the code for my skills to apply manually :-) Alternatively try replacing BUS_DMA_ALLOCNOW to 0 in bus_dma_tag_create(9). I'm not sure I undersdand this right... Do you mean this change? No, change BUS_DMA_ALLOCNOW used in re(4). This is the only place in busdma_machdep.c where BUS_DMA_ALLOCNOW is used. --- busdma_machdep.c.orig 2011-09-16 01:56:52.0 +0700 +++ busdma_machdep.c2011-09-16 01:57:01.0 +0700 @@ -284,7 +284,7 @@ newtag-flags |= BUS_DMA_COULD_BOUNCE; if (((newtag-flags BUS_DMA_COULD_BOUNCE) != 0) - (flags BUS_DMA_ALLOCNOW) != 0) { + (flags 0) != 0) { struct bounce_zone *bz; /* Must bounce */ I'm going to lose control on this remote box if I apply wrong patch for re(4). Eugene Grosbein ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Unknown Re0 Hardware version
On Sun, Aug 21, 2011 at 04:01:10PM +0200, Willem Jan Withagen wrote: Hi, I'm assembling a few system with a ASUS P8 H161-MLE motherboard which was supposed to have a 'Realtek® 8112L, 1 x Gigabit LAN Controller(s)' onboard. And to be honestly I never expected that version not to be supported. Just booted 8.2-RELEASE on it, and the Installer crashed when I wanted it to config the ehternet. Rebooted, and re0 kicks in. But gives a HW revision not supported. It claims HW revision 0x2c80. Is this supported in later 8.2-Stable??? Or in 9.x?? I'm willing to tinker with the code to recompile the re0 driver. Your controller looks like RTL8168E VL and support for the controller was added after 8.2-RELEASE. Either update your source to stable/8 or patch your source tree with back-ported re(4) driver for 8.2-RELEASE like the following. 1. Fetch http://people.freebsd.org/~yongari/re/8.2R/if_re.c and copy it to /usr/src/sys/dev/re directory. 2. Fetch http://people.freebsd.org/~yongari/re/8.2R/if_rlreg.h and copy it /usr/src/sys/pci directory. And rebuild your kernel and your controller should be recognized in next boot. --WjW ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: disable 64-bit dma for one PCI slot only?
On Wed, Jul 20, 2011 at 11:54:06AM +0200, Stefan Esser wrote: Am 19.07.2011 20:17, schrieb Artem Belevich: On Tue, Jul 19, 2011 at 6:31 AM, John Baldwin j...@freebsd.org wrote: The only reason it might be nice to stick with two fields is due to the line length (though the first line is over 80 cols even in the current format). Here are two possible suggestions: old: hostb0@pci0:0:0:0: class=0x06 card=0x20108086 chip=0x01008086 rev=0x09 hdr=0x00 pcib1@pci0:0:1:0: class=0x060400 card=0x20108086 chip=0x01018086 rev=0x09 hdr=0x01 pcib2@pci0:0:1:1: class=0x060400 card=0x20108086 chip=0x01058086 rev=0x09 hdr=0x01 none0@pci0:0:22:0: class=0x078000 card=0x47428086 chip=0x1c3a8086 rev=0x04 hdr=0x00 em0@pci0:0:25:0:class=0x02 card=0x8086 chip=0x15038086 rev=0x04 hdr=0x00 ... A) hostb0@pci0:0:0:0: class=0x06 vendor=0x8086 device=0x0100 subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x00 pcib1@pci0:0:1:0: class=0x060400 vendor=0x8086 device=0x0101 subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x01 pcib2@pci0:0:1:1: class=0x060400 vendor=0x8086 device=0x0105 subvendor=0x8086 subdevice=0x2010 rev=0x09 hdr=0x01 none0@pci0:0:22:0: class=0x078000 vendor=0x8086 device=0x1c3a subvendor=0x8086 subdevice=0x4742 rev=0x04 hdr=0x00 em0@pci0:0:25:0:class=0x02 vendor=0x8086 device=0x1503 subvendor=0x8086 subdevice=0x rev=0x04 hdr=0x00 ... B) hostb0@pci0:0:0:0: class=0x06 devid=0x8086:0100 subid=0x8086:2010 rev=0x09 hdr=0x00 pcib1@pci0:0:1:0: class=0x060400 devid=0x8086:0101 subid=0x8086:2010 rev=0x09 hdr=0x01 pcib2@pci0:0:1:1: class=0x060400 devid=0x8086:0105 subid=0x8086:2010 rev=0x09 hdr=0x01 none0@pci0:0:22:0: class=0x078000 devid=0x8086:1c3a subid=0x8086:4742 rev=0x04 hdr=0x00 em0@pci0:0:25:0:class=0x02 devid=0x8086:1503 subid=0x8086: rev=0x04 hdr=0x00 ... I went with vendor word first for both A) and B) as in my experience that is the more common ordering in driver tables, etc. Do we need to print (class|devid|device|subvendor|etc.)= on every line? IMHO they belong to a header line. Something like this: Driver Handle ClassVnd:Dev Sub Vnd:Dev Rev Hdr -- hostb0 pci0:0:0:0 0x06 0x8086:0100 0x8086:2010 0x09 0x00 pcib1 pci0:0:1:0 0x060400 0x8086:0101 0x8086:2010 0x09 0x01 pcib2 pci0:0:1:1 0x060400 0x8086:0105 0x8086:2010 0x09 0x01 none0 pci0:0:22:0 0x078000 0x8086:1c3a 0x8086:4742 0x04 0x00 em0pci0:0:25:0 0x02 0x8086:1503 0x8086: 0x04 0x00 This is a very good idea, IMHO. When I committed pciconf back in 1996 (it had been contributed by gwollman) for PCI 1.0 (at a time when their was no standard for PCI to PCI brigdes, yet ;-) ), the current format seemed sensible, but the tabular form suggested by Artem is much better to parse. I'd want to suggest another slightly different format: Driver Handle ClassVndDevSubVnd SubDev Rev Hdr hostb0 0:0:0:00x06 0x8086 0x0100 0x8086 0x2010 0x09 0x00 pcib1 0:0:1:00x060400 0x8086 0x0101 0x8086 0x2010 0x09 0x01 pcib2 0:0:1:10x060400 0x8086 0x0105 0x8086 0x2010 0x09 0x01 none0 0:0:22:0 0x078000 0x8086 0x1c3a 0x8086 0x4742 0x04 0x00 em00:0:25:0 0x02 0x8086 0x1503 0x8086 0x 0x04 0x00 dummy0 65535:255:31:7 0x02 0x8086 0x1503 0x8086 0x 0x04 0x00 I.e., print only one header line (no ---), make the Handle column wide enough to hold the longest possible value, use only white space to separate columns and print 0x as a prefix for all hex numbers. Instead of pci0:0:0:0 for the PCI handle, just 0:0:0:0 could be printed, IMHO. (But this is bikeshed material, I guess ...) The Rev column is required for of devices that are not uniquely identified by their Vnd/Dev-IDs. (These used to exist, e.g. the Symbios SCSI controllers, though I'm not aware of any device that needed a different driver depending on the PCI revision number.) re(4) and rl(4) are one of example that needs the Rev. I'd be happy to modify pciconf to print the new format in -CURRENT (having been the maintainer of the PCI code for quite some time), if consensus is reached on a format and if this change is accepted by RE. Regards, STefan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org