Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
On 8/12/2015 6:02 PM, Douglas Miller wrote: Oh, I had missed the extra if condition on tg3_test_link(). So external_lb is not a true superset of offline. So you are not surprised by the (about) 20 second link down period after this test? If this is expected (albeit undocumented) behavior we can change the test scenario to work around it. It seems as though not all adapters exhibit this same symptom. From a testing standpoint, it is a long delay to add that may only be needed for this one adapter (Broadcom BCM5719, or adapter family). We executed the ethtool -t dev offline in a loop on our local test machine with 5719 and linkup time is = 5 secs. Script: #!/bin/bash echo -OS Information- uname -a echo --Card Information-- lspci | grep 5719 echo --Interface information-- ethtool -i p4p4 echo -Offline test start-- for i in 1 2 3 do date ethtool -t p4p4 offline done Output: -OS Information- Linux siva-dev 4.2.0-rc4+ #1 SMP Thu Aug 13 20:24:11 IST 2015 x86_64 x86_64 x86_64 GNU/Linux --Card Information-- 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) --Interface information-- driver: tg3 version: 3.137 firmware-version: 5719-v1.41 NCSI v1.3.6.0 bus-info: :03:00.3 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no -Offline test start-- Thu Aug 13 22:05:59 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Thu Aug 13 22:06:00 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Thu Aug 13 22:06:05 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Please check your test environment. Thanks, Doug On 08/11/2015 03:31 PM, Michael Chan wrote: On Tue, 2015-08-11 at 14:24 -0500, Douglas Miller wrote: Yes, the wrap plugs are the loopback cables/plugs. It is my understanding that the offline tests do not require anything to be plugged into the ports, as they do not in any way touch the external port. They perform an internal loopback test which does not depend on any external connection. Correct. From what I can tell, the only difference between offline and external_lb is that external_lb performs the external loopback tests, *in addition to* all the tests done for offline. Correct. This would imply that the only tests that depend on anything connected to the physical port is external_lb, and there is no requirement that the wrap plugs be removed/replaced in order to run offline tests. When you do external loopback test, we skip the link test because you no longer have normal connection to the network. You now use a special loopback cable, which will fail the link up test because the link up test assumes connection to the network using normal cable. In the case I was debugging, wrap plugs were installed because the ports were, later, being tested in an external loopback way. What I am observing is that it takes about 20 seconds for the kernel to declare that the link is up, after running the offline or external_lb test. In the case of offline I cannot run the test again until the kernel declares the link up. In the case of external_lb I can run the test again immediately and it passes. As stated earlier, because we skip the link test when we are performing external_lb. So, you should always do ethtool -t dev external_lb if you have a loopback cable connected. We will perform the external loopback test and skip the link test. If you don't have an external loopback cable connected, you should run ethtool -t dev offline. It will not do the external loopback test and will do the link test for proper link up with the network. This suggests to me that the external_lb case (again, it is a superset of offline) is performing
Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
Very interesting. I was running a RHEL 7.1 kernel 3.10.0-229.ael7b.ppc64le (PowerPC). tg3 version 3.137, firmware 5719-v1.24i, but unknown what patches were added to either of our modules. We will investigate the environment more, under the assumption that we should not be required to insert any delay between runs of ethtool -t ... offline. Thanks Siva, Doug On 08/13/2015 03:40 AM, Siva Reddy (Siva) Kallam wrote: On 8/12/2015 6:02 PM, Douglas Miller wrote: Oh, I had missed the extra if condition on tg3_test_link(). So external_lb is not a true superset of offline. So you are not surprised by the (about) 20 second link down period after this test? If this is expected (albeit undocumented) behavior we can change the test scenario to work around it. It seems as though not all adapters exhibit this same symptom. From a testing standpoint, it is a long delay to add that may only be needed for this one adapter (Broadcom BCM5719, or adapter family). We executed the ethtool -t dev offline in a loop on our local test machine with 5719 and linkup time is = 5 secs. Script: #!/bin/bash echo -OS Information- uname -a echo --Card Information-- lspci | grep 5719 echo --Interface information-- ethtool -i p4p4 echo -Offline test start-- for i in 1 2 3 do date ethtool -t p4p4 offline done Output: -OS Information- Linux siva-dev 4.2.0-rc4+ #1 SMP Thu Aug 13 20:24:11 IST 2015 x86_64 x86_64 x86_64 GNU/Linux --Card Information-- 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.2 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 03:00.3 Ethernet controller: Broadcom Corporation NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) --Interface information-- driver: tg3 version: 3.137 firmware-version: 5719-v1.41 NCSI v1.3.6.0 bus-info: :03:00.3 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no -Offline test start-- Thu Aug 13 22:05:59 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Thu Aug 13 22:06:00 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Thu Aug 13 22:06:05 IST 2015 The test result is PASS The test extra info: nvram test(online) 0 link test (online) 0 register test (offline) 0 memory test (offline) 0 mac loopback test (offline) 0 phy loopback test (offline) 0 ext loopback test (offline) 0 interrupt test(offline) 0 Please check your test environment. Thanks, Doug On 08/11/2015 03:31 PM, Michael Chan wrote: On Tue, 2015-08-11 at 14:24 -0500, Douglas Miller wrote: Yes, the wrap plugs are the loopback cables/plugs. It is my understanding that the offline tests do not require anything to be plugged into the ports, as they do not in any way touch the external port. They perform an internal loopback test which does not depend on any external connection. Correct. From what I can tell, the only difference between offline and external_lb is that external_lb performs the external loopback tests, *in addition to* all the tests done for offline. Correct. This would imply that the only tests that depend on anything connected to the physical port is external_lb, and there is no requirement that the wrap plugs be removed/replaced in order to run offline tests. When you do external loopback test, we skip the link test because you no longer have normal connection to the network. You now use a special loopback cable, which will fail the link up test because the link up test assumes connection to the network using normal cable. In the case I was debugging, wrap plugs were installed because the ports were, later, being tested in an external loopback way. What I am observing is that it takes about 20 seconds for the kernel to declare that the link is up, after running the offline or external_lb test. In the case of offline I cannot run the test again until the kernel declares the link up. In the case of external_lb I can run the test again immediately and it passes. As stated earlier, because we skip the link test when we are performing external_lb. So, you should always do
Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
Oh, I had missed the extra if condition on tg3_test_link(). So external_lb is not a true superset of offline. So you are not surprised by the (about) 20 second link down period after this test? If this is expected (albeit undocumented) behavior we can change the test scenario to work around it. It seems as though not all adapters exhibit this same symptom. From a testing standpoint, it is a long delay to add that may only be needed for this one adapter (Broadcom BCM5719, or adapter family). Thanks, Doug On 08/11/2015 03:31 PM, Michael Chan wrote: On Tue, 2015-08-11 at 14:24 -0500, Douglas Miller wrote: Yes, the wrap plugs are the loopback cables/plugs. It is my understanding that the offline tests do not require anything to be plugged into the ports, as they do not in any way touch the external port. They perform an internal loopback test which does not depend on any external connection. Correct. From what I can tell, the only difference between offline and external_lb is that external_lb performs the external loopback tests, *in addition to* all the tests done for offline. Correct. This would imply that the only tests that depend on anything connected to the physical port is external_lb, and there is no requirement that the wrap plugs be removed/replaced in order to run offline tests. When you do external loopback test, we skip the link test because you no longer have normal connection to the network. You now use a special loopback cable, which will fail the link up test because the link up test assumes connection to the network using normal cable. In the case I was debugging, wrap plugs were installed because the ports were, later, being tested in an external loopback way. What I am observing is that it takes about 20 seconds for the kernel to declare that the link is up, after running the offline or external_lb test. In the case of offline I cannot run the test again until the kernel declares the link up. In the case of external_lb I can run the test again immediately and it passes. As stated earlier, because we skip the link test when we are performing external_lb. So, you should always do ethtool -t dev external_lb if you have a loopback cable connected. We will perform the external loopback test and skip the link test. If you don't have an external loopback cable connected, you should run ethtool -t dev offline. It will not do the external loopback test and will do the link test for proper link up with the network. This suggests to me that the external_lb case (again, it is a superset of offline) is performing some configuration on the port that allows the subsequent test to work. The one significant difference between offline and external_lb is that external_lb performs the tg3_phy_lpbk_set(tp, 0, true); changes to configuration (immediately prior to running the loopback tests again). I believe this call is to switch from internal loopback to normal, in order to leverage the wrap plugs and perform the external loopback tests. But this call is not made for offline and I am wondering if that leaves the port in a state where it cannot be used until the kernel completes the link up. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
On Tue, 2015-08-11 at 10:59 -0500, Douglas Miller wrote: (Sorry if you got several duplicates, am trying to work through rejected messages due to supposed HTML content) The following behavior is being observed when running ethtool -t dev offline on ports on the Broadcom BCM5719 adapter (tg3 driver). The ports have wrap plugs on them, although I'm not sure why that would have any affect. I'm not sure what are wrap plugs. The test ethtool -t dev offline was being running continuously. The first invocation passes, all subsequent ones fail (at least in the link test step) after ~20 second timeout. When running the test once, I see the following: Looking at /var/log/messages, I see a Link is down message during the test. Then, 20 seconds after the test completes, there is a Link is up... message. If I wait for the Link is up... message I can run the test without problems. If the test is run again while the link is still down, it fails and seems to delay the link up by an additional 20 seconds. When you do offline test, the chip is reset and the PHY is also reset, causing the link to go down. Normally, link should come back up within a few seconds. The selftest code will wait for 6 seconds for copper and 2 seconds for serdes link to be up before declaring there is no link. So for whaever reason, the link in your setup takes longer than that to come up and therefore it fails the link test when you run it in a loop starting on the 2nd iteration. If I run external_lb instead of offline, I am able to run the test repeatedly without error. So it seems that some action taken in the external_lb case actually repairs the port. But the external_lb test also exhibits the link-down for 20 seconds symptom, although it can been run while the link is considered down without failure. External loopback requires a loopback cable. So you must have a loopback cable for this test to pass. May be that's what you meant by wrap plugs. The first question is whether we should expect to be able to run ethtool -t dev offline continually, with no delay between runs. I presume this is supported. If your intention is to run external loopback, yes you should specify external loopback. Otherwise the driver expects normal link behavior and that's why it fails. If you connect a normal cable, then ethtool -t dev offline works repeatedly, right? Second question, I would like someone with experience with the tg3 driver and this adapter to comment on what might be done to fix this. My first, simple, guess would be move the tg3_phy_lpbk_set(tp, 0, true); setting (in tg3_test_loopback()) to be done for both offline and external_lb cases. I am awaiting time on a system with this adapter in order to try out some possible fixes and/or debug what might be wrong/different with the configuration after the offline test. I would appreciate any help, Thanks, Doug Miller -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
Thanks Michael for getting back to me. Yes, the wrap plugs are the loopback cables/plugs. It is my understanding that the offline tests do not require anything to be plugged into the ports, as they do not in any way touch the external port. They perform an internal loopback test which does not depend on any external connection. From what I can tell, the only difference between offline and external_lb is that external_lb performs the external loopback tests, *in addition to* all the tests done for offline. This would imply that the only tests that depend on anything connected to the physical port is external_lb, and there is no requirement that the wrap plugs be removed/replaced in order to run offline tests. In the case I was debugging, wrap plugs were installed because the ports were, later, being tested in an external loopback way. What I am observing is that it takes about 20 seconds for the kernel to declare that the link is up, after running the offline or external_lb test. In the case of offline I cannot run the test again until the kernel declares the link up. In the case of external_lb I can run the test again immediately and it passes. This suggests to me that the external_lb case (again, it is a superset of offline) is performing some configuration on the port that allows the subsequent test to work. The one significant difference between offline and external_lb is that external_lb performs the tg3_phy_lpbk_set(tp, 0, true); changes to configuration (immediately prior to running the loopback tests again). I believe this call is to switch from internal loopback to normal, in order to leverage the wrap plugs and perform the external loopback tests. But this call is not made for offline and I am wondering if that leaves the port in a state where it cannot be used until the kernel completes the link up. Thanks, Doug On 08/11/2015 12:41 PM, Michael Chan wrote: On Tue, 2015-08-11 at 10:59 -0500, Douglas Miller wrote: (Sorry if you got several duplicates, am trying to work through rejected messages due to supposed HTML content) The following behavior is being observed when running ethtool -t dev offline on ports on the Broadcom BCM5719 adapter (tg3 driver). The ports have wrap plugs on them, although I'm not sure why that would have any affect. I'm not sure what are wrap plugs. The test ethtool -t dev offline was being running continuously. The first invocation passes, all subsequent ones fail (at least in the link test step) after ~20 second timeout. When running the test once, I see the following: Looking at /var/log/messages, I see a Link is down message during the test. Then, 20 seconds after the test completes, there is a Link is up... message. If I wait for the Link is up... message I can run the test without problems. If the test is run again while the link is still down, it fails and seems to delay the link up by an additional 20 seconds. When you do offline test, the chip is reset and the PHY is also reset, causing the link to go down. Normally, link should come back up within a few seconds. The selftest code will wait for 6 seconds for copper and 2 seconds for serdes link to be up before declaring there is no link. So for whaever reason, the link in your setup takes longer than that to come up and therefore it fails the link test when you run it in a loop starting on the 2nd iteration. If I run external_lb instead of offline, I am able to run the test repeatedly without error. So it seems that some action taken in the external_lb case actually repairs the port. But the external_lb test also exhibits the link-down for 20 seconds symptom, although it can been run while the link is considered down without failure. External loopback requires a loopback cable. So you must have a loopback cable for this test to pass. May be that's what you meant by wrap plugs. The first question is whether we should expect to be able to run ethtool -t dev offline continually, with no delay between runs. I presume this is supported. If your intention is to run external loopback, yes you should specify external loopback. Otherwise the driver expects normal link behavior and that's why it fails. If you connect a normal cable, then ethtool -t dev offline works repeatedly, right? Second question, I would like someone with experience with the tg3 driver and this adapter to comment on what might be done to fix this. My first, simple, guess would be move the tg3_phy_lpbk_set(tp, 0, true); setting (in tg3_test_loopback()) to be done for both offline and external_lb cases. I am awaiting time on a system with this adapter in order to try out some possible fixes and/or debug what might be wrong/different with the configuration after the offline test. I would appreciate any help, Thanks, Doug Miller -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: Question on behavior of tg3_self_test() (ethtool -t on tg3 driver)
On Tue, 2015-08-11 at 14:24 -0500, Douglas Miller wrote: Yes, the wrap plugs are the loopback cables/plugs. It is my understanding that the offline tests do not require anything to be plugged into the ports, as they do not in any way touch the external port. They perform an internal loopback test which does not depend on any external connection. Correct. From what I can tell, the only difference between offline and external_lb is that external_lb performs the external loopback tests, *in addition to* all the tests done for offline. Correct. This would imply that the only tests that depend on anything connected to the physical port is external_lb, and there is no requirement that the wrap plugs be removed/replaced in order to run offline tests. When you do external loopback test, we skip the link test because you no longer have normal connection to the network. You now use a special loopback cable, which will fail the link up test because the link up test assumes connection to the network using normal cable. In the case I was debugging, wrap plugs were installed because the ports were, later, being tested in an external loopback way. What I am observing is that it takes about 20 seconds for the kernel to declare that the link is up, after running the offline or external_lb test. In the case of offline I cannot run the test again until the kernel declares the link up. In the case of external_lb I can run the test again immediately and it passes. As stated earlier, because we skip the link test when we are performing external_lb. So, you should always do ethtool -t dev external_lb if you have a loopback cable connected. We will perform the external loopback test and skip the link test. If you don't have an external loopback cable connected, you should run ethtool -t dev offline. It will not do the external loopback test and will do the link test for proper link up with the network. This suggests to me that the external_lb case (again, it is a superset of offline) is performing some configuration on the port that allows the subsequent test to work. The one significant difference between offline and external_lb is that external_lb performs the tg3_phy_lpbk_set(tp, 0, true); changes to configuration (immediately prior to running the loopback tests again). I believe this call is to switch from internal loopback to normal, in order to leverage the wrap plugs and perform the external loopback tests. But this call is not made for offline and I am wondering if that leaves the port in a state where it cannot be used until the kernel completes the link up. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html