Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Caglar, [...] Unfortunately emac driver is not stable after this series. I face lock-ups time to time, followed by attached kernel trace. Could you elaborate on your test scenario so that I can try and reproduce the problem at my end? Also, did you have the contents of my commit stack in this particular kernel build? Assuming that the DMA got stuck at some point leading up to the transmit timeout, any ideas as to why a host error was not thrown? To help debug, I'll post out a set of patches that dump out the MAC (and DMA) registers on timeout. That should give us some visibility into the problem. [ 1651.44] nfs: server 192.168.2.34 not responding, still trying [ 1859.01] [ cut here ] [ 1859.01] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x184/0x294() [ 1859.02] NETDEV WATCHDOG: eth0 (davinci_emac): transmit queue 0 timed out [...] Regards Cyril. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Caglar, [...] Assuming that the DMA got stuck at some point leading up to the transmit timeout, any ideas as to why a host error was not thrown? To help debug, I'll post out a set of patches that dump out the MAC (and DMA) registers on timeout. That should give us some visibility into the problem. I have posted a couple of additional patches for this on [1]. Would you mind giving these a quick try? The register dumps should prove useful in figuring this out. Regards Cyril. [1] http://arago-project.org/git/people/?p=cyril/linux-tnetv107x.git;a=shortlog;h=refs/heads/emac-cpdma-mdio-fixes -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
On Monday 13 September 2010 05:09:15 pm Cyril Chemparathy wrote: Hi Caglar, [...] Unfortunately emac driver is not stable after this series. I face lock-ups time to time, followed by attached kernel trace. Could you elaborate on your test scenario so that I can try and reproduce the problem at my end? Also, did you have the contents of my commit stack in this particular kernel build? Ooops! I didn't noticed your commits till you noted, I was working with DaVinci head. Applying patches in your tree solves all my problems and all my use cases are working now. However, I just barely tested them. To make myself forgiven here are my 'netperf' numbers before and after your patches applied: TCP_STREAM: 54.64 Mbit vs 52.84 Mbit UDP_STREAM: 96.27 Mbit vs 96.22 Mbit Regards, Caglar Assuming that the DMA got stuck at some point leading up to the transmit timeout, any ideas as to why a host error was not thrown? To help debug, I'll post out a set of patches that dump out the MAC (and DMA) registers on timeout. That should give us some visibility into the problem. [ 1651.44] nfs: server 192.168.2.34 not responding, still trying [ 1859.01] [ cut here ] [ 1859.01] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x184/0x294() [ 1859.02] NETDEV WATCHDOG: eth0 (davinci_emac): transmit queue 0 timed out [...] Regards Cyril. -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
On Friday 10 September 2010 06:23:52 pm Caglar Akyuz wrote: On Friday 10 September 2010 12:25:40 am Michael Williamson wrote: Hi Cyril, On 09/09/2010 03:51 PM, Cyril Chemparathy wrote: Hi Mike, [...] The hang is in wait_for_user_access() in the davinci_mdio_read() call. Looks like the state machine got put back into IDLE somewhere between the MDIO probe and the EMAC probe. Seems like there should be some sort of time-out and error message in the wait_for_user_access() method (maybe even a check for IDLE??) If I add a patch to check the state machine for IDLE and then re-enable it in the davinci_mdio_read() call, it is able to press on and come up. I don't see any calls to the davinci_mdio_suspend() call, so I am wondering if the EMAC probe routine, particularly the application of the SOFTRESET, is causing the MDIO to drop back to IDLE / disabled. I can post the patch if you like, but it is a bit of a hack... An EMAC soft-reset clobbering the MDIO controller state is a possibility. I will poll TI designers to see if this could be the case. In any case, a couple of unanswered questions remain: 1. Why don't other davinci devices display similar behavior? 2. If the answer to #1 above is that the timing window is pretty slim (i.e., only if an MDIO read/write is in progress during EMAC soft-reset), why do we hit this situation consistently on mityomap? Has it been confirmed that this only happens on mityomap? Has anyone had success using a da850 evm or other da850 platform? The configuration for Same problem exists on another DA850 board, Hawkboard.(Sorry no support in mainline yet) the mityomap, wrt to the EMAC/MII/MDIO, is pretty much identical to the da850 evm using the MII interface. The only difference I am aware of is the assigned address to the PHY chip. The reference clocks and rates are identical, AFAIK, to the evm. I have put together a quick patch (tested dm365). See attached. Your patch doesn't work with my board. It does attempt to reset the bus on This patch fixes the problem here. I'm using kernel IP auto configuration and mounting fs over NFS. My system boots and I can login to my board. Regards, Caglar Unfortunately emac driver is not stable after this series. I face lock-ups time to time, followed by attached kernel trace. Regards, Caglar _ [ 1651.44] nfs: server 192.168.2.34 not responding, still trying [ 1859.01] [ cut here ] [ 1859.01] WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x184/0x294() [ 1859.02] NETDEV WATCHDOG: eth0 (davinci_emac): transmit queue 0 timed out [ 1859.02] Modules linked in: [ 1859.03] Backtrace: [ 1859.03] [c0030444] (dump_backtrace+0x0/0x10c) from [c02d9820] (dump_stack+0x18/0x1c) [ 1859.04] r7:c03b5de8 r6:c025a13c r5:c039045c r4:0102 [ 1859.04] [c02d9808] (dump_stack+0x0/0x1c) from [c0042e44] (warn_slowpath_common+0x58/0x70) [ 1859.05] [c0042dec] (warn_slowpath_common+0x0/0x70) from [c0042f00] (warn_slowpath_fmt+0x38/0x40) [ 1859.06] r8:c03b4000 r7:0030 r6:c0437564 r5:c7bbc000 r4: [ 1859.07] [c0042ec8] (warn_slowpath_fmt+0x0/0x40) from [c025a13c] (dev_watchdog+0x184/0x294) [ 1859.08] r3:c7bbc000 r2:c0390474 [ 1859.08] [c0259fb8] (dev_watchdog+0x0/0x294) from [c004e918] (run_timer_softirq+0x1c4/0x29c) [ 1859.09] [c004e754] (run_timer_softirq+0x0/0x29c) from [c0048b94] (__do_softirq+0x98/0x12c) [ 1859.10] [c0048afc] (__do_softirq+0x0/0x12c) from [c0048c70] (irq_exit+0x48/0x9c) [ 1859.11] [c0048c28] (irq_exit+0x0/0x9c) from [c002c080] (asm_do_IRQ+0x80/0xa0) [ 1859.12] [c002c000] (asm_do_IRQ+0x0/0xa0) from [c002cb2c] (__irq_svc+0x4c/0x9c) [ 1859.12] Exception stack(0xc03b5f38 to 0xc03b5f80) [ 1859.13] 5f20: 0005317f [ 1859.14] 5f40: 0005217f 6013 c03b4000 c03b8ba0 c03b89d4 c03dcc50 c0023e04 41069265 [ 1859.15] 5f60: c0023d94 c03b5f8c 60d3 c03b5f80 c002da6c c002da78 6013 [ 1859.15] r5:febfd000 r4: [ 1859.16] [c002da48] (default_idle+0x0/0x34) from [c002dff0] (cpu_idle+0x74/0xdc) [ 1859.17] [c002df7c] (cpu_idle+0x0/0xdc) from [c02d59d4] (rest_init+0xa4/0xbc) [ 1859.17] r7:c03b89c8 r6:c0026018 r5:c03dcc1c r4:0002 [ 1859.18] [c02d5930] (rest_init+0x0/0xbc) from [c0008bc4] (start_kernel+0x270/0x2d0) [ 1859.19] r4:c042d70c [ 1859.19] [c0008954] (start_kernel+0x0/0x2d0) from [c0008034] (__enable_mmu+0x0/0x2c) [ 1859.20] r6:c002641c r5:c03dcc78 r4:00053175 [ 1859.20] ---[ end trace d278f645c502dc20 ]--- -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Cyril, On 09/10/2010 06:59 PM, Cyril Chemparathy wrote: Hi Mike, I have merged your latest two emails and responded to both here. [...] One of the patches posted on my repo [1] replaces the dumb mdelay() with a bit more logic that calculates the worst-case access time. This mechanism may work a lot better for you. Would you mind trying it out? You patch from [1] is working much more reliably now (well, 6 for 6 boot cycles as well as several ifup/ifdown cycles). I do get the resetting idled controlled console message every cycle, it seems that will be expected now. [...] Also, your while(1) loops with the continue conditions on the second wait_for_user_access() in the read and writes might need some consideration, i.e.: while (1) { ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; if (ret 0) break; __raw_writel(reg, data-regs-user[0].access); ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; ^ --- this will re-issue the request what you want? Yes, the wait_for_user_access() would have reset the controller, throwing the current transaction out. The intent here is to restart the current transaction with a newly initialized controller. OK. Makes sense. Looking at it felt like there was a chance for end endless spin, but that seems unlikely given how that condition might fire. [...] Also, on the shutdown, I get a major kernel trace. Here is the dump, as much as I could catch of it (I need a better terminal program) [...] WARNING: at drivers/net/davinci_emac.c:1025 __cpdma_chan_free+0xac/0xb0() The current code spits out a huge volume of stuff as a result of a WARN_ON in the rx handler. The gitweb on [1] has a patch that fixes this. Yes, these messages are no longer issued with the patches from [1]. Thanks. [...] The MDIO module upgrade (rev 1.4 - 1.5) could have something to do with this behavior. Even so, I can't explain why this issue wasn't seen on da8xx prior to this series. The original code should (at least in theory) have sporadically locked up on emac open. I think, if I understand it correctly, that in the previous version of this code, the emac was reset *prior* to enabling, scanning, and assigning the associated phy on the MDIO bus. The new implementation sets up and scans the MDIO bus first, then comes back around to the EMAC second... hits a reset, and doesn't re-ENABLE the MDIO. AFAICS, that isn't entirely accurate. In the previous version, the mdio bus was being brought up at probe time in davinci_emac_probe(). The soft-reset was happening later on when the device is opened, in emac_hw_enable(). The difference, however, is that the original code forced an emac_mii_reset() immediately after the emac_hw_enable(). This is not being done with the separated mdio, and that is the problem. In terms of behavior, with the current work around, the new and old versions should be close to identical. More below... Also, maybe hitting the EMAC reset while the MDIO state machine is up is *bad*, I seem to recall some text in the user's guide about waiting for the state machine to stop before disabling it. I wonder if that also applies to reset? You are correct. EMAC soft-reset stops the MDIO mid-transaction, quite unlike disabling the module via the control register. Therefore, there is a risk that a badly designed phy could be left hanging in an arbitrary state. However, all earlier versions of the emac code have been exposed to this very same vulnerability (i.e. arbitrary emac soft-reset regardless of mdio state) all along. Thanks for straightening me out on this, Cryil. Your patch series in [1] seems to have resolved the issues I've been able to see on the da850 based board I'm using here. I appreciate your patience and quick response. I may try to beat on it a bit more with some network performance tests (even though it's not at all related to the immediate problems you've fixed) -- we've had that on our list of todos anyway for this module. I think it would be good to float your patches over to davinci-next, if possible, before the 37 merge window opens... Speaking, of course, from a very partial perspective. [1] http://arago-project.org/git/people/?p=cyril/linux-tnetv107x.git;a=shortlog;h=refs/heads/emac-cpdma-mdio-fixes -Mike -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
On Friday 10 September 2010 12:25:40 am Michael Williamson wrote: Hi Cyril, On 09/09/2010 03:51 PM, Cyril Chemparathy wrote: Hi Mike, [...] The hang is in wait_for_user_access() in the davinci_mdio_read() call. Looks like the state machine got put back into IDLE somewhere between the MDIO probe and the EMAC probe. Seems like there should be some sort of time-out and error message in the wait_for_user_access() method (maybe even a check for IDLE??) If I add a patch to check the state machine for IDLE and then re-enable it in the davinci_mdio_read() call, it is able to press on and come up. I don't see any calls to the davinci_mdio_suspend() call, so I am wondering if the EMAC probe routine, particularly the application of the SOFTRESET, is causing the MDIO to drop back to IDLE / disabled. I can post the patch if you like, but it is a bit of a hack... An EMAC soft-reset clobbering the MDIO controller state is a possibility. I will poll TI designers to see if this could be the case. In any case, a couple of unanswered questions remain: 1. Why don't other davinci devices display similar behavior? 2. If the answer to #1 above is that the timing window is pretty slim (i.e., only if an MDIO read/write is in progress during EMAC soft-reset), why do we hit this situation consistently on mityomap? Has it been confirmed that this only happens on mityomap? Has anyone had success using a da850 evm or other da850 platform? The configuration for Same problem exists on another DA850 board, Hawkboard.(Sorry no support in mainline yet) the mityomap, wrt to the EMAC/MII/MDIO, is pretty much identical to the da850 evm using the MII interface. The only difference I am aware of is the assigned address to the PHY chip. The reference clocks and rates are identical, AFAIK, to the evm. I have put together a quick patch (tested dm365). See attached. Your patch doesn't work with my board. It does attempt to reset the bus on This patch fixes the problem here. I'm using kernel IP auto configuration and mounting fs over NFS. My system boots and I can login to my board. Regards, Caglar the read call, but following wait_for_user_access() calls are timing out and the _read() and _write() calls punt. I bumped up the MDIO_TIMEOUT to 100 ms, and it works. I'm wondering if the scanning logic has to complete an entire cycle (all 32 phys) before issuing a request, and if it's a lot slower than expected. I also found that the initial scanning logic would not reliably find the PHY until I bumped up the delay time following the reset operation. Sometimes it would, sometimes no. Also, your while(1) loops with the continue conditions on the second wait_for_user_access() in the read and writes might need some consideration, i.e.: while (1) { ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; if (ret 0) break; __raw_writel(reg, data-regs-user[0].access); ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; ^ --- this will re-issue the request what you want? if (ret 0) break; reg = __raw_readl(data-regs-user[0].access); ret = (reg USERACCESS_ACK) ? (reg USERACCESS_DATA) : -EIO; break; } Also, on the shutdown, I get a major kernel trace. Here is the dump, as much as I could catch of it (I need a better terminal program) Deconfiguring network interfaces... [ cut here ] WARNING: at kernel/softirq.c:143 local_bh_enable+0x40/0xb4() Modules linked in: [c002e684] (unwind_backtrace+0x0/0xec) from [c003e284] (warn_slowpath_common+0x4c/0x64) [c003e284] (warn_slowpath_common+0x4c/0x64) from [c003e2b8] (warn_slowpath_null+0x1c/0x24) [c003e2b8] (warn_slowpath_null+0x1c/0x24) from [c0043e80] (local_bh_enable+0x40/0xb4) [c0043e80] (local_bh_enable+0x40/0xb4) from [c01f3760] (__netif_receive_skb+0xf8/0x3d0) [c01f3760] (__netif_receive_skb+0xf8/0x3d0) from [c01d344c] (emac_rx_handler+0x40/0xcc) [c01d344c] (emac_rx_handler+0x40/0xcc) from [c01d3fe8] (__cpdma_chan_free+0xac/0xb0) [c01d3fe8] (__cpdma_chan_free+0xac/0xb0) from [c01d40d0] (__cpdma_chan_process+0xe4/0x114) [c01d40d0] (__cpdma_chan_process+0xe4/0x114) from [c01d4238] (cpdma_chan_stop+0xf0/0x1c8) [c01d4238] (cpdma_chan_stop+0xf0/0x1c8) from [c01d4390] (cpdma_ctlr_stop+0x80/0xe4) [c01d4390] (cpdma_ctlr_stop+0x80/0xe4) from [c01d2c00] (emac_dev_stop+0xb0/0x13c) [c01d2c00] (emac_dev_stop+0xb0/0x13c) from [c01f5b40] (__dev_close+0x74/0x98) [c01f5b40] (__dev_close+0x74/0x98) from [c01f3168] (__dev_change_flags+0xac/0x13c) [c01f3168] (__dev_change_flags+0xac/0x13c) from
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Mike, I have merged your latest two emails and responded to both here. [...] Your patch doesn't work with my board. It does attempt to reset the bus on the read call, but following wait_for_user_access() calls are timing out and the _read() and _write() calls punt. I bumped up the MDIO_TIMEOUT to 100 ms, and it works. I'm wondering if the scanning logic has to complete an entire cycle (all 32 phys) before issuing a request, and if it's a lot slower than expected. Based on the mdio module's state machine code, the scan does not need to complete before a user-request is issued. However, when the module is first enabled, it _must_ poll at least one phy (phy id 0) before it handles the user-access request. Consequently, the first access after coming out of reset may be slower than subsequent accesses. One of the patches posted on my repo [1] replaces the dumb mdelay() with a bit more logic that calculates the worst-case access time. This mechanism may work a lot better for you. Would you mind trying it out? Since the MDIO_TIMEOUT is simply a defensive measure, I have no objections to raising this timeout (included in the updated stack). I also found that the initial scanning logic would not reliably find the PHY until I bumped up the delay time following the reset operation. Sometimes it would, sometimes no. Also, your while(1) loops with the continue conditions on the second wait_for_user_access() in the read and writes might need some consideration, i.e.: while (1) { ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; if (ret 0) break; __raw_writel(reg, data-regs-user[0].access); ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; ^ --- this will re-issue the request what you want? Yes, the wait_for_user_access() would have reset the controller, throwing the current transaction out. The intent here is to restart the current transaction with a newly initialized controller. if (ret 0) break; reg = __raw_readl(data-regs-user[0].access); ret = (reg USERACCESS_ACK) ? (reg USERACCESS_DATA) : -EIO; break; } Also, on the shutdown, I get a major kernel trace. Here is the dump, as much as I could catch of it (I need a better terminal program) [...] WARNING: at drivers/net/davinci_emac.c:1025 __cpdma_chan_free+0xac/0xb0() The current code spits out a huge volume of stuff as a result of a WARN_ON in the rx handler. The gitweb on [1] has a patch that fixes this. [...] The MDIO module upgrade (rev 1.4 - 1.5) could have something to do with this behavior. Even so, I can't explain why this issue wasn't seen on da8xx prior to this series. The original code should (at least in theory) have sporadically locked up on emac open. I think, if I understand it correctly, that in the previous version of this code, the emac was reset *prior* to enabling, scanning, and assigning the associated phy on the MDIO bus. The new implementation sets up and scans the MDIO bus first, then comes back around to the EMAC second... hits a reset, and doesn't re-ENABLE the MDIO. AFAICS, that isn't entirely accurate. In the previous version, the mdio bus was being brought up at probe time in davinci_emac_probe(). The soft-reset was happening later on when the device is opened, in emac_hw_enable(). The difference, however, is that the original code forced an emac_mii_reset() immediately after the emac_hw_enable(). This is not being done with the separated mdio, and that is the problem. In terms of behavior, with the current work around, the new and old versions should be close to identical. More below... Also, maybe hitting the EMAC reset while the MDIO state machine is up is *bad*, I seem to recall some text in the user's guide about waiting for the state machine to stop before disabling it. I wonder if that also applies to reset? You are correct. EMAC soft-reset stops the MDIO mid-transaction, quite unlike disabling the module via the control register. Therefore, there is a risk that a badly designed phy could be left hanging in an arbitrary state. However, all earlier versions of the emac code have been exposed to this very same vulnerability (i.e. arbitrary emac soft-reset regardless of mdio state) all along. Regards Cyril. [1] http://arago-project.org/git/people/?p=cyril/linux-tnetv107x.git;a=shortlog;h=refs/heads/emac-cpdma-mdio-fixes -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
On 9/8/2010 8:47 PM, Michael Williamson wrote: Hi Cyril, On 09/08/2010 05:59 PM, Cyril Chemparathy wrote: Hi Mike, So I went in and set the debug_level to netif_msg_init() to -1 in the davinci_emac driver. The only thing I get during the boot process is: net eth0: DaVinci EMAC Probe found device (regs: 01e23000, irq: 33) which is good (I also added some debug to make sure probe actually completes), but I don't see *anything* else. I guess I would have expected to see something out of the emac_dev_open. So I added some more debug in there, it's hanging in the phy_connect() call. The phy_id it came up with was correct 0:03. The hang is in wait_for_user_access() in the davinci_mdio_read() call. Looks like the state machine got put back into IDLE somewhere between the MDIO probe and the EMAC probe. Seems like there should be some sort of time-out and error message in the wait_for_user_access() method (maybe even a check for IDLE??) If I add a patch to check the state machine for IDLE and then re-enable it in the davinci_mdio_read() call, it is able to press on and come up. I don't see any calls to the davinci_mdio_suspend() call, so I am wondering if the EMAC probe routine, particularly the application of the SOFTRESET, is causing the MDIO to drop back to IDLE / disabled. I can post the patch if you like, but it is a bit of a hack... -Mike -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Mike, [...] The hang is in wait_for_user_access() in the davinci_mdio_read() call. Looks like the state machine got put back into IDLE somewhere between the MDIO probe and the EMAC probe. Seems like there should be some sort of time-out and error message in the wait_for_user_access() method (maybe even a check for IDLE??) If I add a patch to check the state machine for IDLE and then re-enable it in the davinci_mdio_read() call, it is able to press on and come up. I don't see any calls to the davinci_mdio_suspend() call, so I am wondering if the EMAC probe routine, particularly the application of the SOFTRESET, is causing the MDIO to drop back to IDLE / disabled. I can post the patch if you like, but it is a bit of a hack... An EMAC soft-reset clobbering the MDIO controller state is a possibility. I will poll TI designers to see if this could be the case. In any case, a couple of unanswered questions remain: 1. Why don't other davinci devices display similar behavior? 2. If the answer to #1 above is that the timing window is pretty slim (i.e., only if an MDIO read/write is in progress during EMAC soft-reset), why do we hit this situation consistently on mityomap? I have put together a quick patch (tested dm365). See attached. Regards Cyril. diff --git a/drivers/net/davinci_mdio.c b/drivers/net/davinci_mdio.c index d34a53a..96a0f9e 100644 --- a/drivers/net/davinci_mdio.c +++ b/drivers/net/davinci_mdio.c @@ -36,6 +36,7 @@ #include linux/io.h #include linux/davinci_emac.h +#define MDIO_TIMEOUT 10 /* msecs */ #define PHY_REG_MASK 0x1f #define PHY_ID_MASK 0x1f @@ -85,31 +86,74 @@ struct davinci_mdio_data { bool suspended; }; +static void __davinci_mdio_reset(struct mii_bus *bus) +{ + struct davinci_mdio_data *data = bus-priv; + u32 mdio_in_freq, div; + + mdio_in_freq = clk_get_rate(data-clk); + div = (mdio_in_freq / data-pdata.bus_freq) - 1; + if (div CONTROL_MAX_DIV) + div = CONTROL_MAX_DIV; + + /* set enable and clock divider */ + __raw_writel(div | CONTROL_ENABLE, data-regs-control); +} + /* wait until hardware is ready for another user access */ -static inline u32 wait_for_user_access(struct davinci_mdio_data *data) +static inline int wait_for_user_access(struct davinci_mdio_data *data) { - struct davinci_mdio_regs __iomem *regs = data-regs; + unsigned long timeout = jiffies + msecs_to_jiffies(MDIO_TIMEOUT); u32 reg; + int ret = -ETIMEDOUT; - while ((reg = __raw_readl(regs-user[0].access)) USERACCESS_GO) - ; + while (time_after(timeout, jiffies)) { + reg = __raw_readl(data-regs-user[0].access); + if ((reg USERACCESS_GO) == 0) { + ret = 0; + break; + } - return reg; + reg = __raw_readl(data-regs-control); + if (reg CONTROL_IDLE) { + /* + * An emac soft_reset may have clobbered the mdio + * controller's state machine. We need to reset and + * retry the current operation + */ + dev_warn(data-dev, controller idle in transaction, + resetting\n); + __davinci_mdio_reset(data-bus); + ret = -EAGAIN; + break; + } + } + return ret; } /* wait until hardware state machine is idle */ -static inline void wait_for_idle(struct davinci_mdio_data *data) +static inline int wait_for_idle(struct davinci_mdio_data *data) { - struct davinci_mdio_regs __iomem *regs = data-regs; + unsigned long timeout = jiffies + msecs_to_jiffies(MDIO_TIMEOUT); + int ret = -ETIMEDOUT; + u32 reg; - while ((__raw_readl(regs-control) CONTROL_IDLE) == 0) - ; + while (time_after(timeout, jiffies)) { + reg = __raw_readl(data-regs-control); + if (reg CONTROL_IDLE) { + ret = 0; + break; + } + } + + return ret; } static int davinci_mdio_read(struct mii_bus *bus, int phy_id, int phy_reg) { struct davinci_mdio_data *data = bus-priv; u32 reg; + int ret; if (phy_reg ~PHY_REG_MASK || phy_id ~PHY_ID_MASK) return -EINVAL; @@ -121,14 +165,32 @@ static int davinci_mdio_read(struct mii_bus *bus, int phy_id, int phy_reg) return -ENODEV; } - wait_for_user_access(data); reg = (USERACCESS_GO | USERACCESS_READ | (phy_reg 21) | (phy_id 16)); - __raw_writel(reg, data-regs-user[0].access); - reg = wait_for_user_access(data); + + while (1) { + ret = wait_for_user_access(data); + if (ret == -EAGAIN) + continue; + if (ret 0) + break; + + __raw_writel(reg, data-regs-user[0].access); + + ret = wait_for_user_access(data); + if (ret == -EAGAIN) + continue; + if (ret 0) + break; + + reg = __raw_readl(data-regs-user[0].access); + ret = (reg USERACCESS_ACK) ? (reg USERACCESS_DATA) : -EIO; + break; + } + spin_unlock(data-lock); - return (reg USERACCESS_ACK) ? (reg USERACCESS_DATA) : -EIO; + return ret; } static int davinci_mdio_write(struct mii_bus *bus, int phy_id, @@ -136,6 +198,7 @@ static int davinci_mdio_write(struct mii_bus *bus, int phy_id, { struct davinci_mdio_data *data = bus-priv; u32 reg; + int ret; if (phy_reg ~PHY_REG_MASK || phy_id
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Mike, [...] An EMAC soft-reset clobbering the MDIO controller state is a possibility. I will poll TI designers to see if this could be the case. To test this theory out, I hacked up a crude beat-it-to-death-and-see-if-it-breaks kinda patch (attached). This tests 1 mdio read cycles while constantly doing an emac soft-reset. I ran this on a dm365 evm, and the test didn't raise a single failed read: davinci_mdio davinci_mdio.0: davinci mdio revision 1.4 davinci_mdio davinci_mdio.0: detected phy mask fffc 1 test loops completed, 1 reads ok The failure in question seems to be limited to the da8xx family (tested da830 evm), where: davinci_mdio davinci_mdio.0: davinci mdio revision 1.5 davinci_mdio davinci_mdio.0: detected phy mask fff1 idle triggered!! The MDIO module upgrade (rev 1.4 - 1.5) could have something to do with this behavior. Even so, I can't explain why this issue wasn't seen on da8xx prior to this series. The original code should (at least in theory) have sporadically locked up on emac open. Regards Cyril. diff --git a/drivers/net/davinci_mdio.c b/drivers/net/davinci_mdio.c index d34a53a..8cd6d28 100644 --- a/drivers/net/davinci_mdio.c +++ b/drivers/net/davinci_mdio.c @@ -157,6 +157,56 @@ static int davinci_mdio_write(struct mii_bus *bus, int phy_id, return 0; } +#if 0 /* DM365 */ +#define EMAC_BASE (0x01D07000) +#define EMAC_CNTRL_OFFSET (0x) +#define PHY_ADDR 1 +#endif + +#if 1 /* DA8XX */ +#define EMAC_BASE (0x01e2) +#define EMAC_CNTRL_OFFSET (0x3000) +#define PHY_ADDR 1 /* 1-3 is ok on da830 evm */ +#endif + +#define EMAC_SOFTRESET (0x174) +#define PHY_REGMII_PHYSID1 + +static void mdio_emac_soft_reset_test(struct davinci_mdio_data *data) +{ + struct davinci_mdio_regs __iomem *regs = data-regs; + int loops = 0, status_ok = 0; + void __iomem *emac; + u32 reg; + + emac = ioremap(EMAC_BASE + EMAC_CNTRL_OFFSET, SZ_4K); + if (WARN_ON(!emac)) + return; + + for (loops = 0; loops 1; loops++) { + while (__raw_readl(regs-user[0].access) +USERACCESS_GO) + ; + + reg = (USERACCESS_GO | USERACCESS_READ | + (PHY_REG 21) | (PHY_ADDR 16)); + __raw_writel(reg, regs-user[0].access); + + while (__raw_readl(regs-user[0].access) +USERACCESS_GO) { + __raw_writel(1, emac + EMAC_SOFTRESET); + if (__raw_readl(regs-control) CONTROL_IDLE) { +printk(KERN_ERR idle triggered!!\n); +return; + } + } + if (__raw_readl(regs-user[0].access) USERACCESS_ACK) + status_ok++; + } + printk(KERN_ERR %d test loops completed, %d reads ok\n, loops, + status_ok); +} + static int __devinit davinci_mdio_probe(struct platform_device *pdev) { struct mdio_platform_data *pdata = pdev-dev.platform_data; @@ -262,6 +312,8 @@ static int __devinit davinci_mdio_probe(struct platform_device *pdev) } data-bus-phy_mask = phy_mask; + mdio_emac_soft_reset_test(data); + /* register the mii bus */ ret = mdiobus_register(data-bus); if (ret)
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Cyril, On 09/09/2010 03:51 PM, Cyril Chemparathy wrote: Hi Mike, [...] The hang is in wait_for_user_access() in the davinci_mdio_read() call. Looks like the state machine got put back into IDLE somewhere between the MDIO probe and the EMAC probe. Seems like there should be some sort of time-out and error message in the wait_for_user_access() method (maybe even a check for IDLE??) If I add a patch to check the state machine for IDLE and then re-enable it in the davinci_mdio_read() call, it is able to press on and come up. I don't see any calls to the davinci_mdio_suspend() call, so I am wondering if the EMAC probe routine, particularly the application of the SOFTRESET, is causing the MDIO to drop back to IDLE / disabled. I can post the patch if you like, but it is a bit of a hack... An EMAC soft-reset clobbering the MDIO controller state is a possibility. I will poll TI designers to see if this could be the case. In any case, a couple of unanswered questions remain: 1. Why don't other davinci devices display similar behavior? 2. If the answer to #1 above is that the timing window is pretty slim (i.e., only if an MDIO read/write is in progress during EMAC soft-reset), why do we hit this situation consistently on mityomap? Has it been confirmed that this only happens on mityomap? Has anyone had success using a da850 evm or other da850 platform? The configuration for the mityomap, wrt to the EMAC/MII/MDIO, is pretty much identical to the da850 evm using the MII interface. The only difference I am aware of is the assigned address to the PHY chip. The reference clocks and rates are identical, AFAIK, to the evm. I have put together a quick patch (tested dm365). See attached. Your patch doesn't work with my board. It does attempt to reset the bus on the read call, but following wait_for_user_access() calls are timing out and the _read() and _write() calls punt. I bumped up the MDIO_TIMEOUT to 100 ms, and it works. I'm wondering if the scanning logic has to complete an entire cycle (all 32 phys) before issuing a request, and if it's a lot slower than expected. I also found that the initial scanning logic would not reliably find the PHY until I bumped up the delay time following the reset operation. Sometimes it would, sometimes no. Also, your while(1) loops with the continue conditions on the second wait_for_user_access() in the read and writes might need some consideration, i.e.: while (1) { ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; if (ret 0) break; __raw_writel(reg, data-regs-user[0].access); ret = wait_for_user_access(data); if (ret == -EAGAIN) continue; ^ --- this will re-issue the request what you want? if (ret 0) break; reg = __raw_readl(data-regs-user[0].access); ret = (reg USERACCESS_ACK) ? (reg USERACCESS_DATA) : -EIO; break; } Also, on the shutdown, I get a major kernel trace. Here is the dump, as much as I could catch of it (I need a better terminal program) Deconfiguring network interfaces... [ cut here ] WARNING: at kernel/softirq.c:143 local_bh_enable+0x40/0xb4() Modules linked in: [c002e684] (unwind_backtrace+0x0/0xec) from [c003e284] (warn_slowpath_common+0x4c/0x64) [c003e284] (warn_slowpath_common+0x4c/0x64) from [c003e2b8] (warn_slowpath_null+0x1c/0x24) [c003e2b8] (warn_slowpath_null+0x1c/0x24) from [c0043e80] (local_bh_enable+0x40/0xb4) [c0043e80] (local_bh_enable+0x40/0xb4) from [c01f3760] (__netif_receive_skb+0xf8/0x3d0) [c01f3760] (__netif_receive_skb+0xf8/0x3d0) from [c01d344c] (emac_rx_handler+0x40/0xcc) [c01d344c] (emac_rx_handler+0x40/0xcc) from [c01d3fe8] (__cpdma_chan_free+0xac/0xb0) [c01d3fe8] (__cpdma_chan_free+0xac/0xb0) from [c01d40d0] (__cpdma_chan_process+0xe4/0x114) [c01d40d0] (__cpdma_chan_process+0xe4/0x114) from [c01d4238] (cpdma_chan_stop+0xf0/0x1c8) [c01d4238] (cpdma_chan_stop+0xf0/0x1c8) from [c01d4390] (cpdma_ctlr_stop+0x80/0xe4) [c01d4390] (cpdma_ctlr_stop+0x80/0xe4) from [c01d2c00] (emac_dev_stop+0xb0/0x13c) [c01d2c00] (emac_dev_stop+0xb0/0x13c) from [c01f5b40] (__dev_close+0x74/0x98) [c01f5b40] (__dev_close+0x74/0x98) from [c01f3168] (__dev_change_flags+0xac/0x13c) [c01f3168] (__dev_change_flags+0xac/0x13c) from [c01f5994] (dev_change_flags+0x10/0x44) [c01f5994] (dev_change_flags+0x10/0x44) from [c023cf60] (devinet_ioctl+0x2dc/0x6f4) [c023cf60] (devinet_ioctl+0x2dc/0x6f4) from [c01e484c] (sock_ioctl+0x1fc/0x258) [c01e484c] (sock_ioctl+0x1fc/0x258) from [c00aa0f0] (do_vfs_ioctl+0x550/0x5c0) [c00aa0f0] (do_vfs_ioctl+0x550/0x5c0) from [c00aa198] (sys_ioctl+0x38/0x5c) [c00aa198]
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Hi Cyril, On 09/08/2010 05:59 PM, Cyril Chemparathy wrote: Hi Mike, [...] So I just pulled this to test it out on a da850 based board (mitydspl138), and I'm having some problems. I'm hoping it's just operator error. Probably not, considering that this series had not been tested on da8xx platforms. Thanks for jumping in! Jumped, pushed, you say tomato, ... ;^) I have a TI TLK100PHP PHY at address 0x3. The boot log shows: [snip] davinci_mdio davinci_mdio.0: davinci mdio revision 1.5 davinci_mdio davinci_mdio.0: detected phy mask fff7 davinci_mdio.0: probed davinci_mdio davinci_mdio.0: phy[3]: device 0:03, driver unknown [snip] So far so good. The MDIO code seems to have done its part and latched on to the correct phy. To answer your question from below, the driver unknown implies that the generic phy driver has been attached to the detected phy. and then in the init scripts, following udev population I get a Configuring network interfaces... and the boot process just hangs. Not a peep from emac? Strange. So I went in and set the debug_level to netif_msg_init() to -1 in the davinci_emac driver. The only thing I get during the boot process is: net eth0: DaVinci EMAC Probe found device (regs: 01e23000, irq: 33) which is good (I also added some debug to make sure probe actually completes), but I don't see *anything* else. I guess I would have expected to see something out of the emac_dev_open. So I added some more debug in there, it's hanging in the phy_connect() call. The phy_id it came up with was correct 0:03. [snip] I spent some time today on testing this series on a da830 evm board (sorry, couldn't manage to scrounge up a da850 platform). In the process, I have had to put in some fixes, particularly for the phy-association piece. You can preview these commits on my repo's gitweb [1]. I did apply the requeue on early end-of-queue patch. No difference, which is not surprising as it's dying in the phy_connect. I don't think I need the others, as the correct phy_id is being located by the MDIO probe headed into the phy_connect call (pointer is valid, string is OK, etc.). If I have time I can continue to poke around, but I'm really not up on this chunk of code... -Mike -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 00/10] split out emac cpdma and mdio for reuse
Davinci's EMAC device has an in-built MDIO controller and a CPDMA engine. These hardware modules are not restricted to EMAC device alone. For example, CPSW3G (3-port gigabit ethernet switch) hardware uses these very same modules internally. This patch series separates out EMAC's MDIO and CPDMA functionality, allowing these individual pieces to be reused across TI hardware. This patch series has been broadly organized as follows: MDIO: - Add new functionality netdev: separate out davinci mdio controller code - Hookup new functionality davinci: add mdio platform devices omap: add mdio platform devices netdev: switch davinci emac to new mdio driver - Cleanup left over cruft davinci: cleanup unused davinci mdio arch code omap: cleanup unused davinci mdio arch code netdev: cleanup unused davinci mdio emac code CPDMA: - Add new functionality netdev: separate out davinci cpdma controller code - Hookup new functionality netdev: switch davinci emac to new cpdma layer - Cleanup left over cruft netdev: cleanup unused davinci emac cpdma code This series has been tested on dm365 and tnetv107x (with additional cpsw patches) hardware. Although am3517 (omap) board support code has been updated as needed, emac does not work on this platform. Changes from v1: 1. Fixed memory leak in cpdma_chan_create() failure case 2. Included new omap patches for am3517, avoids build breakage Changes from v2: 1. Updated series to include mityomapl138 board 2. Minor white-space fixes Cyril Chemparathy (10): net: davinci_emac: separate out davinci mdio davinci: add mdio platform devices omap: add mdio platform devices net: davinci_emac: switch to new mdio davinci: cleanup unused davinci mdio arch code omap: cleanup unused davinci mdio arch code net: davinci_emac: cleanup unused mdio emac code net: davinci_emac: separate out cpdma code net: davinci_emac: switch to new cpdma layer net: davinci_emac: cleanup unused cpdma code arch/arm/mach-davinci/board-da830-evm.c |5 - arch/arm/mach-davinci/board-da850-evm.c |6 - arch/arm/mach-davinci/board-dm365-evm.c |7 - arch/arm/mach-davinci/board-dm644x-evm.c|7 - arch/arm/mach-davinci/board-dm646x-evm.c|8 - arch/arm/mach-davinci/board-mityomapl138.c |7 - arch/arm/mach-davinci/board-neuros-osd2.c |7 - arch/arm/mach-davinci/board-sffsdr.c|7 - arch/arm/mach-davinci/devices-da8xx.c | 31 +- arch/arm/mach-davinci/dm365.c | 23 +- arch/arm/mach-davinci/dm644x.c | 23 +- arch/arm/mach-davinci/dm646x.c | 22 +- arch/arm/mach-davinci/include/mach/dm365.h |2 +- arch/arm/mach-davinci/include/mach/dm644x.h |2 +- arch/arm/mach-davinci/include/mach/dm646x.h |2 +- arch/arm/mach-omap2/board-am3517evm.c | 31 +- drivers/net/Kconfig | 21 + drivers/net/Makefile|2 + drivers/net/davinci_cpdma.c | 837 + drivers/net/davinci_cpdma.h | 105 +++ drivers/net/davinci_emac.c | 1325 --- drivers/net/davinci_mdio.c | 386 include/linux/davinci_emac.h|8 +- 23 files changed, 1635 insertions(+), 1239 deletions(-) create mode 100644 drivers/net/davinci_cpdma.c create mode 100644 drivers/net/davinci_cpdma.h create mode 100644 drivers/net/davinci_mdio.c -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
Cyril Chemparathy cy...@ti.com writes: Davinci's EMAC device has an in-built MDIO controller and a CPDMA engine. These hardware modules are not restricted to EMAC device alone. For example, CPSW3G (3-port gigabit ethernet switch) hardware uses these very same modules internally. This patch series separates out EMAC's MDIO and CPDMA functionality, allowing these individual pieces to be reused across TI hardware. OK, queuing this series for 2.6.37 in davinci-next. Kevin This patch series has been broadly organized as follows: MDIO: - Add new functionality netdev: separate out davinci mdio controller code - Hookup new functionality davinci: add mdio platform devices omap: add mdio platform devices netdev: switch davinci emac to new mdio driver - Cleanup left over cruft davinci: cleanup unused davinci mdio arch code omap: cleanup unused davinci mdio arch code netdev: cleanup unused davinci mdio emac code CPDMA: - Add new functionality netdev: separate out davinci cpdma controller code - Hookup new functionality netdev: switch davinci emac to new cpdma layer - Cleanup left over cruft netdev: cleanup unused davinci emac cpdma code This series has been tested on dm365 and tnetv107x (with additional cpsw patches) hardware. Although am3517 (omap) board support code has been updated as needed, emac does not work on this platform. Changes from v1: 1. Fixed memory leak in cpdma_chan_create() failure case 2. Included new omap patches for am3517, avoids build breakage Changes from v2: 1. Updated series to include mityomapl138 board 2. Minor white-space fixes Cyril Chemparathy (10): net: davinci_emac: separate out davinci mdio davinci: add mdio platform devices omap: add mdio platform devices net: davinci_emac: switch to new mdio davinci: cleanup unused davinci mdio arch code omap: cleanup unused davinci mdio arch code net: davinci_emac: cleanup unused mdio emac code net: davinci_emac: separate out cpdma code net: davinci_emac: switch to new cpdma layer net: davinci_emac: cleanup unused cpdma code arch/arm/mach-davinci/board-da830-evm.c |5 - arch/arm/mach-davinci/board-da850-evm.c |6 - arch/arm/mach-davinci/board-dm365-evm.c |7 - arch/arm/mach-davinci/board-dm644x-evm.c|7 - arch/arm/mach-davinci/board-dm646x-evm.c|8 - arch/arm/mach-davinci/board-mityomapl138.c |7 - arch/arm/mach-davinci/board-neuros-osd2.c |7 - arch/arm/mach-davinci/board-sffsdr.c|7 - arch/arm/mach-davinci/devices-da8xx.c | 31 +- arch/arm/mach-davinci/dm365.c | 23 +- arch/arm/mach-davinci/dm644x.c | 23 +- arch/arm/mach-davinci/dm646x.c | 22 +- arch/arm/mach-davinci/include/mach/dm365.h |2 +- arch/arm/mach-davinci/include/mach/dm644x.h |2 +- arch/arm/mach-davinci/include/mach/dm646x.h |2 +- arch/arm/mach-omap2/board-am3517evm.c | 31 +- drivers/net/Kconfig | 21 + drivers/net/Makefile|2 + drivers/net/davinci_cpdma.c | 837 + drivers/net/davinci_cpdma.h | 105 +++ drivers/net/davinci_emac.c | 1325 --- drivers/net/davinci_mdio.c | 386 include/linux/davinci_emac.h|8 +- 23 files changed, 1635 insertions(+), 1239 deletions(-) create mode 100644 drivers/net/davinci_cpdma.c create mode 100644 drivers/net/davinci_cpdma.h create mode 100644 drivers/net/davinci_mdio.c -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 00/10] split out emac cpdma and mdio for reuse
On 09/07/2010 09:18 PM, Kevin Hilman wrote: Cyril Chemparathy cy...@ti.com writes: Davinci's EMAC device has an in-built MDIO controller and a CPDMA engine. These hardware modules are not restricted to EMAC device alone. For example, CPSW3G (3-port gigabit ethernet switch) hardware uses these very same modules internally. This patch series separates out EMAC's MDIO and CPDMA functionality, allowing these individual pieces to be reused across TI hardware. OK, queuing this series for 2.6.37 in davinci-next. So I just pulled this to test it out on a da850 based board (mitydspl138), and I'm having some problems. I'm hoping it's just operator error. I have a TI TLK100PHP PHY at address 0x3. The boot log shows: [snip] davinci_mdio davinci_mdio.0: davinci mdio revision 1.5 davinci_mdio davinci_mdio.0: detected phy mask fff7 davinci_mdio.0: probed davinci_mdio davinci_mdio.0: phy[3]: device 0:03, driver unknown [snip] and then in the init scripts, following udev population I get a Configuring network interfaces... and the boot process just hangs. I went back and did a menuconfig, I didn't see an option for TI PHY support, but shouldn't there be a generic PHY driver that defaults if one isn't matched? I do have TI DaVinci EMAC/MDIO/CPDMA support enabled under the Ethernet (10 or 100 Mbit). Anyway, I've only looked at this for about 15 minutes, and I was hoping anyone might point out something obvious. If there are hints anyone might have for debugging, I'd appreciate them as well. Or, if anyone has tested this with a da850 evm, that would point me to a problem in the board file. Thanks for any insight. -Mike -- To unsubscribe from this list: send the line unsubscribe linux-omap in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html