Re: Xilinx axienet + DP83620 in fiber mode won't set netif_carrier_on

2018-05-16 Thread Alvaro G. M.
On Wed, May 16, 2018 at 03:11:34PM +0200, Andrew Lunn wrote:
> Hi Alvaro
> 
> What should happen in general terms is that at some point the link to
> the peer is established. phylib, the generic PHY code, polls the PHY
> ever second, asking what the link state is. When the link changes from
> down to up, phylib will call the link_adjust callback in the MAC, and
> netif_carrier_on().
> 
> When the PHY reports the link has gone down, it does similar, calls
> the adjust_link callback, and netif_carrier_off().
> 
> So what you need to do is find out why the PHY driver never reports
> link up. Does the PHY even know when the link is up? Often SFF/SFP
> modules have a Signal Detect pin, which is connected to a gpio. Do you
> have something like that? If so, you should look at the PHYLINK code
> and the SFP device which was added recently.


I didn't know about the SFP device. I don't think this will help my specific
case because my board didn't route the i2c bus from the SFP, so it basically
sits there and does it thing alone, I can't communicate with it.

I see that net/phy/marvell.c has a custom marvell_update_link that reads a
different register to check for fiber connectivity instead of using
genphy_update_link, which I see reads from MII_BMSR.BMSR_LSTATUS

It looks like the DP83620 may do something similar, and the fiber status may
be accesible from some other register. This starts to make sense, thanks for
setting my on track!

Best regards!

-- 
Alvaro G. M.


Xilinx axienet + DP83620 in fiber mode won't set netif_carrier_on

2018-05-16 Thread Alvaro G. M.
Hi,

I have a custom board with a Xilinx FPGA running Microblaze and fitting a
Xilinx Axi Ethernet IP core.  This core communicates through MII mode with a
DP83620 PHY from Texas that supports both cabled and fiber interfaces, of
which I'm using the latter.

Under these circumstances, I've noticed that the interface is pretty much
dead except for receiving broadcast packages, so I tried to dig on the
driver to find the cause. Please, beware that I'm not very familiar with the
netdev subsystem, so I may be mistaken on lots of things.

It seems that of_phy_connect ends up calling netif_carrier_off:

phy_device.c:1036
/* Initial carrier state is off as the phy is about to be
 * (re)initialized.
 */
netif_carrier_off(phydev->attached_dev);

/* Do initial configuration here, now that
 * we have certain key parameters
 * (dev_flags and interface)
 */
err = phy_init_hw(phydev);
if (err)
goto error;

phy_resume(phydev);

However, neither xilinx_axienet_main.c nor dp83848.c ever runs
netif_carrier_on. As a simple test, I tried this patch, and that was enough
to make the interface work.

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c 
b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index e74e1e897864..d8bbe4c51b8a 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -957,6 +957,8 @@ static int axienet_open(struct net_device *ndev)
if (ret)
goto err_rx_irq;
 
+   netif_carrier_on(ndev);
+
return 0;
 
 err_rx_irq:


I understand, however, that this is just a proof of concept that shows the
underlying issue. I'd like to contribute to making this a proper patch, or
maybe anyone who is familiar with the netdev subsystem knows at first sight
what is the solution for this.

My understanding is that this code works fine with other PHY chips, as
pretty much the same code has been in the kernel for a long time, but that
probably before ee06b1728b95643668e40fc58ae118aeb7c1753e (which I
instigated) this Xilinx core and driver had never been tested with any
interface other than GMII and RGMII, which were back then written
explicitly, with an unknown PHY chip.

I should also note that axienet_adjust_link is never called in this
configuration, which is the place where I think the call to netif_carrier_on
should be (based on what I've read on other ethernet drivers), but it
seems that the dp83620 doesn't notify of any autonegotiation (at least while
on fiber mode).

I'm open to reading and testing whatever is needed, and please, feel free to
correct me if I've said anything incorrect, which most probably I've done.

Best regards

-- 
Alvaro G. M.


xilinx_axienet: No interrupts asserted in Tx path?

2017-04-03 Thread Alvaro G. M.
Hi

I'm trying to use tri_mode_ethernet_mac & axi_dma cores from Vivado 2016.2,
but I'm facing several problems that I believe are related to the linux'
driver for these modules, xilinx_axienet_main.c

First of all, I've noticed that __axienet_device_reset is called twice (once
for TX and once again for RX).  However, this results on both
{mm2s,s2mm}_prmry_reset_out_n to be always on reset status (0), and they
never change back to 1, as seen via ILA lookup.  Also, bit 2 of S2MM_DMACR
(30h) remains active as if the DMA core is trying to reset itself.  Even if
I modify __axienet_device_reset to write a 0 to that bit after the timeout
runs, it keeps it's previous value of 1, meaning that the core is somehow
stalled?
 

If I remove all calls to __axienet_device_reset function, the core is kept
in active mode, and the whole system begins to almost work.  In this case, I
try to ping my computer on which I have wireshark running.  I see that ARP
packages are being sent from the FPGA to the network through the PHY, and my
computer sees them.  However, the linux system on the FPGA keeps telling me
this:

net eth0: No interrupts asserted in Tx path

And even though my computer is answering back to ARP requests, the embedded
system does not seem to receive them.  If I set manually the arp address of
my computer on the embedded linux, I see then the ICMP ping request packages
being sent out, but once again the same interrupt related message is printed
and of course it doesn't seem to react to the answer the host is sending.


So, there is something wrong around here, but I don't know what else to try. 
I expect it not to be a bug on the DMA core (although the fact that it
remains on reset state is quite strange), but I guess it could be a hardware
(VHDL-block design) problem nevertheless.

Any ideas?

Thanks, best ergards!


-- 
Alvaro G.  M.