Re: [PATCH net-next 1/3] net: ethtool: add support for forward error correction modes

2017-06-28 Thread Dustin Byford
Hi Andrew,

On Wed Jun 28 15:41, Andrew Lunn wrote:
> On Tue, Jun 27, 2017 at 03:22:39AM -0700, Jakub Kicinski wrote:
> > On Sat, 24 Jun 2017 12:19:43 -0700, Roopa Prabhu wrote:
> > > Encoding: Types of encoding
> > > Off:  Turning off any encoding
> > > RS :  enforcing RS-FEC encoding on supported speeds
> > > BaseR  :  enforcing Base R encoding on supported speeds
> > > Auto   :  IEEE defaults for the speed/medium combination
> > 
> > Just to be sure - does auto mean autonegotiate as defined by IEEE or
> > some presets?
> 
> I don't know this field very well. Is this confusion likely to happen
> a lot? Is there a better name for Auto which is less likely to be
> confused?

You're not the first, or the second to ask that question.  I agree it
could use clarification.

I always read auto in this context as automatic rather than autoneg.
The best I can come up with is to perhaps fully spell out "automatic" in
the documentation and the associated uapi enums.  It's accurate, and
hopefully different enough from "autoneg" to hint people away from the
IEEE autoneg concept.

--Dustin


Re: [PATCH net-next 1/3] net: ethtool: add support for forward error correction modes

2017-06-27 Thread Dustin Byford
Hi Jakub,

On Tue Jun 27 03:22, Jakub Kicinski wrote:
> On Sat, 24 Jun 2017 12:19:43 -0700, Roopa Prabhu wrote:
> > Encoding: Types of encoding
> > Off:  Turning off any encoding
> > RS :  enforcing RS-FEC encoding on supported speeds
> > BaseR  :  enforcing Base R encoding on supported speeds
> > Auto   :  IEEE defaults for the speed/medium combination
> 
> Just to be sure - does auto mean autonegotiate as defined by IEEE or
> some presets?  IIUC there is a notion of different length cables
> defaulting to different strength of FEC in 25GE?

In this context, "auto" doesn't mean autoneg.  Typically, if it's a
copper (CR) link autoneg has taken care of the FEC settings.  If you're
using sfecparam, you're probably dealing with optics where there is no
real autoneg.

Here, the term "auto", in its simplest implementation, would mean the
driver picks a preset based on the speed, cable (typically an SFF
cable), and hardware capability.  "IEEE defaults" in the comment refers
to a couple of tables you'll find in IEEE specs (sorry I can't dig them
up at the moment) that suggest FEC modes based on speed, medium
(CR/SR/LR) and perhaps length in the CR case.  So, yes, perhaps length
is part of the calculation for the appropriate "auto" mode.  This is
essentially why this patch set is important.  Drivers up until now can
be thought of as implementing "auto".  Sometimes that matches the
expectation of a link partner, especially if both sides support the
"IEEE defaults", but it's somewhat common for there to be a mismatch.
That can be because one side isn't using the "IEEE default" FEC mode for
the speed/medium/length, but it can also be because the hardware doesn't
support the most common FEC mode for a speed/medium/length.  We need a
way to explicitly set the FEC mode.  But, the hope is that "auto" is a
reasonable default.

--Dustin


Re: [PATCH net-next 1/3] net: ethtool: add support for forward error correction modes

2017-06-27 Thread Dustin Byford
Hi Gal,

On Sun Jun 25 16:38, Gal Pressman wrote:
> 
> > ...
> >
> > SHOW FEC option:
> > root@tor: ethtool --show-fec  swp1
> > FEC parameters for swp1:
> > Active FEC encodings: RS
> > Configured FEC encodings:  RS | BaseR
> >
> > ETHTOOL DEVNAME output modification:
> >
> > ethtool devname output:
> > root@tor:~# ethtool swp1
> > Settings for swp1:
> > root@hpe-7712-03:~# ethtool swp18
> > Settings for swp18:
> > Supported ports: [ FIBRE ]
> > Supported link modes:   4baseCR4/Full
> > 4baseSR4/Full
> > 4baseLR4/Full
> > 10baseSR4/Full
> > 10baseCR4/Full
> > 10baseLR4_ER4/Full
> > Supported pause frame use: No
> > Supports auto-negotiation: Yes
> > Supported FEC modes: [RS | BaseR | None | Not reported]
> > Advertised link modes:  Not reported
> > Advertised pause frame use: No
> > Advertised auto-negotiation: No
> > Advertised FEC modes: [RS | BaseR | None | Not reported]
> >  One or more FEC modes
> > Speed: 10Mb/s
> > Duplex: Full
> > Port: FIBRE
> > PHYAD: 106
> > Transceiver: internal
> > Auto-negotiation: off
> > Link detected: yes

> What is the difference between the information in ethtool DEVNAME and ethtool 
> --show-fec DEVNAME?

I think there are two questions there.  First, how does the FEC-related
information from glinksettings differ from what's retrieved via
gfecparam.  Second, how is that expressed through the ethtool UI.

Regarding the uapi (as we imagined it), glinksettings returns FEC
information through three fields:

@supported: the complete set of FEC modes the hardware supports, at any
speed, medium, or autoneg combination.

@advertising: the set of modes advertised to the link partner through
the relevant autoneg mechanism.

@lp_advertising: the set of modes the link partner is advertising
through autoneg.


gfecparam is used to fetch a couple more important facts about the FEC
configuration:

1) What FEC mode is currently active, either as a result of the autoneg
process, or a previous call to sfecparam.  This is returned in
sfecparam->active_fec

2) If autoneg is off, what is the currently configured FEC mode.  This
is a bitmask returned in gfecparam->fec.  I imagine it's typically a
single mode, but a mask makes it easier to implement a "don't care" policy,
or otherwise allow the NIC/driver to pick between a set of modes.


Regarding the UI.  ethtool DEVNAME gets most of its info from
glinksettings and it's easy to represent the FEC parameters affected by
autoneg there.  ethtool --show-fec simply reports the output of
gfecparam.  I agree the difference is subtle, perhaps it makes sense to
combine all the FEC information into ethtool DEVNAME?

> I can't find a usage of LINK_MODE_FEC_* bits in downstream patches.

I'm not sure what patches you're looking at, but I think those bits
directly affect the "Advertised FEC modes" and "Supported FEC Modes"
fields.


--Dustin


Re: [PATCH RFC 00/26] Phylink & SFP support

2015-12-28 Thread Dustin Byford
Hi Florian,

On Sun Dec 27 18:08, Florian Fainelli wrote:
> On December 14, 2015 11:26:21 PM PST, Dustin Byford 
>  wrote:
> >On Mon Dec 07 17:35, Russell King - ARM Linux wrote:
> >> Hi,
> >
> >Hello.
> >
> >> SFP modules are hot-pluggable ethernet transceivers; they can be
> >> detected at runtime and accordingly configured.  There are a range of
> >> modules offering many different features.
> >> 
> >> Some SFP modules have PHYs conventional integrated into them, others
> >> drive a laser diode from the Serdes bus.  Some have monitoring,
> >others
> >> do not.
> >> 
> >> Some SFP modules want to use SGMII over the Serdes link, others want
> >> to use 1000base-X over the Serdes link.
> >> 
> >> This makes it non-trivial to support with the existing code
> >structure.
> >> Not wanting to write something specific to the mvneta driver, I
> >decided
> >> to have a go at coming up with something more generic.
> >> 
> >> My initial attempts were to provide a PHY driver, but I found that
> >> phylib's state machine got in the way, and it was hard to support two
> >> chained PHYs.  Conversely, having a fixed DT specified setup (via
> >> the fixed phy infrastructure) would allow some SFP modules to work,
> >but
> >> not others.  The same is true of the "managed" in-band status (which
> >> is SGMII.)
> >> 
> >> The result is that I came up with phylink - an infrastructure layer
> >> which sits between the network driver and any attached PHY, and a
> >> SFP module layer detects the SFP module, and configures phylink
> >> accordingly.
> >> 
> >> Overall, this supports:
> >> 
> >> * switching the serdes mode at the NIC driver
> >> * controlling autonegotiation and autoneg results
> >> * allowing PHYs to be hotplugged
> >> * allowing SFP modules to be hotplugged with proper link indication
> >> * fixed-mode links without involving phylib
> >> * flow control
> >> * EEE support
> >> * reading SFP module EEPROMs
> >> 
> >> Overall, phylink supports several link modes, with dynamic switching
> >> possible between these:
> >> * A true fixed link mode, where the parameters are set by DT.
> >> * PHY mode, where we read the negotiation results from the PHY
> >registers
> >>   and pass them to the NIC driver.
> >> * SGMII mode, where the in-band status indicates the speed, duplex
> >and
> >>   flow control settings of the link partner.
> >> * 1000base-X mode, where the in-band status indicates only duplex and
> >>   flow control settings (different, incompatible bit layout from
> >SGMII.)
> >
> >I've been working on some similar code to handle interactions with a
> >wide range of SFF modules, 1G to 100G, on Linux network switches for
> >some time.  For practical reasons a lot of that was in userspace but
> >I've been planning and recently working on an SFF kernel driver that
> >does some of what's done in this series.  I think the model you're
> >proposing is right on, and since you're further along in implementation
> >I'd like to help round out support for the other SFF modules if I can.
> >Then make this work on the network ASICs I have access to.
> >
> >Any concrete plans for QSFP or the new 25G modules?
> >
> >> Ethtool support is included, as well as emulation of the MII
> >registers
> >> for situations where a PHY is not attached, giving compatible
> >emulation
> >> of existing user interfaces where required.
> >> 
> >> The patches here include modification of mvneta (against 4.4-rc1, so
> >> probably won't apply to current development tips.)  It basically
> >> hooks into the places where the phylib would hook into.
> >> 
> >> DT wise, the changes needed to support SFP look like this (example
> >> taken from Clearfog):
> >> 
> >>ethernet@34000 {
> >> +  managed = "in-band-status";
> >>phy-mode = "sgmii";
> >>status = "okay";
> >> -
> >> -  fixed-link {
> >> -  speed = <1000>;
> >> -  full-duplex;
> >> -  };
> >>};
> >> ...
> >&

Re: [PATCH RFC 00/26] Phylink & SFP support

2015-12-14 Thread Dustin Byford
On Mon Dec 07 17:35, Russell King - ARM Linux wrote:
> Hi,

Hello.

> SFP modules are hot-pluggable ethernet transceivers; they can be
> detected at runtime and accordingly configured.  There are a range of
> modules offering many different features.
> 
> Some SFP modules have PHYs conventional integrated into them, others
> drive a laser diode from the Serdes bus.  Some have monitoring, others
> do not.
> 
> Some SFP modules want to use SGMII over the Serdes link, others want
> to use 1000base-X over the Serdes link.
> 
> This makes it non-trivial to support with the existing code structure.
> Not wanting to write something specific to the mvneta driver, I decided
> to have a go at coming up with something more generic.
> 
> My initial attempts were to provide a PHY driver, but I found that
> phylib's state machine got in the way, and it was hard to support two
> chained PHYs.  Conversely, having a fixed DT specified setup (via
> the fixed phy infrastructure) would allow some SFP modules to work, but
> not others.  The same is true of the "managed" in-band status (which
> is SGMII.)
> 
> The result is that I came up with phylink - an infrastructure layer
> which sits between the network driver and any attached PHY, and a
> SFP module layer detects the SFP module, and configures phylink
> accordingly.
> 
> Overall, this supports:
> 
> * switching the serdes mode at the NIC driver
> * controlling autonegotiation and autoneg results
> * allowing PHYs to be hotplugged
> * allowing SFP modules to be hotplugged with proper link indication
> * fixed-mode links without involving phylib
> * flow control
> * EEE support
> * reading SFP module EEPROMs
> 
> Overall, phylink supports several link modes, with dynamic switching
> possible between these:
> * A true fixed link mode, where the parameters are set by DT.
> * PHY mode, where we read the negotiation results from the PHY registers
>   and pass them to the NIC driver.
> * SGMII mode, where the in-band status indicates the speed, duplex and
>   flow control settings of the link partner.
> * 1000base-X mode, where the in-band status indicates only duplex and
>   flow control settings (different, incompatible bit layout from SGMII.)

I've been working on some similar code to handle interactions with a
wide range of SFF modules, 1G to 100G, on Linux network switches for
some time.  For practical reasons a lot of that was in userspace but
I've been planning and recently working on an SFF kernel driver that
does some of what's done in this series.  I think the model you're
proposing is right on, and since you're further along in implementation
I'd like to help round out support for the other SFF modules if I can.
Then make this work on the network ASICs I have access to.

Any concrete plans for QSFP or the new 25G modules?

> Ethtool support is included, as well as emulation of the MII registers
> for situations where a PHY is not attached, giving compatible emulation
> of existing user interfaces where required.
> 
> The patches here include modification of mvneta (against 4.4-rc1, so
> probably won't apply to current development tips.)  It basically
> hooks into the places where the phylib would hook into.
> 
> DT wise, the changes needed to support SFP look like this (example
> taken from Clearfog):
> 
>   ethernet@34000 {
> + managed = "in-band-status";
>   phy-mode = "sgmii";
>   status = "okay";
> -
> - fixed-link {
> - speed = <1000>;
> - full-duplex;
> - };
>   };
> ...
> + sfp: sfp {
> + compatible = "sff,sfp";
> + i2c-bus = <&i2c1>;
> + los-gpio = <&expander0 12 GPIO_ACTIVE_HIGH>;
> + moddef0-gpio = <&expander0 15 GPIO_ACTIVE_LOW>;
> + sfp,ethernet = <ð2>;

Using ð2 is unambiguous in the this case because there's only one
serdes and one mac involved.  To specify the mac/serdes/cage
associations at the same level of detail as the gpios it might be nice
(at least for some devices) to point to a serdes node (or 4 in the case
of QSFP) instead of ð2.  Any thoughts on that?

Switch ASICs, and I imagine at least some NICs, are really flexible in
terms of how serdes are wired to a cage.  Both in the sense that the
board designer gets to pick which wires route to the cage based on
physical constraints and the user gets to pick which serdes or group of
serdes compose the ethernet device.  For example, using a breakout cable
to get 4xSFP out of a QSFP or the other way around.

Perhaps the simple case (sfp,ethernet -> ð2) can remain simple, but
I'd be interested in any thoughts you have on introducing a serdes
layer here.

I think adding such a layer would make it easier to 1) make serdes to
cage mappings part of the platform description (DT or ACPI) and 2) al