Re: [RFC PATCH] arm: Don't trap conditional UDF instructions

2020-05-13 Thread Russell King - ARM Linux admin
On Wed, May 13, 2020 at 05:41:58PM +0200, Fredrik Strupe wrote:
> Hi,
> 
> This is more of a question than a patch, but I hope the attached patch makes
> the issue a bit clearer.
> 
> The arm port of Linux supports hooking/trapping of undefined instructions. 
> Some
> parts of the code use this to trap UDF instructions with certain immediates in
> order to use them for other purposes, like 'UDF #16' which is equivalent to a
> BKPT instruction in A32.
> 
> Moreover, most of the undef hooks on UDF instructions assume that UDF is
> conditional and mask out the condition prefix during matching. The attached
> patch shows the locations where this happens. However, the Arm architecture
> reference manual explicitly states that UDF is *not* conditional, making
> any instruction encoding with a condition prefix other than 0xe (always
> execute) unallocated.

The latest version of the ARM architecture reference manual may say
that, but earlier versions say different things. The latest reference
manual does not apply to earlier architectures, so if you're writing
code to cover multiple different architectures, you must have an
understanding of each of those architectures.

So, from the code:

ARM:    0111      

>From DDI0100E:

3.13.1 Undefined instruction space
   Instructions with the following opcodes are undefined
   instruction space:

   opcode[27:25] = 0b011
   opcode[4] = 1

   31 28 27 26 25 24 5 4 3 0
   cond  0  1  1  x  x x x x x x x x x x x x x x x x x x x 1 x x x x

So, in this version of the architecture, undefined instructions may
be conditional - and indeed that used to be the case.  The condition
code was always respected, and cond= meant "never" (NV).

Hence, trapping them if the condition code is not 1110 (AL) is
entirely reasonable, legal and safe.  If an ARM CPU defines an
instruction coding that matches the above, then it won't take the
undefined instruction trap, and we'll never see it.

Now, as for UDF usage in the kernel, it may be quite correct that we
always use the AL condition code for them, but it would be very odd
for there to be an instruction implemented with a different (non-NV)
condition code that can't also have it's AL condition code encoding.
You could never execute such an instruction unconditionally.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 1/4] net: ethernet: validate pause autoneg setting

2020-05-13 Thread Russell King - ARM Linux admin
On Wed, May 13, 2020 at 03:49:25PM +0200, Andrew Lunn wrote:
> Hi Russell, Doug
> 
> With netlink ethtool we have the possibility of adding a new API to
> control this. And we can leave the IOCTL API alone, and the current
> ethtool commands. We can add a new command to ethtool which uses the new API.
> 
> Question is, do we want to do this? Would we be introducing yet more
> confusion, rather than making the situation better?

The conclusion I came to was that I would document the deficiencies
and do no more; I think people are used to its current quirky
behaviour.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v1] net: phy: at803x: add cable test support

2020-05-13 Thread Russell King - ARM Linux admin
On Wed, May 13, 2020 at 03:32:09PM +0200, Andrew Lunn wrote:
> On Wed, May 13, 2020 at 02:06:48PM +0200, Oleksij Rempel wrote:
> > The cable test seems to be support by all of currently support Atherso
> > PHYs, so add support for all of them. This patch was tested only on
> > AR9331 PHY with following results:
> > - No cable is detected as short
> > - A 15m long cable connected only on one side is detected as 9m open.
> 
> That sounds wrong. What about a shorted 15m cable? Is it also 9m?  Do
> you have any other long cables you can test with? Is it always 1/2 the
> cable length?

I had similar inaccuracies with my recent faulty cable when testing
with a Marvell PHY as I mentioned.

"Using the VCT in the Marvell PHY points to it being pair 3, at a
distance of 0x190 or 0x50 depending on which way round the cable is
connected.  That's in cm.  The cable isn't 480cm long, it's 278cm
long, and the problem is up by one of the connectors."

0x190 = 400cm, 0x50 = 80cm.

Given that the issue was at one of the connectors on the cable, and
I tried VCT with it plugged into the same port, you can't even say
"well, if we define the start of the cable at 80cm, then that works
for the cable connected the other way around" - it gets us closer
but it's still about 30cm wrong.

It doesn't even work if you think maybe the figures have forgotten
to take into account the fact that the TDR pulse has to go out and
then return (so travel twice the distance, so maybe the figures are
doubled.)

So, it seems we have more than one PHY that produces only wildly
inaccurate guesses at the distance to the fault.

I'd say this technology is a "it would be nice if we could" but the
results can not be relied upon.  It may be grounded in hard physics,
but there's clearly something causing incorrect results.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v4 3/5] net: macb: fix macb_get/set_wol() when moving to phylink

2020-05-13 Thread Russell King - ARM Linux admin
On Wed, May 06, 2020 at 01:37:39PM +0200, nicolas.fe...@microchip.com wrote:
> From: Nicolas Ferre 
> 
> Keep previous function goals and integrate phylink actions to them.
> 
> phylink_ethtool_get_wol() is not enough to figure out if Ethernet driver
> supports Wake-on-Lan.
> Initialization of "supported" and "wolopts" members is done in phylink
> function, no need to keep them in calling function.
> 
> phylink_ethtool_set_wol() return value is not enough to determine
> if WoL is enabled for the calling Ethernet driver. Call it first
> but don't rely on its return value as most of simple PHY drivers
> don't implement a set_wol() function.
> 
> Fixes: 7897b071ac3b ("net: macb: convert to phylink")
> Signed-off-by: Nicolas Ferre 
> Reviewed-by: Florian Fainelli 
> Cc: Claudiu Beznea 
> Cc: Harini Katakam 
> Cc: Antoine Tenart 
> ---
>  drivers/net/ethernet/cadence/macb_main.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb_main.c 
> b/drivers/net/ethernet/cadence/macb_main.c
> index 53e81ab048ae..24c044dc7fa0 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -2817,21 +2817,23 @@ static void macb_get_wol(struct net_device *netdev, 
> struct ethtool_wolinfo *wol)
>  {
>   struct macb *bp = netdev_priv(netdev);
>  
> - wol->supported = 0;
> - wol->wolopts = 0;
> -
> - if (bp->wol & MACB_WOL_HAS_MAGIC_PACKET)
> + if (bp->wol & MACB_WOL_HAS_MAGIC_PACKET) {
>   phylink_ethtool_get_wol(bp->phylink, wol);
> + wol->supported |= WAKE_MAGIC;
> +
> + if (bp->wol & MACB_WOL_ENABLED)
> + wol->wolopts |= WAKE_MAGIC;
> + }
>  }
>  
>  static int macb_set_wol(struct net_device *netdev, struct ethtool_wolinfo 
> *wol)
>  {
>   struct macb *bp = netdev_priv(netdev);
> - int ret;
>  
> - ret = phylink_ethtool_set_wol(bp->phylink, wol);
> - if (!ret)
> - return 0;
> + /* Pass the order to phylink layer.
> +  * Don't test return value as set_wol() is often not supported.
> +  */
> + phylink_ethtool_set_wol(bp->phylink, wol);

If this returns an error, does that mean WOL works or does it not?

Note that if set_wol() is not supported, this will return -EOPNOTSUPP.
What about other errors?

If you want to just ignore the case where it's not supported, then
this looks like a sledge hammer to crack a nut.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 4/4] net: bcmgenet: add support for ethtool flow control

2020-05-13 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 05:24:10PM -0700, Doug Berger wrote:
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c 
> b/drivers/net/ethernet/broadcom/genet/bcmmii.c
> index 511d553a4d11..788da1ecea0c 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
> @@ -25,6 +25,21 @@
>  
>  #include "bcmgenet.h"
>  
> +static u32 _flow_control_autoneg(struct phy_device *phydev)
> +{
> + bool tx_pause, rx_pause;
> + u32 cmd_bits = 0;
> +
> + phy_get_pause(phydev, _pause, _pause);
> +
> + if (!tx_pause)
> + cmd_bits |= CMD_TX_PAUSE_IGNORE;
> + if (!rx_pause)
> + cmd_bits |= CMD_RX_PAUSE_IGNORE;
> +
> + return cmd_bits;
> +}
> +
>  /* setup netdev link state when PHY link status change and
>   * update UMAC and RGMII block when link up
>   */
> @@ -71,12 +86,20 @@ void bcmgenet_mii_setup(struct net_device *dev)
>   cmd_bits <<= CMD_SPEED_SHIFT;
>  
>   /* duplex */
> - if (phydev->duplex != DUPLEX_FULL)
> - cmd_bits |= CMD_HD_EN;
> -
> - /* pause capability */
> - if (!phydev->pause)
> - cmd_bits |= CMD_RX_PAUSE_IGNORE | CMD_TX_PAUSE_IGNORE;
> + if (phydev->duplex != DUPLEX_FULL) {
> + cmd_bits |= CMD_HD_EN |
> + CMD_RX_PAUSE_IGNORE | CMD_TX_PAUSE_IGNORE;

phy_get_pause() already takes account of whether the PHY is in half
duplex mode.  So:

bool tx_pause, rx_pause;

if (phydev->autoneg && priv->autoneg_pause) {
phy_get_pause(phydev, _pause, _pause);
} else if (phydev->duplex == DUPLEX_FULL) {
tx_pause = priv->tx_pause;
rx_pause = priv->rx_pause;
} else {
tx_pause = false;
rx_pause = false;
}

if (!tx_pause)
cmd_bits |= CMD_TX_PAUSE_IGNORE;
if (!rx_pause)
cmd_bits |= CMD_RX_PAUSE_IGNORE;

would be entirely sufficient here.

I wonder whether your implementation (which mine follows) is really
correct though.  Consider this:

# ethtool -A eth0 autoneg on tx on rx on
# ethtool -s eth0 autoneg off speed 1000 duplex full

At this point, what do you expect the resulting pause state to be?  It
may not be what you actually think it should be - it will be tx and rx
pause enabled (it's easier to see why that happens with my rewritten
version of your implementation, which is functionally identical.)

If we take the view that if link autoneg is disabled, and therefore the
link partner's advertisement is zero, shouldn't it result in tx and rx
pause being disabled?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 3/4] net: ethernet: introduce phy_set_pause

2020-05-13 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 05:24:09PM -0700, Doug Berger wrote:
> This commit introduces the phy_set_pause function to the phylib as
> a helper to support the set_pauseparam ethtool method.
> 
> It is hoped that the new behavior introduced by this function will
> be widely embraced and the phy_set_sym_pause and phy_set_asym_pause
> functions can be deprecated. Those functions are retained for all
> existing users and for any desenting opinions on my interpretation
> of the functionality.
> 
> Signed-off-by: Doug Berger 
> ---
>  drivers/net/phy/phy_device.c | 31 +++++++
>  include/linux/phy.h  |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 48ab9efa0166..e6dafb3c3e5f 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -2614,6 +2614,37 @@ void phy_set_asym_pause(struct phy_device *phydev, 
> bool rx, bool tx)
>  EXPORT_SYMBOL(phy_set_asym_pause);
>  
>  /**
> + * phy_set_pause - Configure Pause and Asym Pause with autoneg
> + * @phydev: target phy_device struct
> + * @rx: Receiver Pause is supported
> + * @tx: Transmit Pause is supported
> + * @autoneg: Auto neg should be used
> + *
> + * Description: Configure advertised Pause support depending on if
> + * receiver pause and pause auto neg is supported. Generally called
> + * from the set_pauseparam ethtool_ops.
> + *
> + * Note: Since pause is really a MAC level function it should be
> + * notified via adjust_link to update its pause functions.
> + */
> +void phy_set_pause(struct phy_device *phydev, bool rx, bool tx, bool autoneg)
> +{
> + linkmode_set_pause(phydev->advertising, tx, rx, autoneg);
> +
> + /* Reset the state of an already running link to force a new
> +  * link up event when advertising doesn't change or when PHY
> +  * autoneg is disabled.
> +  */
> + mutex_lock(>lock);
> + if (phydev->state == PHY_RUNNING)
> + phydev->state = PHY_UP;
> + mutex_unlock(>lock);

I wonder about this - will drivers cope with having two link-up events
via adjust_link without a corresponding link-down event?  What if they
touch registers that are only supposed to be touched while the link is
down?  Obviously, drivers have to opt-in to this interface, so it may
be okay provided we don't get wholesale changes.

> +
> + phy_start_aneg(phydev);

Should we be making that conditional on something changing and autoneg
being enabled, like phy_set_asym_pause() does?  There is no point
interrupting an established link if the advertisement didn't change.

> +}
> +EXPORT_SYMBOL(phy_set_pause);
> +
> +/**
>   * phy_validate_pause - Test if the PHY/MAC support the pause configuration
>   * @phydev: phy_device struct
>   * @pp: requested pause configuration
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index 5d8ff5428010..71e484424e68 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -1403,6 +1403,7 @@ void phy_support_asym_pause(struct phy_device *phydev);
>  void phy_set_sym_pause(struct phy_device *phydev, bool rx, bool tx,
>  bool autoneg);
>  void phy_set_asym_pause(struct phy_device *phydev, bool rx, bool tx);
> +void phy_set_pause(struct phy_device *phydev, bool rx, bool tx, bool 
> autoneg);
>  bool phy_validate_pause(struct phy_device *phydev,
>   struct ethtool_pauseparam *pp);
>  void phy_get_pause(struct phy_device *phydev, bool *tx_pause, bool 
> *rx_pause);
> -- 
> 2.7.4
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 1/4] net: ethernet: validate pause autoneg setting

2020-05-13 Thread Russell King - ARM Linux admin
On Wed, May 13, 2020 at 06:34:05AM +0100, Russell King - ARM Linux admin wrote:
> On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
> > On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
> > > On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> > >> This was intended as a fix, but I thought it would be better to keep it
> > >> as part of this set for context and since net-next is currently open.
> > >>
> > >> The context is trying to improve the phylib support for offloading
> > >> ethtool pause configuration and this is something that could be checked
> > >> in a single location rather than by individual drivers.
> > >>
> > >> I included it here to get feedback about its appropriateness as a common
> > >> behavior. I should have been more explicit about that.
> > >>
> > >> Personally, I'm actually not that fond of this change since it can
> > >> easily be a source of confusion with the ethtool interface because the
> > >> link autonegotiation and the pause autonegotiation are controlled by
> > >> different commands.
> > >>
> > >> Since the ethtool -A command performs a read/modify/write of pause
> > >> parameters, you can get strange results like these:
> > >> # ethtool -s eth0 speed 100 duplex full autoneg off
> > >> # ethtool -A eth0 tx off
> > >> Cannot set device pause parameters: Invalid argument
> > >> #
> > >> Because, the get read pause autoneg as enabled and only the tx_pause
> > >> member of the structure was updated.
> > > 
> > > This looks like the same argument I've been having with Heiner over
> > > the EEE interface, except there's a difference here.
> > > 
> > > # ethtool -A eth0 autoneg on
> > > # ethtool -s eth0 autoneg off speed 100 duplex full
> > > 
> > > After those two commands, what is the state of pause mode?  The answer
> > > is, it's disabled.
> > > 
> > > # ethtool -A eth0 autoneg off rx on tx on
> > > 
> > > is perfectly acceptable, as we are forcing pause modes at the local
> > > end of the link.
> > > 
> > > # ethtool -A eth0 autoneg on
> > > 
> > > Now, the question is whether that should be allowed or not - but this
> > > is merely restoring the "pause" settings that were in effect prior
> > > to the previous command.  It does not enable pause negotiation,
> > > because autoneg as a whole is disabled, but it _allows_ pause
> > > negotiation to occur when autoneg is enabled at some point in the
> > > future.
> > > 
> > > Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
> > > autoneg off" means you can configure the negotiation parameters
> > > _before_ triggering a negotiation cycle on the link.  In other words,
> > > it would avoid:
> > > 
> > > # ethtool -s eth0 autoneg on
> > > # # Link renegotiates
> > > # ethtool -A eth0 autoneg on
> > > # # Link renegotiates a second time
> > > 
> > > and it also means that if stuff has already been scripted to avoid
> > > this, nothing breaks.
> > > 
> > > If we start rejecting ethtool -A because autoneg is disabled, then
> > > things get difficult to configure - we would need ethtool documentation
> > > to state that autoneg must be enabled before configuration of pause
> > > and EEE can be done.  IMHO, that hurts usability, and adds confusion.
> > > 
> > Thanks for your input and I agree with what you have said here. I will
> > remove this commit from the set when I resubmit and I assume that, like
> > Michal, you would like to see the comment in ethtool.h revised.
> > 
> > I think the crux of the matter is that the meaning of the autoneg pause
> > parameter is not well specified, and that is fundamentally what I am
> > trying to clarify in a common implementation that might help unify a
> > consistent behavior across network drivers.
> > 
> > My interpretation is that the link autonegotiation and the pause
> > autonegotiation can be meaningfully set independently from each other
> > and that the interplay between the two has easily overlooked subtleties.
> > 
> > My opinion (which is at least in part drawn from my interpretation of
> > your opinion) is as follows with regard to pause behaviors:
> > 
> > The link autonegotiation parameter concerns itself with whether the
> > Pau

Re: [PATCH net-next 1/4] net: ethernet: validate pause autoneg setting

2020-05-12 Thread Russell King - ARM Linux admin
On Tue, May 12, 2020 at 08:48:22PM -0700, Doug Berger wrote:
> On 5/12/2020 11:55 AM, Russell King - ARM Linux admin wrote:
> > On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> >> This was intended as a fix, but I thought it would be better to keep it
> >> as part of this set for context and since net-next is currently open.
> >>
> >> The context is trying to improve the phylib support for offloading
> >> ethtool pause configuration and this is something that could be checked
> >> in a single location rather than by individual drivers.
> >>
> >> I included it here to get feedback about its appropriateness as a common
> >> behavior. I should have been more explicit about that.
> >>
> >> Personally, I'm actually not that fond of this change since it can
> >> easily be a source of confusion with the ethtool interface because the
> >> link autonegotiation and the pause autonegotiation are controlled by
> >> different commands.
> >>
> >> Since the ethtool -A command performs a read/modify/write of pause
> >> parameters, you can get strange results like these:
> >> # ethtool -s eth0 speed 100 duplex full autoneg off
> >> # ethtool -A eth0 tx off
> >> Cannot set device pause parameters: Invalid argument
> >> #
> >> Because, the get read pause autoneg as enabled and only the tx_pause
> >> member of the structure was updated.
> > 
> > This looks like the same argument I've been having with Heiner over
> > the EEE interface, except there's a difference here.
> > 
> > # ethtool -A eth0 autoneg on
> > # ethtool -s eth0 autoneg off speed 100 duplex full
> > 
> > After those two commands, what is the state of pause mode?  The answer
> > is, it's disabled.
> > 
> > # ethtool -A eth0 autoneg off rx on tx on
> > 
> > is perfectly acceptable, as we are forcing pause modes at the local
> > end of the link.
> > 
> > # ethtool -A eth0 autoneg on
> > 
> > Now, the question is whether that should be allowed or not - but this
> > is merely restoring the "pause" settings that were in effect prior
> > to the previous command.  It does not enable pause negotiation,
> > because autoneg as a whole is disabled, but it _allows_ pause
> > negotiation to occur when autoneg is enabled at some point in the
> > future.
> > 
> > Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
> > autoneg off" means you can configure the negotiation parameters
> > _before_ triggering a negotiation cycle on the link.  In other words,
> > it would avoid:
> > 
> > # ethtool -s eth0 autoneg on
> > # # Link renegotiates
> > # ethtool -A eth0 autoneg on
> > # # Link renegotiates a second time
> > 
> > and it also means that if stuff has already been scripted to avoid
> > this, nothing breaks.
> > 
> > If we start rejecting ethtool -A because autoneg is disabled, then
> > things get difficult to configure - we would need ethtool documentation
> > to state that autoneg must be enabled before configuration of pause
> > and EEE can be done.  IMHO, that hurts usability, and adds confusion.
> > 
> Thanks for your input and I agree with what you have said here. I will
> remove this commit from the set when I resubmit and I assume that, like
> Michal, you would like to see the comment in ethtool.h revised.
> 
> I think the crux of the matter is that the meaning of the autoneg pause
> parameter is not well specified, and that is fundamentally what I am
> trying to clarify in a common implementation that might help unify a
> consistent behavior across network drivers.
> 
> My interpretation is that the link autonegotiation and the pause
> autonegotiation can be meaningfully set independently from each other
> and that the interplay between the two has easily overlooked subtleties.
> 
> My opinion (which is at least in part drawn from my interpretation of
> your opinion) is as follows with regard to pause behaviors:
> 
> The link autonegotiation parameter concerns itself with whether the
> Pause capabilities are advertised as part of autonegotiation of link
> parameters.
> 
> The pause autonegotiation parameter concerns itself with whether the
> local node is willing to accept the advertised capabilities of its peer
> as input into its pause configuration.
> 
> The Tx_Pause and Rx_Pause parameters indicate in which directions pause
> frames should be supported.

This is where the ethtool interface breaks down - they are unable
to sanely define which should be supported, as wha

Re: [PATCH net-next 1/4] net: ethernet: validate pause autoneg setting

2020-05-12 Thread Russell King - ARM Linux admin
On Tue, May 12, 2020 at 11:31:39AM -0700, Doug Berger wrote:
> This was intended as a fix, but I thought it would be better to keep it
> as part of this set for context and since net-next is currently open.
> 
> The context is trying to improve the phylib support for offloading
> ethtool pause configuration and this is something that could be checked
> in a single location rather than by individual drivers.
> 
> I included it here to get feedback about its appropriateness as a common
> behavior. I should have been more explicit about that.
> 
> Personally, I'm actually not that fond of this change since it can
> easily be a source of confusion with the ethtool interface because the
> link autonegotiation and the pause autonegotiation are controlled by
> different commands.
> 
> Since the ethtool -A command performs a read/modify/write of pause
> parameters, you can get strange results like these:
> # ethtool -s eth0 speed 100 duplex full autoneg off
> # ethtool -A eth0 tx off
> Cannot set device pause parameters: Invalid argument
> #
> Because, the get read pause autoneg as enabled and only the tx_pause
> member of the structure was updated.

This looks like the same argument I've been having with Heiner over
the EEE interface, except there's a difference here.

# ethtool -A eth0 autoneg on
# ethtool -s eth0 autoneg off speed 100 duplex full

After those two commands, what is the state of pause mode?  The answer
is, it's disabled.

# ethtool -A eth0 autoneg off rx on tx on

is perfectly acceptable, as we are forcing pause modes at the local
end of the link.

# ethtool -A eth0 autoneg on

Now, the question is whether that should be allowed or not - but this
is merely restoring the "pause" settings that were in effect prior
to the previous command.  It does not enable pause negotiation,
because autoneg as a whole is disabled, but it _allows_ pause
negotiation to occur when autoneg is enabled at some point in the
future.

Also, allowing "ethtool -A eth0 autoneg on" when "ethtool -s eth0
autoneg off" means you can configure the negotiation parameters
_before_ triggering a negotiation cycle on the link.  In other words,
it would avoid:

# ethtool -s eth0 autoneg on
# # Link renegotiates
# ethtool -A eth0 autoneg on
# # Link renegotiates a second time

and it also means that if stuff has already been scripted to avoid
this, nothing breaks.

If we start rejecting ethtool -A because autoneg is disabled, then
things get difficult to configure - we would need ethtool documentation
to state that autoneg must be enabled before configuration of pause
and EEE can be done.  IMHO, that hurts usability, and adds confusion.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 4/5] net: phy: Introduce fwnode_get_phy_id()

2020-05-11 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 03:04:57PM +0200, Andrew Lunn wrote:
> > NXP's LX2160ARDB platform currently has the following MDIO-PHY connection.
> > 
> > MDIO-1 ==> one 40G PHY, two 1G PHYs(C45), two 10G PHYs(C22)
> > MDIO-2 ==> one 25G PHY
> 
> It has been suggested that ACPI only support a one to one
> mapping. Each MAC has one MDIO bus, with one PHY on it. KISS.
> 
> This clearly does not work for your hardware. So not only do we need
> to solve how PHY properties are described, we also need an equivalent
> of phy-handle, so a MAC can indicate which PHY it is connected to.

I'd suggest that doesn't work for a lot of hardware. It won't work for
the Macchiatobin for example, where there are two Clause 45 NBASE-T
PHYs on one MDIO bus.

The same is likely true on the LX2160A - there can be multiple ethernet
interfaces, but IIRC only two external MDIO buses that one can hang
PHYs off of.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 01/15] net: dsa: provide an option for drivers to always receive bridge VLANs

2020-05-11 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 02:59:12PM +0300, Vladimir Oltean wrote:
> On Mon, 11 May 2020 at 14:54, Russell King - ARM Linux admin
>  wrote:
> >
> > On Mon, May 11, 2020 at 02:40:29PM +0300, Vladimir Oltean wrote:
> > > On Mon, 11 May 2020 at 14:38, Russell King - ARM Linux admin
> > >  wrote:
> > > >
> > > > On Sun, May 10, 2020 at 07:42:41PM +0300, Vladimir Oltean wrote:
> > > > > From: Russell King 
> > > > >
> > > > > DSA assumes that a bridge which has vlan filtering disabled is not
> > > > > vlan aware, and ignores all vlan configuration. However, the kernel
> > > > > software bridge code allows configuration in this state.
> > > > >
> > > > > This causes the kernel's idea of the bridge vlan state and the
> > > > > hardware state to disagree, so "bridge vlan show" indicates a correct
> > > > > configuration but the hardware lacks all configuration. Even worse,
> > > > > enabling vlan filtering on a DSA bridge immediately blocks all traffic
> > > > > which, given the output of "bridge vlan show", is very confusing.
> > > > >
> > > > > Provide an option that drivers can set to indicate they want to 
> > > > > receive
> > > > > vlan configuration even when vlan filtering is disabled. At the very
> > > > > least, this is safe for Marvell DSA bridges, which do not look up
> > > > > ingress traffic in the VTU if the port is in 8021Q disabled state. It 
> > > > > is
> > > > > also safe for the Ocelot switch family. Whether this change is 
> > > > > suitable
> > > > > for all DSA bridges is not known.
> > > > >
> > > > > Signed-off-by: Russell King 
> > > > > Signed-off-by: Vladimir Oltean 
> > > >
> > > > This patch was NAK'd because of objections to the "vlan_bridge_vtu"
> > > > name.  Unfortunately, this means that the bug for Marvell switches
> > > > remains unfixed to this day.
> > > >
> > >
> > > How about "accept_vlan_while_unaware"?
> >
> > It's up to DSA maintainers.
> >
> > However, I find that rather confusing. What's "unaware"? The point of
> > this boolean is to program the vlan tables while vlan filtering is
> > disabled. "accept_vlan_while_vlan_filtering_disabled" is way too long.
> >
> 
> Considering the VLAN filtering modes as "disabled", "check",
> "fallback" and "secure", I think a slight improvement over your
> wording might be "install_vlans_while_disabled". I hope that is not
> confusing and also not too long.

Well, it's not only about "installing" vlans, but also about removing
them as well.  "configure_vlans_while_disabled" would probably work
better.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 01/15] net: dsa: provide an option for drivers to always receive bridge VLANs

2020-05-11 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 02:40:29PM +0300, Vladimir Oltean wrote:
> On Mon, 11 May 2020 at 14:38, Russell King - ARM Linux admin
>  wrote:
> >
> > On Sun, May 10, 2020 at 07:42:41PM +0300, Vladimir Oltean wrote:
> > > From: Russell King 
> > >
> > > DSA assumes that a bridge which has vlan filtering disabled is not
> > > vlan aware, and ignores all vlan configuration. However, the kernel
> > > software bridge code allows configuration in this state.
> > >
> > > This causes the kernel's idea of the bridge vlan state and the
> > > hardware state to disagree, so "bridge vlan show" indicates a correct
> > > configuration but the hardware lacks all configuration. Even worse,
> > > enabling vlan filtering on a DSA bridge immediately blocks all traffic
> > > which, given the output of "bridge vlan show", is very confusing.
> > >
> > > Provide an option that drivers can set to indicate they want to receive
> > > vlan configuration even when vlan filtering is disabled. At the very
> > > least, this is safe for Marvell DSA bridges, which do not look up
> > > ingress traffic in the VTU if the port is in 8021Q disabled state. It is
> > > also safe for the Ocelot switch family. Whether this change is suitable
> > > for all DSA bridges is not known.
> > >
> > > Signed-off-by: Russell King 
> > > Signed-off-by: Vladimir Oltean 
> >
> > This patch was NAK'd because of objections to the "vlan_bridge_vtu"
> > name.  Unfortunately, this means that the bug for Marvell switches
> > remains unfixed to this day.
> >
> 
> How about "accept_vlan_while_unaware"?

It's up to DSA maintainers.

However, I find that rather confusing. What's "unaware"? The point of
this boolean is to program the vlan tables while vlan filtering is
disabled. "accept_vlan_while_vlan_filtering_disabled" is way too long.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net-next 01/15] net: dsa: provide an option for drivers to always receive bridge VLANs

2020-05-11 Thread Russell King - ARM Linux admin
On Sun, May 10, 2020 at 07:42:41PM +0300, Vladimir Oltean wrote:
> From: Russell King 
> 
> DSA assumes that a bridge which has vlan filtering disabled is not
> vlan aware, and ignores all vlan configuration. However, the kernel
> software bridge code allows configuration in this state.
> 
> This causes the kernel's idea of the bridge vlan state and the
> hardware state to disagree, so "bridge vlan show" indicates a correct
> configuration but the hardware lacks all configuration. Even worse,
> enabling vlan filtering on a DSA bridge immediately blocks all traffic
> which, given the output of "bridge vlan show", is very confusing.
> 
> Provide an option that drivers can set to indicate they want to receive
> vlan configuration even when vlan filtering is disabled. At the very
> least, this is safe for Marvell DSA bridges, which do not look up
> ingress traffic in the VTU if the port is in 8021Q disabled state. It is
> also safe for the Ocelot switch family. Whether this change is suitable
> for all DSA bridges is not known.
> 
> Signed-off-by: Russell King 
> Signed-off-by: Vladimir Oltean 

This patch was NAK'd because of objections to the "vlan_bridge_vtu"
name.  Unfortunately, this means that the bug for Marvell switches
remains unfixed to this day.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 4/5] net: phy: Introduce fwnode_get_phy_id()

2020-05-11 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 03:59:30PM +0530, Calvin Johnson wrote:
> On Mon, May 11, 2020 at 10:38:49AM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Mon, May 11, 2020 at 01:30:40PM +0530, Calvin Johnson wrote:
> > > On Sat, May 09, 2020 at 01:42:57AM +0200, Andrew Lunn wrote:
> > > > On Fri, May 08, 2020 at 05:48:33PM -0500, Jeremy Linton wrote:
> > > > > Hi,
> > > > > 
> > > > > On 5/8/20 3:27 PM, Andrew Lunn wrote:
> > > > > > > > There is a very small number of devices where the vendor messed 
> > > > > > > > up,
> > > > > > > > and did not put valid contents in the ID registers. In such 
> > > > > > > > cases, we
> > > > > > > > can read the IDs from device tree. These are then used in 
> > > > > > > > exactly the
> > > > > > > > same way as if they were read from the device.
> > > > > > > > 
> > > > > > > 
> > > > > > > Is that the case here?
> > > > > > 
> > > > > > Sorry, I don't understand the question?
> > > > > 
> > > > > I was asking in general, does this machine report the ID's correctly.
> > > > 
> > > > Very likely, it does.
> > > > 
> > > > > The embedded single mac:mdio per nic case seems like the normal case, 
> > > > > and
> > > > > most of the existing ACPI described devices are setup that way.
> > > > 
> > > > Somebody in this thread pointed to ACPI patches for the
> > > > MACCHIATOBin. If i remember the hardware correctly, it has 4 Ethernet
> > > > interfaces, and two MDIO bus masters. One of the bus masters can only
> > > > do C22 and the other can only do C45. It is expected that the busses
> > > > are shared, not a nice one to one mapping.
> > > > 
> > > > > But at the same time, that shifts the c22/45 question to the nic
> > > > > driver, where use of a DSD property before instantiating/probing
> > > > > MDIO isn't really a problem if needed.
> > > > 
> > > > This in fact does not help you. The MAC driver has no idea what PHY is
> > > > connected to it. The MAC does not know if it is C22 or C45. It uses
> > > > the phylib abstraction which hides all this. Even if you assume 1:1,
> > > > use phy_find_first(), it will not find a C45 PHY because without
> > > > knowing there is a C45 PHY, we don't scan for it. And we should expect
> > > > C45 PHYs to become more popular in the next few years.
> > > 
> > > Agree.
> > > 
> > > NXP's LX2160ARDB platform currently has the following MDIO-PHY connection.
> > > 
> > > MDIO-1 ==> one 40G PHY, two 1G PHYs(C45), two 10G PHYs(C22)
> > 
> > I'm not entirely sure you have that correct.  The Clause 45 register set
> > as defined by IEEE 802.3 does not define registers for 1G negotiation,
> > unless the PHY either supports Clause 22 accesses, or implements some
> > kind of vendor extension.  For a 1G PHY, this would be wasteful, and
> > likely incompatible with a lot of hardware/software.
> > 
> > Conversely, Clause 22 does not define registers for 10G speeds, except
> > accessing Clause 45 registers indirectly through clause 22 registers,
> > which would also be wasteful.
> > 
> Got your point.
> Let me try to clarify.
> 
> MDIO-1 ==> one 40G PHY, two 1G PHYs(C45), two 10G PHYs(C22)
> MDIO-2 ==> one 25G PHY
> This is the physical connection of MDIO & PHYs on the platform.
> 
> For the c45 PHYs(two 10G), we use compatible "ethernet-phy-ieee802.3-c45"(not
> yet upstreamed).
> For c22 PHYs(two 1G), we don't mention the c45 compatible string and hence the
> access also will be using c22, if I'm not wrong.

You seem to have just repeated the same mistake (it seems to be a direct
copy-n-paste of what you sent in the email I replied to) - and then gone
on to say something different.  Either you're confused or you're not
writing in your email what you intend to.

You first say "MDIO-1 ==> two 1G PHYs(C45)".  You then say lower down
"For C22 PHYs (two 1G)".  Both these statements can't be true.

Similarly, you first say "MDIO-1 ==> two 10G PHYs(C22)".  You then say
lower down "For the c45 PHYs(two 10G)".  Again, both these statements
can't be true.

Given that this discussion in this thread has been about C22 vs C45, I
would have thought accuracy in regard to this point would have been of
the up-most importance.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 4/5] net: phy: Introduce fwnode_get_phy_id()

2020-05-11 Thread Russell King - ARM Linux admin
On Mon, May 11, 2020 at 01:30:40PM +0530, Calvin Johnson wrote:
> On Sat, May 09, 2020 at 01:42:57AM +0200, Andrew Lunn wrote:
> > On Fri, May 08, 2020 at 05:48:33PM -0500, Jeremy Linton wrote:
> > > Hi,
> > > 
> > > On 5/8/20 3:27 PM, Andrew Lunn wrote:
> > > > > > There is a very small number of devices where the vendor messed up,
> > > > > > and did not put valid contents in the ID registers. In such cases, 
> > > > > > we
> > > > > > can read the IDs from device tree. These are then used in exactly 
> > > > > > the
> > > > > > same way as if they were read from the device.
> > > > > > 
> > > > > 
> > > > > Is that the case here?
> > > > 
> > > > Sorry, I don't understand the question?
> > > 
> > > I was asking in general, does this machine report the ID's correctly.
> > 
> > Very likely, it does.
> > 
> > > The embedded single mac:mdio per nic case seems like the normal case, and
> > > most of the existing ACPI described devices are setup that way.
> > 
> > Somebody in this thread pointed to ACPI patches for the
> > MACCHIATOBin. If i remember the hardware correctly, it has 4 Ethernet
> > interfaces, and two MDIO bus masters. One of the bus masters can only
> > do C22 and the other can only do C45. It is expected that the busses
> > are shared, not a nice one to one mapping.
> > 
> > > But at the same time, that shifts the c22/45 question to the nic
> > > driver, where use of a DSD property before instantiating/probing
> > > MDIO isn't really a problem if needed.
> > 
> > This in fact does not help you. The MAC driver has no idea what PHY is
> > connected to it. The MAC does not know if it is C22 or C45. It uses
> > the phylib abstraction which hides all this. Even if you assume 1:1,
> > use phy_find_first(), it will not find a C45 PHY because without
> > knowing there is a C45 PHY, we don't scan for it. And we should expect
> > C45 PHYs to become more popular in the next few years.
> 
> Agree.
> 
> NXP's LX2160ARDB platform currently has the following MDIO-PHY connection.
> 
> MDIO-1 ==> one 40G PHY, two 1G PHYs(C45), two 10G PHYs(C22)

I'm not entirely sure you have that correct.  The Clause 45 register set
as defined by IEEE 802.3 does not define registers for 1G negotiation,
unless the PHY either supports Clause 22 accesses, or implements some
kind of vendor extension.  For a 1G PHY, this would be wasteful, and
likely incompatible with a lot of hardware/software.

Conversely, Clause 22 does not define registers for 10G speeds, except
accessing Clause 45 registers indirectly through clause 22 registers,
which would also be wasteful.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH net] mvpp2: enable rxhash only on the first port

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 04:15:46PM +0200, Matteo Croce wrote:
> Currently rxhash only works on the first port of the CP (Communication
> Processor). Enabling it on other ports completely blocks packet reception.
> This patch only adds rxhash as supported feature to the first port,
> so rxhash can't be enabled on other ports:
> 
>   # ethtool -K eth0 rxhash on
>   # ethtool -K eth1 rxhash on
>   # ethtool -K eth2 rxhash on
>   Cannot change receive-hashing
>   Could not change any device features
>   # ethtool -K eth3 rxhash on
>   Cannot change receive-hashing
>   Could not change any device features
> 
> Fixes: 895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
> Signed-off-by: Matteo Croce 
> ---
>  drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c 
> b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> index 2b5dad2ec650..ba71583c7ae3 100644
> --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> @@ -5423,7 +5423,8 @@ static int mvpp2_port_probe(struct platform_device 
> *pdev,
>   NETIF_F_HW_VLAN_CTAG_FILTER;
>  
>   if (mvpp22_rss_is_supported()) {
> - dev->hw_features |= NETIF_F_RXHASH;
> + if (port->id == 0)
> + dev->hw_features |= NETIF_F_RXHASH;
>   dev->features |= NETIF_F_NTUPLE;
>   }

I seem to have discovered the cause of the problem in the old thread,
so I suggest we wait and see whether anyone offers up a proper
solution to this regression before we rush to completely disable
this feature.

I would suggest with a high degress of confidence based on my
research that prior to the offending commit (895586d5dc32), rx
hashing was working fine, distributing interrupts across the cores.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 08:52:46PM +0100, Russell King - ARM Linux admin wrote:
> On Sat, May 09, 2020 at 03:14:05PM +0200, Matteo Croce wrote:
> > Hi,
> > 
> > When git bisect pointed to 895586d5dc32 ("net: mvpp2: cls: Use RSS
> > contexts to handle RSS tables"), which was merged
> > almost an year after d33ec4525007 ("net: mvpp2: add an RSS
> > classification step for each flow"), so I assume that between these
> > two commits either the feature was working or it was disable and we
> > didn't notice
> > 
> > Without knowing what was happening, which commit should my Fixes tag point 
> > to?
> 
> It is highly likely that 895586d5dc32 is responsible for this breakage.
> I've been investigating this afternoon, and what I've found, comparing
> a kernel without 895586d5dc32 and with 895586d5dc32 applied is:
> 
> - The table programmed into the hardware via mvpp22_rss_fill_table()
>   appears to be identical with or without the commit.
> 
> - When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
>   that c2.attr[0] and c2.attr[2] are written back containing:
> 
>- with 895586d5dc32, failing:0020 4000
>- without 895586d5dc32, working: 0400 4000
> 
> - When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:
> 
>0400 
> 
> The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
> first value is the queue number, which comprises two fields.  The high
> 5 bits are 24:29 and the low three are 21:23 inclusive.  This comes
> from:
> 
>c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
>  MVPP22_CLS_C2_ATTR0_QLOW(ql);
> #define MVPP22_CLS_C2_ATTR0_QHIGH(qh)   (((qh) & 0x1f) << 24)
> #define MVPP22_CLS_C2_ATTR0_QLOW(ql)(((ql) & 0x7) << 21)
> 
> So, the working case gives eth2 a queue id of 4.0, or 32 as per
> port->first_rxq, and the non-working case a queue id of 0.1, or 1.
> 
> The allocation of queue IDs seems to be in mvpp2_port_probe():
> 
> if (priv->hw_version == MVPP21)
> port->first_rxq = port->id * port->nrxqs;
> else
> port->first_rxq = port->id * priv->max_port_rxqs;
> 
> Where:
> 
> if (priv->hw_version == MVPP21)
> priv->max_port_rxqs = 8;
> else
> priv->max_port_rxqs = 32;
> 
> Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
> (eth2) be 32.  It seems the idea is that the first 32 queues belong to
> port 0, the second 32 queues belong to port 1, etc.
> 
> mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
> 'ctx', which comes from mvpp22_rss_ctx(port, 0).  This returns
> port->rss_ctx[0].
> 
> mvpp22_rss_context_create() is responsible for allocating that, which
> it does by looking for an unallocated priv->rss_tables[] pointer.  This
> table is shared amongst all ports on the CP silicon.
> 
> When we write the tables in mvpp22_rss_fill_table(), the RSS table
> entry is defined by:
> 
>   u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
>   MVPP22_RSS_INDEX_TABLE_ENTRY(i);
> 
> where rss_ctx is the context ID (queue number) and i is the index in
> the table.
> 
> #define MVPP22_RSS_INDEX_TABLE_ENTRY(idx)   (idx)
> #define MVPP22_RSS_INDEX_TABLE(idx) ((idx) << 8)
> #define MVPP22_RSS_INDEX_QUEUE(idx) ((idx) << 16)
> 
> If we look at what is written:
> 
> - The first table to be written has "sel" values of ..001f,
>   containing values 0..3. This appears to be for eth1.  This is table 0,
>   RX queue number 0.
> - The second table has "sel" values of 0100..011f, and appears
>   to be for eth2.  These contain values 0x20..0x23.  This is table 1,
>   RX queue number 0.
> - The third table has "sel" values of 0200..021f, and appears
>   to be for eth3.  These contain values 0x40..0x43.  This is table 2,
>   RX queue number 0.
> 
> Okay, so how do queue numbers translate to the RSS table?  There is
> another table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE
> field of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
> register.  Before 895586d5dc32, it was:
> 
>mvpp2_write(priv, MVPP22_RSS_INDEX,
>MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
>mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
>MVPP22_RSS_TABLE_POINTER(port->id));
> 
> and after:
> 
>mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));

Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 03:14:05PM +0200, Matteo Croce wrote:
> Hi,
> 
> When git bisect pointed to 895586d5dc32 ("net: mvpp2: cls: Use RSS
> contexts to handle RSS tables"), which was merged
> almost an year after d33ec4525007 ("net: mvpp2: add an RSS
> classification step for each flow"), so I assume that between these
> two commits either the feature was working or it was disable and we
> didn't notice
> 
> Without knowing what was happening, which commit should my Fixes tag point to?

It is highly likely that 895586d5dc32 is responsible for this breakage.
I've been investigating this afternoon, and what I've found, comparing
a kernel without 895586d5dc32 and with 895586d5dc32 applied is:

- The table programmed into the hardware via mvpp22_rss_fill_table()
  appears to be identical with or without the commit.

- When rxhash is enabled on eth2, mvpp2_rss_port_c2_enable() reports
  that c2.attr[0] and c2.attr[2] are written back containing:

   - with 895586d5dc32, failing:0020 4000
   - without 895586d5dc32, working: 0400 4000

- When disabling rxhash, c2.attr[0] and c2.attr[2] are written back as:

   0400 

The second value represents the MVPP22_CLS_C2_ATTR2_RSS_EN bit, the
first value is the queue number, which comprises two fields.  The high
5 bits are 24:29 and the low three are 21:23 inclusive.  This comes
from:

   c2.attr[0] = MVPP22_CLS_C2_ATTR0_QHIGH(qh) |
 MVPP22_CLS_C2_ATTR0_QLOW(ql);
#define MVPP22_CLS_C2_ATTR0_QHIGH(qh)   (((qh) & 0x1f) << 24)
#define MVPP22_CLS_C2_ATTR0_QLOW(ql)(((ql) & 0x7) << 21)

So, the working case gives eth2 a queue id of 4.0, or 32 as per
port->first_rxq, and the non-working case a queue id of 0.1, or 1.

The allocation of queue IDs seems to be in mvpp2_port_probe():

if (priv->hw_version == MVPP21)
port->first_rxq = port->id * port->nrxqs;
else
port->first_rxq = port->id * priv->max_port_rxqs;

Where:

if (priv->hw_version == MVPP21)
priv->max_port_rxqs = 8;
else
priv->max_port_rxqs = 32;

Making the port 0 (eth0 / eth1) have port->first_rxq = 0, and port 1
(eth2) be 32.  It seems the idea is that the first 32 queues belong to
port 0, the second 32 queues belong to port 1, etc.

mvpp2_rss_port_c2_enable() gets the queue number from it's parameter,
'ctx', which comes from mvpp22_rss_ctx(port, 0).  This returns
port->rss_ctx[0].

mvpp22_rss_context_create() is responsible for allocating that, which
it does by looking for an unallocated priv->rss_tables[] pointer.  This
table is shared amongst all ports on the CP silicon.

When we write the tables in mvpp22_rss_fill_table(), the RSS table
entry is defined by:

u32 sel = MVPP22_RSS_INDEX_TABLE(rss_ctx) |
  MVPP22_RSS_INDEX_TABLE_ENTRY(i);

where rss_ctx is the context ID (queue number) and i is the index in
the table.

#define MVPP22_RSS_INDEX_TABLE_ENTRY(idx)   (idx)
#define MVPP22_RSS_INDEX_TABLE(idx) ((idx) << 8)
#define MVPP22_RSS_INDEX_QUEUE(idx) ((idx) << 16)

If we look at what is written:

- The first table to be written has "sel" values of ..001f,
  containing values 0..3. This appears to be for eth1.  This is table 0,
  RX queue number 0.
- The second table has "sel" values of 0100..011f, and appears
  to be for eth2.  These contain values 0x20..0x23.  This is table 1,
  RX queue number 0.
- The third table has "sel" values of 0200..021f, and appears
  to be for eth3.  These contain values 0x40..0x43.  This is table 2,
  RX queue number 0.

Okay, so how do queue numbers translate to the RSS table?  There is
another table - the RXQ2RSS table, indexed by the MVPP22_RSS_INDEX_QUEUE
field of MVPP22_RSS_INDEX and accessed through the MVPP22_RXQ2RSS_TABLE
register.  Before 895586d5dc32, it was:

   mvpp2_write(priv, MVPP22_RSS_INDEX,
   MVPP22_RSS_INDEX_QUEUE(port->first_rxq));
   mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE,
   MVPP22_RSS_TABLE_POINTER(port->id));

and after:

   mvpp2_write(priv, MVPP22_RSS_INDEX, MVPP22_RSS_INDEX_QUEUE(ctx));
   mvpp2_write(priv, MVPP22_RXQ2RSS_TABLE, MVPP22_RSS_TABLE_POINTER(ctx));

So, before the commit, for eth2, that would've contained '32' for the
index and '1' for the table pointer - mapping queue 32 to table 1.
Remember that this is queue-high.queue-low of 4.0.

After the commit, we appear to map queue 1 to table 1.  That again
looks fine on the face of it.

Section 9.3.1 of the A8040 manual seems indicate the reason that the
queue number is separated.  queue-low seems to always come from the
classifier, whereas queue-high can be from the ingress physical port
number or the classifier depending on the MVPP2_CLS_SWFWD_PCTRL_REG.

We set the port bit in MVPP2_CLS_SWFWD_PCTRL_REG, meaning that queue-high
comes from the MVPP2_CLS_SWFWD_P2HQ_REG() register... and this 

Re: [PATCH v1 2/3] armv8: gpio: add gpio feature

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 07:18:45PM +0100, Russell King - ARM Linux admin wrote:
> On Sat, May 09, 2020 at 11:34:59PM +0530, Amit Tomer wrote:
> > > From what I can tell, these patches are not for the kernel.  The
> > > filenames don't match th kernel layout.
> > 
> > These files looks to be from U-boot, and must be intended for U-boot
> > as I see U-boot mailing address in recipient's address?
> 
> So why is it copied to:
> 
> devicet...@vger.kernel.org - a kernel mailing list
> linux-kernel@vger.kernel.org - the main kernel mailing list
> linux-g...@vger.kernel.org - the gpio driver kernel mailing list
> linux-arm-ker...@lists.infradead.org - the ARM kernel mailing list
> 
> Given that it includes four kernel mailing lists (ok, devicetree
> may be argued to have a wider application), then I don't think the
> conclusion that "it's for u-boot, because there's _one_ u-boot
> mailing list in the recipients" is particularly obvious.
> 
> The author really needs to state that up front if they're sending
> it to a wide audeience, rather than leaving people to guess, thereby
> potentially wasting their time.
> 
> Not only did Andrew review the patch as if it were for the kernel,
> but I also wasted time on this as well when I double-took the
> ifdefs, and wanted to check the current driver in the kernel.

Oh, and... u-b...@linux.nxdi.nxp.com bounces because that domain is
not resolvable - I guess that is internal to NXP, and this patch
should have remained within NXP and not been posted publically.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v1 2/3] armv8: gpio: add gpio feature

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 11:34:59PM +0530, Amit Tomer wrote:
> > From what I can tell, these patches are not for the kernel.  The
> > filenames don't match th kernel layout.
> 
> These files looks to be from U-boot, and must be intended for U-boot
> as I see U-boot mailing address in recipient's address?

So why is it copied to:

devicet...@vger.kernel.org - a kernel mailing list
linux-kernel@vger.kernel.org - the main kernel mailing list
linux-g...@vger.kernel.org - the gpio driver kernel mailing list
linux-arm-ker...@lists.infradead.org - the ARM kernel mailing list

Given that it includes four kernel mailing lists (ok, devicetree
may be argued to have a wider application), then I don't think the
conclusion that "it's for u-boot, because there's _one_ u-boot
mailing list in the recipients" is particularly obvious.

The author really needs to state that up front if they're sending
it to a wide audeience, rather than leaving people to guess, thereby
potentially wasting their time.

Not only did Andrew review the patch as if it were for the kernel,
but I also wasted time on this as well when I double-took the
ifdefs, and wanted to check the current driver in the kernel.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v1 2/3] armv8: gpio: add gpio feature

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 05:33:15PM +0200, Andrew Lunn wrote:
> On Sat, May 09, 2020 at 06:39:55PM +0800, Hui Song wrote:
> > From: "hui.song" 
> > 
> > add one struct mpc8xxx_gpio_plat to enable gpio feature.
> > 
> > Signed-off-by: hui.song 
> > ---
> >  .../include/asm/arch-fsl-layerscape/gpio.h| 22 +++
> >  1 file changed, 22 insertions(+)
> >  create mode 100644 arch/arm/include/asm/arch-fsl-layerscape/gpio.h
> > 
> > diff --git a/arch/arm/include/asm/arch-fsl-layerscape/gpio.h 
> > b/arch/arm/include/asm/arch-fsl-layerscape/gpio.h
> > new file mode 100644
> > index 00..d8dd750a72
> > --- /dev/null
> > +++ b/arch/arm/include/asm/arch-fsl-layerscape/gpio.h
> > @@ -0,0 +1,22 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ */
> > +/*
> > + * Copyright 2014 Freescale Semiconductor, Inc.
> > + */
> > +
> > +/*
> > + * Dummy header file to enable CONFIG_OF_CONTROL.
> > + * If CONFIG_OF_CONTROL is enabled, lib/fdtdec.c is compiled.
> > + * It includes  via , so those SoCs that 
> > enable
> > + * OF_CONTROL must have arch/gpio.h.
> > + */
> 
> This does not seem right. You would expect each sub arch to have a
> subdirectory in arch/arm/include/asm/ when in fact none do.

>From what I can tell, these patches are not for the kernel.  The
filenames don't match th kernel layout.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 02:51:05PM +0100, Russell King - ARM Linux admin wrote:
> On Sat, May 09, 2020 at 03:14:05PM +0200, Matteo Croce wrote:
> > On Sat, May 9, 2020 at 1:45 PM Russell King - ARM Linux admin
> >  wrote:
> > >
> > > On Sat, May 09, 2020 at 11:15:58AM +, Stefan Chulski wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Matteo Croce 
> > > > > Sent: Saturday, May 9, 2020 3:13 AM
> > > > > To: David S . Miller 
> > > > > Cc: Maxime Chevallier ; netdev
> > > > > ; LKML ; Antoine
> > > > > Tenart ; Thomas Petazzoni
> > > > > ; gregory.clem...@bootlin.com;
> > > > > miquel.ray...@bootlin.com; Nadav Haklai ; Stefan
> > > > > Chulski ; Marcin Wojtas ; 
> > > > > Linux
> > > > > ARM ; Russell King - ARM Linux 
> > > > > admin
> > > > > 
> > > > > Subject: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS 
> > > > > contexts to
> > > > > handle RSS tables
> > > > >
> > > > > Hi,
> > > > >
> > > > > What do you think about temporarily disabling it like this?
> > > > >
> > > > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > > @@ -5775,7 +5775,8 @@ static int mvpp2_port_probe(struct 
> > > > > platform_device
> > > > > *pdev,
> > > > > NETIF_F_HW_VLAN_CTAG_FILTER;
> > > > >
> > > > > if (mvpp22_rss_is_supported()) {
> > > > > -   dev->hw_features |= NETIF_F_RXHASH;
> > > > > +   if (port->phy_interface != PHY_INTERFACE_MODE_SGMII)
> > > > > +   dev->hw_features |= NETIF_F_RXHASH;
> > > > > dev->features |= NETIF_F_NTUPLE;
> > > > > }
> > > > >
> > > > >
> > > > > David, is this "workaround" too bad to get accepted?
> > > >
> > > > Not sure that RSS related to physical interface(SGMII), better just 
> > > > remove NETIF_F_RXHASH as "workaround".
> > >
> > > Hmm, I'm not sure this is the right way forward.  This patch has the
> > > effect of disabling:
> > >
> > > d33ec4525007 ("net: mvpp2: add an RSS classification step for each flow")
> > >
> > > but the commit you're pointing at which caused the regression is:
> > >
> > > 895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
> > >
> > >
> > 
> > Hi,
> > 
> > When git bisect pointed to 895586d5dc32 ("net: mvpp2: cls: Use RSS
> > contexts to handle RSS tables"), which was merged
> > almost an year after d33ec4525007 ("net: mvpp2: add an RSS
> > classification step for each flow"), so I assume that between these
> > two commits either the feature was working or it was disable and we
> > didn't notice
> > 
> > Without knowing what was happening, which commit should my Fixes tag point 
> > to?
> 
> Let me make sure that I get this clear:
> 
> - Prior to 895586d5dc32, you can turn on and off rxhash without issue
>   on any port.
> - After 895586d5dc32, turning rxhash on eth2 prevents reception.
> 
> Prior to 895586d5dc32, with rxhash on, it looks like hashing using
> CRC32 is supported but only one context.  So, if it's possible to
> enable rxhash on any port on the mcbin without 895586d5dc32, and the
> port continues to work, I'd say the bug was introduced by
> 895586d5dc32.
> 
> Of course, that would be reinforced if there was a measurable
> difference in performance due to rxhash on each port.

I've just run this test, but I can detect no difference in performance
with or without 895586d5dc32 on eth0 or eth2 on the mcbin (apart from
eth2 stopping working with 895586d5dc32 applied.)  I tested this by
reverting almost all changes to the mvpp2 driver between 5.6 and that
commit.

That's not too surprising; I'm using my cex7 platform with the Mellanox
card in for one end of the 10G link, and that platform doesn't seem to
be able to saturdate a 10G link - it only seems to manage around 4Gbps.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 03:14:05PM +0200, Matteo Croce wrote:
> On Sat, May 9, 2020 at 1:45 PM Russell King - ARM Linux admin
>  wrote:
> >
> > On Sat, May 09, 2020 at 11:15:58AM +, Stefan Chulski wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Matteo Croce 
> > > > Sent: Saturday, May 9, 2020 3:13 AM
> > > > To: David S . Miller 
> > > > Cc: Maxime Chevallier ; netdev
> > > > ; LKML ; Antoine
> > > > Tenart ; Thomas Petazzoni
> > > > ; gregory.clem...@bootlin.com;
> > > > miquel.ray...@bootlin.com; Nadav Haklai ; Stefan
> > > > Chulski ; Marcin Wojtas ; Linux
> > > > ARM ; Russell King - ARM Linux 
> > > > admin
> > > > 
> > > > Subject: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS 
> > > > contexts to
> > > > handle RSS tables
> > > >
> > > > Hi,
> > > >
> > > > What do you think about temporarily disabling it like this?
> > > >
> > > > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > > > @@ -5775,7 +5775,8 @@ static int mvpp2_port_probe(struct platform_device
> > > > *pdev,
> > > > NETIF_F_HW_VLAN_CTAG_FILTER;
> > > >
> > > > if (mvpp22_rss_is_supported()) {
> > > > -   dev->hw_features |= NETIF_F_RXHASH;
> > > > +   if (port->phy_interface != PHY_INTERFACE_MODE_SGMII)
> > > > +   dev->hw_features |= NETIF_F_RXHASH;
> > > > dev->features |= NETIF_F_NTUPLE;
> > > > }
> > > >
> > > >
> > > > David, is this "workaround" too bad to get accepted?
> > >
> > > Not sure that RSS related to physical interface(SGMII), better just 
> > > remove NETIF_F_RXHASH as "workaround".
> >
> > Hmm, I'm not sure this is the right way forward.  This patch has the
> > effect of disabling:
> >
> > d33ec4525007 ("net: mvpp2: add an RSS classification step for each flow")
> >
> > but the commit you're pointing at which caused the regression is:
> >
> > 895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")
> >
> >
> 
> Hi,
> 
> When git bisect pointed to 895586d5dc32 ("net: mvpp2: cls: Use RSS
> contexts to handle RSS tables"), which was merged
> almost an year after d33ec4525007 ("net: mvpp2: add an RSS
> classification step for each flow"), so I assume that between these
> two commits either the feature was working or it was disable and we
> didn't notice
> 
> Without knowing what was happening, which commit should my Fixes tag point to?

Let me make sure that I get this clear:

- Prior to 895586d5dc32, you can turn on and off rxhash without issue
  on any port.
- After 895586d5dc32, turning rxhash on eth2 prevents reception.

Prior to 895586d5dc32, with rxhash on, it looks like hashing using
CRC32 is supported but only one context.  So, if it's possible to
enable rxhash on any port on the mcbin without 895586d5dc32, and the
port continues to work, I'd say the bug was introduced by
895586d5dc32.

Of course, that would be reinforced if there was a measurable
difference in performance due to rxhash on each port.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 12:31:21PM +, Stefan Chulski wrote:
> > -Original Message-
> > From: Matteo Croce 
> > Sent: Saturday, May 9, 2020 3:16 PM
> > To: Stefan Chulski 
> > Cc: David S . Miller ; Maxime Chevallier
> > ; netdev ; LKML
> > ; Antoine Tenart
> > ; Thomas Petazzoni
> > ; gregory.clem...@bootlin.com;
> > miquel.ray...@bootlin.com; Nadav Haklai ; Marcin
> > Wojtas ; Linux ARM  > ker...@lists.infradead.org>; Russell King - ARM Linux admin
> > 
> > Subject: Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS 
> > contexts to
> > handle RSS tables
> > 
> > Hi,
> > 
> > The point is that RXHASH works fine on all interfaces, but on the gigabit 
> > one
> > (eth2 usually).
> > And on the 10 gbit interface is very very effective, the throughput goes 4x 
> > when
> > enabled, so it would be a big drawback to disable it on all interfaces.
> > 
> > Honestly I don't have any 2.5 gbit hardware to test it on eth3, so I don't 
> > know if
> > rxhash actually only works on the first interface of a unit (so eth0 and 
> > eth1), or
> > if it just doesn't work on the gigabit one.
> > 
> > If someone could test it on the 2.5 gbit port, this will be helpful.
> 
> RSS tables is part of Packet Processor IP, not MAC(so it's not related to 
> specific speed). Probably issue exist on specific packet processor ports.
> Since RSS work fine on first port of the CP, we can do the following:
> if (port-> id == 0)
>   dev->hw_features |= NETIF_F_RXHASH;

I can confirm that Macchiatobin Single Shot eth0 port works with a
1G Fibre SFP or 10G DA SFP with or without rxhash on.

So it seems Stefan's hunch that it is port related is correct.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 02:16:44PM +0200, Thomas Petazzoni wrote:
> Hello,
> 
> On Sat, 9 May 2020 12:45:18 +0100
> Russell King - ARM Linux admin  wrote:
> 
> > Looking at the timeline here, it looks like Matteo raised the issue
> > very quickly after the patch was sent on the 14th April, and despite
> > following up on it, despite me following up on it, bootlin have
> > remained quiet.
> 
> Unfortunately, we are no longer actively working on Marvell platform
> support at the moment. We might have a look on a best effort basis, but
> this is potentially a non-trivial issue, so I'm not sure when we will
> have the chance to investigate and fix this.

That may be the case, but that doesn't excuse the fact that we have a
regression and we need to do something.

Please can you suggest how we resolve this regression prior to
5.7-final?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to handle RSS tables

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 11:15:58AM +, Stefan Chulski wrote:
> 
> 
> > -Original Message-
> > From: Matteo Croce 
> > Sent: Saturday, May 9, 2020 3:13 AM
> > To: David S . Miller 
> > Cc: Maxime Chevallier ; netdev
> > ; LKML ; Antoine
> > Tenart ; Thomas Petazzoni
> > ; gregory.clem...@bootlin.com;
> > miquel.ray...@bootlin.com; Nadav Haklai ; Stefan
> > Chulski ; Marcin Wojtas ; Linux
> > ARM ; Russell King - ARM Linux admin
> > 
> > Subject: [EXT] Re: [PATCH net-next 3/5] net: mvpp2: cls: Use RSS contexts to
> > handle RSS tables
> > 
> > Hi,
> > 
> > What do you think about temporarily disabling it like this?
> > 
> > --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
> > @@ -5775,7 +5775,8 @@ static int mvpp2_port_probe(struct platform_device
> > *pdev,
> > NETIF_F_HW_VLAN_CTAG_FILTER;
> > 
> > if (mvpp22_rss_is_supported()) {
> > -   dev->hw_features |= NETIF_F_RXHASH;
> > +   if (port->phy_interface != PHY_INTERFACE_MODE_SGMII)
> > +   dev->hw_features |= NETIF_F_RXHASH;
> > dev->features |= NETIF_F_NTUPLE;
> > }
> > 
> > 
> > David, is this "workaround" too bad to get accepted?
> 
> Not sure that RSS related to physical interface(SGMII), better just remove 
> NETIF_F_RXHASH as "workaround".

Hmm, I'm not sure this is the right way forward.  This patch has the
effect of disabling:

d33ec4525007 ("net: mvpp2: add an RSS classification step for each flow")

but the commit you're pointing at which caused the regression is:

895586d5dc32 ("net: mvpp2: cls: Use RSS contexts to handle RSS tables")


Looking at the timeline here, it looks like Matteo raised the issue
very quickly after the patch was sent on the 14th April, and despite
following up on it, despite me following up on it, bootlin have
remained quiet.  For a regression, that's not particularly good, and
doesn't leave many options but to ask davem to revert a commit, or
if possible fix it (which there doesn't seem to be any willingness
for either - maybe it's a feature no one uses on this platform?)

Would reverting the commit you point to as the cause (895586d5dc32)
resolve the problem, and have any advantage over entirely disabling
RSS?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v1 1/3] gpio: mpc8xxx: support fsl-layerscape platform.

2020-05-09 Thread Russell King - ARM Linux admin
On Sat, May 09, 2020 at 06:35:35PM +0800, Hui Song wrote:
> From: "hui.song" 
> 
> Make the MPC8XXX gpio driver to support the fsl-layerscape.
> 
> Signed-off-by: hui.song 
> ---
>  drivers/gpio/mpc8xxx_gpio.c | 59 +
>  1 file changed, 59 insertions(+)

What project are these for?  There is no such file in the kernel tree.

I think you've sent these patches to the wrong people and mailing lists.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] ARM: OMAP2+: remove unneeded variable "errata" in configure_dma_errata()

2020-05-06 Thread Russell King - ARM Linux admin
On Wed, May 06, 2020 at 02:19:00PM +0800, Jason Yan wrote:
> Fix the following coccicheck warning:
> 
> arch/arm/mach-omap2/dma.c:82:10-16: Unneeded variable: "errata". Return
> "0" on line 161

NAK.  Look closer at what the code is doing, thanks.

This warning is basically incorrect.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 1/5] net: phy: Introduce phy related fwnode functions

2020-05-05 Thread Russell King - ARM Linux admin
On Tue, May 05, 2020 at 06:59:01PM +0530, Calvin Johnson wrote:
> +static inline struct phy_device *device_phy_find_device(struct device *dev)
> +{
> + return NULL;
> +}
> +
> +struct fwnode_handle *fwnode_get_phy_node(struct fwnode_handle *fwnode)
> +{
> + return NULL;
> +}

This wants to be "static inline" to avoid the issue the 0-day robot
found.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 1/5] net: phy: Introduce phy related fwnode functions

2020-05-05 Thread Russell King - ARM Linux admin
On Tue, May 05, 2020 at 06:59:01PM +0530, Calvin Johnson wrote:
> Define fwnode_phy_find_device() to iterate an mdiobus and find the
> phy device of the provided phy fwnode. Additionally define
> device_phy_find_device() to find phy device of provided device.
> 
> Define fwnode_get_phy_node() to get phy_node using named reference.
> 
> Signed-off-by: Calvin Johnson 
> ---
> 
> Changes in v3:
>   move fwnode APIs to appropriate place
>   stubs fwnode APIs for !CONFIG_PHYLIB
>   improve comment on function return condition.
> 
> Changes in v2:
>   move phy code from base/property.c to net/phy/phy_device.c
>   replace acpi & of code to get phy-handle with fwnode_find_reference
> 
>  drivers/net/phy/phy_device.c | 53 
>  include/linux/phy.h  | 19 +
>  2 files changed, 72 insertions(+)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 7e1ddd5745d2..3e8224132218 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -31,6 +31,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  MODULE_DESCRIPTION("PHY library");
>  MODULE_AUTHOR("Andy Fleming");
> @@ -2436,6 +2437,58 @@ static bool phy_drv_supports_irq(struct phy_driver 
> *phydrv)
>   return phydrv->config_intr && phydrv->ack_interrupt;
>  }
>  
> +/**
> + * fwnode_phy_find_device - Find phy_device on the mdiobus for the provided
> + * phy_fwnode.
> + * @phy_fwnode: Pointer to the phy's fwnode.
> + *
> + * If successful, returns a pointer to the phy_device with the embedded
> + * struct device refcount incremented by one, or NULL on failure.
> + */
> +struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode)
> +{
> + struct device *d;
> + struct mdio_device *mdiodev;
> +
> + if (!phy_fwnode)
> + return NULL;
> +
> + d = bus_find_device_by_fwnode(_bus_type, phy_fwnode);
> + if (d) {
> + mdiodev = to_mdio_device(d);
> + if (mdiodev->flags & MDIO_DEVICE_FLAG_PHY)
> + return to_phy_device(d);
> + put_device(d);
> + }
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL(fwnode_phy_find_device);

This is basically functionally equivalent to of_phy_find_device().  If
we replaced of_mdio_find_device() with a fwnode equivalent and used that
above, we could have both of_mdio_find_device() and of_phy_find_device()
be wrappers around their fwnode equivalents.

That also means less lines of code to maintain, and means that we're
unlikely to have two implementations that may drift apart functionally
over time because their separated in two different parts of the kernel.
That is an especially important point given that fwnodes can be DT
nodes, so one may call fwnode APIs on a DT platform.

> +
> +/**
> + * device_phy_find_device - For the given device, get the phy_device
> + * @dev: Pointer to the given device
> + *
> + * Refer return conditions of fwnode_phy_find_device().
> + */
> +struct phy_device *device_phy_find_device(struct device *dev)
> +{
> + return fwnode_phy_find_device(dev_fwnode(dev));
> +}
> +EXPORT_SYMBOL_GPL(device_phy_find_device);
> +
> +/**
> + * fwnode_get_phy_node - Get the phy_node using the named reference.
> + * @fwnode: Pointer to fwnode from which phy_node has to be obtained.
> + *
> + * Refer return conditions of fwnode_find_reference().
> + */
> +struct fwnode_handle *fwnode_get_phy_node(struct fwnode_handle *fwnode)
> +{
> + return fwnode_find_reference(fwnode, "phy-handle", 0);
> +}
> +EXPORT_SYMBOL_GPL(fwnode_get_phy_node);

What if the fwnode is a DT device handle?  Shouldn't this also check for
the legacy properties as well, so we can transition code over to this
new interface?

> +
>  /**
>   * phy_probe - probe and init a PHY device
>   * @dev: device to probe and init
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index e2bfb9240587..f2664730a331 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -1141,10 +1141,29 @@ struct phy_device *phy_device_create(struct mii_bus 
> *bus, int addr, u32 phy_id,
>bool is_c45,
>struct phy_c45_device_ids *c45_ids);
>  #if IS_ENABLED(CONFIG_PHYLIB)
> +struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode);
> +struct phy_device *device_phy_find_device(struct device *dev);
> +struct fwnode_handle *fwnode_get_phy_node(struct fwnode_handle *fwnode);
>  struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool 
> is_c45);
>  int phy_device_register(

Re: [net-next PATCH v3 3/5] phylink: Introduce phylink_fwnode_phy_connect()

2020-05-05 Thread Russell King - ARM Linux admin
On Tue, May 05, 2020 at 06:59:03PM +0530, Calvin Johnson wrote:
> Define phylink_fwnode_phy_connect() to connect phy specified by
> a fwnode to a phylink instance.
> 
> Signed-off-by: Calvin Johnson 
> ---
> 
> Changes in v3:
>   remove NULL return check as it is invalid
>   remove unused phylink_device_phy_connect()
> 
> Changes in v2:
>   replace of_ and acpi_ code with generic fwnode to get phy-handle.
> 
>  drivers/net/phy/phylink.c | 48 +++
>  include/linux/phylink.h   |  3 +++
>  2 files changed, 51 insertions(+)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 0f23bec431c1..560d1069426c 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -961,6 +961,54 @@ int phylink_connect_phy(struct phylink *pl, struct 
> phy_device *phy)
>  }
>  EXPORT_SYMBOL_GPL(phylink_connect_phy);
>  
> +/**
> + * phylink_fwnode_phy_connect() - connect the PHY specified in the fwnode.
> + * @pl: a pointer to a  phylink returned from phylink_create()
> + * @fwnode: a pointer to a  fwnode_handle.
> + * @flags: PHY-specific flags to communicate to the PHY device driver
> + *
> + * Connect the phy specified @fwnode to the phylink instance specified
> + * by @pl. Actions specified in phylink_connect_phy() will be
> + * performed.
> + *
> + * Returns 0 on success or a negative errno.
> + */
> +int phylink_fwnode_phy_connect(struct phylink *pl,
> +struct fwnode_handle *fwnode,
> +u32 flags)
> +{
> + struct fwnode_handle *phy_fwnode;
> + struct phy_device *phy_dev;
> + int ret = 0;
> +
> + /* Fixed links and 802.3z are handled without needing a PHY */
> + if (pl->cfg_link_an_mode == MLO_AN_FIXED ||
> + (pl->cfg_link_an_mode == MLO_AN_INBAND &&
> +  phy_interface_mode_is_8023z(pl->link_interface)))
> + return 0;
> +
> + phy_fwnode = fwnode_get_phy_node(fwnode);
> + if ((IS_ERR(phy_fwnode)) && pl->cfg_link_an_mode == MLO_AN_PHY)
> + return -ENODEV;

This doesn't reflect the behaviour of phylink_of_phy_connect() - it is
*not* a cleanup of what is there, which is:

if (!phy_node) {
if (pl->cfg_link_an_mode == MLO_AN_PHY)
return -ENODEV;
return 0;
}

which does:

- if there is a PHY node, find the PHY and connect it.
- if there is no PHY node, then:
   + if we are expecting a PHY to be present, return an error.
   + otherwise, it is not a problem, continue.

That is very important behaviour - it allows drivers to call
phylink_*_phy_connect() without knowing whether there should or should
not be a PHY - and keeps that knowledge within phylink.  It means
network drivers don't have to parse the firmware to find out if there's
a fixed link or SFP cage attached, and decide whether to call these
functions.

> +
> + phy_dev = fwnode_phy_find_device(phy_fwnode);
> + fwnode_handle_put(phy_fwnode);
> + if (!phy_dev)
> + return -ENODEV;
> +
> + ret = phy_attach_direct(pl->netdev, phy_dev, flags,
> + pl->link_interface);
> + if (ret)
> + return ret;
> +
> + ret = phylink_bringup_phy(pl, phy_dev, pl->link_config.interface);
> + if (ret)
> + phy_detach(phy_dev);
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(phylink_fwnode_phy_connect);
> +

I think we need to go further with this, and we need to have
phylink_fwnode_phy_connect() functionally identical to
phylink_of_phy_connect() for DT-based fwnodes.  Doing so will avoid
introducing errors such as the one you've added above.

The only difference between these two is that DT has a number of
legacy properties - these can be omitted if the fwnode is not a DT
node.

Remember that fwnode is compatible with DT, so fwnode_phy_find_device()
can internally decide whether to look for the ACPI property or one of
the three DT properties.

It also means that phylink_of_phy_connect() can become:

int phylink_of_phy_connect(struct phylink *pl, struct device_node *dn,
   u32 flags)
{
return phylink_fwnode_phy_connect(pl, of_fwnode_handle(dn), flags);
}

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [net-next PATCH v3 4/5] net: phy: Introduce fwnode_get_phy_id()

2020-05-05 Thread Russell King - ARM Linux admin
On Tue, May 05, 2020 at 05:15:16PM +0300, Andy Shevchenko wrote:
> On Tue, May 5, 2020 at 4:29 PM Calvin Johnson
> > +   if (sscanf(cp, "ethernet-phy-id%4x.%4x",
> > +  , ) == 2) {
> 
> > +   *phy_id = ((upper & 0x) << 16) | (lower & 
> > 0x);
> 
> How upper can be bigger than 0xfff? Same for lower.

I think your comment is incorrect here.  Four hex digits can be larger
than 0xfff.  "1000" interpreted as hex is four hex digits and larger
than 0xfff, for example.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] ARM: use ARM unwinder for gcov

2020-05-05 Thread Russell King - ARM Linux admin
On Tue, May 05, 2020 at 04:11:56PM +0200, Arnd Bergmann wrote:
> Using gcov on ARM fails when the frame pointer unwinder is used:
> 
> arm-linux-gnueabi-ld: kernel/softirq.o:(.ARM.exidx+0x120): undefined 
> reference to `__aeabi_unwind_cpp_pr0'
> arm-linux-gnueabi-ld: init/main.o:(.ARM.exidx+0x98): undefined reference to 
> `__aeabi_unwind_cpp_pr0'
> arm-linux-gnueabi-ld: init/version.o:(.ARM.exidx+0x0): undefined reference to 
> `__aeabi_unwind_cpp_pr0'
> arm-linux-gnueabi-ld: init/do_mounts.o:(.ARM.exidx+0x28): undefined reference 
> to `__aeabi_unwind_cpp_pr0'
> arm-linux-gnueabi-ld: init/do_mounts_initrd.o:(.ARM.exidx+0x0): undefined 
> reference to `__aeabi_unwind_cpp_pr0'
> arm-linux-gnueabi-ld: init/initramfs.o:(.ARM.exidx+0x8): more undefined 
> references to `__aeabi_unwind_cpp_pr0' follow
> 
> This is likely a bug in clang that should be fixed in the compiler.
> Forcing the use of the ARM unwinder in this configuration however
> works around the problem.

Or should the stub functions in arch/arm/kernel/unwind.c be moved out?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [Patch include request] ARM: dts: imx6qdl-sr-som-ti: indicate powering off wifi is safe

2020-05-04 Thread Russell King - ARM Linux admin
On Mon, May 04, 2020 at 11:58:32AM +0200, Greg KH wrote:
> On Fri, May 01, 2020 at 09:23:49PM +0100, Miguel Borges de Freitas wrote:
> > Dear all,
> > 
> > This is a request to backport b7dc7205b2ae6b6c9d9cfc3e47d6f08da8647b10
> > (Arm: dts:  imx6qdl-sr-som-ti: indicate powering off wifi is safe),
> > already in Linus tree
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/arm/boot/dts/imx6qdl-sr-som-ti.dtsi?h=v5.7-rc3=b7dc7205b2ae6b6c9d9cfc3e47d6f08da8647b10)
> > to LTS kernel 5.4 and to stable 5.6.8.
> > 
> > Reasoning:
> > 
> > Changes to the wlcore driver during Kernel 5.x development, made the
> > Cubox-i with the IMX SOM v1.5 (which includes.a TI Wilink 8 wifi
> > chipset) not power the wireless interface on boot leaving it
> > completely unusable. This happens since at least kernel 5.3 (older one
> > I tested) and affects the current stable and LTS latest kernels. The
> > linked commit, already in linux mainline, restores the wifi
> > functionality.
> > 
> > Thanks in advance,
> 
> Now queued up, thanks.

Just be aware that there's a good reason the patch was never marked
with a Fixes: tag - that is because no one seems to know exactly which
commit broke it, and hence it hasn't been clear which stable kernels
it should be backported to.

So, it's good that someone has put up this backport request.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] arm: Drop CONFIG_MTD_M25P80 in various defconfig files

2020-05-02 Thread Russell King - ARM Linux admin
ig b/arch/arm/configs/pxa_defconfig
> index b817c57..50bbfdd 100644
> --- a/arch/arm/configs/pxa_defconfig
> +++ b/arch/arm/configs/pxa_defconfig
> @@ -181,7 +181,6 @@ CONFIG_MTD_RAM=m
>  CONFIG_MTD_ROM=m
>  CONFIG_MTD_COMPLEX_MAPPINGS=y
>  CONFIG_MTD_PXA2XX=m
> -CONFIG_MTD_M25P80=m
>  CONFIG_MTD_BLOCK2MTD=y
>  CONFIG_MTD_DOCG3=m
>  CONFIG_MTD_RAW_NAND=m
> diff --git a/arch/arm/configs/qcom_defconfig b/arch/arm/configs/qcom_defconfig
> index c882167..0a90c8d 100644
> --- a/arch/arm/configs/qcom_defconfig
> +++ b/arch/arm/configs/qcom_defconfig
> @@ -62,7 +62,6 @@ CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
>  CONFIG_MTD=y
>  CONFIG_MTD_BLOCK=y
> -CONFIG_MTD_M25P80=y
>  CONFIG_MTD_RAW_NAND=y
>  CONFIG_MTD_NAND_QCOM=y
>  CONFIG_MTD_SPI_NOR=y
> diff --git a/arch/arm/configs/sama5_defconfig 
> b/arch/arm/configs/sama5_defconfig
> index bab7861..7e9ec6f 100644
> --- a/arch/arm/configs/sama5_defconfig
> +++ b/arch/arm/configs/sama5_defconfig
> @@ -63,7 +63,6 @@ CONFIG_MTD=y
>  CONFIG_MTD_CMDLINE_PARTS=y
>  CONFIG_MTD_BLOCK=y
>  CONFIG_MTD_CFI=y
> -CONFIG_MTD_M25P80=y
>  CONFIG_MTD_RAW_NAND=y
>  CONFIG_MTD_NAND_ATMEL=y
>  CONFIG_MTD_SPI_NOR=y
> diff --git a/arch/arm/configs/socfpga_defconfig 
> b/arch/arm/configs/socfpga_defconfig
> index e73c97b..04c8bd3 100644
> --- a/arch/arm/configs/socfpga_defconfig
> +++ b/arch/arm/configs/socfpga_defconfig
> @@ -48,7 +48,6 @@ CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
>  CONFIG_MTD=y
>  CONFIG_MTD_BLOCK=y
> -CONFIG_MTD_M25P80=y
>  CONFIG_MTD_RAW_NAND=y
>  CONFIG_MTD_NAND_DENALI_DT=y
>  CONFIG_MTD_SPI_NOR=y
> diff --git a/arch/arm/configs/tegra_defconfig 
> b/arch/arm/configs/tegra_defconfig
> index aa94369..6a7988a 100644
> --- a/arch/arm/configs/tegra_defconfig
> +++ b/arch/arm/configs/tegra_defconfig
> @@ -76,7 +76,6 @@ CONFIG_DEVTMPFS=y
>  CONFIG_DEVTMPFS_MOUNT=y
>  CONFIG_TEGRA_GMI=y
>  CONFIG_MTD=y
> -CONFIG_MTD_M25P80=y
>  CONFIG_MTD_SPI_NOR=y
>  CONFIG_BLK_DEV_LOOP=y
>  CONFIG_AD525X_DPOT=y
> -- 
> 2.7.4
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] net: dsa: sja1105: fix speed setting for 10 MBPS

2020-05-01 Thread Russell King - ARM Linux admin
On Fri, May 01, 2020 at 06:00:52PM +, Walter Harms wrote:
> IMHO it would be better to use switch case here to improve readability.
> 
> switch (bmcr & mask) {
> 
> case  BMCR_SPEED1000:
>  speed = SPEED_1000;
>  break;
> case  BMCR_SPEED100:
>  speed = SPEED_100;
>  break;
> case  BMCR_SPEED10:
>  speed = SPEED_10;
>  break;
> default:
> speed = SPEED_UNKNOWN
> }
> 
> jm2c,
>  wh
> 
> btw: an_enabled ? why not !enabled, mich more easy to read

You misinterpret "an_enabled".  It's not "negated enabled".  It's not
even "disabled".  It's short for "autonegotiation enabled".  It's
positive logic too.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH 04/18] ARM: debug-ll: Add support for r8a7742

2020-04-29 Thread Russell King - ARM Linux admin
On Wed, Apr 29, 2020 at 10:56:41PM +0100, Lad Prabhakar wrote:
> @@ -1701,6 +1709,7 @@ config DEBUG_UART_PHYS
>   default 0xe6e6 if DEBUG_RCAR_GEN2_SCIF0
>   default 0xe6e68000 if DEBUG_RCAR_GEN2_SCIF1
>   default 0xe6ee if DEBUG_RCAR_GEN2_SCIF4
> + default 0xe6c6 if DEBUG_RCAR_GEN2_SCIFA2

Hi,

This is ordered by address.  Please keep it so.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH v2 0/5] Fix ELF / FDPIC ELF core dumping, and use mmap_sem properly in there

2020-04-29 Thread Russell King - ARM Linux admin
On Wed, Apr 29, 2020 at 11:49:49PM +0200, Jann Horn wrote:
> At the moment, we have that rather ugly mmget_still_valid() helper to
> work around : ELF core dumping
> doesn't take the mmap_sem while traversing the task's VMAs, and if
> anything (like userfaultfd) then remotely messes with the VMA tree,
> fireworks ensue. So at the moment we use mmget_still_valid() to bail
> out in any writers that might be operating on a remote mm's VMAs.
> 
> With this series, I'm trying to get rid of the need for that as
> cleanly as possible.
> In particular, I want to avoid holding the mmap_sem across unbounded
> sleeps.
> 
> 
> Patches 1, 2 and 3 are relatively unrelated cleanups in the core
> dumping code.
> 
> Patches 4 and 5 implement the main change: Instead of repeatedly
> accessing the VMA list with sleeps in between, we snapshot it at the
> start with proper locking, and then later we just use our copy of
> the VMA list. This ensures that the kernel won't crash, that VMA
> metadata in the coredump is consistent even in the presence of
> concurrent modifications, and that any virtual addresses that aren't
> being concurrently modified have their contents show up in the core
> dump properly.
> 
> The disadvantage of this approach is that we need a bit more memory
> during core dumping for storing metadata about all VMAs.
> 
> After this series has landed, we should be able to rip out
> mmget_still_valid().
> 
> 
> Testing done so far:
> 
>  - Creating a simple core dump on X86-64 still works.
>  - The created coredump on X86-64 opens in GDB, and both the stack and the
>exectutable look vaguely plausible.
>  - 32-bit ARM compiles with FDPIC support, both with MMU and !MMU config.
> 
> I'm CCing some folks from the architectures that use FDPIC in case
> anyone wants to give this a spin.

I've never had any reason to use FDPIC, and I don't have any binaries
that would use it.  Nicolas Pitre added ARM support, so I guess he
would be the one to talk to about it.  (Added Nicolas.)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] net: dsa: mv88e6xxx: remove duplicate assignment of struct members

2020-04-29 Thread Russell King - ARM Linux admin
On Wed, Apr 29, 2020 at 10:10:01PM +0800, Jason Yan wrote:
> These struct members named 'phylink_validate' was assigned twice:
> 
> static const struct mv88e6xxx_ops mv88e6190_ops = {
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
> };
> 
> static const struct mv88e6xxx_ops mv88e6190x_ops = {
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
>   ..
>   .phylink_validate = mv88e6390x_phylink_validate,
> };
> 
> static const struct mv88e6xxx_ops mv88e6191_ops = {
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
> };
> 
> static const struct mv88e6xxx_ops mv88e6290_ops = {
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
>   ..
>   .phylink_validate = mv88e6390_phylink_validate,
> };
> 
> Remove all the first one and leave the second one which are been used in
> fact. Be aware that for 'mv88e6190x_ops' the assignment functions is
> different while the others are all the same. This fixes the following
> coccicheck warning:
> 
> drivers/net/dsa/mv88e6xxx/chip.c:3911:48-49: phylink_validate: first
> occurrence line 3965, second occurrence line 3967
> drivers/net/dsa/mv88e6xxx/chip.c:3970:49-50: phylink_validate: first
> occurrence line 4024, second occurrence line 4026
> drivers/net/dsa/mv88e6xxx/chip.c:4029:48-49: phylink_validate: first
> occurrence line 4082, second occurrence line 4085
> drivers/net/dsa/mv88e6xxx/chip.c:4184:48-49: phylink_validate: first
> occurrence line 4238, second occurrence line 4242

This looks like a mistake while rebasing / updating the code which
resulted in commit 4262c38dc42e ("net: dsa: mv88e6xxx: Add SERDES stats
counters to all 6390 family members").

In light of what the commit which introduced this did, this patch looks
correct to me.

Fixes: 4262c38dc42e ("net: dsa: mv88e6xxx: Add SERDES stats counters to all 
6390 family members")
Reviewed-by: Russell King 

Thanks.

> 
> Signed-off-by: Jason Yan 
> ---
>  drivers/net/dsa/mv88e6xxx/chip.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
> b/drivers/net/dsa/mv88e6xxx/chip.c
> index dd8a5666a584..2b4a723c8306 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
> @@ -3962,7 +3962,6 @@ static const struct mv88e6xxx_ops mv88e6190_ops = {
>   .serdes_get_stats = mv88e6390_serdes_get_stats,
>   .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
>   .serdes_get_regs = mv88e6390_serdes_get_regs,
> - .phylink_validate = mv88e6390_phylink_validate,
>   .gpio_ops = _gpio_ops,
>   .phylink_validate = mv88e6390_phylink_validate,
>  };
> @@ -4021,7 +4020,6 @@ static const struct mv88e6xxx_ops mv88e6190x_ops = {
>   .serdes_get_stats = mv88e6390_serdes_get_stats,
>   .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
>   .serdes_get_regs = mv88e6390_serdes_get_regs,
> - .phylink_validate = mv88e6390_phylink_validate,
>   .gpio_ops = _gpio_ops,
>   .phylink_validate = mv88e6390x_phylink_validate,
>  };
> @@ -4079,7 +4077,6 @@ static const struct mv88e6xxx_ops mv88e6191_ops = {
>   .serdes_get_stats = mv88e6390_serdes_get_stats,
>   .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
>   .serdes_get_regs = mv88e6390_serdes_get_regs,
> - .phylink_validate = mv88e6390_phylink_validate,
>   .avb_ops = _avb_ops,
>   .ptp_ops = _ptp_ops,
>   .phylink_validate = mv88e6390_phylink_validate,
> @@ -4235,7 +4232,6 @@ static const struct mv88e6xxx_ops mv88e6290_ops = {
>   .serdes_get_stats = mv88e6390_serdes_get_stats,
>   .serdes_get_regs_len = mv88e6390_serdes_get_regs_len,
>   .serdes_get_regs = mv88e6390_serdes_get_regs,
> - .phylink_validate = mv88e6390_phylink_validate,
>   .gpio_ops = _gpio_ops,
>   .avb_ops = _avb_ops,
>   .ptp_ops = _ptp_ops,
> -- 
> 2.21.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: [PATCH] [v2] ARM: oabi-compat: fix epoll_ctl build failure

2020-04-29 Thread Russell King - ARM Linux admin
On Wed, Apr 29, 2020 at 03:23:24PM +0200, Arnd Bergmann wrote:
> Two functions are not declared or defined when CONFIG_EPOLL is
> disabled:
> 
> arch/arm/kernel/sys_oabi-compat.c: In function 'sys_oabi_epoll_ctl':
> arch/arm/kernel/sys_oabi-compat.c:258:6: error: implicit declaration of 
> function 'ep_op_has_event' [-Werror=implicit-function-declaration]
>   258 |  if (ep_op_has_event(op) &&
>   |  ^~~
> arch/arm/kernel/sys_oabi-compat.c:265:9: error: implicit declaration of 
> function 'do_epoll_ctl'; did you mean 'sys_epoll_ctl'? 
> [-Werror=implicit-function-declaration]
>   265 |  return do_epoll_ctl(epfd, op, fd, , false);
>   | ^~~~
>   | sys_epoll_ctl
> 
> Replace the function with the sys_ni_syscall stub in this case.
> 
> Fixes: c281634c8652 ("ARM: compat: remove KERNEL_DS usage in 
> sys_oabi_epoll_ctl()")
> Signed-off-by: Arnd Bergmann 
> ---
> v2: use sys_ni_syscall() instead of removing the function body
> ---
>  arch/arm/kernel/sys_oabi-compat.c | 2 ++
>  kernel/sys_ni.c   | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/arm/kernel/sys_oabi-compat.c 
> b/arch/arm/kernel/sys_oabi-compat.c
> index 85a1e95341d8..2488c69242cf 100644
> --- a/arch/arm/kernel/sys_oabi-compat.c
> +++ b/arch/arm/kernel/sys_oabi-compat.c
> @@ -249,6 +249,7 @@ struct oabi_epoll_event {
>   __u64 data;
>  } __attribute__ ((packed,aligned(4)));
>  
> +#ifdef CONFIG_EPOLL
>  asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int fd,
>  struct oabi_epoll_event __user *event)
>  {
> @@ -264,6 +265,7 @@ asmlinkage long sys_oabi_epoll_ctl(int epfd, int op, int 
> fd,
>  
>   return do_epoll_ctl(epfd, op, fd, , false);
>  }
> +#endif
>  
>  asmlinkage long sys_oabi_epoll_wait(int epfd,
>   struct oabi_epoll_event __user *events,
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 42ce28c460f6..9ee6a46b1795 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -68,6 +68,7 @@ COND_SYSCALL(epoll_create1);
>  COND_SYSCALL(epoll_ctl);
>  COND_SYSCALL(epoll_pwait);
>  COND_SYSCALL_COMPAT(epoll_pwait);
> +COND_SYSCALL(oabi_epoll_ctl); /* ARM OABI specific */
>  
>  /* fs/fcntl.c */
>  

I know what Chris said, but do we really want to be polluting generic
kernel files with arch specific stuff like this?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up


Re: Xilinx axienet 1000BaseX support

2020-04-29 Thread Russell King - ARM Linux admin
On Tue, Apr 28, 2020 at 05:51:58PM -0600, Robert Hancock wrote:
> On 2020-04-28 5:01 p.m., Russell King - ARM Linux admin wrote:
> > On Tue, Apr 28, 2020 at 03:59:45PM -0600, Robert Hancock wrote:
> > > On 2020-04-22 1:51 a.m., Russell King - ARM Linux admin wrote:
> > > > On Tue, Apr 21, 2020 at 07:45:47PM -0600, Robert Hancock wrote:
> > > > > Hi Andre/Russell,
> > > > > 
> > > > > Just wondering where things got to with the changes for SGMII on 
> > > > > Xilinx
> > > > > axienet that you were discussing (below)? I am looking into our 
> > > > > Xilinx setup
> > > > > using 1000BaseX SFP and trying to get it working "properly" with newer
> > > > > kernels. My understanding is that the requirements for 1000BaseX and 
> > > > > SGMII
> > > > > are somewhat similar. I gathered that SGMII was working somewhat 
> > > > > already,
> > > > > but that not all link modes had been tested. However, it appears 
> > > > > 1000BaseX
> > > > > is not yet working in the stock kernel.
> > > > > 
> > > > > The way I had this working before with a 4.19-based kernel was 
> > > > > basically a
> > > > > hack to phylink to allow the Xilinx PCS/PMA PHY to be configured
> > > > > sufficiently as a PHY for it to work, and mostly ignored the link 
> > > > > status of
> > > > > the SFP PHY itself, even though we were using in-band signalling mode 
> > > > > with
> > > > > an SFP module. That was using this patch:
> > > > > 
> > > > > https://patchwork.ozlabs.org/project/netdev/patch/1559330285-30246-5-git-send-email-hanc...@sedsystems.ca/
> > > > > 
> > > > > Of course, that's basically just a hack which I suspect mostly worked 
> > > > > by
> > > > > luck. I see that there are some helpers that were added to phylink to 
> > > > > allow
> > > > > setting PHY advertisements and reading PHY status from clause 22 PHY
> > > > > devices, so I'm guessing that is the way to go in this case? 
> > > > > Something like:
> > > > > 
> > > > > axienet_mac_config: if using in-band mode, use
> > > > > phylink_mii_c22_pcs_set_advertisement to configure the Xilinx PHY.
> > > > > 
> > > > > axienet_mac_pcs_get_state: use phylink_mii_c22_pcs_get_state to get 
> > > > > the MAC
> > > > > PCS state from the Xilinx PHY
> > > > > 
> > > > > axienet_mac_an_restart: if using in-band mode, use
> > > > > phylink_mii_c22_pcs_an_restart to restart autonegotiation on Xilinx 
> > > > > PHY
> > > > > 
> > > > > To use those c22 functions, we need to find the mdio_device that's
> > > > > referenced by the phy-handle in the device tree - I guess we can just 
> > > > > use
> > > > > some of the guts of of_phy_find_device to do that?
> > > > 
> > > > Please see the code for DPAA2 - it's changed slightly since I sent a
> > > > copy to the netdev mailing list, and it still isn't clear whether this
> > > > is the final approach (DPAA2 has some fun stuff such as several
> > > > different PHYs at address 0.) NXP basically didn't like the approach
> > > > I had in the patches I sent to netdev, we had a call, they presented
> > > > an alternative appraoch, I implemented it, then they decided my
> > > > original approach was the better solution for their situation.
> > > > 
> > > > See http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=cex7
> > > > 
> > > > specifically the patches from:
> > > > 
> > > > "dpaa2-mac: add 1000BASE-X/SGMII PCS support"
> > > > 
> > > > through to:
> > > > 
> > > > "net: phylink: add interface to configure clause 22 PCS PHY"
> > > > 
> > > > You may also need some of the patches further down in the net-queue
> > > > branch:
> > > > 
> > > > "net: phylink: avoid mac_config calls"
> > > > 
> > > > through to:
> > > > 
> > > > "net: phylink: rejig link state tracking"
> > > 
> > > I've been playing with this a bit on a 5.4 kernel with some of these 
> > > patches
> > > back

Re: Xilinx axienet 1000BaseX support

2020-04-28 Thread Russell King - ARM Linux admin
On Tue, Apr 28, 2020 at 03:59:45PM -0600, Robert Hancock wrote:
> On 2020-04-22 1:51 a.m., Russell King - ARM Linux admin wrote:
> > On Tue, Apr 21, 2020 at 07:45:47PM -0600, Robert Hancock wrote:
> > > Hi Andre/Russell,
> > > 
> > > Just wondering where things got to with the changes for SGMII on Xilinx
> > > axienet that you were discussing (below)? I am looking into our Xilinx 
> > > setup
> > > using 1000BaseX SFP and trying to get it working "properly" with newer
> > > kernels. My understanding is that the requirements for 1000BaseX and SGMII
> > > are somewhat similar. I gathered that SGMII was working somewhat already,
> > > but that not all link modes had been tested. However, it appears 1000BaseX
> > > is not yet working in the stock kernel.
> > > 
> > > The way I had this working before with a 4.19-based kernel was basically a
> > > hack to phylink to allow the Xilinx PCS/PMA PHY to be configured
> > > sufficiently as a PHY for it to work, and mostly ignored the link status 
> > > of
> > > the SFP PHY itself, even though we were using in-band signalling mode with
> > > an SFP module. That was using this patch:
> > > 
> > > https://patchwork.ozlabs.org/project/netdev/patch/1559330285-30246-5-git-send-email-hanc...@sedsystems.ca/
> > > 
> > > Of course, that's basically just a hack which I suspect mostly worked by
> > > luck. I see that there are some helpers that were added to phylink to 
> > > allow
> > > setting PHY advertisements and reading PHY status from clause 22 PHY
> > > devices, so I'm guessing that is the way to go in this case? Something 
> > > like:
> > > 
> > > axienet_mac_config: if using in-band mode, use
> > > phylink_mii_c22_pcs_set_advertisement to configure the Xilinx PHY.
> > > 
> > > axienet_mac_pcs_get_state: use phylink_mii_c22_pcs_get_state to get the 
> > > MAC
> > > PCS state from the Xilinx PHY
> > > 
> > > axienet_mac_an_restart: if using in-band mode, use
> > > phylink_mii_c22_pcs_an_restart to restart autonegotiation on Xilinx PHY
> > > 
> > > To use those c22 functions, we need to find the mdio_device that's
> > > referenced by the phy-handle in the device tree - I guess we can just use
> > > some of the guts of of_phy_find_device to do that?
> > 
> > Please see the code for DPAA2 - it's changed slightly since I sent a
> > copy to the netdev mailing list, and it still isn't clear whether this
> > is the final approach (DPAA2 has some fun stuff such as several
> > different PHYs at address 0.) NXP basically didn't like the approach
> > I had in the patches I sent to netdev, we had a call, they presented
> > an alternative appraoch, I implemented it, then they decided my
> > original approach was the better solution for their situation.
> > 
> > See http://git.armlinux.org.uk/cgit/linux-arm.git/log/?h=cex7
> > 
> > specifically the patches from:
> > 
> >"dpaa2-mac: add 1000BASE-X/SGMII PCS support"
> > 
> > through to:
> > 
> >"net: phylink: add interface to configure clause 22 PCS PHY"
> > 
> > You may also need some of the patches further down in the net-queue
> > branch:
> > 
> >"net: phylink: avoid mac_config calls"
> > 
> > through to:
> > 
> >"net: phylink: rejig link state tracking"
> 
> I've been playing with this a bit on a 5.4 kernel with some of these patches
> backported. However, I'm running into something that my previous hacks for
> this basically dealt with as a side effect: when phylink_start is called,
> sfp_upstream_start gets called, an SFP module is detected,
> phylink_connect_phy gets called, but then it hits this condition and bails
> out, because we are using INBAND mode with 1000BaseX:
> 
>   if (WARN_ON(pl->cfg_link_an_mode == MLO_AN_FIXED ||
>   (pl->cfg_link_an_mode == MLO_AN_INBAND &&
>phy_interface_mode_is_8023z(interface
>   return -EINVAL;

I'm expecting SGMII mode to be used when there's an external PHY as
that gives greatest flexibility (as it allows 10 and 100Mbps speeds
as well.)  From what I remember, these blocks support SGMII, so it
should just be a matter of adding that.

> I guess I'm not sure how this is supposed to work when the PHY on the SFP
> module gets detected, i.e. if there's supposed to be another code path that
> this is supposed to go down, or this is something that just hasn't been
> fully implemented yet?

Re: [PATCH 1/2] Revert "ASoC: hdmi-codec: re-introduce mutex locking"

2019-10-23 Thread Russell King - ARM Linux admin
On Wed, Oct 23, 2019 at 05:37:16PM +0100, Mark Brown wrote:
> On Wed, Oct 23, 2019 at 06:12:02PM +0200, Jerome Brunet wrote:
> > This reverts commit eb1ecadb7f67dde94ef0efd3ddaed5cb6c9a65ed.
> > 
> > This fixes the following warning reported by lockdep and a potential
> > issue with hibernation
> 
> Please submit patches using subject lines reflecting the style for the
> subsystem, this makes it easier for people to identify relevant patches.
> Look at what existing commits in the area you're changing are doing and
> make sure your subject lines visually resemble what they're doing.
> There's no need to resubmit to fix this alone.

Hi Mark,

If you look at the git log for reverted commits, the vast majority
of them follow _this_ style.  From 5.3 back to the start of current
git history, there are 3665 commits with "Revert" in their subject
line, 3050 of those start with "Revert" with no subsystem prefix.

It seems that there are a small number of subsystems that want
something different, ASoC included.  That will be an ongoing problem,
people won't remember which want it when the majority don't.

Maybe the revert format should be standardised in some manner?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v3 1/5] net: ag71xx: port to phylink

2019-10-21 Thread Russell King - ARM Linux admin
On Mon, Oct 21, 2019 at 07:38:07AM +0200, Oleksij Rempel wrote:
> +static void ag71xx_mac_validate(struct phylink_config *config,
> + unsigned long *supported,
> + struct phylink_link_state *state)
>  {
> - struct ag71xx *ag = netdev_priv(ndev);
> + __ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
> +
> + if (state->interface != PHY_INTERFACE_MODE_NA &&
> + state->interface != PHY_INTERFACE_MODE_GMII &&
> + state->interface != PHY_INTERFACE_MODE_MII) {
> + bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
> + return;
> + }
> +
> + phylink_set(mask, MII);
> +
> + /* flow control is not supported */
> +
> + phylink_set(mask, 10baseT_Half);
> + phylink_set(mask, 10baseT_Full);
> + phylink_set(mask, 100baseT_Half);
> + phylink_set(mask, 100baseT_Full);
>  
> - ag71xx_link_adjust(ag, true);
> + if (state->interface == PHY_INTERFACE_MODE_NA &&
> + state->interface == PHY_INTERFACE_MODE_GMII) {

This is always false.

Apart from that, from just reading the patch I have no further concerns.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH net-next 2/2] net: phy: Add ability to debug RGMII connections

2019-10-18 Thread Russell King - ARM Linux admin
On Fri, Oct 18, 2019 at 04:37:55PM +0300, Vladimir Oltean wrote:
> On Fri, 18 Oct 2019 at 16:23, Russell King - ARM Linux admin
>  wrote:
> >
> > On Fri, Oct 18, 2019 at 04:09:30PM +0300, Vladimir Oltean wrote:
> > > Hi Andrew,
> > >
> > > On Fri, 18 Oct 2019 at 16:01, Andrew Lunn  wrote:
> > > >
> > > > > Well, that's the tricky part. You're sending a frame out, with no
> > > > > guarantee you'll get the same frame back in. So I'm not sure that any
> > > > > identifiers put inside the frame will survive.
> > > > > How do the tests pan out for you? Do you actually get to trigger this
> > > > > check? As I mentioned, my NIC drops the frames with bad FCS.
> > > >
> > > > My experience is, the NIC drops the frame and increments some the
> > > > counter about bad FCS. I do very occasionally see a frame delivered,
> > > > but i guess that is 1/65536 where the FCS just happens to be good by
> > > > accident. So i think some other algorithm should be used which is
> > > > unlikely to be good when the FCS is accidentally good, or just check
> > > > the contents of the packet, you know what is should contain.
> > > >
> > > > Are there any NICs which don't do hardware FCS? Is that something we
> > > > realistically need to consider?
> > > >
> > > > > Yes, but remember, nobody guarantees that a frame with DMAC
> > > > > ff:ff:ff:ff:ff:ff on egress will still have it on its way back. Again,
> > > > > this all depends on how you plan to manage the rx-all ethtool feature.
> > > >
> > > > Humm. Never heard that before. Are you saying some NICs rewrite the
> > > > DMAN?
> > > >
> > >
> > > I'm just trying to understand the circumstances under which this
> > > kernel thread makes sense.
> > > Checking for FCS validity means that the intention was to enable the
> > > reception of frames with bad FCS.
> > > Bad FCS after bad RGMII setup/hold times doesn't mean there's a small
> > > guy in there who rewrites the checksum. It means that frame octets get
> > > garbled. All octets are just as likely to get garbled, including the
> > > SFD, preamble, DMAC, etc.
> > > All I'm saying is that, if the intention of the patch is to actually
> > > process the FCS of frames before and after, then it should actually
> > > put the interface in promiscuous mode, so that frames with a
> > > non-garbled SFD and preamble can still be received, even though their
> > > DMAC was the one that got garbled.
> >
> > Isn't the point of this to see which RGMII setting results in a working
> > setup?
> >
> > So, is it not true that what we're after is receiving a _correct_ frame
> > that corresponds to the frame that was sent out?
> >
> 
> Only true if the MAC does not drop bad frames by itself. Then the FCS
> check in the kernel thread is superfluous.

If a MAC driver doesn't drop bad frames, then surely it's buggy, since
there isn't (afaik) a way of marking a received skb with a FCS error.
Therefore, forwarding frames with bad FCS into the Linux networking
stack will allow the reception of bad frames as if they were good.

All the network drivers I've looked at (and written), when encountering
a packet with an error, update the statistic counters and drop the
errored packet.

Do you know of any that don't?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH net-next 2/2] net: phy: Add ability to debug RGMII connections

2019-10-18 Thread Russell King - ARM Linux admin
On Fri, Oct 18, 2019 at 04:09:30PM +0300, Vladimir Oltean wrote:
> Hi Andrew,
> 
> On Fri, 18 Oct 2019 at 16:01, Andrew Lunn  wrote:
> >
> > > Well, that's the tricky part. You're sending a frame out, with no
> > > guarantee you'll get the same frame back in. So I'm not sure that any
> > > identifiers put inside the frame will survive.
> > > How do the tests pan out for you? Do you actually get to trigger this
> > > check? As I mentioned, my NIC drops the frames with bad FCS.
> >
> > My experience is, the NIC drops the frame and increments some the
> > counter about bad FCS. I do very occasionally see a frame delivered,
> > but i guess that is 1/65536 where the FCS just happens to be good by
> > accident. So i think some other algorithm should be used which is
> > unlikely to be good when the FCS is accidentally good, or just check
> > the contents of the packet, you know what is should contain.
> >
> > Are there any NICs which don't do hardware FCS? Is that something we
> > realistically need to consider?
> >
> > > Yes, but remember, nobody guarantees that a frame with DMAC
> > > ff:ff:ff:ff:ff:ff on egress will still have it on its way back. Again,
> > > this all depends on how you plan to manage the rx-all ethtool feature.
> >
> > Humm. Never heard that before. Are you saying some NICs rewrite the
> > DMAN?
> >
> 
> I'm just trying to understand the circumstances under which this
> kernel thread makes sense.
> Checking for FCS validity means that the intention was to enable the
> reception of frames with bad FCS.
> Bad FCS after bad RGMII setup/hold times doesn't mean there's a small
> guy in there who rewrites the checksum. It means that frame octets get
> garbled. All octets are just as likely to get garbled, including the
> SFD, preamble, DMAC, etc.
> All I'm saying is that, if the intention of the patch is to actually
> process the FCS of frames before and after, then it should actually
> put the interface in promiscuous mode, so that frames with a
> non-garbled SFD and preamble can still be received, even though their
> DMAC was the one that got garbled.

Isn't the point of this to see which RGMII setting results in a working
setup?

So, is it not true that what we're after is receiving a _correct_ frame
that corresponds to the frame that was sent out?

Hence, if the DMAC got changed, it's irrelevent whether we received the
packet or not - since "no packet" || "changed packet" = fail.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 1/5] net: ag71xx: port to phylink

2019-10-18 Thread Russell King - ARM Linux admin
Hi,

On Fri, Oct 18, 2019 at 11:39:25AM +0200, Oleksij Rempel wrote:
> The port to phylink was done as close as possible to initial
> functionality.
> Theoretically this HW can support flow control, practically seems to be not
> enough to just enable it. So, more work should be done.
> 
> Signed-off-by: Oleksij Rempel 
> ---
>  drivers/net/ethernet/atheros/Kconfig  |   2 +-
>  drivers/net/ethernet/atheros/ag71xx.c | 146 +++---
>  2 files changed, 87 insertions(+), 61 deletions(-)
> 
> diff --git a/drivers/net/ethernet/atheros/Kconfig 
> b/drivers/net/ethernet/atheros/Kconfig
> index 0058051ba925..2720bde5034e 100644
> --- a/drivers/net/ethernet/atheros/Kconfig
> +++ b/drivers/net/ethernet/atheros/Kconfig
> @@ -20,7 +20,7 @@ if NET_VENDOR_ATHEROS
>  config AG71XX
>   tristate "Atheros AR7XXX/AR9XXX built-in ethernet mac support"
>   depends on ATH79
> - select PHYLIB
> + select PHYLINK
>   help
> If you wish to compile a kernel for AR7XXX/91XXX and enable
> ethernet support, then you should always answer Y to this.
> diff --git a/drivers/net/ethernet/atheros/ag71xx.c 
> b/drivers/net/ethernet/atheros/ag71xx.c
> index 1b1a09095c0d..4ad587d6a8e8 100644
> --- a/drivers/net/ethernet/atheros/ag71xx.c
> +++ b/drivers/net/ethernet/atheros/ag71xx.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -314,6 +315,8 @@ struct ag71xx {
>   dma_addr_t stop_desc_dma;
>  
>   int phy_if_mode;
> + struct phylink *phylink;
> + struct phylink_config phylink_config;
>  
>   struct delayed_work restart_work;
>   struct timer_list oom_timer;
> @@ -845,24 +848,20 @@ static void ag71xx_hw_start(struct ag71xx *ag)
>   netif_wake_queue(ag->ndev);
>  }
>  
> -static void ag71xx_link_adjust(struct ag71xx *ag, bool update)
> +static void ag71xx_mac_config(struct phylink_config *config, unsigned int 
> mode,
> +   const struct phylink_link_state *state)
>  {
> - struct phy_device *phydev = ag->ndev->phydev;
> + struct ag71xx *ag = netdev_priv(to_net_dev(config->dev));
>   u32 cfg2;
>   u32 ifctl;
>   u32 fifo5;
>  
> - if (!phydev->link && update) {
> - ag71xx_hw_stop(ag);
> - return;
> - }
> -
>   if (!ag71xx_is(ag, AR7100) && !ag71xx_is(ag, AR9130))
>   ag71xx_fast_reset(ag);
>  
>   cfg2 = ag71xx_rr(ag, AG71XX_REG_MAC_CFG2);
>   cfg2 &= ~(MAC_CFG2_IF_1000 | MAC_CFG2_IF_10_100 | MAC_CFG2_FDX);
> - cfg2 |= (phydev->duplex) ? MAC_CFG2_FDX : 0;
> + cfg2 |= (state->duplex) ? MAC_CFG2_FDX : 0;
>  
>   ifctl = ag71xx_rr(ag, AG71XX_REG_MAC_IFCTL);
>   ifctl &= ~(MAC_IFCTL_SPEED);
> @@ -870,7 +869,7 @@ static void ag71xx_link_adjust(struct ag71xx *ag, bool 
> update)
>   fifo5 = ag71xx_rr(ag, AG71XX_REG_FIFO_CFG5);
>   fifo5 &= ~FIFO_CFG5_BM;
>  
> - switch (phydev->speed) {
> + switch (state->speed) {

Please see the documentation for the mac_config() method in
include/linux/phylink.h wrt state->speed and state->duplex validity.

>   case SPEED_1000:
>   cfg2 |= MAC_CFG2_IF_1000;
>   fifo5 |= FIFO_CFG5_BM;
> @@ -883,7 +882,6 @@ static void ag71xx_link_adjust(struct ag71xx *ag, bool 
> update)
>   cfg2 |= MAC_CFG2_IF_10_100;
>   break;
>   default:
> - WARN(1, "not supported speed %i\n", phydev->speed);
>   return;
>   }
>  
> @@ -897,58 +895,78 @@ static void ag71xx_link_adjust(struct ag71xx *ag, bool 
> update)
>   ag71xx_wr(ag, AG71XX_REG_MAC_CFG2, cfg2);
>   ag71xx_wr(ag, AG71XX_REG_FIFO_CFG5, fifo5);
>   ag71xx_wr(ag, AG71XX_REG_MAC_IFCTL, ifctl);
> -
> - ag71xx_hw_start(ag);
> -
> - if (update)
> - phy_print_status(phydev);
>  }
>  
> -static void ag71xx_phy_link_adjust(struct net_device *ndev)
> +static void ag71xx_mac_validate(struct phylink_config *config,
> + unsigned long *supported,
> + struct phylink_link_state *state)
>  {
> - struct ag71xx *ag = netdev_priv(ndev);
> + __ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
> +
> + if (state->interface != PHY_INTERFACE_MODE_NA &&
> + state->interface != PHY_INTERFACE_MODE_GMII &&
> + state->interface != PHY_INTERFACE_MODE_MII) {
> + bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
> + return;
> + }
> +
> + phyl

Re: [RFC PATCH 0/3] watchdog servicing during decompression

2019-10-17 Thread Russell King - ARM Linux admin
On Thu, Oct 17, 2019 at 02:34:52PM +0200, Rasmus Villemoes wrote:
> On 17/10/2019 14.03, Russell King - ARM Linux admin wrote:
> > We used to have this on ARM - it was called from the decompressor
> > code via an arch_decomp_wdog() hook.
> > 
> > That code got removed because it is entirely unsuitable for a multi-
> > platform kernel.  This looks like it takes an address for the watchdog
> > from the Kconfig, and builds that into the decompressor, making the
> > decompressor specific to that board or platform.
> > 
> > I'm not sure distros are going to like that given where we are with
> > multiplatform kernels.
> 
> This is definitely not for multiplatform kernels or general distros,
> it's for kernels that are built as part of a BSP for a specific board -
> hence the "Say N unless you know you need this.".
> 
> I didn't know it used to exist. But I do know that something like this
> is carried out-of-tree for lots of boards, so I thought I'd try to get
> upstream support.

Sorry, it does still exist, just been moved around a bit.

See lib/inflate.c:

STATIC int INIT inflate(void)
{
...
#ifdef ARCH_HAS_DECOMP_WDOG
arch_decomp_wdog();
#endif

Given that it still exists, maybe this hook name should be used for
this same issue in the LZ4 code?

> The first two patches, or something like them, would be nice on their
> own, as that would minimize the conflicts when forward-porting the
> board-specific patch. But such a half-implemented feature that requires
> out-of-tree patches to actually do anything useful of course won't fly.
> 
> I'm not really a big fan of the third patch, even though it does work
> for all the cases I've encountered so far - I'm sure there would be
> boards where a much more complicated solution would be needed. Another
> method I thought of was to just supply a __weak no-op
> decompress_keepalive(), and then have a config option to point at an
> extra object file to link into the compressed/vmlinux (similar to the
> initramfs_source option that also points to some external resource).
> 
> But if the mainline kernel doesn't want anything like this
> re-introduced, that's also fine, that just means a bit of job security.

Well, we'll see whether the arm-soc people have anything to add...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [RFC PATCH 0/3] watchdog servicing during decompression

2019-10-17 Thread Russell King - ARM Linux admin
We used to have this on ARM - it was called from the decompressor
code via an arch_decomp_wdog() hook.

That code got removed because it is entirely unsuitable for a multi-
platform kernel.  This looks like it takes an address for the watchdog
from the Kconfig, and builds that into the decompressor, making the
decompressor specific to that board or platform.

I'm not sure distros are going to like that given where we are with
multiplatform kernels.

On Thu, Oct 17, 2019 at 01:49:03PM +0200, Rasmus Villemoes wrote:
> Many custom boards have an always-running external watchdog
> circuit. When the timeout of that watchdog is small, one cannot boot a
> compressed kernel since the board gets reset before it even starts
> booting the kernel proper.
> 
> One way around that is to do the decompression in a bootloader which
> knows how to service the watchdog. However, one reason to prefer using
> the kernel's own decompressor is to be able to take advantage of
> future compression enhancements (say, a faster implementation of the
> current method, or switching over when a new method such a zstd is
> invented) - often, the bootloader cannot be updated without physical
> access or is locked down for other reasons, so the decompressor has to
> be bundled with the kernel image for that to be possible.
> 
> This POC adds a linux/decompress/keepalive.h header which provides a
> decompress_keepalive() macro. Wiring up any given decompressor just
> amounts to including that header and adding decompress_keepalive() in
> the main loop - for simplicity, this series just does it for lz4.
> 
> The actual decompress_keepalive() implementation is of course very
> board-specific. The third patch adds a kconfig knob that handles a
> common case (and in fact suffices for all the various boards I've come
> across): An external watchdog serviced by toggling a gpio, with the
> value of that gpio being settable in a memory-mapped register.
> 
> Rasmus Villemoes (3):
>   decompress/keepalive.h: prepare for watchdog keepalive during kernel
> decompression
>   lib: lz4: wire up watchdog keepalive during decompression
>   decompress/keepalive.h: add config option for toggling a set of bits
> 
>  include/linux/decompress/keepalive.h | 22 +++
>  init/Kconfig | 33 
>  lib/lz4/lz4_decompress.c |  2 ++
>  3 files changed, 57 insertions(+)
>  create mode 100644 include/linux/decompress/keepalive.h
> 
> -- 
> 2.20.1
> 
> 
> _______
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs

2019-10-15 Thread Russell King - ARM Linux admin
On Tue, Oct 15, 2019 at 09:14:00AM +, Xiaowei Bao wrote:
> > -Original Message-
> > From: Russell King - ARM Linux admin 
> > Sent: 2019年10月15日 17:08
> > To: Xiaowei Bao 
> > Cc: Z.q. Hou ; bhelg...@google.com;
> > robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li
> > ; kis...@ti.com; lorenzo.pieral...@arm.com; M.h. Lian
> > ; andrew.mur...@arm.com; Mingkai Hu
> > ; linux-...@vger.kernel.org;
> > linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
> > linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for NXP
> > Layerscape SoCs
> > 
> > On Tue, Oct 15, 2019 at 07:46:12AM +, Xiaowei Bao wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: Russell King - ARM Linux admin 
> > > > Sent: 2019年9月25日 0:39
> > > > To: Xiaowei Bao 
> > > > Cc: Z.q. Hou ; bhelg...@google.com;
> > > > robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo
> > > > robh+Li
> > > > ; kis...@ti.com; lorenzo.pieral...@arm.com; M.h.
> > > > Lian ; andrew.mur...@arm.com; Mingkai Hu
> > > > ; linux-...@vger.kernel.org;
> > > > linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
> > > > linux-kernel@vger.kernel.org
> > > > Subject: Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for
> > > > NXP Layerscape SoCs
> > > >
> > > > On Mon, Sep 16, 2019 at 10:17:39AM +0800, Xiaowei Bao wrote:
> > > > > This PCIe controller is based on the Mobiveil GPEX IP, it work in
> > > > > EP mode if select this config opteration.
> > > > >
> > > > > Signed-off-by: Xiaowei Bao 
> > > > > ---
> > > > >  MAINTAINERS|   2
> > +
> > > > >  drivers/pci/controller/mobiveil/Kconfig|  17 ++-
> > > > >  drivers/pci/controller/mobiveil/Makefile   |   1 +
> > > > >  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 156
> > > > > +
> > > > >  4 files changed, 173 insertions(+), 3 deletions(-)  create mode
> > > > > 100644 drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> > > > >
> > > > > diff --git a/MAINTAINERS b/MAINTAINERS index b997056..0858b54
> > > > > 100644
> > > > > --- a/MAINTAINERS
> > > > > +++ b/MAINTAINERS
> > > > > @@ -12363,11 +12363,13 @@ F:
> > > > drivers/pci/controller/dwc/*layerscape*
> > > > >
> > > > >  PCI DRIVER FOR NXP LAYERSCAPE GEN4 CONTROLLER
> > > > >  M:   Hou Zhiqiang 
> > > > > +M:   Xiaowei Bao 
> > > > >  L:   linux-...@vger.kernel.org
> > > > >  L:   linux-arm-ker...@lists.infradead.org
> > > > >  S:   Maintained
> > > > >  F:   Documentation/devicetree/bindings/pci/layerscape-pcie-gen4.txt
> > > > >  F:   drivers/pci/controller/mobibeil/pcie-layerscape-gen4.c
> > > > > +F:   drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> > > > >
> > > > >  PCI DRIVER FOR GENERIC OF HOSTS
> > > > >  M:   Will Deacon 
> > > > > diff --git a/drivers/pci/controller/mobiveil/Kconfig
> > > > > b/drivers/pci/controller/mobiveil/Kconfig
> > > > > index 2054950..0696b6e 100644
> > > > > --- a/drivers/pci/controller/mobiveil/Kconfig
> > > > > +++ b/drivers/pci/controller/mobiveil/Kconfig
> > > > > @@ -27,13 +27,24 @@ config PCIE_MOBIVEIL_PLAT
> > > > > for address translation and it is a PCIe Gen4 IP.
> > > > >
> > > > >  config PCIE_LAYERSCAPE_GEN4
> > > > > - bool "Freescale Layerscape PCIe Gen4 controller"
> > > > > + bool "Freescale Layerscpe PCIe Gen4 controller in RC mode"
> > > > >   depends on PCI
> > > > >   depends on OF && (ARM64 || ARCH_LAYERSCAPE)
> > > > >   depends on PCI_MSI_IRQ_DOMAIN
> > > > >   select PCIE_MOBIVEIL_HOST
> > > > >   help
> > > > > Say Y here if you want PCIe Gen4 controller support on
> > > > > -   Layerscape SoCs. The PCIe controller can work in RC or
> > > > > -   EP mode according to RCW[HOST_AGT_PEX] setting.
> > > > >

Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs

2019-10-15 Thread Russell King - ARM Linux admin
On Tue, Oct 15, 2019 at 07:46:12AM +, Xiaowei Bao wrote:
> 
> 
> > -Original Message-
> > From: Russell King - ARM Linux admin 
> > Sent: 2019年9月25日 0:39
> > To: Xiaowei Bao 
> > Cc: Z.q. Hou ; bhelg...@google.com;
> > robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li
> > ; kis...@ti.com; lorenzo.pieral...@arm.com; M.h. Lian
> > ; andrew.mur...@arm.com; Mingkai Hu
> > ; linux-...@vger.kernel.org;
> > linux-arm-ker...@lists.infradead.org; devicet...@vger.kernel.org;
> > linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for NXP
> > Layerscape SoCs
> > 
> > On Mon, Sep 16, 2019 at 10:17:39AM +0800, Xiaowei Bao wrote:
> > > This PCIe controller is based on the Mobiveil GPEX IP, it work in EP
> > > mode if select this config opteration.
> > >
> > > Signed-off-by: Xiaowei Bao 
> > > ---
> > >  MAINTAINERS|   2 +
> > >  drivers/pci/controller/mobiveil/Kconfig|  17 ++-
> > >  drivers/pci/controller/mobiveil/Makefile   |   1 +
> > >  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 156
> > > +
> > >  4 files changed, 173 insertions(+), 3 deletions(-)  create mode
> > > 100644 drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> > >
> > > diff --git a/MAINTAINERS b/MAINTAINERS index b997056..0858b54 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -12363,11 +12363,13 @@ F:
> > drivers/pci/controller/dwc/*layerscape*
> > >
> > >  PCI DRIVER FOR NXP LAYERSCAPE GEN4 CONTROLLER
> > >  M:   Hou Zhiqiang 
> > > +M:   Xiaowei Bao 
> > >  L:   linux-...@vger.kernel.org
> > >  L:   linux-arm-ker...@lists.infradead.org
> > >  S:   Maintained
> > >  F:   Documentation/devicetree/bindings/pci/layerscape-pcie-gen4.txt
> > >  F:   drivers/pci/controller/mobibeil/pcie-layerscape-gen4.c
> > > +F:   drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> > >
> > >  PCI DRIVER FOR GENERIC OF HOSTS
> > >  M:   Will Deacon 
> > > diff --git a/drivers/pci/controller/mobiveil/Kconfig
> > > b/drivers/pci/controller/mobiveil/Kconfig
> > > index 2054950..0696b6e 100644
> > > --- a/drivers/pci/controller/mobiveil/Kconfig
> > > +++ b/drivers/pci/controller/mobiveil/Kconfig
> > > @@ -27,13 +27,24 @@ config PCIE_MOBIVEIL_PLAT
> > > for address translation and it is a PCIe Gen4 IP.
> > >
> > >  config PCIE_LAYERSCAPE_GEN4
> > > - bool "Freescale Layerscape PCIe Gen4 controller"
> > > + bool "Freescale Layerscpe PCIe Gen4 controller in RC mode"
> > >   depends on PCI
> > >   depends on OF && (ARM64 || ARCH_LAYERSCAPE)
> > >   depends on PCI_MSI_IRQ_DOMAIN
> > >   select PCIE_MOBIVEIL_HOST
> > >   help
> > > Say Y here if you want PCIe Gen4 controller support on
> > > -   Layerscape SoCs. The PCIe controller can work in RC or
> > > -   EP mode according to RCW[HOST_AGT_PEX] setting.
> > > +   Layerscape SoCs. And the PCIe controller work in RC mode
> > > +   by setting the RCW[HOST_AGT_PEX] to 0.
> > > +
> > > +config PCIE_LAYERSCAPE_GEN4_EP
> > > + bool "Freescale Layerscpe PCIe Gen4 controller in EP mode"
> > > + depends on PCI
> > > + depends on OF && (ARM64 || ARCH_LAYERSCAPE)
> > > + depends on PCI_ENDPOINT
> > > + select PCIE_MOBIVEIL_EP
> > > + help
> > > +   Say Y here if you want PCIe Gen4 controller support on
> > > +   Layerscape SoCs. And the PCIe controller work in EP mode
> > > +   by setting the RCW[HOST_AGT_PEX] to 1.
> > >  endmenu
> > > diff --git a/drivers/pci/controller/mobiveil/Makefile
> > > b/drivers/pci/controller/mobiveil/Makefile
> > > index 686d41f..6f54856 100644
> > > --- a/drivers/pci/controller/mobiveil/Makefile
> > > +++ b/drivers/pci/controller/mobiveil/Makefile
> > > @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_MOBIVEIL_HOST) +=
> > > pcie-mobiveil-host.o
> > >  obj-$(CONFIG_PCIE_MOBIVEIL_EP) += pcie-mobiveil-ep.o
> > >  obj-$(CONFIG_PCIE_MOBIVEIL_PLAT) += pcie-mobiveil-plat.o
> > >  obj-$(CONFIG_PCIE_LAYERSCAPE_GEN4) += pcie-layerscape-gen4.o
> > > +obj-$(CONFIG_PCIE_LAYERSCAPE_GEN4_EP) +=
> > pcie-layerscape-gen4-ep.o
> > > diff --git a/drivers/pci/controller/m

[no subject]

2019-10-14 Thread linux-kernel
Здравствуйте! Вас интересуют клиентские базы данных?



Re: [PATCH v2 1/3] net: phylink: switch to using fwnode_gpiod_get_index()

2019-10-14 Thread Russell King - ARM Linux admin
On Mon, Oct 14, 2019 at 10:40:20AM -0700, Dmitry Torokhov wrote:
> Instead of fwnode_get_named_gpiod() that I plan to hide away, let's use
> the new fwnode_gpiod_get_index() that mimics gpiod_get_index(), but
> works with arbitrary firmware node.
> 
> Reviewed-by: Andy Shevchenko 
> Acked-by: David S. Miller 

Acked-by: Russell King 

> Signed-off-by: Dmitry Torokhov 
> ---
> 
>  drivers/net/phy/phylink.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index a5a57ca94c1a..c34ca644d47e 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -168,8 +168,8 @@ static int phylink_parse_fixedlink(struct phylink *pl,
>   pl->link_config.pause |= MLO_PAUSE_ASYM;
>  
>   if (ret == 0) {
> - desc = fwnode_get_named_gpiod(fixed_node, "link-gpios",
> -   0, GPIOD_IN, "?");
> + desc = fwnode_gpiod_get_index(fixed_node, "link", 0,
> +   GPIOD_IN, "?");
>  
>   if (!IS_ERR(desc))
>   pl->link_gpio = desc;
> -- 
> 2.23.0.700.g56cf767bdb-goog
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


[no subject]

2019-10-13 Thread linux-kernel
Здравствуйте! Вас интересуют клиентские базы данных?



Re: [PATCH 3/3] arm64: configs: unset CPU_BIG_ENDIAN

2019-10-12 Thread Russell King - ARM Linux admin
On Sat, Oct 12, 2019 at 12:47:45AM +0200, Arnd Bergmann wrote:
> On Fri, Oct 11, 2019 at 12:33 PM Russell King - ARM Linux admin
>  wrote:
> > 32-bit ARM experience is that telco class users really like big
> > endian.
> 
> Right, basically anyone with a large code base migrated over from a
> big-endian MIPS or PowerPC legacy that found it cheaper to change
> the rest of the world than to fix their own code.

I think you need to step off your soap box!  Big endian isn't going
away, and it likely has nothing to do with code bases.  Just look at
networking and telco protocols.  Everything in that world tends to
be big endian.  BE is what is understood in that world, and there's
little we can do to change it.

Demanding that they switch to LE is tantamount to you demanding that
their entire world change - it ain't going to happen.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-12 Thread Russell King - ARM Linux admin
On Sat, Oct 12, 2019 at 12:15:42PM +0200, Stefan Wahren wrote:
> Hi,
> 
> Am 03.10.19 um 18:36 schrieb Will Deacon:
> > On Wed, Oct 02, 2019 at 01:39:40PM -0700, Linus Torvalds wrote:
> >> On Wed, Oct 2, 2019 at 5:56 AM Geert Uytterhoeven  
> >> wrote:
> >>>> Then use the C preprocessor to force the inlining.  I'm sorry it's not
> >>>> as pretty as static inline functions.
> >>> Which makes us lose the baby^H^H^H^Htype checking performed
> >>> on function parameters, requiring to add more ugly checks.
> >> I'm 100% agreed on this.
> >>
> >> If the inline change is being pushed by people who say "you should
> >> have used macros instead if you wanted inlining", then I will just
> >> revert that stupid commit that is causing problems.
> >>
> >> No, the preprocessor is not the answer.
> >>
> >> That said, code that relies on inlining for _correctness_ should use
> >> "__always_inline" and possibly even have a comment about why.
> >>
> >> But I am considering just undoing commit 9012d011660e ("compiler:
> >> allow all arches to enable CONFIG_OPTIMIZE_INLINING") entirely. The
> >> advantages are questionable, and when the advantages are balanced
> >> against actual regressions and the arguments are "use macros", that
> >> just shows how badly thought out this was.
> > It's clear that opinions are divided on this issue, but you can add
> > an enthusiastic:
> >
> > Acked-by: Will Deacon 
> >
> > if you go ahead with the revert. I'm all for allowing the compiler to
> > make its own inlining decisions, but not when the potential for
> > miscompilation isn't fully understood and the proposed alternatives turn
> > the source into an unreadable mess. Perhaps we can do something different
> > for 5.5 (arch opt-in? clang only? invert the logic? work to move functions
> > over to __always_inline /before/ flipping the CONFIG option? ...?)
> 
> what's the status on this?
> 
> In need to prepare my pull requests for 5.5 and all recent kernelci
> targets (including linux-next) with bcm2835_defconfig are still broken.

I merged the patches late on Thursday, it may have been too late for
linux-next to pick them up - and because of the time difference between
UK and Australia, it means they won't be in linux-next until next week
(basically, tomorrow).  linux-next is basically a Sunday to Thursday
operation from my point of view.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 3/3] arm64: configs: unset CPU_BIG_ENDIAN

2019-10-11 Thread Russell King - ARM Linux admin
On Fri, Oct 11, 2019 at 11:27:48AM +0100, Will Deacon wrote:
> Does anybody use BIG_ENDIAN? If we're not even building it then maybe we
> should get rid of it altogether on arm64. I don't know of any supported
> userspace that supports it or any CPUs that are unable to run little-endian
> binaries.

32-bit ARM experience is that telco class users really like big
endian.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 34/36] ARM: s3c: stop including mach/hardware.h from mach/io.h

2019-10-10 Thread Russell King - ARM Linux admin
t;  #include 
>  
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> diff --git a/arch/arm/mach-s3c24xx/s3c244x.c b/arch/arm/mach-s3c24xx/s3c244x.c
> index f5bd489bac85..0ca188d0ffe5 100644
> --- a/arch/arm/mach-s3c24xx/s3c244x.c
> +++ b/arch/arm/mach-s3c24xx/s3c244x.c
> @@ -25,7 +25,7 @@
>  #include 
>  #include 
>  
> -#include 
> +#include 
>  #include 
>  
>  #include 
> diff --git a/arch/arm/mach-s3c24xx/sleep-s3c2410.S 
> b/arch/arm/mach-s3c24xx/sleep-s3c2410.S
> index 659f9eff9de2..e4f6f64e7826 100644
> --- a/arch/arm/mach-s3c24xx/sleep-s3c2410.S
> +++ b/arch/arm/mach-s3c24xx/sleep-s3c2410.S
> @@ -13,7 +13,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 
> diff --git a/arch/arm/mach-s3c24xx/sleep-s3c2412.S 
> b/arch/arm/mach-s3c24xx/sleep-s3c2412.S
> index c373f1ca862b..434f5082b2ed 100644
> --- a/arch/arm/mach-s3c24xx/sleep-s3c2412.S
> +++ b/arch/arm/mach-s3c24xx/sleep-s3c2412.S
> @@ -8,7 +8,6 @@
>  
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 
> diff --git a/arch/arm/mach-s3c24xx/sleep.S b/arch/arm/mach-s3c24xx/sleep.S
> index f0f11ad60c52..4bda4a413584 100644
> --- a/arch/arm/mach-s3c24xx/sleep.S
> +++ b/arch/arm/mach-s3c24xx/sleep.S
> @@ -13,7 +13,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  
>  #include 
> diff --git a/drivers/spi/spi-s3c24xx-regs.h b/drivers/spi/spi-s3c24xx-regs.h
> index 37b93ff7c7fe..b76d591eba8c 100644
> --- a/drivers/spi/spi-s3c24xx-regs.h
> +++ b/drivers/spi/spi-s3c24xx-regs.h
> @@ -8,6 +8,8 @@
>  #ifndef __ASM_ARCH_REGS_SPI_H
>  #define __ASM_ARCH_REGS_SPI_H
>  
> +#include 
> +

If this is outside of arch/arm, it shouldn't need anything from
mach/map.h - mach/map.h is not for driver use.

>  #define S3C2410_SPCON(0x00)
>  
>  #define S3C2410_SPCON_SMOD_DMA   (2 << 5)/* DMA mode */
> diff --git a/drivers/usb/gadget/udc/s3c2410_udc_regs.h 
> b/drivers/usb/gadget/udc/s3c2410_udc_regs.h
> index d8d2eeaca088..4df279342cdd 100644
> --- a/drivers/usb/gadget/udc/s3c2410_udc_regs.h
> +++ b/drivers/usb/gadget/udc/s3c2410_udc_regs.h
> @@ -6,6 +6,8 @@
>  #ifndef __ASM_ARCH_REGS_UDC_H
>  #define __ASM_ARCH_REGS_UDC_H
>  
> +#include 
> +

If this is outside of arch/arm, it shouldn't need anything from
mach/map.h - mach/map.h is not for driver use.

>  #define S3C2410_USBDREG(x) (x)
>  
>  #define S3C2410_UDC_FUNC_ADDR_REGS3C2410_USBDREG(0x0140)
> -- 
> 2.20.0
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 5.3 076/148] mmc: sdhci-of-esdhc: set DMA snooping based on DMA coherence

2019-10-10 Thread Russell King - ARM Linux admin
Hi Greg,

On 5th October, Christian Zigotzky  reported a
problem with this on PowerPC (at a guess, it looks like there's a
PowerPC user of this where the DT does not mark the device as
dma-coherent, but the hardware requires it to be DMA coherent.)

However, despite sending a reply to him within minutes of his email
arriving, I've heard nothing since, so there's been no progress on
working out what is really going on.

Given that the reporter hasn't responded to my reply, I'm not sure
what we should be doing with this... maybe the reporter has solved
his problem, maybe he was using an incorrect DT, we just don't know.

On Thu, Oct 10, 2019 at 10:35:37AM +0200, Greg Kroah-Hartman wrote:
> From: Russell King 
> 
> commit 121bd08b029e03404c451bb237729cdff76eafed upstream.
> 
> We must not unconditionally set the DMA snoop bit; if the DMA API is
> assuming that the device is not DMA coherent, and the device snoops the
> CPU caches, the device can see stale cache lines brought in by
> speculative prefetch.
> 
> This leads to the device seeing stale data, potentially resulting in
> corrupted data transfers.  Commonly, this results in a descriptor fetch
> error such as:
> 
> mmc0: ADMA error
> mmc0: sdhci:  SDHCI REGISTER DUMP ===
> mmc0: sdhci: Sys addr:  0x | Version:  0x2202
> mmc0: sdhci: Blk size:  0x0008 | Blk cnt:  0x0001
> mmc0: sdhci: Argument:  0x | Trn mode: 0x0013
> mmc0: sdhci: Present:   0x01f50008 | Host ctl: 0x0038
> mmc0: sdhci: Power: 0x0003 | Blk gap:  0x
> mmc0: sdhci: Wake-up:   0x | Clock:0x40d8
> mmc0: sdhci: Timeout:   0x0003 | Int stat: 0x0001
> mmc0: sdhci: Int enab:  0x037f108f | Sig enab: 0x037f108b
> mmc0: sdhci: ACmd stat: 0x | Slot int: 0x2202
> mmc0: sdhci: Caps:  0x35fa | Caps_1:   0xaf00
> mmc0: sdhci: Cmd:   0x333a | Max curr: 0x
> mmc0: sdhci: Resp[0]:   0x0920 | Resp[1]:  0x001d8a33
> mmc0: sdhci: Resp[2]:   0x325b5900 | Resp[3]:  0x3f400e00
> mmc0: sdhci: Host ctl2: 0x
> mmc0: sdhci: ADMA Err:  0x0009 | ADMA Ptr: 0x00236d43820c
> mmc0: sdhci: 
> mmc0: error -5 whilst initialising SD card
> 
> but can lead to other errors, and potentially direct the SDHCI
> controller to read/write data to other memory locations (e.g. if a valid
> descriptor is visible to the device in a stale cache line.)
> 
> Fix this by ensuring that the DMA snoop bit corresponds with the
> behaviour of the DMA API.  Since the driver currently only supports DT,
> use of_dma_is_coherent().  Note that device_get_dma_attr() can not be
> used as that risks re-introducing this bug if/when the driver is
> converted to ACPI.
> 
> Signed-off-by: Russell King 
> Acked-by: Adrian Hunter 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Ulf Hansson 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  drivers/mmc/host/sdhci-of-esdhc.c |7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> --- a/drivers/mmc/host/sdhci-of-esdhc.c
> +++ b/drivers/mmc/host/sdhci-of-esdhc.c
> @@ -495,7 +495,12 @@ static int esdhc_of_enable_dma(struct sd
>   dma_set_mask_and_coherent(dev, DMA_BIT_MASK(40));
>  
>   value = sdhci_readl(host, ESDHC_DMA_SYSCTL);
> - value |= ESDHC_DMA_SNOOP;
> +
> + if (of_dma_is_coherent(dev->of_node))
> + value |= ESDHC_DMA_SNOOP;
> + else
> + value &= ~ESDHC_DMA_SNOOP;
> +
>   sdhci_writel(host, value, ESDHC_DMA_SYSCTL);
>   return 0;
>  }
> 
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v6] gpio/mpc8xxx: change irq handler from chained to normal

2019-10-09 Thread Russell King - ARM Linux admin
On Wed, Oct 09, 2019 at 04:30:21PM +0800, Hui Song wrote:
> From: Song Hui 
> 
> More than one gpio controllers can share one interrupt, change the
> driver to request shared irq.
> 
> While this will work, it will mess up userspace accounting of the number
> of interrupts per second in tools such as vmstat.  The reason is that
> for every GPIO interrupt, /proc/interrupts records the count against GIC
> interrupt 68 or 69, as well as the GPIO itself.  So, for every GPIO
> interrupt, the total number of interrupts that the system has seen
> increments by two
> 
> Signed-off-by: Laurentiu Tudor 
> Signed-off-by: Alex Marginean 
> Signed-off-by: Song Hui 
> ---
>  Changes in v6:
>   - change request_irq to devm_request_irq and add commit message.
>  Changes in v5:
>   - add traverse every bit function.
>  Changes in v4:
>   - convert 'pr_err' to 'dev_err'.
>  Changes in v3:
>   - update the patch description.
>  Changes in v2:
>   - delete the compatible of ls1088a.
> 
>  drivers/gpio/gpio-mpc8xxx.c | 31 ---
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c
> index 16a47de..f0be284 100644
> --- a/drivers/gpio/gpio-mpc8xxx.c
> +++ b/drivers/gpio/gpio-mpc8xxx.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define MPC8XXX_GPIO_PINS32
>  
> @@ -127,20 +128,20 @@ static int mpc8xxx_gpio_to_irq(struct gpio_chip *gc, 
> unsigned offset)
>   return -ENXIO;
>  }
>  
> -static void mpc8xxx_gpio_irq_cascade(struct irq_desc *desc)
> +static irqreturn_t mpc8xxx_gpio_irq_cascade(int irq, void *data)
>  {
> - struct mpc8xxx_gpio_chip *mpc8xxx_gc = irq_desc_get_handler_data(desc);
> - struct irq_chip *chip = irq_desc_get_chip(desc);
> + struct mpc8xxx_gpio_chip *mpc8xxx_gc = data;
>   struct gpio_chip *gc = _gc->gc;
>   unsigned int mask;

This needs to be "unsigned long mask;" for for_each_set_bit() not to
complain.

> + int i;
>  
>   mask = gc->read_reg(mpc8xxx_gc->regs + GPIO_IER)
>   & gc->read_reg(mpc8xxx_gc->regs + GPIO_IMR);
> - if (mask)
> + for_each_set_bit(i, , 32)
>   generic_handle_irq(irq_linear_revmap(mpc8xxx_gc->irq,
> -  32 - ffs(mask)));
> - if (chip->irq_eoi)
> - chip->irq_eoi(>irq_data);
> +  31 - i));
> +
> + return IRQ_HANDLED;
>  }
>  
>  static void mpc8xxx_irq_unmask(struct irq_data *d)
> @@ -388,8 +389,8 @@ static int mpc8xxx_probe(struct platform_device *pdev)
>  
>   ret = gpiochip_add_data(gc, mpc8xxx_gc);
>   if (ret) {
> - pr_err("%pOF: GPIO chip registration failed with status %d\n",
> -np, ret);
> + dev_err(>dev, "%pOF: GPIO chip registration failed with 
> status %d\n",
> + np, ret);
>   goto err;
>   }
>  
> @@ -409,8 +410,16 @@ static int mpc8xxx_probe(struct platform_device *pdev)
>   if (devtype->gpio_dir_in_init)
>   devtype->gpio_dir_in_init(gc);
>  
> - irq_set_chained_handler_and_data(mpc8xxx_gc->irqn,
> -  mpc8xxx_gpio_irq_cascade, mpc8xxx_gc);
> + ret = devm_request_irq(>dev, mpc8xxx_gc->irqn,
> +mpc8xxx_gpio_irq_cascade,
> +IRQF_NO_THREAD | IRQF_SHARED, "gpio-cascade",
> +    mpc8xxx_gc);
> + if (ret) {
> + dev_err(>dev, "%s: failed to devm_request_irq(%d), ret = 
> %d\n",
> + np->full_name, mpc8xxx_gc->irqn, ret);
> + goto err;
> + }
> +
>   return 0;
>  err:
>   iounmap(mpc8xxx_gc->regs);
> -- 
> 2.9.5
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 -next] ASoc: tas2770: Fix build error without GPIOLIB

2019-10-06 Thread mirq-linux
On Sun, Oct 06, 2019 at 06:46:31PM +0800, YueHaibing wrote:
> If GPIOLIB is not set, building fails:
> 
> sound/soc/codecs/tas2770.c: In function tas2770_reset:
> sound/soc/codecs/tas2770.c:38:3: error: implicit declaration of function 
> gpiod_set_value_cansleep; did you mean gpio_set_value_cansleep? 
> [-Werror=implicit-function-declaration]
>gpiod_set_value_cansleep(tas2770->reset_gpio, 0);
>^~~~
>gpio_set_value_cansleep
> sound/soc/codecs/tas2770.c: In function tas2770_i2c_probe:
> sound/soc/codecs/tas2770.c:749:24: error: implicit declaration of function 
> devm_gpiod_get_optional; did you mean devm_regulator_get_optional? 
> [-Werror=implicit-function-declaration]
>   tas2770->reset_gpio = devm_gpiod_get_optional(tas2770->dev,
> ^~~
> devm_regulator_get_optional
> sound/soc/codecs/tas2770.c:751:13: error: GPIOD_OUT_HIGH undeclared (first 
> use in this function); did you mean GPIOF_INIT_HIGH?
>  GPIOD_OUT_HIGH);
>  ^~
>  GPIOF_INIT_HIGH
> 
> Reported-by: Hulk Robot 
> Fixes: 1a476abc723e ("tas2770: add tas2770 smart PA kernel driver")
> Suggested-by: Ladislav Michl 
> Signed-off-by: YueHaibing 
> ---
> v2: Add missing include file
> ---
>  sound/soc/codecs/tas2770.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/sound/soc/codecs/tas2770.c b/sound/soc/codecs/tas2770.c
> index 9da88cc..a36d0d7 100644
> --- a/sound/soc/codecs/tas2770.c
> +++ b/sound/soc/codecs/tas2770.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 

The Kconfig part is missing - is this intended? If I guess correctly,
the driver won't work without GPIOLIB, so it should either
'select GPIOLIB' or 'depends on GPIOLIB || COMPILE_TEST' or even
'select GPIOLIB if !COMPILE_TEST'.

Best Regards,
Michał Mirosław


Re: MAP_FIXED_NOREPLACE appears to break older i386 binaries

2019-10-06 Thread Russell King - ARM Linux admin
On Sat, Oct 05, 2019 at 05:18:05PM -0700, Linus Torvalds wrote:
> Duh.
> 
> I only looked at recent issues in this area, and overlooked your
> sentence in between the two ELF section dumps, and it appears that you
> have already biseced it to something else:

I hadn't - I'd looked at the changes and identified a likely culpret
that fit with the symptoms and the layout of the binary.

What I'm basically trying to do is update my laptop - it was running
an x86_64 4.5.7 kernel but with 32-bit userland.  I've just installed
into a separate partition Debian Stable with the view to seeing
whether I like it, which means migrating stuff over - and I hit a
problem with the newer Evolution not wanting to recognise the
configuration/data from the previous version.

So I thought... I can just chroot into the old setup, run up evolution
there, export its configuration, so I can import it into the newer
version without having to go through a reboot cycle.

The chroot and exec of bin/bash in the old setup was successful, as
was dmesg, but useful tools like ls failed with a segfault.

The difference between working binaries and non-working binaries seems
to be whether the r-x and rw- LOAD sections in the ELF program headers
overlap on a page.  Here's bash:

LOAD off0x vaddr 0x08047000 paddr 0x08047000 align 2**12
 filesz 0x000bbb08 memsz 0x000bbb08 flags r-x
LOAD off0x000bc000 vaddr 0x08103000 paddr 0x08103000 align 2**12
 filesz 0x4864 memsz 0x9648 flags rw-

So, the r-x load covers 0x08047000-0x08102b08, and the following rw-
load covers 0x08103000 onwards - so next page.  dmesg is similar:

LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12
 filesz 0x9c64 memsz 0x9c64 flags r-x
LOAD off0x9e28 vaddr 0x08052e28 paddr 0x08052e28 align 2**12
 filesz 0x28ce memsz 0x28ce flags rw-

0x08048000-0x08051c64 vs 0x08052e28 - so next page.  In contrast, ls:

LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12
 filesz 0x0001d620 memsz 0x0001d620 flags r-x
LOAD off0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**12
 filesz 0x0a50 memsz 0x16e4 flags rw-

0x08048000-0x08065620 vs 0x08065950 - so same page, and fails.

Looking at the commit I referred to, what we end up with is:

- Initially, elf_fixed is MAP_FIXED_NOREPLACE and load_addr_set is false
- elf_brk and elf_bss are initially zero
- The first LOAD requests a mapping for 0x08048000 .. 0x08065fff inclusive
- since this is an executable mapping, we use elf_fixed to set the
  MAP_FIXED* flags, so this mapping is established with
  MAP_FIXED_NOREPLACE.
- load_addr_set is now set to true
- elf_bss is set to vaddr + filesz => 0x08065620
- elf_brk is set to vaddr + memsz => 0x08065620
- Moving on to the second LOAD, this is a mapping starting at 0x08065950
- Since elf_brk > elf_bss is false, we don't take that path through the
  code, which _would_ have set elf_fixed to MAP_FIXED (that's the only
  case which we would do - for the BSS.)
- As load_addr_set is true, we again use elf_fixed to set the
  MAP_FIXED* flags.  elf_fixed is still MAP_FIXED_NOREPLACE, so this
  mapping uses MAP_FIXED_NOREPLACE.
- Since this mapping overlaps the previous mapping, it fails with the
  error mentioned.

Since the ELF load_binary() method returns -EEXIST, we end up in this
code path in fs/exec.c:

if (retval < 0 && !bprm->mm) {
/* we got to flush_old_exec() and failed after it */
read_unlock(_lock);
force_sigsegv(SIGSEGV);
return retval;
}

and the program is killed with a SIGSEGV.

So, from a code inspection point of view, it seems that this is likely
the culpret.

I don't yet have the debian stable system setup enough to build kernels;
that may be today's project, but I'd first like to solve the original
issue (migrating the evolution setup) so I can first see whether it's
going to be worth me continuing, or whether I persist with my existing
setup.

However, I think it _is_ worth highlighting that we seem to have broken
binary compatibility with older i386 userspace with newer kernels.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


MAP_FIXED_NOREPLACE appears to break older i386 binaries

2019-10-05 Thread Russell King - ARM Linux admin
Under a 4.19 kernel (debian stable), I am surprised to find that some
previously working i386 binaries no longer work, whereas others are
fine.  ls, for example, dies with a SEGV, but bash is fine.

Looking at the kernel log reveals:

[13117.361000] 20899 (ls): Uhuuh, elf segment at 08065000 requested but
the memory is mapped already
[13120.367221] 20935 (vdir): Uhuuh, elf segment at 08065000 requested 
but the memory is mapped already
[13122.891253] 20936 (ls): Uhuuh, elf segment at 08065000 requested but
the memory is mapped already
[13137.719143] 20940 (ls): Uhuuh, elf segment at 08065000 requested but
the memory is mapped already
[13139.202469] 20978 (ls): Uhuuh, elf segment at 08065000 requested but
the memory is mapped already
[13158.093533] 21007 (ls): Uhuuh, elf segment at 08065000 requested but
the memory is mapped already
[13221.920939] 21021 (objdump): Uhuuh, elf segment at 080a1000 
requested but the memory is mapped already

Looking at /bin/ls:

Program Header:
PHDR off0x0034 vaddr 0x08048034 paddr 0x08048034 align 2**2
 filesz 0x0120 memsz 0x0120 flags r-x
  INTERP off0x0154 vaddr 0x08048154 paddr 0x08048154 align 2**0
 filesz 0x0013 memsz 0x0013 flags r--
LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12
 filesz 0x0001d620 memsz 0x0001d620 flags r-x
LOAD off0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**12
 filesz 0x0a50 memsz 0x16e4 flags rw-
 DYNAMIC off0x0001dec4 vaddr 0x08065ec4 paddr 0x08065ec4 align 2**2
 filesz 0x0100 memsz 0x0100 flags rw-
NOTE off0x0168 vaddr 0x08048168 paddr 0x08048168 align 2**2
 filesz 0x0044 memsz 0x0044 flags r--
EH_FRAME off0x00018e68 vaddr 0x08060e68 paddr 0x08060e68 align 2**2
 filesz 0x0774 memsz 0x0774 flags r--
   STACK off0x vaddr 0x paddr 0x align 2**4
 filesz 0x memsz 0x flags rw-
   RELRO off0x0001d950 vaddr 0x08065950 paddr 0x08065950 align 2**0
 filesz 0x06b0 memsz 0x06b0 flags r--

Note that the executable part of ls extends from 0x08048000 for
0x0001d620 bytes in memory and file, which takes that up to
0x08065620.  The rw data section starts at 0x08065950.

Seems we've broken older i386 binaries with commit ad55eac74f20
("elf: enforce MAP_FIXED on overlaying elf segments").  Maybe the
MAP_FIXED_NOREPLACE stuff needs to have an on/off switch?

Here's the objdump -h output for the same binary:

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .interp   0013  08048154  08048154  0154  2**0
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 0020  08048168  08048168  0168  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 0024  08048188  08048188  0188  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .gnu.hash 003c  080481ac  080481ac  01ac  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .dynsym   0840  080481e8  080481e8  01e8  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .gnu.liblist  00c8  08048a28  08048a28  0a28  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .gnu.version  0108  08049020  08049020  1020  2**1
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  7 .gnu.version_r 00c0  08049128  08049128  1128  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  8 .rel.dyn  0048  080491e8  080491e8  11e8  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  9 .rel.plt  0390  08049230  08049230  1230  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 10 .init 0023  080495c0  080495c0  15c0  2**2
  CONTENTS, ALLOC, LOAD, READONLY, CODE
 11 .plt  0730  080495f0  080495f0  15f0  2**4
  CONTENTS, ALLOC, LOAD, READONLY, CODE
 12 .text 00013274  08049d20  08049d20  1d20  2**4
  CONTENTS, ALLOC, LOAD, READONLY, CODE
 13 .fini 0014  0805cf94  0805cf94  00014f94  2**2
  CONTENTS, ALLOC, LOAD, READONLY, CODE
 14 .rodata   3ea8  0805cfc0  0805cfc0  00014fc0  2**5
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 15 .eh_frame_hdr 0774  08060e68  08060e68  00018e68  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 16 .eh_frame 341c  080615dc  080615dc  000195dc  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 17 .dynstr   064c  080649f8  080649f8  0001c9f8  2**0
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 18 .gnu.conflict 05dc  08065044  08065044  0001d044  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
 19 .init_array   0004  08065950  08065950  0001d950 

Re: [PATCH] panic: Ensure preemption is disabled during panic()

2019-10-04 Thread Russell King - ARM Linux admin
On Fri, Oct 04, 2019 at 11:11:42AM +0200, Petr Mladek wrote:
> On Thu 2019-10-03 21:56:34, Will Deacon wrote:
> > Hi Kees,
> > 
> > On Wed, Oct 02, 2019 at 01:58:46PM -0700, Kees Cook wrote:
> > > On Wed, Oct 02, 2019 at 01:35:38PM +0100, Will Deacon wrote:
> > > > Calling 'panic()' on a kernel with CONFIG_PREEMPT=y can leave the
> > > > calling CPU in an infinite loop, but with interrupts and preemption
> > > > enabled. From this state, userspace can continue to be scheduled,
> > > > despite the system being "dead" as far as the kernel is concerned. This
> > > > is easily reproducible on arm64 when booting with "nosmp" on the command
> > > > line; a couple of shell scripts print out a periodic "Ping" message
> > > > whilst another triggers a crash by writing to /proc/sysrq-trigger:
> > > > 
> > > >   | sysrq: Trigger a crash
> > > >   | Kernel panic - not syncing: sysrq triggered crash
> > > >   | CPU: 0 PID: 1 Comm: init Not tainted 5.2.15 #1
> > > >   | Hardware name: linux,dummy-virt (DT)
> > > >   | Call trace:
> > > >   |  dump_backtrace+0x0/0x148
> > > >   |  show_stack+0x14/0x20
> > > >   |  dump_stack+0xa0/0xc4
> > > >   |  panic+0x140/0x32c
> > > >   |  sysrq_handle_reboot+0x0/0x20
> > > >   |  __handle_sysrq+0x124/0x190
> > > >   |  write_sysrq_trigger+0x64/0x88
> > > >   |  proc_reg_write+0x60/0xa8
> > > >   |  __vfs_write+0x18/0x40
> > > >   |  vfs_write+0xa4/0x1b8
> > > >   |  ksys_write+0x64/0xf0
> > > >   |  __arm64_sys_write+0x14/0x20
> > > >   |  el0_svc_common.constprop.0+0xb0/0x168
> > > >   |  el0_svc_handler+0x28/0x78
> > > >   |  el0_svc+0x8/0xc
> > > >   | Kernel Offset: disabled
> > > >   | CPU features: 0x0002,24002004
> > > >   | Memory Limit: none
> > > >   | ---[ end Kernel panic - not syncing: sysrq triggered crash ]---
> > > >   |  Ping 2!
> > > >   |  Ping 1!
> > > >   |  Ping 1!
> > > >   |  Ping 2!
> > > > 
> > > > The issue can also be triggered on x86 kernels if CONFIG_SMP=n, 
> > > > otherwise
> > > > local interrupts are disabled in 'smp_send_stop()'.
> > > > 
> > > > Disable preemption in 'panic()' before re-enabling interrupts.
> > > 
> > > Is this perhaps the correct solution for what commit c39ea0b9dd24 ("panic:
> > > avoid the extra noise dmesg") was trying to fix?
> > 
> > Hmm, maybe, although that looks like it's focussed more on irq handling
> > than preemption.
> 
> Exactly, the backtrace mentioned in commit c39ea0b9dd24 ("panic: avoid
> the extra noise dmesg") is printed by wake_up() called from
> wake_up_klogd_work_func(). It is irq_work. Therefore disabling
> preemption would not prevent this.
> 
> 
> > I've deliberately left the irq part alone, since I think
> > having magic sysrq work via the keyboard interrupt is desirable from the
> > panic loop.
> 
> I agree that we should keep sysrq working.
> 
> One pity thing is that led_panic_blink() in
> leds/drivers/trigger/ledtrig-panic.c uses workqueues:
> 
>   + led_panic_blink()
> + led_trigger_event()
>   + led_set_brightness()
>   + schedule_work()
> 
> It means that it depends on the scheduler. I guess that it
> does not work in many panic situations. But this patch
> will always block it.
> 
> I agree that it is strange that userspace still works at
> this stage. But does it cause any real problems?

Yes, there are watchdog drivers that continue to pat their watchdog
after the kernel has panic'd.  It makes watchdogs useless (which is
exactly how this problem was discovered.)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 00/21] Refine memblock API

2019-10-04 Thread Russell King - ARM Linux admin
On Thu, Oct 03, 2019 at 02:30:10PM +0300, Mike Rapoport wrote:
> On Thu, Oct 03, 2019 at 09:49:14AM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Thu, Oct 03, 2019 at 08:34:52AM +0300, Mike Rapoport wrote:
> > > (trimmed the CC)
> > > 
> > > On Wed, Oct 02, 2019 at 06:14:11AM -0500, Adam Ford wrote:
> > > > On Wed, Oct 2, 2019 at 2:36 AM Mike Rapoport  wrote:
> > > > >
> > > > 
> > > > Before the patch:
> > > > 
> > > > # cat /sys/kernel/debug/memblock/memory
> > > >0: 0x1000..0x8fff
> > > > # cat /sys/kernel/debug/memblock/reserved
> > > >0: 0x10004000..0x10007fff
> > > >   34: 0x2f88..0x3fff
> > > > 
> > > > 
> > > > After the patch:
> > > > # cat /sys/kernel/debug/memblock/memory
> > > >0: 0x1000..0x8fff
> > > > # cat /sys/kernel/debug/memblock/reserved
> > > >0: 0x10004000..0x10007fff
> > > >   36: 0x8000..0x8fff
> > > 
> > > I'm still not convinced that the memblock refactoring didn't uncovered an
> > > issue in etnaviv driver.
> > > 
> > > Why moving the CMA area from 0x8000 to 0x3000 makes it fail?
> > 
> > I think you have that the wrong way round.
> 
> I'm relying on Adam's reports of working and non-working versions.
> According to that etnaviv works when CMA area is at 0x8000 and does not
> work when it is at 0x3000.
> 
> He also sent logs a few days ago [1], they also confirm that.
> 
> [1] 
> https://lore.kernel.org/linux-mm/CAHCN7xJEvS2Si=M+BYtz+kY0M4NxmqDjiX9Nwq6_3GGBh3yg=w...@mail.gmail.com/

Sorry, yes, you're right.  Still, I've reported this same regression
a while back, and it's never gone away.

> > > BTW, the code that complained about "command buffer outside valid memory
> > > window" has been removed by the commit 17e4660ae3d7 ("drm/etnaviv:
> > > implement per-process address spaces on MMUv2"). 
> > > 
> > > Could be that recent changes to MMU management of etnaviv resolve the
> > > issue?
> > 
> > The iMX6 does not have MMUv2 hardware, it has MMUv1.  With MMUv1
> > hardware requires command buffers within the first 2GiB of physical
> > RAM.
> 
> I've mentioned that patch because it removed the check for cmdbuf address
> for MMUv1:
> 
> @@ -785,15 +768,7 @@ int etnaviv_gpu_init(struct etnaviv_gpu *gpu)
>   PAGE_SIZE);
> if (ret) {
> dev_err(gpu->dev, "could not create command buffer\n");
> -   goto unmap_suballoc;
> -   }
> -
> -   if (!(gpu->identity.minor_features1 & chipMinorFeatures1_MMU_VERSION) 
> &&
> -   etnaviv_cmdbuf_get_va(>buffer, >cmdbuf_mapping) > 
> 0x8000) {
> -   ret = -EINVAL;
> -   dev_err(gpu->dev,
> -   "command buffer outside valid memory window\n");
> -   goto free_buffer;
> +   goto fail;
> }
>  
> /* Setup event management */
> 
> 
> I really don't know how etnaviv works, so I hoped that people who
> understand it would help.

>From what I can see, removing that check is a completely insane thing
to do, and I note that these changes are _not_ described in the commit
message.  The problem was known about _before_ (June 22) the patch was
created (July 5).

Lucas, please can you explain why removing the above check, which is
well known to correctly trigger on various platforms to prevent
incorrect GPU behaviour, is safe?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 00/21] Refine memblock API

2019-10-04 Thread Russell King - ARM Linux admin
On Thu, Oct 03, 2019 at 07:46:06AM -0700, Chris Healy wrote:
> >
> > The iMX6 does not have MMUv2 hardware, it has MMUv1.  With MMUv1
> > hardware requires command buffers within the first 2GiB of physical
> > RAM.
> >
> I thought that the i.MX6q has the MMUv1 and GC2000 GPU while the
> i.MX6qp has the MMUv2 and GC3000?  Meaning the i.MX6 has both MMUv1
> and MMUv2 depending on which i.MX6 part we are talking about.

The report says iMX6Q with GC2000 - which is what I was referring to
here.  I'm not aware of what the later SoCs use, since I've never used
them.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2 00/21] Refine memblock API

2019-10-03 Thread Russell King - ARM Linux admin
On Thu, Oct 03, 2019 at 08:34:52AM +0300, Mike Rapoport wrote:
> (trimmed the CC)
> 
> On Wed, Oct 02, 2019 at 06:14:11AM -0500, Adam Ford wrote:
> > On Wed, Oct 2, 2019 at 2:36 AM Mike Rapoport  wrote:
> > >
> > 
> > Before the patch:
> > 
> > # cat /sys/kernel/debug/memblock/memory
> >0: 0x1000..0x8fff
> > # cat /sys/kernel/debug/memblock/reserved
> >0: 0x10004000..0x10007fff
> >   34: 0x2f88..0x3fff
> > 
> > 
> > After the patch:
> > # cat /sys/kernel/debug/memblock/memory
> >0: 0x1000..0x8fff
> > # cat /sys/kernel/debug/memblock/reserved
> >0: 0x10004000..0x10007fff
> >   36: 0x8000..0x8fff
> 
> I'm still not convinced that the memblock refactoring didn't uncovered an
> issue in etnaviv driver.
> 
> Why moving the CMA area from 0x8000 to 0x3000 makes it fail?

I think you have that the wrong way round.

> BTW, the code that complained about "command buffer outside valid memory
> window" has been removed by the commit 17e4660ae3d7 ("drm/etnaviv:
> implement per-process address spaces on MMUv2"). 
> 
> Could be that recent changes to MMU management of etnaviv resolve the
> issue?

The iMX6 does not have MMUv2 hardware, it has MMUv1.  With MMUv1
hardware requires command buffers within the first 2GiB of physical
RAM.

I've reported the problem previously but there was no resolution,
other than pointing the blame at CMA.

https://lists.freedesktop.org/archives/dri-devel/2019-June/thread.html#223516

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 0/6] Add the Mobiveil EP and Layerscape Gen4 EP driver support

2019-10-02 Thread Russell King - ARM Linux admin
On Wed, Oct 02, 2019 at 04:14:21PM -0500, Bjorn Helgaas wrote:
> On Tue, Sep 24, 2019 at 04:52:23PM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Tue, Sep 24, 2019 at 03:18:47PM +0100, Russell King - ARM Linux admin 
> > wrote:
> > > On Mon, Sep 16, 2019 at 10:17:36AM +0800, Xiaowei Bao wrote:
> > > > This patch set are for adding Mobiveil EP driver and adding PCIe Gen4
> > > > EP driver of NXP Layerscape platform.
> > > > 
> > > > This patch set depends on:
> > > > https://patchwork.kernel.org/project/linux-pci/list/?series=159139
> > > > 
> > > > Xiaowei Bao (6):
> > > >   PCI: mobiveil: Add the EP driver support
> > > >   dt-bindings: Add DT binding for PCIE GEN4 EP of the layerscape
> > > >   PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs
> > > >   PCI: mobiveil: Add workaround for unsupported request error
> > > >   arm64: dts: lx2160a: Add PCIe EP node
> > > >   misc: pci_endpoint_test: Add the layerscape PCIe GEN4 EP device
> > > > support
> > > > 
> > > >  .../bindings/pci/layerscape-pcie-gen4.txt  |  28 +-
> > > >  MAINTAINERS|   3 +
> > > >  arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi |  56 ++
> > > >  drivers/misc/pci_endpoint_test.c   |   2 +
> > > >  drivers/pci/controller/mobiveil/Kconfig|  22 +-
> > > >  drivers/pci/controller/mobiveil/Makefile   |   2 +
> > > >  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 169 ++
> > > >  drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c | 568 
> > > > +
> > > >  drivers/pci/controller/mobiveil/pcie-mobiveil.c|  99 +++-
> > > >  drivers/pci/controller/mobiveil/pcie-mobiveil.h|  72 +++
> > > >  10 files changed, 1009 insertions(+), 12 deletions(-)
> > > >  create mode 100644 
> > > > drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> > > >  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c
> > > 
> > > Hi,
> > > 
> > > I've applied "PCI: mobiveil: Fix the CPU base address setup in inbound
> > > window" and your patch set to 5.3, which seems to be able to detect the
> > > PCIe card I have plugged in:
> > > 
> > > layerscape-pcie-gen4 380.pcie: host bridge /soc/pcie@380 ranges:
> > > layerscape-pcie-gen4 380.pcie:   MEM 0xa04000..0xa07fff -> 
> > > 0x4000
> > > layerscape-pcie-gen4 380.pcie: PCI host bridge to bus :00
> > > pci_bus :00: root bus resource [bus 00-ff]
> > > pci_bus :00: root bus resource [mem 0xa04000-0xa07fff] (bus 
> > > address
> > > [0x4000-0x7fff])
> > > pci :00:00.0: [1957:8d90] type 01 class 0x060400
> > > pci :00:00.0: enabling Extended Tags
> > > pci :00:00.0: supports D1 D2
> > > pci :00:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> > > pci :01:00.0: [15b3:6750] type 00 class 0x02
> > > pci :01:00.0: reg 0x10: [mem 0xa04000-0xa0400f 64bit]
> > > pci :01:00.0: reg 0x18: [mem 0xa04080-0xa040ff 64bit pref]
> > > pci :01:00.0: reg 0x30: [mem 0xa04100-0xa0410f pref]
> > > pci :00:00.0: up support 3 enabled 0
> > > pci :00:00.0: dn support 1 enabled 0
> > > pci :00:00.0: BAR 9: assigned [mem 0xa04000-0xa0407f 64bit 
> > > pref]
> > > pci :00:00.0: BAR 8: assigned [mem 0xa04080-0xa0409f]
> > > pci :01:00.0: BAR 2: assigned [mem 0xa04000-0xa0407f 64bit 
> > > pref]
> > > pci :01:00.0: BAR 0: assigned [mem 0xa04080-0xa0408f 64bit]
> > > pci :01:00.0: BAR 6: assigned [mem 0xa04090-0xa0409f pref]
> > > pci :00:00.0: PCI bridge to [bus 01-ff]
> > > pci :00:00.0:   bridge window [mem 0xa04080-0xa0409f]
> > > pci :00:00.0:   bridge window [mem 0xa04000-0xa0407f 64bit 
> > > pref]
> > > pci :00:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read 
> > > Rq  256pci :01:00.0: Max Payload Size set to  256/ 256 (was  128), 
> > > Max Read Rq  256pcieport :00:00.0: PCIe capabilities: 0x13
> > > pcieport :00:00.0: init_service_irqs: -19
> > > 
> > > However, a bit later in the kernel boot, I get:
> > > 
> > > SError Interrupt on CPU1, cod

Re: [PATCHv3] ARM: drivers/amba: release and cleanup the resource to allow for deferred probe

2019-10-02 Thread Russell King - ARM Linux admin
On Wed, Oct 02, 2019 at 09:35:51AM -0500, Dinh Nguyen wrote:
> With commit "79bdcb202a35 ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe", the amba bus driver needs to be deferred probe because the
> reset driver is probed later. However with a deferred probe, the call to
> request_resource() in the driver returns -EBUSY. The reason is the driver
> has not released the resource from the previous probe attempt.
> 
> This patch fixes how we handle the condition of EPROBE_DEFER that is returned
> from getting the reset controls. For this condition, the patch will jump
> to defer_probe, which will iounmap, dev_pm_domain_detach, and release the
> resource.
> 
> Fixes: 79bdcb202a35 ("ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe")
> Signed-off-by: Dinh Nguyen 
> ---
> v3: jump to defer_probe where the driver will unmap and pm_detach the
> driver resource for the next probe attempt
> v2: release the resource when of_reset_control_array_get_optional_shared()
> returns EPROBE_DEFER
> ---
>  drivers/amba/bus.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index f39f075abff9..4a021b1dab3d 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -409,9 +409,12 @@ static int amba_device_try_add(struct amba_device *dev, 
> struct resource *parent)
>*/
>   rstc = 
> of_reset_control_array_get_optional_shared(dev->dev.of_node);
>   if (IS_ERR(rstc)) {
> - if (PTR_ERR(rstc) != -EPROBE_DEFER)
> + ret = PTR_ERR(rstc);
> + if (ret == -EPROBE_DEFER)
> + goto defer_probe;
> + else
>   dev_err(>dev, "Can't get amba reset!\n");
> - return PTR_ERR(rstc);
> + return ret;

So, if of_reset_control_array_get_optional_shared() returns an error,
we end up leaking the ioremap(), the resource claim, the pclk enable
and pm domain?  If it returns -EPROBE_DEFER, we end up leaking the
pclk enable?

I think this is going to be quicker if I write the patch - I haven't
build-tested this yet though.  Please check whether this works for
you.

Thanks.

8<=
From: Russell King 
Subject: [PATCH] drivers/amba: fix reset control error handling

With commit 79bdcb202a35 ("ARM: 8906/1: drivers/amba: add reset control
to amba bus probe") it is possible for the the amba bus driver to defer
probing the device for its IDs because the reset driver may be probed
later.

However when a subsequent probe occurs, the call to request_resource()
in the driver returns -EBUSY as the driver has not released the resource
from the initial probe attempt - or cleaned up any of the preceding
actions.

Fix this both for the deferred probe case as well as a failure to get
the reset.

Fixes: 79bdcb202a35 ("ARM: 8906/1: drivers/amba: add reset control to amba bus 
probe")
Signed-off-by: Russell King 
---
 drivers/amba/bus.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index f39f075abff9..fe1523664816 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -409,9 +409,11 @@ static int amba_device_try_add(struct amba_device *dev, 
struct resource *parent)
 */
rstc = 
of_reset_control_array_get_optional_shared(dev->dev.of_node);
if (IS_ERR(rstc)) {
-   if (PTR_ERR(rstc) != -EPROBE_DEFER)
-   dev_err(>dev, "Can't get amba reset!\n");
-   return PTR_ERR(rstc);
+   ret = PTR_ERR(rstc);
+   if (ret != -EPROBE_DEFER)
+   dev_err(>dev, "can't get reset: %d\n",
+   ret);
+   goto err_reset;
}
reset_control_deassert(rstc);
reset_control_put(rstc);
@@ -472,6 +474,12 @@ static int amba_device_try_add(struct amba_device *dev, 
struct resource *parent)
release_resource(>res);
  err_out:
return ret;
+
+ err_reset:
+   amba_put_disable_pclk(dev);
+   iounmap(tmp);
+   dev_pm_domain_detach(>dev, true);
+   goto err_release;
 }
 
 /*
-- 
2.7.4


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCHv2] ARM: drivers/amba: release the resource to allow for deferred probe

2019-10-02 Thread Russell King - ARM Linux admin
On Wed, Oct 02, 2019 at 07:33:49AM -0500, Dinh Nguyen wrote:
> With commit "79bdcb202a35 ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe", the amba bus driver needs to be deferred probe because the
> reset driver is probed later. However with a deferred probe, the call to
> request_resource() in the driver returns -EBUSY. The reason is the driver
> has not released the resource from the previous probe attempt.
> 
> This patch fixes how we handle the condition of EPROBE_DEFER that is returned
> from getting the reset controls. For this condition, the patch will jump
> to err_release, which will release the resource.
> 
> Fixes: 79bdcb202a35 ("ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe")
> Signed-off-by: Dinh Nguyen 
> ---
> v2: release the resource when of_reset_control_array_get_optional_shared()
> returns EPROBE_DEFER
> ---
>  drivers/amba/bus.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index f39f075abff9..1109437815eb 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -409,9 +409,12 @@ static int amba_device_try_add(struct amba_device *dev, 
> struct resource *parent)
>*/
>   rstc = 
> of_reset_control_array_get_optional_shared(dev->dev.of_node);
>   if (IS_ERR(rstc)) {
> - if (PTR_ERR(rstc) != -EPROBE_DEFER)
> + ret = PTR_ERR(rstc);
> + if (ret == -EPROBE_DEFER)
> + goto err_release;
> + else
>   dev_err(>dev, "Can't get amba reset!\n");
> - return PTR_ERR(rstc);
> + return ret;

Still a negative.

Remember in the comments to the previous patch I talked about ioremap().

Please read the code that you are modifying and carefully consider what
needs to happen at this site to properly clean up on failure.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v2] ARM: add __always_inline to functions called from __get_user_check()

2019-10-02 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 10:03:50AM -0700, Nick Desaulniers wrote:
> On Tue, Oct 1, 2019 at 1:37 AM Masahiro Yamada
>  wrote:
> >
> > KernelCI reports that bcm2835_defconfig is no longer booting since
> > commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> > forcibly") (https://lkml.org/lkml/2019/9/26/825).
> >
> > I also received a regression report from Nicolas Saenz Julienne
> > (https://lkml.org/lkml/2019/9/27/263).
> >
> > This problem has cropped up on bcm2835_defconfig because it enables
> > CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends to prefer not inlining
> > functions with -Os. I was able to reproduce it with other boards and
> > defconfig files by manually enabling CONFIG_CC_OPTIMIZE_FOR_SIZE.
> >
> > The __get_user_check() specifically uses r0, r1, r2 registers.
> > So, uaccess_save_and_enable() and uaccess_restore() must be inlined.
> > Otherwise, those register assignments would be entirely dropped,
> > according to my analysis of the disassembly.
> >
> > Prior to commit 9012d011660e ("compiler: allow all arches to enable
> > CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> > inlining functions, except on x86.
> >
> > Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> > So, __always_inline is now the only guaranteed way of forcible inlining.
> 
> No, the C preprocessor is the only guaranteed way of inlining.  I
> preferred v1; if you're going to play with
> firewrite assembly, don't get burned.

It seems we disagree on that.

Masahiro Yamada, please send this to the patch system, thanks.

> 
> >
> > I also added __always_inline to 4 functions in the call-graph from the
> > __get_user_check() macro.
> >
> > Fixes: 9012d011660e ("compiler: allow all arches to enable 
> > CONFIG_OPTIMIZE_INLINING")
> > Reported-by: "kernelci.org bot" 
> > Reported-by: Nicolas Saenz Julienne 
> > Signed-off-by: Masahiro Yamada 
> > ---
> >
> > Changes in v2:
> >   - Use __always_inline instead of changing the function call places
> >  (per Russell King)
> >   - The previous submission is: 
> > https://lore.kernel.org/patchwork/patch/1132459/
> >
> >  arch/arm/include/asm/domain.h  | 8 
> >  arch/arm/include/asm/uaccess.h | 4 ++--
> >  2 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/arch/arm/include/asm/domain.h b/arch/arm/include/asm/domain.h
> > index 567dbede4785..f1d0a7807cd0 100644
> > --- a/arch/arm/include/asm/domain.h
> > +++ b/arch/arm/include/asm/domain.h
> > @@ -82,7 +82,7 @@
> >  #ifndef __ASSEMBLY__
> >
> >  #ifdef CONFIG_CPU_CP15_MMU
> > -static inline unsigned int get_domain(void)
> > +static __always_inline unsigned int get_domain(void)
> >  {
> > unsigned int domain;
> >
> > @@ -94,7 +94,7 @@ static inline unsigned int get_domain(void)
> > return domain;
> >  }
> >
> > -static inline void set_domain(unsigned val)
> > +static __always_inline void set_domain(unsigned int val)
> >  {
> > asm volatile(
> > "mcrp15, 0, %0, c3, c0  @ set domain"
> > @@ -102,12 +102,12 @@ static inline void set_domain(unsigned val)
> > isb();
> >  }
> >  #else
> > -static inline unsigned int get_domain(void)
> > +static __always_inline unsigned int get_domain(void)
> >  {
> > return 0;
> >  }
> >
> > -static inline void set_domain(unsigned val)
> > +static __always_inline void set_domain(unsigned int val)
> >  {
> >  }
> >  #endif
> > diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> > index 303248e5b990..98c6b91be4a8 100644
> > --- a/arch/arm/include/asm/uaccess.h
> > +++ b/arch/arm/include/asm/uaccess.h
> > @@ -22,7 +22,7 @@
> >   * perform such accesses (eg, via list poison values) which could then
> >   * be exploited for priviledge escalation.
> >   */
> > -static inline unsigned int uaccess_save_and_enable(void)
> > +static __always_inline unsigned int uaccess_save_and_enable(void)
> >  {
> >  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > unsigned int old_domain = get_domain();
> > @@ -37,7 +37,7 @@ static inline unsigned int uaccess_save_and_enable(void)
> >  #endif
> >  }
> >
> > -static inline void uaccess_restore(unsigned int flags)
> > +static __always_inline void uaccess_restore(unsigned int flags)
> >  {
> >  #ifdef CONFIG_CPU_SW_DOMAIN_PAN
> > /* Restore the user access mask */
> > --
> > 2.17.1
> >
> 
> 
> -- 
> Thanks,
> ~Nick Desaulniers
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 02:32:54PM -0700, Nick Desaulniers wrote:
> On Tue, Oct 1, 2019 at 2:26 PM Russell King - ARM Linux admin
>  wrote:
> >
> > On Tue, Oct 01, 2019 at 09:59:38PM +0100, Russell King - ARM Linux admin 
> > wrote:
> > > On Tue, Oct 01, 2019 at 01:21:44PM -0700, Nick Desaulniers wrote:
> > > > On Tue, Oct 1, 2019 at 11:14 AM Russell King - ARM Linux admin
> > > >  wrote:
> > > > >
> > > > > The whole "let's make inline not really mean inline" is nothing more
> > > > > than a band-aid to the overuse (and abuse) of "inline".
> > > >
> > > > Let's triple check the ISO C11 draft spec just to be sure:
> > > > § 6.7.4.6: A function declared with an inline function specifier is an
> > > > inline function. Making a
> > > > function an inline function suggests that calls to the function be as
> > > > fast as possible.
> > > > The extent to which such suggestions are effective is
> > > > implementation-defined. 139)
> > > > 139) For example, an implementation might never perform inline
> > > > substitution, or might only perform inline
> > > > substitutions to calls in the scope of an inline declaration.
> > > > § J.3.8 [Undefined Behavior] Hints: The extent to which suggestions
> > > > made by using the inline function specifier are effective (6.7.4).
> > > >
> > > > My translation:
> > > > "Please don't assume inline means anything."
> > > >
> > > > For the unspecified GNU C extension __attribute__((always_inline)), it
> > > > seems to me like it's meant more for performing inlining (an
> > > > optimization) at -O0.  Whether the compiler warns or not seems like a
> > > > nice side effect, but provides no strong guarantee otherwise.
> > > >
> > > > I'm sorry that so much code may have been written with that
> > > > assumption, and I'm sorry to be the bearer of bad news, but this isn't
> > > > a recent change.  If code was written under false assumptions, it
> > > > should be rewritten. Sorry.
> > >
> > > You may quote C11, but that is not relevent.  The kernel is coded to
> > > gnu89 standard - see the -std=gnu89 flag.
> >
> > There's more to this and why C11 is entirely irrelevant.  The "inline"
> > you see in our headers is not the compiler keyword that you find in
> > various C standards, it is a macro that gets expanded to either:
> >
> > #define inline inline __attribute__((__always_inline__)) __gnu_inline \
> > __maybe_unused notrace
> >
> > or
> >
> > #define inline inline__gnu_inline \
> > __maybe_unused notrace
> >
> > __gnu_inline is defined as:
> >
> > #define __gnu_inline__attribute__((__gnu_inline__))
> >
> > So this attaches the gnu_inline attribute to the function:
> >
> > `gnu_inline'
> >  This attribute should be used with a function that is also declared
> >  with the `inline' keyword.  It directs GCC to treat the function
> >  as if it were defined in gnu90 mode even when compiling in C99 or
> >  gnu99 mode.
> > ...
> >  Since ISO C99 specifies a different semantics for `inline', this
> >  function attribute is provided as a transition measure and as a
> >  useful feature in its own right.  This attribute is available in
> >  GCC 4.1.3 and later.  It is available if either of the
> >  preprocessor macros `__GNUC_GNU_INLINE__' or
> >  `__GNUC_STDC_INLINE__' are defined.  *Note An Inline Function is
> >  As Fast As a Macro: Inline.
> >
> > which is quite clear that C99 semantics do not apply to _this_ inline.
> > The manual goes on to explain:
> >
> >  GCC implements three different semantics of declaring a function
> > inline.  One is available with `-std=gnu89' or `-fgnu89-inline' or when
> > `gnu_inline' attribute is present on all inline declarations, another
> > when `-std=c99', `-std=c11', `-std=gnu99' or `-std=gnu11' (without
> > `-fgnu89-inline'), and the third is used when compiling C++.
> 
> (I wrote the kernel patch for gnu_inline; it only comes into play when
> `inline` appears on a function *also defined as `extern`*).

>From what I can tell reading the GCC manual, the patch adding
gnu_inline should have no effect.  Maybe it was written before
-std=gnu89 was in use by the kernel makefiles?

> > I'd suggest gnu90 mode is

Re: [PATCH] ARM: drivers/amba: release the resource to allow for deferred probe

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 04:40:26PM -0500, Dinh Nguyen wrote:
> With commit "79bdcb202a35 ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe", the amba bus driver needs to be deferred probe because the
> reset driver is probed later than the amba bus. However with a deferred
> probe, the call to request_resource() in the driver returns -EBUSY. The
> reason is the driver has not released the resource from the previous probe
> attempt.
> 
> This patch releases the resource when amba_device_try_add() returns
> -EPROBE_DEFER. This allows the deferred probe to continue.
> 
> Fixes: 79bdcb202a35 ("ARM: 8906/1: drivers/amba: add reset control to
> amba bus probe")
> Signed-off-by: Dinh Nguyen 
> ---
>  drivers/amba/bus.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
> index f39f075abff9..f246b847c991 100644
> --- a/drivers/amba/bus.c
> +++ b/drivers/amba/bus.c
> @@ -535,6 +535,7 @@ int amba_device_add(struct amba_device *dev, struct 
> resource *parent)
>  
>   if (ret == -EPROBE_DEFER) {
>   struct deferred_device *ddev;
> + release_resource(>res);

This is in the wrong place, and misses more serious leaks.

>   ddev = kmalloc(sizeof(*ddev), GFP_KERNEL);
>   if (!ddev)

What we have is bad error cleanup code in amba_device_try_add().
Consider what would happen if dev_pm_domain_attach() inside that
function were to return with -EPROBE_DEFER with your patch in
place - we would call release_resource() twice on the same
resource.  Clearly, that's incorrect.

The problem is that an error from
of_reset_control_array_get_optional_shared() just returns, leaving
everything that amba_device_try_add() already did still in place.
So, for example, a subsequent call to amba_device_try_add() will
remap the resource, leaking the previous mapping.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 09:59:38PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Oct 01, 2019 at 01:21:44PM -0700, Nick Desaulniers wrote:
> > On Tue, Oct 1, 2019 at 11:14 AM Russell King - ARM Linux admin
> >  wrote:
> > >
> > > On Tue, Oct 01, 2019 at 11:00:11AM -0700, Nick Desaulniers wrote:
> > > > On Tue, Oct 1, 2019 at 10:55 AM Russell King - ARM Linux admin
> > > >  wrote:
> > > > >
> > > > > On Tue, Oct 01, 2019 at 10:44:43AM -0700, Nick Desaulniers wrote:
> > > > > > I apologize; I don't mean to be difficult.  I would just like to 
> > > > > > avoid
> > > > > > surprises when code written with the assumption that it will be
> > > > > > inlined is not.  It sounds like we found one issue in arm32 and one 
> > > > > > in
> > > > > > arm64 related to outlining.  If we fix those two cases, I think 
> > > > > > we're
> > > > > > close to proceeding with Masahiro's cleanup, which I view as a good
> > > > > > thing for the health of the Linux kernel codebase.
> > > > >
> > > > > Except, using the C preprocessor for this turns the arm32 code into
> > > > > yuck:
> > > > >
> > > > > 1. We'd need to turn get_domain() and set_domain() into multi-line
> > > > >preprocessor macro definitions, using the GCC ({ }) extension
> > > > >so that get_domain() can return a value.
> > > > >
> > > > > 2. uaccess_save_and_enable() and uaccess_restore() also need to
> > > > >become preprocessor macro definitions too.
> > > > >
> > > > > So, we end up with multiple levels of nested preprocessor macros.
> > > > > When something goes wrong, the compiler warning/error message is
> > > > > going to be utterly _horrid_.
> > > >
> > > > That's why I preferred V1 of Masahiro's patch, that fixed the inline
> > > > asm not to make use of caller saved registers before calling a
> > > > function that might not be inlined.
> > >
> > > ... which I objected to based on the fact that this uaccess stuff is
> > > supposed to add protection against the kernel being fooled into
> > > accessing userspace when it shouldn't.  The whole intention there is
> > > that [sg]et_domain(), and uaccess_*() are _always_ inlined as close
> > > as possible to the call site of the accessor touching userspace.
> > 
> > Then use the C preprocessor to force the inlining.  I'm sorry it's not
> > as pretty as static inline functions.
> > 
> > >
> > > Moving it before the assignments mean that the compiler is then free
> > > to issue memory loads/stores to load up those registers, which is
> > > exactly what we want to avoid.
> > >
> > >
> > > In any case, I violently disagree with the idea that stuff we have
> > > in header files should be permitted not to be inlined because we
> > > have soo much that is marked inline.
> > 
> > So there's a very important subtly here.  There's:
> > 1. code that adds `inline` cause "oh maybe it would be nice to inline
> > this, but if it isn't no big deal"
> > 2. code that if not inlined is somehow not correct.
> > 3. avoid ODR violations via `static inline`
> > 
> > I'll posit that "we have soo much that is marked inline [is
> > predominantly case 1 or 3, not case 2]."  Case 2 is a code smell, and
> > requires extra scrutiny.
> > 
> > > Having it moved out of line,
> > > and essentially the same function code appearing in multiple C files
> > > is really not an improvement over the current situation with excessive
> > > use of inlining.  Anyone who has looked at the code resulting from
> > > dma_map_single() will know exactly what I'm talking about, which is
> > > way in excess of the few instructions we have for the uaccess_* stuff
> > > here.
> > >
> > > The right approach is to move stuff out of line - and by that, I
> > > mean _actually_ move the damn code, so that different compilation
> > > units can use the same instructions, and thereby gain from the
> > > whole point of an instruction cache.
> > 
> > And be marked __attribute__((noinline)), otherwise might be inlined via LTO.
> > 
> > >
> > > The whole "let's make inline not really mean inline" is nothing more
> > > than a band-aid to the overus

Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 01:21:44PM -0700, Nick Desaulniers wrote:
> On Tue, Oct 1, 2019 at 11:14 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Tue, Oct 01, 2019 at 11:00:11AM -0700, Nick Desaulniers wrote:
> > > On Tue, Oct 1, 2019 at 10:55 AM Russell King - ARM Linux admin
> > >  wrote:
> > > >
> > > > On Tue, Oct 01, 2019 at 10:44:43AM -0700, Nick Desaulniers wrote:
> > > > > I apologize; I don't mean to be difficult.  I would just like to avoid
> > > > > surprises when code written with the assumption that it will be
> > > > > inlined is not.  It sounds like we found one issue in arm32 and one in
> > > > > arm64 related to outlining.  If we fix those two cases, I think we're
> > > > > close to proceeding with Masahiro's cleanup, which I view as a good
> > > > > thing for the health of the Linux kernel codebase.
> > > >
> > > > Except, using the C preprocessor for this turns the arm32 code into
> > > > yuck:
> > > >
> > > > 1. We'd need to turn get_domain() and set_domain() into multi-line
> > > >preprocessor macro definitions, using the GCC ({ }) extension
> > > >so that get_domain() can return a value.
> > > >
> > > > 2. uaccess_save_and_enable() and uaccess_restore() also need to
> > > >become preprocessor macro definitions too.
> > > >
> > > > So, we end up with multiple levels of nested preprocessor macros.
> > > > When something goes wrong, the compiler warning/error message is
> > > > going to be utterly _horrid_.
> > >
> > > That's why I preferred V1 of Masahiro's patch, that fixed the inline
> > > asm not to make use of caller saved registers before calling a
> > > function that might not be inlined.
> >
> > ... which I objected to based on the fact that this uaccess stuff is
> > supposed to add protection against the kernel being fooled into
> > accessing userspace when it shouldn't.  The whole intention there is
> > that [sg]et_domain(), and uaccess_*() are _always_ inlined as close
> > as possible to the call site of the accessor touching userspace.
> 
> Then use the C preprocessor to force the inlining.  I'm sorry it's not
> as pretty as static inline functions.
> 
> >
> > Moving it before the assignments mean that the compiler is then free
> > to issue memory loads/stores to load up those registers, which is
> > exactly what we want to avoid.
> >
> >
> > In any case, I violently disagree with the idea that stuff we have
> > in header files should be permitted not to be inlined because we
> > have soo much that is marked inline.
> 
> So there's a very important subtly here.  There's:
> 1. code that adds `inline` cause "oh maybe it would be nice to inline
> this, but if it isn't no big deal"
> 2. code that if not inlined is somehow not correct.
> 3. avoid ODR violations via `static inline`
> 
> I'll posit that "we have soo much that is marked inline [is
> predominantly case 1 or 3, not case 2]."  Case 2 is a code smell, and
> requires extra scrutiny.
> 
> > Having it moved out of line,
> > and essentially the same function code appearing in multiple C files
> > is really not an improvement over the current situation with excessive
> > use of inlining.  Anyone who has looked at the code resulting from
> > dma_map_single() will know exactly what I'm talking about, which is
> > way in excess of the few instructions we have for the uaccess_* stuff
> > here.
> >
> > The right approach is to move stuff out of line - and by that, I
> > mean _actually_ move the damn code, so that different compilation
> > units can use the same instructions, and thereby gain from the
> > whole point of an instruction cache.
> 
> And be marked __attribute__((noinline)), otherwise might be inlined via LTO.
> 
> >
> > The whole "let's make inline not really mean inline" is nothing more
> > than a band-aid to the overuse (and abuse) of "inline".
> 
> Let's triple check the ISO C11 draft spec just to be sure:
> § 6.7.4.6: A function declared with an inline function specifier is an
> inline function. Making a
> function an inline function suggests that calls to the function be as
> fast as possible.
> The extent to which such suggestions are effective is
> implementation-defined. 139)
> 139) For example, an implementation might never perform inline
> substitution, or might only perform inline
> substitutions to calls in the scope of an inline

Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 11:00:11AM -0700, Nick Desaulniers wrote:
> On Tue, Oct 1, 2019 at 10:55 AM Russell King - ARM Linux admin
>  wrote:
> >
> > On Tue, Oct 01, 2019 at 10:44:43AM -0700, Nick Desaulniers wrote:
> > > I apologize; I don't mean to be difficult.  I would just like to avoid
> > > surprises when code written with the assumption that it will be
> > > inlined is not.  It sounds like we found one issue in arm32 and one in
> > > arm64 related to outlining.  If we fix those two cases, I think we're
> > > close to proceeding with Masahiro's cleanup, which I view as a good
> > > thing for the health of the Linux kernel codebase.
> >
> > Except, using the C preprocessor for this turns the arm32 code into
> > yuck:
> >
> > 1. We'd need to turn get_domain() and set_domain() into multi-line
> >preprocessor macro definitions, using the GCC ({ }) extension
> >so that get_domain() can return a value.
> >
> > 2. uaccess_save_and_enable() and uaccess_restore() also need to
> >become preprocessor macro definitions too.
> >
> > So, we end up with multiple levels of nested preprocessor macros.
> > When something goes wrong, the compiler warning/error message is
> > going to be utterly _horrid_.
> 
> That's why I preferred V1 of Masahiro's patch, that fixed the inline
> asm not to make use of caller saved registers before calling a
> function that might not be inlined.

... which I objected to based on the fact that this uaccess stuff is
supposed to add protection against the kernel being fooled into
accessing userspace when it shouldn't.  The whole intention there is
that [sg]et_domain(), and uaccess_*() are _always_ inlined as close
as possible to the call site of the accessor touching userspace.

Moving it before the assignments mean that the compiler is then free
to issue memory loads/stores to load up those registers, which is
exactly what we want to avoid.


In any case, I violently disagree with the idea that stuff we have
in header files should be permitted not to be inlined because we
have soo much that is marked inline.  Having it moved out of line,
and essentially the same function code appearing in multiple C files
is really not an improvement over the current situation with excessive
use of inlining.  Anyone who has looked at the code resulting from
dma_map_single() will know exactly what I'm talking about, which is
way in excess of the few instructions we have for the uaccess_* stuff
here.

The right approach is to move stuff out of line - and by that, I
mean _actually_ move the damn code, so that different compilation
units can use the same instructions, and thereby gain from the
whole point of an instruction cache.

The whole "let's make inline not really mean inline" is nothing more
than a band-aid to the overuse (and abuse) of "inline".

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] compiler: enable CONFIG_OPTIMIZE_INLINING forcibly

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 10:44:43AM -0700, Nick Desaulniers wrote:
> I apologize; I don't mean to be difficult.  I would just like to avoid
> surprises when code written with the assumption that it will be
> inlined is not.  It sounds like we found one issue in arm32 and one in
> arm64 related to outlining.  If we fix those two cases, I think we're
> close to proceeding with Masahiro's cleanup, which I view as a good
> thing for the health of the Linux kernel codebase.

Except, using the C preprocessor for this turns the arm32 code into
yuck:

1. We'd need to turn get_domain() and set_domain() into multi-line
   preprocessor macro definitions, using the GCC ({ }) extension
   so that get_domain() can return a value.

2. uaccess_save_and_enable() and uaccess_restore() also need to
   become preprocessor macro definitions too.

So, we end up with multiple levels of nested preprocessor macros.
When something goes wrong, the compiler warning/error message is
going to be utterly _horrid_.

Now, as to whether an __attribute__((always_inline)) can or can not
be inlined...

`always_inline'
 Generally, functions are not inlined unless optimization is
 specified.  For functions declared inline, this attribute inlines
 the function even if no optimization level is specified.

Is this another instance of the compiler folk changing the rules of
already documented semantics?  This says nothing about "might not be
inlined if someone passes some random combination of -f flags".

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 04:28:27PM +0100, Andrew Murray wrote:
> I hadn't noticed the use of __OPTIMIZE__ - indeed if __compiletime_assert
> is no-op'd and you reach it then you won't have a build error - but you
> may get uninitialised values instead.
> 
> Presumably the purpose of __OPTIMIZE__ in this case is to prevent getting
> an undefined function error for the __compiletime_assert line, even though
> it doesn't get called (when using a compiler that doesn't optimize out the
> call to the unused function).
> 
> Why is the call to __get_user_bad not guarded in this way for when
> __OPTIMIZE__ isn't set, i.e. why doesn't it suffer from the issue
> that the following fixes?

Officially, the kernel does not support building with -O0.  To start
with, the top level makefile has:

ifdef CONFIG_CC_OPTIMIZE_FOR_SIZE
KBUILD_CFLAGS   += -Os
else
KBUILD_CFLAGS   += -O2
endif

and we've said for years that the kernel relies upon the compiler
optimiser to build correctly.  You may be lucky if you pass it via
some method to 'make' but that's going to rely on the argument order
to the compiler, and the order in which the compiler processes its
arguments, and whether it (for example) correctly disables all
optimisations if it encounters -O0 somewhere.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] Partially revert "compiler: enable CONFIG_OPTIMIZE_INLINING forcibly"

2019-10-01 Thread Russell King - ARM Linux admin
On Tue, Oct 01, 2019 at 12:41:30PM +0100, Andrew Murray wrote:
> On Tue, Oct 01, 2019 at 11:42:54AM +0100, Will Deacon wrote:
> > On Tue, Oct 01, 2019 at 06:40:26PM +0900, Masahiro Yamada wrote:
> > > On Mon, Sep 30, 2019 at 8:45 PM Will Deacon  wrote:
> > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > index 93d97f9b0157..c37c72adaeff 100644
> > > > --- a/lib/Kconfig.debug
> > > > +++ b/lib/Kconfig.debug
> > > > @@ -312,6 +312,7 @@ config HEADERS_CHECK
> > > >
> > > >  config OPTIMIZE_INLINING
> > > > def_bool y
> > > > +   depends on !(ARM || ARM64) # 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9
> > > 
> > > 
> > > This is a too big hammer.
> > 
> > It matches the previous default behaviour!
> > 
> > > For ARM, it is not a compiler bug, so I am trying to fix the kernel code.
> > > 
> > > For ARM64, even if it is a compiler bug, you can add __always_inline
> > > to the functions in question.
> > > (arch_atomic64_dec_if_positive in this case).
> > > 
> > > You do not need to force __always_inline globally.
> > 
> > So you'd prefer I do something like the diff below? I mean, it's a start,
> > but I do worry that we're hanging arch/arm/ out to dry.
> 
> If I've understood one part of this issue correctly - and using the
> c2p_unsupported build failure as an example [1], there are instances in
> the kernel where it is assumed that the compiler will optimise out a call
> to an undefined function, and also assumed that the compiler will know
> at compile time that the function will never get called. It's common to
> satisfy this assumption when the calling function is inlined.
> 
> But I suspect there may be other cases similar to c2p_unsupported which
> are still lurking.
> 
> For example the following functions are called but non-existent, and thus
> may be an area worth investigating:
> 
> __buggy_use_of_MTHCA_PUT, __put_dbe_unknown, __cmpxchg_wrong_size,
> __bad_percpu_size, __put_user_bad, __get_user_unknown,
> __bad_unaligned_access_size, __bad_xchg
> 
> But more generally, as this is a common pattern - isn't there a benefit
> here for changing all of these to BUILD_BUG? (So they can be found easily).

Precisely, what is your suggestion?

If you think that replacing the call to __get_user_bad with BUILD_BUG(),
BUILD_BUG() becomes a no-op when __OPTIMIZE__ is not defined (see the
definition of __compiletime_assert() in linux/compiler.h); this means
such places will be reachable, which leads to uninitialised variables.

> Or to avoid this class of issues, change them to BUG or unreachable - but
> lose the benefit of compile time detection?

I think you ought to read the GCC manual wrt __builtin_unreachable().
"If control flow reaches the point of the `__builtin_unreachable',
 the program is undefined.  It is useful in situations where the
 compiler cannot deduce the unreachability of the code."

I have seen cases where the instructions following an unreachable
code section have been the literal pool for the function - which,
if reached, would be quite confusing to debug.  If you're lucky, you
might get an undefined instruction exception.  If not, you could
continue and start executing another part of the function, leading
to possibly no crash at all - but unexpected results (which may end
up leaking sensitive data.)

For example, in our BUG() implementation on 32-bit ARM, we use
unreachable() after the asm() statement creating the bug table
entry and inserting the undefined instruction into the text.
Here's the resulting disassembly:

 278:   ebfebl  0 
278: R_ARM_CALL page_mapped
 27c:   e350cmp r0, #0
 280:   1a6cbne 438 

...
 2d4:   ebfebl  0 <_raw_spin_lock_irqsave>
2d4: R_ARM_CALL _raw_spin_lock_irqsave
 2d8:   e5943008ldr r3, [r4, #8]
 2dc:   e3130001tst r3, #1
 2e0:   e1a02000mov r2, r0
 2e4:   1a54bne 43c 

...
 438:   e7f001f2.word   0xe7f001f2
 43c:   e2433001sub r3, r3, #1
 440:   eaa9b   2ec 


Now, consider what unreachable() actually gets you here - it tells
the compiler that we do not expect to reach this point (that being
the point between 438 and 43c.)  If we were to reach that point, we
would continue executing the code at 43c.

In this case, it would be like...

if (BUG_ON(page_mapped(page)))
goto 
random-location-in-xa_lock_irqsave()-inside-invalidate_complete_page2();

So no.  unreachable() is not an 

Re: [PATCH] ARM: fix __get_user_check() in case uaccess_* calls are not inlined

2019-09-30 Thread Russell King - ARM Linux admin
On Mon, Sep 30, 2019 at 02:59:25PM +0900, Masahiro Yamada wrote:
> KernelCI reports that bcm2835_defconfig is no longer booting since
> commit ac7c3e4ff401 ("compiler: enable CONFIG_OPTIMIZE_INLINING
> forcibly"):
> 
>   https://lkml.org/lkml/2019/9/26/825
> 
> I also received a regression report from Nicolas Saenz Julienne:
> 
>   https://lkml.org/lkml/2019/9/27/263
> 
> This problem has cropped up on arch/arm/config/bcm2835_defconfig
> because it enables CONFIG_CC_OPTIMIZE_FOR_SIZE. The compiler tends
> to prefer not inlining functions with -Os. I was able to reproduce
> it with other boards and defconfig files by manually enabling
> CONFIG_CC_OPTIMIZE_FOR_SIZE.
> 
> The __get_user_check() specifically uses r0, r1, r2 registers.
> So, uaccess_save_and_enable() and uaccess_restore() must be inlined
> in order to avoid those registers being overwritten in the callees.
> 
> Prior to commit 9012d011660e ("compiler: allow all arches to enable
> CONFIG_OPTIMIZE_INLINING"), the 'inline' marker was always enough for
> inlining functions, except on x86.
> 
> Since that commit, all architectures can enable CONFIG_OPTIMIZE_INLINING.
> So, __always_inline is now the only guaranteed way of forcible inlining.
> 
> I want to keep as much compiler's freedom as possible about the inlining
> decision. So, I changed the function call order instead of adding
> __always_inline around.
> 
> Call uaccess_save_and_enable() before assigning the __p ("r0"), and
> uaccess_restore() after evacuating the __e ("r0").
> 
> Fixes: 9012d011660e ("compiler: allow all arches to enable 
> CONFIG_OPTIMIZE_INLINING")
> Reported-by: "kernelci.org bot" 
> Reported-by: Nicolas Saenz Julienne 
> Signed-off-by: Masahiro Yamada 
> ---
> 
>  arch/arm/include/asm/uaccess.h | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
> index 303248e5b990..559f252d7e3c 100644
> --- a/arch/arm/include/asm/uaccess.h
> +++ b/arch/arm/include/asm/uaccess.h
> @@ -191,11 +191,12 @@ extern int __get_user_64t_4(void *);
>  #define __get_user_check(x, p)   
> \
>   ({  \
>   unsigned long __limit = current_thread_info()->addr_limit - 1; \
> + unsigned int __ua_flags = uaccess_save_and_enable();\

If the compiler is moving uaccess_save_and_enable(), that's something
we really don't want - the idea is to _minimise_ the number of kernel
memory accesses between enabling userspace access and performing the
actual access.

Fixing it in this way widens the window for the kernel to be doing
something it shoulding in userspace.

So, the right solution is to ensure that the compiler always inlines
the uaccess_*() helpers - which should be nothing more than four
instructions for uaccess_save_and_enable() and two for the
restore.

I.O.W. it should look something like this:

 144:   ee134f10mrc 15, 0, r4, cr3, cr0, {0}
 148:   e3c4200cbic r2, r4, #12
 14c:   e24e1001sub r1, lr, #1
 150:   e3822004orr r2, r2, #4
 154:   ee032f10mcr 15, 0, r2, cr3, cr0, {0}
 158:   f57ff06fisb sy
 15c:   ebfebl  0 <__get_user_4>
 160:   ee034f10mcr 15, 0, r4, cr3, cr0, {0}
 164:   f57ff06fisb sy

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


perf annotate fails with "Invalid -1 error code"

2019-09-30 Thread Russell King - ARM Linux admin
Hi,

While using perf report on aarch64, I try to annotate
__arch_copy_to_user, and it fails with:

Error: Couldn't annotate __arch_copy_to_user: Internal error: Invalid -1 error 
code

which is not very helpful.  Looking at the code, the error message
appended to the "Couldn't annotate ...:" comes from
symbol__strerror_disassemble(), which expects either an errno or
one of the special SYMBOL_ANNOTATE_ERRNO_* constants in its 3rd
argument.

symbol__tui_annotate() passes the 3rd argument as the return value
from symbol__annotate2().  symbol__annotate2() returns either zero or
-1.  This calls symbol__annotate(), which returns -1 (which would
generally conflict with -EPERM), -errno, the return value of
arch->init, or the return value of symbol__disassemble().

This seems to be something of a mess - different places seem to use
different approaches to handling errors, and some don't bother
propagating the error code up.

The upshot is, the error message reported when trying to annotate
gives the user no clue why perf is unable to annotate, and you have
to resort to stracing perf in an attempt to find out - which also
isn't useful:

3431  pselect6(1, [0], NULL, NULL, NULL, NULL) = 1 (in [0])
3431  pselect6(5, [4], NULL, NULL, {tv_sec=10, tv_nsec=0}, NULL) = 1 (in [4], 
left {tv_sec=9, tv_nsec=95480})
3431  read(4, "\r", 1)  = 1
3431  uname({sysname="Linux", nodename="cex7", ...}) = 0
3431  openat(AT_FDCWD, "/usr/lib/aarch64-linux-gnu/gconv/gconv-modules.cache", 
O_RDONLY) = 26
3431  fstat(26, {st_mode=S_IFREG|0644, st_size=26404, ...}) = 0
3431  mmap(NULL, 26404, PROT_READ, MAP_SHARED, 26, 0) = 0x7fa1fd9000
3431  close(26) = 0
3431  futex(0x7fa172b830, FUTEX_WAKE_PRIVATE, 2147483647) = 0
3431  write(1, 
"\33[10;21H\33[37m\33[40m\342\224\214\342\224\200Error:\342\224"..., 522) = 522
3431  pselect6(1, [0], NULL, NULL, NULL, NULL 

Which makes it rather difficult to know what is actually failing...
so the only way is to resort to gdb.

It seems that dso__disassemble_filename() is returning -1, which
seems to be SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX and as described above,
this is lost due to the lack of error code propagation.

Specifically, the failing statement is:

if (dso->symtab_type == DSO_BINARY_TYPE__KALLSYMS &&
!dso__is_kcore(dso))
return SYMBOL_ANNOTATE_ERRNO__NO_VMLINUX;

Looking at "dso" shows:

kernel = DSO_TYPE_KERNEL,
symtab_type = DSO_BINARY_TYPE__KALLSYMS,
binary_type = DSO_BINARY_TYPE__KALLSYMS,
load_errno = DSO_LOAD_ERRNO__MISMATCHING_BUILDID,
name = 0x88781c "/boot/vmlinux",

and we finally get to the reason - it's using the wrong vmlinux.
So, obvious solution (once the failure reason is known), give it
the correct vmlinux.

Should it really be necessary to resort to gdb to discover why perf
is failing?

It looks like this was introduced by ecda45bd6cfe ("perf annotate:
Introduce symbol__annotate2 method") which did this:

-   err = symbol__annotate(sym, map, evsel, 0, );
+   err = symbol__annotate2(sym, map, evsel, _browser__opts, 
);

+int symbol__annotate2(struct symbol *sym, struct map *map, struct perf_evsel 
*evsel,
+ struct annotation_options *options, struct arch **parch)
+{
...
+   err = symbol__annotate(sym, map, evsel, 0, parch);
+   if (err)
+   goto out_free_offsets;
...
+out_free_offsets:
+   zfree(>offsets);
+   return -1;
+}

introducing this problem by the "return -1" disease.

So, given that this function's return value is used as an error code
in the way I've described above, should this function also be fixed
to return ENOMEM when the zalloc fails, as well as propagating the
return value from symbol__annotate() ?

I haven't yet checked to see if there's other places that call this
function but now rely on it returning -1... but I'd like to lodge a
plea that perf gets some consistency wrt how errors are passed and
propagated from one function to another.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH v5] gpio/mpc8xxx: change irq handler from chained to normal

2019-09-26 Thread Russell King - ARM Linux admin
On Mon, Sep 16, 2019 at 01:58:17PM +0800, Hui Song wrote:
> From: Song Hui 
> 
> More than one gpio controllers can share one interrupt, change the
> driver to request shared irq.
> 
> Signed-off-by: Laurentiu Tudor 
> Signed-off-by: Alex Marginean 
> Signed-off-by: Song Hui 

While this will work, it will mess up userspace accounting of the
number of interrupts per second in tools such as vmstat.  The reason
is that for every GPIO interrupt, /proc/interrupts records the count
against GIC interrupt 68 or 69, as well as the GPIO itself.  So, for
every GPIO interrupt, the total number of interrupts that the system
has seen increments by two.

If we don't care about accurate interrupt statistics, then this is
fine, but I think it should be mentioned in the commit message.

> ---
> Changes in v5:
>   - add traverse every bit function.
> Changes in v4:
>   - convert 'pr_err' to 'dev_err'.
> Changes in v3:
>   - update the patch description.
> Changes in v2:
>   - delete the compatible of ls1088a.
>  drivers/gpio/gpio-mpc8xxx.c | 30 +++---
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpio/gpio-mpc8xxx.c b/drivers/gpio/gpio-mpc8xxx.c
> index 16a47de..3a06ca9 100644
> --- a/drivers/gpio/gpio-mpc8xxx.c
> +++ b/drivers/gpio/gpio-mpc8xxx.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define MPC8XXX_GPIO_PINS32
>  
> @@ -127,20 +128,20 @@ static int mpc8xxx_gpio_to_irq(struct gpio_chip *gc, 
> unsigned offset)
>   return -ENXIO;
>  }
>  
> -static void mpc8xxx_gpio_irq_cascade(struct irq_desc *desc)
> +static irqreturn_t mpc8xxx_gpio_irq_cascade(int irq, void *data)
>  {
> - struct mpc8xxx_gpio_chip *mpc8xxx_gc = irq_desc_get_handler_data(desc);
> - struct irq_chip *chip = irq_desc_get_chip(desc);
> + struct mpc8xxx_gpio_chip *mpc8xxx_gc = (struct mpc8xxx_gpio_chip *)data;
>   struct gpio_chip *gc = _gc->gc;
>   unsigned int mask;
> + int i;
>  
>   mask = gc->read_reg(mpc8xxx_gc->regs + GPIO_IER)
>   & gc->read_reg(mpc8xxx_gc->regs + GPIO_IMR);
> - if (mask)
> + for_each_set_bit(i, , 32)
>   generic_handle_irq(irq_linear_revmap(mpc8xxx_gc->irq,
> -  32 - ffs(mask)));
> - if (chip->irq_eoi)
> - chip->irq_eoi(>irq_data);
> +  31 - i));
> +
> + return IRQ_HANDLED;
>  }
>  
>  static void mpc8xxx_irq_unmask(struct irq_data *d)
> @@ -388,8 +389,8 @@ static int mpc8xxx_probe(struct platform_device *pdev)
>  
>   ret = gpiochip_add_data(gc, mpc8xxx_gc);
>   if (ret) {
> - pr_err("%pOF: GPIO chip registration failed with status %d\n",
> -np, ret);
> + dev_err(>dev, "%pOF: GPIO chip registration failed with 
> status %d\n",
> + np, ret);
>   goto err;
>   }
>  
> @@ -409,8 +410,15 @@ static int mpc8xxx_probe(struct platform_device *pdev)
>   if (devtype->gpio_dir_in_init)
>   devtype->gpio_dir_in_init(gc);
>  
> - irq_set_chained_handler_and_data(mpc8xxx_gc->irqn,
> -  mpc8xxx_gpio_irq_cascade, mpc8xxx_gc);
> + ret = request_irq(mpc8xxx_gc->irqn, mpc8xxx_gpio_irq_cascade,
> +   IRQF_NO_THREAD | IRQF_SHARED, "gpio-cascade",
> +   mpc8xxx_gc);
> +     if (ret) {
> + dev_err(>dev, "%s: failed to request_irq(%d), ret = %d\n",
> + np->full_name, mpc8xxx_gc->irqn, ret);
> + goto err;
> + }
> +
>   return 0;
>  err:
>   iounmap(mpc8xxx_gc->regs);
> -- 
> 2.9.5
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


[no subject]

2019-09-25 Thread linux-kernel
Здравствуйте! Вас интересуют клиентские базы данных?



Re: [PATCH 4/6] PCI: mobiveil: Add workaround for unsupported request error

2019-09-24 Thread Russell King - ARM Linux admin
On Mon, Sep 16, 2019 at 10:17:40AM +0800, Xiaowei Bao wrote:
> Errata: unsupported request error on inbound posted write
> transaction, PCIe controller reports advisory error instead
> of uncorrectable error message to RC.
> 
> Signed-off-by: Xiaowei Bao 
> ---
>  drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c | 13 +
>  drivers/pci/controller/mobiveil/pcie-mobiveil.h   |  4 
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c 
> b/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> index 7bfec51..5bc9ed7 100644
> --- a/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> +++ b/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> @@ -49,6 +49,19 @@ static void ls_pcie_g4_ep_init(struct mobiveil_pcie_ep *ep)
>   struct mobiveil_pcie *mv_pci = to_mobiveil_pcie_from_ep(ep);
>   int win_idx;
>   u8 bar;
> + u32 val;
> +
> + /*
> +  * Errata: unsupported request error on inbound posted write
> +  * transaction, PCIe controller reports advisory error instead
> +  * of uncorrectable error message to RC.
> +  * workaround: set the bit20(unsupported_request_Error_severity) with
> +  * value 1 in uncorrectable_Error_Severity_Register, make the
> +  * unsupported request error generate the fatal error.
> +  */
> + val =  csr_readl(mv_pci, CFG_UNCORRECTABLE_ERROR_SEVERITY);
> + val |= 1 << UNSUPPORTED_REQUEST_ERROR_SHIFT;

   BIT(UNSUPPORTED_REQUEST_ERROR_SHIFT) ?

> + csr_writel(mv_pci, val, CFG_UNCORRECTABLE_ERROR_SEVERITY);
>  
>   ep->bar_num = PCIE_LX2_BAR_NUM;
>  
> diff --git a/drivers/pci/controller/mobiveil/pcie-mobiveil.h 
> b/drivers/pci/controller/mobiveil/pcie-mobiveil.h
> index 7308fa4..a40707e 100644
> --- a/drivers/pci/controller/mobiveil/pcie-mobiveil.h
> +++ b/drivers/pci/controller/mobiveil/pcie-mobiveil.h
> @@ -123,6 +123,10 @@
>  #define GPEX_BAR_SIZE_UDW0x4DC
>  #define GPEX_BAR_SELECT  0x4E0
>  
> +#define CFG_UNCORRECTABLE_ERROR_SEVERITY 0x10c
> +#define UNSUPPORTED_REQUEST_ERROR_SHIFT  20
> +#define CFG_UNCORRECTABLE_ERROR_MASK 0x108
> +
>  /* starting offset of INTX bits in status register */
>  #define PAB_INTX_START   5
>  
> -- 
> 2.9.5
> 
> 
> ___
> linux-arm-kernel mailing list
> linux-arm-ker...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH 3/6] PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs

2019-09-24 Thread Russell King - ARM Linux admin
On Mon, Sep 16, 2019 at 10:17:39AM +0800, Xiaowei Bao wrote:
> This PCIe controller is based on the Mobiveil GPEX IP, it work in EP
> mode if select this config opteration.
> 
> Signed-off-by: Xiaowei Bao 
> ---
>  MAINTAINERS|   2 +
>  drivers/pci/controller/mobiveil/Kconfig|  17 ++-
>  drivers/pci/controller/mobiveil/Makefile   |   1 +
>  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 156 
> +
>  4 files changed, 173 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b997056..0858b54 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -12363,11 +12363,13 @@ F:  drivers/pci/controller/dwc/*layerscape*
>  
>  PCI DRIVER FOR NXP LAYERSCAPE GEN4 CONTROLLER
>  M:   Hou Zhiqiang 
> +M:   Xiaowei Bao 
>  L:   linux-...@vger.kernel.org
>  L:   linux-arm-ker...@lists.infradead.org
>  S:   Maintained
>  F:   Documentation/devicetree/bindings/pci/layerscape-pcie-gen4.txt
>  F:   drivers/pci/controller/mobibeil/pcie-layerscape-gen4.c
> +F:   drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
>  
>  PCI DRIVER FOR GENERIC OF HOSTS
>  M:   Will Deacon 
> diff --git a/drivers/pci/controller/mobiveil/Kconfig 
> b/drivers/pci/controller/mobiveil/Kconfig
> index 2054950..0696b6e 100644
> --- a/drivers/pci/controller/mobiveil/Kconfig
> +++ b/drivers/pci/controller/mobiveil/Kconfig
> @@ -27,13 +27,24 @@ config PCIE_MOBIVEIL_PLAT
> for address translation and it is a PCIe Gen4 IP.
>  
>  config PCIE_LAYERSCAPE_GEN4
> - bool "Freescale Layerscape PCIe Gen4 controller"
> + bool "Freescale Layerscpe PCIe Gen4 controller in RC mode"
>   depends on PCI
>   depends on OF && (ARM64 || ARCH_LAYERSCAPE)
>   depends on PCI_MSI_IRQ_DOMAIN
>   select PCIE_MOBIVEIL_HOST
>   help
> Say Y here if you want PCIe Gen4 controller support on
> -   Layerscape SoCs. The PCIe controller can work in RC or
> -   EP mode according to RCW[HOST_AGT_PEX] setting.
> +   Layerscape SoCs. And the PCIe controller work in RC mode
> +   by setting the RCW[HOST_AGT_PEX] to 0.
> +
> +config PCIE_LAYERSCAPE_GEN4_EP
> + bool "Freescale Layerscpe PCIe Gen4 controller in EP mode"
> + depends on PCI
> + depends on OF && (ARM64 || ARCH_LAYERSCAPE)
> + depends on PCI_ENDPOINT
> + select PCIE_MOBIVEIL_EP
> + help
> +   Say Y here if you want PCIe Gen4 controller support on
> +   Layerscape SoCs. And the PCIe controller work in EP mode
> +   by setting the RCW[HOST_AGT_PEX] to 1.
>  endmenu
> diff --git a/drivers/pci/controller/mobiveil/Makefile 
> b/drivers/pci/controller/mobiveil/Makefile
> index 686d41f..6f54856 100644
> --- a/drivers/pci/controller/mobiveil/Makefile
> +++ b/drivers/pci/controller/mobiveil/Makefile
> @@ -4,3 +4,4 @@ obj-$(CONFIG_PCIE_MOBIVEIL_HOST) += pcie-mobiveil-host.o
>  obj-$(CONFIG_PCIE_MOBIVEIL_EP) += pcie-mobiveil-ep.o
>  obj-$(CONFIG_PCIE_MOBIVEIL_PLAT) += pcie-mobiveil-plat.o
>  obj-$(CONFIG_PCIE_LAYERSCAPE_GEN4) += pcie-layerscape-gen4.o
> +obj-$(CONFIG_PCIE_LAYERSCAPE_GEN4_EP) += pcie-layerscape-gen4-ep.o
> diff --git a/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c 
> b/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> new file mode 100644
> index 000..7bfec51
> --- /dev/null
> +++ b/drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> @@ -0,0 +1,156 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PCIe controller EP driver for Freescale Layerscape SoCs
> + *
> + * Copyright (C) 2019 NXP Semiconductor.
> + *
> + * Author: Xiaowei Bao 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "pcie-mobiveil.h"
> +
> +#define PCIE_LX2_BAR_NUM 4
> +
> +#define to_ls_pcie_g4_ep(x)  dev_get_drvdata((x)->dev)
> +
> +struct ls_pcie_g4_ep {
> + struct mobiveil_pcie*mv_pci;
> +};
> +
> +static const struct of_device_id ls_pcie_g4_ep_of_match[] = {
> + { .compatible = "fsl,lx2160a-pcie-ep",},
> + { },
> +};
> +
> +static const struct pci_epc_features ls_pcie_g4_epc_features = {
> + .linkup_notifier = false,
> + .msi_capable = true,
> + .msix_capable = true,
> + .reserved_bar = (1 << BAR_4) | (1 << BAR_5),

BIT(BAR_4) | BIT(BAR_5) ?

> +};
> +
> +static const struct pci_epc_features*
> +ls_pcie_g4_ep_get_features(struct mobiveil_pcie_ep 

Re: [PATCH 0/6] Add the Mobiveil EP and Layerscape Gen4 EP driver support

2019-09-24 Thread Russell King - ARM Linux admin
On Tue, Sep 24, 2019 at 03:18:47PM +0100, Russell King - ARM Linux admin wrote:
> On Mon, Sep 16, 2019 at 10:17:36AM +0800, Xiaowei Bao wrote:
> > This patch set are for adding Mobiveil EP driver and adding PCIe Gen4
> > EP driver of NXP Layerscape platform.
> > 
> > This patch set depends on:
> > https://patchwork.kernel.org/project/linux-pci/list/?series=159139
> > 
> > Xiaowei Bao (6):
> >   PCI: mobiveil: Add the EP driver support
> >   dt-bindings: Add DT binding for PCIE GEN4 EP of the layerscape
> >   PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs
> >   PCI: mobiveil: Add workaround for unsupported request error
> >   arm64: dts: lx2160a: Add PCIe EP node
> >   misc: pci_endpoint_test: Add the layerscape PCIe GEN4 EP device
> > support
> > 
> >  .../bindings/pci/layerscape-pcie-gen4.txt  |  28 +-
> >  MAINTAINERS|   3 +
> >  arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi |  56 ++
> >  drivers/misc/pci_endpoint_test.c   |   2 +
> >  drivers/pci/controller/mobiveil/Kconfig|  22 +-
> >  drivers/pci/controller/mobiveil/Makefile   |   2 +
> >  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 169 ++
> >  drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c | 568 
> > +
> >  drivers/pci/controller/mobiveil/pcie-mobiveil.c|  99 +++-
> >  drivers/pci/controller/mobiveil/pcie-mobiveil.h|  72 +++
> >  10 files changed, 1009 insertions(+), 12 deletions(-)
> >  create mode 100644 
> > drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
> >  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c
> 
> Hi,
> 
> I've applied "PCI: mobiveil: Fix the CPU base address setup in inbound
> window" and your patch set to 5.3, which seems to be able to detect the
> PCIe card I have plugged in:
> 
> layerscape-pcie-gen4 380.pcie: host bridge /soc/pcie@380 ranges:
> layerscape-pcie-gen4 380.pcie:   MEM 0xa04000..0xa07fff -> 
> 0x4000
> layerscape-pcie-gen4 380.pcie: PCI host bridge to bus :00
> pci_bus :00: root bus resource [bus 00-ff]
> pci_bus :00: root bus resource [mem 0xa04000-0xa07fff] (bus 
> address
> [0x4000-0x7fff])
> pci :00:00.0: [1957:8d90] type 01 class 0x060400
> pci :00:00.0: enabling Extended Tags
> pci :00:00.0: supports D1 D2
> pci :00:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> pci :01:00.0: [15b3:6750] type 00 class 0x02
> pci :01:00.0: reg 0x10: [mem 0xa04000-0xa0400f 64bit]
> pci :01:00.0: reg 0x18: [mem 0xa04080-0xa040ff 64bit pref]
> pci :01:00.0: reg 0x30: [mem 0xa04100-0xa0410f pref]
> pci :00:00.0: up support 3 enabled 0
> pci :00:00.0: dn support 1 enabled 0
> pci :00:00.0: BAR 9: assigned [mem 0xa04000-0xa0407f 64bit pref]
> pci :00:00.0: BAR 8: assigned [mem 0xa04080-0xa0409f]
> pci :01:00.0: BAR 2: assigned [mem 0xa04000-0xa0407f 64bit pref]
> pci :01:00.0: BAR 0: assigned [mem 0xa04080-0xa0408f 64bit]
> pci :01:00.0: BAR 6: assigned [mem 0xa04090-0xa0409f pref]
> pci :00:00.0: PCI bridge to [bus 01-ff]
> pci :00:00.0:   bridge window [mem 0xa04080-0xa0409f]
> pci :00:00.0:   bridge window [mem 0xa04000-0xa0407f 64bit pref]
> pci :00:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read Rq  
> 256pci :01:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read 
> Rq  256pcieport :00:00.0: PCIe capabilities: 0x13
> pcieport :00:00.0: init_service_irqs: -19
> 
> However, a bit later in the kernel boot, I get:
> 
> SError Interrupt on CPU1, code 0xbf02 -- SError
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0+ #392
> Hardware name: SolidRun LX2160A COM express type 7 module (DT)
> pstate: 60400085 (nZCv daIf +PAN -UAO)
> pc : pci_generic_config_read+0xb0/0xc0
> lr : pci_generic_config_read+0x1c/0xc0
> sp : ff8010f9baf0
> x29: ff8010f9baf0 x28: ff8010d620a0
> x27: ff8010d79000 x26: ff8010d62000
> x25: ff8010cb06d4 x24: 
> x23: ff8010e499b8 x22: ff8010f9bbaf
> x21:  x20: ffe2eda11800
> x19: ff8010f62158 x18: ff8010bdede0
> x17: ff8010bdede8 x16: ff8010b96970
> x15:  x14: ff00
> x13:  x12: 0030
> x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
> x9 : 2dff716475687163 x8 : 
> x7 : fefefefefefefefe x6 : 
> x5 :  x4 : ff8010f9bb6c
> x3

Re: [PATCH 0/6] Add the Mobiveil EP and Layerscape Gen4 EP driver support

2019-09-24 Thread Russell King - ARM Linux admin
On Mon, Sep 16, 2019 at 10:17:36AM +0800, Xiaowei Bao wrote:
> This patch set are for adding Mobiveil EP driver and adding PCIe Gen4
> EP driver of NXP Layerscape platform.
> 
> This patch set depends on:
> https://patchwork.kernel.org/project/linux-pci/list/?series=159139
> 
> Xiaowei Bao (6):
>   PCI: mobiveil: Add the EP driver support
>   dt-bindings: Add DT binding for PCIE GEN4 EP of the layerscape
>   PCI: mobiveil: Add PCIe Gen4 EP driver for NXP Layerscape SoCs
>   PCI: mobiveil: Add workaround for unsupported request error
>   arm64: dts: lx2160a: Add PCIe EP node
>   misc: pci_endpoint_test: Add the layerscape PCIe GEN4 EP device
> support
> 
>  .../bindings/pci/layerscape-pcie-gen4.txt  |  28 +-
>  MAINTAINERS|   3 +
>  arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi |  56 ++
>  drivers/misc/pci_endpoint_test.c   |   2 +
>  drivers/pci/controller/mobiveil/Kconfig|  22 +-
>  drivers/pci/controller/mobiveil/Makefile   |   2 +
>  .../controller/mobiveil/pcie-layerscape-gen4-ep.c  | 169 ++
>  drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c | 568 
> +
>  drivers/pci/controller/mobiveil/pcie-mobiveil.c|  99 +++-
>  drivers/pci/controller/mobiveil/pcie-mobiveil.h|  72 +++
>  10 files changed, 1009 insertions(+), 12 deletions(-)
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-layerscape-gen4-ep.c
>  create mode 100644 drivers/pci/controller/mobiveil/pcie-mobiveil-ep.c

Hi,

I've applied "PCI: mobiveil: Fix the CPU base address setup in inbound
window" and your patch set to 5.3, which seems to be able to detect the
PCIe card I have plugged in:

layerscape-pcie-gen4 380.pcie: host bridge /soc/pcie@380 ranges:
layerscape-pcie-gen4 380.pcie:   MEM 0xa04000..0xa07fff -> 
0x4000
layerscape-pcie-gen4 380.pcie: PCI host bridge to bus :00
pci_bus :00: root bus resource [bus 00-ff]
pci_bus :00: root bus resource [mem 0xa04000-0xa07fff] (bus address
[0x4000-0x7fff])
pci :00:00.0: [1957:8d90] type 01 class 0x060400
pci :00:00.0: enabling Extended Tags
pci :00:00.0: supports D1 D2
pci :00:00.0: PME# supported from D0 D1 D2 D3hot D3cold
pci :01:00.0: [15b3:6750] type 00 class 0x02
pci :01:00.0: reg 0x10: [mem 0xa04000-0xa0400f 64bit]
pci :01:00.0: reg 0x18: [mem 0xa04080-0xa040ff 64bit pref]
pci :01:00.0: reg 0x30: [mem 0xa04100-0xa0410f pref]
pci :00:00.0: up support 3 enabled 0
pci :00:00.0: dn support 1 enabled 0
pci :00:00.0: BAR 9: assigned [mem 0xa04000-0xa0407f 64bit pref]
pci :00:00.0: BAR 8: assigned [mem 0xa04080-0xa0409f]
pci :01:00.0: BAR 2: assigned [mem 0xa04000-0xa0407f 64bit pref]
pci :01:00.0: BAR 0: assigned [mem 0xa04080-0xa0408f 64bit]
pci :01:00.0: BAR 6: assigned [mem 0xa04090-0xa0409f pref]
pci :00:00.0: PCI bridge to [bus 01-ff]
pci :00:00.0:   bridge window [mem 0xa04080-0xa0409f]
pci :00:00.0:   bridge window [mem 0xa04000-0xa0407f 64bit pref]
pci :00:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read Rq  
256pci :01:00.0: Max Payload Size set to  256/ 256 (was  128), Max Read Rq  
256pcieport :00:00.0: PCIe capabilities: 0x13
pcieport :00:00.0: init_service_irqs: -19

However, a bit later in the kernel boot, I get:

SError Interrupt on CPU1, code 0xbf02 -- SError
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0+ #392
Hardware name: SolidRun LX2160A COM express type 7 module (DT)
pstate: 60400085 (nZCv daIf +PAN -UAO)
pc : pci_generic_config_read+0xb0/0xc0
lr : pci_generic_config_read+0x1c/0xc0
sp : ff8010f9baf0
x29: ff8010f9baf0 x28: ff8010d620a0
x27: ff8010d79000 x26: ff8010d62000
x25: ff8010cb06d4 x24: 
x23: ff8010e499b8 x22: ff8010f9bbaf
x21:  x20: ffe2eda11800
x19: ff8010f62158 x18: ff8010bdede0
x17: ff8010bdede8 x16: ff8010b96970
x15:  x14: ff00
x13:  x12: 0030
x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
x9 : 2dff716475687163 x8 : 
x7 : fefefefefefefefe x6 : 
x5 :  x4 : ff8010f9bb6c
x3 : 0001 x2 : 0003
x1 :  x0 : 
Kernel panic - not syncing: Asynchronous SError Interrupt
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.3.0+ #392
Hardware name: SolidRun LX2160A COM express type 7 module (DT)
Call trace:
 dump_backtrace+0x0/0x120
 show_stack+0x14/0x1c
 dump_stack+0x9c/0xc0
 panic+0x148/0x34c
 print_tainted+0x0/0xa8
 arm64_serror_panic+0x74/0x80
 do_serror+0x8c/0x13c
 el1_error+0xbc/0x160
 pci_generic_config_read+0xb0/0xc0
 pci_bus_read_config_byte+0x64/0x90
 pci_read_co

Re: [PATCH] arm: export memblock_reserve()d regions via /proc/iomem

2019-09-23 Thread Russell King - ARM Linux admin
On Mon, Sep 23, 2019 at 11:42:54PM +0800, Yu Chen wrote:
> From: Yu Chen 
> 
> On Sat, 21 Sep 2019 15:51:38, Russell King - ARM Linux admin wrote:
> > On Sat, Sep 21, 2019 at 09:02:49PM +0800, Yu Chen wrote:
> > > From: Yu Chen  
> > >  
> > > memblock reserved regions are not reported via /proc/iomem on ARM, kexec's
> > > user-space doesn't know about memblock_reserve()d regions and thus
> > > possible for kexec to overwrite with the new kernel or initrd.
> > 
> > Many reserved regions come from the kernel allocating memory during
> > boot.  We don't want to prevent kexec re-using those regions.
> > 
> > > [0.00] Booting Linux on physical CPU 0xf00
> > > [0.00] Linux version 4.9.115-rt93-dirty 
> > > (yuchen@localhost.localdomain) (gcc version 6.2.0 (ZTE Embsys-TSP V3.07.2
> > > 0) ) #62 SMP PREEMPT Fri Sep 20 10:39:29 CST 2019
> > > [0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), 
> > > cr=30c5387d
> > > [0.00] CPU: div instructions available: patching division code
> > > [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
> > > instruction cache
> > > [0.00] OF: fdt:Machine model: LS1021A TWR Board
> > > [0.00] INITRD: 0x80f7f000+0x03695e40 overlaps in-use memory 
> > > region - disabling initrd
> > 
> > Is the overlapping region one that is marked as reserved in DT?
> 
> the overlapping region is not reserved in DT.
> 
> > Where is the reserved region that overlaps the initrd coming from?
> 
> I found the reserved region that overlaps the initrd is kernel code & data, 
> with memblock=debug cmdline start new kerne:
> 
> / # kexec -l uImage-ls1021a --ramdisk=ramdisk-ls1021a --dtb=fdt 
> --append="root=/
> dev/ram0 rw console=ttyS0,115200 earlyprintk memblock=debug" -d
> Try gzip decompression.
> Try LZMA decompression.
> lzma_decompress_file: read on uImage-ls1021a of 65536 bytes failed
> kernel: 0xb6c71008 kernel_size: 0x317ab8
> MEMORY RANGES
> 8000-bfff (0)
> 80003000-80007fff (1)
> 80e0-80ff (1)
> 810c45a4-810c4fff (1)
> 81ac4000-85159fff (1)
> 8515a000-8515 (1)
> 8800-8b695fff (1)
> 8f00-8f004fff (1)
> af709000-af7eafff (1)
> af7ed000-afffbfff (1)
> afffc000-afffcfff (1)
> afffd000-afff (1)
> bc00-bfff (1)
> zImage header: 0x016f2818 0x 0x00317a78
> zImage size 0x317a78, file size 0x317a78

I see nothing here that suggests either a new kexec or a sufficiently
new kernel.  Hence, kexec lacks all the information to correctly layout
the images in physical memory.

The kernel was augmented with additional information around the
v4.15 time.  See commits:

c772568788b5 ARM: add additional table to compressed kernel
429f7a062e3b ARM: decompressor: fix BSS size calculation
99cf8f903148 ARM: better diagnostics with missing/corrupt dtb

There may be some others also needed, but I forget now, it was two
years ago.

For kexec, you need at least 2.0.17 (2.0.16 merged the wrong version
of one of my patches.)

> kexec_load: entry = 0x80008000 flags = 0x28
> nr_segments = 3
> segment[0].buf   = 0xb6c71048
> segment[0].bufsz = 0x317a78
> segment[0].mem   = 0x80008000
> segment[0].memsz = 0x318000
> segment[1].buf   = 0xb35db048
> segment[1].bufsz = 0x3695e40
> segment[1].mem   = 0x80f7f000
> segment[1].memsz = 0x3696000
> segment[2].buf   = 0x100b108
> segment[2].bufsz = 0x5090
> segment[2].mem   = 0x84615000
> segment[2].memsz = 0x6000
> / # kexec -e
> [  126.583598] kexec_core: Starting new kernel
> [  126.587815] Disabling non-boot CPUs ...
> [  126.626917] CPU1: shutdown
> [  126.656344] Retrying again to check for CPU kill
> [  126.660947] CPU1 killed.
> [  126.687585] Bye!
> [0.00] Booting Linux on physical CPU 0xf00
> [0.00] Linux version 4.9.115-rt93-CGEL-V6.02.10.R4-dirty 
> (yuchen@localhost.localdomain) (gcc version 6.2.0 (ZTE Embsys-TSP V3.07.20) ) 
> #62 SMP PREEMPT Fri Sep 20 10:39:29 CST 2019
> [0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=30c5387d
> [0.00] CPU: div instructions available: patching division code
> [0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
> instruction cache
> [0.00] OF: fdt:Machine model: LS1021A TWR Board
> [0.00] memblock_reserve: [0x008020-0x00810c45a3] flags 
> 0x0 arm_memblock_init+0x44/0x23c
> [0.00

Re: [PATCH] arm: export memblock_reserve()d regions via /proc/iomem

2019-09-21 Thread Russell King - ARM Linux admin
On Sat, Sep 21, 2019 at 09:02:49PM +0800, Yu Chen wrote:
> From: Yu Chen 
> 
> memblock reserved regions are not reported via /proc/iomem on ARM, kexec's
> user-space doesn't know about memblock_reserve()d regions and thus
> possible for kexec to overwrite with the new kernel or initrd.

Many reserved regions come from the kernel allocating memory during
boot.  We don't want to prevent kexec re-using those regions.

> [    0.000000] Booting Linux on physical CPU 0xf00
> [    0.00] Linux version 4.9.115-rt93-dirty 
> (yuchen@localhost.localdomain) (gcc version 6.2.0 (ZTE Embsys-TSP V3.07.2
> 0) ) #62 SMP PREEMPT Fri Sep 20 10:39:29 CST 2019
> [    0.00] CPU: ARMv7 Processor [410fc075] revision 5 (ARMv7), cr=30c5387d
> [    0.00] CPU: div instructions available: patching division code
> [    0.00] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing 
> instruction cache
> [    0.00] OF: fdt:Machine model: LS1021A TWR Board
> [    0.00] INITRD: 0x80f7f000+0x03695e40 overlaps in-use memory region - 
> disabling initrd

Is the overlapping region one that is marked as reserved in DT?
Where is the reserved region that overlaps the initrd coming from?

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-20 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 10:42:01PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 06:19:13PM +0100, Russell King - ARM Linux admin 
> wrote:
> > whether you can get the link to come up at all.  You might need to see
> > whether wiggling the RJ45 helps (I've had that sort of thing with some
> > cables.)
> > 
> > You might also need "ethtool -s eth0 advertise ffcf" after trying that
> > if it doesn't work to take the gigabit speeds out of the advertisement.
> > 
> > Thanks.
> > 
> >  drivers/net/phy/at803x.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> > index b3893347804d..85cf4a4a5e81 100644
> > --- a/drivers/net/phy/at803x.c
> > +++ b/drivers/net/phy/at803x.c
> > @@ -296,6 +296,11 @@ static int at803x_config_init(struct phy_device 
> > *phydev)
> > if (ret < 0)
> > return ret;
> >  
> > +   /* Disable smartspeed */
> > +   ret = phy_modify(phydev, 0x14, BIT(5), 0);
> > +   if (ret < 0)
> > +   return ret;
> > +
> > /* The RX and TX delay default is:
> >  *   after HW reset: RX delay enabled and TX delay disabled
> >  *   after SW reset: RX delay enabled, while TX delay retains the
> 
> Hi,
> 
> Could you try this patch instead - it seems that the PHY needs to be
> soft-reset for the write to take effect, and _even_ for the clearance
> of the bit to become visible in the register.
> 
> I'm not expecting this on its own to solve anything, but it should at
> least mean that the at803x doesn't modify the advertisement registers
> itself.  It may mean that the link doesn't even come up without forcing
> the advertisement via the ethtool command I mentioned before.
> 
> Thanks.
> 
>  drivers/net/phy/at803x.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index b3893347804d..69a58c0e6b42 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
> @@ -296,6 +296,16 @@ static int at803x_config_init(struct phy_device *phydev)
>   if (ret < 0)
>   return ret;
>  
> + /* Disable smartspeed */
> + ret = phy_modify(phydev, 0x14, BIT(5), 0);
> + if (ret < 0)
> + return ret;
> +
> + /* Must soft-reset the PHY for smartspeed disable to take effect */
> + ret = genphy_soft_reset(phydev);
> + if (ret < 0)
> + return ret;
> +
>   /* The RX and TX delay default is:
>*   after HW reset: RX delay enabled and TX delay disabled
>*   after SW reset: RX delay enabled, while TX delay retains the

Bad news I'm afraid.  It looks like the AR8035 has a bug in it.
Disabling the SmartSpeed feature appears to make register 9, the
1000BASET control register, read-only.

For example:

Reading 0x0009=0x0200
Writing 0x0014=0x082c   <= smartspeed enabled
Writing 0x=0xb100   <= soft reset
Writing 0x0009=0x0600
Reading 0x0009=0x0600   <= it took the value

Reading 0x0009=0x0600
Writing 0x0014=0x080c   <= smartspeed disabled
Writing 0x=0xb100   <= soft reset
Writing 0x0009=0x0200
Reading 0x0009=0x0600   <= it ignored the write

Reading 0x0009=0x0600
Writing 0x0014=0x082c   <= smartspeed enabled
Writing 0x=0xb100   <= soft reset
Writing 0x0009=0x0200
Reading 0x0009=0x0200   <= it took the value

If it's going to make register 9 read-only when smartspeed is disabled,
then that's another failure mode and autonegotiation cockup just
waiting to happen - which I spotted when trying to configure the
advertisement using ethtool, and finding that it was impossible to stop
1000baseT/Full being advertised.

I think the only sane approach - at least until we have something more
reasonable in place - is to base the negotiation result off what is
actually stored in the PHY registers at the time the link comes up, and
not on the cached versions of what we should be advertising.

5502b218e001 has caused this regression, and where we are now after
more than a week of trying to come up with some fix for this
regression, the only solution that seems to work without introducing
more failures is to revert that commit.

Adding Heiner (original commit author), Florian, David and netdev.

Thoughts?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-20 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 08:39:34PM +0200, Andrew Lunn wrote:
> > > Well, the _correct_ driver needs to be used for the PHY specific
> > > features to be properly controlled.  Using the generic driver
> > > in this situation will not be guaranteed to work.
> 
> I fully agree about the PHY driver. I'm expect this device is
> violating c22 when it modifies the advertisement register itself. So
> all bets are off for the genphy.
> 
> > Well, this hasn't worked, but not for the obvious reason.  Register 0x14
> > is documented as read/write.  Bits 15:6 are reserved, bit 5 is the
> > smart speed enable, 4:2 configures the attempts, bit 1 sets the link
> > stable condition, bit 0 is reserved.
> > 
> > Writing 0x80c results in the register reading back 0x82c.  Writing
> > 0x800 results in the same.  Writing 0 reads back 0x2c.  Writing 0x
> > seems to prevent packets being passed - and at that point I lost
> > control so I couldn't see what the result was.
> > 
> > There is nothing in the data sheet which suggests that there is any
> > gating of this register.  So it looks like we're stuck with smartspeed
> > enabled.
> > 
> > So, I think there's only two remaining ways forward - to revert commit
> > 5502b218e001 to restore the old behaviour, read back the advertisement
> > from the PHY along with the rest of the status, as I've previously
> > stated.  It means that phylib will modify phydev->advertising at
> > random points, just as it modifies phydev->lp_advertising, so locking
> > may become an issue.  The revert approach is probably best until we
> > have something working along those lines.
> 
> We have a couple of other PHYs which support downshift. We should see
> if we can follow what they do. What is i think important is that
> read_status return the correct speed. So we probably cannot use
> genphy_read_status() as is. Maybe we should split genphy_read_status()
> into two, so the register reading bit can be done unconditionally by
> phy drivers for hardware which don't report link down when they
> should?

I think we need to check how the downshift feature works on other PHYs
and whether it is enabled there.

Looking at the Marvell 88e151x PHYs, they have the feature, but do not
enable it by default.  If firmware has enabled the feature, phylib will
incorrectly resolve the link speed based on just the advertisements.

I think the safest way in the case of both PHYs to ascertain the real
link speed is to read the Specific Status register - register 17 in
both cases.  The top two bits indicate the negotiated speed resolution
and bit 13 indicates the duplex.  Bit 11 indicates whether the
resolution is valid.  This register layout seems to apply to both
88e151x and AR8035.

The register also contains the pause mode resolution in terms of
receive or transmit pause enabled, but this is not useful to phylib
as that is not what phylib wants to know.  However, it probably makes
sense for phylib to resolve the pause mode negotiation itself rather
than having that logic in the MAC drivers.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 11:30:13PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 04:32:53PM +0300, tinywrkb wrote:
> > Here's the output of # mii-tool -v -v eth0 
> > 
> > * linux-test-5.1rc1-a2703de70942-without_bad_commit
> > 
> > Using SIOCGMIIPHY=0x8947
> > eth0: negotiated 100baseTx-FD flow-control, link ok
> >   registers for MII PHY 0:
> > 3100 796d 004d d072 15e1 c5e1 000f 
> >   0800     a000
> >    f420 082c  04e8 
> > 3200 3000  063d    
> 
> I'll also mention some other discrepencies that I've just spotted in
> this register set.
> 
> The BMSR is 0x796d.  Bit 2 is the link status, which is indicating
> that link is up.  Bit 5 indicates negotiation complete, which it
> claims it is.
> 
> The PHY has a second status register at 0x11 which gives real time
> information.  That is 0x.  Bit 10 indicates link up, and is
> indicating that the link is down.  Bit 11 is saying that the speed
> and duplex is not resolved either.
> 
> So, there's contradictory information being reported by this PHY.
> 
> This brings up several questions:
> 1. what is the _true_ state of the link?  Is the link up or down?
> 
> 2. what does the link partner think is the current link state and
>results of negotiation?
> 
> 3. should we be reading the register at 0x11 to determine the
>negotiation results and link state (maybe logically anding the
>present state with the BMSR link state)?
> 
> 
> Compare that to a correctly functioning AR8035 such as I have in my
> cubox-i4 connected to a Netgear GS116 switch:
> 
>3100 796d 004d d072 15e1 c5e1 000d 2001
> 0200 3c00   4007 b29a a000
>0862 bc1c   082c  07e8 
>3200 3000  063e  0005 2d47 8100.
> 
> BMSR is again 0x796d.  The PHY specific status register this time
> is 0xbc1c, which indicates 1G, full duplex, resolved, link up, no
> smartspeed downgrade, tx/rx pause.
> 
> The register at 0x10 is a control register, which is strangely also
> different between our two.  Apparently in your PHY configuration,
> auto-MDI crossover mode is disabled, you are forced to MDI mode.
> On hardware reset, this register contains 0x0862, as per my
> example above, but yours is zero.
> 
> I don't think the difference in register 0x10 can be explained away
> by operation of the smartspeed feature - so maybe my theory about
> the advertisement registers being cleared by the PHY is wrong.  The
> question is: how is 0x10 getting reset to zero in your setup?  Maybe
> something has corrupted the configuration of the PHY in ways that
> Linux doesn't know how to reprogram?
> 
> Have you tried power-cycling the cubox-i?

Hopefully one last thing, which will explain why you may not be able
to get an IP address even with some of these tweaks I've been getting
you to try.  Do you have either none or both of these commits in your
kernel?

0672d22a1924 ("ARM: dts: imx: Fix the AR803X phy-mode")
6d4cd041f0af ("net: phy: at803x: disable delay only for RGMII mode")

I think you'll have the latter but not the former.  You will need the
former if you have the latter.

I think this thread is a good illustration why breaking existing DT
compatibility - even for the sake of fixing a bug - is just bad news.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 04:32:53PM +0300, tinywrkb wrote:
> Here's the output of # mii-tool -v -v eth0 
> 
> * linux-test-5.1rc1-a2703de70942-without_bad_commit
> 
> Using SIOCGMIIPHY=0x8947
> eth0: negotiated 100baseTx-FD flow-control, link ok
>   registers for MII PHY 0:
> 3100 796d 004d d072 15e1 c5e1 000f 
>   0800     a000
>    f420 082c  04e8 
> 3200 3000  063d    

I'll also mention some other discrepencies that I've just spotted in
this register set.

The BMSR is 0x796d.  Bit 2 is the link status, which is indicating
that link is up.  Bit 5 indicates negotiation complete, which it
claims it is.

The PHY has a second status register at 0x11 which gives real time
information.  That is 0x.  Bit 10 indicates link up, and is
indicating that the link is down.  Bit 11 is saying that the speed
and duplex is not resolved either.

So, there's contradictory information being reported by this PHY.

This brings up several questions:
1. what is the _true_ state of the link?  Is the link up or down?

2. what does the link partner think is the current link state and
   results of negotiation?

3. should we be reading the register at 0x11 to determine the
   negotiation results and link state (maybe logically anding the
   present state with the BMSR link state)?


Compare that to a correctly functioning AR8035 such as I have in my
cubox-i4 connected to a Netgear GS116 switch:

   3100 796d 004d d072 15e1 c5e1 000d 2001
    0200 3c00   4007 b29a a000
   0862 bc1c   082c  07e8 
   3200 3000  063e  0005 2d47 8100.

BMSR is again 0x796d.  The PHY specific status register this time
is 0xbc1c, which indicates 1G, full duplex, resolved, link up, no
smartspeed downgrade, tx/rx pause.

The register at 0x10 is a control register, which is strangely also
different between our two.  Apparently in your PHY configuration,
auto-MDI crossover mode is disabled, you are forced to MDI mode.
On hardware reset, this register contains 0x0862, as per my
example above, but yours is zero.

I don't think the difference in register 0x10 can be explained away
by operation of the smartspeed feature - so maybe my theory about
the advertisement registers being cleared by the PHY is wrong.  The
question is: how is 0x10 getting reset to zero in your setup?  Maybe
something has corrupted the configuration of the PHY in ways that
Linux doesn't know how to reprogram?

Have you tried power-cycling the cubox-i?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 06:19:13PM +0100, Russell King - ARM Linux admin wrote:
> whether you can get the link to come up at all.  You might need to see
> whether wiggling the RJ45 helps (I've had that sort of thing with some
> cables.)
> 
> You might also need "ethtool -s eth0 advertise ffcf" after trying that
> if it doesn't work to take the gigabit speeds out of the advertisement.
> 
> Thanks.
> 
>  drivers/net/phy/at803x.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index b3893347804d..85cf4a4a5e81 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
> @@ -296,6 +296,11 @@ static int at803x_config_init(struct phy_device *phydev)
>   if (ret < 0)
>   return ret;
>  
> + /* Disable smartspeed */
> + ret = phy_modify(phydev, 0x14, BIT(5), 0);
> + if (ret < 0)
> + return ret;
> +
>   /* The RX and TX delay default is:
>*   after HW reset: RX delay enabled and TX delay disabled
>*   after SW reset: RX delay enabled, while TX delay retains the

Hi,

Could you try this patch instead - it seems that the PHY needs to be
soft-reset for the write to take effect, and _even_ for the clearance
of the bit to become visible in the register.

I'm not expecting this on its own to solve anything, but it should at
least mean that the at803x doesn't modify the advertisement registers
itself.  It may mean that the link doesn't even come up without forcing
the advertisement via the ethtool command I mentioned before.

Thanks.

 drivers/net/phy/at803x.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index b3893347804d..69a58c0e6b42 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -296,6 +296,16 @@ static int at803x_config_init(struct phy_device *phydev)
if (ret < 0)
return ret;
 
+   /* Disable smartspeed */
+   ret = phy_modify(phydev, 0x14, BIT(5), 0);
+   if (ret < 0)
+   return ret;
+
+   /* Must soft-reset the PHY for smartspeed disable to take effect */
+   ret = genphy_soft_reset(phydev);
+   if (ret < 0)
+   return ret;
+
/* The RX and TX delay default is:
 *   after HW reset: RX delay enabled and TX delay disabled
 *   after SW reset: RX delay enabled, while TX delay retains the
-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 06:37:28PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 07:26:58PM +0200, Andrew Lunn wrote:
> > > diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> > > index b3893347804d..85cf4a4a5e81 100644
> > > --- a/drivers/net/phy/at803x.c
> > > +++ b/drivers/net/phy/at803x.c
> > 
> > Hi Russell
> > 
> > This won't work. In the kernel logs, you see 
> > 
> > kernel: Generic PHY 2188000.ethernet-1:00: attached PHY driver [Generic PHY]
> > 
> > The generic PHY driver is being used, not the at803x driver.
> 
> Well, the _correct_ driver needs to be used for the PHY specific
> features to be properly controlled.  Using the generic driver
> in this situation will not be guaranteed to work.

Well, this hasn't worked, but not for the obvious reason.  Register 0x14
is documented as read/write.  Bits 15:6 are reserved, bit 5 is the
smart speed enable, 4:2 configures the attempts, bit 1 sets the link
stable condition, bit 0 is reserved.

Writing 0x80c results in the register reading back 0x82c.  Writing
0x800 results in the same.  Writing 0 reads back 0x2c.  Writing 0x
seems to prevent packets being passed - and at that point I lost
control so I couldn't see what the result was.

There is nothing in the data sheet which suggests that there is any
gating of this register.  So it looks like we're stuck with smartspeed
enabled.

So, I think there's only two remaining ways forward - to revert commit
5502b218e001 to restore the old behaviour, read back the advertisement
from the PHY along with the rest of the status, as I've previously
stated.  It means that phylib will modify phydev->advertising at
random points, just as it modifies phydev->lp_advertising, so locking
may become an issue.  The revert approach is probably best until we
have something working along those lines.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 07:26:58PM +0200, Andrew Lunn wrote:
> > diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> > index b3893347804d..85cf4a4a5e81 100644
> > --- a/drivers/net/phy/at803x.c
> > +++ b/drivers/net/phy/at803x.c
> 
> Hi Russell
> 
> This won't work. In the kernel logs, you see 
> 
> kernel: Generic PHY 2188000.ethernet-1:00: attached PHY driver [Generic PHY]
> 
> The generic PHY driver is being used, not the at803x driver.

Well, the _correct_ driver needs to be used for the PHY specific
features to be properly controlled.  Using the generic driver
in this situation will not be guaranteed to work.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 06:04:19PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 07:34:27PM +0300, tinywrkb wrote:
> > The patch didn't fix the issue.
> > 
> > # ethtool eth0
> > 
> > Settings for eth0:
> > Supported ports: [ TP MII ]
> > Supported link modes:   10baseT/Half 10baseT/Full
> > 100baseT/Half 100baseT/Full
> > 1000baseT/Full
> > Supported pause frame use: Symmetric
> > Supports auto-negotiation: Yes
> > Supported FEC modes: Not reported
> > Advertised link modes:  10baseT/Half 10baseT/Full
> > 100baseT/Half 100baseT/Full
> > 1000baseT/Full
> > Advertised pause frame use: Symmetric
> > Advertised auto-negotiation: Yes
> > Advertised FEC modes: Not reported
> > Link partner advertised link modes:  10baseT/Half 10baseT/Full
> >  100baseT/Half 100baseT/Full
> >  1000baseT/Full
> > Link partner advertised pause frame use: Symmetric
> > Link partner advertised auto-negotiation: Yes
> > Link partner advertised FEC modes: Not reported
> > Speed: 1000Mb/s
> > Duplex: Full
> > Port: MII
> > PHYAD: 0
> > Transceiver: internal
> > Auto-negotiation: on
> > Supports Wake-on: d
> > Wake-on: d
> > Link detected: yes
> > 
> > # mii-tool -v -v eth0
> > 
> > Using SIOCGMIIPHY=0x8947
> > eth0: negotiated 100baseTx-FD flow-control, link ok
> >   registers for MII PHY 0:
> > 3100 796d 004d d072 15e1 c5e1 000f 
> >   0800     a000
> >    f420 082c  04e8 
> > 3200 3000  063d    
> >   product info: vendor 00:13:74, model 7 rev 2
> >   basic mode:   autonegotiation enabled
> >   basic status: autonegotiation complete, link ok
> >   capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
> >   advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
> >   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 
> > 10baseT-HD flow-control
> > 
> > # journalctl -b | egrep -i 'phy|eth|fec'|grep -v usb
> > 
> > kernel: Booting Linux on physical CPU 0x0
> > kernel: libphy: Fixed MDIO Bus: probed
> > kernel: libphy: fec_enet_mii_bus: probed
> > kernel: fec 2188000.ethernet eth0: registered PHC device 0
> > kernel: dwhdmi-imx 12.hdmi: Detected HDMI TX controller v1.31a with 
> > HDCP (DWC HDMI 3D TX PHY)
> > kernel: Generic PHY 2188000.ethernet-1:00: attached PHY driver [Generic 
> > PHY] (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)
> > kernel: fec 2188000.ethernet eth0: Link is Up - 1Gbps/Full - flow control 
> > rx/tx
> > kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> > systemd-networkd[242]: eth0: Gained carrier
> 
> Okay, so this is getting weird.
> 
> ethtool still shows that 1000baseT/Full is being advertised, yet the
> PHY disagrees:
> 
>  3100 796d 004d d072 15e1 c5e1 000f 
>    0800     a000
>   
> Gigabit control register, bits 9 should be set, but it's clear.
> 
> Looking at the following registers, brings up another possibility what
> is going on:
> 
>     f420 082c  04e8 
>  
> 
> These two registers may provide a hint.  Of the first register, which
> is the interrupt status register, bit 5 is set, indicating that a
> "smartspeed downgrade occurred".  The second register is the smartspeed
> configuration, which basically says that the feature is enabled.
> 
> Smartspeed is designed to allow the link to come up if two-pair CAT5
> cable is used (are you using a 4-pair or 2-pair cable?) by making the
> link fall back to 100mbit, or with CAT3 cable, 10mbit speeds.  What
> isn't specified is whether it does this by clearing bits in the various
> advertisement registers.
> 
> Given what you've said so far, I'd suggest that this is indeed the
> case - when smartspeed is triggered, advertisement bits are cleared by
> the PHY without the kernel's knowledge, leading to the kernel getting
> the speed resolution incorrect after 5502b218e001.
> 
> There's another issue here - if smartspeed clears advertisement bits,
> then if you connect a 4-pair cable after having used a 2-pair cable,
> you'd still be limited to 100mbit.  The ethtool output will be just
> as confusing.
&g

Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 07:34:27PM +0300, tinywrkb wrote:
> The patch didn't fix the issue.
> 
> # ethtool eth0
> 
> Settings for eth0:
>   Supported ports: [ TP MII ]
>   Supported link modes:   10baseT/Half 10baseT/Full
>   100baseT/Half 100baseT/Full
>   1000baseT/Full
>   Supported pause frame use: Symmetric
>   Supports auto-negotiation: Yes
>   Supported FEC modes: Not reported
>   Advertised link modes:  10baseT/Half 10baseT/Full
>   100baseT/Half 100baseT/Full
>   1000baseT/Full
>   Advertised pause frame use: Symmetric
>   Advertised auto-negotiation: Yes
>   Advertised FEC modes: Not reported
>   Link partner advertised link modes:  10baseT/Half 10baseT/Full
>100baseT/Half 100baseT/Full
>1000baseT/Full
>   Link partner advertised pause frame use: Symmetric
>   Link partner advertised auto-negotiation: Yes
>   Link partner advertised FEC modes: Not reported
>   Speed: 1000Mb/s
>   Duplex: Full
>   Port: MII
>   PHYAD: 0
>   Transceiver: internal
>   Auto-negotiation: on
>   Supports Wake-on: d
>   Wake-on: d
>   Link detected: yes
> 
> # mii-tool -v -v eth0
> 
> Using SIOCGMIIPHY=0x8947
> eth0: negotiated 100baseTx-FD flow-control, link ok
>   registers for MII PHY 0:
> 3100 796d 004d d072 15e1 c5e1 000f 
>   0800     a000
>    f420 082c  04e8 
> 3200 3000  063d    
>   product info: vendor 00:13:74, model 7 rev 2
>   basic mode:   autonegotiation enabled
>   basic status: autonegotiation complete, link ok
>   capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
>   advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control
>   link partner: 1000baseT-FD 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD 
> flow-control
> 
> # journalctl -b | egrep -i 'phy|eth|fec'|grep -v usb
> 
> kernel: Booting Linux on physical CPU 0x0
> kernel: libphy: Fixed MDIO Bus: probed
> kernel: libphy: fec_enet_mii_bus: probed
> kernel: fec 2188000.ethernet eth0: registered PHC device 0
> kernel: dwhdmi-imx 12.hdmi: Detected HDMI TX controller v1.31a with HDCP 
> (DWC HDMI 3D TX PHY)
> kernel: Generic PHY 2188000.ethernet-1:00: attached PHY driver [Generic PHY] 
> (mii_bus:phy_addr=2188000.ethernet-1:00, irq=POLL)
> kernel: fec 2188000.ethernet eth0: Link is Up - 1Gbps/Full - flow control 
> rx/tx
> kernel: IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> systemd-networkd[242]: eth0: Gained carrier

Okay, so this is getting weird.

ethtool still shows that 1000baseT/Full is being advertised, yet the
PHY disagrees:

 3100 796d 004d d072 15e1 c5e1 000f 
   0800     a000
  
Gigabit control register, bits 9 should be set, but it's clear.

Looking at the following registers, brings up another possibility what
is going on:

    f420 082c  04e8 
 

These two registers may provide a hint.  Of the first register, which
is the interrupt status register, bit 5 is set, indicating that a
"smartspeed downgrade occurred".  The second register is the smartspeed
configuration, which basically says that the feature is enabled.

Smartspeed is designed to allow the link to come up if two-pair CAT5
cable is used (are you using a 4-pair or 2-pair cable?) by making the
link fall back to 100mbit, or with CAT3 cable, 10mbit speeds.  What
isn't specified is whether it does this by clearing bits in the various
advertisement registers.

Given what you've said so far, I'd suggest that this is indeed the
case - when smartspeed is triggered, advertisement bits are cleared by
the PHY without the kernel's knowledge, leading to the kernel getting
the speed resolution incorrect after 5502b218e001.

There's another issue here - if smartspeed clears advertisement bits,
then if you connect a 4-pair cable after having used a 2-pair cable,
you'd still be limited to 100mbit.  The ethtool output will be just
as confusing.

The only thing I can think we should do is to read-back the
advertisement from the PHY whenever we read the rest of the status
and update the phy->advertising mask, just like we do with the link
partner advertisement.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up


Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 04:17:07PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 02:39:43PM +0100, Russell King - ARM Linux admin 
> wrote:
> > On Tue, Sep 17, 2019 at 04:32:53PM +0300, tinywrkb wrote:
> > > On Tue, Sep 17, 2019 at 02:54:34PM +0200, Andrew Lunn wrote:
> > > > On Tue, Sep 17, 2019 at 03:41:01PM +0300, tinywrkb wrote:
> > > > > On Sun, Sep 15, 2019 at 03:56:52PM +0200, Andrew Lunn wrote:
> > > > > > > Tinywrkb confirmed to me in private communication that revert of
> > > > > > > 5502b218e001 fixes Ethernet for him on effected system.
> > > > > > > 
> > > > > > > He also referred me to an old Cubox-i spec that lists 10/100 
> > > > > > > Ethernet
> > > > > > > only for i.MX6 Solo/DualLite variants of Cubox-i. It turns out 
> > > > > > > that
> > > > > > > there was a plan to use a different 10/100 PHY for Solo/DualLite
> > > > > > > SOMs. This plan never materialized. All SolidRun i.MX6 SOMs use 
> > > > > > > the same
> > > > > > > AR8035 PHY that supports 1Gb.
> > > > > > > 
> > > > > > > Commit 5502b218e001 might be triggering a hardware issue on the 
> > > > > > > affected
> > > > > > > Cubox-i. I could not reproduce the issue here with Cubox-i and a 
> > > > > > > Dual
> > > > > > > SOM variant running v5.3-rc8. I have no Solo/DualLite variant 
> > > > > > > handy at
> > > > > > > the moment.
> > > > > > 
> > > > > > Could somebody with an affected device show us the output of ethtool
> > > > > > with and without 5502b218e001. Does one show 1G has been negotiated,
> > > > > > and the other 100Mbps? If this is true, how does it get 100Mbps
> > > > > > without that patch? We are missing a piece of the puzzle.
> > > > > > 
> > > > > > Andrew
> > > > > 
> > > > > linux-test-5.1rc1-a2703de70942-without_bad_commit
> > > > > 
> > > > > Settings for eth0:
> > > > >   Supported ports: [ TP MII ]
> > > > >   Supported link modes:   10baseT/Half 10baseT/Full
> > > > >   100baseT/Half 100baseT/Full
> > > > >   1000baseT/Full
> > > > 
> > > > So this means the local device says it can do 1000Mbps.
> > > > 
> > > > 
> > > > >   Supported pause frame use: Symmetric
> > > > >   Supports auto-negotiation: Yes
> > > > >   Supported FEC modes: Not reported
> > > > >   Advertised link modes:  10baseT/Half 10baseT/Full
> > > > >   100baseT/Half 100baseT/Full
> > > > >   1000baseT/Full
> > > > 
> > > > The link peer can also do 1000Mbps.
> > > > 
> > > > 
> > > > >   Advertised pause frame use: Symmetric
> > > > >   Advertised auto-negotiation: Yes
> > > > >   Advertised FEC modes: Not reported
> > > > >   Link partner advertised link modes:  10baseT/Half 10baseT/Full
> > > > >100baseT/Half 100baseT/Full
> > > > >1000baseT/Full
> > > > >   Link partner advertised pause frame use: Symmetric
> > > > >   Link partner advertised auto-negotiation: Yes
> > > > >   Link partner advertised FEC modes: Not reported
> > > > >   Speed: 100Mb/s
> > > > 
> > > > Yet they have decided to do 100Mbps. 
> > > > 
> > > > We need to understand Why? The generic PHY driver would not do this on
> > > > its own. So i'm thinking something has poked a PHY register with some
> > > > value, and this patch is causing it to be over written.
> > > > 
> > > > Please can you use mii-tool -v -v to dump the PHY registers in each
> > > > case.
> > > > 
> > > > Thanks
> > > > Andrew
> > > 
> > > Here's the output of # mii-tool -v -v eth0 
> > > 
> > > * linux-test-5.1rc1-a2703de70942-without_bad_commit
> > > 
> > > Using SIOCGMIIPHY=0x8947
&g

Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed

2019-09-17 Thread Russell King - ARM Linux admin
On Tue, Sep 17, 2019 at 02:39:43PM +0100, Russell King - ARM Linux admin wrote:
> On Tue, Sep 17, 2019 at 04:32:53PM +0300, tinywrkb wrote:
> > On Tue, Sep 17, 2019 at 02:54:34PM +0200, Andrew Lunn wrote:
> > > On Tue, Sep 17, 2019 at 03:41:01PM +0300, tinywrkb wrote:
> > > > On Sun, Sep 15, 2019 at 03:56:52PM +0200, Andrew Lunn wrote:
> > > > > > Tinywrkb confirmed to me in private communication that revert of
> > > > > > 5502b218e001 fixes Ethernet for him on effected system.
> > > > > > 
> > > > > > He also referred me to an old Cubox-i spec that lists 10/100 
> > > > > > Ethernet
> > > > > > only for i.MX6 Solo/DualLite variants of Cubox-i. It turns out that
> > > > > > there was a plan to use a different 10/100 PHY for Solo/DualLite
> > > > > > SOMs. This plan never materialized. All SolidRun i.MX6 SOMs use the 
> > > > > > same
> > > > > > AR8035 PHY that supports 1Gb.
> > > > > > 
> > > > > > Commit 5502b218e001 might be triggering a hardware issue on the 
> > > > > > affected
> > > > > > Cubox-i. I could not reproduce the issue here with Cubox-i and a 
> > > > > > Dual
> > > > > > SOM variant running v5.3-rc8. I have no Solo/DualLite variant handy 
> > > > > > at
> > > > > > the moment.
> > > > > 
> > > > > Could somebody with an affected device show us the output of ethtool
> > > > > with and without 5502b218e001. Does one show 1G has been negotiated,
> > > > > and the other 100Mbps? If this is true, how does it get 100Mbps
> > > > > without that patch? We are missing a piece of the puzzle.
> > > > > 
> > > > >   Andrew
> > > > 
> > > > linux-test-5.1rc1-a2703de70942-without_bad_commit
> > > > 
> > > > Settings for eth0:
> > > > Supported ports: [ TP MII ]
> > > > Supported link modes:   10baseT/Half 10baseT/Full
> > > > 100baseT/Half 100baseT/Full
> > > > 1000baseT/Full
> > > 
> > > So this means the local device says it can do 1000Mbps.
> > > 
> > > 
> > > > Supported pause frame use: Symmetric
> > > > Supports auto-negotiation: Yes
> > > > Supported FEC modes: Not reported
> > > > Advertised link modes:  10baseT/Half 10baseT/Full
> > > > 100baseT/Half 100baseT/Full
> > > > 1000baseT/Full
> > > 
> > > The link peer can also do 1000Mbps.
> > > 
> > > 
> > > > Advertised pause frame use: Symmetric
> > > > Advertised auto-negotiation: Yes
> > > > Advertised FEC modes: Not reported
> > > > Link partner advertised link modes:  10baseT/Half 10baseT/Full
> > > >  100baseT/Half 100baseT/Full
> > > >  1000baseT/Full
> > > > Link partner advertised pause frame use: Symmetric
> > > > Link partner advertised auto-negotiation: Yes
> > > > Link partner advertised FEC modes: Not reported
> > > > Speed: 100Mb/s
> > > 
> > > Yet they have decided to do 100Mbps. 
> > > 
> > > We need to understand Why? The generic PHY driver would not do this on
> > > its own. So i'm thinking something has poked a PHY register with some
> > > value, and this patch is causing it to be over written.
> > > 
> > > Please can you use mii-tool -v -v to dump the PHY registers in each
> > > case.
> > > 
> > > Thanks
> > >   Andrew
> > 
> > Here's the output of # mii-tool -v -v eth0 
> > 
> > * linux-test-5.1rc1-a2703de70942-without_bad_commit
> > 
> > Using SIOCGMIIPHY=0x8947
> > eth0: negotiated 100baseTx-FD flow-control, link ok
> >   registers for MII PHY 0:
> > 3100 796d 004d d072 15e1 c5e1 000f 
> >   0800     a000
> >    f420 082c  04e8 
> > 3200 3000  063d    
> >   product info: vendor 00:13:74, model 7 rev 2
> >   basic mode:   autonegotiation enabled
> >   basic status: autonegotiation complete, link ok
> >   ca

<    1   2   3   4   5   6   7   8   9   10   >