date:20171008

[patch net-next repost 0/2] mlxsw: Add more extack error reporting

2017-10-08 Thread Jiri Pirko

From: Jiri Pirko 

Ido says:

Add error messages to VLAN and bridge enslavements to help users
understand why the enslavement failed.

Ido Schimmel (2):
  mlxsw: spectrum: Add extack for VLAN enslavements
  mlxsw: spectrum: Propagate extack further for bridge enslavements

 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 18 
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  3 ++-
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   | 25 +++---
 3 files changed, 33 insertions(+), 13 deletions(-)

-- 
2.9.5

[patch net-next repost 1/2] mlxsw: spectrum: Add extack for VLAN enslavements

2017-10-08 Thread Jiri Pirko

From: Ido Schimmel 

Similar to physical ports, enslavement of VLAN devices can also fail.
Use extack to indicate why the enslavement failed.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add name bond0 type bond mode 802.3ad
$ ip link set dev enp1s0np1.10 master bond0
Error: spectrum: VLAN devices only support bridge and VRF uppers.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
Acked-by: David Ahern 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 5cd4df0..5ab4fd7 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -4389,18 +4390,25 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct 
net_device *vlan_dev,
 {
struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
struct netdev_notifier_changeupper_info *info = ptr;
+   struct netlink_ext_ack *extack;
struct net_device *upper_dev;
int err = 0;
 
+   extack = netdev_notifier_info_to_extack(>info);
+
switch (event) {
case NETDEV_PRECHANGEUPPER:
upper_dev = info->upper_dev;
-   if (!netif_is_bridge_master(upper_dev))
+   if (!netif_is_bridge_master(upper_dev)) {
+   NL_SET_ERR_MSG(extack, "spectrum: VLAN devices only 
support bridge and VRF uppers");
return -EINVAL;
+   }
if (!info->linking)
break;
-   if (netdev_has_any_upper_dev(upper_dev))
+   if (netdev_has_any_upper_dev(upper_dev)) {
+   NL_SET_ERR_MSG(extack, "spectrum: Enslaving a port to a 
device that already has an upper device is not supported");
return -EINVAL;
+   }
break;
case NETDEV_CHANGEUPPER:
upper_dev = info->upper_dev;
-- 
2.9.5

[patch net-next repost 2/2] mlxsw: spectrum: Propagate extack further for bridge enslavements

2017-10-08 Thread Jiri Pirko

From: Ido Schimmel 

The code that actually takes care of bridge offload introduces a few
more non-trivial constraints with regards to bridge enslavements.
Propagate extack there to indicate the reason.

$ ip link add link enp1s0np1 name enp1s0np1.10 type vlan id 10
$ ip link add link enp1s0np1 name enp1s0np1.20 type vlan id 20
$ ip link add name br0 type bridge
$ ip link set dev enp1s0np1.10 master br0
$ ip link set dev enp1s0np1.20 master br0
Error: spectrum: Can not bridge VLAN uppers of the same port.

Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
Acked-by: David Ahern 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum.c |  6 --
 drivers/net/ethernet/mellanox/mlxsw/spectrum.h |  3 ++-
 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   | 25 +++---
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 5ab4fd7..321988a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4299,7 +4299,8 @@ static int mlxsw_sp_netdevice_port_upper_event(struct 
net_device *lower_dev,
if (info->linking)
err = mlxsw_sp_port_bridge_join(mlxsw_sp_port,
lower_dev,
-   upper_dev);
+   upper_dev,
+   extack);
else
mlxsw_sp_port_bridge_leave(mlxsw_sp_port,
   lower_dev,
@@ -4416,7 +4417,8 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct 
net_device *vlan_dev,
if (info->linking)
err = mlxsw_sp_port_bridge_join(mlxsw_sp_port,
vlan_dev,
-   upper_dev);
+   upper_dev,
+   extack);
else
mlxsw_sp_port_bridge_leave(mlxsw_sp_port,
   vlan_dev,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index ae67e60..8e45183 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -326,7 +326,8 @@ void
 mlxsw_sp_port_vlan_bridge_leave(struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan);
 int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
  struct net_device *brport_dev,
- struct net_device *br_dev);
+ struct net_device *br_dev,
+ struct netlink_ext_ack *extack);
 void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
struct net_device *brport_dev,
struct net_device *br_dev);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 0f9eac5..2cfdf22 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "spectrum.h"
@@ -107,7 +108,8 @@ struct mlxsw_sp_bridge_vlan {
 struct mlxsw_sp_bridge_ops {
int (*port_join)(struct mlxsw_sp_bridge_device *bridge_device,
 struct mlxsw_sp_bridge_port *bridge_port,
-struct mlxsw_sp_port *mlxsw_sp_port);
+struct mlxsw_sp_port *mlxsw_sp_port,
+struct netlink_ext_ack *extack);
void (*port_leave)(struct mlxsw_sp_bridge_device *bridge_device,
   struct mlxsw_sp_bridge_port *bridge_port,
   struct mlxsw_sp_port *mlxsw_sp_port);
@@ -1735,12 +1737,15 @@ static const struct switchdev_ops 
mlxsw_sp_port_switchdev_ops = {
 static int
 mlxsw_sp_bridge_8021q_port_join(struct mlxsw_sp_bridge_device *bridge_device,
struct mlxsw_sp_bridge_port *bridge_port,
-   struct mlxsw_sp_port *mlxsw_sp_port)
+   struct mlxsw_sp_port *mlxsw_sp_port,
+   struct netlink_ext_ack *extack)
 {
struct mlxsw_sp_port_vlan *mlxsw_sp_port_vlan;
 
-   if (is_vlan_dev(bridge_port->dev))
+   if

[PATCH] wcn36xx: Remove unnecessary rcu_read_unlock in wcn36xx_bss_info_changed

2017-10-08 Thread Jia-Ju Bai

No rcu_read_lock is called, but rcu_read_unlock is still called.
Thus rcu_read_unlock should be removed.

Signed-off-by: Jia-Ju Bai 
---
 drivers/net/wireless/ath/wcn36xx/main.c |1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index 35bd50b..b83f01d 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -812,7 +812,6 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw 
*hw,
if (!sta) {
wcn36xx_err("sta %pM is not found\n",
  bss_conf->bssid);
-   rcu_read_unlock();
goto out;
}
sta_priv = wcn36xx_sta_to_priv(sta);
-- 
1.7.9.5

Re: [PATCH net-next RFC 5/9] net: dsa: forward hardware timestamping ioctls to switch driver

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:34AM -0700, Florian Fainelli wrote:
> This echoes back to Andrew's comments in patch 2, but we may have to
> prefer PHY timestamping over MAC timestamping if both are available?
> Richard, is that usually how the preference should be made?

No, if the MAC supports time stamping, then it will take precedence,
because the MAC driver doesn't know that the PHY also supports this.
In the case where a board design includes the PHYTER (the one and only
PHY PHC) and a MAC PHC, the user must de-select the MAC support in the
Kconfig in order to use the PHYTER.

So in general, we don't support PHC/timestamping simultaneously in the
MAC and PHY.  It would be a lot of work to support this, and the user
timestamping API would have to be extended yet again, and so I think
it is not worth the effort.

Getting back to this patch, it should fall back to PHY timestamping
when the switch device doesn't support timestamping:

case SIOCGHWTSTAMP:
if (ds->ops->port_hwtstamp_get)
return ds->ops->port_hwtstamp_get(ds, port, ifr);
else
return phy_mii_ioctl(dev->phydev, ifr, cmd);

That way, if someone combines a PHYTER with a non-PTP capable switch,
it will just work.

Thanks,
Richard

[PATCH iproute2 1/1] color: Fix ip segfault in color_fprintf() when using --color switch

2017-10-08 Thread Petr Vorel

This fixes two regressions:

Commit 959f1428 ("color: add new COLOR_NONE and disable_color function")
caused segfault, when running ip with --color switch, as 'attr + 8' in
color_fprintf() access array item out of bounds.
Changing latter value of ternar operator in attr_colors[] index is for
restoring the same colors.
Reproduce the bug with:
$ ip -c a

Commit d0e72011 ("ip: ipaddress.c: add support for json output")
introduced passing -1 as enum color_attr. This is not only wrong as no
color_attr has value -1, but also causes another segfault in color_fprintf()
on this setup as there is no item with index -1 in array of enum attr_colors[].
Using 0 is valid option.

Reproduce the bug with:
$ COLORFGBG='0;15' ip -c a

NOTE: COLORFGBG is environmental variable used for defining whether user
has light or dark background.
COLORFGBG="0;15" is used to ask for color set suitable for light background,
COLORFGBG="15;0" is used to ask for color set suitable for dark background.

Signed-off-by: Petr Vorel 
---
 include/json_print.h | 2 +-
 lib/color.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/json_print.h b/include/json_print.h
index b6ce1f9f..2f3f07c8 100644
--- a/include/json_print.h
+++ b/include/json_print.h
@@ -53,7 +53,7 @@ void close_json_array(enum output_type type, const char 
*delim);
 const char *fmt,   \
 type value)\
{   \
-   print_color_##type_name(t, -1, key, fmt, value);\
+   print_color_##type_name(t, 0, key, fmt, value); \
}
 _PRINT_FUNC(int, int);
 _PRINT_FUNC(bool, bool);
diff --git a/lib/color.c b/lib/color.c
index 79d5e289..e597798f 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -110,7 +110,7 @@ int color_fprintf(FILE *fp, enum color_attr attr, const 
char *fmt, ...)
}
 
ret += fprintf(fp, "%s",
-  color_codes[attr_colors[is_dark_bg ? attr + 8 : attr]]);
+  color_codes[attr_colors[is_dark_bg ? attr + 6 : attr - 
1]]);
ret += vfprintf(fp, fmt, args);
ret += fprintf(fp, "%s", color_codes[C_CLEAR]);
 
-- 
2.14.2

[patch net] mlxsw: spectrum_router: Avoid expensive lookup during route removal

2017-10-08 Thread Jiri Pirko

From: Ido Schimmel 

In commit fc922bb0dd94 ("mlxsw: spectrum_router: Use one LPM tree for
all virtual routers") I increased the scale of supported VRFs by having
all of them share the same LPM tree.

In order to avoid look-ups for prefix lengths that don't exist, each
route removal would trigger an aggregation across all the active virtual
routers to see which prefix lengths are in use and which aren't and
structure the tree accordingly.

With the way the data structures are currently laid out, this is a very
expensive operation. When preformed repeatedly - due to the invocation
of the abort mechanism - and with enough VRFs, this can result in a hung
task.

For now, avoid this optimization until it can be properly re-added in
net-next.

Fixes: fc922bb0dd94 ("mlxsw: spectrum_router: Use one LPM tree for all virtual 
routers")
Signed-off-by: Ido Schimmel 
Reported-by: David Ahern 
Tested-by: David Ahern 
Signed-off-by: Jiri Pirko 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 14 --
 1 file changed, 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index 032089e..c16718d 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -3505,20 +3505,6 @@ static int mlxsw_sp_fib_lpm_tree_link(struct mlxsw_sp 
*mlxsw_sp,
 static void mlxsw_sp_fib_lpm_tree_unlink(struct mlxsw_sp *mlxsw_sp,
 struct mlxsw_sp_fib *fib)
 {
-   struct mlxsw_sp_prefix_usage req_prefix_usage = {{ 0 } };
-   struct mlxsw_sp_lpm_tree *lpm_tree;
-
-   /* Aggregate prefix lengths across all virtual routers to make
-* sure we only have used prefix lengths in the LPM tree.
-*/
-   mlxsw_sp_vrs_prefixes(mlxsw_sp, fib->proto, _prefix_usage);
-   lpm_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, _prefix_usage,
-fib->proto);
-   if (IS_ERR(lpm_tree))
-   goto err_tree_get;
-   mlxsw_sp_vrs_lpm_tree_replace(mlxsw_sp, fib, lpm_tree);
-
-err_tree_get:
if (!mlxsw_sp_prefix_usage_none(>prefix_usage))
return;
mlxsw_sp_vr_lpm_tree_unbind(mlxsw_sp, fib);
-- 
2.9.5

Re: [PATCH] Add a driver for Renesas uPD60620 and uPD60620A PHYs

2017-10-08 Thread Bernd Edlinger

Hi Andrew,

sorry for delayed reply.
Looks like I did not receive a copy of your e-mail.

 >> Do you suggest that there are cases where auto negotiation does not
 >> reach completion, and still provides a usable link status?
 >
 > My experience is that it often return 10/half, since everything should
 > support that. And depending on what the MAC is doing, packets can
 > sometime get across the link.
 >>
 >> I have tried to connect to link partners with fixed configuration
 >> but even then the auto negotiation always competes normally.
 >
 > Which is a bit odd.
 >
 > There are a few different possibilities here.  The peer PHY driver is
 > broken. Rather than doing fixed, it actually set the possible
 > negotiation options to just the one setting you tried to fix it
 > to. And hence the uPD60620 device negotiated fine. Or the uPD60620 is
 > broken is said it negotiated, but in fact it failed.
 >
 > What was the result? 10/Half, or the fixed values you set the peer to?

This is a dual-channel PHY, so I did just connect both ports and
played with the mii-tool -F / -A in different combinations on each
port and observed what happens when the cable is plugged in.

What happens is that the port with autonegotiation enabled detects
the correct speed and always half duplex, so the ASIC _pretends_ that
autonegotiatiation completes, when in fact only parallel detection
succeeded.  Of course the other phy may be in full-duplex mode, but
that can not be detected by parallel detection.

The duplex mode would be full duplex by default, but my initialization
overrides a possible strap option and changes that to half duplex:

+   /* Enable support for passive HUBs (could be a strap option) */
+   /* PHYMODE: All speeds, HD in parallel detect */
+   return phy_write(phydev, PHY_SPM, 0x0180 | phydev->mdio.addr);

 >>
 >> Signed-off-by: Bernd Edlinger 
 >
 > Please send this is a new patch. If we were to take this is is, all
 > the comments above would end up in the commit message.
 >
 > ---
 >
 > Under the --- you can however add comments which don't go into the
 > commit log. Good practice is to list the things you changed since the
 > previous version.

Thanks, I did not know that.
I will re-send the patch in a new thread.

Bernd.

Re: [PATCH net-next RFC 1/9] net: dsa: mv88e6xxx: add accessors for PTP/TAI registers

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:33AM -0500, Brandon Streiff wrote:
> This patch implements support for accessing PTP/TAI registers through

To avoid confusion, it would be helpful to mention what TAI stands for
here and also in the source code comments!

Thanks,
Richard

Re: [patch net-next 1/6] net: bridge: Use the MDB_RTR_TYPE_TEMP on bridge device too

2017-10-08 Thread Nikolay Aleksandrov

On 08/10/17 12:39, Nikolay Aleksandrov wrote:
> On 08/10/17 08:23, Yotam Gigi wrote:
>> On 10/05/2017 03:09 PM, Nikolay Aleksandrov wrote:
>>> On 05/10/17 13:36, Jiri Pirko wrote:
 From: Yotam Gigi 

 Every bridge port is in one of four mcast router port states:
  - MDB_RTR_TYPE_PERM - the port is set by the user to be an mrouter port
regardless of IGMP queries.
  - MDB_RTR_TYPE_DISABLED - the port is set by the user to not be an mrouter
port regardless of IGMP queries.
  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
learning state, but currently it is not an mrouter port as no IGMP query
has been received by it for the last multicast_querier_interval.
  - MDB_RTR_TYPE_TEMP_QUERY - the port is set by the user to be in mcast
router learning state, and currently it is an mrouter port due to an
IGMP query that has been received by it during the passed
multicast_querier_interval.
>>> I think you got the last two partially mixed up, MDB_RTR_TYPE_TEMP marks 
>>> the port as a router
>>> regardless if there were any igmp queries, while TYPE_TEMP_QUERY means it's 
>>> in learning
>>> state. It is the timer (armed vs not) that defines if currently the port is 
>>> a router
>>> when one of the TEMP/TEMP_QUERY are set. In the _TEMP case it is always 
>>> armed as it
>>> is refreshed by user or igmp queries which was the point of that mode.
>>> So this means in br_multicast_router() just check for the timer_pending or 
>>> perm mode.
>>
>>
>> As much as I tried to make this clear, it seems like I failed :)
>>
>> The 4 states I described are currently the "bridged port" states, not the
>> "bridge device" state. A bridged port has these 4 states, all can be set by 
>> the
>> user, while the bridge device only uses 3 of these states. This patch makes 
>> the
>> bridge device use the 4 states too. I thought it makes sense.
> 
> (disclaimer: this is all about bridge ports, not bridge device)
> Right, I'll try to explain again: _TEMP always marks the port as a mcast 
> router,
> it does not put it into just learning state waiting for an igmp query and it 
> can
> be refreshed by either a query or the user again setting the port in _TEMP.
> While _TEMP_QUERY puts the port in learning state waiting for a query to 
> become
> a router, and _TEMP downgrades to _TEMP_QUERY if it expires.
> 
> Does that make it clearer ?
> 
> so for _TEMP you say:
  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
learning state, but currently it is not an mrouter port as no IGMP query
has been received by it for the last multicast_querier_interval.
> 
> which is not the case, it is always a router when that mode is set on a port.
> Same for _TEMP_QUERY.

Err, sorry by same I meant it is not correct, not that the _TEMP definition is 
the same.
Need to get coffee :-)

> 
>>
>> The first paragraph describes the current states of a bridged port, and the
>> second one explains the difference between bridged port and bridge device. I
>> will (try to) make it clearer if we agree on resending this patch.
>>
>> Is it clearer now?
>>
>>
>>>
>>> In the port code you have the following transitions:
>>>  _TEMP -> TEMP_QUERY (on timer fire or user-set val, port becomes learning 
>>> only)
>>>  _TEMP -> _TEMP (noop on user refresh or igmp query, timer refreshes)
>>>  _TEMP_QUERY -> _TEMP_QUERY (on igmp query the timer is armed, port becomes 
>>> router)
>>>
>>> you never have _TEMP_QUERY -> _TEMP, which you're using here to denote the 
>>> timer
>>> getting armed and the bridge becoming a router.
>>
>>
>> I am not sure I got this one. I do address that: when an IGMP query is 
>> recieved
>> and the current state is _TEMP_QUERY, I arm the timer and set the state to
>> _TEMP. I marked that place on the patch, so you can see below.
>>
> 
> Exactly, there is no such transition for the ports. I tried to say that 
> you're using
> the router type to distinguish between when a query is received and it is 
> just learning.
> I get that you need to do so, but that deviates from how ports are handled, 
> thus I
> suggested to use the timer state instead and drop the _TEMP for bridge device 
> altogether.
> If it's possible then the patch will be much simpler and you will not need 
> the hacks
> to hide the state from user-space which is the part I really don't like.
> 
>>
>>>
 The bridge device (brX) itself can also be configured by the user to be
 either fixed, disabled or learning mrouter port states, but currently there
 is no distinction between the MDB_RTR_TYPE_TEMP_QUERY and MDB_RTR_TYPE_TEMP
 in the bridge internal state. Due to that, when an IGMP query is received,
 it is not straightforward to tell whether it changes the bridge device
 mrouter port status or not.
>>> But before this patch the bridge device could not get that set.
>>>
 Further patches in this

Re: [PATCH net-next RFC 2/9] net: dsa: mv88e6xxx: expose switch time as a PTP hardware clock

2017-10-08 Thread Richard Cochran

On Fri, Sep 29, 2017 at 03:17:02PM +, Brandon Streiff wrote:
>  
> Although now that I'm looking it over again, I'm also not certain of
> the need. Even if we're called more frequently than we expect, that
> doesn't seem to be harmful with regard to timekeeping. Hmm.

Just keep it simple and drop the extra logic.  It doesn't hurt to
over-sample the clock.  Here is what I did:

/* Covers both a 100 or a 125 MHz input clock. */
#define MV88E635X_OVERFLOW_PERIOD (HZ * 16)

static void mv88e635x_overflow_check(struct work_struct *ws)
{
struct timespec64 ts;
struct mv88e6xxx_chip *ps =
container_of(ws, struct mv88e6xxx_chip, oflow_work.work);

mv88e635x_ptp_gettime(>ptp_info, );
pr_debug("mv88e635x overflow check at %lld.%09lu\n",
 ts.tv_sec, ts.tv_nsec);
schedule_delayed_work(>oflow_work, MV88E635X_OVERFLOW_PERIOD);
}

Thanks,
Richard

[PATCH iproute2 net-next] ip: mroute: Print offload indication

2017-10-08 Thread Yotam Gigi

Since kernel net-next commit c7c0bbeae950 ("net: ipmr: Add MFC offload
indication") the kernel indicates on an MFC entry whether it was offloaded
using the RTNH_F_OFFLOAD flag. Update the "ip mroute show" command to
indicate when a route is offloaded, similarly to the "ip route show"
command.

Example output:
$ ip mroute
(0.0.0.0, 239.255.0.1)  Iif: sw1p7  Oifs: t_br0 State: resolved offload
(192.168.1.1, 239.255.0.1)  Iif: sw1p7  Oifs: sw1p4 State: resolved offload

Signed-off-by: Yotam Gigi 
---
 ip/ipmroute.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ip/ipmroute.c b/ip/ipmroute.c
index b51c23c..453a6cf 100644
--- a/ip/ipmroute.c
+++ b/ip/ipmroute.c
@@ -161,6 +161,8 @@ int print_mroute(const struct sockaddr_nl *who, struct 
nlmsghdr *n, void *arg)
}
fprintf(fp, " State: %s",
r->rtm_flags & RTNH_F_UNRESOLVED ? "unresolved" : "resolved");
+   if (r->rtm_flags & RTNH_F_OFFLOAD)
+   fprintf(fp, " offload");
if (show_stats && tb[RTA_MFC_STATS]) {
struct rta_mfc_stats *mfcs = RTA_DATA(tb[RTA_MFC_STATS]);
 
-- 
2.8.4

Re: [patch net-next 1/6] net: bridge: Use the MDB_RTR_TYPE_TEMP on bridge device too

2017-10-08 Thread Nikolay Aleksandrov

On 08/10/17 08:23, Yotam Gigi wrote:
> On 10/05/2017 03:09 PM, Nikolay Aleksandrov wrote:
>> On 05/10/17 13:36, Jiri Pirko wrote:
>>> From: Yotam Gigi 
>>>
>>> Every bridge port is in one of four mcast router port states:
>>>  - MDB_RTR_TYPE_PERM - the port is set by the user to be an mrouter port
>>>regardless of IGMP queries.
>>>  - MDB_RTR_TYPE_DISABLED - the port is set by the user to not be an mrouter
>>>port regardless of IGMP queries.
>>>  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
>>>learning state, but currently it is not an mrouter port as no IGMP query
>>>has been received by it for the last multicast_querier_interval.
>>>  - MDB_RTR_TYPE_TEMP_QUERY - the port is set by the user to be in mcast
>>>router learning state, and currently it is an mrouter port due to an
>>>IGMP query that has been received by it during the passed
>>>multicast_querier_interval.
>> I think you got the last two partially mixed up, MDB_RTR_TYPE_TEMP marks the 
>> port as a router
>> regardless if there were any igmp queries, while TYPE_TEMP_QUERY means it's 
>> in learning
>> state. It is the timer (armed vs not) that defines if currently the port is 
>> a router
>> when one of the TEMP/TEMP_QUERY are set. In the _TEMP case it is always 
>> armed as it
>> is refreshed by user or igmp queries which was the point of that mode.
>> So this means in br_multicast_router() just check for the timer_pending or 
>> perm mode.
> 
> 
> As much as I tried to make this clear, it seems like I failed :)
> 
> The 4 states I described are currently the "bridged port" states, not the
> "bridge device" state. A bridged port has these 4 states, all can be set by 
> the
> user, while the bridge device only uses 3 of these states. This patch makes 
> the
> bridge device use the 4 states too. I thought it makes sense.

(disclaimer: this is all about bridge ports, not bridge device)
Right, I'll try to explain again: _TEMP always marks the port as a mcast router,
it does not put it into just learning state waiting for an igmp query and it can
be refreshed by either a query or the user again setting the port in _TEMP.
While _TEMP_QUERY puts the port in learning state waiting for a query to become
a router, and _TEMP downgrades to _TEMP_QUERY if it expires.

Does that make it clearer ?

so for _TEMP you say:
>>>  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
>>>learning state, but currently it is not an mrouter port as no IGMP query
>>>has been received by it for the last multicast_querier_interval.

which is not the case, it is always a router when that mode is set on a port.
Same for _TEMP_QUERY.

> 
> The first paragraph describes the current states of a bridged port, and the
> second one explains the difference between bridged port and bridge device. I
> will (try to) make it clearer if we agree on resending this patch.
> 
> Is it clearer now?
> 
> 
>>
>> In the port code you have the following transitions:
>>  _TEMP -> TEMP_QUERY (on timer fire or user-set val, port becomes learning 
>> only)
>>  _TEMP -> _TEMP (noop on user refresh or igmp query, timer refreshes)
>>  _TEMP_QUERY -> _TEMP_QUERY (on igmp query the timer is armed, port becomes 
>> router)
>>
>> you never have _TEMP_QUERY -> _TEMP, which you're using here to denote the 
>> timer
>> getting armed and the bridge becoming a router.
> 
> 
> I am not sure I got this one. I do address that: when an IGMP query is 
> recieved
> and the current state is _TEMP_QUERY, I arm the timer and set the state to
> _TEMP. I marked that place on the patch, so you can see below.
> 

Exactly, there is no such transition for the ports. I tried to say that you're 
using
the router type to distinguish between when a query is received and it is just 
learning.
I get that you need to do so, but that deviates from how ports are handled, 
thus I
suggested to use the timer state instead and drop the _TEMP for bridge device 
altogether.
If it's possible then the patch will be much simpler and you will not need the 
hacks
to hide the state from user-space which is the part I really don't like.

> 
>>
>>> The bridge device (brX) itself can also be configured by the user to be
>>> either fixed, disabled or learning mrouter port states, but currently there
>>> is no distinction between the MDB_RTR_TYPE_TEMP_QUERY and MDB_RTR_TYPE_TEMP
>>> in the bridge internal state. Due to that, when an IGMP query is received,
>>> it is not straightforward to tell whether it changes the bridge device
>>> mrouter port status or not.
>> But before this patch the bridge device could not get that set.
>>
>>> Further patches in this patch-set will introduce notifications upon the
>>> bridge device mrouter port state. In order to prevent resending bridge
>>> mrouter notification when it is not needed, such distinction is necessary.
>>>
>> Granted the bridge device hasn't got a way to clearly distinguish the 
>> transitions
>>

Re: [patch net-next 1/6] net: bridge: Use the MDB_RTR_TYPE_TEMP on bridge device too

2017-10-08 Thread Yotam Gigi

On 10/08/2017 12:42 PM, Nikolay Aleksandrov wrote:
> On 08/10/17 12:39, Nikolay Aleksandrov wrote:
>> On 08/10/17 08:23, Yotam Gigi wrote:
>>> On 10/05/2017 03:09 PM, Nikolay Aleksandrov wrote:
 On 05/10/17 13:36, Jiri Pirko wrote:
> From: Yotam Gigi 
>
> Every bridge port is in one of four mcast router port states:
>  - MDB_RTR_TYPE_PERM - the port is set by the user to be an mrouter port
>regardless of IGMP queries.
>  - MDB_RTR_TYPE_DISABLED - the port is set by the user to not be an 
> mrouter
>port regardless of IGMP queries.
>  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
>learning state, but currently it is not an mrouter port as no IGMP 
> query
>has been received by it for the last multicast_querier_interval.
>  - MDB_RTR_TYPE_TEMP_QUERY - the port is set by the user to be in mcast
>router learning state, and currently it is an mrouter port due to an
>IGMP query that has been received by it during the passed
>multicast_querier_interval.
 I think you got the last two partially mixed up, MDB_RTR_TYPE_TEMP marks 
 the port as a router
 regardless if there were any igmp queries, while TYPE_TEMP_QUERY means 
 it's in learning
 state. It is the timer (armed vs not) that defines if currently the port 
 is a router
 when one of the TEMP/TEMP_QUERY are set. In the _TEMP case it is always 
 armed as it
 is refreshed by user or igmp queries which was the point of that mode.
 So this means in br_multicast_router() just check for the timer_pending or 
 perm mode.
>>>
>>> As much as I tried to make this clear, it seems like I failed :)
>>>
>>> The 4 states I described are currently the "bridged port" states, not the
>>> "bridge device" state. A bridged port has these 4 states, all can be set by 
>>> the
>>> user, while the bridge device only uses 3 of these states. This patch makes 
>>> the
>>> bridge device use the 4 states too. I thought it makes sense.
>> (disclaimer: this is all about bridge ports, not bridge device)
>> Right, I'll try to explain again: _TEMP always marks the port as a mcast 
>> router,
>> it does not put it into just learning state waiting for an igmp query and it 
>> can
>> be refreshed by either a query or the user again setting the port in _TEMP.
>> While _TEMP_QUERY puts the port in learning state waiting for a query to 
>> become
>> a router, and _TEMP downgrades to _TEMP_QUERY if it expires.
>>
>> Does that make it clearer ?
>>
>> so for _TEMP you say:
>  - MDB_RTR_TYPE_TEMP - the port is set by the user to be in mcast router
>learning state, but currently it is not an mrouter port as no IGMP 
> query
>has been received by it for the last multicast_querier_interval.
>> which is not the case, it is always a router when that mode is set on a port.
>> Same for _TEMP_QUERY.
> Err, sorry by same I meant it is not correct, not that the _TEMP definition 
> is the same.
> Need to get coffee :-)

Ho, I see  that I was clear but not right :)

I had a look at the code and seems like you are right - for some reason I
thought that _TEMP is learning-active and _TEMP_QUERY is learning-inactive, and
the ports change states when according to IGMP queries.


>
>>> The first paragraph describes the current states of a bridged port, and the
>>> second one explains the difference between bridged port and bridge device. I
>>> will (try to) make it clearer if we agree on resending this patch.
>>>
>>> Is it clearer now?


Yes it is.

>>>
>>>
 In the port code you have the following transitions:
  _TEMP -> TEMP_QUERY (on timer fire or user-set val, port becomes learning 
 only)
  _TEMP -> _TEMP (noop on user refresh or igmp query, timer refreshes)
  _TEMP_QUERY -> _TEMP_QUERY (on igmp query the timer is armed, port 
 becomes router)

 you never have _TEMP_QUERY -> _TEMP, which you're using here to denote the 
 timer
 getting armed and the bridge becoming a router.
>>>
>>> I am not sure I got this one. I do address that: when an IGMP query is 
>>> recieved
>>> and the current state is _TEMP_QUERY, I arm the timer and set the state to
>>> _TEMP. I marked that place on the patch, so you can see below.
>>>
>> Exactly, there is no such transition for the ports. I tried to say that 
>> you're using
>> the router type to distinguish between when a query is received and it is 
>> just learning.
>> I get that you need to do so, but that deviates from how ports are handled, 
>> thus I
>> suggested to use the timer state instead and drop the _TEMP for bridge 
>> device altogether.
>> If it's possible then the patch will be much simpler and you will not need 
>> the hacks
>> to hide the state from user-space which is the part I really don't like.


Yeah, I agree. Thanks for the feedback and sorry for the late answer :)

>>
> The bridge device (brX) itself can also be

Re: [PATCH net-next RFC 2/9] net: dsa: mv88e6xxx: expose switch time as a PTP hardware clock

2017-10-08 Thread Richard Cochran

On Fri, Sep 29, 2017 at 03:28:02PM +, Brandon Streiff wrote:
> 
> NETWORK_PHY_TIMESTAMPING implies NET_PTP_CLASSIFY (which I do use)
> and net/core/timestamping.c (which I didn't). It probably makes more
> sense to just depend on NET_PTP_CLASSIFY directly.

Yes, that makes sense to do, if you can make it work.

With my driver I tried depending on NET_PTP_CLASSIFY, but there was
some Kconfig issue, and rather than figuring it out I did the lazy
thing and used NETWORK_PHY_TIMESTAMPING.

Thanks,
Richard

[PATCHv2] Add a driver for Renesas uPD60620 and uPD60620A PHYs

2017-10-08 Thread Bernd Edlinger

Signed-off-by: Bernd Edlinger 
---
 drivers/net/phy/Kconfig|   5 +++
 drivers/net/phy/Makefile   |   1 +
 drivers/net/phy/uPD60620.c | 109 +
 3 files changed, 115 insertions(+)
 create mode 100644 drivers/net/phy/uPD60620.c

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index cd931cf..e2cf8ff 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -366,6 +366,11 @@ config REALTEK_PHY
---help---
  Supports the Realtek 821x PHY.
 
+config RENESAS_PHY
+   tristate "Driver for Renesas PHYs"
+   ---help---
+ Supports the Renesas PHYs uPD60620 and uPD60620A.
+
 config ROCKCHIP_PHY
 tristate "Driver for Rockchip Ethernet PHYs"
 ---help---
diff --git a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile
index 416df92..1404ad3 100644
--- a/drivers/net/phy/Makefile
+++ b/drivers/net/phy/Makefile
@@ -72,6 +72,7 @@ obj-$(CONFIG_MICROSEMI_PHY)   += mscc.o
 obj-$(CONFIG_NATIONAL_PHY) += national.o
 obj-$(CONFIG_QSEMI_PHY)+= qsemi.o
 obj-$(CONFIG_REALTEK_PHY)  += realtek.o
+obj-$(CONFIG_RENESAS_PHY)  += uPD60620.o
 obj-$(CONFIG_ROCKCHIP_PHY) += rockchip.o
 obj-$(CONFIG_SMSC_PHY) += smsc.o
 obj-$(CONFIG_STE10XP)  += ste10Xp.o
diff --git a/drivers/net/phy/uPD60620.c b/drivers/net/phy/uPD60620.c
new file mode 100644
index 000..96b3347
--- /dev/null
+++ b/drivers/net/phy/uPD60620.c
@@ -0,0 +1,109 @@
+/*
+ * Driver for the Renesas PHY uPD60620.
+ *
+ * Copyright (C) 2015 Softing Industrial Automation GmbH
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#define UPD60620_PHY_ID0xb8242824
+
+/* Extended Registers and values */
+/* PHY Special Control/Status*/
+#define PHY_PHYSCR 0x1F  /* PHY.31 */
+#define PHY_PHYSCR_10MB0x0004/* PHY speed = 10mb */
+#define PHY_PHYSCR_100MB   0x0008/* PHY speed = 100mb */
+#define PHY_PHYSCR_DUPLEX  0x0010/* PHY Duplex */
+
+/* PHY Special Modes */
+#define PHY_SPM0x12  /* PHY.18 */
+
+/* Init PHY */
+
+static int upd60620_config_init(struct phy_device *phydev)
+{
+   /* Enable support for passive HUBs (could be a strap option) */
+   /* PHYMODE: All speeds, HD in parallel detect */
+   return phy_write(phydev, PHY_SPM, 0x0180 | phydev->mdio.addr);
+}
+
+/* Get PHY status from common registers */
+
+static int upd60620_read_status(struct phy_device *phydev)
+{
+   int phy_state;
+
+   /* Read negotiated state */
+   phy_state = phy_read(phydev, MII_BMSR);
+   if (phy_state < 0)
+   return phy_state;
+
+   phydev->link = 0;
+   phydev->lp_advertising = 0;
+   phydev->pause = 0;
+   phydev->asym_pause = 0;
+
+   if (phy_state & (BMSR_ANEGCOMPLETE | BMSR_LSTATUS)) {
+   phy_state = phy_read(phydev, PHY_PHYSCR);
+   if (phy_state < 0)
+   return phy_state;
+
+   if (phy_state & (PHY_PHYSCR_10MB | PHY_PHYSCR_100MB)) {
+   phydev->link = 1;
+   phydev->speed = SPEED_10;
+   phydev->duplex = DUPLEX_HALF;
+
+   if (phy_state & PHY_PHYSCR_100MB)
+   phydev->speed = SPEED_100;
+   if (phy_state & PHY_PHYSCR_DUPLEX)
+   phydev->duplex = DUPLEX_FULL;
+
+   phy_state = phy_read(phydev, MII_LPA);
+   if (phy_state < 0)
+   return phy_state;
+
+   phydev->lp_advertising
+   = mii_lpa_to_ethtool_lpa_t(phy_state);
+
+   if (phydev->duplex == DUPLEX_FULL) {
+   if (phy_state & LPA_PAUSE_CAP)
+   phydev->pause = 1;
+   if (phy_state & LPA_PAUSE_ASYM)
+   phydev->asym_pause = 1;
+   }
+   }
+   }
+   return 0;
+}
+
+MODULE_DESCRIPTION("Renesas uPD60620 PHY driver");
+MODULE_AUTHOR("Bernd Edlinger ");
+MODULE_LICENSE("GPL");
+
+static struct phy_driver upd60620_driver[1] = { {
+   .phy_id = UPD60620_PHY_ID,
+   .phy_id_mask= 0xfffe,
+   .name   = "Renesas uPD60620",
+   .features   = PHY_BASIC_FEATURES,
+   .flags  = 0,
+   .config_init= upd60620_config_init,
+   .config_aneg= genphy_config_aneg,
+   .read_status= upd60620_read_status,
+} };
+
+module_phy_driver(upd60620_driver);
+
+static struct mdio_device_id __maybe_unused

Re: [PATCH net-next RFC 2/9] net: dsa: mv88e6xxx: expose switch time as a PTP hardware clock

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:34AM -0500, Brandon Streiff wrote:
> +static int mv88e6xxx_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm)
> +{
> + if (scaled_ppm == 0)
> + return 0;
> +
> + return -EOPNOTSUPP;
> +}

We really want to have an adjustable clock here.  More below.

> +int mv88e6xxx_ptp_setup(struct mv88e6xxx_chip *chip)
> +{
> + /* Set up the cycle counter */
> + memset(>tstamp_cc, 0, sizeof(chip->tstamp_cc));
> + chip->tstamp_cc.read= mv88e6xxx_ptp_clock_read;
> + chip->tstamp_cc.mask= CYCLECOUNTER_MASK(32);
> + /* Raw timestamps are in units of 8-ns clock periods. */
> + chip->tstamp_cc.mult= 8;
> + chip->tstamp_cc.shift   = 0;

First of all, the switch can use an external clock, and so at the very
least, the period should be a macro so that if and when we support the
external clock, the macro may be converted into a variable.

Secondly, the mult/shift should be chosen to allow the finest possible
frequency adjustment.  Here is what I did:

---
#define N 28
#define CC_MULT (8 << N)

int mv88e635x_setup(struct dsa_switch *ds)
{
struct mv88e6xxx_chip *ps = ds->priv;

ps->cc.read = mv88e635x_global_time_read;
ps->cc.mask = CLOCKSOURCE_MASK(32);
ps->cc.mult = CC_MULT;
ps->cc.shift = N;
timecounter_init(>tc, >cc, ktime_to_ns(ktime_get_real()));
...
}

static int mv88e635x_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb)
{
u64 adj;
u32 diff, mult;
int neg_adj = 0;
struct mv88e6xxx_chip *ps =
container_of(ptp, struct mv88e6xxx_chip, ptp_info);

if (ppb < 0) {
neg_adj = 1;
ppb = -ppb;
}
mult = CC_MULT;
adj = mult;
adj *= ppb;
diff = div_u64(adj, 10ULL);

mutex_lock(>clock_mutex);
timecounter_read(>tc);
ps->cc.mult = neg_adj ? mult - diff : mult + diff;
mutex_unlock(>clock_mutex);

return 0;
}
---

(This is the legacy adjfreq method, but you can easily convert it into
 the adjfine method.)

Of course, this means that you'll have to drop the periodic output
signal code.

Thanks,
Richard

Re: [patch net-next 0/2] mlxsw: Add more extack error reporting

2017-10-08 Thread Jiri Pirko

Sun, Oct 08, 2017 at 12:23:50AM CEST, da...@davemloft.net wrote:
>From: Jiri Pirko 
>Date: Thu,  5 Oct 2017 23:39:58 +0200
>
>> From: Jiri Pirko 
>> 
>> Ido says:
>> 
>> Add error messages to VLAN and bridge enslavements to help users
>> understand why the enslavement failed.
>
>For some reason patch #2 didn't make it to the list and patchwork,
>could you please resend (and add in David Ahern's ACK of course).

Will do. Thanks

Re: [PATCH net-next RFC 8/9] net: dsa: mv88e6xxx: add rx/tx timestamping support

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:40AM -0500, Brandon Streiff wrote:

> +static bool mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port,
> + struct sk_buff *skb, unsigned int type)
> +{
> + struct mv88e6xxx_port_hwtstamp *ps = >port_hwtstamp[port];
> + u8 *ptp_hdr, *msgtype;
> + bool ret;
> +
> + if (port < 0 || port >= mv88e6xxx_num_ports(chip))
> + return false;
> +
> + ptp_hdr = _get_ptp_header(skb, type);
> + if (IS_ERR(ptp_hdr))
> + return false;
> +
> + if (unlikely(type & PTP_CLASS_V1))
> + msgtype = ptp_hdr + OFF_PTP_CONTROL;
> + else
> + msgtype = ptp_hdr;
> +
> + ret = test_bit(MV88E6XXX_HWTSTAMP_ENABLED, >state);

This should be the first test, don't you think?

> + dev_dbg(chip->dev,
> + "p%d: PTP message classification 0x%x type 0x%x, tstamp? %d",
> + port, type, *msgtype, (int)ret);
> +
> + return ret;
> +}
> +
> +/* rxtstamp will be called in interrupt context so we don't to do
> + * anything like read PTP registers over SMI.
> + */
> +bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int port,
> +  struct sk_buff *skb, unsigned int type)
> +{
> + struct mv88e6xxx_chip *chip = ds->priv;
> + struct skb_shared_hwtstamps *shhwtstamps;
> + __be32 *ptp_rx_ts;
> + u8 *ptp_hdr;
> + u32 raw_ts;
> + u64 ns;
> +
> + if (!chip->info->ptp_support)
> + return false;
> +
> + if (port < 0 || port >= mv88e6xxx_num_ports(chip))
> + return false;

This test is duplicated in mv88e6xxx_should_tstamp().

> + if (!mv88e6xxx_should_tstamp(chip, port, skb, type))
> + return false;
> +
> + shhwtstamps = skb_hwtstamps(skb);
> + memset(shhwtstamps, 0, sizeof(*shhwtstamps));
> +
> + /* Because we configured the arrival timestamper to put the counter
> +  * into the 32-bit "reserved" field of the PTP header, we can retrieve
> +  * the value from the packet directly instead of having to retrieve it
> +  * via SMI.
> +  */
> + ptp_hdr = _get_ptp_header(skb, type);
> + if (IS_ERR(ptp_hdr))
> + return false;
> + ptp_rx_ts = (__be32 *)(ptp_hdr + OFF_PTP_RESERVED);
> + raw_ts = __be32_to_cpu(*ptp_rx_ts);
> + ns = timecounter_cyc2time(>tstamp_tc, raw_ts);
> + shhwtstamps->hwtstamp = ns_to_ktime(ns);
> +
> + dev_dbg(chip->dev, "p%d: rxtstamp %llx\n", port, ns);
> +
> + return false;
> +}
> +
> +static void mv88e6xxx_txtstamp_work(struct work_struct *ugly)
> +{
> + struct mv88e6xxx_port_hwtstamp *ps = container_of(
> + ugly, struct mv88e6xxx_port_hwtstamp, tx_tstamp_work);
> + struct mv88e6xxx_chip *chip = container_of(
> + ps, struct mv88e6xxx_chip, port_hwtstamp[ps->port_id]);
> + struct sk_buff *tmp_skb;
> + unsigned long tmp_tstamp_start;
> + int err;
> + u16 departure_block[4];
> + u16 tmp_seq_id;
> +
> + if (!test_bit(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS, >state))
> + return;
> +
> + tmp_skb = ps->tx_skb;
> + tmp_seq_id = ps->tx_seq_id;
> + tmp_tstamp_start = ps->tx_tstamp_start;
> +
> + if (!tmp_skb)
> + return;
> +
> + mutex_lock(>reg_lock);
> + err = mv88e6xxx_port_ptp_read(chip, ps->port_id,
> +   MV88E6XXX_PORT_PTP_DEP_STS,
> +   departure_block,
> +   ARRAY_SIZE(departure_block));
> + mutex_unlock(>reg_lock);
> +
> + if (err)
> + goto free_and_clear_skb;
> +
> + if (departure_block[0] & MV88E6XXX_PTP_TS_VALID) {

You can avoid the IfOk anti-pattern here.  Make the test for !VALID
and move the 'else' block up.

> + struct skb_shared_hwtstamps shhwtstamps;
> + u64 ns;
> + u32 time_raw;
> + u16 status;
> +
> + /* We have the timestamp; go ahead and clear valid now */
> + mutex_lock(>reg_lock);
> + mv88e6xxx_port_ptp_write(chip, ps->port_id,
> +  MV88E6XXX_PORT_PTP_DEP_STS, 0);
> + mutex_unlock(>reg_lock);
> +
> + status = departure_block[0] &
> + MV88E6XXX_PTP_TS_STATUS_MASK;
> + if (status != MV88E6XXX_PTP_TS_STATUS_NORMAL) {
> + dev_warn(chip->dev, "p%d: tx timestamp overrun\n",
> +  ps->port_id);
> + goto free_and_clear_skb;
> + }
> +
> + if (departure_block[3] != tmp_seq_id) {
> + dev_warn(chip->dev, "p%d: unexpected sequence id\n",
> +  ps->port_id);
> + goto free_and_clear_skb;
> + }
> +
> + memset(, 0, sizeof(shhwtstamps));
> + time_raw = ((u32)departure_block[2] << 16) |
> +

[PATCH] tests: Remove bashisms (s/source/.)

2017-10-08 Thread Petr Vorel

Signed-off-by: Petr Vorel 
---
 testsuite/tests/ip/link/new_link.t| 2 +-
 testsuite/tests/ip/link/show_dev_wo_vf_rate.t | 2 +-
 testsuite/tests/ip/netns/set_nsid.t   | 2 +-
 testsuite/tests/ip/netns/set_nsid_batch.t | 2 +-
 testsuite/tests/ip/route/add_default_route.t  | 4 ++--
 testsuite/tests/ip/tunnel/add_tunnel.t| 2 +-
 testsuite/tests/tc/cls-testbed.t  | 2 +-
 testsuite/tests/tc/dsmark.t   | 2 +-
 testsuite/tests/tc/pedit.t| 2 +-
 9 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/testsuite/tests/ip/link/new_link.t 
b/testsuite/tests/ip/link/new_link.t
index 699adbcd..c17650a2 100755
--- a/testsuite/tests/ip/link/new_link.t
+++ b/testsuite/tests/ip/link/new_link.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 ts_log "[Testing add/del virtual links]"
 
diff --git a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t 
b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
index a600ba65..5b3c004e 100755
--- a/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
+++ b/testsuite/tests/ip/link/show_dev_wo_vf_rate.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 NL_FILE="tests/ip/link/dev_wo_vf_rate.nl"
 ts_ip "$0" "Show VF devices w/o VF rate info" -d monitor file $NL_FILE
diff --git a/testsuite/tests/ip/netns/set_nsid.t 
b/testsuite/tests/ip/netns/set_nsid.t
index 606d45ab..8f8c7792 100755
--- a/testsuite/tests/ip/netns/set_nsid.t
+++ b/testsuite/tests/ip/netns/set_nsid.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 ts_log "[Testing netns nsid]"
 
diff --git a/testsuite/tests/ip/netns/set_nsid_batch.t 
b/testsuite/tests/ip/netns/set_nsid_batch.t
index abb3f1bb..196fd4b3 100755
--- a/testsuite/tests/ip/netns/set_nsid_batch.t
+++ b/testsuite/tests/ip/netns/set_nsid_batch.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 ts_log "[Testing netns nsid in batch mode]"
 
diff --git a/testsuite/tests/ip/route/add_default_route.t 
b/testsuite/tests/ip/route/add_default_route.t
index e5ea6473..0b566f1f 100755
--- a/testsuite/tests/ip/route/add_default_route.t
+++ b/testsuite/tests/ip/route/add_default_route.t
@@ -1,6 +1,6 @@
-#!/bin/sh
+#!/bin/bash
 
-source lib/generic.sh
+. lib/generic.sh
 
 ts_log "[Testing add default route]"
 
diff --git a/testsuite/tests/ip/tunnel/add_tunnel.t 
b/testsuite/tests/ip/tunnel/add_tunnel.t
index 18f6e370..3f5a9d3c 100755
--- a/testsuite/tests/ip/tunnel/add_tunnel.t
+++ b/testsuite/tests/ip/tunnel/add_tunnel.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 TUNNEL_NAME="tunnel_test_ip"
 
diff --git a/testsuite/tests/tc/cls-testbed.t b/testsuite/tests/tc/cls-testbed.t
index 2afc26fc..d5c21e5c 100755
--- a/testsuite/tests/tc/cls-testbed.t
+++ b/testsuite/tests/tc/cls-testbed.t
@@ -1,7 +1,7 @@
 #!/bin/bash
 # vim: ft=sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 QDISCS="cbq htb dsmark"
 
diff --git a/testsuite/tests/tc/dsmark.t b/testsuite/tests/tc/dsmark.t
index 6934165e..177585e6 100755
--- a/testsuite/tests/tc/dsmark.t
+++ b/testsuite/tests/tc/dsmark.t
@@ -1,7 +1,7 @@
 #!/bin/bash
 # vim: ft=sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 ts_qdisc_available "dsmark"
 if [ $? -eq 0 ]; then
diff --git a/testsuite/tests/tc/pedit.t b/testsuite/tests/tc/pedit.t
index e9b6c333..8d531a05 100755
--- a/testsuite/tests/tc/pedit.t
+++ b/testsuite/tests/tc/pedit.t
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-source lib/generic.sh
+. lib/generic.sh
 
 DEV="$(rand_dev)"
 ts_ip "$0" "Add $DEV dummy interface" link add dev $DEV type dummy
-- 
2.14.2

[PATCH v2 6/7] crypto:chelsio:Move DMA un/mapping to chcr from lld cxgb4 driver

2017-10-08 Thread Harsh Jain

Allow chcr to do DMA mapping/Unmapping instead of lld cxgb4.
It moves "Copy AAD to dst buffer" requirement from driver to
firmware.

Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_algo.c   | 1645 ++
 drivers/crypto/chelsio/chcr_algo.h   |   44 +-
 drivers/crypto/chelsio/chcr_crypto.h |  114 ++-
 drivers/net/ethernet/chelsio/cxgb4/sge.c |8 +-
 4 files changed, 1116 insertions(+), 695 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index b13991d..646dfff 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -70,6 +70,8 @@
 #include "chcr_algo.h"
 #include "chcr_crypto.h"
 
+#define IV AES_BLOCK_SIZE
+
 static inline  struct chcr_aead_ctx *AEAD_CTX(struct chcr_context *ctx)
 {
return ctx->crypto_ctx->aeadctx;
@@ -102,7 +104,7 @@ static inline struct uld_ctx *ULD_CTX(struct chcr_context 
*ctx)
 
 static inline int is_ofld_imm(const struct sk_buff *skb)
 {
-   return (skb->len <= CRYPTO_MAX_IMM_TX_PKT_LEN);
+   return (skb->len <= SGE_MAX_WR_LEN);
 }
 
 /*
@@ -117,21 +119,92 @@ static inline unsigned int sgl_len(unsigned int n)
return (3 * n) / 2 + (n & 1) + 2;
 }
 
-static int dstsg_2k(struct scatterlist *sgl, unsigned int reqlen)
+static int sg_nents_xlen(struct scatterlist *sg, unsigned int reqlen,
+unsigned int entlen,
+unsigned int skip)
 {
int nents = 0;
unsigned int less;
+   unsigned int skip_len = 0;
 
-   while (sgl && reqlen) {
-   less = min(reqlen, sgl->length);
-   nents += DIV_ROUND_UP(less, CHCR_SG_SIZE);
-   reqlen -= less;
-   sgl = sg_next(sgl);
+   while (sg && skip) {
+   if (sg_dma_len(sg) <= skip) {
+   skip -= sg_dma_len(sg);
+   skip_len = 0;
+   sg = sg_next(sg);
+   } else {
+   skip_len = skip;
+   skip = 0;
+   }
}
 
+   while (sg && reqlen) {
+   less = min(reqlen, sg_dma_len(sg) - skip_len);
+   nents += DIV_ROUND_UP(less, entlen);
+   reqlen -= less;
+   skip_len = 0;
+   sg = sg_next(sg);
+   }
return nents;
 }
 
+static inline void chcr_handle_ahash_resp(struct ahash_request *req,
+ unsigned char *input,
+ int err)
+{
+   struct chcr_ahash_req_ctx *reqctx = ahash_request_ctx(req);
+   int digestsize, updated_digestsize;
+   struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
+   struct uld_ctx *u_ctx = ULD_CTX(h_ctx(tfm));
+
+   if (input == NULL)
+   goto out;
+   reqctx = ahash_request_ctx(req);
+   digestsize = crypto_ahash_digestsize(crypto_ahash_reqtfm(req));
+   if (reqctx->is_sg_map)
+   chcr_hash_dma_unmap(_ctx->lldi.pdev->dev, req);
+   if (reqctx->dma_addr)
+   dma_unmap_single(_ctx->lldi.pdev->dev, reqctx->dma_addr,
+reqctx->dma_len, DMA_TO_DEVICE);
+   reqctx->dma_addr = 0;
+   updated_digestsize = digestsize;
+   if (digestsize == SHA224_DIGEST_SIZE)
+   updated_digestsize = SHA256_DIGEST_SIZE;
+   else if (digestsize == SHA384_DIGEST_SIZE)
+   updated_digestsize = SHA512_DIGEST_SIZE;
+   if (reqctx->result == 1) {
+   reqctx->result = 0;
+   memcpy(req->result, input + sizeof(struct cpl_fw6_pld),
+  digestsize);
+   } else {
+   memcpy(reqctx->partial_hash, input + sizeof(struct cpl_fw6_pld),
+  updated_digestsize);
+   }
+out:
+   req->base.complete(>base, err);
+
+   }
+
+static inline void chcr_handle_aead_resp(struct aead_request *req,
+unsigned char *input,
+int err)
+{
+   struct chcr_aead_reqctx *reqctx = aead_request_ctx(req);
+   struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+   struct uld_ctx *u_ctx = ULD_CTX(a_ctx(tfm));
+
+
+   chcr_aead_dma_unmap(_ctx->lldi.pdev->dev, req, reqctx->op);
+   if (reqctx->b0_dma)
+   dma_unmap_single(_ctx->lldi.pdev->dev, reqctx->b0_dma,
+reqctx->b0_len, DMA_BIDIRECTIONAL);
+   if (reqctx->verify == VERIFY_SW) {
+   chcr_verify_tag(req, input, );
+   reqctx->verify = VERIFY_HW;
+}
+   req->base.complete(>base, err);
+
+}
 static void chcr_verify_tag(struct aead_request *req, u8 *input, int *err)
 {
u8 temp[SHA512_DIGEST_SIZE];
@@ -166,27 +239,11 @@ int chcr_handle_resp(struct crypto_async_request *req, 
unsigned char *input,
 {
struct crypto_tfm *tfm = req->tfm;
struct chcr_context

[PATCH v2 5/7] crypto:chelsio:Remove allocation of sg list to implement 2K limit of dsgl header

2017-10-08 Thread Harsh Jain

Update DMA address index instead of allocating new sg list to impose  2k size 
limit for each entry.

Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_algo.c   | 237 +++
 drivers/crypto/chelsio/chcr_algo.h   |   3 +-
 drivers/crypto/chelsio/chcr_core.h   |   2 +-
 drivers/crypto/chelsio/chcr_crypto.h |   6 -
 4 files changed, 76 insertions(+), 172 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index e0ab34a..b13991d 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -117,6 +117,21 @@ static inline unsigned int sgl_len(unsigned int n)
return (3 * n) / 2 + (n & 1) + 2;
 }
 
+static int dstsg_2k(struct scatterlist *sgl, unsigned int reqlen)
+{
+   int nents = 0;
+   unsigned int less;
+
+   while (sgl && reqlen) {
+   less = min(reqlen, sgl->length);
+   nents += DIV_ROUND_UP(less, CHCR_SG_SIZE);
+   reqlen -= less;
+   sgl = sg_next(sgl);
+   }
+
+   return nents;
+}
+
 static void chcr_verify_tag(struct aead_request *req, u8 *input, int *err)
 {
u8 temp[SHA512_DIGEST_SIZE];
@@ -166,8 +181,6 @@ int chcr_handle_resp(struct crypto_async_request *req, 
unsigned char *input,
kfree_skb(ctx_req.ctx.reqctx->skb);
ctx_req.ctx.reqctx->skb = NULL;
}
-   free_new_sg(ctx_req.ctx.reqctx->newdstsg);
-   ctx_req.ctx.reqctx->newdstsg = NULL;
if (ctx_req.ctx.reqctx->verify == VERIFY_SW) {
chcr_verify_tag(ctx_req.req.aead_req, input,
);
@@ -388,31 +401,41 @@ static void write_phys_cpl(struct cpl_rx_phys_dsgl 
*phys_cpl,
 {
struct phys_sge_pairs *to;
unsigned int len = 0, left_size = sg_param->obsize;
-   unsigned int nents = sg_param->nents, i, j = 0;
+   unsigned int j = 0;
+   int offset, ent_len;
 
phys_cpl->op_to_tid = htonl(CPL_RX_PHYS_DSGL_OPCODE_V(CPL_RX_PHYS_DSGL)
| CPL_RX_PHYS_DSGL_ISRDMA_V(0));
+   to = (struct phys_sge_pairs *)((unsigned char *)phys_cpl +
+  sizeof(struct cpl_rx_phys_dsgl));
+   while (left_size && sg) {
+   len = min_t(u32, left_size, sg_dma_len(sg));
+   offset = 0;
+   while (len) {
+   ent_len =  min_t(u32, len, CHCR_SG_SIZE);
+   to->len[j % 8] = htons(ent_len);
+   to->addr[j % 8] = cpu_to_be64(sg_dma_address(sg) +
+ offset);
+   offset += ent_len;
+   len -= ent_len;
+   j++;
+   if ((j % 8) == 0)
+   to++;
+   }
+   left_size -= min(left_size, sg_dma_len(sg));
+   sg = sg_next(sg);
+   }
phys_cpl->pcirlxorder_to_noofsgentr =
htonl(CPL_RX_PHYS_DSGL_PCIRLXORDER_V(0) |
  CPL_RX_PHYS_DSGL_PCINOSNOOP_V(0) |
  CPL_RX_PHYS_DSGL_PCITPHNTENB_V(0) |
  CPL_RX_PHYS_DSGL_PCITPHNT_V(0) |
  CPL_RX_PHYS_DSGL_DCAID_V(0) |
- CPL_RX_PHYS_DSGL_NOOFSGENTR_V(nents));
+ CPL_RX_PHYS_DSGL_NOOFSGENTR_V(j));
phys_cpl->rss_hdr_int.opcode = CPL_RX_PHYS_ADDR;
phys_cpl->rss_hdr_int.qid = htons(sg_param->qid);
phys_cpl->rss_hdr_int.hash_val = 0;
-   to = (struct phys_sge_pairs *)((unsigned char *)phys_cpl +
-  sizeof(struct cpl_rx_phys_dsgl));
-   for (i = 0; nents && left_size; to++) {
-   for (j = 0; j < 8 && nents && left_size; j++, nents--) {
-   len = min(left_size, sg_dma_len(sg));
-   to->len[j] = htons(len);
-   to->addr[j] = cpu_to_be64(sg_dma_address(sg));
-   left_size -= len;
-   sg = sg_next(sg);
-   }
-   }
+
 }
 
 static inline int map_writesg_phys_cpl(struct device *dev,
@@ -523,31 +546,33 @@ static int generate_copy_rrkey(struct ablk_ctx *ablkctx,
 static int chcr_sg_ent_in_wr(struct scatterlist *src,
 struct scatterlist *dst,
 unsigned int minsg,
-unsigned int space,
-short int *sent,
-short int *dent)
+unsigned int space)
 {
int srclen = 0, dstlen = 0;
int srcsg = minsg, dstsg = 0;
+   int offset = 0, less;
 
-   *sent = 0;
-   *dent = 0;
while (src && dst && ((srcsg + 1) <= MAX_SKB_FRAGS) &&
   space > (sgl_ent_len[srcsg + 1] + dsgl_ent_len[dstsg])) {

[PATCH v2 2/7] crypto:chelsio: Check error code with IS_ERR macro

2017-10-08 Thread Harsh Jain

From: Yeshaswi M R Gowda 

Check and return proper error code.

Signed-off-by: Jitendra Lulla 
Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_algo.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index bdb1014..e4bf32d 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -1455,8 +1455,8 @@ static int chcr_ahash_update(struct ahash_request *req)
req_ctx->result = 0;
req_ctx->data_len += params.sg_len + params.bfr_len;
skb = create_hash_wr(req, );
-   if (!skb)
-   return -ENOMEM;
+   if (IS_ERR(skb))
+   return PTR_ERR(skb);
 
if (remainder) {
u8 *temp;
@@ -1519,8 +1519,8 @@ static int chcr_ahash_final(struct ahash_request *req)
params.more = 0;
}
skb = create_hash_wr(req, );
-   if (!skb)
-   return -ENOMEM;
+   if (IS_ERR(skb))
+   return PTR_ERR(skb);
 
skb->dev = u_ctx->lldi.ports[0];
set_wr_txq(skb, CPL_PRIORITY_DATA, ctx->tx_qidx);
@@ -1570,8 +1570,8 @@ static int chcr_ahash_finup(struct ahash_request *req)
}
 
skb = create_hash_wr(req, );
-   if (!skb)
-   return -ENOMEM;
+   if (IS_ERR(skb))
+   return PTR_ERR(skb);
 
skb->dev = u_ctx->lldi.ports[0];
set_wr_txq(skb, CPL_PRIORITY_DATA, ctx->tx_qidx);
@@ -1621,8 +1621,8 @@ static int chcr_ahash_digest(struct ahash_request *req)
}
 
skb = create_hash_wr(req, );
-   if (!skb)
-   return -ENOMEM;
+   if (IS_ERR(skb))
+   return PTR_ERR(skb);
 
skb->dev = u_ctx->lldi.ports[0];
set_wr_txq(skb, CPL_PRIORITY_DATA, ctx->tx_qidx);
-- 
2.1.4

[PATCH v2 1/7] crypto:chelsio: Remove unused parameter

2017-10-08 Thread Harsh Jain

From: Yeshaswi M R Gowda 

Remove unused parameter sent to latest fw.

Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_algo.c | 43 +++---
 drivers/crypto/chelsio/chcr_algo.h | 12 +--
 2 files changed, 23 insertions(+), 32 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 0e81607..bdb1014 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -577,36 +577,27 @@ static int chcr_cipher_fallback(struct crypto_skcipher 
*cipher,
 static inline void create_wreq(struct chcr_context *ctx,
   struct chcr_wr *chcr_req,
   void *req, struct sk_buff *skb,
-  int kctx_len, int hash_sz,
-  int is_iv,
+  int hash_sz,
   unsigned int sc_len,
   unsigned int lcb)
 {
struct uld_ctx *u_ctx = ULD_CTX(ctx);
-   int iv_loc = IV_DSGL;
int qid = u_ctx->lldi.rxq_ids[ctx->rx_qidx];
-   unsigned int immdatalen = 0, nr_frags = 0;
+   unsigned int immdatalen = 0;
 
-   if (is_ofld_imm(skb)) {
+   if (is_ofld_imm(skb))
immdatalen = skb->data_len;
-   iv_loc = IV_IMMEDIATE;
-   } else {
-   nr_frags = skb_shinfo(skb)->nr_frags;
-   }
 
-   chcr_req->wreq.op_to_cctx_size = FILL_WR_OP_CCTX_SIZE(immdatalen,
-   ((sizeof(chcr_req->key_ctx) + kctx_len) >> 4));
+   chcr_req->wreq.op_to_cctx_size = FILL_WR_OP_CCTX_SIZE;
chcr_req->wreq.pld_size_hash_size =
-   htonl(FW_CRYPTO_LOOKASIDE_WR_PLD_SIZE_V(sgl_lengths[nr_frags]) |
- FW_CRYPTO_LOOKASIDE_WR_HASH_SIZE_V(hash_sz));
+   htonl(FW_CRYPTO_LOOKASIDE_WR_HASH_SIZE_V(hash_sz));
chcr_req->wreq.len16_pkd =
htonl(FW_CRYPTO_LOOKASIDE_WR_LEN16_V(DIV_ROUND_UP(
(calc_tx_flits_ofld(skb) * 8), 16)));
chcr_req->wreq.cookie = cpu_to_be64((uintptr_t)req);
chcr_req->wreq.rx_chid_to_rx_q_id =
FILL_WR_RX_Q_ID(ctx->dev->rx_channel_id, qid,
-   is_iv ? iv_loc : IV_NOP, !!lcb,
-   ctx->tx_qidx);
+   !!lcb, ctx->tx_qidx);
 
chcr_req->ulptx.cmd_dest = FILL_ULPTX_CMD_DEST(ctx->dev->tx_channel_id,
   qid);
@@ -616,7 +607,7 @@ static inline void create_wreq(struct chcr_context *ctx,
chcr_req->sc_imm.cmd_more = FILL_CMD_MORE(immdatalen);
chcr_req->sc_imm.len = cpu_to_be32(sizeof(struct cpl_tx_sec_pdu) +
   sizeof(chcr_req->key_ctx) +
-  kctx_len + sc_len + immdatalen);
+  sc_len + immdatalen);
 }
 
 /**
@@ -706,8 +697,8 @@ static struct sk_buff *create_cipher_wr(struct 
cipher_wr_param *wrparam)
write_buffer_to_skb(skb, , reqctx->iv, ivsize);
write_sg_to_skb(skb, , wrparam->srcsg, wrparam->bytes);
atomic_inc(>chcr_stats.cipher_rqst);
-   create_wreq(ctx, chcr_req, &(wrparam->req->base), skb, kctx_len, 0, 1,
-   sizeof(struct cpl_rx_phys_dsgl) + phys_dsgl,
+   create_wreq(ctx, chcr_req, &(wrparam->req->base), skb, 0,
+   sizeof(struct cpl_rx_phys_dsgl) + phys_dsgl + kctx_len,
ablkctx->ciph_mode == CHCR_SCMD_CIPHER_MODE_AES_CBC);
reqctx->skb = skb;
skb_get(skb);
@@ -1417,8 +1408,8 @@ static struct sk_buff *create_hash_wr(struct 
ahash_request *req,
if (param->sg_len != 0)
write_sg_to_skb(skb, , req->src, param->sg_len);
atomic_inc(>chcr_stats.digest_rqst);
-   create_wreq(ctx, chcr_req, >base, skb, kctx_len,
-   hash_size_in_response, 0, DUMMY_BYTES, 0);
+   create_wreq(ctx, chcr_req, >base, skb, hash_size_in_response,
+   DUMMY_BYTES + kctx_len, 0);
req_ctx->skb = skb;
skb_get(skb);
return skb;
@@ -2080,8 +2071,8 @@ static struct sk_buff *create_authenc_wr(struct 
aead_request *req,
write_buffer_to_skb(skb, , req->iv, ivsize);
write_sg_to_skb(skb, , src, req->cryptlen);
atomic_inc(>chcr_stats.cipher_rqst);
-   create_wreq(ctx, chcr_req, >base, skb, kctx_len, size, 1,
-  sizeof(struct cpl_rx_phys_dsgl) + dst_size, 0);
+   create_wreq(ctx, chcr_req, >base, skb, size,
+  sizeof(struct cpl_rx_phys_dsgl) + dst_size + kctx_len, 0);
reqctx->skb = skb;
skb_get(skb);
 
@@ -2396,8 +2387,8 @@ static struct sk_buff *create_aead_ccm_wr(struct 
aead_request *req,
skb_set_transport_header(skb, transhdr_len);
frags = fill_aead_req_fields(skb, req, src,

[PATCH v2 4/7] crypto:chelsio:Use x8_ble gf multiplication to calculate IV.

2017-10-08 Thread Harsh Jain

gf128mul_x8_ble() will reduce gf Multiplication iteration by 8.

Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_algo.c   | 11 +--
 drivers/crypto/chelsio/chcr_crypto.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index e4bf32d..e0ab34a 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -888,9 +888,11 @@ static int chcr_update_tweak(struct ablkcipher_request 
*req, u8 *iv)
int ret, i;
u8 *key;
unsigned int keylen;
+   int round = reqctx->last_req_len / AES_BLOCK_SIZE;
+   int round8 = round / 8;
 
cipher = ablkctx->aes_generic;
-   memcpy(iv, req->info, AES_BLOCK_SIZE);
+   memcpy(iv, reqctx->iv, AES_BLOCK_SIZE);
 
keylen = ablkctx->enckey_len / 2;
key = ablkctx->key + keylen;
@@ -899,7 +901,10 @@ static int chcr_update_tweak(struct ablkcipher_request 
*req, u8 *iv)
goto out;
 
crypto_cipher_encrypt_one(cipher, iv, iv);
-   for (i = 0; i < (reqctx->processed / AES_BLOCK_SIZE); i++)
+   for (i = 0; i < round8; i++)
+   gf128mul_x8_ble((le128 *)iv, (le128 *)iv);
+
+   for (i = 0; i < (round % 8); i++)
gf128mul_x_ble((le128 *)iv, (le128 *)iv);
 
crypto_cipher_decrypt_one(cipher, iv, iv);
@@ -1040,6 +1045,7 @@ static int chcr_handle_cipher_resp(struct 
ablkcipher_request *req,
CRYPTO_ALG_SUB_TYPE_CTR)
bytes = adjust_ctr_overflow(reqctx->iv, bytes);
reqctx->processed += bytes;
+   reqctx->last_req_len = bytes;
wrparam.qid = u_ctx->lldi.rxq_ids[ctx->rx_qidx];
wrparam.req = req;
wrparam.bytes = bytes;
@@ -1132,6 +1138,7 @@ static int process_cipher(struct ablkcipher_request *req,
goto error;
}
reqctx->processed = bytes;
+   reqctx->last_req_len = bytes;
reqctx->dst = reqctx->dstsg;
reqctx->op = op_type;
wrparam.qid = qid;
diff --git a/drivers/crypto/chelsio/chcr_crypto.h 
b/drivers/crypto/chelsio/chcr_crypto.h
index 30af1ee..b3722b3 100644
--- a/drivers/crypto/chelsio/chcr_crypto.h
+++ b/drivers/crypto/chelsio/chcr_crypto.h
@@ -247,6 +247,7 @@ struct chcr_blkcipher_req_ctx {
struct scatterlist *dst;
struct scatterlist *newdstsg;
unsigned int processed;
+   unsigned int last_req_len;
unsigned int op;
short int dst_nents;
u8 iv[CHCR_MAX_CRYPTO_IV_LEN];
-- 
2.1.4

[PATCH v2 3/7] crypto:gf128mul: The x8_ble multiplication functions

2017-10-08 Thread Harsh Jain

It multiply GF(2^128) elements in the ble format.
It will be used by chelsio driver to speed up gf multiplication.

Signed-off-by: Harsh Jain 
---
 crypto/gf128mul.c | 13 +
 include/crypto/gf128mul.h |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/crypto/gf128mul.c b/crypto/gf128mul.c
index dc01212..24e6019 100644
--- a/crypto/gf128mul.c
+++ b/crypto/gf128mul.c
@@ -156,6 +156,19 @@ static void gf128mul_x8_bbe(be128 *x)
x->b = cpu_to_be64((b << 8) ^ _tt);
 }
 
+void gf128mul_x8_ble(le128 *r, const le128 *x)
+{
+   u64 a = le64_to_cpu(x->a);
+   u64 b = le64_to_cpu(x->b);
+
+   /* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
+   u64 _tt = gf128mul_table_be[a >> 56];
+
+   r->a = cpu_to_le64((a << 8) | (b >> 56));
+   r->b = cpu_to_le64((b << 8) ^ _tt);
+}
+EXPORT_SYMBOL(gf128mul_x8_ble);
+
 void gf128mul_lle(be128 *r, const be128 *b)
 {
be128 p[8];
diff --git a/include/crypto/gf128mul.h b/include/crypto/gf128mul.h
index 0977fb1..fa0a63d 100644
--- a/include/crypto/gf128mul.h
+++ b/include/crypto/gf128mul.h
@@ -227,7 +227,7 @@ struct gf128mul_4k *gf128mul_init_4k_lle(const be128 *g);
 struct gf128mul_4k *gf128mul_init_4k_bbe(const be128 *g);
 void gf128mul_4k_lle(be128 *a, const struct gf128mul_4k *t);
 void gf128mul_4k_bbe(be128 *a, const struct gf128mul_4k *t);
-
+void gf128mul_x8_ble(le128 *r, const le128 *x);
 static inline void gf128mul_free_4k(struct gf128mul_4k *t)
 {
kzfree(t);
-- 
2.1.4

[PATCH v2 7/7] crypto:chelsio: Fix memory leak

2017-10-08 Thread Harsh Jain

Fix memory leak when device does not support crypto.

Reported-by: Dan Carpenter 
Signed-off-by: Harsh Jain 
---
 drivers/crypto/chelsio/chcr_core.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_core.c 
b/drivers/crypto/chelsio/chcr_core.c
index b6dd9cb..4f677b3 100644
--- a/drivers/crypto/chelsio/chcr_core.c
+++ b/drivers/crypto/chelsio/chcr_core.c
@@ -154,15 +154,15 @@ static void *chcr_uld_add(const struct cxgb4_lld_info 
*lld)
struct uld_ctx *u_ctx;
 
/* Create the device and add it in the device list */
+   if (!(lld->ulp_crypto & ULP_CRYPTO_LOOKASIDE))
+   return ERR_PTR(-EOPNOTSUPP);
+
+   /* Create the device and add it in the device list */
u_ctx = kzalloc(sizeof(*u_ctx), GFP_KERNEL);
if (!u_ctx) {
u_ctx = ERR_PTR(-ENOMEM);
goto out;
}
-   if (!(lld->ulp_crypto & ULP_CRYPTO_LOOKASIDE)) {
-   u_ctx = ERR_PTR(-ENOMEM);
-   goto out;
-   }
u_ctx->lldi = *lld;
 out:
return u_ctx;
-- 
2.1.4

Re: [PATCH 3/3] batman-adv: Add missing kerneldoc for extack

2017-10-08 Thread Sven Eckelmann

On Samstag, 7. Oktober 2017 20:59:52 CEST David Miller wrote:
> From: Sven Eckelmann 
> Date: Sat,  7 Oct 2017 14:21:22 +0200
> 
> > The parameter extack was added to batadv_softif_slave_add without adding
> > the kernel-doc for it. This caused kernel-doc warnings.
> > 
> > Signed-off-by: Sven Eckelmann 
> > Cc: David Ahern 
> 
> I'm happy to apply this but where are the other two patches of this
> series and the series header posting?

They are on the b.a.t.m@lists.open-mesh.org mailing list. And they deal 
with the "extack" compat problems of the out-of-tree module build.

Why were you cc'ed for the patch 3:
==

David Ahern and David Miller were involved in adding following changes to 
batman-adv:

* net: Add extack to ndo_add_slave
* net: Add extack to upper device linking

These were not posted to the batman-adv mailing list.

An automatic kernel-doc run on a build machine noticed a minor problem in the 
first patch. Since I was not involved in the original change (and have not 
checked what was discussed in the past about it), I could only guess how the 
kernel-doc should have looked like. But you, David Ahern or anyone else from 
netdev@... could have a different, more educated opinion about the correct 
documentation - this is why you were Cc'ed.

Why are you not Cc'ed on the patch 1+2:
==

The changes are about code which is not present in the upstream kernel. It 
only deals with changes required to build the extracted batman-adv module 
against a different kernel. These changes were required to get the "extack" 
stuff working on Linux 4.14 and below.

So, it is not really relevant for you but for some people on the 
b.a.t.m@lists.open-mesh.org mailing list. This is why everything 
(including the third patch) was send directly ("To:") to the mentioned mailing 
list.

Are you expected to apply this change:
=

I was hoping that Simon is picking the patch up and forwards it do you in a 
proper pull request. But I would doubt that he has a big problem with you 
applying this single line kernel-doc change. At least it is less work for him 
and less extra noise on both mailing lists...

But feel free to tell us your preferred solution.

Kind regards,
Sven

signature.asc
Description: This is a digitally signed message part.

Re: [PATCH iproute2] iproute: build more easily on Android

2017-10-08 Thread Leon Romanovsky

On Tue, Oct 03, 2017 at 02:03:37AM +0900, Lorenzo Colitti wrote:
> iproute2 contains a bunch of kernel headers, including uapi ones.
> Android's libc uses uapi headers almost directly, and uses a
> script to fix kernel types that don't match what userspace
> expects.
>
> For example: https://issuetracker.google.com/36987220 reports
> that our struct ip_mreq_source contains "__be32 imr_multiaddr"
> rather than "struct in_addr imr_multiaddr". The script addresses
> this by replacing the uapi struct definition with a #include
>  which contains the traditional userspace
> definition.
>
> Unfortunately, when we compile iproute2, this definition
> conflicts with the one in iproute2's linux/in.h.
>
> Historically we've just solved this problem by running "git rm"
> on all the iproute2 include/linux headers that break Android's
> libc.  However, deleting the files in this way makes it harder to
> keep up with upstream, because every upstream change to
> an include file causes a merge conflict with the delete.
>
> This patch fixes the problem by moving the iproute2 linux headers
> from include/linux to include/uapi/linux.
>
> Tested: compiles on ubuntu trusty (glibc)
>
> Signed-off-by: Elliott Hughes 
> Signed-off-by: Lorenzo Colitti 
> ---

include/rdma/* files are coming from kernel's uapi too.

BTW, I know that the rdma folder is not relevant for android, but
anyway would like to see the same treatment for that folder as for the
include/linux.

Thanks


signature.asc
Description: PGP signature

Re: devlink dump of mlxsw_adj table triggers a panic

2017-10-08 Thread Arkadi Sharshevsky



On 10/05/2017 02:57 AM, David Ahern wrote:
> The following devlink command on a 2700 triggers a panic every time.
> Kernel is net-next at 26873308b21654b6e0785b9f9e2c5414d37a4c4c
> 
> $ devlink  dpipe table dump pci/:03:00.0 name mlxsw_adj
> devlink answers: No buffer space available
> 
> 
> I have seen several different stack traces and varying amounts of EMAD
> errors on console:
> 
> [   77.453364] mlxsw_spectrum :03:00.0: EMAD reg access failed
> (tid=64c24a43688,reg_id=200b(sfn),type=query,status=0(operation
> performed))
> [   77.466568] mlxsw_spectrum :03:00.0: Failed to get FDB notifications
> 
> If it does not reproduce for you let me know and I'll grab a trace.
> 
> David
> 

Thanks, will check it out. How many nexthops groups & overall number of
nexthops you configured?

Thanks,
Arkadi

Re: [PATCH 41/47] netfilter: convert hook list to an array

2017-10-08 Thread Tariq Toukan




On 04/09/2017 1:42 AM, Pablo Neira Ayuso wrote:

From: Aaron Conole 

This converts the storage and layout of netfilter hook entries from a
linked list to an array.  After this commit, hook entries will be
stored adjacent in memory.  The next pointer is no longer required.

The ops pointers are stored at the end of the array as they are only
used in the register/unregister path and in the legacy br_netfilter code.

nf_unregister_net_hooks() is slower than needed as it just calls
nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
calls), this will be addressed in followup patch.

Test setup:
  - ixgbe 10gbit
  - netperf UDP_STREAM, 64 byte packets
  - 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
empty mangle and raw prerouting, mangle and filter input hooks:
353.9
this patch:
364.2

Signed-off-by: Aaron Conole 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---


Hi,

We experience a regression in server with iommu enabled.
After installing kernel and rebooting the server, it crashes during boot.
Please see trace below.

Bisecting points to this patch.

Any idea what's wrong?

Regards,
Tariq Toukan

[   25.590816] RIP: 0010:_raw_read_lock_bh+0x15/0x40
[   25.596160] RSP: 0018:c90007db77a0 EFLAGS: 00010286
[   25.602089] RAX: 0100 RBX: 0003 RCX: 

[   25.610152] RDX:  RSI: c90007db7898 RDI: 
003c
[   25.618470] RBP: c90007db7840 R08: 0001 R09: 
87c10eef
[   25.626786] R10: 88180f21f040 R11: ea005feeaf00 R12: 

[   25.635103] R13: c90007db7898 R14: 8817fbabdc00 R15: 
8817fbabdc00
[   25.643421] FS:  7fcdb7771740() GS:88180f20() 
knlGS:

[   25.653056] CS:  0010 DS:  ES:  CR0: 80050033
[   25.659818] CR2: 003c CR3: 001809ae0001 CR4: 
001606e0

[   25.668136] Call Trace:
[   25.671215]  ? ebt_do_table+0x3d/0x6e8 [ebtables]
[   25.676817]  ebt_nat_out+0x1f/0x30 [ebtable_nat]
[   25.682326]  nf_hook_slow+0x3c/0xb0
[   25.686576]  __br_forward+0xb1/0x1b0 [bridge]
[   25.691786]  ? br_dev_queue_push_xmit+0x170/0x170 [bridge]
[   25.704333]  br_flood+0x130/0x1b0 [bridge]
[   25.709254]  br_dev_xmit+0x1e5/0x2a0 [bridge]
[   25.714468]  dev_hard_start_xmit+0xa1/0x210
[   25.719485]  __dev_queue_xmit+0x4f6/0x610
[   25.724304]  dev_queue_xmit+0x10/0x20
[   25.728739]  ip_finish_output2+0x233/0x320
[   25.733656]  ip_finish_output+0x12a/0x1d0
[   25.738474]  ? netif_rx_ni+0x33/0x80
[   25.742805]  ip_mc_output+0x84/0x250
[   25.747140]  ip_local_out+0x35/0x40
[   25.751377]  ip_send_skb+0x19/0x40
[   25.755583]  udp_send_skb+0x172/0x280
[   25.760013]  udp_sendmsg+0x2c0/0xa30
[   25.764348]  ? ip_reply_glue_bits+0x50/0x50
[   25.769366]  ? import_iovec+0x2c/0xc0
[   25.773801]  inet_sendmsg+0x31/0xb0
[   25.778042]  sock_sendmsg+0x38/0x50
[   25.782276]  ___sys_sendmsg+0x25c/0x270
[   25.786904]  ? file_update_time+0x3a/0xf0
[   25.791727]  ? __wake_up_sync_key+0x50/0x60
[   25.796741]  ? pipe_write+0x3cc/0x420
[   25.801175]  ? __vfs_write+0xd0/0x130
[   25.805608]  __sys_sendmsg+0x45/0x80
[   25.809938]  SyS_sendmsg+0x12/0x20
[   25.814077]  entry_SYSCALL_64_fastpath+0x1a/0xa5
[   25.819577] RIP: 0033:0x7fcdb64ac7a0
[   25.823908] RSP: 002b:7ffe2b98cb98 EFLAGS: 0246 ORIG_RAX: 
002e
[   25.832961] RAX: ffda RBX: 7ffe2b98c630 RCX: 
7fcdb64ac7a0
[   25.841270] RDX:  RSI: 7ffe2b98cc50 RDI: 
000c
[   25.849583] RBP: 7fcdb69018f8 R08: 7ffe2b98cbc3 R09: 
0004
[   25.857901] R10: 0019 R11: 0246 R12: 

[   25.866213] R13:  R14: 7ffe2b98c6c0 R15: 
7ffe2b98c6e0
[   25.874520] Code: 55 48 89 e5 e8 bd 74 82 ff 5d c3 66 66 2e 0f 1f 84 
00 00 00 00 00 0f 1f 44 00 00 65 81 05 68 78 74 7e 00 02 00 00 b8 00 01 
00 00  0f c1 07 8d b0 00 01 00 00 40 84

[   25.896497] RIP: _raw_read_lock_bh+0x15/0x40 RSP: c90007db77a0
[   25.903744] CR2: 003c
[   25.907808] ---[ end trace 4f824a5c467b1872 ]---
[   25.907811] BUG: unable to handle kernel NULL pointer dereference at 
003c

[   25.907828] IP: _raw_read_lock_bh+0x15/0x40
[   25.907830] PGD 0 P4D 0
[   25.907834] Oops: 0002 [#2] SMP
[   25.907836] Modules linked in: ebtable_nat(+) ebtables ib_ucm mlx4_en 
mlx4_ib rpcrdma mlx4_core rdma_ucm ib_uverbs ib_iser ib_umad rdma_cm 
ib_ipoib iw_cm ib_cm mlx5_ib bridge stp llc sge
[   25.907895] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G  D 
4.13.0-for-linust-perf-2017-09-10_06-48-01-64 #1
[   25.907896] Hardware name: Dell Inc. PowerEdge R720/0HJK12, BIOS 
2.2.3 05/20/2014

[   25.907898] task: 880c0c2f8000 task.stack: c90006318000
[   25.907901] RIP: 0010:_raw_read_lock_bh+0x15/0x40
[   25.907902] RSP:

Re: [PATCH net-next RFC 4/9] net: dsa: mv88e6xxx: add support for event capture

2017-10-08 Thread Richard Cochran


There are some issues here.

On Thu, Sep 28, 2017 at 10:25:36AM -0500, Brandon Streiff wrote:
> +static int mv88e6xxx_config_periodic_trig(struct mv88e6xxx_chip *chip,
> +   u32 ns, u16 picos)
> +{
> + int err;
> + u16 global_config;
> +
> + if (picos >= 1000)
> + return -ERANGE;
> +
> + /* TRIG generation is in units of 8 ns clock periods. Convert ns
> +  * and ps into 8 ns clock periods and up to 8000 additional ps
> +  */
> + picos += (ns & 0x7) * 1000;
> + ns = ns >> 3;

Again, the 8 nanosecounds shouldn't be hard coded.

...

> + return err;
> +}

> +static void mv88e6xxx_tai_event_work(struct work_struct *ugly)
> +{
> + struct delayed_work *dw = to_delayed_work(ugly);
> + struct mv88e6xxx_chip *chip =
> + container_of(dw, struct mv88e6xxx_chip, tai_event_work);
> + u16 ev_status[4];
> + int err;
> +
> + mutex_lock(>reg_lock);
> +
> + err = mv88e6xxx_tai_read(chip, MV88E6XXX_TAI_EVENT_STATUS,
> +  ev_status, ARRAY_SIZE(ev_status));
> + if (err) {
> + mutex_unlock(>reg_lock);
> + return;
> + }
> +
> + if (ev_status[0] & MV88E6XXX_TAI_EVENT_STATUS_ERROR)
> + dev_warn(chip->dev, "missed event capture\n");
> +
> + if (ev_status[0] & MV88E6XXX_TAI_EVENT_STATUS_VALID) {

Avoid IfOk.

> + struct ptp_clock_event ev;
> + u32 raw_ts = ((u32)ev_status[2] << 16) | ev_status[1];
> +
> + /* Clear the valid bit so the next timestamp can come in */
> + ev_status[0] &= ~MV88E6XXX_TAI_EVENT_STATUS_VALID;
> + err = mv88e6xxx_tai_write(chip, MV88E6XXX_TAI_EVENT_STATUS,
> +   ev_status[0]);
> +
> + if (ev_status[0] & MV88E6XXX_TAI_EVENT_STATUS_CAP_TRIG) {
> + /* TAI is configured to timestamp internal events.
> +  * This will be a PPS event.
> +  */
> + ev.type = PTP_CLOCK_PPS;
> + } else {
> + /* Otherwise this is an external timestamp */
> + ev.type = PTP_CLOCK_EXTTS;
> + }
> + /* We only have one timestamping channel. */
> + ev.index = 0;
> + ev.timestamp = timecounter_cyc2time(>tstamp_tc, raw_ts);
> +
> + ptp_clock_event(chip->ptp_clock, );
> + }
> +
> + mutex_unlock(>reg_lock);
> +
> + schedule_delayed_work(>tai_event_work, TAI_EVENT_WORK_INTERVAL);
> +}
> +

> +static int mv88e6xxx_ptp_enable_perout(struct mv88e6xxx_chip *chip,
> +struct ptp_clock_request *rq, int on)
> +{
> + struct timespec ts;
> + u64 ns;
> + int pin;
> + int err;
> +
> + pin = ptp_find_pin(chip->ptp_clock, PTP_PF_PEROUT, rq->extts.index);
> +
> + if (pin < 0)
> + return -EBUSY;
> +
> + ts.tv_sec = rq->perout.period.sec;
> + ts.tv_nsec = rq->perout.period.nsec;
> + ns = timespec_to_ns();
> +
> + if (ns > U32_MAX)
> + return -ERANGE;
> +
> + mutex_lock(>reg_lock);
> +
> + err = mv88e6xxx_config_periodic_trig(chip, (u32)ns, 0);

Here you ignore the phase of the signal given in the trq->perout.start
field.  That is not what the user expects.  For periodic outputs where
the phase cannot be set, we really would need a new ioctl.

However, in this case, you should just drop this functionality.  I
understand that this works with your adjustable external oscillator,
but we cannot support that in mainline (at least, not yet).

Thanks,
Richard


> + if (err)
> + goto out;
> +
> + if (on) {
> + err = mv88e6xxx_g2_set_gpio_config(
> + chip, pin, MV88E6XXX_G2_SCRATCH_GPIO_MODE_TRIG,
> + MV88E6XXX_G2_SCRATCH_GPIO_DIR_OUT);
> + } else {
> + err = mv88e6xxx_g2_set_gpio_config(
> + chip, pin, MV88E6XXX_G2_SCRATCH_GPIO_MODE_GPIO,
> + MV88E6XXX_G2_SCRATCH_GPIO_DIR_IN);
> + }
> +
> +out:
> + mutex_unlock(>reg_lock);
> +
> + return err;
> +}

Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context

2017-10-08 Thread Eric Dumazet

On Sun, 2017-10-08 at 18:18 +0300, Ido Schimmel wrote:
> Without the rwlock and with PREEMPT_RCU we're no longer guaranteed to be
> in non-preemptible context when performing a route lookup, so use
> raw_cpu_ptr() instead.
> 
> Takes care of the following splat:
> [  122.221814] BUG: using smp_processor_id() in preemptible [] code: 
> sshd/2672
> [  122.221845] caller is debug_smp_processor_id+0x17/0x20
> [  122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 
> 4.14.0-rc3-idosch-next-custom #639
> [  122.221880] Hardware name: Mellanox Technologies Ltd. 
> MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [  122.221893] Call Trace:
> [  122.221919]  dump_stack+0xb1/0x10c
> [  122.221946]  ? _atomic_dec_and_lock+0x124/0x124
> [  122.221974]  ? ___ratelimit+0xfe/0x240
> [  122.222020]  check_preemption_disabled+0x173/0x1b0
> [  122.222060]  debug_smp_processor_id+0x17/0x20
> [  122.222083]  ip6_pol_route+0x1482/0x24a0
> ...
> 
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in 
> fib6_table")
> Signed-off-by: Ido Schimmel 
> ---


Thanks Ido for this patch.

IMO, we no longer play this read_lock() -> write_lock() game since
ip6_dst_gc() could be called from rt6_make_pcpu_route()


So we might simplify things quite a bit, by blocking BH (and thus
preventing preemption)

Something like :

 net/ipv6/route.c |   26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 
399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003
 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct 
rt6_info *rt)
dst_hold(_rt->dst);
p = this_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
-   if (prev) {
-   /* If someone did it before us, return prev instead */
-   /* release refcnt taken by ip6_rt_pcpu_alloc() */
-   dst_release_immediate(_rt->dst);
-   /* release refcnt taken by above dst_hold() */
-   dst_release_immediate(_rt->dst);
-   dst_hold(>dst);
-   pcpu_rt = prev;
-   }
+   BUG_ON(prev);
 
rt6_dst_from_metrics_check(pcpu_rt);
return pcpu_rt;
@@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, struct 
fib6_table *table,
struct rt6_info *pcpu_rt;
 
dst_use_noref(>dst, jiffies);
+   local_bh_disable();
pcpu_rt = rt6_get_pcpu_route(rt);
 
-   if (pcpu_rt) {
-   rcu_read_unlock();
-   } else {
+   if (!pcpu_rt) {
/* atomic_inc_not_zero() is needed when using rcu */
if (atomic_inc_not_zero(>rt6i_ref)) {
-   /* We have to do the read_unlock first
-* because rt6_make_pcpu_route() may trigger
-* ip6_dst_gc() which will take the write_lock.
-*
-* No dst_hold() on rt is needed because 
grabbing
+   /* No dst_hold() on rt is needed because 
grabbing
 * rt->rt6i_ref makes sure rt can't be released.
 */
-   rcu_read_unlock();
pcpu_rt = rt6_make_pcpu_route(rt);
rt6_release(rt);
} else {
/* rt is already removed from tree */
-   rcu_read_unlock();
pcpu_rt = net->ipv6.ip6_null_entry;
dst_hold(_rt->dst);
}
}
-
+   local_bh_enable();
+   rcu_read_unlock();
trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
return pcpu_rt;
}

[PATCH 00/12] radix-tree: split out struct radix_tree_root out to

2017-10-08 Thread Masahiro Yamada


The motivation of this series is to cut down unnecessary header
dependency in terms of radix tree.

Sub-systems or drivers that use radix-tree for data management
typically embed struct radix_tree_root in their data structures,
like this:

struct foo {
   ...

   struct radix_tree_root   foo_tree;
   ...
};

So,  needs to include ,
therefore, users of  include a lot of bloat
from .

If you see the definition of radix_tree_root,

   struct radix_tree_root {
   gfp_tgfp_mask;
   struct radix_tree_node   __rcu *rnode;
   };

it is a very simple structure.
It only depends on  for gfp_t and
 for __rcu.

By splitting out the radix_tree_root definition,
we can reduce the header file dependency.

Reducing the header dependency will help for speeding the kernel
build, suppressing unnecessary recompile of objects during
git-bisect'ing, etc.

The patch 1 is a trivial clean-up; it is just here
to avoid conflict.

The patch 2 is the main part of this series;
split out struct radix_tree_root.

The rest of the series replace 
with  where appropriate.

Please review if the idea is OK.

If it is OK, I'd like to know how to apply the series.

Perhaps, the first two for v4.15.  Then, rest of series
will be sent per-subsystem for v4.16?

Or, can somebody take care of the whole series?

I checked allmodconfig for x86 and arm64.
I am expecting 0 day testing will check it too.



Masahiro Yamada (12):
  radix-tree: replace  with 
  radix-tree: split struct radix_tree_root to 
  irqdomain: replace  with 
  writeback: replace  with 
  iocontext.h: replace  with

  fs: replace  with 
  blkcg: replace  with 
  fscache: include 
  sh: intc: replace  with 
  net/mlx4: replace  with 
  net/mlx5: replace  with 
  drm/i915: replace  with 

 drivers/gpu/drm/i915/i915_gem.c|  1 +
 drivers/gpu/drm/i915/i915_gem_context.c|  1 +
 drivers/gpu/drm/i915/i915_gem_context.h|  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  1 +
 drivers/gpu/drm/i915/i915_gem_object.h |  1 +
 drivers/net/ethernet/mellanox/mlx4/cq.c|  1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4.h  |  2 +-
 drivers/net/ethernet/mellanox/mlx4/qp.c|  1 +
 drivers/net/ethernet/mellanox/mlx4/srq.c   |  1 +
 drivers/sh/intc/internals.h|  2 +-
 include/linux/backing-dev-defs.h   |  2 +-
 include/linux/blk-cgroup.h |  2 +-
 include/linux/fs.h |  2 +-
 include/linux/fscache.h|  1 +
 include/linux/iocontext.h  |  2 +-
 include/linux/irqdomain.h  |  2 +-
 include/linux/mlx4/device.h|  2 +-
 include/linux/mlx4/qp.h|  1 +
 include/linux/mlx5/driver.h|  2 +-
 include/linux/mlx5/qp.h|  1 +
 include/linux/radix-tree-root.h| 24 
 include/linux/radix-tree.h |  8 ++--
 22 files changed, 46 insertions(+), 16 deletions(-)
 create mode 100644 include/linux/radix-tree-root.h

-- 
2.7.4

Re: [PATCH 11/12] net/mlx5: replace with

2017-10-08 Thread David Miller

From: Masahiro Yamada 
Date: Mon,  9 Oct 2017 01:10:12 +0900

> The header include/linux/mlx5/driver.h requires the definition of
> struct radix_tree_root, but does not need to know anything about
> other radix tree stuff.
> 
> Include  instead of  to
> reduce the number of included header files.
> 
> Also, add  to include/linux/mlx5/gp.h where radix
> tree accessors are used.
> 
> Signed-off-by: Masahiro Yamada 

Same objections as the mlx4 changes, just include both headers in
driver.h

Re: [PATCH v6 05/11] dt-bindings: net: dwmac-sun8i: update documentation about integrated PHY

2017-10-08 Thread Corentin Labbe

On Thu, Sep 28, 2017 at 09:37:08AM +0200, Corentin Labbe wrote:
> On Wed, Sep 27, 2017 at 04:02:10PM +0200, Andrew Lunn wrote:
> > Hi Corentin
> > 
> > > +Required properties for the mdio-mux node:
> > > +  - compatible = "mdio-mux"
> > 
> > This is too generic. Please add a more specific compatible for this
> > particular mux. You can keep "mdio-mux", since that is what the MDIO
> > subsystem will look for.
> > 
> 
> I will add allwinner,sun8i-h3-mdio-mux
> 
> > > +Required properties of the integrated phy node:
> > >  - clocks: a phandle to the reference clock for the EPHY
> > >  - resets: a phandle to the reset control for the EPHY
> > > +- phy-is-integrated
> > 
> > So the last thing you said is that the mux is not the problem
> > here. Something else is locking up. Did you discover what?
> > 
> > I really would like phy-is-integrated to go away.
> > 
> 
> I have found the problem: by enabling ephy clk/reset the timeout does not 
> occur anymore.
> So we could remove phy-is-integrated by:
> Moving internal phy clk/reset handling in mdio_mux_syscon_switch_fn()
> But this means:
> - getting internalphy node always by manually get internal_mdio/internal_phy 
> (and not by the given phyhandle)
> - doing some unnecessary tasks (enable/scan/disable) when external_phy is 
> needed
> 
> Regards

Hello all

Below is the current patch, as you can read, it does not use anymore the 
phy-is-integrated property.
So now, the mdio-mux must always enable the internal mdio when switch_fn ask 
for it and so reset MAC and so need to enable ephy clk/reset.
But for this I need a reference to thoses clock and reset. (this is done in 
get_ephy_nodes)
The current version set those clock in mdio-mux node, and as you can see it is 
already ugly (lots of get next node),
if the clk/rst nodes were as it should be, in phy nodes, it will be more bad.

So, since the MAC have a dependency on thoses clk/rst nodes for doing reset(), 
I seek a proper way to get references on it.
OR do you agree that putting ephy clk/rst in emac is acceptable ?

thanks
regards

--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -41,14 +42,14 @@
  * This value is used for disabling properly EMAC
  * and used as a good starting value in case of the
  * boot process(uboot) leave some stuff.
- * @internal_phy:  Does the MAC embed an internal PHY
+ * @soc_has_internal_phy:  Does the MAC embed an internal PHY
  * @support_mii:   Does the MAC handle MII
  * @support_rmii:  Does the MAC handle RMII
  * @support_rgmii: Does the MAC handle RGMII
  */
 struct emac_variant {
u32 default_syscon_value;
-   int internal_phy;
+   bool soc_has_internal_phy;
bool support_mii;
bool support_rmii;
bool support_rgmii;
@@ -61,7 +62,7 @@ struct emac_variant {
  * @rst_ephy:  reference to the optional EPHY reset for the internal PHY
  * @variant:   reference to the current board variant
  * @regmap:regmap for using the syscon
- * @use_internal_phy: Does the current PHY choice imply using the internal PHY
+ * @internal_phy_powered: Does the internal PHY is enabled
  */
 struct sunxi_priv_data {
struct clk *tx_clk;
@@ -70,12 +71,13 @@ struct sunxi_priv_data {
struct reset_control *rst_ephy;
const struct emac_variant *variant;
struct regmap *regmap;
-   bool use_internal_phy;
+   bool internal_phy_powered;
+   void *mux_handle;
 };
 
 static const struct emac_variant emac_variant_h3 = {
.default_syscon_value = 0x58000,
-   .internal_phy = PHY_INTERFACE_MODE_MII,
+   .soc_has_internal_phy = true,
.support_mii = true,
.support_rmii = true,
.support_rgmii = true
@@ -83,20 +85,20 @@ static const struct emac_variant emac_variant_h3 = {
 
 static const struct emac_variant emac_variant_v3s = {
.default_syscon_value = 0x38000,
-   .internal_phy = PHY_INTERFACE_MODE_MII,
+   .soc_has_internal_phy = true,
.support_mii = true
 };
 
 static const struct emac_variant emac_variant_a83t = {
.default_syscon_value = 0,
-   .internal_phy = 0,
+   .soc_has_internal_phy = false,
.support_mii = true,
.support_rgmii = true
 };
 
 static const struct emac_variant emac_variant_a64 = {
.default_syscon_value = 0,
-   .internal_phy = 0,
+   .soc_has_internal_phy = false,
.support_mii = true,
.support_rmii = true,
.support_rgmii = true
@@ -195,6 +197,9 @@ static const struct emac_variant emac_variant_a64 = {
 #define H3_EPHY_LED_POLBIT(17) /* 1: active low, 0: active 
high */
 #define H3_EPHY_SHUTDOWN   BIT(16) /* 1: shutdown, 0: power up */
 #define H3_EPHY_SELECT BIT(15)

Re: [PATCH net-next RFC 0/9] net: dsa: PTP timestamping for mv88e6xxx

2017-10-08 Thread Richard Cochran

On Fri, Sep 29, 2017 at 05:43:23AM -0400, Richard Cochran wrote:
> I happy to see this series.  I just finished porting an out-of-tree
> PHC driver for the Marvell mv88e635x, and I want to mainline it, but I
> also have a few uglies.

This series looks really good.  I won't even post my mine, as that
would now be too embarrassing.

I will try to get my hands on some HW, perhaps by the end of October,
in order to test and complete your driver...

Thanks,
Richard

Re: [PATCH 10/12] net/mlx4: replace with

2017-10-08 Thread David Miller

From: Masahiro Yamada 
Date: Mon,  9 Oct 2017 01:10:11 +0900

> The headers
>  - include/linux/mlx4/device.h
>  - drivers/net/ethernet/mellanox/mlx4/mlx4.h
> require the definition of struct radix_tree_root, but do not need to
> know anything about other radix tree stuff.
> 
> Include  instead of  to
> reduce the header dependency.
> 
> While we are here, let's add missing  where
> radix tree accessors are used.
> 
> Signed-off-by: Masahiro Yamada 

Honestly this makes things more complicated.

The driver was trying to consolidate all of the header needs
by including them all in one place, the main driver header.

Now you're including headers in several different files.

I really don't like the results of this change and would
ask you to reconsider.

Just add both radix-tree-root.h _and_ radix-tree.h to mlx4.h
and leave the rest of the driver alone.

Re: [patch net] mlxsw: spectrum_router: Avoid expensive lookup during route removal

2017-10-08 Thread David Miller

From: Jiri Pirko 
Date: Sun,  8 Oct 2017 11:53:26 +0200

> From: Ido Schimmel 
> 
> In commit fc922bb0dd94 ("mlxsw: spectrum_router: Use one LPM tree for
> all virtual routers") I increased the scale of supported VRFs by having
> all of them share the same LPM tree.
> 
> In order to avoid look-ups for prefix lengths that don't exist, each
> route removal would trigger an aggregation across all the active virtual
> routers to see which prefix lengths are in use and which aren't and
> structure the tree accordingly.
> 
> With the way the data structures are currently laid out, this is a very
> expensive operation. When preformed repeatedly - due to the invocation
> of the abort mechanism - and with enough VRFs, this can result in a hung
> task.
> 
> For now, avoid this optimization until it can be properly re-added in
> net-next.
> 
> Fixes: fc922bb0dd94 ("mlxsw: spectrum_router: Use one LPM tree for all 
> virtual routers")
> Signed-off-by: Ido Schimmel 
> Reported-by: David Ahern 
> Tested-by: David Ahern 
> Signed-off-by: Jiri Pirko 

Applied.

Re: [patch net-next repost 0/2] mlxsw: Add more extack error reporting

2017-10-08 Thread David Miller

From: Jiri Pirko 
Date: Sun,  8 Oct 2017 11:57:54 +0200

> From: Jiri Pirko 
> 
> Ido says:
> 
> Add error messages to VLAN and bridge enslavements to help users
> understand why the enslavement failed.

Series applied, thanks for the repost.

Re: [PATCH 10/12] net/mlx4: replace with

2017-10-08 Thread Joe Perches

On Mon, 2017-10-09 at 02:29 +0900, Masahiro Yamada wrote:
> The idea is simple; include necessary headers explicitly.

Try that for kernel.h

There's a reason aggregation of #includes is useful.

Re: [PATCH 00/12] radix-tree: split out struct radix_tree_root out to

2017-10-08 Thread Leon Romanovsky

On Mon, Oct 09, 2017 at 01:10:01AM +0900, Masahiro Yamada wrote:

<...>
>
> By splitting out the radix_tree_root definition,
> we can reduce the header file dependency.
>
> Reducing the header dependency will help for speeding the kernel
> build, suppressing unnecessary recompile of objects during
> git-bisect'ing, etc.

If we judge by the diffstat of this series, there won't be any
visible change in anything mentioned above.

<...>

>
> Masahiro Yamada (12):
>   radix-tree: replace  with 
>   radix-tree: split struct radix_tree_root to 
>   irqdomain: replace  with 
>   writeback: replace  with 
>   iocontext.h: replace  with
> 
>   fs: replace  with 
>   blkcg: replace  with 
>   fscache: include 
>   sh: intc: replace  with 
>   net/mlx4: replace  with 
>   net/mlx5: replace  with 
>   drm/i915: replace  with 
>
>  drivers/gpu/drm/i915/i915_gem.c|  1 +
>  drivers/gpu/drm/i915/i915_gem_context.c|  1 +
>  drivers/gpu/drm/i915/i915_gem_context.h|  2 +-
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  1 +
>  drivers/gpu/drm/i915/i915_gem_object.h |  1 +
>  drivers/net/ethernet/mellanox/mlx4/cq.c|  1 +
>  drivers/net/ethernet/mellanox/mlx4/mlx4.h  |  2 +-
>  drivers/net/ethernet/mellanox/mlx4/qp.c|  1 +
>  drivers/net/ethernet/mellanox/mlx4/srq.c   |  1 +
>  drivers/sh/intc/internals.h|  2 +-
>  include/linux/backing-dev-defs.h   |  2 +-
>  include/linux/blk-cgroup.h |  2 +-
>  include/linux/fs.h |  2 +-
>  include/linux/fscache.h|  1 +
>  include/linux/iocontext.h  |  2 +-
>  include/linux/irqdomain.h  |  2 +-
>  include/linux/mlx4/device.h|  2 +-
>  include/linux/mlx4/qp.h|  1 +
>  include/linux/mlx5/driver.h|  2 +-
>  include/linux/mlx5/qp.h|  1 +
>  include/linux/radix-tree-root.h| 24 
>  include/linux/radix-tree.h |  8 ++--
>  22 files changed, 46 insertions(+), 16 deletions(-)
>  create mode 100644 include/linux/radix-tree-root.h
>
> --
> 2.7.4
>


signature.asc
Description: PGP signature

[PATCH] net/core: Fix BUG to BUG_ON conditionals.

2017-10-08 Thread Tim Hansen

Fix BUG() calls to use BUG_ON(conditional) macros.

This was found using make coccicheck M=net/core on linux next
tag next-20170929.

Signed-off-by: Tim Hansen 
---
 net/core/skbuff.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d98c2e3ce2bf..461516f45b33 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t 
gfp_mask)
/* Set the tail pointer and length */
skb_put(n, skb->len);
 
-   if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
 
copy_skb_header(n, skb);
return n;
@@ -1449,8 +1448,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int 
ntail,
 
BUG_ON(nhead < 0);
 
-   if (skb_shared(skb))
-   BUG();
+   BUG_ON(skb_shared(skb));
 
size = SKB_DATA_ALIGN(size);
 
@@ -1595,9 +1593,8 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
head_copy_off = newheadroom - head_copy_len;
 
/* Copy the linear header and data. */
-   if (skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
- skb->len + head_copy_len))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
+skb->len + head_copy_len));
 
copy_skb_header(n, skb);
 
@@ -1878,8 +1875,8 @@ void *__pskb_pull_tail(struct sk_buff *skb, int delta)
return NULL;
}
 
-   if (skb_copy_bits(skb, skb_headlen(skb), skb_tail_pointer(skb), delta))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, skb_headlen(skb),
+skb_tail_pointer(skb), delta))
 
/* Optimization: no fragments, no reasons to preestimate
 * size of pulled pages. Superb.
-- 
2.14.2

[BUG] stmmac: A possible sleep-in-atomic bug in stmmac_suspend

2017-10-08 Thread Jia-Ju Bai


According to stmmac_main.c, the driver may sleep under a spinlock,
and the function call path is:
stmmac_suspend (acquire the spinlock)
  stmmac_disable_all_queues
napi_disable
  might_sleep --> may sleep
  msleep --> may sleep

This bug is found by my static analysis tool and my code review.


Thanks,
Jia-Ju Bai

Re: [PATCH net-next RFC 8/9] net: dsa: mv88e6xxx: add rx/tx timestamping support

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:40AM -0500, Brandon Streiff wrote:
> +void mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> +  struct sk_buff *clone, unsigned int type)
> +{
> + struct mv88e6xxx_chip *chip = ds->priv;
> + struct mv88e6xxx_port_hwtstamp *ps = >port_hwtstamp[port];
> +
> + if (!chip->info->ptp_support)
> + return;
> +
> + if (port < 0 || port >= mv88e6xxx_num_ports(chip))
> + goto out;
> +
> + if (unlikely(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP) &&
> + mv88e6xxx_should_tstamp(chip, port, clone, type)) {
> + __be16 *seq_ptr = (__be16 *)(_get_ptp_header(clone, type) +
> +  OFF_PTP_SEQUENCE_ID);
> +
> + if (!test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS,
> +>state)) {
> + ps->tx_skb = clone;
> + ps->tx_tstamp_start = jiffies;
> + ps->tx_seq_id = be16_to_cpup(seq_ptr);
> +
> + /* Fetching the timestamp is high-priority work because
> +  * 802.1AS bounds the time for a response.

Can you please use this?

commit d9535cb7b7603aeb549c697ecdf92024e4d0a650
Author: Grygorii Strashko 
Date:   Fri Jul 28 17:30:02 2017 -0500

ptp: introduce ptp auxiliary worker

Many PTP drivers required to perform some asynchronous or periodic work,
like periodically handling PHC counter overflow or handle delayed timestamp
for RX/TX network packets. In most of the cases, such work is implemented
using workqueues. Unfortunately, Kernel workqueues might introduce
significant delay in work scheduling under high system load and on -RT,
which could cause misbehavior of PTP drivers due to internal counter
overflow, for example, and there is no way to tune its execution policy and
priority manuallly.

Hence, The kthread_worker can be used insted of workqueues, as it create
separte named kthread for each worker and its its execution policy and
priority can be configured using chrt tool.

> +  * No need to check result of queue_work(). ps->tx_skb
> +  * check ensures work item is not pending (it may be
> +  * waiting to exit)
> +  */
> + queue_work(system_highpri_wq, >tx_tstamp_work);
> + return;
> + }
> +
> + /* Otherwise we're already in progress... */
> + dev_dbg(chip->dev,
> + "p%d: tx timestamp already in progress, discarding",
> + port);
> + }
> +
> +out:
> + /* We don't need it after all. */
> + kfree_skb(clone);
> +}

Thanks,
Richard

Re: [PATCH 10/12] net/mlx4: replace with

2017-10-08 Thread Leon Romanovsky

On Mon, Oct 09, 2017 at 02:29:15AM +0900, Masahiro Yamada wrote:
> 2017-10-09 2:00 GMT+09:00 David Miller :
> > From: Masahiro Yamada 
> > Date: Mon,  9 Oct 2017 01:10:11 +0900
> >
> >> The headers
> >>  - include/linux/mlx4/device.h
> >>  - drivers/net/ethernet/mellanox/mlx4/mlx4.h
> >> require the definition of struct radix_tree_root, but do not need to
> >> know anything about other radix tree stuff.
> >>
> >> Include  instead of  to
> >> reduce the header dependency.
> >>
> >> While we are here, let's add missing  where
> >> radix tree accessors are used.
> >>
> >> Signed-off-by: Masahiro Yamada 
> >
> > Honestly this makes things more complicated.
>
>
> The idea is simple; include necessary headers explicitly.
>
> Putting everything into one common header
> means most of C files are forced to parse unnecessary headers.

It is neglected, only first caller will actually parse that header file,
other callers will check the #ifndef pragma without need to reparse the
whole file.

>
>
>
> > The driver was trying to consolidate all of the header needs
> > by including them all in one place, the main driver header.
> >
> > Now you're including headers in several different files.
> >
> > I really don't like the results of this change and would
> > ask you to reconsider.
> >
> > Just add both radix-tree-root.h _and_ radix-tree.h to mlx4.h
> > and leave the rest of the driver alone.
>
>
> If you do not like this, you can just throw it away.
>
>  includes .
> You do not need to include both.
>
>
>
>
> --
> Best Regards
> Masahiro Yamada
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: PGP signature

[PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context

2017-10-08 Thread Ido Schimmel

Without the rwlock and with PREEMPT_RCU we're no longer guaranteed to be
in non-preemptible context when performing a route lookup, so use
raw_cpu_ptr() instead.

Takes care of the following splat:
[  122.221814] BUG: using smp_processor_id() in preemptible [] code: 
sshd/2672
[  122.221845] caller is debug_smp_processor_id+0x17/0x20
[  122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 
4.14.0-rc3-idosch-next-custom #639
[  122.221880] Hardware name: Mellanox Technologies Ltd. 
MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  122.221893] Call Trace:
[  122.221919]  dump_stack+0xb1/0x10c
[  122.221946]  ? _atomic_dec_and_lock+0x124/0x124
[  122.221974]  ? ___ratelimit+0xfe/0x240
[  122.222020]  check_preemption_disabled+0x173/0x1b0
[  122.222060]  debug_smp_processor_id+0x17/0x20
[  122.222083]  ip6_pol_route+0x1482/0x24a0
...

Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Signed-off-by: Ido Schimmel 
---
 net/ipv6/route.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 399d1bceec4a..579d4b73beb1 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1112,7 +1112,7 @@ static struct rt6_info *rt6_get_pcpu_route(struct 
rt6_info *rt)
 {
struct rt6_info *pcpu_rt, **p;
 
-   p = this_cpu_ptr(rt->rt6i_pcpu);
+   p = raw_cpu_ptr(rt->rt6i_pcpu);
pcpu_rt = *p;
 
if (pcpu_rt && ip6_hold_safe(NULL, _rt, false))
@@ -1134,7 +1134,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct 
rt6_info *rt)
}
 
dst_hold(_rt->dst);
-   p = this_cpu_ptr(rt->rt6i_pcpu);
+   p = raw_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
if (prev) {
/* If someone did it before us, return prev instead */
-- 
2.13.6

[PATCH 10/12] net/mlx4: replace with

2017-10-08 Thread Masahiro Yamada

The headers
 - include/linux/mlx4/device.h
 - drivers/net/ethernet/mellanox/mlx4/mlx4.h
require the definition of struct radix_tree_root, but do not need to
know anything about other radix tree stuff.

Include  instead of  to
reduce the header dependency.

While we are here, let's add missing  where
radix tree accessors are used.

Signed-off-by: Masahiro Yamada 
---

 drivers/net/ethernet/mellanox/mlx4/cq.c   | 1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4.h | 2 +-
 drivers/net/ethernet/mellanox/mlx4/qp.c   | 1 +
 drivers/net/ethernet/mellanox/mlx4/srq.c  | 1 +
 include/linux/mlx4/device.h   | 2 +-
 include/linux/mlx4/qp.h   | 1 +
 6 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c 
b/drivers/net/ethernet/mellanox/mlx4/cq.c
index 72eb50c..4cbe65c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -36,6 +36,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index c68da19..975ef70 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -38,7 +38,7 @@
 #define MLX4_H
 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c 
b/drivers/net/ethernet/mellanox/mlx4/qp.c
index 728a2fb..50cbc62 100644
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -35,6 +35,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/drivers/net/ethernet/mellanox/mlx4/srq.c 
b/drivers/net/ethernet/mellanox/mlx4/srq.c
index bedf521..4201a46 100644
--- a/drivers/net/ethernet/mellanox/mlx4/srq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/srq.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mlx4.h"
 #include "icm.h"
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index b0a57e0..75eac23 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -36,7 +36,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/include/linux/mlx4/qp.h b/include/linux/mlx4/qp.h
index 8e2828d..dfa7d8e 100644
--- a/include/linux/mlx4/qp.h
+++ b/include/linux/mlx4/qp.h
@@ -35,6 +35,7 @@
 
 #include 
 #include 
+#include 
 
 #include 
 
-- 
2.7.4

[PATCH 11/12] net/mlx5: replace with

2017-10-08 Thread Masahiro Yamada

The header include/linux/mlx5/driver.h requires the definition of
struct radix_tree_root, but does not need to know anything about
other radix tree stuff.

Include  instead of  to
reduce the number of included header files.

Also, add  to include/linux/mlx5/gp.h where radix
tree accessors are used.

Signed-off-by: Masahiro Yamada 
---

 include/linux/mlx5/driver.h | 2 +-
 include/linux/mlx5/qp.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 401c897..0aea568 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -40,7 +40,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index 66d19b6..a90996f 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -35,6 +35,7 @@
 
 #include 
 #include 
+#include 
 
 #define MLX5_INVALID_LKEY  0x100
 #define MLX5_SIG_WQE_SIZE  (MLX5_SEND_WQE_BB * 5)
-- 
2.7.4

Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context

2017-10-08 Thread Ido Schimmel

Hi Eric,

On Sun, Oct 08, 2017 at 09:03:53AM -0700, Eric Dumazet wrote:
> Thanks Ido for this patch.
> 
> IMO, we no longer play this read_lock() -> write_lock() game since
> ip6_dst_gc() could be called from rt6_make_pcpu_route()

Right, cause we can't deadlock anymore as with the rwlock.

> 
> So we might simplify things quite a bit, by blocking BH (and thus
> preventing preemption)
> 
> Something like :
> 
>  net/ipv6/route.c |   26 ++
>  1 file changed, 6 insertions(+), 20 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 
> 399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003
>  100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct 
> rt6_info *rt)
>   dst_hold(_rt->dst);
>   p = this_cpu_ptr(rt->rt6i_pcpu);
>   prev = cmpxchg(p, NULL, pcpu_rt);
> - if (prev) {
> - /* If someone did it before us, return prev instead */
> - /* release refcnt taken by ip6_rt_pcpu_alloc() */
> - dst_release_immediate(_rt->dst);
> - /* release refcnt taken by above dst_hold() */
> - dst_release_immediate(_rt->dst);
> - dst_hold(>dst);
> - pcpu_rt = prev;
> - }
> + BUG_ON(prev);

Is this BUG_ON() now valid because of the local_bh_disable() in
ip6_pol_route()?

>  
>   rt6_dst_from_metrics_check(pcpu_rt);
>   return pcpu_rt;
> @@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, 
> struct fib6_table *table,
>   struct rt6_info *pcpu_rt;
>  
>   dst_use_noref(>dst, jiffies);
> + local_bh_disable();
>   pcpu_rt = rt6_get_pcpu_route(rt);
>  
> - if (pcpu_rt) {
> - rcu_read_unlock();
> - } else {
> + if (!pcpu_rt) {
>   /* atomic_inc_not_zero() is needed when using rcu */
>   if (atomic_inc_not_zero(>rt6i_ref)) {
> - /* We have to do the read_unlock first
> -  * because rt6_make_pcpu_route() may trigger
> -  * ip6_dst_gc() which will take the write_lock.
> -  *
> -  * No dst_hold() on rt is needed because 
> grabbing
> + /* No dst_hold() on rt is needed because 
> grabbing
>* rt->rt6i_ref makes sure rt can't be released.
>*/
> - rcu_read_unlock();
>   pcpu_rt = rt6_make_pcpu_route(rt);
>   rt6_release(rt);
>   } else {
>   /* rt is already removed from tree */
> - rcu_read_unlock();
>   pcpu_rt = net->ipv6.ip6_null_entry;
>   dst_hold(_rt->dst);
>   }
>   }
> -
> + local_bh_enable();
> + rcu_read_unlock();
>   trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
>   return pcpu_rt;
>   }

I replaced my patch with yours and I don't trigger the bug anymore. Feel
free to add my tag:

Tested-by: Ido Schimmel 

Thanks!

Re: [PATCH net-next,0/3] hv_netvsc: support changing TCP hash level

2017-10-08 Thread David Miller

From: Haiyang Zhang 
Date: Fri,  6 Oct 2017 08:33:56 -0700

> From: Haiyang Zhang 
> 
> The patch set simplifies the existing hash level switching code for
> UDP. It also adds the support for changing TCP hash level. So users
> can switch between L3 an L4 hash levels for TCP and UDP.

Series applied.

Re: [PATCH net-next] ipv6: Do not use this_cpu_ptr() in preemptible context

2017-10-08 Thread Eric Dumazet

On Sun, 2017-10-08 at 19:54 +0300, Ido Schimmel wrote:
> Hi Eric,

> > prev = cmpxchg(p, NULL, pcpu_rt);
> > -   if (prev) {
> > -   /* If someone did it before us, return prev instead */
> > -   /* release refcnt taken by ip6_rt_pcpu_alloc() */
> > -   dst_release_immediate(_rt->dst);
> > -   /* release refcnt taken by above dst_hold() */
> > -   dst_release_immediate(_rt->dst);
> > -   dst_hold(>dst);
> > -   pcpu_rt = prev;
> > -   }
> > +   BUG_ON(prev);
> 
> Is this BUG_ON() now valid because of the local_bh_disable() in
> ip6_pol_route()?

Yes, this bug to trigger would need this code be re-entered from a hard
IRQ, and that would be wrong of course.

[PATCH] net: ethernet: stmmac: Clean up dead code

2017-10-08 Thread Christos Gkekas

Many macros in dwmac-ipq806x are unused and should be removed.
Moreover gmac->id is an unsigned variable and therefore checking
whether it is less than zero is redundant.

Signed-off-by: Christos Gkekas 
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c | 14 +-
 1 file changed, 1 insertion(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c 
b/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c
index 866444b..2c6d7c6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-ipq806x.c
@@ -51,15 +51,11 @@
 #define NSS_COMMON_CLK_SRC_CTRL_RGMII(x)   1
 #define NSS_COMMON_CLK_SRC_CTRL_SGMII(x)   ((x >= 2) ? 1 : 0)
 
-#define NSS_COMMON_MACSEC_CTL  0x28
-#define NSS_COMMON_MACSEC_CTL_EXT_BYPASS_EN(x) (1 << x)
-
 #define NSS_COMMON_GMAC_CTL(x) (0x30 + (x * 4))
 #define NSS_COMMON_GMAC_CTL_CSYS_REQ   BIT(19)
 #define NSS_COMMON_GMAC_CTL_PHY_IFACE_SEL  BIT(16)
 #define NSS_COMMON_GMAC_CTL_IFG_LIMIT_OFFSET   8
 #define NSS_COMMON_GMAC_CTL_IFG_OFFSET 0
-#define NSS_COMMON_GMAC_CTL_IFG_MASK   0x3f
 
 #define NSS_COMMON_CLK_DIV_RGMII_1000  1
 #define NSS_COMMON_CLK_DIV_RGMII_100   9
@@ -68,9 +64,6 @@
 #define NSS_COMMON_CLK_DIV_SGMII_100   4
 #define NSS_COMMON_CLK_DIV_SGMII_1049
 
-#define QSGMII_PCS_MODE_CTL0x68
-#define QSGMII_PCS_MODE_CTL_AUTONEG_EN(x)  BIT((x * 8) + 7)
-
 #define QSGMII_PCS_CAL_LCKDT_CTL   0x120
 #define QSGMII_PCS_CAL_LCKDT_CTL_RST   BIT(19)
 
@@ -83,15 +76,10 @@
 #define QSGMII_PHY_TX_DRIVER_ENBIT(3)
 #define QSGMII_PHY_QSGMII_EN   BIT(7)
 #define QSGMII_PHY_PHASE_LOOP_GAIN_OFFSET  12
-#define QSGMII_PHY_PHASE_LOOP_GAIN_MASK0x7
 #define QSGMII_PHY_RX_DC_BIAS_OFFSET   18
-#define QSGMII_PHY_RX_DC_BIAS_MASK 0x3
 #define QSGMII_PHY_RX_INPUT_EQU_OFFSET 20
-#define QSGMII_PHY_RX_INPUT_EQU_MASK   0x3
 #define QSGMII_PHY_CDR_PI_SLEW_OFFSET  22
-#define QSGMII_PHY_CDR_PI_SLEW_MASK0x3
 #define QSGMII_PHY_TX_DRV_AMP_OFFSET   28
-#define QSGMII_PHY_TX_DRV_AMP_MASK 0xf
 
 struct ipq806x_gmac {
struct platform_device *pdev;
@@ -217,7 +205,7 @@ static int ipq806x_gmac_of_parse(struct ipq806x_gmac *gmac)
 * code and keep it consistent with the Linux convention, we'll number
 * them from 0 to 3 here.
 */
-   if (gmac->id < 0 || gmac->id > 3) {
+   if (gmac->id > 3) {
dev_err(dev, "invalid gmac id\n");
return -EINVAL;
}
-- 
2.7.4

Re: devlink dump of mlxsw_adj table triggers a panic

2017-10-08 Thread David Ahern

On 10/8/17 1:43 AM, Arkadi Sharshevsky wrote:
> Thanks, will check it out. How many nexthops groups & overall number of
> nexthops you configured?

8 ports with 62 VLANs on each (496 total vlan devices) and 62 VRFs. BGP
is exchanging routes with neighbors. No multipath routes.

Re: [PATCH net-next RFC 8/9] net: dsa: mv88e6xxx: add rx/tx timestamping support

2017-10-08 Thread Richard Cochran

On Thu, Sep 28, 2017 at 10:25:40AM -0500, Brandon Streiff wrote:
> We also utilize a feature of the "generation 3" PTP hardware that lets
> us to embed the timestamp value into one of the reserved fields in the
> PTP header. This lets us extract the timestamp out of the header and
> avoid an SMI access in the RX codepath. (This implementation does not
> presently support the older generations.)

That is fine for the later models, but we really need the code to read
over MDIO as well.  You added .ptp_support = true for those older
switches, and so the present series won't work.

If it helps, maybe I can adapt the relevant code from my driver to
your work.

Thanks,
Richard

Re: [PATCH 3/3] batman-adv: Add missing kerneldoc for extack

2017-10-08 Thread David Miller

From: Sven Eckelmann 
Date: Sun, 08 Oct 2017 09:29:17 +0200

> Are you expected to apply this change:
> =
> 
> I was hoping that Simon is picking the patch up and forwards it do you in a 
> proper pull request. But I would doubt that he has a big problem with you 
> applying this single line kernel-doc change. At least it is less work for him 
> and less extra noise on both mailing lists...
> 
> But feel free to tell us your preferred solution.

Ok I'll wait to get this from the next batman-adv pull request then.

Thanks.

Re: [PATCH v2] gso: fix payload length when gso_size is zero

2017-10-08 Thread David Miller

From: Alexey Kodanev 
Date: Fri,  6 Oct 2017 19:02:35 +0300

> When gso_size reset to zero for the tail segment in skb_segment(), later
> in ipv6_gso_segment(), __skb_udp_tunnel_segment() and gre_gso_segment()
> we will get incorrect results (payload length, pcsum) for that segment.
> inet_gso_segment() already has a check for gso_size before calculating
> payload.
> 
> The issue was found with LTP vxlan & gre tests over ixgbe NIC.
> 
> Fixes: 07b26c9454a2 ("gso: Support partial splitting at the frag_list 
> pointer")
> Signed-off-by: Alexey Kodanev 
> ---
> v2: also added skb_is_gso to gre_gso_segment() and __skb_udp_tunnel_segment()
> 

Applied and queued up for -stable, thanks.

Re: [PATCH 10/12] net/mlx4: replace with

2017-10-08 Thread Masahiro Yamada

2017-10-09 2:00 GMT+09:00 David Miller :
> From: Masahiro Yamada 
> Date: Mon,  9 Oct 2017 01:10:11 +0900
>
>> The headers
>>  - include/linux/mlx4/device.h
>>  - drivers/net/ethernet/mellanox/mlx4/mlx4.h
>> require the definition of struct radix_tree_root, but do not need to
>> know anything about other radix tree stuff.
>>
>> Include  instead of  to
>> reduce the header dependency.
>>
>> While we are here, let's add missing  where
>> radix tree accessors are used.
>>
>> Signed-off-by: Masahiro Yamada 
>
> Honestly this makes things more complicated.


The idea is simple; include necessary headers explicitly.

Putting everything into one common header
means most of C files are forced to parse unnecessary headers.



> The driver was trying to consolidate all of the header needs
> by including them all in one place, the main driver header.
>
> Now you're including headers in several different files.
>
> I really don't like the results of this change and would
> ask you to reconsider.
>
> Just add both radix-tree-root.h _and_ radix-tree.h to mlx4.h
> and leave the rest of the driver alone.


If you do not like this, you can just throw it away.

 includes .
You do not need to include both.




-- 
Best Regards
Masahiro Yamada

Re: [PATCH v2] net/core: Fix BUG to BUG_ON conditionals.

2017-10-08 Thread Tim Hansen

Mistakenly sent the patch previously with a missing semicolon.
Apologies.

Fix BUG() calls to use BUG_ON(conditional) macros.

This was found using make coccicheck M=net/core on linux next
tag next-20170929

Signed-off-by: Tim Hansen 
---
 net/core/skbuff.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index d98c2e3ce2bf..34ce4c1a0f3c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1350,8 +1350,7 @@ struct sk_buff *skb_copy(const struct sk_buff *skb, gfp_t 
gfp_mask)
/* Set the tail pointer and length */
skb_put(n, skb->len);
 
-   if (skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, -headerlen, n->head, headerlen + skb->len));
 
copy_skb_header(n, skb);
return n;
@@ -1449,8 +1448,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int 
ntail,
 
BUG_ON(nhead < 0);
 
-   if (skb_shared(skb))
-   BUG();
+   BUG_ON(skb_shared(skb));
 
size = SKB_DATA_ALIGN(size);
 
@@ -1595,9 +1593,8 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
head_copy_off = newheadroom - head_copy_len;
 
/* Copy the linear header and data. */
-   if (skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
- skb->len + head_copy_len))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, -head_copy_len, n->head + head_copy_off,
+skb->len + head_copy_len));
 
copy_skb_header(n, skb);
 
@@ -1878,8 +1875,8 @@ void *__pskb_pull_tail(struct sk_buff *skb, int delta)
return NULL;
}
 
-   if (skb_copy_bits(skb, skb_headlen(skb), skb_tail_pointer(skb), delta))
-   BUG();
+   BUG_ON(skb_copy_bits(skb, skb_headlen(skb),
+skb_tail_pointer(skb), delta));
 
/* Optimization: no fragments, no reasons to preestimate
 * size of pulled pages. Superb.
-- 
2.14.2

RIF/VRF overflow in spectrum and reporting errors back to user

2017-10-08 Thread David Ahern

Jiri / Ido:

I am looking at adding user messages for spectrum failures related to
RIF and VRF overflow coming from the inetaddr and inet6addr notifier
paths. The key is that if the notifiers fail the address add needs to
fail and an error reported to the user as to what happened.

Earlier this year 3ad7d2468f79f added in_validator_info and
in6_validator_info as a way for the notifiers to fail adding an address.
Adding support to spectrum for that notifier is complicated by the fact
that the validator notifier and address notifiers will come in back to
back for the NETDEV_UP case. Ignoring NETDEV_UP in
mlxsw_sp_inetaddr_event seems ok for IPv6 but not clear for IPv4 since
the NETDEV_UP case is emitted on an address delete that involves a
promotion. Handling the back to back NETDEV_UP is complicated since
functions invoked by __mlxsw_sp_inetaddr_event can take multiple
references. Specifically, in mlxsw_sp_port_vlan_router_join():
fid = rif->ops->fid_get(rif);

Can NETDEV_UP be ignored for the inetaddr notifier if it is handled by
the validator notitifer?

David

[PATCH] ath10k: spectral: Simplify error checking

2017-10-08 Thread Christos Gkekas

Variable val is unsigned so checking whether it is less than zero is
redundant.

Signed-off-by: Christos Gkekas 
---
 drivers/net/wireless/ath/ath10k/spectral.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/spectral.c 
b/drivers/net/wireless/ath/ath10k/spectral.c
index dd9cc09..1867937 100644
--- a/drivers/net/wireless/ath/ath10k/spectral.c
+++ b/drivers/net/wireless/ath/ath10k/spectral.c
@@ -403,10 +403,7 @@ static ssize_t write_file_spectral_count(struct file *file,
return -EFAULT;
 
buf[len] = '\0';
-   if (kstrtoul(buf, 0, ))
-   return -EINVAL;
-
-   if (val < 0 || val > 255)
+   if (kstrtoul(buf, 0, ) || val > 255)
return -EINVAL;
 
mutex_lock(>conf_mutex);
-- 
2.7.4

[PATCH] ath9k: debug: Simplify error checking

2017-10-08 Thread Christos Gkekas

Variable val is unsigned so checking whether it is less than zero is
redundant.

Signed-off-by: Christos Gkekas 
---
 drivers/net/wireless/ath/ath9k/debug.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/debug.c 
b/drivers/net/wireless/ath/ath9k/debug.c
index 01fa301..3b93c23 100644
--- a/drivers/net/wireless/ath/ath9k/debug.c
+++ b/drivers/net/wireless/ath/ath9k/debug.c
@@ -1164,10 +1164,7 @@ static ssize_t write_file_tpc(struct file *file, const 
char __user *user_buf,
return -EFAULT;
 
buf[len] = '\0';
-   if (kstrtoul(buf, 0, ))
-   return -EINVAL;
-
-   if (val < 0 || val > 1)
+   if (kstrtoul(buf, 0, ) || val > 1)
return -EINVAL;
 
tpc_enabled = !!val;
-- 
2.7.4

[PATCH] qed: Delete redundant check on dcb_app priority

2017-10-08 Thread Christos Gkekas

dcb_app priority is unsigned thus checking whether it is less than zero
is redundant.

Signed-off-by: Christos Gkekas 
---
 drivers/net/ethernet/qlogic/qed/qed_dcbx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c 
b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c
index 8f6ccc0..6e15d3c 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dcbx.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dcbx.c
@@ -2308,7 +2308,7 @@ static int qed_dcbnl_ieee_setapp(struct qed_dev *cdev, 
struct dcb_app *app)
 
DP_VERBOSE(hwfn, QED_MSG_DCB, "selector = %d protocol = %d pri = %d\n",
   app->selector, app->protocol, app->priority);
-   if (app->priority < 0 || app->priority >= QED_MAX_PFC_PRIORITIES) {
+   if (app->priority >= QED_MAX_PFC_PRIORITIES) {
DP_INFO(hwfn, "Invalid priority %d\n", app->priority);
return -EINVAL;
}
-- 
2.7.4

[net PATCH] macvlan: Only deliver one copy of the frame to the macvlan interface

2017-10-08 Thread Alexander Duyck

From: Alexander Duyck 

This patch intoduces a slight adjustment for macvlan to address the fact
that in source mode I was seeing two copies of any packet addressed to the
macvlan interface being delivered where there should have been only one.

The issue appears to be that one copy was delivered based on the source MAC
address and then the second copy was being delivered based on the
destination MAC address. To fix it I am just freeing the second copy
instead of delivering it up the stack using the same netdev as was already
delivered to.

Fixes: 79cf79abce71 ("macvlan: add source mode")
Signed-off-by: Alexander Duyck 
---
 drivers/net/macvlan.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index d2aea961e0f4..744b0fe6dc78 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -484,7 +484,8 @@ static rx_handler_result_t macvlan_handle_frame(struct 
sk_buff **pskb)
return RX_HANDLER_PASS;
 
dev = vlan->dev;
-   if (unlikely(!(dev->flags & IFF_UP))) {
+   if ((vlan->mode == MACVLAN_MODE_SOURCE) ||
+   unlikely(!(dev->flags & IFF_UP))) {
kfree_skb(skb);
return RX_HANDLER_CONSUMED;
}

Re: [net PATCH] macvlan: Only deliver one copy of the frame to the macvlan interface

2017-10-08 Thread Eric Dumazet

On Sun, 2017-10-08 at 15:54 -0700, Alexander Duyck wrote:
> From: Alexander Duyck 
> 
> This patch intoduces a slight adjustment for macvlan to address the fact
> that in source mode I was seeing two copies of any packet addressed to the
> macvlan interface being delivered where there should have been only one.
> 
> The issue appears to be that one copy was delivered based on the source MAC
> address and then the second copy was being delivered based on the
> destination MAC address. To fix it I am just freeing the second copy
> instead of delivering it up the stack using the same netdev as was already
> delivered to.
> 
> Fixes: 79cf79abce71 ("macvlan: add source mode")
> Signed-off-by: Alexander Duyck 
> ---
>  drivers/net/macvlan.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
> index d2aea961e0f4..744b0fe6dc78 100644
> --- a/drivers/net/macvlan.c
> +++ b/drivers/net/macvlan.c
> @@ -484,7 +484,8 @@ static rx_handler_result_t macvlan_handle_frame(struct 
> sk_buff **pskb)
>   return RX_HANDLER_PASS;
>  
>   dev = vlan->dev;
> - if (unlikely(!(dev->flags & IFF_UP))) {
> + if ((vlan->mode == MACVLAN_MODE_SOURCE) ||
> + unlikely(!(dev->flags & IFF_UP))) {
>   kfree_skb(skb);
>   return RX_HANDLER_CONSUMED;
>   }
> 


Shouldn't we have a consume_skb() then instead of kfree_skb() ?

We are not really dropping a packet here, only avoiding some artifact
cause by the cited commit.

Re: [PATCH] net: dsa: mv88e6xxx: rework in-chip bridging

2017-10-08 Thread Greg Ungerer

Hi Florian,

On 07/10/17 13:04, Florian Fainelli wrote:
> Le 10/03/17 à 23:20, Greg Ungerer a écrit :
>> On Wed, Mar 29, 2017 at 04:30:16PM -0400, Vivien Didelot wrote:
>>> All ports -- internal and external, for chips featuring a PVT -- have a
>>> mask restricting to which internal ports a frame is allowed to egress.
>>>
>>> Now that DSA exposes the number of ports and their bridge devices, it is
>>> possible to extract the code generating the VLAN map and make it generic
>>> so that it can be shared later with the cross-chip bridging code.
>>
>> This patch changes the behavior of interfaces on startup if they are
>> not part of a bridge.
>>
>> I have a board with a Marvell 6350 switch with a device tree that sets
>> up the 5 ports as lan1, lan2, lan3, lan4, wan. With kernels before
>> this patch (so linux-4.12 and older) after system startup I could do:
>>
>>   ifconfig lan1 192.168.0.1
>>
>> And then ping out that interface with no problems.
>>
>> After this patch is applied (effects linux-4.13 and newer) then the
>> ping fails:
>>
>>   PING 192.168.0.22 (192.168.0.22) 56(84) bytes of data.
>>   From 192.168.0.1 icmp_seq=1 Destination Host Unreachable
>>   From 192.168.0.1 icmp_seq=2 Destination Host Unreachable
>>   From 192.168.0.1 icmp_seq=3 Destination Host Unreachable
>>
>> If I incorporate an interface into a bridge then it all works ok.
>> So simply:
>>
>>   brctl addbr br0
>>   brctl addif br0 lan1
>>   ifconfig lan1 up
>>   ifconfig br0 192.168.0.1
>>
>> Then pings out work as expected. And if I now remove that lan1
>> interface from the bridge and use it alone again then it will
>> now work ok:
>>
>>   ifconfig br0 down
>>   brctl delif br0 lan1
>>   ifconfig lan1 192.168.0.1
>>
>> And that now pings ok.
>>
>> I fixed this with the attached patch. It is probably not the correct
>> approach, but it does restore the older behavior.
>>
>> What do you think?
> 
> This is strange, the dsa_switch_tree and its associated dsa_switch
> instances should be fully setup by the time ops->setup() is running in
> your driver but your patch suggests this may not be happening?

That is what I am seeing, yep.


> Are you using the new style Device Tree binding or the old style Device
> Tree binding out of curiosity?

This is my device tree fragment for the switch:

dsa@0 {
compatible = "marvell,dsa";
#address-cells = <2>;
#size-cells = <0>;

dsa,ethernet = <>;
dsa,mii-bus = <>;

switch@0 {
#address-cells = <1>;
#size-cells = <0>;
reg = <0x11 0>;

port@0 {
reg = <0>;
label = "lan1";
};
port@1 {
reg = <1>;
label = "lan2";
};
port@2 {
reg = <2>;
label = "lan3";
};
port@3 {
reg = <3>;
label = "lan4";
};
port@4 {
reg = <4>;
label = "wan";
};
port@5 {
reg = <5>;
label = "cpu";
};
};
 };

The board I am using is based around an Marvell Armada 370. This device tree
setup looks pretty similar to the other Marvell boards using marvell,dsa.

Regards
Greg



>>> Signed-off-by: Vivien Didelot 
>>> ---
>>>  drivers/net/dsa/mv88e6xxx/chip.c | 53 
>>> ++--
>>>  1 file changed, 34 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c 
>>> b/drivers/net/dsa/mv88e6xxx/chip.c
>>> index b114bf8e6a11..e5165831e8b5 100644
>>> --- a/drivers/net/dsa/mv88e6xxx/chip.c
>>> +++ b/drivers/net/dsa/mv88e6xxx/chip.c
>>> @@ -1123,27 +1123,42 @@ static int mv88e6xxx_set_eee(struct dsa_switch *ds, 
>>> int 
>>> port,
>>> return err;
>>>  }
>>>  
>>> +static u16 mv88e6xxx_port_vlan(struct mv88e6xxx_chip *chip, int dev, int 
>>> port)
>>> +{
>>> +   struct dsa_switch *ds = NULL;
>>> +   struct net_device *br;
>>> +   u16 pvlan;
>>> +   int i;
>>> +
>>> +   if (dev < DSA_MAX_SWITCHES)
>>> +   ds = chip->ds->dst->ds[dev];
>>> +
>>> +   /* Prevent frames from unknown switch or port */
>>> +   if (!ds || port >= ds->num_ports)
>>> +   return 0;
>>> +
>>> +   /* Frames from DSA links and CPU ports can egress any local port */
>>> +   if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port))
>>> +   return

Re: BUG: using smp_processor_id() in preemptible [00000000]

2017-10-08 Thread Jakub Kicinski

On Sun, 8 Oct 2017 20:47:10 -0700, Eric Dumazet wrote:
> This was discussed/reported earlier today by Ido.
> 
> https://patchwork.ozlabs.org/patch/823011/
> 
> We will submit an official fix shortly.

Ah, sorry, I missed that!

Re: [PATCH] net/core: Fix BUG to BUG_ON conditionals.

2017-10-08 Thread kbuild test robot

Hi Tim,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Tim-Hansen/net-core-Fix-BUG-to-BUG_ON-conditionals/20171009-070451
config: blackfin-allyesconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 6.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=blackfin 

All errors (new ones prefixed by >>):

   net/core/skbuff.c: In function '__pskb_pull_tail':
>> net/core/skbuff.c:1884:2: error: expected ';' before 'if'
 if (!skb_has_frag_list(skb))
 ^~

vim +1884 net/core/skbuff.c

^1da177e4c Linus Torvalds   2005-04-16  1838  
^1da177e4c Linus Torvalds   2005-04-16  1839  /**
^1da177e4c Linus Torvalds   2005-04-16  1840   *
__pskb_pull_tail - advance tail of skb header
^1da177e4c Linus Torvalds   2005-04-16  1841   *@skb: buffer to 
reallocate
^1da177e4c Linus Torvalds   2005-04-16  1842   *@delta: number 
of bytes to advance tail
^1da177e4c Linus Torvalds   2005-04-16  1843   *
^1da177e4c Linus Torvalds   2005-04-16  1844   *The function 
makes a sense only on a fragmented _buff,
^1da177e4c Linus Torvalds   2005-04-16  1845   *it expands 
header moving its tail forward and copying necessary
^1da177e4c Linus Torvalds   2005-04-16  1846   *data from 
fragmented part.
^1da177e4c Linus Torvalds   2005-04-16  1847   *
^1da177e4c Linus Torvalds   2005-04-16  1848   *_buff MUST 
have reference count of 1.
^1da177e4c Linus Torvalds   2005-04-16  1849   *
^1da177e4c Linus Torvalds   2005-04-16  1850   *Returns %NULL 
(and _buff does not change) if pull failed
^1da177e4c Linus Torvalds   2005-04-16  1851   *or value of new 
tail of skb in the case of success.
^1da177e4c Linus Torvalds   2005-04-16  1852   *
^1da177e4c Linus Torvalds   2005-04-16  1853   *All the 
pointers pointing into skb header may change and must be
^1da177e4c Linus Torvalds   2005-04-16  1854   *reloaded after 
call to this function.
^1da177e4c Linus Torvalds   2005-04-16  1855   */
^1da177e4c Linus Torvalds   2005-04-16  1856  
^1da177e4c Linus Torvalds   2005-04-16  1857  /* Moves tail of skb head 
forward, copying data from fragmented part,
^1da177e4c Linus Torvalds   2005-04-16  1858   * when it is necessary.
^1da177e4c Linus Torvalds   2005-04-16  1859   * 1. It may fail due to 
malloc failure.
^1da177e4c Linus Torvalds   2005-04-16  1860   * 2. It may change skb 
pointers.
^1da177e4c Linus Torvalds   2005-04-16  1861   *
^1da177e4c Linus Torvalds   2005-04-16  1862   * It is pretty 
complicated. Luckily, it is called only in exceptional cases.
^1da177e4c Linus Torvalds   2005-04-16  1863   */
af72868b90 Johannes Berg2017-06-16  1864  void 
*__pskb_pull_tail(struct sk_buff *skb, int delta)
^1da177e4c Linus Torvalds   2005-04-16  1865  {
^1da177e4c Linus Torvalds   2005-04-16  1866/* If skb has not 
enough free space at tail, get new one
^1da177e4c Linus Torvalds   2005-04-16  1867 * plus 128 bytes for 
future expansions. If we have enough
^1da177e4c Linus Torvalds   2005-04-16  1868 * room at tail, 
reallocate without expansion only if skb is cloned.
^1da177e4c Linus Torvalds   2005-04-16  1869 */
4305b54135 Arnaldo Carvalho de Melo 2007-04-19  1870int i, k, eat = 
(skb->tail + delta) - skb->end;
^1da177e4c Linus Torvalds   2005-04-16  1871  
^1da177e4c Linus Torvalds   2005-04-16  1872if (eat > 0 || 
skb_cloned(skb)) {
^1da177e4c Linus Torvalds   2005-04-16  1873if 
(pskb_expand_head(skb, 0, eat > 0 ? eat + 128 : 0,
^1da177e4c Linus Torvalds   2005-04-16  1874
 GFP_ATOMIC))
^1da177e4c Linus Torvalds   2005-04-16  1875return 
NULL;
^1da177e4c Linus Torvalds   2005-04-16  1876}
^1da177e4c Linus Torvalds   2005-04-16  1877  
b4ef80dbcb Tim Hansen   2017-10-08  1878
BUG_ON(skb_copy_bits(skb, skb_headlen(skb),
b4ef80dbcb Tim Hansen   2017-10-08  1879 
skb_tail_pointer(skb), delta))
^1da177e4c Linus Torvalds   2005-04-16  1880  
^1da177e4c Linus Torvalds   2005-04-16  1881/* Optimization: no 
fragments, no reasons to preestimate
^1da177e4c Linus Torvalds   2005-04-16  1882 * size of pulled 
pages. Superb.
^1da177e4c Linus Torvalds   2005-04-16  1883 */
21dc330157 David S.

Re: [PATCH] net: make ->ndo_get_phys_port_name accept 32-bit len

2017-10-08 Thread Jakub Kicinski

On Sun, 8 Oct 2017 01:19:17 +0300, Alexey Dobriyan wrote:
> Buffer length passed into this hook is always IFNAMSIZ which is 16.
> 
> Code savings on x86_64:
> 
>   add/remove: 0/0 grow/shrink: 1/9 up/down: 2/-45 (-43)
>   function old new   delta
>   rocker_cmd_get_port_settings_phys_name_proc 179 181  +2
>   rocker_port_get_phys_port_name62  61  -1
>   mlxsw_sx_port_get_phys_port_name  54  50  -4
>   mlx5e_rep_get_phys_port_name  61  57  -4
>   efx_get_phys_port_name50  46  -4
>   dsa_slave_get_phys_port_name  54  50  -4
>   bnxt_vf_rep_get_phys_port_name69  65  -4
>   bnxt_get_phys_port_name   70  65  -5
>   mlxsw_sp_port_get_phys_port_name 116 107  -9
>   nfp_port_get_phys_port_name  180 170 -10
> 
> Signed-off-by: Alexey Dobriyan 

I don't think the gains justify the additional burden on backports.

> --- a/drivers/net/ethernet/netronome/nfp/nfp_port.c
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_port.c
> @@ -139,8 +139,8 @@ struct nfp_eth_table_port *nfp_port_get_eth_port(struct 
> nfp_port *port)
>   return __nfp_port_get_eth_port(port);
>  }
>  
> -int
> -nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t 
> len)
> +int nfp_port_get_phys_port_name(struct net_device *netdev,
> + char *name, unsigned int len)

Line break between name and len, please.

>  {
>   struct nfp_eth_table_port *eth_port;
>   struct nfp_port *port;
> --- a/drivers/net/ethernet/netronome/nfp/nfp_port.h
> +++ b/drivers/net/ethernet/netronome/nfp/nfp_port.h
> @@ -127,8 +127,7 @@ nfp_port_from_id(struct nfp_pf *pf, enum nfp_port_type 
> type, unsigned int id);
>  struct nfp_eth_table_port *__nfp_port_get_eth_port(struct nfp_port *port);
>  struct nfp_eth_table_port *nfp_port_get_eth_port(struct nfp_port *port);
>  
> -int
> -nfp_port_get_phys_port_name(struct net_device *netdev, char *name, size_t 
> len);
> +int nfp_port_get_phys_port_name(struct net_device *netdev, char *name, 
> unsigned int len);

This line is now longer than 80 characters.

>  int nfp_port_configure(struct net_device *netdev, bool configed);
>  
>  struct nfp_port *

> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1292,7 +1292,7 @@ struct net_device_ops {
>   int (*ndo_get_phys_port_id)(struct net_device *dev,
>   struct 
> netdev_phys_item_id *ppid);
>   int (*ndo_get_phys_port_name)(struct net_device 
> *dev,
> -   char *name, size_t 
> len);
> +   char *name, unsigned 
> int len);

And so is this one.

>   void(*ndo_udp_tunnel_add)(struct net_device *dev,
> struct udp_tunnel_info 
> *ti);
>   void(*ndo_udp_tunnel_del)(struct net_device *dev,

RE: [PATCH] qed: Delete redundant check on dcb_app priority

2017-10-08 Thread Tayar, Tomer

> dcb_app priority is unsigned thus checking whether it is less than zero
> is redundant.
> 
> Signed-off-by: Christos Gkekas 

Thanks

Acked-By: Tomer Tayar

BUG: using smp_processor_id() in preemptible [00000000]

2017-10-08 Thread Jakub Kicinski

Hi!

This appeared in net-next over the weekend - looks likely related to
this set...

[  164.456668] BUG: using smp_processor_id() in preemptible [] code: 
sshd/1255
[  164.472259] caller is ip6_pol_route+0xe88/0x4a80
[  164.477571] CPU: 10 PID: 1255 Comm: sshd Not tainted 
4.14.0-rc3-debug-00960-g58e545033ed7 #312
[  164.487350] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
11/08/2016
[  164.495884] Call Trace:
[  164.498771]  dump_stack+0xb3/0x140
[  164.502720]  ? _atomic_dec_and_lock+0x2b0/0x2b0
[  164.507909]  ? __lock_is_held+0xcf/0x1a0
[  164.512492]  check_preemption_disabled+0x1a0/0x1d0
[  164.517974]  ip6_pol_route+0xe88/0x4a80
[  164.522402]  ? ip6_rt_update_pmtu+0xc0/0xc0
[  164.527205]  ? unwind_get_return_address+0x5f/0xc0
[  164.532685]  ? print_usage_bug+0x5a0/0x5a0
[  164.537399]  ? lock_downgrade+0x740/0x740
[  164.542000]  ? deref_stack_reg+0x75/0x230
[  164.546605]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
[  164.553158]  ? print_usage_bug+0x5a0/0x5a0
[  164.557859]  ? bpf_prog_alloc+0x330/0x330
[  164.562467]  ? lock_downgrade+0x740/0x740
[  164.567076]  ? __mod_tree_remove+0x40/0x40
[  164.571774]  ? deref_stack_reg+0x75/0x230
[  164.576414]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
[  164.582964]  ? deref_stack_reg+0x230/0x230
[  164.587676]  ? __free_insn_slot+0x970/0x970
[  164.592507]  ? print_usage_bug+0x5a0/0x5a0
[  164.597203]  ? deref_stack_reg+0x75/0x230
[  164.601804]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
[  164.608412]  ? show_rcu_gp_kthreads+0x160/0x160
[  164.613601]  ? is_bpf_text_address+0xa4/0x130
[  164.618590]  ? bpf_prog_alloc+0x330/0x330
[  164.623194]  ? kernel_text_address+0xec/0x120
[  164.628200]  ? cyc2ns_read_end+0x20/0x20
[  164.632705]  ? __mod_tree_remove+0x40/0x40
[  164.637403]  ? deref_stack_reg+0x75/0x230
[  164.642006]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
[  164.648552]  ? deref_stack_reg+0x75/0x230
[  164.653165]  ? sched_clock_cpu+0x18/0x290
[  164.657768]  ? lock_downgrade+0x740/0x740
[  164.662383]  ? bpf_prog_alloc+0x330/0x330
[  164.666978]  ? lock_downgrade+0x740/0x740
[  164.671581]  ? debug_check_no_locks_freed+0x2f0/0x2f0
[  164.677425]  ? deref_stack_reg+0x75/0x230
[  164.677432]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
[  164.677438]  ? deref_stack_reg+0x230/0x230
[  164.677445]  ? deref_stack_reg+0x230/0x230
[  164.677458]  ? rcu_read_lock_held+0xc0/0xc0
[  164.677466]  ? __free_insn_slot+0x970/0x970
[  164.677476]  ? rcu_is_watching+0x59/0x1f0
[  164.677483]  ? show_rcu_gp_kthreads+0x160/0x160
[  164.677496]  ? is_bpf_text_address+0xa4/0x130
[  164.677504]  ? kernel_text_address+0xec/0x120
[  164.677509]  ? __kernel_text_address+0xe/0x30
[  164.677516]  ? __kernel_text_address+0xe/0x30
[  164.677523]  ? unwind_get_return_address+0x5f/0xc0
[  164.677532]  ? __save_stack_trace+0x83/0x120
[  164.677549]  ? __lock_acquire+0x1fcf/0x4440
[  164.677557]  ? save_trace+0x350/0x350
[  164.677566]  ? cyc2ns_read_end+0x20/0x20
[  164.677580]  ? debug_check_no_locks_freed+0x2f0/0x2f0
[  164.677589]  ? ip6_pol_route_input+0x90/0x90
[  164.677595]  fib6_rule_lookup+0x10a/0x750
[  164.677604]  ? fib6_rules_seq_read+0x10/0x10
[  164.677623]  ? lock_downgrade+0x740/0x740
[  164.677631]  ? ip6_route_output_flags+0x1ff/0x310
[  164.677642]  ip6_dst_lookup_tail+0x4ce/0xd50
[  164.677671]  ip6_dst_lookup_flow+0xce/0x2a0
[  164.677679]  ? ip6_dst_lookup+0x90/0x90
[  164.677684]  ? __lock_is_held+0xcf/0x1a0
[  164.677690]  ? show_rcu_gp_kthreads+0x160/0x160
[  164.677704]  ip6_datagram_dst_update+0x728/0x1b00
[  164.677711]  ? cyc2ns_read_end+0x20/0x20
[  164.677721]  ? ip6_datagram_send_ctl+0x1400/0x1400
[  164.677733]  ? print_usage_bug+0x5a0/0x5a0
[  164.677741]  ? sched_clock_cpu+0x18/0x290
[  164.677759]  ? lock_downgrade+0x740/0x740
[  164.677765]  ? get_lock_stats+0x18/0x160
[  164.677788]  ? __ip6_datagram_connect+0x51b/0x1410
[  164.677793]  __ip6_datagram_connect+0x51b/0x1410
[  164.677803]  ? ip6_datagram_release_cb+0x5f0/0x5f0
[  164.677809]  ? mark_held_locks+0xce/0x170
[  164.677817]  ? __local_bh_enable_ip+0x5f/0xa0
[  164.677830]  ip6_datagram_connect+0x27/0x40
[  164.677839]  SYSC_connect+0x1e8/0x4a0
[  164.677847]  ? SYSC_bind+0x3e0/0x3e0
[  164.677856]  ? get_unused_fd_flags+0x133/0x210
[  164.677863]  ? trace_raw_output_sys_exit+0x110/0x110
[  164.677874]  ? __close_fd+0x242/0x400
[  164.677883]  ? sock_map_fd+0x3d/0x60
[  164.677893]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
[  164.677901]  ? trace_hardirqs_on_caller+0x3cb/0x680
[  164.677908]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[  164.677919]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  164.677926] RIP: 0033:0x7f44264eb160
[  164.677931] RSP: 002b:7ffe6e705f88 EFLAGS: 0246 ORIG_RAX: 
002a
[  164.677939] RAX: ffda RBX: 0046 RCX: 7f44264eb160
[  164.677943] RDX: 001c RSI: 55de0fedc2b0 RDI: 0003
[  164.677948] RBP: 7ffe6e7060e0 R08: 017f

RE: [PATCH net v1 2/2] net: fec: Let fec_ptp have its own interrupt routine

2017-10-08 Thread Andy Duan

From: Troy Kisky  Sent: Tuesday, October 03, 
2017 8:05 AM
>To: shawn@linaro.org; Andy Duan ;
>netdev@vger.kernel.org; da...@davemloft.net
>Cc: Fabio Estevam ; lzn...@gmail.com; Troy Kisky
>
>Subject: [PATCH net v1 2/2] net: fec: Let fec_ptp have its own interrupt
>routine
>
>This is better for code locality and should slightly speed up normal 
>interrupts.
>
>This also allows PPS clock output to start working for i.mx7. This is because
>i.mx7 was already using the limit of 3 interrupts, and needed another.
>
>Signed-off-by: Troy Kisky 
>
>---
>If this patch is taken before the corresponding dtb file patch than time
>stamping will be momentarily broken.
>---

The two patches for PPS seems 

> drivers/net/ethernet/freescale/fec.h  |  1 -
> drivers/net/ethernet/freescale/fec_main.c | 15 +++---
>drivers/net/ethernet/freescale/fec_ptp.c  | 77 ++
>-
> 3 files changed, 52 insertions(+), 41 deletions(-)
>
>diff --git a/drivers/net/ethernet/freescale/fec.h
>b/drivers/net/ethernet/freescale/fec.h
>index ede1876a9a19..782509041102 100644
>--- a/drivers/net/ethernet/freescale/fec.h
>+++ b/drivers/net/ethernet/freescale/fec.h
>@@ -587,7 +587,6 @@ void fec_ptp_stop(struct platform_device *pdev);
>void fec_ptp_start_cyclecounter(struct net_device *ndev);  int
>fec_ptp_set(struct net_device *ndev, struct ifreq *ifr);  int
>fec_ptp_get(struct net_device *ndev, struct ifreq *ifr); -uint
>fec_ptp_check_pps_event(struct fec_enet_private *fep);
>
>
>/**
>**/
> #endif /* FEC_H */
>diff --git a/drivers/net/ethernet/freescale/fec_main.c
>b/drivers/net/ethernet/freescale/fec_main.c
>index 3dc2d771a222..80fe04165ba0 100644
>--- a/drivers/net/ethernet/freescale/fec_main.c
>+++ b/drivers/net/ethernet/freescale/fec_main.c
>@@ -1602,10 +1602,6 @@ fec_enet_interrupt(int irq, void *dev_id)
>   ret = IRQ_HANDLED;
>   complete(>mdio_done);
>   }
>-
>-  if (fep->ptp_clock)
>-  if (fec_ptp_check_pps_event(fep))
>-  ret = IRQ_HANDLED;
>   return ret;
> }
>
>@@ -3325,6 +3321,7 @@ fec_probe(struct platform_device *pdev)
>   struct device_node *np = pdev->dev.of_node, *phy_node;
>   int num_tx_qs;
>   int num_rx_qs;
>+  int ptp_irq, j;
>
>   fec_enet_get_queue_num(pdev, _tx_qs, _rx_qs);
>
>@@ -3472,20 +3469,24 @@ fec_probe(struct platform_device *pdev)
>   if (ret)
>   goto failed_init;
>
>-  for (i = 0; i < FEC_IRQ_NUM; i++) {
>-  irq = platform_get_irq(pdev, i);
>+  ptp_irq = platform_get_irq_byname(pdev, "ptp");

Here, interrupt-names property is required properties, it should update binding 
doc.
Also need to handle the return value checking.

>+  i = j = 0;
>+  while (i < FEC_IRQ_NUM) {
>+  irq = platform_get_irq(pdev, j++);
>   if (irq < 0) {
>   if (i)
>   break;
>   ret = irq;
>   goto failed_irq;
>   }
>+  if (irq == ptp_irq)
>+  continue;
>   ret = devm_request_irq(>dev, irq, fec_enet_interrupt,
>  0, pdev->name, ndev);
>   if (ret)
>   goto failed_irq;
>
>-  fep->irq[i] = irq;
>+  fep->irq[i++] = irq;

I think it is not necessary to split pps interrupt handler.
To support pps interrupt handler,  just change the maximum interrupt number 
FEC_IRQ_NUM to 4 that minimize the code change.

>   }
>
>   init_completion(>mdio_done);
>diff --git a/drivers/net/ethernet/freescale/fec_ptp.c
>b/drivers/net/ethernet/freescale/fec_ptp.c
>index 6ebad3fac81d..20e01a3ca453 100644
>--- a/drivers/net/ethernet/freescale/fec_ptp.c
>+++ b/drivers/net/ethernet/freescale/fec_ptp.c
>@@ -549,6 +549,38 @@ static void fec_time_keep(struct work_struct *work)
>   schedule_delayed_work(>time_keep, HZ);  }
>
>+/*
>+ * This function checks the pps event and reloads the timer compare counter.
>+ */
>+static irqreturn_t fec_ptp_interrupt(int irq, void *dev_id) {
>+  struct net_device *ndev = dev_id;
>+  struct fec_enet_private *fep = netdev_priv(ndev);
>+  u32 val;
>+  u8 channel = fep->pps_channel;
>+  struct ptp_clock_event event;
>+
>+  val = readl(fep->hwp + FEC_TCSR(channel));
>+  if (val & FEC_T_TF_MASK) {
>+  /* Write the next next compare(not the next according the
>spec)
>+   * value to the register
>+   */
>+  writel(fep->next_counter, fep->hwp + FEC_TCCR(channel));
>+  do {
>+  writel(val, fep->hwp + FEC_TCSR(channel));
>+  } while (readl(fep->hwp + FEC_TCSR(channel)) &
>FEC_T_TF_MASK);
>+
>+

Re: BUG: using smp_processor_id() in preemptible [00000000]

2017-10-08 Thread Eric Dumazet

On Sun, Oct 8, 2017 at 8:43 PM, Jakub Kicinski  wrote:
> Hi!
>
> This appeared in net-next over the weekend - looks likely related to
> this set...
>
> [  164.456668] BUG: using smp_processor_id() in preemptible [] code: 
> sshd/1255
> [  164.472259] caller is ip6_pol_route+0xe88/0x4a80
> [  164.477571] CPU: 10 PID: 1255 Comm: sshd Not tainted 
> 4.14.0-rc3-debug-00960-g58e545033ed7 #312
> [  164.487350] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 
> 11/08/2016
> [  164.495884] Call Trace:
> [  164.498771]  dump_stack+0xb3/0x140
> [  164.502720]  ? _atomic_dec_and_lock+0x2b0/0x2b0
> [  164.507909]  ? __lock_is_held+0xcf/0x1a0
> [  164.512492]  check_preemption_disabled+0x1a0/0x1d0
> [  164.517974]  ip6_pol_route+0xe88/0x4a80
> [  164.522402]  ? ip6_rt_update_pmtu+0xc0/0xc0
> [  164.527205]  ? unwind_get_return_address+0x5f/0xc0
> [  164.532685]  ? print_usage_bug+0x5a0/0x5a0
> [  164.537399]  ? lock_downgrade+0x740/0x740
> [  164.542000]  ? deref_stack_reg+0x75/0x230
> [  164.546605]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
> [  164.553158]  ? print_usage_bug+0x5a0/0x5a0
> [  164.557859]  ? bpf_prog_alloc+0x330/0x330
> [  164.562467]  ? lock_downgrade+0x740/0x740
> [  164.567076]  ? __mod_tree_remove+0x40/0x40
> [  164.571774]  ? deref_stack_reg+0x75/0x230
> [  164.576414]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
> [  164.582964]  ? deref_stack_reg+0x230/0x230
> [  164.587676]  ? __free_insn_slot+0x970/0x970
> [  164.592507]  ? print_usage_bug+0x5a0/0x5a0
> [  164.597203]  ? deref_stack_reg+0x75/0x230
> [  164.601804]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
> [  164.608412]  ? show_rcu_gp_kthreads+0x160/0x160
> [  164.613601]  ? is_bpf_text_address+0xa4/0x130
> [  164.618590]  ? bpf_prog_alloc+0x330/0x330
> [  164.623194]  ? kernel_text_address+0xec/0x120
> [  164.628200]  ? cyc2ns_read_end+0x20/0x20
> [  164.632705]  ? __mod_tree_remove+0x40/0x40
> [  164.637403]  ? deref_stack_reg+0x75/0x230
> [  164.642006]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
> [  164.648552]  ? deref_stack_reg+0x75/0x230
> [  164.653165]  ? sched_clock_cpu+0x18/0x290
> [  164.657768]  ? lock_downgrade+0x740/0x740
> [  164.662383]  ? bpf_prog_alloc+0x330/0x330
> [  164.666978]  ? lock_downgrade+0x740/0x740
> [  164.671581]  ? debug_check_no_locks_freed+0x2f0/0x2f0
> [  164.677425]  ? deref_stack_reg+0x75/0x230
> [  164.677432]  ? __read_once_size_nocheck.constprop.3+0x50/0x50
> [  164.677438]  ? deref_stack_reg+0x230/0x230
> [  164.677445]  ? deref_stack_reg+0x230/0x230
> [  164.677458]  ? rcu_read_lock_held+0xc0/0xc0
> [  164.677466]  ? __free_insn_slot+0x970/0x970
> [  164.677476]  ? rcu_is_watching+0x59/0x1f0
> [  164.677483]  ? show_rcu_gp_kthreads+0x160/0x160
> [  164.677496]  ? is_bpf_text_address+0xa4/0x130
> [  164.677504]  ? kernel_text_address+0xec/0x120
> [  164.677509]  ? __kernel_text_address+0xe/0x30
> [  164.677516]  ? __kernel_text_address+0xe/0x30
> [  164.677523]  ? unwind_get_return_address+0x5f/0xc0
> [  164.677532]  ? __save_stack_trace+0x83/0x120
> [  164.677549]  ? __lock_acquire+0x1fcf/0x4440
> [  164.677557]  ? save_trace+0x350/0x350
> [  164.677566]  ? cyc2ns_read_end+0x20/0x20
> [  164.677580]  ? debug_check_no_locks_freed+0x2f0/0x2f0
> [  164.677589]  ? ip6_pol_route_input+0x90/0x90
> [  164.677595]  fib6_rule_lookup+0x10a/0x750
> [  164.677604]  ? fib6_rules_seq_read+0x10/0x10
> [  164.677623]  ? lock_downgrade+0x740/0x740
> [  164.677631]  ? ip6_route_output_flags+0x1ff/0x310
> [  164.677642]  ip6_dst_lookup_tail+0x4ce/0xd50
> [  164.677671]  ip6_dst_lookup_flow+0xce/0x2a0
> [  164.677679]  ? ip6_dst_lookup+0x90/0x90
> [  164.677684]  ? __lock_is_held+0xcf/0x1a0
> [  164.677690]  ? show_rcu_gp_kthreads+0x160/0x160
> [  164.677704]  ip6_datagram_dst_update+0x728/0x1b00
> [  164.677711]  ? cyc2ns_read_end+0x20/0x20
> [  164.677721]  ? ip6_datagram_send_ctl+0x1400/0x1400
> [  164.677733]  ? print_usage_bug+0x5a0/0x5a0
> [  164.677741]  ? sched_clock_cpu+0x18/0x290
> [  164.677759]  ? lock_downgrade+0x740/0x740
> [  164.677765]  ? get_lock_stats+0x18/0x160
> [  164.677788]  ? __ip6_datagram_connect+0x51b/0x1410
> [  164.677793]  __ip6_datagram_connect+0x51b/0x1410
> [  164.677803]  ? ip6_datagram_release_cb+0x5f0/0x5f0
> [  164.677809]  ? mark_held_locks+0xce/0x170
> [  164.677817]  ? __local_bh_enable_ip+0x5f/0xa0
> [  164.677830]  ip6_datagram_connect+0x27/0x40
> [  164.677839]  SYSC_connect+0x1e8/0x4a0
> [  164.677847]  ? SYSC_bind+0x3e0/0x3e0
> [  164.677856]  ? get_unused_fd_flags+0x133/0x210
> [  164.677863]  ? trace_raw_output_sys_exit+0x110/0x110
> [  164.677874]  ? __close_fd+0x242/0x400
> [  164.677883]  ? sock_map_fd+0x3d/0x60
> [  164.677893]  ? entry_SYSCALL_64_fastpath+0x5/0xb1
> [  164.677901]  ? trace_hardirqs_on_caller+0x3cb/0x680
> [  164.677908]  ? trace_hardirqs_on_thunk+0x1a/0x1c
> [  164.677919]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  164.677926] RIP: 0033:0x7f44264eb160
> [  164.677931] RSP:

[PATCH net-next 00/15] nfp: bpf ABIv2 and multi port

2017-10-08 Thread Jakub Kicinski

Hi!

This series migrates our eBPF offload from old PoC firmware to 
a redesigned, faster and more feature rich FW.  Marking support
is dropped for now.  We have to teach the JIT about encoding
local memory accesses (one of NFP memory types).  There is also
code to populate the ECC of instructions (PoC had ECC protection
on instruction store disabled).  There is also a minor ld_field
fix and all 64 bit shifts can now be encoded.


Jakub Kicinski (15):
  nfp: output control messages to trace_devlink_hwmsg()
  nfp: bpf: lift the single-port limitation
  nfp: bpf: use the power of sparse to check we encode registers right
  nfp: bpf: move software reg helpers and cmd table out of translator
  nfp: bpf: encode all 64bit shifts
  nfp: bpf: remove register rename
  nfp: bpf: remove packet marking support
  nfp: add more white space to the instruction defines
  nfp: bpf: encode LMEM accesses
  nfp: bpf: encode extended LM pointer operands
  nfp: bpf: move to datapath ABI version 2
  nfp: bpf: calculate code store ECC
  nfp: bpf: pad code with valid nops
  nfp: bpf: byte swap the instructions
  nfp: bpf: pass dst register to ld_field instruction

 drivers/net/ethernet/netronome/nfp/Makefile   |   1 +
 drivers/net/ethernet/netronome/nfp/bpf/jit.c  | 403 --
 drivers/net/ethernet/netronome/nfp/bpf/main.c |   8 -
 drivers/net/ethernet/netronome/nfp/bpf/main.h |  49 +--
 drivers/net/ethernet/netronome/nfp/nfp_app.h  |   8 +
 drivers/net/ethernet/netronome/nfp/nfp_asm.c  | 254 ++
 drivers/net/ethernet/netronome/nfp/nfp_asm.h  | 293 +++-
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h |   2 +-
 8 files changed, 640 insertions(+), 378 deletions(-)
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_asm.c

-- 
2.14.1

Re: [PATCH] net: dsa: mv88e6xxx: rework in-chip bridging

2017-10-08 Thread Florian Fainelli

Le 10/08/17 à 20:23, Greg Ungerer a écrit :
> Hi Florian,
> 
> On 07/10/17 13:04, Florian Fainelli wrote:
>> Le 10/03/17 à 23:20, Greg Ungerer a écrit :
>>> On Wed, Mar 29, 2017 at 04:30:16PM -0400, Vivien Didelot wrote:
 All ports -- internal and external, for chips featuring a PVT -- have a
 mask restricting to which internal ports a frame is allowed to egress.

 Now that DSA exposes the number of ports and their bridge devices, it is
 possible to extract the code generating the VLAN map and make it generic
 so that it can be shared later with the cross-chip bridging code.
>>>
>>> This patch changes the behavior of interfaces on startup if they are
>>> not part of a bridge.
>>>
>>> I have a board with a Marvell 6350 switch with a device tree that sets
>>> up the 5 ports as lan1, lan2, lan3, lan4, wan. With kernels before
>>> this patch (so linux-4.12 and older) after system startup I could do:
>>>
>>>   ifconfig lan1 192.168.0.1
>>>
>>> And then ping out that interface with no problems.
>>>
>>> After this patch is applied (effects linux-4.13 and newer) then the
>>> ping fails:
>>>
>>>   PING 192.168.0.22 (192.168.0.22) 56(84) bytes of data.
>>>   From 192.168.0.1 icmp_seq=1 Destination Host Unreachable
>>>   From 192.168.0.1 icmp_seq=2 Destination Host Unreachable
>>>   From 192.168.0.1 icmp_seq=3 Destination Host Unreachable
>>>
>>> If I incorporate an interface into a bridge then it all works ok.
>>> So simply:
>>>
>>>   brctl addbr br0
>>>   brctl addif br0 lan1
>>>   ifconfig lan1 up
>>>   ifconfig br0 192.168.0.1
>>>
>>> Then pings out work as expected. And if I now remove that lan1
>>> interface from the bridge and use it alone again then it will
>>> now work ok:
>>>
>>>   ifconfig br0 down
>>>   brctl delif br0 lan1
>>>   ifconfig lan1 192.168.0.1
>>>
>>> And that now pings ok.
>>>
>>> I fixed this with the attached patch. It is probably not the correct
>>> approach, but it does restore the older behavior.
>>>
>>> What do you think?
>>
>> This is strange, the dsa_switch_tree and its associated dsa_switch
>> instances should be fully setup by the time ops->setup() is running in
>> your driver but your patch suggests this may not be happening?
> 
> That is what I am seeing, yep.
> 
> 
>> Are you using the new style Device Tree binding or the old style Device
>> Tree binding out of curiosity?
> 
> This is my device tree fragment for the switch:
> 
> dsa@0 {
> compatible = "marvell,dsa";
> #address-cells = <2>;
> #size-cells = <0>;
> 
> dsa,ethernet = <>;
> dsa,mii-bus = <>;
> 
> switch@0 {
> #address-cells = <1>;
> #size-cells = <0>;
> reg = <0x11 0>;
> 
> port@0 {
> reg = <0>;
> label = "lan1";
> };
> port@1 {
> reg = <1>;
> label = "lan2";
> };
> port@2 {
> reg = <2>;
> label = "lan3";
> };
> port@3 {
> reg = <3>;
> label = "lan4";
> };
> port@4 {
> reg = <4>;
> label = "wan";
> };
> port@5 {
> reg = <5>;
> label = "cpu";
> };
> };
>  };
> 
> The board I am using is based around an Marvell Armada 370. This device tree
> setup looks pretty similar to the other Marvell boards using marvell,dsa.

This is the old Device Tree binding which goes through an unfortunately
different code path while initializing all the dsa_switch_tree and
dsa_switch structures, while we should definitively look into fixing
this, would you mind trying to update your board using something similar
to this commit:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4cb2ec8cad8c82cd7cfd19edcacd846861d6e703

This would make you go through net/dsa/dsa2.c which is what most of us
usually test. In the meantime we should probably start issuing warning
messages when people use the old Device Tree binding to encourage them
to migrate other.
-- 
Florian

Re: [PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route()

2017-10-08 Thread David Miller

From: Eric Dumazet 
Date: Sun, 08 Oct 2017 21:07:18 -0700

> From: Eric Dumazet 
> 
> Ido reported following splat and provided a patch.
> 
> [  122.221814] BUG: using smp_processor_id() in preemptible [] code: 
> sshd/2672
> [  122.221845] caller is debug_smp_processor_id+0x17/0x20
> [  122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 
> 4.14.0-rc3-idosch-next-custom #639
> [  122.221880] Hardware name: Mellanox Technologies Ltd. 
> MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [  122.221893] Call Trace:
> [  122.221919]  dump_stack+0xb1/0x10c
> [  122.221946]  ? _atomic_dec_and_lock+0x124/0x124
> [  122.221974]  ? ___ratelimit+0xfe/0x240
> [  122.222020]  check_preemption_disabled+0x173/0x1b0
> [  122.222060]  debug_smp_processor_id+0x17/0x20
> [  122.222083]  ip6_pol_route+0x1482/0x24a0
> ...
> 
> I believe we can simplify this code path a bit, since we no longer
> hold a read_lock and need to release it to avoid a dead lock.
> 
> By disabling BH, we make sure we'll prevent code re-entry and
> rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu. 
> 
> Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in 
> fib6_table")
> Reported-by: Ido Schimmel 
> Signed-off-by: Eric Dumazet 
> Tested-by: Ido Schimmel 

Applied, thanks Eric.

Re: [pull request][for-next 0/9] Mellanox, mlx5 updates 2017-10-06

2017-10-08 Thread David Miller

From: Saeed Mahameed 
Date: Fri,  6 Oct 2017 16:37:40 -0700

> This series includes some shared code updates for kernel 4.15 to both
> net-next and rdma-next trees.

I've pulled this into net-next, thanks Saeed.

Re: [PATCH net-next 0/6] ipv6: ipv6_dev_get_saddr() rcu works

2017-10-08 Thread David Miller

From: Eric Dumazet 
Date: Sat,  7 Oct 2017 19:30:22 -0700

> Sending IPv6 udp packets on non connected sockets is quite slow,
> because ipv6_dev_get_saddr() is still using an rwlock and silly
> references games on ifa.
> 
> Tested:
> 
> $ ./super_netperf 16 -H ::555:0786 -l 2000 -t UDP_STREAM -- -m 100 &
> [1] 12527
> 
> Performance is boosted from 2.02 Mpps to 4.28 Mpps

Awesome, series applied, thanks!

Re: [PATCH] qed: Delete redundant check on dcb_app priority

2017-10-08 Thread David Miller

From: Christos Gkekas 
Date: Sun,  8 Oct 2017 23:46:47 +0100

> dcb_app priority is unsigned thus checking whether it is less than zero
> is redundant.
> 
> Signed-off-by: Christos Gkekas 

Applied to net-next.

Re: [PATCH net-next 0/2] net: defer cgroups init to accept()

2017-10-08 Thread Eric Dumazet

On Sun, Oct 8, 2017 at 9:44 PM, Eric Dumazet  wrote:
> After TCP 3WHS became lockless, we should not attempt cgroup games
> from sk_clone_lock() since listener/cgroup might be already gone.
>
> Move this business to inet_csk_accept() where we have
> the guarantee both parent and child exist.
>
> Many thanks to John Sperbeck for spotting these issues
>
> Eric Dumazet (2):
>   net: memcontrol: defer call to mem_cgroup_sk_alloc()
>   net: defer call to cgroup_sk_alloc()

This was based on net tree, but I used the wrong script, and thus this
has the [PATCH net-next] tag.

Sorry for the confusion, but I guess this also can be applied to
net-next since this is not a recent regression.

Re: netlink backwards compatibility in userspace tools

2017-10-08 Thread David Miller

From: "Jason A. Donenfeld" 
Date: Fri, 29 Sep 2017 12:22:42 +0200

> One handy aspect of Netlink is that it's backwards compatible. This
> means that you can run old userspace utilities on new kernels, even if
> the new kernel supports new features and netlink attributes. The wire
> format is stable enough that the data marshaled can be extended
> without breaking compat. Neat.
> 
> I was wondering, though, what you think the best stance is toward
> these old userspace utilities. What should they do if the kernel sends
> it netlink attributes that it does not recognize? At the moment, I'm
> doing something like this:
> 
> static void warn_unrecognized(void)
> {
> static bool once = false;
> if (once)
> return;
> once = true;
> fprintf(stderr,
> "Warning: this program received from your kernel one or more\n"
> "attributes that it did not recognize. It is possible that\n"
> "this version of wg(8) is older than your kernel. You may\n"
> "want to update this program.\n");
> }
> 
> This seems like a somewhat sensible warning, but then I wonder about
> distributions like Debian, which has a long stable life cycle, so it
> frequently has very old tools (ancient iproute2 for example). Then,
> VPS providers have these Debian images run on top of newer kernels.
> People in this situation would undoubtedly see the above warning a lot
> and not be able to do anything about it. Not horrible, but a bit
> annoying. Is this an okay annoyance? Or is it advised to just have no
> warning at all? One idea would be to put it behind an environment
> variable flag, but I don't like too many nobs.
> 
> I'm generally wondering about attitudes toward this kind of userspace
> program behavior in response to newer kernels.

Generally, yes you should simply ignore attributes you don't understand.

But we keep coming back to this issue, because it's not always the best
thing to do.

For example, let's say you have settings X and Y for object A.

User A has a newer tool and is able to set both X and Y, as well as
see them in dumps.  And let's further assume that Y's setting has some
kind of influence on the behavior of X.

User B has an older tool, and sees X but not Y because Y is not
understood by the older tool.  User B will not be able to figure out
why X is not behaving the way they expect it to, because of the loss
of information.

Similar, even more serious, issues arise when setting values.  User B
can set X and wonder why it's not doing what they expect it to do
because of setting Y which they can't even see with their tools.

For this reason it might be beneficical to at least say to the user
"Warning, I've seen one or more unrecognized netlink attributes."
so that there is at least a chance for the user to figure out what
might be happening to them.

[PATCH net-next 06/15] nfp: bpf: remove register rename

2017-10-08 Thread Jakub Kicinski

Remove the register renumbering optimization.  To implement calling
map and other helpers we need more strict register layout.  We can't
freely reassign register numbers.

This will have the effect of running in 4 context/thread mode, which
should be OK since we are moving towards integrating the BPF closer
with FW app datapath anyway, and the target datapath itself runs in
4 context mode.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 46 ++--
 1 file changed, 3 insertions(+), 43 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 095cf50e8450..469dc8a055f2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -1621,38 +1621,6 @@ static void nfp_bpf_opt_reg_init(struct nfp_prog 
*nfp_prog)
}
 }
 
-/* Try to rename registers so that program uses only low ones */
-static int nfp_bpf_opt_reg_rename(struct nfp_prog *nfp_prog)
-{
-   bool reg_used[MAX_BPF_REG] = {};
-   u8 tgt_reg[MAX_BPF_REG] = {};
-   struct nfp_insn_meta *meta;
-   unsigned int i, j;
-
-   list_for_each_entry(meta, _prog->insns, l) {
-   if (meta->skip)
-   continue;
-
-   reg_used[meta->insn.src_reg] = true;
-   reg_used[meta->insn.dst_reg] = true;
-   }
-
-   for (i = 0, j = 0; i < ARRAY_SIZE(tgt_reg); i++) {
-   if (!reg_used[i])
-   continue;
-
-   tgt_reg[i] = j++;
-   }
-   nfp_prog->num_regs = j;
-
-   list_for_each_entry(meta, _prog->insns, l) {
-   meta->insn.src_reg = tgt_reg[meta->insn.src_reg];
-   meta->insn.dst_reg = tgt_reg[meta->insn.dst_reg];
-   }
-
-   return 0;
-}
-
 /* Remove masking after load since our load guarantees this is not needed */
 static void nfp_bpf_opt_ld_mask(struct nfp_prog *nfp_prog)
 {
@@ -1729,14 +1697,8 @@ static void nfp_bpf_opt_ld_shift(struct nfp_prog 
*nfp_prog)
 
 static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
 {
-   int ret;
-
nfp_bpf_opt_reg_init(nfp_prog);
 
-   ret = nfp_bpf_opt_reg_rename(nfp_prog);
-   if (ret)
-   return ret;
-
nfp_bpf_opt_ld_mask(nfp_prog);
nfp_bpf_opt_ld_shift(nfp_prog);
 
@@ -1783,10 +1745,8 @@ nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
if (ret)
goto out;
 
-   if (nfp_prog->num_regs <= 7)
-   nfp_prog->regs_per_thread = 16;
-   else
-   nfp_prog->regs_per_thread = 32;
+   nfp_prog->num_regs = MAX_BPF_REG;
+   nfp_prog->regs_per_thread = 32;
 
nfp_prog->prog = prog_mem;
nfp_prog->__prog_alloc_len = prog_sz;
@@ -1799,7 +1759,7 @@ nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
}
 
res->n_instr = nfp_prog->prog_len;
-   res->dense_mode = nfp_prog->num_regs <= 7;
+   res->dense_mode = false;
 out:
nfp_prog_free(nfp_prog);
 
-- 
2.14.1

[PATCH net-next 10/15] nfp: bpf: encode extended LM pointer operands

2017-10-08 Thread Jakub Kicinski

Most instructions have special fields which allow switching
between base and extended Local Memory pointers.  Introduce
those to register encoding, we will use the extra LM pointers
to access high addresses of the stack.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 56 +++-
 drivers/net/ethernet/netronome/nfp/nfp_asm.c |  6 +++
 drivers/net/ethernet/netronome/nfp/nfp_asm.h | 28 --
 3 files changed, 70 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 4fa220f710d2..d7dc19feba8d 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -153,6 +153,11 @@ emit_cmd(struct nfp_prog *nfp_prog, enum cmd_tgt_map op,
nfp_prog->error = -EFAULT;
return;
}
+   if (reg.dst_lmextn || reg.src_lmextn) {
+   pr_err("cmd can't use LMextn\n");
+   nfp_prog->error = -EFAULT;
+   return;
+   }
 
__emit_cmd(nfp_prog, op, mode, xfer, reg.areg, reg.breg, size, sync);
 }
@@ -198,7 +203,7 @@ emit_br(struct nfp_prog *nfp_prog, enum br_mask mask, u16 
addr, u8 defer)
 
 static void
 __emit_br_byte(struct nfp_prog *nfp_prog, u8 areg, u8 breg, bool imm8,
-  u8 byte, bool equal, u16 addr, u8 defer)
+  u8 byte, bool equal, u16 addr, u8 defer, bool src_lmextn)
 {
u16 addr_lo, addr_hi;
u64 insn;
@@ -214,32 +219,34 @@ __emit_br_byte(struct nfp_prog *nfp_prog, u8 areg, u8 
breg, bool imm8,
FIELD_PREP(OP_BB_EQ, equal) |
FIELD_PREP(OP_BB_DEFBR, defer) |
FIELD_PREP(OP_BB_ADDR_LO, addr_lo) |
-   FIELD_PREP(OP_BB_ADDR_HI, addr_hi);
+   FIELD_PREP(OP_BB_ADDR_HI, addr_hi) |
+   FIELD_PREP(OP_BB_SRC_LMEXTN, src_lmextn);
 
nfp_prog_push(nfp_prog, insn);
 }
 
 static void
 emit_br_byte_neq(struct nfp_prog *nfp_prog,
-swreg dst, u8 imm, u8 byte, u16 addr, u8 defer)
+swreg src, u8 imm, u8 byte, u16 addr, u8 defer)
 {
struct nfp_insn_re_regs reg;
int err;
 
-   err = swreg_to_restricted(reg_none(), dst, reg_imm(imm), , true);
+   err = swreg_to_restricted(reg_none(), src, reg_imm(imm), , true);
if (err) {
nfp_prog->error = err;
return;
}
 
__emit_br_byte(nfp_prog, reg.areg, reg.breg, reg.i8, byte, false, addr,
-  defer);
+  defer, reg.src_lmextn);
 }
 
 static void
 __emit_immed(struct nfp_prog *nfp_prog, u16 areg, u16 breg, u16 imm_hi,
 enum immed_width width, bool invert,
-enum immed_shift shift, bool wr_both)
+enum immed_shift shift, bool wr_both,
+bool dst_lmextn, bool src_lmextn)
 {
u64 insn;
 
@@ -250,7 +257,9 @@ __emit_immed(struct nfp_prog *nfp_prog, u16 areg, u16 breg, 
u16 imm_hi,
FIELD_PREP(OP_IMMED_WIDTH, width) |
FIELD_PREP(OP_IMMED_INV, invert) |
FIELD_PREP(OP_IMMED_SHIFT, shift) |
-   FIELD_PREP(OP_IMMED_WR_AB, wr_both);
+   FIELD_PREP(OP_IMMED_WR_AB, wr_both) |
+   FIELD_PREP(OP_IMMED_SRC_LMEXTN, src_lmextn) |
+   FIELD_PREP(OP_IMMED_DST_LMEXTN, dst_lmextn);
 
nfp_prog_push(nfp_prog, insn);
 }
@@ -274,13 +283,15 @@ emit_immed(struct nfp_prog *nfp_prog, swreg dst, u16 imm,
}
 
__emit_immed(nfp_prog, reg.areg, reg.breg, imm >> 8, width,
-invert, shift, reg.wr_both);
+invert, shift, reg.wr_both,
+reg.dst_lmextn, reg.src_lmextn);
 }
 
 static void
 __emit_shf(struct nfp_prog *nfp_prog, u16 dst, enum alu_dst_ab dst_ab,
   enum shf_sc sc, u8 shift,
-  u16 areg, enum shf_op op, u16 breg, bool i8, bool sw, bool wr_both)
+  u16 areg, enum shf_op op, u16 breg, bool i8, bool sw, bool wr_both,
+  bool dst_lmextn, bool src_lmextn)
 {
u64 insn;
 
@@ -302,7 +313,9 @@ __emit_shf(struct nfp_prog *nfp_prog, u16 dst, enum 
alu_dst_ab dst_ab,
FIELD_PREP(OP_SHF_SHIFT, shift) |
FIELD_PREP(OP_SHF_OP, op) |
FIELD_PREP(OP_SHF_DST_AB, dst_ab) |
-   FIELD_PREP(OP_SHF_WR_AB, wr_both);
+   FIELD_PREP(OP_SHF_WR_AB, wr_both) |
+   FIELD_PREP(OP_SHF_SRC_LMEXTN, src_lmextn) |
+   FIELD_PREP(OP_SHF_DST_LMEXTN, dst_lmextn);
 
nfp_prog_push(nfp_prog, insn);
 }
@@ -321,12 +334,14 @@ emit_shf(struct nfp_prog *nfp_prog, swreg dst,
}
 
__emit_shf(nfp_prog, reg.dst, reg.dst_ab, sc, shift,
-  reg.areg, op, reg.breg, reg.i8, reg.swap, reg.wr_both);
+  reg.areg, op, reg.breg, reg.i8,

[PATCH net-next 11/15] nfp: bpf: move to datapath ABI version 2

2017-10-08 Thread Jakub Kicinski

Datapath ABI version 2 stores the packet information in LMEM
instead of NNRs.  We also have strict restrictions on which
GPRs we can use.  Only GPRs 0-23 are reserved for BPF.

Adjust the static register locations and "ABI" registers.
Note that packet length is packed with other info so we have
to extract it into one of the scratch registers, OTOH since
LMEM can be used in restricted operands we don't have to
extract packet pointer.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c  | 19 +-
 drivers/net/ethernet/netronome/nfp/bpf/main.h | 24 ++-
 drivers/net/ethernet/netronome/nfp/nfp_net_ctrl.h |  2 +-
 3 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index d7dc19feba8d..9b6c98ccebfe 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -526,22 +526,22 @@ construct_data_ind_ld(struct nfp_prog *nfp_prog, u16 
offset,
emit_alu(nfp_prog, imm_a(nfp_prog),
 imm_a(nfp_prog), ALU_OP_ADD, reg_imm(size));
emit_alu(nfp_prog, reg_none(),
-NFP_BPF_ABI_LEN, ALU_OP_SUB, imm_a(nfp_prog));
+plen_reg(nfp_prog), ALU_OP_SUB, imm_a(nfp_prog));
wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
/* Load data */
emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
-pkt_reg(nfp_prog), imm_b(nfp_prog), sz - 1, true);
+pptr_reg(nfp_prog), imm_b(nfp_prog), sz - 1, true);
} else {
/* Check packet length */
tmp_reg = ur_load_imm_any(nfp_prog, offset + size,
  imm_a(nfp_prog));
emit_alu(nfp_prog, reg_none(),
-NFP_BPF_ABI_LEN, ALU_OP_SUB, tmp_reg);
+plen_reg(nfp_prog), ALU_OP_SUB, tmp_reg);
wrp_br_special(nfp_prog, BR_BLO, OP_BR_GO_ABORT);
/* Load data */
tmp_reg = re_load_imm_any(nfp_prog, offset, imm_b(nfp_prog));
emit_cmd(nfp_prog, CMD_TGT_READ8, CMD_MODE_32b, 0,
-pkt_reg(nfp_prog), tmp_reg, sz - 1, true);
+pptr_reg(nfp_prog), tmp_reg, sz - 1, true);
}
 
i = 0;
@@ -1024,7 +1024,7 @@ static int mem_ldx4_skb(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
 {
if (meta->insn.off == offsetof(struct sk_buff, len))
emit_alu(nfp_prog, reg_both(meta->insn.dst_reg * 2),
-reg_none(), ALU_OP_NONE, NFP_BPF_ABI_LEN);
+reg_none(), ALU_OP_NONE, plen_reg(nfp_prog));
else
return -EOPNOTSUPP;
 
@@ -1039,12 +1039,12 @@ static int mem_ldx4_xdp(struct nfp_prog *nfp_prog, 
struct nfp_insn_meta *meta)
meta->insn.off != offsetof(struct xdp_md, data_end))
return -EOPNOTSUPP;
 
-   emit_alu(nfp_prog, dst, reg_none(), ALU_OP_NONE, NFP_BPF_ABI_PKT);
+   emit_alu(nfp_prog, dst, reg_none(), ALU_OP_NONE, pptr_reg(nfp_prog));
 
if (meta->insn.off == offsetof(struct xdp_md, data))
return 0;
 
-   emit_alu(nfp_prog, dst, dst, ALU_OP_ADD, NFP_BPF_ABI_LEN);
+   emit_alu(nfp_prog, dst, dst, ALU_OP_ADD, plen_reg(nfp_prog));
 
return 0;
 }
@@ -1403,8 +1403,9 @@ static int nfp_fixup_branches(struct nfp_prog *nfp_prog)
 
 static void nfp_intro(struct nfp_prog *nfp_prog)
 {
-   emit_alu(nfp_prog, pkt_reg(nfp_prog),
-reg_none(), ALU_OP_NONE, NFP_BPF_ABI_PKT);
+   wrp_immed(nfp_prog, plen_reg(nfp_prog), GENMASK(13, 0));
+   emit_alu(nfp_prog, plen_reg(nfp_prog),
+plen_reg(nfp_prog), ALU_OP_AND, pv_len(nfp_prog));
 }
 
 static void nfp_outro_tc_legacy(struct nfp_prog *nfp_prog)
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h 
b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index 7d959757a51a..b7a112acbdb7 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -54,9 +54,13 @@ enum br_special {
 };
 
 enum static_regs {
-   STATIC_REG_PKT  = 1,
-#define REG_PKT_BANK   ALU_DST_A
-   STATIC_REG_IMM  = 2, /* Bank AB */
+   STATIC_REG_IMM  = 21, /* Bank AB */
+   STATIC_REG_PKT_LEN  = 22, /* Bank B */
+};
+
+enum pkt_vec {
+   PKT_VEC_PKT_LEN = 0,
+   PKT_VEC_PKT_PTR = 2,
 };
 
 enum nfp_bpf_action_type {
@@ -66,15 +70,17 @@ enum nfp_bpf_action_type {
NN_ACT_XDP,
 };
 
-#define pkt_reg(np)reg_a((np)->regs_per_thread - STATIC_REG_PKT)
-#define imm_a(np)  reg_a((np)->regs_per_thread - STATIC_REG_IMM)
-#define imm_b(np)

[PATCH net-next 08/15] nfp: add more white space to the instruction defines

2017-10-08 Thread Jakub Kicinski

We need to add longer OP_* defines, move the values away.
Purely whitespace commit.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/nfp_asm.h | 156 +--
 1 file changed, 78 insertions(+), 78 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
index 8e87c0676c30..63cfd07da34e 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -53,14 +53,14 @@
 #define UR_REG_IMM_encode(x) (UR_REG_IMM | (x))
 #define UR_REG_IMM_MAX  0x0ffULL
 
-#define OP_BR_BASE 0x0d80020ULL
-#define OP_BR_BASE_MASK0x0f8000c3ce0ULL
-#define OP_BR_MASK 0x01fULL
-#define OP_BR_EV_PIP   0x300ULL
-#define OP_BR_CSS  0x003c000ULL
-#define OP_BR_DEFBR0x030ULL
-#define OP_BR_ADDR_LO  0x007ffc0ULL
-#define OP_BR_ADDR_HI  0x100ULL
+#define OP_BR_BASE 0x0d80020ULL
+#define OP_BR_BASE_MASK0x0f8000c3ce0ULL
+#define OP_BR_MASK 0x01fULL
+#define OP_BR_EV_PIP   0x300ULL
+#define OP_BR_CSS  0x003c000ULL
+#define OP_BR_DEFBR0x030ULL
+#define OP_BR_ADDR_LO  0x007ffc0ULL
+#define OP_BR_ADDR_HI  0x100ULL
 
 #define nfp_is_br(_insn)   \
(((_insn) & OP_BR_BASE_MASK) == OP_BR_BASE)
@@ -83,30 +83,30 @@ enum br_ctx_signal_state {
BR_CSS_NONE = 2,
 };
 
-#define OP_BBYTE_BASE  0x0c8ULL
-#define OP_BB_A_SRC0x0ffULL
-#define OP_BB_BYTE 0x300ULL
-#define OP_BB_B_SRC0x003fc00ULL
-#define OP_BB_I8   0x004ULL
-#define OP_BB_EQ   0x008ULL
-#define OP_BB_DEFBR0x030ULL
-#define OP_BB_ADDR_LO  0x007ffc0ULL
-#define OP_BB_ADDR_HI  0x100ULL
-
-#define OP_BALU_BASE   0x0e8ULL
-#define OP_BA_A_SRC0x3ffULL
-#define OP_BA_B_SRC0x00ffc00ULL
-#define OP_BA_DEFBR0x030ULL
-#define OP_BA_ADDR_HI  0x0007fc0ULL
-
-#define OP_IMMED_A_SRC 0x3ffULL
-#define OP_IMMED_B_SRC 0x00ffc00ULL
-#define OP_IMMED_IMM   0xff0ULL
-#define OP_IMMED_WIDTH 0x0006000ULL
-#define OP_IMMED_INV   0x0008000ULL
-#define OP_IMMED_SHIFT 0x006ULL
-#define OP_IMMED_BASE  0x0f0ULL
-#define OP_IMMED_WR_AB 0x200ULL
+#define OP_BBYTE_BASE  0x0c8ULL
+#define OP_BB_A_SRC0x0ffULL
+#define OP_BB_BYTE 0x300ULL
+#define OP_BB_B_SRC0x003fc00ULL
+#define OP_BB_I8   0x004ULL
+#define OP_BB_EQ   0x008ULL
+#define OP_BB_DEFBR0x030ULL
+#define OP_BB_ADDR_LO  0x007ffc0ULL
+#define OP_BB_ADDR_HI  0x100ULL
+
+#define OP_BALU_BASE   0x0e8ULL
+#define OP_BA_A_SRC0x3ffULL
+#define OP_BA_B_SRC0x00ffc00ULL
+#define OP_BA_DEFBR0x030ULL
+#define OP_BA_ADDR_HI  0x0007fc0ULL
+
+#define OP_IMMED_A_SRC 0x3ffULL
+#define OP_IMMED_B_SRC 0x00ffc00ULL
+#define OP_IMMED_IMM   0xff0ULL
+#define OP_IMMED_WIDTH 0x0006000ULL
+#define OP_IMMED_INV   0x0008000ULL
+#define OP_IMMED_SHIFT 0x006ULL
+#define OP_IMMED_BASE  0x0f0ULL
+#define OP_IMMED_WR_AB 0x200ULL
 
 enum immed_width {
IMMED_WIDTH_ALL = 0,
@@ -120,17 +120,17 @@ enum immed_shift {
IMMED_SHIFT_2B = 2,
 };
 
-#define OP_SHF_BASE0x080ULL
-#define OP_SHF_A_SRC   0x0ffULL
-#define OP_SHF_SC  0x300ULL
-#define OP_SHF_B_SRC   0x003fc00ULL
-#define OP_SHF_I8  0x004ULL
-#define OP_SHF_SW  0x008ULL
-#define OP_SHF_DST 0xff0ULL
-#define OP_SHF_SHIFT   0x001f000ULL
-#define OP_SHF_OP  0x00eULL
-#define OP_SHF_DST_AB  0x010ULL
-#define OP_SHF_WR_AB   0x200ULL
+#define OP_SHF_BASE0x080ULL
+#define OP_SHF_A_SRC   0x0ffULL
+#define OP_SHF_SC  0x300ULL
+#define OP_SHF_B_SRC   0x003fc00ULL
+#define OP_SHF_I8  0x004ULL
+#define OP_SHF_SW  0x008ULL
+#define OP_SHF_DST 0xff0ULL
+#define OP_SHF_SHIFT   0x001f000ULL
+#define OP_SHF_OP  0x00eULL
+#define OP_SHF_DST_AB  0x010ULL
+#define OP_SHF_WR_AB   0x200ULL
 
 enum shf_op {
SHF_OP_NONE = 0,
@@ -145,14 +145,14 @@ enum shf_sc {
SHF_SC_R_DSHF = 3,
 };
 
-#define OP_ALU_A_SRC   0x3ffULL
-#define OP_ALU_B_SRC   0x00ffc00ULL
-#define OP_ALU_DST 0x0003ff0ULL
-#define OP_ALU_SW  0x0004000ULL

[PATCH net-next 03/15] nfp: bpf: use the power of sparse to check we encode registers right

2017-10-08 Thread Jakub Kicinski

Define a new __bitwise type for software representation of registers.
This will allow us to catch incorrect parameter types using sparse.

Accessors we define also allow us to return correct enum type and
therefore ensure all switches handle all register types.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c  | 99 ++-
 drivers/net/ethernet/netronome/nfp/bpf/main.h | 24 +--
 drivers/net/ethernet/netronome/nfp/nfp_asm.h  | 45 
 3 files changed, 99 insertions(+), 69 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 239dfbe8a0a1..7e8cdfb39607 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -128,11 +128,11 @@ struct nfp_insn_re_regs {
bool i8;
 };
 
-static u16 nfp_swreg_to_unreg(u32 swreg, bool is_dst)
+static u16 nfp_swreg_to_unreg(swreg reg, bool is_dst)
 {
-   u16 val = FIELD_GET(NN_REG_VAL, swreg);
+   u16 val = swreg_value(reg);
 
-   switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+   switch (swreg_type(reg)) {
case NN_REG_GPR_A:
case NN_REG_GPR_B:
case NN_REG_GPR_BOTH:
@@ -149,33 +149,34 @@ static u16 nfp_swreg_to_unreg(u32 swreg, bool is_dst)
return UR_REG_IMM_encode(val);
case NN_REG_NONE:
return is_dst ? UR_REG_NO_DST : REG_NONE;
-   default:
-   pr_err("unrecognized reg encoding %08x\n", swreg);
-   return 0;
}
+
+   pr_err("unrecognized reg encoding %08x\n", reg);
+   return 0;
 }
 
 static int
-swreg_to_unrestricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_ur_regs 
*reg)
+swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
+ struct nfp_insn_ur_regs *reg)
 {
memset(reg, 0, sizeof(*reg));
 
/* Decode destination */
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+   if (swreg_type(dst) == NN_REG_IMM)
return -EFAULT;
 
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+   if (swreg_type(dst) == NN_REG_GPR_B)
reg->dst_ab = ALU_DST_B;
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+   if (swreg_type(dst) == NN_REG_GPR_BOTH)
reg->wr_both = true;
reg->dst = nfp_swreg_to_unreg(dst, true);
 
/* Decode source operands */
-   if (FIELD_GET(NN_REG_TYPE, lreg) == FIELD_GET(NN_REG_TYPE, rreg))
+   if (swreg_type(lreg) == swreg_type(rreg))
return -EFAULT;
 
-   if (FIELD_GET(NN_REG_TYPE, lreg) == NN_REG_GPR_B ||
-   FIELD_GET(NN_REG_TYPE, rreg) == NN_REG_GPR_A) {
+   if (swreg_type(lreg) == NN_REG_GPR_B ||
+   swreg_type(rreg) == NN_REG_GPR_A) {
reg->areg = nfp_swreg_to_unreg(rreg, false);
reg->breg = nfp_swreg_to_unreg(lreg, false);
reg->swap = true;
@@ -187,11 +188,11 @@ swreg_to_unrestricted(u32 dst, u32 lreg, u32 rreg, struct 
nfp_insn_ur_regs *reg)
return 0;
 }
 
-static u16 nfp_swreg_to_rereg(u32 swreg, bool is_dst, bool has_imm8, bool *i8)
+static u16 nfp_swreg_to_rereg(swreg reg, bool is_dst, bool has_imm8, bool *i8)
 {
-   u16 val = FIELD_GET(NN_REG_VAL, swreg);
+   u16 val = swreg_value(reg);
 
-   switch (FIELD_GET(NN_REG_TYPE, swreg)) {
+   switch (swreg_type(reg)) {
case NN_REG_GPR_A:
case NN_REG_GPR_B:
case NN_REG_GPR_BOTH:
@@ -207,34 +208,37 @@ static u16 nfp_swreg_to_rereg(u32 swreg, bool is_dst, 
bool has_imm8, bool *i8)
return RE_REG_IMM_encode(val & 0x7f);
case NN_REG_NONE:
return is_dst ? RE_REG_NO_DST : REG_NONE;
-   default:
-   pr_err("unrecognized reg encoding\n");
+   case NN_REG_NNR:
+   pr_err("NNRs used with restricted encoding\n");
return 0;
}
+
+   pr_err("unrecognized reg encoding\n");
+   return 0;
 }
 
 static int
-swreg_to_restricted(u32 dst, u32 lreg, u32 rreg, struct nfp_insn_re_regs *reg,
-   bool has_imm8)
+swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
+   struct nfp_insn_re_regs *reg, bool has_imm8)
 {
memset(reg, 0, sizeof(*reg));
 
/* Decode destination */
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_IMM)
+   if (swreg_type(dst) == NN_REG_IMM)
return -EFAULT;
 
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_B)
+   if (swreg_type(dst) == NN_REG_GPR_B)
reg->dst_ab = ALU_DST_B;
-   if (FIELD_GET(NN_REG_TYPE, dst) == NN_REG_GPR_BOTH)
+   if (swreg_type(dst) == NN_REG_GPR_BOTH)
reg->wr_both = true;
reg->dst = nfp_swreg_to_rereg(dst, true, false, NULL);
 
/* Decode source operands */
-   if (FIELD_GET(NN_REG_TYPE, lreg) ==

[PATCH net-next 07/15] nfp: bpf: remove packet marking support

2017-10-08 Thread Jakub Kicinski

Temporarily drop support for skb->mark.  We are primarily focusing
on XDP offload, and implementing skb->mark on the new datapath has
lower priority.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c  | 13 -
 drivers/net/ethernet/netronome/nfp/bpf/main.h |  3 +--
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 469dc8a055f2..4fa220f710d2 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -542,16 +542,6 @@ static int construct_data_ld(struct nfp_prog *nfp_prog, 
u16 offset, u8 size)
return construct_data_ind_ld(nfp_prog, offset, 0, false, size);
 }
 
-static int wrp_set_mark(struct nfp_prog *nfp_prog, u8 src)
-{
-   emit_alu(nfp_prog, NFP_BPF_ABI_MARK,
-reg_none(), ALU_OP_NONE, reg_b(src));
-   emit_alu(nfp_prog, NFP_BPF_ABI_FLAGS,
-NFP_BPF_ABI_FLAGS, ALU_OP_OR, reg_imm(NFP_BPF_ABI_FLAG_MARK));
-
-   return 0;
-}
-
 static void
 wrp_alu_imm(struct nfp_prog *nfp_prog, u8 dst, enum alu_op alu_op, u32 imm)
 {
@@ -1053,9 +1043,6 @@ static int mem_ldx4(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
 
 static int mem_stx4_skb(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
-   if (meta->insn.off == offsetof(struct sk_buff, mark))
-   return wrp_set_mark(nfp_prog, meta->insn.src_reg * 2);
-
return -EOPNOTSUPP;
 }
 
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.h 
b/drivers/net/ethernet/netronome/nfp/bpf/main.h
index ccc3dbea25f6..7d959757a51a 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.h
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.h
@@ -71,9 +71,8 @@ enum nfp_bpf_action_type {
 #define imm_b(np)  reg_b((np)->regs_per_thread - STATIC_REG_IMM)
 #define imm_both(np)   reg_both((np)->regs_per_thread - STATIC_REG_IMM)
 
-#define NFP_BPF_ABI_FLAGS  reg_nnr(0)
+#define NFP_BPF_ABI_FLAGS  reg_imm(0)
 #define   NFP_BPF_ABI_FLAG_MARK1
-#define NFP_BPF_ABI_MARK   reg_nnr(1)
 #define NFP_BPF_ABI_PKTreg_nnr(2)
 #define NFP_BPF_ABI_LENreg_nnr(3)
 
-- 
2.14.1

[PATCH net-next 15/15] nfp: bpf: pass dst register to ld_field instruction

2017-10-08 Thread Jakub Kicinski

ld_field instruction is a bit special because the encoding uses
two source registers and one of them becomes the output.  We do
need to pass the dst register to our encoding helpers though,
otherwise the "write both banks" flag will not be observed.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index f68052367db7..13148f30fc4c 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -408,7 +408,8 @@ emit_ld_field_any(struct nfp_prog *nfp_prog, enum shf_sc 
sc, u8 shift,
struct nfp_insn_re_regs reg;
int err;
 
-   err = swreg_to_restricted(reg_none(), dst, src, , true);
+   /* Note: ld_field is special as it uses one of the src regs as dst */
+   err = swreg_to_restricted(dst, dst, src, , true);
if (err) {
nfp_prog->error = err;
return;
-- 
2.14.1

[PATCH net-next 09/15] nfp: bpf: encode LMEM accesses

2017-10-08 Thread Jakub Kicinski

NFP LMEM is a large, indirectly accessed register file.  There
are two basic indirect access registers.  Each access operation
may either use offset (up to 8 or 16 words) or perform post
decrement/increment.

Add encodings of LMEM indexes as instruction operands.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/nfp_asm.c | 44 
 drivers/net/ethernet/netronome/nfp/nfp_asm.h | 41 ++
 2 files changed, 85 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.c 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
index 4c9201bf9331..4bcab43da16d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
@@ -48,6 +48,7 @@ const struct cmd_tgt_act cmd_tgt_act[__CMD_TGT_MAP_SIZE] = {
 
 static u16 nfp_swreg_to_unreg(swreg reg, bool is_dst)
 {
+   bool lm_id, lm_dec = false;
u16 val = swreg_value(reg);
 
switch (swreg_type(reg)) {
@@ -59,6 +60,33 @@ static u16 nfp_swreg_to_unreg(swreg reg, bool is_dst)
return UR_REG_NN | val;
case NN_REG_XFER:
return UR_REG_XFR | val;
+   case NN_REG_LMEM:
+   lm_id = swreg_lm_idx(reg);
+
+   switch (swreg_lm_mode(reg)) {
+   case NN_LM_MOD_NONE:
+   if (val & ~UR_REG_LM_IDX_MAX) {
+   pr_err("LM offset too large\n");
+   return 0;
+   }
+   return UR_REG_LM | FIELD_PREP(UR_REG_LM_IDX, lm_id) |
+   val;
+   case NN_LM_MOD_DEC:
+   lm_dec = true;
+   /* fall through */
+   case NN_LM_MOD_INC:
+   if (val) {
+   pr_err("LM offset in inc/dev mode\n");
+   return 0;
+   }
+   return UR_REG_LM | UR_REG_LM_POST_MOD |
+   FIELD_PREP(UR_REG_LM_IDX, lm_id) |
+   FIELD_PREP(UR_REG_LM_POST_MOD_DEC, lm_dec);
+   default:
+   pr_err("bad LM mode for unrestricted operands %d\n",
+  swreg_lm_mode(reg));
+   return 0;
+   }
case NN_REG_IMM:
if (val & ~0xff) {
pr_err("immediate too large\n");
@@ -108,6 +136,7 @@ int swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
 static u16 nfp_swreg_to_rereg(swreg reg, bool is_dst, bool has_imm8, bool *i8)
 {
u16 val = swreg_value(reg);
+   bool lm_id;
 
switch (swreg_type(reg)) {
case NN_REG_GPR_A:
@@ -116,6 +145,21 @@ static u16 nfp_swreg_to_rereg(swreg reg, bool is_dst, bool 
has_imm8, bool *i8)
return val;
case NN_REG_XFER:
return RE_REG_XFR | val;
+   case NN_REG_LMEM:
+   lm_id = swreg_lm_idx(reg);
+
+   if (swreg_lm_mode(reg) != NN_LM_MOD_NONE) {
+   pr_err("bad LM mode for restricted operands %d\n",
+  swreg_lm_mode(reg));
+   return 0;
+   }
+
+   if (val & ~RE_REG_LM_IDX_MAX) {
+   pr_err("LM offset too large\n");
+   return 0;
+   }
+
+   return RE_REG_LM | FIELD_PREP(RE_REG_LM_IDX, lm_id) | val;
case NN_REG_IMM:
if (val & ~(0x7f | has_imm8 << 7)) {
pr_err("immediate too large\n");
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
index 63cfd07da34e..d722f6878bd8 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -44,9 +44,17 @@
 #define RE_REG_IMM_encode(x)   \
(RE_REG_IMM | ((x) & 0x1f) | (((x) & 0x60) << 1))
 #define RE_REG_IMM_MAX  0x07fULL
+#define RE_REG_LM  0x050
+#define RE_REG_LM_IDX  0x008
+#define RE_REG_LM_IDX_MAX  0x7
 #define RE_REG_XFR 0x080
 
 #define UR_REG_XFR 0x180
+#define UR_REG_LM  0x200
+#define UR_REG_LM_IDX  0x020
+#define UR_REG_LM_POST_MOD 0x010
+#define UR_REG_LM_POST_MOD_DEC 0x001
+#define UR_REG_LM_IDX_MAX  0xf
 #define UR_REG_NN  0x280
 #define UR_REG_NO_DST  0x300
 #define UR_REG_IMM UR_REG_NO_DST
@@ -235,6 +243,8 @@ enum lcsr_wr_src {
 
 /* Software register representation, independent of operand type */
 #define NN_REG_TYPEGENMASK(31, 24)
+#define NN_REG_LM_IDX  BIT(22)
+#define NN_REG_LM_MOD  GENMASK(21, 20)
 #define NN_REG_VAL GENMASK(7, 0)
 
 enum nfp_bpf_reg_type {
@@ -245,6 +255,13 @@ enum nfp_bpf_reg_type {
NN_REG_XFER =   BIT(3),
NN_REG_IMM =BIT(4),

[PATCH net-next 02/15] nfp: bpf: lift the single-port limitation

2017-10-08 Thread Jakub Kicinski

Limiting the eBPF offload to a single port was a workaround
required for the PoC application FW which has not been
released externally.  It's not necessary any more.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/main.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/main.c 
b/drivers/net/ethernet/netronome/nfp/bpf/main.c
index be2cf10a2cd7..074726980994 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/main.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/main.c
@@ -89,14 +89,6 @@ nfp_bpf_vnic_alloc(struct nfp_app *app, struct nfp_net *nn, 
unsigned int id)
struct nfp_net_bpf_priv *priv;
int ret;
 
-   /* Limit to single port, otherwise it's just a NIC */
-   if (id > 0) {
-   nfp_warn(app->cpp,
-"BPF NIC doesn't support more than one port right 
now\n");
-   nn->port = nfp_port_alloc(app, NFP_PORT_INVALID, nn->dp.netdev);
-   return PTR_ERR_OR_ZERO(nn->port);
-   }
-
priv = kmalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
-- 
2.14.1

[PATCH net-next 05/15] nfp: bpf: encode all 64bit shifts

2017-10-08 Thread Jakub Kicinski

Add encodings of all 64bit shift operations.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 48 +---
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 7bcff007075c..095cf50e8450 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -825,12 +825,24 @@ static int sub_imm64(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
 static int shl_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
const struct bpf_insn *insn = >insn;
-
-   if (insn->imm != 32)
-   return 1; /* TODO */
-
-   wrp_reg_mov(nfp_prog, insn->dst_reg * 2 + 1, insn->dst_reg * 2);
-   wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2), 0);
+   u8 dst = insn->dst_reg * 2;
+
+   if (insn->imm < 32) {
+   emit_shf(nfp_prog, reg_both(dst + 1),
+reg_a(dst + 1), SHF_OP_NONE, reg_b(dst),
+SHF_SC_R_DSHF, 32 - insn->imm);
+   emit_shf(nfp_prog, reg_both(dst),
+reg_none(), SHF_OP_NONE, reg_b(dst),
+SHF_SC_L_SHF, insn->imm);
+   } else if (insn->imm == 32) {
+   wrp_reg_mov(nfp_prog, dst + 1, dst);
+   wrp_immed(nfp_prog, reg_both(dst), 0);
+   } else if (insn->imm > 32) {
+   emit_shf(nfp_prog, reg_both(dst + 1),
+reg_none(), SHF_OP_NONE, reg_b(dst),
+SHF_SC_L_SHF, insn->imm - 32);
+   wrp_immed(nfp_prog, reg_both(dst), 0);
+   }
 
return 0;
 }
@@ -838,12 +850,24 @@ static int shl_imm64(struct nfp_prog *nfp_prog, struct 
nfp_insn_meta *meta)
 static int shr_imm64(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta)
 {
const struct bpf_insn *insn = >insn;
-
-   if (insn->imm != 32)
-   return 1; /* TODO */
-
-   wrp_reg_mov(nfp_prog, insn->dst_reg * 2, insn->dst_reg * 2 + 1);
-   wrp_immed(nfp_prog, reg_both(insn->dst_reg * 2 + 1), 0);
+   u8 dst = insn->dst_reg * 2;
+
+   if (insn->imm < 32) {
+   emit_shf(nfp_prog, reg_both(dst),
+reg_a(dst + 1), SHF_OP_NONE, reg_b(dst),
+SHF_SC_R_DSHF, insn->imm);
+   emit_shf(nfp_prog, reg_both(dst + 1),
+reg_none(), SHF_OP_NONE, reg_b(dst + 1),
+SHF_SC_R_SHF, insn->imm);
+   } else if (insn->imm == 32) {
+   wrp_reg_mov(nfp_prog, dst, dst + 1);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+   } else if (insn->imm > 32) {
+   emit_shf(nfp_prog, reg_both(dst),
+reg_none(), SHF_OP_NONE, reg_b(dst + 1),
+SHF_SC_R_SHF, insn->imm - 32);
+   wrp_immed(nfp_prog, reg_both(dst + 1), 0);
+   }
 
return 0;
 }
-- 
2.14.1

[PATCH net-next 04/15] nfp: bpf: move software reg helpers and cmd table out of translator

2017-10-08 Thread Jakub Kicinski

Move the software reg helpers and some static data to nfp_asm.c.
They are related to the previous patch, but move is done in a separate
commit for ease of review.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/Makefile  |   1 +
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 147 ---
 drivers/net/ethernet/netronome/nfp/nfp_asm.c | 167 +++
 drivers/net/ethernet/netronome/nfp/nfp_asm.h |  24 
 4 files changed, 192 insertions(+), 147 deletions(-)
 create mode 100644 drivers/net/ethernet/netronome/nfp/nfp_asm.c

diff --git a/drivers/net/ethernet/netronome/nfp/Makefile 
b/drivers/net/ethernet/netronome/nfp/Makefile
index becaacf1554d..bd3b2bd408bc 100644
--- a/drivers/net/ethernet/netronome/nfp/Makefile
+++ b/drivers/net/ethernet/netronome/nfp/Makefile
@@ -14,6 +14,7 @@ nfp-objs := \
nfpcore/nfp_resource.o \
nfpcore/nfp_rtsym.o \
nfpcore/nfp_target.o \
+   nfp_asm.o \
nfp_app.o \
nfp_app_nic.o \
nfp_devlink.o \
diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 7e8cdfb39607..7bcff007075c 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -110,154 +110,7 @@ nfp_prog_offset_to_index(struct nfp_prog *nfp_prog, 
unsigned int offset)
return offset - nfp_prog->start_off;
 }
 
-/* --- SW reg --- */
-struct nfp_insn_ur_regs {
-   enum alu_dst_ab dst_ab;
-   u16 dst;
-   u16 areg, breg;
-   bool swap;
-   bool wr_both;
-};
-
-struct nfp_insn_re_regs {
-   enum alu_dst_ab dst_ab;
-   u8 dst;
-   u8 areg, breg;
-   bool swap;
-   bool wr_both;
-   bool i8;
-};
-
-static u16 nfp_swreg_to_unreg(swreg reg, bool is_dst)
-{
-   u16 val = swreg_value(reg);
-
-   switch (swreg_type(reg)) {
-   case NN_REG_GPR_A:
-   case NN_REG_GPR_B:
-   case NN_REG_GPR_BOTH:
-   return val;
-   case NN_REG_NNR:
-   return UR_REG_NN | val;
-   case NN_REG_XFER:
-   return UR_REG_XFR | val;
-   case NN_REG_IMM:
-   if (val & ~0xff) {
-   pr_err("immediate too large\n");
-   return 0;
-   }
-   return UR_REG_IMM_encode(val);
-   case NN_REG_NONE:
-   return is_dst ? UR_REG_NO_DST : REG_NONE;
-   }
-
-   pr_err("unrecognized reg encoding %08x\n", reg);
-   return 0;
-}
-
-static int
-swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
- struct nfp_insn_ur_regs *reg)
-{
-   memset(reg, 0, sizeof(*reg));
-
-   /* Decode destination */
-   if (swreg_type(dst) == NN_REG_IMM)
-   return -EFAULT;
-
-   if (swreg_type(dst) == NN_REG_GPR_B)
-   reg->dst_ab = ALU_DST_B;
-   if (swreg_type(dst) == NN_REG_GPR_BOTH)
-   reg->wr_both = true;
-   reg->dst = nfp_swreg_to_unreg(dst, true);
-
-   /* Decode source operands */
-   if (swreg_type(lreg) == swreg_type(rreg))
-   return -EFAULT;
-
-   if (swreg_type(lreg) == NN_REG_GPR_B ||
-   swreg_type(rreg) == NN_REG_GPR_A) {
-   reg->areg = nfp_swreg_to_unreg(rreg, false);
-   reg->breg = nfp_swreg_to_unreg(lreg, false);
-   reg->swap = true;
-   } else {
-   reg->areg = nfp_swreg_to_unreg(lreg, false);
-   reg->breg = nfp_swreg_to_unreg(rreg, false);
-   }
-
-   return 0;
-}
-
-static u16 nfp_swreg_to_rereg(swreg reg, bool is_dst, bool has_imm8, bool *i8)
-{
-   u16 val = swreg_value(reg);
-
-   switch (swreg_type(reg)) {
-   case NN_REG_GPR_A:
-   case NN_REG_GPR_B:
-   case NN_REG_GPR_BOTH:
-   return val;
-   case NN_REG_XFER:
-   return RE_REG_XFR | val;
-   case NN_REG_IMM:
-   if (val & ~(0x7f | has_imm8 << 7)) {
-   pr_err("immediate too large\n");
-   return 0;
-   }
-   *i8 = val & 0x80;
-   return RE_REG_IMM_encode(val & 0x7f);
-   case NN_REG_NONE:
-   return is_dst ? RE_REG_NO_DST : REG_NONE;
-   case NN_REG_NNR:
-   pr_err("NNRs used with restricted encoding\n");
-   return 0;
-   }
-
-   pr_err("unrecognized reg encoding\n");
-   return 0;
-}
-
-static int
-swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
-   struct nfp_insn_re_regs *reg, bool has_imm8)
-{
-   memset(reg, 0, sizeof(*reg));
-
-   /* Decode destination */
-   if (swreg_type(dst) == NN_REG_IMM)
-   return -EFAULT;
-
-   if (swreg_type(dst) == NN_REG_GPR_B)
-   reg->dst_ab = ALU_DST_B;
-   if (swreg_type(dst) ==

[PATCH net-next 13/15] nfp: bpf: pad code with valid nops

2017-10-08 Thread Jakub Kicinski

We need to append up to 8 nops after last instruction to make
sure the CPU will not fetch garbage instructions with invalid
ECC if the code store was not initialized.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 12 +++-
 drivers/net/ethernet/netronome/nfp/nfp_asm.h |  2 ++
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index f4aedc89bfc8..e0600d037773 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -426,6 +426,11 @@ emit_ld_field(struct nfp_prog *nfp_prog, swreg dst, u8 
bmask, swreg src,
emit_ld_field_any(nfp_prog, sc, shift, dst, bmask, src, false);
 }
 
+static void emit_nop(struct nfp_prog *nfp_prog)
+{
+   __emit_immed(nfp_prog, UR_REG_IMM, UR_REG_IMM, 0, 0, 0, 0, 0, 0, 0);
+}
+
 /* --- Wrappers --- */
 static bool pack_immed(u32 imm, u16 *val, enum immed_shift *shift)
 {
@@ -1550,7 +1555,7 @@ static void nfp_outro(struct nfp_prog *nfp_prog)
 static int nfp_translate(struct nfp_prog *nfp_prog)
 {
struct nfp_insn_meta *meta;
-   int err;
+   int i, err;
 
nfp_intro(nfp_prog);
if (nfp_prog->error)
@@ -1582,6 +1587,11 @@ static int nfp_translate(struct nfp_prog *nfp_prog)
if (nfp_prog->error)
return nfp_prog->error;
 
+   for (i = 0; i < NFP_USTORE_PREFETCH_WINDOW; i++)
+   emit_nop(nfp_prog);
+   if (nfp_prog->error)
+   return nfp_prog->error;
+
return nfp_fixup_branches(nfp_prog);
 }
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
index d95087e5fb73..c4c18dd5630a 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -362,6 +362,8 @@ int swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
 int swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
struct nfp_insn_re_regs *reg, bool has_imm8);
 
+#define NFP_USTORE_PREFETCH_WINDOW 8
+
 int nfp_ustore_check_valid_no_ecc(u64 insn);
 u64 nfp_ustore_calc_ecc_insn(u64 insn);
 
-- 
2.14.1

[PATCH net-next 12/15] nfp: bpf: calculate code store ECC

2017-10-08 Thread Jakub Kicinski

In the initial PoC firmware I simply disabled ECC on the instruction
store.  Do the ECC calculation for generated instructions in the driver.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 20 +++
 drivers/net/ethernet/netronome/nfp/nfp_asm.c | 37 
 drivers/net/ethernet/netronome/nfp/nfp_asm.h |  3 +++
 3 files changed, 60 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index 9b6c98ccebfe..f4aedc89bfc8 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -1715,6 +1715,23 @@ static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
return 0;
 }
 
+static int nfp_bpf_ustore_calc(struct nfp_prog *nfp_prog)
+{
+   int i;
+
+   for (i = 0; i < nfp_prog->prog_len; i++) {
+   int err;
+
+   err = nfp_ustore_check_valid_no_ecc(nfp_prog->prog[i]);
+   if (err)
+   return err;
+
+   nfp_prog->prog[i] = nfp_ustore_calc_ecc_insn(nfp_prog->prog[i]);
+   }
+
+   return 0;
+}
+
 /**
  * nfp_bpf_jit() - translate BPF code into NFP assembly
  * @filter:kernel BPF filter struct
@@ -1766,8 +1783,11 @@ nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
pr_err("Translation failed with error %d (translated: %u)\n",
   ret, nfp_prog->n_translated);
ret = -EINVAL;
+   goto out;
}
 
+   ret = nfp_bpf_ustore_calc(nfp_prog);
+
res->n_instr = nfp_prog->prog_len;
res->dense_mode = false;
 out:
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.c 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
index 1decc638ea6f..de76e7444fc2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.c
@@ -215,3 +215,40 @@ int swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
 
return 0;
 }
+
+#define NFP_USTORE_ECC_POLY_WORDS  7
+#define NFP_USTORE_OP_BITS 45
+
+static const u64 nfp_ustore_ecc_polynomials[NFP_USTORE_ECC_POLY_WORDS] = {
+   0x0ff87fffULL,
+   0x11f801ff801fULL,
+   0x1e387e0781e1ULL,
+   0x17cb8e388e22ULL,
+   0x1af5b2c93244ULL,
+   0x1f56d5525488ULL,
+   0x0daf69a46910ULL,
+};
+
+static bool parity(u64 value)
+{
+   return hweight64(value) & 1;
+}
+
+int nfp_ustore_check_valid_no_ecc(u64 insn)
+{
+   if (insn & ~GENMASK_ULL(NFP_USTORE_OP_BITS, 0))
+   return -EINVAL;
+
+   return 0;
+}
+
+u64 nfp_ustore_calc_ecc_insn(u64 insn)
+{
+   u8 ecc = 0;
+   int i;
+
+   for (i = 0; i < NFP_USTORE_ECC_POLY_WORDS; i++)
+   ecc |= parity(nfp_ustore_ecc_polynomials[i] & insn) << i;
+
+   return insn | (u64)ecc << NFP_USTORE_OP_BITS;
+}
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_asm.h 
b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
index 40a51a45afd7..d95087e5fb73 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_asm.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_asm.h
@@ -362,4 +362,7 @@ int swreg_to_unrestricted(swreg dst, swreg lreg, swreg rreg,
 int swreg_to_restricted(swreg dst, swreg lreg, swreg rreg,
struct nfp_insn_re_regs *reg, bool has_imm8);
 
+int nfp_ustore_check_valid_no_ecc(u64 insn);
+u64 nfp_ustore_calc_ecc_insn(u64 insn);
+
 #endif
-- 
2.14.1

[PATCH net-next 01/15] nfp: output control messages to trace_devlink_hwmsg()

2017-10-08 Thread Jakub Kicinski

Use standard devlink trace point to allow tracing of control
messages.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/nfp_app.h | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_app.h 
b/drivers/net/ethernet/netronome/nfp/nfp_app.h
index af640b5c2108..857bb33020ba 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_app.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_app.h
@@ -36,6 +36,8 @@
 
 #include 
 
+#include 
+
 #include "nfp_net_repr.h"
 
 struct bpf_prog;
@@ -271,11 +273,17 @@ static inline int nfp_app_xdp_offload(struct nfp_app 
*app, struct nfp_net *nn,
 
 static inline bool nfp_app_ctrl_tx(struct nfp_app *app, struct sk_buff *skb)
 {
+   trace_devlink_hwmsg(priv_to_devlink(app->pf), false, 0,
+   skb->data, skb->len);
+
return nfp_ctrl_tx(app->ctrl, skb);
 }
 
 static inline void nfp_app_ctrl_rx(struct nfp_app *app, struct sk_buff *skb)
 {
+   trace_devlink_hwmsg(priv_to_devlink(app->pf), true, 0,
+   skb->data, skb->len);
+
app->type->ctrl_msg_rx(app, skb);
 }
 
-- 
2.14.1

[PATCH net-next] ipv6: fix a BUG in rt6_get_pcpu_route()

2017-10-08 Thread Eric Dumazet

From: Eric Dumazet 

Ido reported following splat and provided a patch.

[  122.221814] BUG: using smp_processor_id() in preemptible [] code: 
sshd/2672
[  122.221845] caller is debug_smp_processor_id+0x17/0x20
[  122.221866] CPU: 0 PID: 2672 Comm: sshd Not tainted 
4.14.0-rc3-idosch-next-custom #639
[  122.221880] Hardware name: Mellanox Technologies Ltd. 
MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  122.221893] Call Trace:
[  122.221919]  dump_stack+0xb1/0x10c
[  122.221946]  ? _atomic_dec_and_lock+0x124/0x124
[  122.221974]  ? ___ratelimit+0xfe/0x240
[  122.222020]  check_preemption_disabled+0x173/0x1b0
[  122.222060]  debug_smp_processor_id+0x17/0x20
[  122.222083]  ip6_pol_route+0x1482/0x24a0
...

I believe we can simplify this code path a bit, since we no longer
hold a read_lock and need to release it to avoid a dead lock.

By disabling BH, we make sure we'll prevent code re-entry and
rt6_get_pcpu_route()/rt6_make_pcpu_route() run on the same cpu. 

Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table")
Reported-by: Ido Schimmel 
Signed-off-by: Eric Dumazet 
Tested-by: Ido Schimmel 
---
 net/ipv6/route.c |   26 ++
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 
399d1bceec4a6e6736c367e706dd2acbd4093d58..606e80325b21c0e10a02e9c7d5b3fcfbfc26a003
 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1136,15 +1136,7 @@ static struct rt6_info *rt6_make_pcpu_route(struct 
rt6_info *rt)
dst_hold(_rt->dst);
p = this_cpu_ptr(rt->rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
-   if (prev) {
-   /* If someone did it before us, return prev instead */
-   /* release refcnt taken by ip6_rt_pcpu_alloc() */
-   dst_release_immediate(_rt->dst);
-   /* release refcnt taken by above dst_hold() */
-   dst_release_immediate(_rt->dst);
-   dst_hold(>dst);
-   pcpu_rt = prev;
-   }
+   BUG_ON(prev);
 
rt6_dst_from_metrics_check(pcpu_rt);
return pcpu_rt;
@@ -1739,31 +1731,25 @@ struct rt6_info *ip6_pol_route(struct net *net, struct 
fib6_table *table,
struct rt6_info *pcpu_rt;
 
dst_use_noref(>dst, jiffies);
+   local_bh_disable();
pcpu_rt = rt6_get_pcpu_route(rt);
 
-   if (pcpu_rt) {
-   rcu_read_unlock();
-   } else {
+   if (!pcpu_rt) {
/* atomic_inc_not_zero() is needed when using rcu */
if (atomic_inc_not_zero(>rt6i_ref)) {
-   /* We have to do the read_unlock first
-* because rt6_make_pcpu_route() may trigger
-* ip6_dst_gc() which will take the write_lock.
-*
-* No dst_hold() on rt is needed because 
grabbing
+   /* No dst_hold() on rt is needed because 
grabbing
 * rt->rt6i_ref makes sure rt can't be released.
 */
-   rcu_read_unlock();
pcpu_rt = rt6_make_pcpu_route(rt);
rt6_release(rt);
} else {
/* rt is already removed from tree */
-   rcu_read_unlock();
pcpu_rt = net->ipv6.ip6_null_entry;
dst_hold(_rt->dst);
}
}
-
+   local_bh_enable();
+   rcu_read_unlock();
trace_fib6_table_lookup(net, pcpu_rt, table->tb6_id, fl6);
return pcpu_rt;
}

[PATCH net-next 14/15] nfp: bpf: byte swap the instructions

2017-10-08 Thread Jakub Kicinski

Device expects the instructions in little endian.  Make sure we
byte swap on big endian hosts.

Signed-off-by: Jakub Kicinski 
Reviewed-by: Simon Horman 
---
 drivers/net/ethernet/netronome/nfp/bpf/jit.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c 
b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
index e0600d037773..f68052367db7 100644
--- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c
+++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c
@@ -1725,7 +1725,7 @@ static int nfp_bpf_optimize(struct nfp_prog *nfp_prog)
return 0;
 }
 
-static int nfp_bpf_ustore_calc(struct nfp_prog *nfp_prog)
+static int nfp_bpf_ustore_calc(struct nfp_prog *nfp_prog, __le64 *ustore)
 {
int i;
 
@@ -1737,6 +1737,8 @@ static int nfp_bpf_ustore_calc(struct nfp_prog *nfp_prog)
return err;
 
nfp_prog->prog[i] = nfp_ustore_calc_ecc_insn(nfp_prog->prog[i]);
+
+   ustore[i] = cpu_to_le64(nfp_prog->prog[i]);
}
 
return 0;
@@ -1796,7 +1798,7 @@ nfp_bpf_jit(struct bpf_prog *filter, void *prog_mem,
goto out;
}
 
-   ret = nfp_bpf_ustore_calc(nfp_prog);
+   ret = nfp_bpf_ustore_calc(nfp_prog, (__force __le64 *)prog_mem);
 
res->n_instr = nfp_prog->prog_len;
res->dense_mode = false;
-- 
2.14.1

Re: [PATCH net-next v6 0/3] bridge: neigh msg proxy and flood suppression support

2017-10-08 Thread David Miller

From: Roopa Prabhu 
Date: Fri,  6 Oct 2017 22:12:36 -0700

> From: Roopa Prabhu 
> 
> This series implements arp and nd suppression in the bridge
> driver for ethernet vpns. It implements rfc7432, section 10
> https://tools.ietf.org/html/rfc7432#section-10
> for ethernet VPN deployments. It is similar to the existing
> BR_PROXYARP* flags but has a few semantic differences to conform
> to EVPN standard. Unlike the existing flags, this new flag suppresses
> flood of all neigh discovery packets (arp and nd) to tunnel ports.
> Supports both vlan filtering and non-vlan filtering bridges.
> 
> In case of EVPN, it is mainly used to avoid flooding
> of arp and nd packets to tunnel ports like vxlan.
> 
> v2 : rebase to latest + address some optimization feedback from Nikolay.
> v3 : fix kbuild reported build errors with CONFIG_INET off
> v4 : simplify port flag mask as suggested by stephen
> v5 : address some feedback from Toshiaki
> v6 : some v5 cleanups in nd suppress (keep it consistent with arp suppress)

Series applied, thank you.

Re: [net 1/1] tipc: correct initialization of skb list

2017-10-08 Thread David Miller

From: Jon Maloy 
Date: Sat, 7 Oct 2017 14:32:49 +0200

> We change the initialization of the skb transmit buffer queues
> in the functions tipc_bcast_xmit() and tipc_rcast_xmit() to also
> initialize their spinlocks. This is needed because we may, during
> error conditions, need to call skb_queue_purge() on those queues
> further down the stack.
> 
> Signed-off-by: Jon Maloy 

Applied.

Re: [net 1/1] tipc: Unclone message at secondary destination lookup

2017-10-08 Thread David Miller

From: Jon Maloy 
Date: Sat, 7 Oct 2017 15:07:20 +0200

> When a bundling message is received, the function tipc_link_input()
> calls function tipc_msg_extract() to unbundle all inner messages of
> the bundling message before adding them to input queue.
> 
> The function tipc_msg_extract() just clones all inner skb for all
> inner messagges from the bundling skb. This means that the skb
> headroom of an inner message overlaps with the data part of the
> preceding message in the bundle.
> 
> If the message in question is a name addressed message, it may be
> subject to a secondary destination lookup, and eventually be sent out
> on one of the interfaces again. But, since what is perceived as headroom
> by the device driver in reality is the last bytes of the preceding
> message in the bundle, the latter will be overwritten by the MAC
> addresses of the L2 header. If the preceding message has not yet been
> consumed by the user, it will evenually be delivered with corrupted
> contents.
> 
> This commit fixes this by uncloning all messages passing through the
> function tipc_msg_lookup_dest(), hence ensuring that the headroom
> is always valid when the message is passed on.
> 
> Signed-off-by: Tung Nguyen 
> Signed-off-by: Jon Maloy 

Applied.

1 2 >

1 - 100 of 112 matches

Mail list logo