Re: [PATCH] iproute2: Extend bridge command to configure ageing interval on bridge devices.

2015-08-14 Thread Michal Kubecek
On Fri, Aug 14, 2015 at 09:50:02AM +, Premkumar Jonnala wrote:
 Extend bridge command to configure and retrieve ageing interval for bridge
 devices.  Netlink messaging is used to configure and retrieve the ageing
 interval.
 
 Signed-off-by: Premkumar Jonnala pjonn...@broadcom.com
 
...
 diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
 index a78f0b3..abc9617 100644
 --- a/include/linux/rtnetlink.h
 +++ b/include/linux/rtnetlink.h
 @@ -139,6 +139,13 @@ enum {
   RTM_GETNSID = 90,
  #define RTM_GETNSID RTM_GETNSID
  
 + RTM_SETAGEING = 92,
 +#define RTM_SETAGEING RTM_SETAGEING
 + RTM_SETDEFAULTAGEING = 93,
 +#define RTM_SETDEFAULTAGEING RTM_SETDEFAULTAGEING
 + RTM_GETAGEING = 94,
 +#define RTM_GETAGEING RTM_GETAGEING
 +
   __RTM_MAX,
  #define RTM_MAX  (((__RTM_MAX + 3)  ~3) - 1)
  };
 --

As far as I can see, this depends on a kernel patch which is still under
review (in particular, adding these new message types was objected to).
I would suggest to wait with submission of the iproute2 patch until
relevant kernel changes are accepted.

  Michal Kubecek

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


pull-request: mac80211 2015-08-14

2015-08-14 Thread Johannes Berg
Hi Dave,

I'm back from vacation, and found a single bugfix waiting. It's in this
pull request, but I'm not quite up to speed as to what's happening with
the release. If it goes in, great; if not I've already tagged it with
Cc stable anyway.

Thanks,
johannes



The following changes since commit 923b352f19d9ea971ae2536eab55f5fc9e95fedf:

  cfg80211: use RTNL locked reg_can_beacon for IR-relaxation (2015-07-17 
15:02:02 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211.git 
tags/mac80211-for-davem-2015-08-14

for you to fetch changes up to f5eeb5fa191fd7b634cbc4883ac58f3b2184dbc5:

  mac80211: fix invalid read in minstrel_sort_best_tp_rates() (2015-08-13 
13:52:34 +0200)


We have a single bugfix for an invalid memory read.


Adrien Schildknecht (1):
  mac80211: fix invalid read in minstrel_sort_best_tp_rates()

 net/mac80211/rc80211_minstrel.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] net/bonding: enable LRO if one device supports it

2015-08-14 Thread Michal Kubecek
On Thu, Aug 13, 2015 at 02:02:55PM -0400, Jarod Wilson wrote:
 Currently, all bonding devices come up, and claim to have LRO support,
 which ethtool will let you toggle on and off, even if none of the
 underlying hardware devices actually support it. While the bonding driver
 takes precautions for slaves that don't support all features, this is at
 least a little bit misleading to users.
 
 If we add NETIF_F_LRO to the NETIF_F_ONE_FOR_ALL flags in
 netdev_features.h, then netdev_features_increment() will only enable LRO
 if 1) its listed in the device's feature mask and 2) if there's actually a
 slave present that supports the feature.
 
 Note that this is going to require some follow-up patches, as not all LRO
 capable device drivers are currently properly reporting LRO support in
 their vlan_features, which is where the bonding driver picks up
 device-specific features.
 
 CC: David S. Miller da...@davemloft.net
 CC: Jiri Pirko j...@resnulli.us
 CC: Tom Herbert therb...@google.com
 CC: Scott Feldman sfel...@gmail.com
 CC: netdev@vger.kernel.org
 Signed-off-by: Jarod Wilson ja...@redhat.com
 ---
  include/linux/netdev_features.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
 index 9672781..6440bf1 100644
 --- a/include/linux/netdev_features.h
 +++ b/include/linux/netdev_features.h
 @@ -159,7 +159,8 @@ enum {
   */
  #define NETIF_F_ONE_FOR_ALL  (NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ROBUST | \
NETIF_F_SG | NETIF_F_HIGHDMA | \
 -  NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED)
 +  NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED | \
 +  NETIF_F_LRO)
  
  /*
   * If one device doesn't support one of these features, then disable it
 -- 

I don't think this is going to work the way you expect. Assume we have a
non-LRO eth1 and LRO capable eth2. If we enslave eth1 first, bond will
lose NETIF_F_LRO so that while enslaving eth2, bond_enslave() does run

if (!(bond_dev-features  NETIF_F_LRO))
dev_disable_lro(slave_dev);

and disable LRO on eth2 even before computing the bond features so that
in the end, all three interfaces end up with disabled LRO. If you add
the slaves in the opposite order, you end up with eth2 and bond having
LRO enabled. IMHO features should not depend on the order in which
slaves are added into the bond.

You would need to remove the code quoted above to make things work the
way you want (or move it after the call to bond_compute_features() which
is effectively the same). But then the result would be even worse:
adding a LRO-capable slave to a bond having dev_disable_lro() called on
it would not disable LRO on that slave, possibly (or rather likely)
causing communication breakage.

I believe NETIF_F_LRO in its original sense should be only considered
for physical devices; even if it's not explicitely said in the commit
message, the logic behind fbe168ba91f7 (net: generic dev_disable_lro()
stacked device handling) is that for stacked devices like bond or team,
NETIF_F_LRO means allow slaves to use LRO if they can and want while
its absence means disable LRO on all slaves. If you wanted NETIF_F_LRO
for a bond to mean there is at least one LRO capable slave, you would
need a new flag for the LRO should be disabled for all lower devices
state. I don't think it's worth the effort.

 Michal Kubecek

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] bridge: Enable configuration of ageing interval for bridges and switch devices.

2015-08-14 Thread Premkumar Jonnala
Bridge devices have ageing interval used to age out MAC addresses
from FDB.  This ageing interval was not configuratble.

Enable netlink based configuration of ageing interval for bridges and
switch devices.  The ageing interval changes the timer used to purge
inactive FDB entries in bridges.  The ageing interval config is
propagated to switch devices, so that platform or hardware based
ageing works according to configuration.

Signed-off-by: Premkumar Jonnala pjonn...@broadcom.com

---

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 607b5f4..e3b0c45 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1053,7 +1053,16 @@ typedef u16 (*select_queue_fallback_t)(struct net_device 
*dev,
  * This function is used to pass protocol port error state information
  * to the switch driver. The switch driver can react to the proto_down
  *  by doing a phys down on the associated switch port.
- *
+ * int (*ndo_bridge_setageing)(const struct net_device *dev,
+ *int ageing_interval);
+ * Called to set FDB aging interval for a given bridge device.
+ * int (*ndo_bridge_getageing_nl)(struct sk_buff *skb,
+ * const struct net_device *dev,
+ * struct netlink_callback *cb);
+ * Called to return the ageing interval for the given bridge device,
+ * in a format suitable for netlink messaging.
+ * int (*ndo_bridge_getageing)(const struct net_device *dev);
+ * Called to retrieve the ageing interval for the given bridge device.
  */
 struct net_device_ops {
int (*ndo_init)(struct net_device *dev);
@@ -1226,6 +1235,13 @@ struct net_device_ops {
int (*ndo_get_iflink)(const struct net_device *dev);
int (*ndo_change_proto_down)(struct net_device *dev,
 bool proto_down);
+   int (*ndo_bridge_setageing)(const struct net_device *dev,
+   int ageing_interval);
+   int (*ndo_bridge_getageing_nl)(struct sk_buff *skb,
+  const struct net_device *dev,
+  struct netlink_callback *cb);
+
+   int (*ndo_bridge_getageing)(const struct net_device *dev);
 };
 
 /**
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 89da893..7186fea 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -129,6 +129,10 @@ int switchdev_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr);
 int switchdev_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr);
+int netdev_switch_ageing_set(struct net_device *dev, int ageing_interval);
+int netdev_switch_ageing_get(struct sk_buff *skb,
+const struct net_device *dev,
+struct netlink_callback *cb);
 int switchdev_port_obj_add(struct net_device *dev, struct switchdev_obj *obj);
 int switchdev_port_obj_del(struct net_device *dev, struct switchdev_obj *obj);
 int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj *obj);
@@ -163,6 +167,17 @@ void switchdev_port_fwd_mark_set(struct net_device *dev,
 
 #else
 
+static inline int netdev_switch_ageing_set(struct net_device *dev,
+   int ageing_interval)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline int netdev_switch_ageing_get(struct net_device *dev)
+{
+   return -EOPNOTSUPP;
+}
+
 static inline int switchdev_port_attr_get(struct net_device *dev,
  struct switchdev_attr *attr)
 {
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 3635b77..a32ab4d 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -199,4 +199,23 @@ enum {
 };
 #define MDBA_SET_ENTRY_MAX (__MDBA_SET_ENTRY_MAX - 1)
 
+struct admsg {
+   __u8 adm_family;
+   __u8 adm_pad1;
+   __u16 adm_pad2;
+   __s32 adm_ifindex;
+   __u16 adm_ageing_interval;
+};
+
+/* The value of this macro is based on the value recommended by IEEE
+ * standard 802.1d.
+ */
+#define MIN_AGEING_INTERVAL_SECS (10)
+
+/* The value of DEFAULT_AGEING_INTERVAL_SECS is the default ageing
+ * interval that was used in br_device.c.  This default value is also
+ * recommended by IEEE Standard 802.1d.
+ */
+#define DEFAULT_AGEING_INTERVAL_SECS (300)
+
 #endif /* _UAPI_LINUX_IF_BRIDGE_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 47d24cb..9321818 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -139,6 +139,13 @@ enum {
RTM_GETNSID = 90,
 #define RTM_GETNSID RTM_GETNSID
 
+   RTM_SETAGEING = 92,
+#define RTM_SETAGEING RTM_SETAGEING
+   RTM_SETDEFAULTAGEING = 93,
+#define 

Re: [PATCH] bridge: Enable configuration of ageing interval for bridges and switch devices.

2015-08-14 Thread roopa

On 8/13/15, 11:23 PM, Premkumar Jonnala wrote:

Bridge devices have ageing interval used to age out MAC addresses
from FDB.  This ageing interval was not configuratble.

Enable netlink based configuration of ageing interval for bridges and
switch devices.  The ageing interval changes the timer used to purge
inactive FDB entries in bridges.  The ageing interval config is
propagated to switch devices, so that platform or hardware based
ageing works according to configuration.

Signed-off-by: Premkumar Jonnala pjonn...@broadcom.com


How is this different from netlink attribute IFLA_BR_AGEING_TIME ?


---

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 607b5f4..e3b0c45 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1053,7 +1053,16 @@ typedef u16 (*select_queue_fallback_t)(struct net_device 
*dev,
   *This function is used to pass protocol port error state information
   *to the switch driver. The switch driver can react to the proto_down
   *  by doing a phys down on the associated switch port.
- *
+ * int (*ndo_bridge_setageing)(const struct net_device *dev,
+ *int ageing_interval);
+ * Called to set FDB aging interval for a given bridge device.
+ * int (*ndo_bridge_getageing_nl)(struct sk_buff *skb,
+ * const struct net_device *dev,
+ * struct netlink_callback *cb);
+ * Called to return the ageing interval for the given bridge device,
+ * in a format suitable for netlink messaging.
+ * int (*ndo_bridge_getageing)(const struct net_device *dev);
+ * Called to retrieve the ageing interval for the given bridge device.
   */
  struct net_device_ops {
int (*ndo_init)(struct net_device *dev);
@@ -1226,6 +1235,13 @@ struct net_device_ops {
int (*ndo_get_iflink)(const struct net_device *dev);
int (*ndo_change_proto_down)(struct net_device *dev,
 bool proto_down);
+   int (*ndo_bridge_setageing)(const struct net_device *dev,
+   int ageing_interval);
+   int (*ndo_bridge_getageing_nl)(struct sk_buff *skb,
+  const struct net_device *dev,
+  struct netlink_callback *cb);
+
+   int (*ndo_bridge_getageing)(const struct net_device *dev);
  };
  

you cannot add new ndo's for each of these. It should be covered as part of
existing br_link_ops


  /**
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 89da893..7186fea 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -129,6 +129,10 @@ int switchdev_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr);
  int switchdev_port_attr_set(struct net_device *dev,
struct switchdev_attr *attr);
+int netdev_switch_ageing_set(struct net_device *dev, int ageing_interval);
+int netdev_switch_ageing_get(struct sk_buff *skb,
+const struct net_device *dev,
+struct netlink_callback *cb);
  int switchdev_port_obj_add(struct net_device *dev, struct switchdev_obj *obj);
  int switchdev_port_obj_del(struct net_device *dev, struct switchdev_obj *obj);
  int switchdev_port_obj_dump(struct net_device *dev, struct switchdev_obj 
*obj);
@@ -163,6 +167,17 @@ void switchdev_port_fwd_mark_set(struct net_device *dev,
  
  #else
  
+static inline int netdev_switch_ageing_set(struct net_device *dev,

+   int ageing_interval)
+{
+   return -EOPNOTSUPP;
+}
+
+static inline int netdev_switch_ageing_get(struct net_device *dev)
+{
+   return -EOPNOTSUPP;
+}
+
  static inline int switchdev_port_attr_get(struct net_device *dev,
  struct switchdev_attr *attr)
  {
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 3635b77..a32ab4d 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -199,4 +199,23 @@ enum {
  };
  #define MDBA_SET_ENTRY_MAX (__MDBA_SET_ENTRY_MAX - 1)
  
+struct admsg {

+   __u8 adm_family;
+   __u8 adm_pad1;
+   __u16 adm_pad2;
+   __s32 adm_ifindex;
+   __u16 adm_ageing_interval;
+};
+
+/* The value of this macro is based on the value recommended by IEEE
+ * standard 802.1d.
+ */
+#define MIN_AGEING_INTERVAL_SECS (10)
+
+/* The value of DEFAULT_AGEING_INTERVAL_SECS is the default ageing
+ * interval that was used in br_device.c.  This default value is also
+ * recommended by IEEE Standard 802.1d.
+ */
+#define DEFAULT_AGEING_INTERVAL_SECS (300)
+
  #endif /* _UAPI_LINUX_IF_BRIDGE_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 47d24cb..9321818 100644
--- a/include/uapi/linux/rtnetlink.h
+++ 

Re: [PATCH net-next 1/3] lwt: Add support to redirect dst.input

2015-08-14 Thread roopa

On 8/13/15, 9:54 AM, Tom Herbert wrote:

This patch adds the capability to redirect dst input in the same way
that dst output is redirected by LWT.

Also, save the original dst.input and and dst.out when setting up
lwtunnel redirection. These can be called by the client as a pass-
through.

Signed-off-by: Tom Herbert t...@herbertland.com
-

LGTM. Acked-by: Roopa Prabhu ro...@cumulusnetworks.com

thanks,
Roopa

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] iwlwifi: out-of-bounds access in iwl_init_sband_channels

2015-08-14 Thread Kalle Valo
Adrien Schildknecht adrien+...@schischi.me writes:

 Hi,

 On 08/14/2015 03:36 AM, Adrien Schildknecht wrote:
  Both loops of this function compare data from the 'chan' array and
  then check if the index is valid.
  
  The 2 conditions should be inverted to avoid an out-of-bounds
  access.
  
 
 Was that found by a static analyzer or any other automated tool, or
 was that the result of your very careful review?

 The error has been reported by KASan:
 ==
 BUG: KASan: out of bounds access in iwl_init_sband_channels+0x207/0x260 
 [iwlwifi] at addr 8800c2d0aac8
 Read of size 4 by task modprobe/329
 ==

Always try to add information like this to the commit log, it's very
useful.

-- 
Kalle Valo
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Peter Chen
It causes the i.mx6sx sdb board hang when using nfsroot during boots
up at v4.2-rc6.

This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.

Cc: netdev@vger.kernel.org
Cc: Fugang Duan b38...@freescale.com
Cc: shawn@linaro.org
Cc: fabio.este...@freescale.com
Cc: tyler.ba...@linaro.org
Cc: Lucas Stach l.st...@pengutronix.de
Cc: Andrew Lunn and...@lunn.ch
Signed-off-by: Peter Chen peter.c...@freescale.com
---

According to Fugang Duan, the i.mx series has different clock control
sequence among SoCs, this patch may only consider certain SoCs.

 drivers/net/ethernet/freescale/fec_main.c | 89 +--
 1 file changed, 13 insertions(+), 76 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c 
b/drivers/net/ethernet/freescale/fec_main.c
index 32e3807c..5e8b837 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -24,7 +24,6 @@
 #include linux/module.h
 #include linux/kernel.h
 #include linux/string.h
-#include linux/pm_runtime.h
 #include linux/ptrace.h
 #include linux/errno.h
 #include linux/ioport.h
@@ -78,7 +77,6 @@ static void fec_enet_itr_coal_init(struct net_device *ndev);
 #define FEC_ENET_RAEM_V0x8
 #define FEC_ENET_RAFL_V0x8
 #define FEC_ENET_OPD_V 0xFFF0
-#define FEC_MDIO_PM_TIMEOUT  100 /* ms */
 
 static struct platform_device_id fec_devtype[] = {
{
@@ -1769,13 +1767,7 @@ static void fec_enet_adjust_link(struct net_device *ndev)
 static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
 {
struct fec_enet_private *fep = bus-priv;
-   struct device *dev = fep-pdev-dev;
unsigned long time_left;
-   int ret = 0;
-
-   ret = pm_runtime_get_sync(dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
 
fep-mii_timeout = 0;
init_completion(fep-mdio_done);
@@ -1791,30 +1783,18 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int 
mii_id, int regnum)
if (time_left == 0) {
fep-mii_timeout = 1;
netdev_err(fep-netdev, MDIO read timeout\n);
-   ret = -ETIMEDOUT;
-   goto out;
+   return -ETIMEDOUT;
}
 
-   ret = FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
-
-out:
-   pm_runtime_mark_last_busy(dev);
-   pm_runtime_put_autosuspend(dev);
-
-   return ret;
+   /* return value */
+   return FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
 }
 
 static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
   u16 value)
 {
struct fec_enet_private *fep = bus-priv;
-   struct device *dev = fep-pdev-dev;
unsigned long time_left;
-   int ret = 0;
-
-   ret = pm_runtime_get_sync(dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
 
fep-mii_timeout = 0;
init_completion(fep-mdio_done);
@@ -1831,13 +1811,10 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int 
mii_id, int regnum,
if (time_left == 0) {
fep-mii_timeout = 1;
netdev_err(fep-netdev, MDIO write timeout\n);
-   ret  = -ETIMEDOUT;
+   return -ETIMEDOUT;
}
 
-   pm_runtime_mark_last_busy(dev);
-   pm_runtime_put_autosuspend(dev);
-
-   return ret;
+   return 0;
 }
 
 static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
@@ -1849,6 +1826,9 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
bool enable)
ret = clk_prepare_enable(fep-clk_ahb);
if (ret)
return ret;
+   ret = clk_prepare_enable(fep-clk_ipg);
+   if (ret)
+   goto failed_clk_ipg;
if (fep-clk_enet_out) {
ret = clk_prepare_enable(fep-clk_enet_out);
if (ret)
@@ -1872,6 +1852,7 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
bool enable)
}
} else {
clk_disable_unprepare(fep-clk_ahb);
+   clk_disable_unprepare(fep-clk_ipg);
if (fep-clk_enet_out)
clk_disable_unprepare(fep-clk_enet_out);
if (fep-clk_ptp) {
@@ -1893,6 +1874,8 @@ failed_clk_ptp:
if (fep-clk_enet_out)
clk_disable_unprepare(fep-clk_enet_out);
 failed_clk_enet_out:
+   clk_disable_unprepare(fep-clk_ipg);
+failed_clk_ipg:
clk_disable_unprepare(fep-clk_ahb);
 
return ret;
@@ -2864,14 +2847,10 @@ fec_enet_open(struct net_device *ndev)
struct fec_enet_private *fep = netdev_priv(ndev);
int ret;
 
-   ret = pm_runtime_get_sync(fep-pdev-dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
-
pinctrl_pm_select_default_state(fep-pdev-dev);
ret = fec_enet_clk_enable(ndev, true);
if (ret)
-   goto clk_enable;
+   return ret;
 
/* 

Re: [PATCH 1/2] average: provide macro to create static EWMA

2015-08-14 Thread Johannes Berg
On Thu, 2015-08-13 at 17:26 -0700, David Miller wrote:
 From: Johannes Berg johan...@sipsolutions.net
 Date: Thu, 13 Aug 2015 11:11:48 +0200
 
  From: Johannes Berg johannes.b...@intel.com
  
  Having the EWMA parameters stored in the runtime struct imposes
  memory requirements for the constant values that could just be
  inlined in the code. This particularly makes sense if there are
  a lot of such structs, for example in mac80211 in the station
  table where each station has a number of these in an array, and
  there can be many stations.
  
  Provide a macro DECLARE_EWMA() that declares the necessary struct
  and inline functions to access it with the parameters hard-coded;
  using this also means the user no longer needs to 'select AVERAGE'
  as it's entirely self-contained.
  
  In the mac80211 case, on x86-64, this actually slightly *reduces*
  code size, while also saving 80 bytes of runtime memory per sta.
  
  Signed-off-by: Johannes Berg johannes.b...@intel.com
  ---
  As the next patch relies on this, I'll take this through my tree
  unless I hear objections.
 
 This looks fine to me.
 

Thanks, I've applied both.

johannes
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Duan Andy
From: Peter Chen peter.c...@freescale.com Sent: Friday, August 14, 2015 1:48 
PM
 To: da...@davemloft.net
 Cc: Chen Peter-B29397; netdev@vger.kernel.org; Duan Fugang-B38611;
 shawn@linaro.org; Estevam Fabio-R49496; tyler.ba...@linaro.org; Lucas
 Stach; Andrew Lunn
 Subject: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while
 using mdio bus
 
 It causes the i.mx6sx sdb board hang when using nfsroot during boots up
 at v4.2-rc6.
 
 This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.
 
 Cc: netdev@vger.kernel.org
 Cc: Fugang Duan b38...@freescale.com
 Cc: shawn@linaro.org
 Cc: fabio.este...@freescale.com
 Cc: tyler.ba...@linaro.org
 Cc: Lucas Stach l.st...@pengutronix.de
 Cc: Andrew Lunn and...@lunn.ch
 Signed-off-by: Peter Chen peter.c...@freescale.com
 ---
 
 According to Fugang Duan, the i.mx series has different clock control
 sequence among SoCs, this patch may only consider certain SoCs.
 
  drivers/net/ethernet/freescale/fec_main.c | 89 +
 --
  1 file changed, 13 insertions(+), 76 deletions(-)
 

I suggest to revert the patch. The current patch doesn't consider 
i.MX6sx/i.MX7d... chips.
As somebody/customer's requirement that want to use MDIO bus is independent of 
MAC itself, I will submit one mdio driver to separate MDIO bus and MAC driver.

Regards,
Andy
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/4] Added getsynctime64() callback

2015-08-14 Thread Richard Cochran
On Thu, Aug 13, 2015 at 09:10:36PM +, Hall, Christopher S wrote:
   + if (!cpu_has_art)
   + return -EOPNOTSUPP;
  
  Perform this check before registration, setting .getsynctime64
  accordingly.
 
 The problem here is that ART initialization doesn't happen until we
 install TSC as a clocksource.  This design is per Thomas'
 suggestion.  That occurs after the driver is loaded (as a module).

So that 'cpu_has_art' actually means 'cpu_has_art_and_has_been_initialized'?

In any case, returning EOPNOTSUPP early on, but OK later seems mean to
me.  If the clocks aren't ready yet, the error should be EBUSY so that
user space knows it can try again.

Thanks,
Richard
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] iwlwifi: out-of-bounds access in iwl_init_sband_channels

2015-08-14 Thread Adrien Schildknecht
Hi,

 On 08/14/2015 03:36 AM, Adrien Schildknecht wrote:
  Both loops of this function compare data from the 'chan' array and
  then check if the index is valid.
  
  The 2 conditions should be inverted to avoid an out-of-bounds
  access.
  
 
 Was that found by a static analyzer or any other automated tool, or
 was that the result of your very careful review?

The error has been reported by KASan:
==
BUG: KASan: out of bounds access in iwl_init_sband_channels+0x207/0x260 
[iwlwifi] at addr 8800c2d0aac8
Read of size 4 by task modprobe/329
==

-- 
Adrien Schildknecht
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next:master 751/762] DockBook: Warning(include/linux/netdevice.h:1293): Enum value 'IFF_VRF_MASTER' not described in enum 'netdev_priv_flags'

2015-08-14 Thread kbuild test robot
tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master
head:   d52736e24fe2e927c26817256f8d1a3c8b5d51a0
commit: 4e3c89920cd3a6cfce22c6f537690747c26128dd [751/762] net: Introduce VRF 
related flags and helpers
reproduce: make htmldocs

All warnings (new ones prefixed by ):

   Warning(include/linux/skbuff.h:833): No description found for parameter 'sk'
   Warning(net/core/skbuff.c:407): No description found for parameter 'len'
   Warning(net/core/skbuff.c:407): Excess function parameter 'length' 
description in '__netdev_alloc_skb'
   Warning(net/core/skbuff.c:476): No description found for parameter 'len'
   Warning(net/core/skbuff.c:476): Excess function parameter 'length' 
description in '__napi_alloc_skb'
   Warning(net/core/gen_stats.c:155): No description found for parameter 'cpu'
   Warning(net/core/gen_estimator.c:212): No description found for parameter 
'cpu_bstats'
   Warning(net/core/gen_estimator.c:303): No description found for parameter 
'cpu_bstats'
   Warning(net/core/dev.c:2921): No description found for parameter 'sk'
   Warning(net/core/dev.c:3986): No description found for parameter 'sk'
   Warning(net/core/dev.c:6078): No description found for parameter 'len'
   Warning(include/linux/netdevice.h:1293): Enum value 
'IFF_XMIT_DST_RELEASE_PERM' not described in enum 'netdev_priv_flags'
   Warning(include/linux/netdevice.h:1293): Enum value 'IFF_IPVLAN_MASTER' not 
described in enum 'netdev_priv_flags'
   Warning(include/linux/netdevice.h:1293): Enum value 'IFF_IPVLAN_SLAVE' not 
described in enum 'netdev_priv_flags'
 Warning(include/linux/netdevice.h:1293): Enum value 'IFF_VRF_MASTER' not 
 described in enum 'netdev_priv_flags'
   Warning(include/linux/netdevice.h:1795): No description found for parameter 
'ptype_all'
   Warning(include/linux/netdevice.h:1795): No description found for parameter 
'ptype_specific'

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation
#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 4.2.0-rc6 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT=elf32-i386
CONFIG_ARCH_DEFCONFIG=arch/x86/configs/i386_defconfig
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_HAVE_LATENCYTOP_SUPPORT=y
CONFIG_MMU=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_32_LAZY_GS=y
CONFIG_ARCH_HWEIGHT_CFLAGS=-fcall-saved-ecx -fcall-saved-edx
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=2
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_BROKEN_ON_SMP=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME=(none)
# CONFIG_SYSVIPC is not set
# CONFIG_CROSS_MEMORY_ATTACH is not set
# CONFIG_FHANDLE is not set
# CONFIG_USELIB is not set
CONFIG_HAVE_ARCH_AUDITSYSCALL=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ is not set
# CONFIG_HIGH_RES_TIMERS is not set

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set

#
# RCU Subsystem
#
CONFIG_TINY_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
# CONFIG_RCU_STALL_COMMON is not set
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_EXPEDITE_BOOT is not set
# CONFIG_BUILD_BIN2C is not set
# CONFIG_IKCONFIG is not set
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
# CONFIG_CGROUPS is not set
# CONFIG_CHECKPOINT_RESTORE is not set
# CONFIG_SCHED_AUTOGROUP is not set
# CONFIG_RELAY is 

Re: [PATCH net-next 1/3] lwt: Add support to redirect dst.input

2015-08-14 Thread Robert Shearman

On 13/08/15 17:54, Tom Herbert wrote:

This patch adds the capability to redirect dst input in the same way
that dst output is redirected by LWT.

Also, save the original dst.input and and dst.out when setting up
lwtunnel redirection. These can be called by the client as a pass-
through.


The lwt state is refcounted so it can be shared by different dst 
contexts, so is it safe to be storing per-dst state in the lwt state?


Otherwise, it looks good.

Thanks,
Rob
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


4.1.5 oops in alloc_skb_with_frags

2015-08-14 Thread Ortwin Glück

Hi,

I got a series of (same) Oopses on a fresh 4.1.5 on KDE startup:

Aug 14 08:45:38 gandalf kernel: PGD 0
Aug 14 08:45:38 gandalf kernel: Oops:  [#1] PREEMPT SMP
Aug 14 08:45:38 gandalf kernel: Modules linked in: radeon cfbfillrect cfbimgblt cfbcopyarea fbcon 
i2c_algo_bit bit

blit softcursor font drm_kms_helper ttm drm fb fbdev
Aug 14 08:45:38 gandalf kernel: CPU: 2 PID: 2726 Comm: X Not tainted 4.1.5 #1
Aug 14 08:45:38 gandalf kernel: Hardware name: Apple Inc. iMac11,2/Mac-F2238AC8, BIOS 
IM112.88Z.0057.B00.100503

1455 05/03/10
Aug 14 08:45:38 gandalf kernel: task: 880092e8b020 ti: 880135248000 
task.ti: 880135248000
Aug 14 08:45:38 gandalf kernel: RIP: 0010:[811b41a6]  [811b41a6] 
__kmalloc_track_caller+0x76/0

x190
Aug 14 08:45:38 gandalf kernel: RSP: 0018:88013524ba88  EFLAGS: 00010202
Aug 14 08:45:38 gandalf kernel: RAX:  RBX: 8800812e2f00 
RCX: 9042
Aug 14 08:45:38 gandalf kernel: RDX: 903a RSI: 903a 
RDI: 02bf
Aug 14 08:45:38 gandalf kernel: RBP: 88013524bac8 R08: 00018d80 
R09: 0003
Aug 14 08:45:38 gandalf kernel: R10: 7000 R11: 0160 
R12: 02c0
Aug 14 08:45:38 gandalf kernel: R13: 000106d0 R14: 01d6800b 
R15: 880137001780
Aug 14 08:45:38 gandalf kernel: FS:  7f5233ea1880() GS:88013bc8() 
knlGS:

Aug 14 08:45:38 gandalf kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Aug 14 08:45:38 gandalf kernel: CR2: 01d6800b CR3: 00013528b000 
CR4: 06e0
Aug 14 08:45:38 gandalf kernel: Stack:
Aug 14 08:45:38 gandalf kernel:  8801 8164defd 
88013524bae8 8800812e2f00
Aug 14 08:45:38 gandalf kernel:  88013524bb27 04d0 
02c0 
Aug 14 08:45:38 gandalf kernel:  88013524bb08 8164de3c 
880136799300 8800812e2f00
Aug 14 08:45:38 gandalf kernel: Call Trace:
Aug 14 08:45:38 gandalf kernel:  [8164defd] ? __alloc_skb+0x6d/0x1c0
Aug 14 08:45:38 gandalf kernel:  [8164de3c] 
__kmalloc_reserve.isra.43+0x2c/0x80
Aug 14 08:45:38 gandalf kernel:  [8164defd] __alloc_skb+0x6d/0x1c0
Aug 14 08:45:38 gandalf kernel:  [8164e0a7] 
alloc_skb_with_frags+0x57/0x200
Aug 14 08:45:38 gandalf kernel:  [81649e83] ? sock_wfree+0x53/0x60
Aug 14 08:45:38 gandalf kernel:  [81647dd6] 
sock_alloc_send_pskb+0x196/0x240
Aug 14 08:45:38 gandalf kernel:  [816531bf] ? 
skb_copy_datagram_from_iter+0x4f/0x1f0
Aug 14 08:45:38 gandalf kernel:  [8174b55a] 
unix_stream_sendmsg+0x25a/0x3a0
Aug 14 08:45:38 gandalf kernel:  [81644c52] sock_sendmsg+0x12/0x20
Aug 14 08:45:38 gandalf kernel:  [81644cd3] sock_write_iter+0x73/0xd0
Aug 14 08:45:38 gandalf kernel:  [811bc3d4] 
do_iter_readv_writev+0x54/0x70
Aug 14 08:45:38 gandalf kernel:  [811bca96] 
do_readv_writev+0x196/0x230
Aug 14 08:45:38 gandalf kernel:  [811d7f80] ? __fget_light+0x20/0x70
Aug 14 08:45:38 gandalf kernel:  [811d7f0d] ? __fget+0x6d/0xa0
Aug 14 08:45:38 gandalf kernel:  [811bcba4] vfs_writev+0x34/0x50
Aug 14 08:45:38 gandalf kernel:  [811bd875] SyS_writev+0x45/0xd0
Aug 14 08:45:38 gandalf kernel:  [8189f197] 
system_call_fastpath+0x12/0x6a
Aug 14 08:45:38 gandalf kernel: Code: 48 89 c8 65 48 03 05 8a 5f e5 7e 48 8b 70 08 48 39 f2 75 e7 4c 
8b 30 4d 85 f6 0f 84 bc 00 00 00 49 63 47 20 48 8d 4a 08 4d 8b 07 49 8b 1c 06 4c 89 f0 65 49 0f c7 
08 0f 94 c0 84 c0 74 ba 49 63

Aug 14 08:45:38 gandalf kernel:  RSP 88013524ba88
Aug 14 08:45:38 gandalf kernel: CR2: 01d6800b
Aug 14 08:45:38 gandalf kernel: ---[ end trace 3cf0da471519df5e ]---
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mm: make page pfmemalloc check more robust

2015-08-14 Thread Vlastimil Babka

On 08/13/2015 04:40 PM, Eric Dumazet wrote:

On Thu, 2015-08-13 at 11:13 +0200, Vlastimil Babka wrote:


Given that this apparently isn't the first case of this localhost issue,
I wonder if network code should just clear skb-pfmemalloc during send
(or maybe just send over localhost). That would be probably easier than
distinguish the __skb_fill_page_desc() callers for send vs receive.


Would this still needed after this patch ?


Not until another corner case is discovered :) Or something passes a 
genuine pfmemalloc page to a socket (sending contents of some slab 
objects perhaps, where the slab page was allocated as pfmemalloc? Dunno 
if that can happen right now).



It is sad we do not have a SNMP counter to at least count how often we
drop skb because pfmemalloc is set.

I'll provide such a patch.




--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] IGMP: Inhibit reports for local multicast groups

2015-08-14 Thread Andrew Lunn
Hi Philip

So with a bit of poking and prodding, we have a much better
understanding as to why this is O.K. Maybe your next patch can quote
the relevant RFCs and have a much fuller commit message?

Thanks
Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/6] MIPS: net: BPF: Introduce BPF ASM helpers

2015-08-14 Thread Markos Chandras
On Thu, Aug 13, 2015 at 10:42:46PM +0200, Aurelien Jarno wrote:
 
 This patch relies on R2 instructions, and thus the Linux kernel fails to
 build when targetting non-R2 CPUs. See for example:
 
 https://buildd.debian.org/status/fetch.php?pkg=linuxarch=mipselver=4.2%7Erc6-1%7Eexp1stamp=143948
 
 -- 
 Aurelien Jarno  GPG: 4096R/1DDD8C9B
 aurel...@aurel32.net http://www.aurel32.net
Hi,

I think Ralf may have a fix for R1 cores but I am not sure about the status of
that patch. Ralf?

-- 
markos
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/fsl: simplify Kconfig dependency list for fsl networking

2015-08-14 Thread Geert Uytterhoeven
On Fri, Aug 14, 2015 at 12:01 AM, Stuart Yoder
stuart.yo...@freescale.com wrote:
 make the list of Kconfig dependencies for Freescale
 networking more general. Simplify to supported
 architectures: ARM, ARM64, PPC, M68K

 Signed-off-by: Stuart Yoder stuart.yo...@freescale.com
 ---
  drivers/net/ethernet/freescale/Kconfig | 4 +---
  1 file changed, 1 insertion(+), 3 deletions(-)

 diff --git a/drivers/net/ethernet/freescale/Kconfig 
 b/drivers/net/ethernet/freescale/Kconfig
 index ff76d4e..70782d7 100644
 --- a/drivers/net/ethernet/freescale/Kconfig
 +++ b/drivers/net/ethernet/freescale/Kconfig
 @@ -5,9 +5,7 @@
  config NET_VENDOR_FREESCALE
 bool Freescale devices
 default y
 -   depends on FSL_SOC || QUICC_ENGINE || CPM1 || CPM2 || PPC_MPC512x || \
 -  M523x || M527x || M5272 || M528x || M520x || M532x || \
 -  ARCH_MXC || ARCH_MXS || (PPC_MPC52xx  PPC_BESTCOMM)
 +   depends on M68K || PPC || ARM || ARM64
 ---help---
   If you have a network (Ethernet) card belonging to this class, say 
 Y.

This breaks m68k/allmodconfig:

drivers/net/ethernet/freescale/gianfar.c: In function ‘gfar_parse_group’:
drivers/net/ethernet/freescale/gianfar.c:684: error: ‘NO_IRQ’
undeclared (first use in this function)
drivers/net/ethernet/freescale/gianfar.c:684: error: (Each undeclared
identifier is reported only once
drivers/net/ethernet/freescale/gianfar.c:684: error: for each function
it appears in.)

P.S. Hint: Would have been caught earlier if the NET_VENDOR_* symbol
 had || COMPILE_TEST among its dependencies.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say programmer or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Lucas Stach
Am Freitag, den 14.08.2015, 08:25 + schrieb Peter Chen:
  
  
  Am Freitag, den 14.08.2015, 13:47 +0800 schrieb Peter Chen:
   It causes the i.mx6sx sdb board hang when using nfsroot during boots
   up at v4.2-rc6.
  
   This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.
  
   Cc: netdev@vger.kernel.org
   Cc: Fugang Duan b38...@freescale.com
   Cc: shawn@linaro.org
   Cc: fabio.este...@freescale.com
   Cc: tyler.ba...@linaro.org
   Cc: Lucas Stach l.st...@pengutronix.de
   Cc: Andrew Lunn and...@lunn.ch
   Signed-off-by: Peter Chen peter.c...@freescale.com
   ---
  
   According to Fugang Duan, the i.mx series has different clock control
   sequence among SoCs, this patch may only consider certain SoCs.
  
  Sorry, but NACK.
  
  Please test current mainline (what will become v4.2-rc7). There is already a
  patch in that fixes i.MX27 and probably fixes the same problem on i.MX6SX.
  
 
 Would you help point to me which commit and at which tree?
 
Mainline, so Linus Torvalds tree.

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=14d2b7c1a96ef37eb571599c73d4a1a606b964d6

Regards,
Lucas

 Peter
 
drivers/net/ethernet/freescale/fec_main.c | 89
   +--
1 file changed, 13 insertions(+), 76 deletions(-)
  
   diff --git a/drivers/net/ethernet/freescale/fec_main.c
   b/drivers/net/ethernet/freescale/fec_main.c
   index 32e3807c..5e8b837 100644
   --- a/drivers/net/ethernet/freescale/fec_main.c
   +++ b/drivers/net/ethernet/freescale/fec_main.c
   @@ -24,7 +24,6 @@
#include linux/module.h
#include linux/kernel.h
#include linux/string.h
   -#include linux/pm_runtime.h
#include linux/ptrace.h
#include linux/errno.h
#include linux/ioport.h
   @@ -78,7 +77,6 @@ static void fec_enet_itr_coal_init(struct net_device
  *ndev);
#define FEC_ENET_RAEM_V  0x8
#define FEC_ENET_RAFL_V  0x8
#define FEC_ENET_OPD_V   0xFFF0
   -#define FEC_MDIO_PM_TIMEOUT  100 /* ms */
  
static struct platform_device_id fec_devtype[] = {
 {
   @@ -1769,13 +1767,7 @@ static void fec_enet_adjust_link(struct
   net_device *ndev)  static int fec_enet_mdio_read(struct mii_bus *bus,
   int mii_id, int regnum)  {
 struct fec_enet_private *fep = bus-priv;
   - struct device *dev = fep-pdev-dev;
 unsigned long time_left;
   - int ret = 0;
   -
   - ret = pm_runtime_get_sync(dev);
   - if (IS_ERR_VALUE(ret))
   - return ret;
  
 fep-mii_timeout = 0;
 init_completion(fep-mdio_done);
   @@ -1791,30 +1783,18 @@ static int fec_enet_mdio_read(struct mii_bus
  *bus, int mii_id, int regnum)
 if (time_left == 0) {
 fep-mii_timeout = 1;
 netdev_err(fep-netdev, MDIO read timeout\n);
   - ret = -ETIMEDOUT;
   - goto out;
   + return -ETIMEDOUT;
 }
  
   - ret = FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
   -
   -out:
   - pm_runtime_mark_last_busy(dev);
   - pm_runtime_put_autosuspend(dev);
   -
   - return ret;
   + /* return value */
   + return FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
}
  
static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int 
   regnum,
u16 value)
{
 struct fec_enet_private *fep = bus-priv;
   - struct device *dev = fep-pdev-dev;
 unsigned long time_left;
   - int ret = 0;
   -
   - ret = pm_runtime_get_sync(dev);
   - if (IS_ERR_VALUE(ret))
   - return ret;
  
 fep-mii_timeout = 0;
 init_completion(fep-mdio_done);
   @@ -1831,13 +1811,10 @@ static int fec_enet_mdio_write(struct mii_bus
  *bus, int mii_id, int regnum,
 if (time_left == 0) {
 fep-mii_timeout = 1;
 netdev_err(fep-netdev, MDIO write timeout\n);
   - ret  = -ETIMEDOUT;
   + return -ETIMEDOUT;
 }
  
   - pm_runtime_mark_last_busy(dev);
   - pm_runtime_put_autosuspend(dev);
   -
   - return ret;
   + return 0;
}
  
static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
   @@ -1849,6 +1826,9 @@ static int fec_enet_clk_enable(struct net_device
  *ndev, bool enable)
 ret = clk_prepare_enable(fep-clk_ahb);
 if (ret)
 return ret;
   + ret = clk_prepare_enable(fep-clk_ipg);
   + if (ret)
   + goto failed_clk_ipg;
 if (fep-clk_enet_out) {
 ret = clk_prepare_enable(fep-clk_enet_out);
 if (ret)
   @@ -1872,6 +1852,7 @@ static int fec_enet_clk_enable(struct net_device
  *ndev, bool enable)
 }
 } else {
 clk_disable_unprepare(fep-clk_ahb);
   + clk_disable_unprepare(fep-clk_ipg);
 if (fep-clk_enet_out)
 clk_disable_unprepare(fep-clk_enet_out);
 if (fep-clk_ptp) {
   @@ -1893,6 +1874,8 @@ failed_clk_ptp:
 if (fep-clk_enet_out)
 clk_disable_unprepare(fep-clk_enet_out);
failed_clk_enet_out:
   + 

[PATCH v2] iwlwifi: out-of-bounds access in iwl_init_sband_channels

2015-08-14 Thread Adrien Schildknecht
KASan error report:
==
BUG: KASan: out of bounds access in iwl_init_sband_channels+0x207/0x260 
[iwlwifi] at addr 8800c2d0aac8
Read of size 4 by task modprobe/329
==

Both loops of this function compare data from the 'chan' array and then
check if the index is valid.

The 2 conditions should be inverted to avoid an out-of-bounds access.

Signed-off-by: Adrien Schildknecht adrien+...@schischi.me
---
 drivers/net/wireless/iwlwifi/iwl-eeprom-parse.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-eeprom-parse.c 
b/drivers/net/wireless/iwlwifi/iwl-eeprom-parse.c
index 21302b6..acc3d18 100644
--- a/drivers/net/wireless/iwlwifi/iwl-eeprom-parse.c
+++ b/drivers/net/wireless/iwlwifi/iwl-eeprom-parse.c
@@ -713,12 +713,12 @@ int iwl_init_sband_channels(struct iwl_nvm_data *data,
struct ieee80211_channel *chan = data-channels[0];
int n = 0, idx = 0;
 
-   while (chan-band != band  idx  n_channels)
+   while (idx  n_channels  chan-band != band)
chan = data-channels[++idx];
 
sband-channels = data-channels[idx];
 
-   while (chan-band == band  idx  n_channels) {
+   while (idx  n_channels  chan-band == band) {
chan = data-channels[++idx];
n++;
}
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GCOV_PROFILE_ALL breaks BUILD_BUG_ON(!is_power_of_2(8))

2015-08-14 Thread Johannes Berg
On Fri, 2015-08-14 at 11:00 +0200, Michal Kubecek wrote:

  but should I have expected this?
 
 It might have something to do with the fact that is_power_of_2() 
 being an inline function, perhaps with this compiler option it 
 translates to something that can't be used in the context 
 BUILD_BUG_ON() uses it in.

Evidently, yeah.

 There is a BUILD_BUG_ON_NOT_POWER_OF_2() macro you could use.
 

Good point, I'll do that, thanks.

johannes
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux kernel commit breaks IPMI on iface downing

2015-08-14 Thread Sébastien Bocahu
The BNX2 firmware has already been updated to the latest version
available from Dell.

root@debian:~# ethtool -i eth0 | grep firmware
firmware-version: 6.4.5 bc 5.2.3 NCSI 2.0.11

2015-08-14 3:30 GMT+02:00 Michael Chan mc...@broadcom.com:
 +netdev and Harish who is the current maintainer of bnx2 at qlogic.

 The patch in question effectively just removes the
 bnx2_set_power_state() call during ip link set down.  If there is
 IPMI, the firmware should know the link needs to stay up when the driver
 resets the device during bnx2_close().  This should be a very common
 scenario.  Please provide the firmware versions to Harish with ethtool
 -i.  Perhaps upgrading the firmware can resolve this issue.

 On Thu, 2015-08-13 at 15:47 +0200, Sébastien Bocahu wrote:
 Hi,

 Being unable to install Debian Jessie via IPMI on mainstream Dell R410
 servers that used to be well supported by Debian Wheezy, I tracked the
 problem down to a specific commit in the Linux kernel, specifically in
 the bnx2 driver.
 The issue is that ip link set eth0 down takes the Ethernet part of
 the BMC down (shared NIC for BMC+eth0), cutting off the IPMI session.
 The BMC gets back only after power cycling.

 Hardware: Dell R410 w/ a Broadcom 5716 NIC:
 Ethernet controller: Broadcom Corporation NetXtreme II BCM5716 Gigabit
 Ethernet (rev 20)  )
 Part number: BCM95716C1
 Vendor specific: 6.4.5 (Firmware has been updated to the latest
 version available by Dell's support website)

 I built many kernels to track which change caused the issue and it
 seems that's :
 25bfb1dd4ba3b2d9a49ce9d9b0cd7be1840e15ed (bnx2: Add pci shutdown handler.)

 Before, 'ip link set eth0 down' would just cause 2/3s of packet loss
 but IPMI would still be working afterwards.

 I'm available for more informations and/or testing if needed.

 Thanks !








-- 
Sébastien Bocahu
IT infrastructure manager

4, Rue Montrochet - 69002 - Lyon, France

+33 (0)437651704 - Phone
ReportLinker.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Lucas Stach
Am Freitag, den 14.08.2015, 13:47 +0800 schrieb Peter Chen:
 It causes the i.mx6sx sdb board hang when using nfsroot during boots
 up at v4.2-rc6.
 
 This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.
 
 Cc: netdev@vger.kernel.org
 Cc: Fugang Duan b38...@freescale.com
 Cc: shawn@linaro.org
 Cc: fabio.este...@freescale.com
 Cc: tyler.ba...@linaro.org
 Cc: Lucas Stach l.st...@pengutronix.de
 Cc: Andrew Lunn and...@lunn.ch
 Signed-off-by: Peter Chen peter.c...@freescale.com
 ---
 
 According to Fugang Duan, the i.mx series has different clock control
 sequence among SoCs, this patch may only consider certain SoCs.
 
Sorry, but NACK.

Please test current mainline (what will become v4.2-rc7). There is
already a patch in that fixes i.MX27 and probably fixes the same problem
on i.MX6SX.

  drivers/net/ethernet/freescale/fec_main.c | 89 
 +--
  1 file changed, 13 insertions(+), 76 deletions(-)
 
 diff --git a/drivers/net/ethernet/freescale/fec_main.c 
 b/drivers/net/ethernet/freescale/fec_main.c
 index 32e3807c..5e8b837 100644
 --- a/drivers/net/ethernet/freescale/fec_main.c
 +++ b/drivers/net/ethernet/freescale/fec_main.c
 @@ -24,7 +24,6 @@
  #include linux/module.h
  #include linux/kernel.h
  #include linux/string.h
 -#include linux/pm_runtime.h
  #include linux/ptrace.h
  #include linux/errno.h
  #include linux/ioport.h
 @@ -78,7 +77,6 @@ static void fec_enet_itr_coal_init(struct net_device *ndev);
  #define FEC_ENET_RAEM_V  0x8
  #define FEC_ENET_RAFL_V  0x8
  #define FEC_ENET_OPD_V   0xFFF0
 -#define FEC_MDIO_PM_TIMEOUT  100 /* ms */
  
  static struct platform_device_id fec_devtype[] = {
   {
 @@ -1769,13 +1767,7 @@ static void fec_enet_adjust_link(struct net_device 
 *ndev)
  static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
  {
   struct fec_enet_private *fep = bus-priv;
 - struct device *dev = fep-pdev-dev;
   unsigned long time_left;
 - int ret = 0;
 -
 - ret = pm_runtime_get_sync(dev);
 - if (IS_ERR_VALUE(ret))
 - return ret;
  
   fep-mii_timeout = 0;
   init_completion(fep-mdio_done);
 @@ -1791,30 +1783,18 @@ static int fec_enet_mdio_read(struct mii_bus *bus, 
 int mii_id, int regnum)
   if (time_left == 0) {
   fep-mii_timeout = 1;
   netdev_err(fep-netdev, MDIO read timeout\n);
 - ret = -ETIMEDOUT;
 - goto out;
 + return -ETIMEDOUT;
   }
  
 - ret = FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
 -
 -out:
 - pm_runtime_mark_last_busy(dev);
 - pm_runtime_put_autosuspend(dev);
 -
 - return ret;
 + /* return value */
 + return FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
  }
  
  static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
  u16 value)
  {
   struct fec_enet_private *fep = bus-priv;
 - struct device *dev = fep-pdev-dev;
   unsigned long time_left;
 - int ret = 0;
 -
 - ret = pm_runtime_get_sync(dev);
 - if (IS_ERR_VALUE(ret))
 - return ret;
  
   fep-mii_timeout = 0;
   init_completion(fep-mdio_done);
 @@ -1831,13 +1811,10 @@ static int fec_enet_mdio_write(struct mii_bus *bus, 
 int mii_id, int regnum,
   if (time_left == 0) {
   fep-mii_timeout = 1;
   netdev_err(fep-netdev, MDIO write timeout\n);
 - ret  = -ETIMEDOUT;
 + return -ETIMEDOUT;
   }
  
 - pm_runtime_mark_last_busy(dev);
 - pm_runtime_put_autosuspend(dev);
 -
 - return ret;
 + return 0;
  }
  
  static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
 @@ -1849,6 +1826,9 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
 bool enable)
   ret = clk_prepare_enable(fep-clk_ahb);
   if (ret)
   return ret;
 + ret = clk_prepare_enable(fep-clk_ipg);
 + if (ret)
 + goto failed_clk_ipg;
   if (fep-clk_enet_out) {
   ret = clk_prepare_enable(fep-clk_enet_out);
   if (ret)
 @@ -1872,6 +1852,7 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
 bool enable)
   }
   } else {
   clk_disable_unprepare(fep-clk_ahb);
 + clk_disable_unprepare(fep-clk_ipg);
   if (fep-clk_enet_out)
   clk_disable_unprepare(fep-clk_enet_out);
   if (fep-clk_ptp) {
 @@ -1893,6 +1874,8 @@ failed_clk_ptp:
   if (fep-clk_enet_out)
   clk_disable_unprepare(fep-clk_enet_out);
  failed_clk_enet_out:
 + clk_disable_unprepare(fep-clk_ipg);
 +failed_clk_ipg:
   clk_disable_unprepare(fep-clk_ahb);
  
   return ret;
 @@ -2864,14 +2847,10 @@ fec_enet_open(struct net_device *ndev)
   struct fec_enet_private *fep = netdev_priv(ndev);
   int ret;
  
 - ret = pm_runtime_get_sync(fep-pdev-dev);
 - 

[PATCH net-next]r8169.c: Force transmission when nic refuse to start.

2015-08-14 Thread Corcodel Marian
Brute force transmission when Rx interrupt exist on
 interface is up.  Guaranteed to start on full duplex and not maximum
 speed.When set  half duplex working same without this patch.
For apply this patch set --whitespace=warn

Signed-off-by: Corcodel Marian corcodel.mar...@gmail.com

diff --git a/drivers/net/ethernet/realtek/r8169.c 
b/drivers/net/ethernet/realtek/r8169.c
index eb2d2a4..6882eab 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7470,15 +7470,22 @@ static int rtl8169_poll(struct napi_struct *napi, int 
budget)
u16 enable_mask = RTL_EVENT_NAPI | tp-event_slow;
int work_done= 0;
u16 status;
+int tx_force = 1;
 
status = rtl_get_events(tp);
rtl_ack_events(tp, status  ~tp-event_slow);
-
+   if (netif_running(dev)) {
if (status  RTL_EVENT_NAPI_RX)
work_done = rtl_rx(dev, tp, (u32) budget);
+
 
if (status  RTL_EVENT_NAPI_TX)
rtl_tx(dev, tp);
+ else if (tx_force == 1) {
+mdelay(10);
+rtl_tx(dev, tp);
+ }
+}
 
if (status  tp-event_slow) {
enable_mask = ~tp-event_slow;
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] net: phy: workaround for buggy cable detection by LAN8700 after cable plugging

2015-08-14 Thread Igor Plyatov

Dear Michael,


Hi Igor,

Am Donnerstag, 13. August 2015, 22:18:34 schrieben Sie:

 * Due to HW bug, LAN8700 sometimes does not detect presence of 
energy in the


 Ethernet cable in Energy Detect Power-Down mode (e.g while EDPWRDOWN 
bit is


 set, the ENERGYON bit does not asserted sometimes). This is a common 
bug of


 LAN87xx family of PHY chips.

Is there any offical errata sheet for this PHY family? How do you 
know, that this is a


common HW bug?



The LAN8700, LAN8710, LAN8720 is a product of the SMSC company. 
Microchip acquired SMSC in August 2012.


The LAN8700 is a legacy product for Microchip and they will not update 
anything about it. So, even if Microchip know about HW bug, then there 
is no chance to have Errata sheet or any new documents about LAN8700.


I think same history is for LAN8710/LAN8720 even if they are not marked 
as legacy. They are SMSC products.


The workarounds for same issue in LAN8710/LAN8720 was committed by:
 * Marek Vasut ma...@denx.de as b629820d18fa65cc598390e4b9712fd5f83ee693.
 * Patrick Trantham patrick.trant...@fuel7.com as 
4223dbffed9f89596177ff2b256ef3258b20fa46.


Me too, I think that this family has some problems with this mode, 
however, without


hard evidence, I would put it softer.



I have discovered this bug by just monitoring of data to/from MDIO 
registers of LAN8700.
And HW issue is proven on 100 % by rare absence of ENERGYON bit when 
cable is plugged in.

Sometimes, it is required to make 2-20 tests to catch this issue.

The configuration of CPU pins, responsible for the MDIO interface, was 
checked carefully by oscilloscope and they are fine (no spikes, no 
garbage, good shape of edges).


 * The lan87xx_read_status() was improved to acquire ENERGYON bit. 
Its previous


 algorythm still not reliable on 100 % and sometimes skip cable plugging.



 Signed-off-by: Igor Plyatov plya...@gmail.com

 ---

 drivers/net/phy/smsc.c | 15 ---

 1 file changed, 12 insertions(+), 3 deletions(-)



 diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c

 index c0f6479..8559ff1 100644

 --- a/drivers/net/phy/smsc.c

 +++ b/drivers/net/phy/smsc.c

 @@ -104,6 +104,7 @@ static int lan911x_config_init(struct phy_device 
*phydev)


 static int lan87xx_read_status(struct phy_device *phydev)

 {

 int err = genphy_read_status(phydev);

 + int i;



 if (!phydev-link) {

 /* Disable EDPD to wake up PHY */

 @@ -116,8 +117,16 @@ static int lan87xx_read_status(struct 
phy_device *phydev)


 if (rc  0)

 return rc;



 - /* Sleep 64 ms to allow ~5 link test pulses to be sent */

 - msleep(64);

 + /* Wait max 640 ms to detect energy */

Why 640ms and not e.g. 650ms?

I'm no PHY expert, but this looks like an ugly workaround.



Such a value was adopted after many trial and probes. It allows to 
detect cable plugging on 100 %.

Ugly or not, but it works and reliable.

Maybe it would be better to avoid this power saving mode at all, when 
it is not


reliable, but this are just my 2cts. :-)



Power saving mode allow to save around 220 mW of energy consumed from 
power supply, when Ethernet cable is not plugged in.

This is a good value for embedded devices.
Better to keep power save mode on.

Anyway, I guess you should also update the explanation on top of the 
function to reflect


your new approach.



I propose following comment for the lan87xx_read_status():
/*
 * The LAN87xx suffers from rare absence of the ENERGYON-bit when 
Ethernet cable
 * plugs in while LAN87xx is in Energy Detect Power-Down mode. This 
leads to

 * unstable detection of plugging in Ethernet cable.
 * This workaround disables Energy Detect Power-Down mode and waiting for
 * response on link pulses to detect presence of plugged Ethernet cable.
 * The Energy Detect Power-Down mode enabled again in the end of 
procedure to

 * save approximately 220 mW of power if cable is unplugged.
 */


 + for (i = 0; i  64; i++) {

 + /* Sleep to allow link test pulses to be sent */

 + msleep(10);

 + rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);

 + if (rc  0)

 + return rc;

 + if (rc  MII_LAN83C185_ENERGYON)

 + break;

 + };



 /* Re-enable EDPD */

 rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);

 @@ -191,7 +200,7 @@ static struct phy_driver smsc_phy_driver[] = {



 /* basic functions */

 .config_aneg = genphy_config_aneg,

 - .read_status = genphy_read_status,

 + .read_status = lan87xx_read_status,

This one makes sense, since I really guess, that the whole PHY family 
behave very similar.


But this change alone does not solve your problem, right?



Yes, use of non modified lan87xx_read_status() only reduce amount of 
false cable detections, but does not resolve issue completely.



 .config_init = smsc_phy_config_init,

 .soft_reset = smsc_phy_reset,





Regards,

Michael



Best wishes.

--
Igor Plyatov

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

GCOV_PROFILE_ALL breaks BUILD_BUG_ON(!is_power_of_2(8))

2015-08-14 Thread Johannes Berg
+linux-kernel

 +#define DECLARE_EWMA(name, _factor, _weight) 
  \
 + struct ewma_##name {
  \
 +unsigned long internal;  
  \
 + }; \
 + static inline void ewma_##name##_init(struct ewma_##name *e) 
  \
 + {  \
 +BUILD_BUG_ON(!__builtin_constant_p(_factor));   
  \
 +BUILD_BUG_ON(!__builtin_constant_p(_weight));   
  \
 +BUILD_BUG_ON(!is_power_of_2(_factor)); 
  \
 +BUILD_BUG_ON(!is_power_of_2(_weight)); 
  \
 

So this seemed fine to me, but for some reason the compiler is saying
the BUILD_BUG_ON(!is_power_of_2(x)) fails, if and only if (!)
CONFIG_GCOV_PROFILE_ALL is enabled, which seems to boil down to the
compiler option -fprofile-arcs.

I'm going to replace this with just the code itself, i.e.

/* both must be a power of 2 */
BUILD_BUG_ON(_factor  (_factor - 1));
BUILD_BUG_ON(_weight  (_weight - 1));

but should I have expected this?

johannes
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net v2] ppp: fix device unregistration upon netns deletion

2015-08-14 Thread Guillaume Nault
PPP devices may get automatically unregistered when their network
namespace is getting removed. This happens if the ppp control plane
daemon (e.g. pppd) exits while it is the last user of this namespace.

This leads to several races:

  * ppp_exit_net() may destroy the per namespace idr (pn-units_idr)
before all file descriptors were released. Successive ppp_release()
calls may then cleanup PPP devices with ppp_shutdown_interface() and
try to use the already destroyed idr.

  * Automatic device unregistration may also happen before the
ppp_release() call for that device gets executed. Once called on
the file owning the device, ppp_release() will then clean it up and
try to unregister it a second time.

To fix these issues, operations defined in ppp_shutdown_interface() are
moved to the PPP device's ndo_uninit() callback. This allows PPP
devices to be properly cleaned up by unregister_netdev() and friends.
So checking for ppp-owner is now an accurate test to decide if a PPP
device should be unregistered.

Setting ppp-owner is done in ppp_create_interface(), before device
registration, in order to avoid unprotected modification of this field.

Finally, ppp_exit_net() now starts by unregistering all remaining PPP
devices to ensure that none will get unregistered after the call to
idr_destroy().

Signed-off-by: Guillaume Nault g.na...@alphalink.fr
---
v2: remove unnecessary curly braces in idr_for_each_entry()

 drivers/net/ppp/ppp_generic.c | 78 +++
 1 file changed, 42 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 9d15566..fa8f504 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -269,9 +269,9 @@ static void ppp_ccp_peek(struct ppp *ppp, struct sk_buff 
*skb, int inbound);
 static void ppp_ccp_closed(struct ppp *ppp);
 static struct compressor *find_compressor(int type);
 static void ppp_get_stats(struct ppp *ppp, struct ppp_stats *st);
-static struct ppp *ppp_create_interface(struct net *net, int unit, int *retp);
+static struct ppp *ppp_create_interface(struct net *net, int unit,
+   struct file *file, int *retp);
 static void init_ppp_file(struct ppp_file *pf, int kind);
-static void ppp_shutdown_interface(struct ppp *ppp);
 static void ppp_destroy_interface(struct ppp *ppp);
 static struct ppp *ppp_find_unit(struct ppp_net *pn, int unit);
 static struct channel *ppp_find_channel(struct ppp_net *pn, int unit);
@@ -392,8 +392,10 @@ static int ppp_release(struct inode *unused, struct file 
*file)
file-private_data = NULL;
if (pf-kind == INTERFACE) {
ppp = PF_TO_PPP(pf);
+   rtnl_lock();
if (file == ppp-owner)
-   ppp_shutdown_interface(ppp);
+   unregister_netdevice(ppp-dev);
+   rtnl_unlock();
}
if (atomic_dec_and_test(pf-refcnt)) {
switch (pf-kind) {
@@ -593,8 +595,10 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
mutex_lock(ppp_mutex);
if (pf-kind == INTERFACE) {
ppp = PF_TO_PPP(pf);
+   rtnl_lock();
if (file == ppp-owner)
-   ppp_shutdown_interface(ppp);
+   unregister_netdevice(ppp-dev);
+   rtnl_unlock();
}
if (atomic_long_read(file-f_count)  2) {
ppp_release(NULL, file);
@@ -838,11 +842,10 @@ static int ppp_unattached_ioctl(struct net *net, struct 
ppp_file *pf,
/* Create a new ppp unit */
if (get_user(unit, p))
break;
-   ppp = ppp_create_interface(net, unit, err);
+   ppp = ppp_create_interface(net, unit, file, err);
if (!ppp)
break;
file-private_data = ppp-file;
-   ppp-owner = file;
err = -EFAULT;
if (put_user(ppp-file.index, p))
break;
@@ -916,6 +919,16 @@ static __net_init int ppp_init_net(struct net *net)
 static __net_exit void ppp_exit_net(struct net *net)
 {
struct ppp_net *pn = net_generic(net, ppp_net_id);
+   struct ppp *ppp;
+   LIST_HEAD(list);
+   int id;
+
+   rtnl_lock();
+   idr_for_each_entry(pn-units_idr, ppp, id)
+   unregister_netdevice_queue(ppp-dev, list);
+
+   unregister_netdevice_many(list);
+   rtnl_unlock();
 
idr_destroy(pn-units_idr);
 }
@@ -1088,8 +1101,28 @@ static int ppp_dev_init(struct net_device *dev)
return 0;
 }
 
+static void ppp_dev_uninit(struct net_device *dev)
+{
+   struct ppp *ppp = netdev_priv(dev);
+  

Re: [PATCH 0/2] net: introduce IFF_NO_QUEUE as successor of zero tx_queue_len

2015-08-14 Thread Phil Sutter
On Thu, Aug 13, 2015 at 12:11:57PM -0700, Stephen Hemminger wrote:
 On Thu, 13 Aug 2015 20:40:37 +0200
 Jesper Dangaard Brouer bro...@redhat.com wrote:
 
  On Thu, 13 Aug 2015 10:49:50 -0700
  Stephen Hemminger step...@networkplumber.org wrote:
  
   On Thu, 13 Aug 2015 19:01:05 +0200
   Phil Sutter p...@nwl.cc wrote:
   
Up to now, drivers being aware of the above applying to them set
dev-tx_queue_len to zero to indicate no qdisc should be attached to the
interface they drive and the kernel reacts upon this by assigning the 
noop
qdisc instead of the default pfifo_fast. This implicit agreement though 
leads
to an inconvenient situation once a user tries to attach a real qdisc 
to these
devices, as the formerly special tx_queue_len value becomes a regular 
one,
   
   So this is a workaround for user ignorance by introducing kernel API 
   complexity.
   Before user sets qdisc, why don't they set tx queue length?
  
  Please don't insist on keeping this broke interface... how should users
  know that BEFORE adding a qdisc they MUST change the _device_ tx queue
  length (not zero).
 
 Before setting any qdisc, they should set queue length anyway.

Probably, yes. But if they don't, it depends on the interface driver
whether they're screwed or not. In my opinion, this inconsistency alone
is worth fixing.

   Getting back to the original state, they MUST
  change the device tx queue len back to zero BEFORE deleting the qdisc,
  such that when assigning the default queue qdisc the system detects
  this device can work without a qdisc.  Changing the tx queue len to
  zero after the qdisc is deleted will have not effect. 
  
  Listen to the description, that interface is broken. The kernel really
  needs to hide these details from userspace.
  
  It even allows you to misconfigure the kernel, by tricking the kernel
  into assigning noqueue to physical devices that really need it.
 
 But adding a flag risks breaking external scripts.

Could you please elaborate on this? As far as I can tell, introducing a
separate flag is the only solution *not* breaking existing scripts. So
if you see the rub, I would like to know where exactly it is.

Cheers, Phil
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 0/2] ppp: implement x-netns support

2015-08-14 Thread Guillaume Nault
On Thu, Aug 13, 2015 at 09:20:04PM -0700, David Miller wrote:
 From: Guillaume Nault g.na...@alphalink.fr
 Date: Thu, 13 Aug 2015 15:28:02 +0200
 
  This series allows PPP devices to reside in a different netns from the
  PPP unit/channels. Packets only cross netns boundaries when they're
  transmitted between the net_device and the PPP unit (units and channels
  always remain in their creation namespace).
  So only PPP units need to handle cross namespace operations. Channels
  and lower layer protocols aren't affected.
  
  Patch #1 is a bug fix for an existing namespace deletion bug and has
  been separetly sent to net.
  Patch #2 is the actual x-netns implementation.
 
 Patch #1 needs to be respun with the change I requested.
 
Ok, done.

 And this is not the way to submit things that have dependencies
 upon bug fixes.
 
Will do. I was actually unsure about how to handle this case.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch] net: ethernet: micrel: fix an error code

2015-08-14 Thread Dan Carpenter
The dma_mapping_error() function returns true or false.  We should
return -ENOMEM if it there is a dma mapping error.

Signed-off-by: Dan Carpenter dan.carpen...@oracle.com

diff --git a/drivers/net/ethernet/micrel/ks8842.c 
b/drivers/net/ethernet/micrel/ks8842.c
index f78909a..09d2e16 100644
--- a/drivers/net/ethernet/micrel/ks8842.c
+++ b/drivers/net/ethernet/micrel/ks8842.c
@@ -952,9 +952,8 @@ static int ks8842_alloc_dma_bufs(struct net_device *netdev)
 
sg_dma_address(tx_ctl-sg) = dma_map_single(adapter-dev,
tx_ctl-buf, DMA_BUFFER_SIZE, DMA_TO_DEVICE);
-   err = dma_mapping_error(adapter-dev,
-   sg_dma_address(tx_ctl-sg));
-   if (err) {
+   if (dma_mapping_error(adapter-dev, sg_dma_address(tx_ctl-sg))) {
+   err = -ENOMEM;
sg_dma_address(tx_ctl-sg) = 0;
goto err;
}
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IGMP: Inhibit reports for local multicast groups

2015-08-14 Thread Philip Downey
Sorry for the duplication - I responded in a similar manner before seeing this.

Thanks

Philip

 -Original Message-
 From: Thadeu Lima de Souza Cascardo [mailto:casca...@redhat.com]
 Sent: Thursday, August 13, 2015 7:08 PM
 To: Andrew Lunn
 Cc: Philip Downey; David Miller; kuz...@ms2.inr.ac.ru; jmor...@namei.org;
 yoshf...@linux-ipv6.org; ka...@trash.net; linux-ker...@vger.kernel.org;
 netdev@vger.kernel.org
 Subject: Re: [PATCH] IGMP: Inhibit reports for local multicast groups
 
 On Thu, Aug 13, 2015 at 07:01:37PM +0200, Andrew Lunn wrote:
  On Thu, Aug 13, 2015 at 04:52:32PM +, Philip Downey wrote:
   Hi Andrew
   IGMP snooping is designed to prevent hosts on a local network from
 receiving traffic for a multicast group they have not explicitly joined.   
 Link-
 Local multicast traffic should not have an IGMP client since it is reserved 
 for
 routing protocols.  One would expect that IGMP snooping needs to ignore
 local multicast traffic in the reserved range intended for routers since there
 should be no IGMP client to make join requests.
 
  The point of this patch is that Linux is sending out group membership
  for these addresses, it is acting as a client. What happens with a
  switch which is applying IGMP snooping to link-local multicast groups?
  You turn on this feature, and you no longer get your routing protocol
  messages.
 
  I had a quick look at RFC 3376. The only mention i spotted for not
  sending IGMP messages is:
 
 The all-systems multicast address, 224.0.0.1, is handled as a special
 case.  On all systems -- that is all hosts and routers, including
 multicast routers -- reception of packets destined to the all-systems
 multicast address, from all sources, is permanently enabled on all
 interfaces on which multicast reception is supported.  No IGMP
 messages are ever sent regarding the all-systems multicast address.
 
  IGMP v2 has something similar:
 
 The all-systems group (address 224.0.0.1) is handled as a special
 case.  The host starts in Idle Member state for that group on every
 interface, never transitions to another state, and never sends a
 report for that group.
 
  But i did not find anything which says all other link-local addresses
  don't need member reports. Did i miss something?
 
Andrew
 
 From RFC 4541 (Considerations for Internet Group Management Protocol
 (IGMP) and Multicast Listener Discovery (MLD) Snooping Switches):
 
  2) Packets with a destination IP (DIP) address in the 224.0.0.X range
   which are not IGMP must be forwarded on all ports.
 
   This recommendation is based on the fact that many host systems do
   not send Join IP multicast addresses in this range before sending
   or listening to IP multicast packets.  Furthermore, since the
   224.0.0.X address range is defined as link-local (not to be
   routed), it seems unnecessary to keep the state for each address
   in this range.  Additionally, some routers operate in the
   224.0.0.X address range without issuing IGMP Joins, and these
   applications would break if the switch were to prune them due to
   not having seen a Join Group message from the router.
 
 So, it looks like some hosts and routers out there in the field do not send
 joins for those local addresses. In fact, IPv4 local multicast addresses are
 ignored when Linux bridge multicast snooping adds a new group.
 
 static int br_ip4_multicast_add_group(struct net_bridge *br, ...
   if (ipv4_is_local_multicast(group))
   return 0;
 
 Cascardo.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IGMP: Inhibit reports for local multicast groups

2015-08-14 Thread Philip Downey
Hi Andrew
Answers inline...

 -Original Message-
 From: Andrew Lunn [mailto:and...@lunn.ch]
 Sent: Thursday, August 13, 2015 6:02 PM
 To: Philip Downey
 Cc: David Miller; kuz...@ms2.inr.ac.ru; jmor...@namei.org; yoshfuji@linux-
 ipv6.org; ka...@trash.net; linux-ker...@vger.kernel.org;
 netdev@vger.kernel.org
 Subject: Re: [PATCH] IGMP: Inhibit reports for local multicast groups
 
 On Thu, Aug 13, 2015 at 04:52:32PM +, Philip Downey wrote:
  Hi Andrew
  IGMP snooping is designed to prevent hosts on a local network from
 receiving traffic for a multicast group they have not explicitly joined.   
 Link-
 Local multicast traffic should not have an IGMP client since it is reserved 
 for
 routing protocols.  One would expect that IGMP snooping needs to ignore
 local multicast traffic in the reserved range intended for routers since there
 should be no IGMP client to make join requests.
 
 The point of this patch is that Linux is sending out group membership for
 these addresses, it is acting as a client. What happens with a switch which is
 applying IGMP snooping to link-local multicast groups?
 You turn on this feature, and you no longer get your routing protocol
 messages.

It is expected that link-local multicast is always forwarded by switches 
otherwise routers may not function correctly.

From the relevant RFC:

RFC 4541 IGMP and MLD Snooping Switches Considerations  May 2006


2.1.2. Data Forwarding Rules


   1) Packets with a destination IP address outside 224.0.0.X which are
  not IGMP should be forwarded according to group-based port
  membership tables and must also be forwarded on router ports.

  This is the main IGMP snooping functionality for the data path.
  One approach that an implementation could take would be to
  maintain separate membership and multicast router tables in
  software and then merge these tables into a forwarding cache.

   2) Packets with a destination IP (DIP) address in the 224.0.0.X range
  which are not IGMP must be forwarded on all ports.

  This recommendation is based on the fact that many host systems do
  not send Join IP multicast addresses in this range before sending
  or listening to IP multicast packets.  Furthermore, since the
  224.0.0.X address range is defined as link-local (not to be
  routed), it seems unnecessary to keep the state for each address
  in this range.  Additionally, some routers operate in the
  224.0.0.X address range without issuing IGMP Joins, and these
  applications would break if the switch were to prune them due to
  not having seen a Join Group message from the router.

 
 I had a quick look at RFC 3376. The only mention i spotted for not sending
 IGMP messages is:
 
The all-systems multicast address, 224.0.0.1, is handled as a special
case.  On all systems -- that is all hosts and routers, including
multicast routers -- reception of packets destined to the all-systems
multicast address, from all sources, is permanently enabled on all
interfaces on which multicast reception is supported.  No IGMP
messages are ever sent regarding the all-systems multicast address.
 
 IGMP v2 has something similar:
 
The all-systems group (address 224.0.0.1) is handled as a special
case.  The host starts in Idle Member state for that group on every
interface, never transitions to another state, and never sends a
report for that group.
 
 But i did not find anything which says all other link-local addresses don't
 need member reports. Did i miss something?

No you did not miss anything - that is correct.
However, the RFCs don't really cover the behavior of routers well in some 
areas.   Routing protocols which use the 224.0.0.x address space do not need 
IGMP therefore it makes no sense to distribute membership reports for these 
groups.  A router which receives an IGMP membership report which includes 
groups from this reserved address range will ignore it -and probably generate 
debug messages highlighting an invalid address.

Regards

Philip
 
   Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues

2015-08-14 Thread Ken-ichirou MATSUZAWA
 Hi,

Thank you for taking your time.
Please let me explain these with code samples on gist.
I can not describe and arrange it well, sorry.
 
normal socket nflog sample:

https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/nflog.c

set iptables

iptables -A INPUT -p icmp --icmp-type echo-request \
-j NFLOG --nflog-group 2 --nflog-threshold 4

monitor nlmon (like netsniff-ng), run this sample and
ping -i 0.2 -c 10 from another hosts. This sample only shows receive
size and nlmsg_type. Same things can be done with rx mmaped socket.

rx only mmaped nflog sample:

https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/rxring-nflog.c

This sample gets a panic if monitoring nlmon.

panic message:

https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/mmaped_netlink_panic

I think it's because of accessing a skb_shared_info when releasing
skb, although mmaped netlink skb does not have a skb_shared_info. I
tried to fix this at patch 1 and 2 by introducing helper function
which will not access a skb_shared_info.

And I think nm_status should be set to UNUSED when releasing it so
also tried to fix it patch 3.



With both tx/rx mmaped,

both tx/rx mmaped nflog sample:

https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/ring-nflog.c

This sample will not work, since msg-msg_iter.type in
netlink_sendmsg() is set to 1 (WRITE) when this sample calls
sendto(). patch 4 fix this by accepting it.



After applying patch 1 and 2, rx only sample can work but it behaves
differ from normal one. patch 5 may fix this.

And it also works well with my another code which set frame
nm_status to SKIP and passes it to worker threads and the worker
threads set status to UNUSED, even though ring becomes full.

That my another code may set UNUSED status in random, not
sequensially, so that it seems I need to check whole ring.

Thanks,
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Peter Chen
 
 
 Am Freitag, den 14.08.2015, 13:47 +0800 schrieb Peter Chen:
  It causes the i.mx6sx sdb board hang when using nfsroot during boots
  up at v4.2-rc6.
 
  This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.
 
  Cc: netdev@vger.kernel.org
  Cc: Fugang Duan b38...@freescale.com
  Cc: shawn@linaro.org
  Cc: fabio.este...@freescale.com
  Cc: tyler.ba...@linaro.org
  Cc: Lucas Stach l.st...@pengutronix.de
  Cc: Andrew Lunn and...@lunn.ch
  Signed-off-by: Peter Chen peter.c...@freescale.com
  ---
 
  According to Fugang Duan, the i.mx series has different clock control
  sequence among SoCs, this patch may only consider certain SoCs.
 
 Sorry, but NACK.
 
 Please test current mainline (what will become v4.2-rc7). There is already a
 patch in that fixes i.MX27 and probably fixes the same problem on i.MX6SX.
 

Would you help point to me which commit and at which tree?

Peter

   drivers/net/ethernet/freescale/fec_main.c | 89
  +--
   1 file changed, 13 insertions(+), 76 deletions(-)
 
  diff --git a/drivers/net/ethernet/freescale/fec_main.c
  b/drivers/net/ethernet/freescale/fec_main.c
  index 32e3807c..5e8b837 100644
  --- a/drivers/net/ethernet/freescale/fec_main.c
  +++ b/drivers/net/ethernet/freescale/fec_main.c
  @@ -24,7 +24,6 @@
   #include linux/module.h
   #include linux/kernel.h
   #include linux/string.h
  -#include linux/pm_runtime.h
   #include linux/ptrace.h
   #include linux/errno.h
   #include linux/ioport.h
  @@ -78,7 +77,6 @@ static void fec_enet_itr_coal_init(struct net_device
 *ndev);
   #define FEC_ENET_RAEM_V0x8
   #define FEC_ENET_RAFL_V0x8
   #define FEC_ENET_OPD_V 0xFFF0
  -#define FEC_MDIO_PM_TIMEOUT  100 /* ms */
 
   static struct platform_device_id fec_devtype[] = {
  {
  @@ -1769,13 +1767,7 @@ static void fec_enet_adjust_link(struct
  net_device *ndev)  static int fec_enet_mdio_read(struct mii_bus *bus,
  int mii_id, int regnum)  {
  struct fec_enet_private *fep = bus-priv;
  -   struct device *dev = fep-pdev-dev;
  unsigned long time_left;
  -   int ret = 0;
  -
  -   ret = pm_runtime_get_sync(dev);
  -   if (IS_ERR_VALUE(ret))
  -   return ret;
 
  fep-mii_timeout = 0;
  init_completion(fep-mdio_done);
  @@ -1791,30 +1783,18 @@ static int fec_enet_mdio_read(struct mii_bus
 *bus, int mii_id, int regnum)
  if (time_left == 0) {
  fep-mii_timeout = 1;
  netdev_err(fep-netdev, MDIO read timeout\n);
  -   ret = -ETIMEDOUT;
  -   goto out;
  +   return -ETIMEDOUT;
  }
 
  -   ret = FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
  -
  -out:
  -   pm_runtime_mark_last_busy(dev);
  -   pm_runtime_put_autosuspend(dev);
  -
  -   return ret;
  +   /* return value */
  +   return FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
   }
 
   static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
 u16 value)
   {
  struct fec_enet_private *fep = bus-priv;
  -   struct device *dev = fep-pdev-dev;
  unsigned long time_left;
  -   int ret = 0;
  -
  -   ret = pm_runtime_get_sync(dev);
  -   if (IS_ERR_VALUE(ret))
  -   return ret;
 
  fep-mii_timeout = 0;
  init_completion(fep-mdio_done);
  @@ -1831,13 +1811,10 @@ static int fec_enet_mdio_write(struct mii_bus
 *bus, int mii_id, int regnum,
  if (time_left == 0) {
  fep-mii_timeout = 1;
  netdev_err(fep-netdev, MDIO write timeout\n);
  -   ret  = -ETIMEDOUT;
  +   return -ETIMEDOUT;
  }
 
  -   pm_runtime_mark_last_busy(dev);
  -   pm_runtime_put_autosuspend(dev);
  -
  -   return ret;
  +   return 0;
   }
 
   static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
  @@ -1849,6 +1826,9 @@ static int fec_enet_clk_enable(struct net_device
 *ndev, bool enable)
  ret = clk_prepare_enable(fep-clk_ahb);
  if (ret)
  return ret;
  +   ret = clk_prepare_enable(fep-clk_ipg);
  +   if (ret)
  +   goto failed_clk_ipg;
  if (fep-clk_enet_out) {
  ret = clk_prepare_enable(fep-clk_enet_out);
  if (ret)
  @@ -1872,6 +1852,7 @@ static int fec_enet_clk_enable(struct net_device
 *ndev, bool enable)
  }
  } else {
  clk_disable_unprepare(fep-clk_ahb);
  +   clk_disable_unprepare(fep-clk_ipg);
  if (fep-clk_enet_out)
  clk_disable_unprepare(fep-clk_enet_out);
  if (fep-clk_ptp) {
  @@ -1893,6 +1874,8 @@ failed_clk_ptp:
  if (fep-clk_enet_out)
  clk_disable_unprepare(fep-clk_enet_out);
   failed_clk_enet_out:
  +   clk_disable_unprepare(fep-clk_ipg);
  +failed_clk_ipg:
  clk_disable_unprepare(fep-clk_ahb);
 
  return ret;
  @@ -2864,14 +2847,10 @@ fec_enet_open(struct net_device *ndev)
  struct fec_enet_private *fep = netdev_priv(ndev);
  int ret;
 
  -   

Re: GCOV_PROFILE_ALL breaks BUILD_BUG_ON(!is_power_of_2(8))

2015-08-14 Thread Michal Kubecek
On Fri, Aug 14, 2015 at 10:29:04AM +0200, Johannes Berg wrote:
 +linux-kernel
 
  +#define DECLARE_EWMA(name, _factor, _weight)   
   \
  +   struct ewma_##name {
   \
  +  unsigned long internal;  
   \
  +   }; \
  +   static inline void ewma_##name##_init(struct ewma_##name *e) 
   \
  +   {  \
  +  BUILD_BUG_ON(!__builtin_constant_p(_factor));   
   \
  +  BUILD_BUG_ON(!__builtin_constant_p(_weight));   
   \
  +  BUILD_BUG_ON(!is_power_of_2(_factor)); 
   \
  +  BUILD_BUG_ON(!is_power_of_2(_weight)); 
   \
  
 
 So this seemed fine to me, but for some reason the compiler is saying
 the BUILD_BUG_ON(!is_power_of_2(x)) fails, if and only if (!)
 CONFIG_GCOV_PROFILE_ALL is enabled, which seems to boil down to the
 compiler option -fprofile-arcs.
 
 I'm going to replace this with just the code itself, i.e.
 
 /* both must be a power of 2 */
 BUILD_BUG_ON(_factor  (_factor - 1));
 BUILD_BUG_ON(_weight  (_weight - 1));
 
 but should I have expected this?

It might have something to do with the fact that is_power_of_2() being
an inline function, perhaps with this compiler option it translates to
something that can't be used in the context BUILD_BUG_ON() uses it in.

There is a BUILD_BUG_ON_NOT_POWER_OF_2() macro you could use.

Michal Kubecek
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-08-14 Thread Peter Chen
On Fri, Aug 14, 2015 at 10:27:33AM +0200, Lucas Stach wrote:
 Am Freitag, den 14.08.2015, 08:25 + schrieb Peter Chen:
   
   
   Am Freitag, den 14.08.2015, 13:47 +0800 schrieb Peter Chen:
It causes the i.mx6sx sdb board hang when using nfsroot during boots
up at v4.2-rc6.
   
This reverts commit 8fff755e9f8d0f70a595e79f248695ce6aef5cc3.
   
Cc: netdev@vger.kernel.org
Cc: Fugang Duan b38...@freescale.com
Cc: shawn@linaro.org
Cc: fabio.este...@freescale.com
Cc: tyler.ba...@linaro.org
Cc: Lucas Stach l.st...@pengutronix.de
Cc: Andrew Lunn and...@lunn.ch
Signed-off-by: Peter Chen peter.c...@freescale.com
---
   
According to Fugang Duan, the i.mx series has different clock control
sequence among SoCs, this patch may only consider certain SoCs.
   
   Sorry, but NACK.
   
   Please test current mainline (what will become v4.2-rc7). There is 
   already a
   patch in that fixes i.MX27 and probably fixes the same problem on i.MX6SX.
   
  
  Would you help point to me which commit and at which tree?
  
 Mainline, so Linus Torvalds tree.
 
 http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=14d2b7c1a96ef37eb571599c73d4a1a606b964d6

It fixes my imx6sx-sdb board. It is interesting that there was
no problem for some platforms, but with problem for others.
Your fix is a common runtime PM fix.

Again, why we need this as a bug-fix, not but as new feature for next
rc1?

-- 

Best Regards,
Peter Chen
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] IGMP: Inhibit reports for local multicast groups

2015-08-14 Thread Philip Downey
Hi Andrew
Will resubmit with the information you suggest.
There might be a slight delay for this as I am on holiday now for a week.

Rest regards

Philip

 -Original Message-
 From: Andrew Lunn [mailto:and...@lunn.ch]
 Sent: Friday, August 14, 2015 2:35 PM
 To: Philip Downey
 Cc: David Miller; kuz...@ms2.inr.ac.ru; jmor...@namei.org; yoshfuji@linux-
 ipv6.org; ka...@trash.net; linux-ker...@vger.kernel.org;
 netdev@vger.kernel.org
 Subject: Re: [PATCH] IGMP: Inhibit reports for local multicast groups
 
 Hi Philip
 
 So with a bit of poking and prodding, we have a much better understanding
 as to why this is O.K. Maybe your next patch can quote the relevant RFCs
 and have a much fuller commit message?
 
 Thanks
   Andrew
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] lwtunnel: rename ip lwtunnel attributes

2015-08-14 Thread Jiri Benc
We already have IFLA_IPTUN_ netlink attributes. The IP_TUN_ attributes look
very similar, yet they serve very different purpose. This is confusing for
anyone trying to implement a user space tool supporting lwt.

As the IP_TUN_ attributes are used only for the lightweight tunnels, prefix
them with LWTUNNEL_IP_ instead to make their purpose clear. Also, it's more
logical to have them in lwtunnel.h together with the encap enum.

Fixes: 3093fbe7ff4b (route: Per route IP tunnel metadata via lightweight 
tunnel)
Signed-off-by: Jiri Benc jb...@redhat.com
---
These are still in net-next only, thus it's safe to change them.

It's still a bit weird these attributes are in RTA_ENCAP, perhaps we should
also rename RTA_ENCAP to RTA_LWT_ENCAP or similar?
---
 include/uapi/linux/lwtunnel.h  | 14 +++
 include/uapi/linux/rtnetlink.h | 15 
 net/ipv4/ip_tunnel_core.c  | 86 +-
 3 files changed, 57 insertions(+), 58 deletions(-)

diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 31377bbea3f8..3bf223bc2367 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -12,5 +12,19 @@ enum lwtunnel_encap_types {
 
 #define LWTUNNEL_ENCAP_MAX (__LWTUNNEL_ENCAP_MAX - 1)
 
+enum lwtunnel_ip_t {
+   LWTUNNEL_IP_UNSPEC,
+   LWTUNNEL_IP_ID,
+   LWTUNNEL_IP_DST,
+   LWTUNNEL_IP_SRC,
+   LWTUNNEL_IP_TTL,
+   LWTUNNEL_IP_TOS,
+   LWTUNNEL_IP_SPORT,
+   LWTUNNEL_IP_DPORT,
+   LWTUNNEL_IP_FLAGS,
+   __LWTUNNEL_IP_MAX,
+};
+
+#define LWTUNNEL_IP_MAX (__LWTUNNEL_IP_MAX - 1)
 
 #endif /* _UAPI_LWTUNNEL_H_ */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 47d24cb3fbc1..0d3d3cc43356 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -286,21 +286,6 @@ enum rt_class_t {
 
 /* Routing message attributes */
 
-enum ip_tunnel_t {
-   IP_TUN_UNSPEC,
-   IP_TUN_ID,
-   IP_TUN_DST,
-   IP_TUN_SRC,
-   IP_TUN_TTL,
-   IP_TUN_TOS,
-   IP_TUN_SPORT,
-   IP_TUN_DPORT,
-   IP_TUN_FLAGS,
-   __IP_TUN_MAX,
-};
-
-#define IP_TUN_MAX (__IP_TUN_MAX - 1)
-
 enum rtattr_type_t {
RTA_UNSPEC,
RTA_DST,
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 5512f4e4ec1b..fd6319681c50 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -192,15 +192,15 @@ struct rtnl_link_stats64 *ip_tunnel_get_stats64(struct 
net_device *dev,
 }
 EXPORT_SYMBOL_GPL(ip_tunnel_get_stats64);
 
-static const struct nla_policy ip_tun_policy[IP_TUN_MAX + 1] = {
-   [IP_TUN_ID] = { .type = NLA_U64 },
-   [IP_TUN_DST]= { .type = NLA_U32 },
-   [IP_TUN_SRC]= { .type = NLA_U32 },
-   [IP_TUN_TTL]= { .type = NLA_U8 },
-   [IP_TUN_TOS]= { .type = NLA_U8 },
-   [IP_TUN_SPORT]  = { .type = NLA_U16 },
-   [IP_TUN_DPORT]  = { .type = NLA_U16 },
-   [IP_TUN_FLAGS]  = { .type = NLA_U16 },
+static const struct nla_policy ip_tun_policy[LWTUNNEL_IP_MAX + 1] = {
+   [LWTUNNEL_IP_ID]= { .type = NLA_U64 },
+   [LWTUNNEL_IP_DST]   = { .type = NLA_U32 },
+   [LWTUNNEL_IP_SRC]   = { .type = NLA_U32 },
+   [LWTUNNEL_IP_TTL]   = { .type = NLA_U8 },
+   [LWTUNNEL_IP_TOS]   = { .type = NLA_U8 },
+   [LWTUNNEL_IP_SPORT] = { .type = NLA_U16 },
+   [LWTUNNEL_IP_DPORT] = { .type = NLA_U16 },
+   [LWTUNNEL_IP_FLAGS] = { .type = NLA_U16 },
 };
 
 static int ip_tun_build_state(struct net_device *dev, struct nlattr *attr,
@@ -208,10 +208,10 @@ static int ip_tun_build_state(struct net_device *dev, 
struct nlattr *attr,
 {
struct ip_tunnel_info *tun_info;
struct lwtunnel_state *new_state;
-   struct nlattr *tb[IP_TUN_MAX + 1];
+   struct nlattr *tb[LWTUNNEL_IP_MAX + 1];
int err;
 
-   err = nla_parse_nested(tb, IP_TUN_MAX, attr, ip_tun_policy);
+   err = nla_parse_nested(tb, LWTUNNEL_IP_MAX, attr, ip_tun_policy);
if (err  0)
return err;
 
@@ -223,29 +223,29 @@ static int ip_tun_build_state(struct net_device *dev, 
struct nlattr *attr,
 
tun_info = lwt_tun_info(new_state);
 
-   if (tb[IP_TUN_ID])
-   tun_info-key.tun_id = nla_get_u64(tb[IP_TUN_ID]);
+   if (tb[LWTUNNEL_IP_ID])
+   tun_info-key.tun_id = nla_get_u64(tb[LWTUNNEL_IP_ID]);
 
-   if (tb[IP_TUN_DST])
-   tun_info-key.ipv4_dst = nla_get_be32(tb[IP_TUN_DST]);
+   if (tb[LWTUNNEL_IP_DST])
+   tun_info-key.ipv4_dst = nla_get_be32(tb[LWTUNNEL_IP_DST]);
 
-   if (tb[IP_TUN_SRC])
-   tun_info-key.ipv4_src = nla_get_be32(tb[IP_TUN_SRC]);
+   if (tb[LWTUNNEL_IP_SRC])
+   tun_info-key.ipv4_src = nla_get_be32(tb[LWTUNNEL_IP_SRC]);
 
-   if (tb[IP_TUN_TTL])
-   tun_info-key.ipv4_ttl = 

Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Daniel Borkmann

[ @Willem: RH email doesn't exist anymore, I took it out, otherwise
  every reply gets a bounce. ;) ]

On 08/14/2015 07:03 PM, Alexei Starovoitov wrote:

On 8/14/15 8:50 AM, Willem de Bruijn wrote:

...

all looks great except in the above the check:
 if (new-type != BPF_PROG_TYPE_SOCKET_FILTER) {
 bpf_prog_put(new);
 return -EINVAL;
 }
is missing. Otherwise user will be able to attach programs
of wrong types to fanout.

Also instead of:
  #define PACKET_FANOUT_BPF6
  #define PACKET_FANOUT_EBPF7

I would call them FANOUT_CBPF and FANOUT_EBPF to be unambiguous.
This is how bpf manpage distinguishes them.


We have SO_ATTACH_FILTER and SO_ATTACH_BPF, could also be
analogous for fanout, if we want to be consistent with the API?

But C/E prefix seems okay too, how you want ...

Btw, in case someone sets sock_flag(sk, SOCK_FILTER_LOCKED),
perhaps we should also apply it on fanout?

Thanks,
Daniel
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Willem de Bruijn
 [ @Willem: RH email doesn't exist anymore, I took it out, otherwise
   every reply gets a bounce. ;) ]

Sorry for using the wrong address, Daniel.

 Also instead of:
   #define PACKET_FANOUT_BPF6
   #define PACKET_FANOUT_EBPF7

 I would call them FANOUT_CBPF and FANOUT_EBPF to be unambiguous.
 This is how bpf manpage distinguishes them.

 We have SO_ATTACH_FILTER and SO_ATTACH_BPF, could also be
 analogous for fanout, if we want to be consistent with the API?

 But C/E prefix seems okay too, how you want ...

I don't feel very strongly, either. But CBPF/EBPF is a bit more
descriptive, so let's do that.

 Btw, in case someone sets sock_flag(sk, SOCK_FILTER_LOCKED),
 perhaps we should also apply it on fanout?

Good point. With classic bpf, packet access control is fully
enforced in per-socket filters, but playing with load balancing
filters could allow an adversary to infer some information
about the dropped packets*. With eBPF and maps, access
is even more direct. Let's support locking of fanout filters in
place.

I intend to test the existing socket flag. No need to add a
separate flag for the fanout group, as far as I can see.


(*) I noticed that a similar unintended effect also causes the
PACKET_FANOUT_LB selftest to be flaky: filters on the
sockets ensure that the test only reads expected packets.
But, all traffic makes it through packet_rcv_fanout. Packets
that are later dropped by sk_filter have already incremented
rr_cur. Worst case, with 2 sockets and each accepted packet
interleaved with a dropped packet, all packets are queued on
only one socket. Test flakiness is fixed, e.g., by running in a
private network namespace. The implementation behavior
may be unexpected in other, production, environments.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next]r8169.c: Force transmission when nic refuse to start.

2015-08-14 Thread Francois Romieu
Corcodel Marian corcodel.mar...@gmail.com :
[...]
 diff --git a/drivers/net/ethernet/realtek/r8169.c 
 b/drivers/net/ethernet/realtek/r8169.c
 index eb2d2a4..6882eab 100644
 --- a/drivers/net/ethernet/realtek/r8169.c
 +++ b/drivers/net/ethernet/realtek/r8169.c
 @@ -7470,15 +7470,22 @@ static int rtl8169_poll(struct napi_struct *napi, int 
 budget)
   u16 enable_mask = RTL_EVENT_NAPI | tp-event_slow;
   int work_done= 0;
   u16 status;
 +int tx_force = 1;
  
   status = rtl_get_events(tp);
   rtl_ack_events(tp, status  ~tp-event_slow);
 -
 +   if (netif_running(dev)) {
   if (status  RTL_EVENT_NAPI_RX)
   work_done = rtl_rx(dev, tp, (u32) budget);
 +
  
   if (status  RTL_EVENT_NAPI_TX)
   rtl_tx(dev, tp);
 + else if (tx_force == 1) {
 +mdelay(10);
 +rtl_tx(dev, tp);
 + }
 +}

Please try to use TimerInt instead of this ugly hack.

-- 
Ueimor
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Daniel Borkmann

On 08/14/2015 09:27 PM, Willem de Bruijn wrote:
...

Btw, in case someone sets sock_flag(sk, SOCK_FILTER_LOCKED),
perhaps we should also apply it on fanout?


Good point. With classic bpf, packet access control is fully
enforced in per-socket filters, but playing with load balancing
filters could allow an adversary to infer some information
about the dropped packets*. With eBPF and maps, access
is even more direct. Let's support locking of fanout filters in
place.


Right, a process could share a map between the fanout lb filter
and actual sk filter, i.e. to look up how much actually passed
through on the later sk level filter, and use that information
in addition for its lb decisions.


I intend to test the existing socket flag. No need to add a
separate flag for the fanout group, as far as I can see.


Agreed, should be okay.

Thanks Willem!


(*) I noticed that a similar unintended effect also causes the
PACKET_FANOUT_LB selftest to be flaky: filters on the
sockets ensure that the test only reads expected packets.
But, all traffic makes it through packet_rcv_fanout. Packets
that are later dropped by sk_filter have already incremented
rr_cur. Worst case, with 2 sockets and each accepted packet
interleaved with a dropped packet, all packets are queued on
only one socket. Test flakiness is fixed, e.g., by running in a
private network namespace. The implementation behavior
may be unexpected in other, production, environments.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Willem de Bruijn
On Fri, Aug 14, 2015 at 1:03 PM, Alexei Starovoitov a...@plumgrid.com wrote:
 On 8/14/15 8:50 AM, Willem de Bruijn wrote:

 +static int fanout_set_data_ebpf(struct packet_fanout *f, char __user
 *data,
 +   unsigned int len)
 +{
 +   struct bpf_prog *new;
 +   u32 fd;
 +
 +   if (len != sizeof(fd))
 +   return -EINVAL;
 +   if (copy_from_user(fd, data, len))
 +   return -EFAULT;
 +
 +   new = bpf_prog_get(fd);
 +   if (IS_ERR(new))
 +   return PTR_ERR(new);
 +
 +   __fanout_set_data_bpf(f, new);
 +   return 0;
 +}


 all looks great except in the above the check:
 if (new-type != BPF_PROG_TYPE_SOCKET_FILTER) {
 bpf_prog_put(new);
 return -EINVAL;
 }
 is missing. Otherwise user will be able to attach programs
 of wrong types to fanout.

Ai, good point!

 Also instead of:
  #define PACKET_FANOUT_BPF  6
  #define PACKET_FANOUT_EBPF 7

 I would call them FANOUT_CBPF and FANOUT_EBPF to be unambiguous.
 This is how bpf manpage distinguishes them.


Sounds good. I'll make both changes in v2. Thanks for reviewing, Alexei.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] be2net: avoid vxlan offloading on multichannel configs

2015-08-14 Thread Ivan Vecera
VxLAN offloading is not functional if the NIC is running in multichannel
mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
connectivity through the NIC and the device needs to be down and up to
restore it. The firmware should take care about it and does not allow
the conversion of interface to tunnel type (be_cmd_manage_iface) or should
support VxLAN offloading if multichannel config is enabled.
I have tested this on the latest available firmware (10.6.144.21).

Result:
[root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 
172.30.10.50/24 dev enp5s0f0
[root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 
56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
[root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 
172.30.10.60 dstport 4789
[root@sm-04 ~]# ip link set vxlan10 up
[ 7900.442811] be2net :05:00.0: Enabled VxLAN offloads for UDP port 4789
[ 7900.455722] be2net :05:00.1: Enabled VxLAN offloads for UDP port 4789
[ 7900.468635] be2net :05:00.2: Enabled VxLAN offloads for UDP port 4789
[ 7900.481553] be2net :05:00.3: Enabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

[root@sm-04 ~]# ip link set vxlan10 down
[ 7959.434093] be2net :05:00.0: Disabled VxLAN offloads for UDP port 4789
[ 7959.444792] be2net :05:00.1: Disabled VxLAN offloads for UDP port 4789
[ 7959.455592] be2net :05:00.2: Disabled VxLAN offloads for UDP port 4789
[ 7959.466416] be2net :05:00.3: Disabled VxLAN offloads for UDP port 4789
[root@sm-04 ~]# ip link del vxlan10
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

[root@sm-04 ~]# ip link set enp5s0f0 down
[root@sm-04 ~]# ip link set enp5s0f0 up
[ 8071.019003] be2net :05:00.0 enp5s0f0: Link is Up
[root@sm-04 ~]# ping -c 3 172.30.10.254
PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms

 --- 172.30.10.254 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms

Cc: Sathya Perla sathya.pe...@avagotech.com
Cc: Ajit Khaparde ajit.khapa...@avagotech.com
Cc: Padmanabh Ratnakar padmanabh.ratna...@avagotech.com
Cc: Sriharsha Basavapatna sriharsha.basavapa...@avagotech.com
Signed-off-by: Ivan Vecera ivec...@redhat.com
---
 drivers/net/ethernet/emulex/benet/be_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be_main.c 
b/drivers/net/ethernet/emulex/benet/be_main.c
index c28e3bf..6ca693b 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -5174,7 +5174,7 @@ static void be_add_vxlan_port(struct net_device *netdev, 
sa_family_t sa_family,
struct device *dev = adapter-pdev-dev;
int status;
 
-   if (lancer_chip(adapter) || BEx_chip(adapter))
+   if (lancer_chip(adapter) || BEx_chip(adapter) || be_is_mc(adapter))
return;
 
if (adapter-flags  BE_FLAGS_VXLAN_OFFLOADS) {
@@ -5221,7 +5221,7 @@ static void be_del_vxlan_port(struct net_device *netdev, 
sa_family_t sa_family,
 {
struct be_adapter *adapter = netdev_priv(netdev);
 
-   if (lancer_chip(adapter) || BEx_chip(adapter))
+   if (lancer_chip(adapter) || BEx_chip(adapter) || be_is_mc(adapter))
return;
 
if (adapter-vxlan_port != port)
-- 
2.4.6

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fw: [Bug 102861] New: soft lockup with inet: fix races with reqsk timers

2015-08-14 Thread Eric Dumazet
On Fri, 2015-08-14 at 08:57 -0700, Stephen Hemminger wrote:
 
 Begin forwarded message:
 
 Date: Fri, 14 Aug 2015 14:32:56 +
 From: bugzilla-dae...@bugzilla.kernel.org 
 bugzilla-dae...@bugzilla.kernel.org
 To: shemmin...@linux-foundation.org shemmin...@linux-foundation.org
 Subject: [Bug 102861] New: soft lockup with inet: fix races with reqsk 
 timers
 
 
 https://bugzilla.kernel.org/show_bug.cgi?id=102861
 
 Bug ID: 102861
Summary: soft lockup with inet: fix races with reqsk timers
Product: Networking
Version: 2.5
 Kernel Version: net-next d52736e2
   Hardware: All
 OS: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: IPV4
   Assignee: shemmin...@linux-foundation.org
   Reporter: andreas.r...@gmail.com
 Regression: No
 
 Created attachment 184921
   -- https://bugzilla.kernel.org/attachment.cgi?id=184921action=edit
 dmesg picture
 
 Happens withing 10min while eg. torrenting. net-next from a few days ago was
 fine.
 
 Picture as ssh hangs and dmesg to disk lands in nirvana instead.
 
 Checked via reverting this commit, which allowed me to merrily torrent without
 lockup again. This does not 100% prove that it was actually the cause, ofc.
 

Fixed with :
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net 2/3] ipv6: Add rt6_make_pcpu_route()

2015-08-14 Thread Martin KaFai Lau
It is a prep work for fixing a potential deadlock when creating
a pcpu rt.

The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
exist.  This patch moves the pcpu rt creation logic into another function,
rt6_make_pcpu_route().

Signed-off-by: Martin KaFai Lau ka...@fb.com
CC: Hannes Frederic Sowa han...@stressinduktion.org
---
 net/ipv6/route.c | 20 
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index c95c319..0a82653 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -993,13 +993,21 @@ static struct rt6_info *ip6_rt_pcpu_alloc(struct rt6_info 
*rt)
 /* It should be called with read_lock_bh(tb6_lock) acquired */
 static struct rt6_info *rt6_get_pcpu_route(struct rt6_info *rt)
 {
-   struct rt6_info *pcpu_rt, *prev, **p;
+   struct rt6_info *pcpu_rt, **p;
 
p = this_cpu_ptr(rt-rt6i_pcpu);
pcpu_rt = *p;
 
-   if (pcpu_rt)
-   goto done;
+   if (pcpu_rt) {
+   dst_hold(pcpu_rt-dst);
+   rt6_dst_from_metrics_check(pcpu_rt);
+   }
+   return pcpu_rt;
+}
+
+static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
+{
+   struct rt6_info *pcpu_rt, *prev, **p;
 
pcpu_rt = ip6_rt_pcpu_alloc(rt);
if (!pcpu_rt) {
@@ -1009,6 +1017,7 @@ static struct rt6_info *rt6_get_pcpu_route(struct 
rt6_info *rt)
goto done;
}
 
+   p = this_cpu_ptr(rt-rt6i_pcpu);
prev = cmpxchg(p, NULL, pcpu_rt);
if (prev) {
/* If someone did it before us, return prev instead */
@@ -1093,8 +1102,11 @@ redo_rt6_select:
rt-dst.lastuse = jiffies;
rt-dst.__use++;
pcpu_rt = rt6_get_pcpu_route(rt);
-   read_unlock_bh(table-tb6_lock);
 
+   if (!pcpu_rt)
+   pcpu_rt = rt6_make_pcpu_route(rt);
+
+   read_unlock_bh(table-tb6_lock);
return pcpu_rt;
}
 }
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net 0/3] ipv6: Fix a potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
v1 - v2:
A minor change in the commit message of patch 2.

This patch series fixes a potential deadlock when creating a pcpu rt.
It happens when dst_alloc() decided to run gc. Something like this:

read_lock(table-tb6_lock);
ip6_rt_pcpu_alloc()
= dst_alloc()
= ip6_dst_gc()
= write_lock(table-tb6_lock); /* oops */

Patch 1 and 2 are some prep works.
Patch 3 is the fix.

Original report: https://bugzilla.kernel.org/show_bug.cgi?id=102291

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Alexei Starovoitov

On 8/14/15 8:50 AM, Willem de Bruijn wrote:

+static int fanout_set_data_ebpf(struct packet_fanout *f, char __user *data,
+   unsigned int len)
+{
+   struct bpf_prog *new;
+   u32 fd;
+
+   if (len != sizeof(fd))
+   return -EINVAL;
+   if (copy_from_user(fd, data, len))
+   return -EFAULT;
+
+   new = bpf_prog_get(fd);
+   if (IS_ERR(new))
+   return PTR_ERR(new);
+
+   __fanout_set_data_bpf(f, new);
+   return 0;
+}


all looks great except in the above the check:
if (new-type != BPF_PROG_TYPE_SOCKET_FILTER) {
bpf_prog_put(new);
return -EINVAL;
}
is missing. Otherwise user will be able to attach programs
of wrong types to fanout.

Also instead of:
 #define PACKET_FANOUT_BPF  6
 #define PACKET_FANOUT_EBPF 7

I would call them FANOUT_CBPF and FANOUT_EBPF to be unambiguous.
This is how bpf manpage distinguishes them.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3] net: phy: workaround for buggy cable detection by LAN8700 after cable plugging

2015-08-14 Thread Igor Plyatov
* Due to HW bug, LAN8700 sometimes does not detect presence of energy in the
  Ethernet cable in Energy Detect Power-Down mode (e.g while EDPWRDOWN bit is
  set, the ENERGYON bit does not asserted sometimes). This is a common bug of
  LAN87xx family of PHY chips.
* The lan87xx_read_status() was improved to acquire ENERGYON bit. Its previous
  algorythm still not reliable on 100 % and sometimes skip cable plugging.

Signed-off-by: Igor Plyatov plya...@gmail.com
---
 drivers/net/phy/smsc.c |   31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index c0f6479..d64f016 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -91,19 +91,18 @@ static int lan911x_config_init(struct phy_device *phydev)
 }
 
 /*
- * The LAN8710/LAN8720 requires a minimum of 2 link pulses within 64ms of each
- * other in order to set the ENERGYON bit and exit EDPD mode.  If a link 
partner
- * does send the pulses within this interval, the PHY will remained powered
- * down.
- *
- * This workaround will manually toggle the PHY on/off upon calls to 
read_status
- * in order to generate link test pulses if the link is down.  If a link 
partner
- * is present, it will respond to the pulses, which will cause the ENERGYON bit
- * to be set and will cause the EDPD mode to be exited.
+ * The LAN87xx suffers from rare absence of the ENERGYON-bit when Ethernet 
cable
+ * plugs in while LAN87xx is in Energy Detect Power-Down mode. This leads to
+ * unstable detection of plugging in Ethernet cable.
+ * This workaround disables Energy Detect Power-Down mode and waiting for
+ * response on link pulses to detect presence of plugged Ethernet cable.
+ * The Energy Detect Power-Down mode is enabled again in the end of procedure 
to
+ * save approximately 220 mW of power if cable is unplugged.
  */
 static int lan87xx_read_status(struct phy_device *phydev)
 {
int err = genphy_read_status(phydev);
+   int i;
 
if (!phydev-link) {
/* Disable EDPD to wake up PHY */
@@ -116,8 +115,16 @@ static int lan87xx_read_status(struct phy_device *phydev)
if (rc  0)
return rc;
 
-   /* Sleep 64 ms to allow ~5 link test pulses to be sent */
-   msleep(64);
+   /* Wait max 640 ms to detect energy */
+   for (i = 0; i  64; i++) {
+   /* Sleep to allow link test pulses to be sent */
+   msleep(10);
+   rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
+   if (rc  0)
+   return rc;
+   if (rc  MII_LAN83C185_ENERGYON)
+   break;
+   };
 
/* Re-enable EDPD */
rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
@@ -191,7 +198,7 @@ static struct phy_driver smsc_phy_driver[] = {
 
/* basic functions */
.config_aneg= genphy_config_aneg,
-   .read_status= genphy_read_status,
+   .read_status= lan87xx_read_status,
.config_init= smsc_phy_config_init,
.soft_reset = smsc_phy_reset,
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Several races in usbnet module (kernel 4.1.x)

2015-08-14 Thread Eugene Shatokhin

Hi,

21.07.2015 17:22, Oliver Neukum пишет:

On Mon, 2015-07-20 at 21:13 +0300, Eugene Shatokhin wrote:

And here, the code clears EVENT_RX_KILL bit in dev-flags, which may
execute concurrently with the above operation:
#0 clear_bit (bitops.h:113, inlined)
#1 usbnet_bh (usbnet.c:1475)
 /* restart RX again after disabling due to high error rate */
 clear_bit(EVENT_RX_KILL, dev-flags);

If clear_bit() is atomic w.r.t. setting dev-flags to 0, this race is
not a problem, I guess. Otherwise, it may be.


clear_bit is atomic with respect to other atomic operations.
So how about this:

Regards
Oliver


From 1c4e685b3a9c183e04c46b661830e5c7ed35b513 Mon Sep 17 00:00:00 2001

From: Oliver Neukum oneu...@suse.com
Date: Tue, 21 Jul 2015 16:19:40 +0200
Subject: [PATCH] usbnet: fix race between usbnet_stop() and the BH

Does this do the job?

Signed-off-by: Oliver Neukum oneu...@suse.com
---
  drivers/net/usb/usbnet.c | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 3c86b10..77a9a86 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -778,7 +778,7 @@ int usbnet_stop (struct net_device *net)
  {
struct usbnet   *dev = netdev_priv(net);
struct driver_info  *info = dev-driver_info;
-   int retval, pm;
+   int retval, pm, mpn;

clear_bit(EVENT_DEV_OPEN, dev-flags);
netif_stop_queue (net);
@@ -813,14 +813,17 @@ int usbnet_stop (struct net_device *net)
 * can't flush_scheduled_work() until we drop rtnl (later),
 * else workers could deadlock; so make workers a NOP.
 */
+   mpn = !test_and_clear_bit(EVENT_NO_RUNTIME_PM, dev-flags);
dev-flags = 0;
del_timer_sync (dev-delay);
tasklet_kill (dev-bh);
+   mpn |= !test_and_clear_bit(EVENT_NO_RUNTIME_PM, dev-flags);
+   /* in case the bh reset a flag */
+   dev-flags = 0;
if (!pm)
usb_autopm_put_interface(dev-intf);

-   if (info-manage_power 
-   !test_and_clear_bit(EVENT_NO_RUNTIME_PM, dev-flags))
+   if (info-manage_power  mpn)
info-manage_power(dev, 0);
else
usb_autopm_put_interface(dev-intf);



From what we have discussed here, I have combined a patch that fixes 
the race #1 in usbnet_stop() and makes #4 harmless by using atomics. I 
will send it shortly.


I had to make some adjustments (e.g. using spin_lock_nested in one place 
for lockdep to see it is OK to take dev-done.lock there).


I have tested the patch on the mainline kernel 4.2-rc6 built for x86-64, 
with the same USB modem. So far, lockdep, Kmemleak (just in case) and my 
tools have not detected problems in the relevant parts of the code. The 
device and the driver seem to work well.


So, what is your opinion?

Regards,
Eugene




--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] lwtunnel: rename ip lwtunnel attributes

2015-08-14 Thread Thomas Graf
On 08/14/15 at 04:40pm, Jiri Benc wrote:
 We already have IFLA_IPTUN_ netlink attributes. The IP_TUN_ attributes look
 very similar, yet they serve very different purpose. This is confusing for
 anyone trying to implement a user space tool supporting lwt.
 
 As the IP_TUN_ attributes are used only for the lightweight tunnels, prefix
 them with LWTUNNEL_IP_ instead to make their purpose clear. Also, it's more
 logical to have them in lwtunnel.h together with the encap enum.
 
 Fixes: 3093fbe7ff4b (route: Per route IP tunnel metadata via lightweight 
 tunnel)
 Signed-off-by: Jiri Benc jb...@redhat.com

Acked-by: Thomas Graf tg...@suug.ch

 ---
 These are still in net-next only, thus it's safe to change them.
 
 It's still a bit weird these attributes are in RTA_ENCAP, perhaps we should
 also rename RTA_ENCAP to RTA_LWT_ENCAP or similar?

I think RTA_ENCAP is fine but I don't mind changing it either.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net: phy: fix PHY_RUNNING in phy_state_machine

2015-08-14 Thread Florian Fainelli
Le 08/13/15 21:23, shh@gmail.com a écrit :
 From: Shaohui Xie shaohui@freescale.com
 
 Currently, if phy state is PHY_RUNNING, we always register a CHANGE
 when phy works in polling or interrupt ignored, this will make the
 adjust_link being called even the phy link did Not changed.

Right, which is why most drivers do implement a caching scheme.

 
 checking the phy link to make sure the link did changed before we
 register a CHANGE, if link did not changed, we do nothing.

With your change we will end-up with virtually polling a PHY twice as
fast as we used to with the RUNNING - CHANGELINK - RUNNING transition
(current state transitions), which is probably fine, but puts a bit more
pressure on the (slow) MDIO bus since we end-up with two additional
reads to latch the link status register.

PS: I would appreciate if you could CC me on future libphy submissions.

 
 Signed-off-by: Shaohui Xie shaohui@freescale.com
 ---
  drivers/net/phy/phy.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
 index 84b1fba..d972851 100644
 --- a/drivers/net/phy/phy.c
 +++ b/drivers/net/phy/phy.c
 @@ -814,6 +814,7 @@ void phy_state_machine(struct work_struct *work)
   bool needs_aneg = false, do_suspend = false;
   enum phy_state old_state;
   int err = 0;
 + int old_link;
  
   mutex_lock(phydev-lock);
  
 @@ -899,11 +900,18 @@ void phy_state_machine(struct work_struct *work)
   phydev-adjust_link(phydev-attached_dev);
   break;
   case PHY_RUNNING:
 - /* Only register a CHANGE if we are
 -  * polling or ignoring interrupts
 + /* Only register a CHANGE if we are polling or ignoring
 +  * interrupts and link changed since latest checking.
*/
 - if (!phy_interrupt_is_valid(phydev))
 - phydev-state = PHY_CHANGELINK;
 + if (!phy_interrupt_is_valid(phydev)) {
 + old_link = phydev-link;
 + err = phy_read_status(phydev);
 + if (err)
 + break;
 +
 + if (old_link != phydev-link)
 + phydev-state = PHY_CHANGELINK;
 + }
   break;
   case PHY_CHANGELINK:
   err = phy_read_status(phydev);
 


-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] net: phy: workaround for buggy cable detection by LAN8700 after cable plugging

2015-08-14 Thread Michael Heimpold
Hi Igor,

Am Freitag, 14. August 2015, 11:03:04 schrieb Igor Plyatov:
 Dear Michael,
 
  Hi Igor,
 
  Am Donnerstag, 13. August 2015, 22:18:34 schrieben Sie:
 
   * Due to HW bug, LAN8700 sometimes does not detect presence of 
  energy in the
 
   Ethernet cable in Energy Detect Power-Down mode (e.g while EDPWRDOWN 
  bit is
 
   set, the ENERGYON bit does not asserted sometimes). This is a common 
  bug of
 
   LAN87xx family of PHY chips.
 
  Is there any offical errata sheet for this PHY family? How do you 
  know, that this is a
 
  common HW bug?
 
 
 The LAN8700, LAN8710, LAN8720 is a product of the SMSC company. 
 Microchip acquired SMSC in August 2012.
 
 The LAN8700 is a legacy product for Microchip and they will not update 
 anything about it. So, even if Microchip know about HW bug, then there 
 is no chance to have Errata sheet or any new documents about LAN8700.

Long time ago, I worked on a custom device with a PHY of the same family.
Errata sheet existed but was only available by signing a NDA. So I simply
wondered whether this changed since SMSC is now Microchip or if they keep
it still so covered...

 
 I think same history is for LAN8710/LAN8720 even if they are not marked 
 as legacy. They are SMSC products.
 
 The workarounds for same issue in LAN8710/LAN8720 was committed by:
   * Marek Vasut ma...@denx.de as b629820d18fa65cc598390e4b9712fd5f83ee693.
   * Patrick Trantham patrick.trant...@fuel7.com as 
 4223dbffed9f89596177ff2b256ef3258b20fa46.
 
  Me too, I think that this family has some problems with this mode, 
  however, without
 
  hard evidence, I would put it softer.
 
 
 I have discovered this bug by just monitoring of data to/from MDIO 
 registers of LAN8700.
 And HW issue is proven on 100 % by rare absence of ENERGYON bit when 
 cable is plugged in.
 Sometimes, it is required to make 2-20 tests to catch this issue.
 
 The configuration of CPU pins, responsible for the MDIO interface, was 
 checked carefully by oscilloscope and they are fine (no spikes, no 
 garbage, good shape of edges).
 
   * The lan87xx_read_status() was improved to acquire ENERGYON bit. 
  Its previous
 
   algorythm still not reliable on 100 % and sometimes skip cable plugging.
 
  
 
   Signed-off-by: Igor Plyatov plya...@gmail.com
 
   ---
 
   drivers/net/phy/smsc.c | 15 ---
 
   1 file changed, 12 insertions(+), 3 deletions(-)
 
  
 
   diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
 
   index c0f6479..8559ff1 100644
 
   --- a/drivers/net/phy/smsc.c
 
   +++ b/drivers/net/phy/smsc.c
 
   @@ -104,6 +104,7 @@ static int lan911x_config_init(struct phy_device 
  *phydev)
 
   static int lan87xx_read_status(struct phy_device *phydev)
 
   {
 
   int err = genphy_read_status(phydev);
 
   + int i;
 
  
 
   if (!phydev-link) {
 
   /* Disable EDPD to wake up PHY */
 
   @@ -116,8 +117,16 @@ static int lan87xx_read_status(struct 
  phy_device *phydev)
 
   if (rc  0)
 
   return rc;
 
  
 
   - /* Sleep 64 ms to allow ~5 link test pulses to be sent */
 
   - msleep(64);
 
   + /* Wait max 640 ms to detect energy */
 
  Why 640ms and not e.g. 650ms?
 
  I'm no PHY expert, but this looks like an ugly workaround.
 
 
 Such a value was adopted after many trial and probes. It allows to 
 detect cable plugging on 100 %.
 Ugly or not, but it works and reliable.
 
  Maybe it would be better to avoid this power saving mode at all, when 
  it is not
 
  reliable, but this are just my 2cts. :-)
 
 
 Power saving mode allow to save around 220 mW of energy consumed from 
 power supply, when Ethernet cable is not plugged in.
 This is a good value for embedded devices.
 Better to keep power save mode on.

Ok, I was not aware, that this is so much.

 
  Anyway, I guess you should also update the explanation on top of the 
  function to reflect
 
  your new approach.
 
 
 I propose following comment for the lan87xx_read_status():
 /*
   * The LAN87xx suffers from rare absence of the ENERGYON-bit when 
 Ethernet cable
   * plugs in while LAN87xx is in Energy Detect Power-Down mode. This 
 leads to
   * unstable detection of plugging in Ethernet cable.
   * This workaround disables Energy Detect Power-Down mode and waiting for
   * response on link pulses to detect presence of plugged Ethernet cable.
   * The Energy Detect Power-Down mode enabled again in the end of 
 procedure to
   * save approximately 220 mW of power if cable is unplugged.
   */

Nice. Only one nitpick: ... _is_ enabled again...

 
   + for (i = 0; i  64; i++) {
 
   + /* Sleep to allow link test pulses to be sent */
 
   + msleep(10);
 
   + rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
 
   + if (rc  0)
 
   + return rc;
 
   + if (rc  MII_LAN83C185_ENERGYON)
 
   + break;
 
   + };
 
  
 
   /* Re-enable EDPD */
 
   rc = phy_read(phydev, MII_LAN83C185_CTRL_STATUS);
 
   @@ -191,7 +200,7 @@ static struct phy_driver smsc_phy_driver[] = {
 
  
 
   /* basic functions */
 
   .config_aneg = genphy_config_aneg,
 
   - 

Re: [PATCH v2] net: phy: workaround for buggy cable detection by LAN8700 after cable plugging

2015-08-14 Thread Igor Plyatov

Dear Michael,


The LAN8700, LAN8710, LAN8720 is a product of the SMSC company.
Microchip acquired SMSC in August 2012.

The LAN8700 is a legacy product for Microchip and they will not update
anything about it. So, even if Microchip know about HW bug, then there
is no chance to have Errata sheet or any new documents about LAN8700.
Long time ago, I worked on a custom device with a PHY of the same family.
Errata sheet existed but was only available by signing a NDA. So I simply
wondered whether this changed since SMSC is now Microchip or if they keep
it still so covered...


The Microchip web-site does not contain Errata sheet for LAN87xx devices.
While it contains many Errata sheets for PIC and dsPIC devices.
So, situation is same as many years ago.



I propose following comment for the lan87xx_read_status():
/*
   * The LAN87xx suffers from rare absence of the ENERGYON-bit when
Ethernet cable
   * plugs in while LAN87xx is in Energy Detect Power-Down mode. This
leads to
   * unstable detection of plugging in Ethernet cable.
   * This workaround disables Energy Detect Power-Down mode and waiting for
   * response on link pulses to detect presence of plugged Ethernet cable.
   * The Energy Detect Power-Down mode enabled again in the end of
procedure to
   * save approximately 220 mW of power if cable is unplugged.
   */
Nice. Only one nitpick: ... _is_ enabled again...



Changed in [PATCH v3].

Best wishes.
--
Igor Plyatov
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 04/11] udp: Handle VRF device in sendmsg

2015-08-14 Thread Tom Herbert
On Thu, Aug 13, 2015 at 1:59 PM, David Ahern d...@cumulusnetworks.com wrote:
 For unconnected UDP sockets using a VRF device lookup source address
 based on VRF table. This allows the UDP header to be properly setup
 before showing up at the VRF device via the dst.

 Signed-off-by: Shrijeet Mukherjee s...@cumulusnetworks.com
 Signed-off-by: David Ahern d...@cumulusnetworks.com
 ---
  net/ipv4/udp.c | 22 +-
  1 file changed, 21 insertions(+), 1 deletion(-)

 diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
 index 83aa604f9273..7af5052e3b1f 100644
 --- a/net/ipv4/udp.c
 +++ b/net/ipv4/udp.c
 @@ -1013,11 +1013,31 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, 
 size_t len)

 if (!rt) {
 struct net *net = sock_net(sk);
 +   __u8 flow_flags = inet_sk_flowi_flags(sk);

 fl4 = fl4_stack;
 +
 +   /* unconnected socket. If output device is enslaved to a VRF
 +* device lookup source address from VRF table. This mimics
 +* behavior of ip_route_connect{_init}.
 +*/
 +   if (netif_index_is_vrf(net, ipc.oif)) {
 +   flowi4_init_output(fl4, ipc.oif, sk-sk_mark, tos,
 +  RT_SCOPE_UNIVERSE, sk-sk_protocol,
 +  (flow_flags | FLOWI_FLAG_VRFSRC),
 +  faddr, saddr, dport,
 +  inet-inet_sport);
 +
 +   rt = ip_route_output_flow(net, fl4, sk);
 +   if (!IS_ERR(rt)) {
 +   saddr = fl4-saddr;
 +   ip_rt_put(rt);
 +   }
 +   }
 +

I really don't like this. It seems like you're putting device specific
code in a critical L4 data path function. Also, does ipv6/udp.c need
be updated similarly? Why can't VRF be abstracted out in routing
lookups?

Tom

 flowi4_init_output(fl4, ipc.oif, sk-sk_mark, tos,
RT_SCOPE_UNIVERSE, sk-sk_protocol,
 -  inet_sk_flowi_flags(sk),
 +  flow_flags,
faddr, saddr, dport, inet-inet_sport);

 security_sk_classify_flow(sk, flowi4_to_flowi(fl4));
 --
 2.3.2 (Apple Git-55)

 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] usbnet: Fix two races between usbnet_stop() and the BH

2015-08-14 Thread Eugene Shatokhin
Both races may happen when a device (e.g. YOTA 4G LTE Modem) is
unplugged while the system is downloading a large file from the Net.

Hardware breakpoints and Kprobes with delays were used to confirm that
the races do actually happen.

1. The first race is on skb_queue ('next' pointer) between usbnet_stop()
and rx_complete(), which, in turn, calls usbnet_bh().

Here is a part of the call stack with the code where the changes to the
queue happen. The line numbers are for the kernel 4.1.0:

*0 __skb_unlink (skbuff.h:1517)
prev-next = next;
*1 defer_bh (usbnet.c:430)
spin_lock_irqsave(list-lock, flags);
old_state = entry-state;
entry-state = state;
__skb_unlink(skb, list);
spin_unlock(list-lock);
spin_lock(dev-done.lock);
__skb_queue_tail(dev-done, skb);
if (dev-done.qlen == 1)
tasklet_schedule(dev-bh);
spin_unlock_irqrestore(dev-done.lock, flags);
*2 rx_complete (usbnet.c:640)
state = defer_bh(dev, skb, dev-rxq, state);

At the same time, the following code repeatedly checks if the queue is
empty and reads these values concurrently with the above changes:

*0  usbnet_terminate_urbs (usbnet.c:765)
/* maybe wait for deletions to finish. */
while (!skb_queue_empty(dev-rxq)
 !skb_queue_empty(dev-txq)
 !skb_queue_empty(dev-done)) {
schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS));
set_current_state(TASK_UNINTERRUPTIBLE);
netif_dbg(dev, ifdown, dev-net,
  waited for %d urb completions\n, temp);
}
*1  usbnet_stop (usbnet.c:806)
if (!(info-flags  FLAG_AVOID_UNLINK_URBS))
usbnet_terminate_urbs(dev);

As a result, it is possible, for example, that the skb is removed from
dev-rxq by __skb_unlink() before the check
!skb_queue_empty(dev-rxq) in usbnet_terminate_urbs() is made. It is
also possible in this case that the skb is added to dev-done queue
after !skb_queue_empty(dev-done) is checked. So
usbnet_terminate_urbs() may stop waiting and return while dev-done
queue still has an item.

Locking in defer_bh() and usbnet_terminate_urbs() was revisited to avoid
this race.

2. The second race is on dev-flags.

dev-flags is set to 0 here:
*0  usbnet_stop (usbnet.c:816)
/* deferred work (task, timer, softirq) must also stop.
 * can't flush_scheduled_work() until we drop rtnl (later),
 * else workers could deadlock; so make workers a NOP.
 */
dev-flags = 0;
del_timer_sync (dev-delay);
tasklet_kill (dev-bh);

And here, the code clears EVENT_RX_KILL bit in dev-flags, which may
execute concurrently with the above operation:
*0 clear_bit (bitops.h:113, inlined)
*1 usbnet_bh (usbnet.c:1475)
/* restart RX again after disabling due to high error rate */
clear_bit(EVENT_RX_KILL, dev-flags);

It seems, setting dev-flags to 0 is not necessarily atomic w.r.t.
clear_bit() and other bit operations with dev-flags. It is safer to
make it atomic and this way, make the race harmless.

While at it, the checking of EVENT_NO_RUNTIME_PM bit of dev-flags in
usbnet_stop() was fixed too: the bit should be checked before dev-flags
is cleared.

Signed-off-by: Eugene Shatokhin eugene.shatok...@rosalab.ru
---
 drivers/net/usb/usbnet.c   | 49 --
 include/linux/usb/usbnet.h | 33 +++
 2 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 3c86b10..a53124c 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -428,12 +428,18 @@ static enum skb_state defer_bh(struct usbnet *dev, struct 
sk_buff *skb,
old_state = entry-state;
entry-state = state;
__skb_unlink(skb, list);
-   spin_unlock(list-lock);
-   spin_lock(dev-done.lock);
+
+   /* defer_bh() is never called with list == dev-done.
+* spin_lock_nested() tells lockdep that it is OK to take
+* dev-done.lock here with list-lock held. *
+*/
+   spin_lock_nested(dev-done.lock, SINGLE_DEPTH_NESTING);
+
__skb_queue_tail(dev-done, skb);
if (dev-done.qlen == 1)
tasklet_schedule(dev-bh);
-   spin_unlock_irqrestore(dev-done.lock, flags);
+   spin_unlock(dev-done.lock);
+   spin_unlock_irqrestore(list-lock, flags);
return old_state;
 }
 
@@ -749,6 +755,20 @@ EXPORT_SYMBOL_GPL(usbnet_unlink_rx_urbs);
 
 /*-*/
 
+static void wait_skb_queue_empty(struct sk_buff_head *q)
+{
+   unsigned long flags;
+
+   spin_lock_irqsave(q-lock, flags);
+   while (!skb_queue_empty(q)) {
+   spin_unlock_irqrestore(q-lock, flags);
+   schedule_timeout(msecs_to_jiffies(UNLINK_TIMEOUT_MS));
+   set_current_state(TASK_UNINTERRUPTIBLE);
+   spin_lock_irqsave(q-lock, flags);
+   }
+   spin_unlock_irqrestore(q-lock, flags);
+}
+
 // precondition: never 

Re: [PATCH RFC net 0/3] ipv6: Fix potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
On Thu, Aug 13, 2015 at 05:29:09PM -0700, David Miller wrote:
 From: Martin KaFai Lau ka...@fb.com
 Date: Thu, 13 Aug 2015 00:58:00 -0700
 
  This patch series fixes a potential deadlock when creating a pcpu rt.
  It happens when dst_alloc() decided to run gc. Something like this:
  
  read_lock(table-tb6_lock);
  ip6_rt_pcpu_alloc()
  = dst_alloc()
  = ip6_dst_gc()
  = write_lock(table-tb6_lock); /* oops */
  
  Patch 1 and 2 are some prep works.
  Patch 3 is the fix.
  
  Original report: https://bugzilla.kernel.org/show_bug.cgi?id=102291
  
  Steinar, the patches can also be applied to 4.2-rc5 (I just tried).
  Can you help to test them? Thanks!
 
 This series looks fine to me.
Thanks. I will repost it with a minor change in one of the commit messages.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 net 3/3] ipv6: Fix a potential deadlock when creating pcpu rt

2015-08-14 Thread Martin KaFai Lau
rt6_make_pcpu_route() is called under read_lock(table-tb6_lock).
rt6_make_pcpu_route() calls ip6_rt_pcpu_alloc(rt) which then
calls dst_alloc().  dst_alloc() _may_ call ip6_dst_gc() which takes
the write_lock(tabl-tb6_lock).  A visualized version:

read_lock(table-tb6_lock);
rt6_make_pcpu_route();
= ip6_rt_pcpu_alloc();
= dst_alloc();
= ip6_dst_gc();
= write_lock(table-tb6_lock); /* oops */

The fix is to do a read_unlock first before calling ip6_rt_pcpu_alloc().

A reported stack:

[141625.537638] INFO: rcu_sched self-detected stall on CPU { 27}  (t=6 
jiffies g=4159086 c=4159085 q=2139)
[141625.547469] Task dump for CPU 27:
[141625.550881] mtr R  running task0 22121  22081 0x0008
[141625.558069]   88103f363d98 8106e488 
001b
[141625.565641]  81684900 88103f363db8 810702b0 
0800
[141625.573220]  81684900 88103f363de8 8108df9f 
88103f375a00
[141625.580803] Call Trace:
[141625.583345]  IRQ  [8106e488] sched_show_task+0xc1/0xc6
[141625.589650]  [810702b0] dump_cpu_task+0x35/0x39
[141625.595144]  [8108df9f] rcu_dump_cpu_stacks+0x6a/0x8c
[141625.601320]  [81090606] rcu_check_callbacks+0x1f6/0x5d4
[141625.607669]  [810940c8] update_process_times+0x2a/0x4f
[141625.613925]  [8109fbee] tick_sched_handle+0x32/0x3e
[141625.619923]  [8109fc2f] tick_sched_timer+0x35/0x5c
[141625.625830]  [81094a1f] __hrtimer_run_queues+0x8f/0x18d
[141625.632171]  [81094c9e] hrtimer_interrupt+0xa0/0x166
[141625.638258]  [8102bf2a] local_apic_timer_interrupt+0x4e/0x52
[141625.645036]  [8102c36f] smp_apic_timer_interrupt+0x39/0x4a
[141625.651643]  [8140b9e8] apic_timer_interrupt+0x68/0x70
[141625.657895]  EOI  [81346ee8] ? dst_destroy+0x7c/0xb5
[141625.664188]  [813d45b5] ? fib6_flush_trees+0x20/0x20
[141625.670272]  [81082b45] ? queue_write_lock_slowpath+0x60/0x6f
[141625.677140]  [8140aa33] _raw_write_lock_bh+0x23/0x25
[141625.683218]  [813d4553] __fib6_clean_all+0x40/0x82
[141625.689124]  [813d45b5] ? fib6_flush_trees+0x20/0x20
[141625.695207]  [813d6058] fib6_clean_all+0xe/0x10
[141625.700854]  [813d60d3] fib6_run_gc+0x79/0xc8
[141625.706329]  [813d0510] ip6_dst_gc+0x85/0xf9
[141625.711718]  [81346d68] dst_alloc+0x55/0x159
[141625.717105]  [813d09b5] __ip6_dst_alloc.isra.32+0x19/0x63
[141625.723620]  [813d1830] ip6_pol_route+0x36a/0x3e8
[141625.729441]  [813d18d6] ip6_pol_route_output+0x11/0x13
[141625.735700]  [813f02c8] fib6_rule_action+0xa7/0x1bf
[141625.741698]  [813d18c5] ? ip6_pol_route_input+0x17/0x17
[141625.748043]  [81357c48] fib_rules_lookup+0xb5/0x12a
[141625.754050]  [81141628] ? poll_select_copy_remaining+0xf9/0xf9
[141625.761002]  [813f0535] fib6_rule_lookup+0x37/0x5c
[141625.766914]  [813d18c5] ? ip6_pol_route_input+0x17/0x17
[141625.773260]  [813d008c] ip6_route_output+0x7a/0x82
[141625.779177]  [813c44c8] ip6_dst_lookup_tail+0x53/0x112
[141625.785437]  [813c45c3] ip6_dst_lookup_flow+0x2a/0x6b
[141625.791604]  [813ddaab] rawv6_sendmsg+0x407/0x9b6
[141625.797423]  [813d7914] ? do_ipv6_setsockopt.isra.8+0xd87/0xde2
[141625.804464]  [8139d4b4] inet_sendmsg+0x57/0x8e
[141625.810028]  [81329ba3] sock_sendmsg+0x2e/0x3c
[141625.815588]  [8132be57] SyS_sendto+0xfe/0x143
[141625.821063]  [813dd551] ? rawv6_setsockopt+0x5e/0x67
[141625.827146]  [8132c9f8] ? sock_common_setsockopt+0xf/0x11
[141625.833660]  [8132c08c] ? SyS_setsockopt+0x81/0xa2
[141625.839565]  [8140ac17] entry_SYSCALL_64_fastpath+0x12/0x6a

Fixes: d52d3997f843 (pv6: Create percpu rt6_info)
Signed-off-by: Martin KaFai Lau ka...@fb.com
CC: Hannes Frederic Sowa han...@stressinduktion.org
Reported-by: Steinar H. Gunderson sgunder...@bigfoot.com
---
 net/ipv6/ip6_fib.c |  2 ++
 net/ipv6/route.c   | 44 +---
 2 files changed, 35 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 55d1986..548c623 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -172,6 +172,8 @@ static void rt6_free_pcpu(struct rt6_info *non_pcpu_rt)
*ppcpu_rt = NULL;
}
}
+
+   non_pcpu_rt-rt6i_pcpu = NULL;
 }
 
 static void rt6_release(struct rt6_info *rt)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 0a82653..d155864 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1007,27 +1007,39 @@ static struct rt6_info *rt6_get_pcpu_route(struct 
rt6_info *rt)
 
 static struct rt6_info *rt6_make_pcpu_route(struct rt6_info *rt)
 {
+   struct fib6_table *table = rt-rt6i_table;
struct rt6_info *pcpu_rt, *prev, **p;
 
pcpu_rt = ip6_rt_pcpu_alloc(rt);
if (!pcpu_rt) {
   

[PATCH v2 net 1/3] ipv6: Remove un-used argument from ip6_dst_alloc()

2015-08-14 Thread Martin KaFai Lau
After 4b32b5ad31a6 (ipv6: Stop rt6_info from using inet_peer's metrics),
ip6_dst_alloc() does not need the 'table' argument.  This patch
cleans it up.

Signed-off-by: Martin KaFai Lau ka...@fb.com
CC: Hannes Frederic Sowa han...@stressinduktion.org
---
 net/ipv6/route.c | 21 +
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 9de4d2b..c95c319 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -318,8 +318,7 @@ static const struct rt6_info ip6_blk_hole_entry_template = {
 /* allocate dst with ip6_dst_ops */
 static struct rt6_info *__ip6_dst_alloc(struct net *net,
struct net_device *dev,
-   int flags,
-   struct fib6_table *table)
+   int flags)
 {
struct rt6_info *rt = dst_alloc(net-ipv6.ip6_dst_ops, dev,
0, DST_OBSOLETE_FORCE_CHK, flags);
@@ -336,10 +335,9 @@ static struct rt6_info *__ip6_dst_alloc(struct net *net,
 
 static struct rt6_info *ip6_dst_alloc(struct net *net,
  struct net_device *dev,
- int flags,
- struct fib6_table *table)
+ int flags)
 {
-   struct rt6_info *rt = __ip6_dst_alloc(net, dev, flags, table);
+   struct rt6_info *rt = __ip6_dst_alloc(net, dev, flags);
 
if (rt) {
rt-rt6i_pcpu = alloc_percpu_gfp(struct rt6_info *, GFP_ATOMIC);
@@ -950,8 +948,7 @@ static struct rt6_info *ip6_rt_cache_alloc(struct rt6_info 
*ort,
if (ort-rt6i_flags  (RTF_CACHE | RTF_PCPU))
ort = (struct rt6_info *)ort-dst.from;
 
-   rt = __ip6_dst_alloc(dev_net(ort-dst.dev), ort-dst.dev,
-0, ort-rt6i_table);
+   rt = __ip6_dst_alloc(dev_net(ort-dst.dev), ort-dst.dev, 0);
 
if (!rt)
return NULL;
@@ -983,8 +980,7 @@ static struct rt6_info *ip6_rt_pcpu_alloc(struct rt6_info 
*rt)
struct rt6_info *pcpu_rt;
 
pcpu_rt = __ip6_dst_alloc(dev_net(rt-dst.dev),
- rt-dst.dev, rt-dst.flags,
- rt-rt6i_table);
+ rt-dst.dev, rt-dst.flags);
 
if (!pcpu_rt)
return NULL;
@@ -1555,7 +1551,7 @@ struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
if (unlikely(!idev))
return ERR_PTR(-ENODEV);
 
-   rt = ip6_dst_alloc(net, dev, 0, NULL);
+   rt = ip6_dst_alloc(net, dev, 0);
if (unlikely(!rt)) {
in6_dev_put(idev);
dst = ERR_PTR(-ENOMEM);
@@ -1742,7 +1738,8 @@ int ip6_route_add(struct fib6_config *cfg)
if (!table)
goto out;
 
-   rt = ip6_dst_alloc(net, NULL, (cfg-fc_flags  RTF_ADDRCONF) ? 0 : 
DST_NOCOUNT, table);
+   rt = ip6_dst_alloc(net, NULL,
+  (cfg-fc_flags  RTF_ADDRCONF) ? 0 : DST_NOCOUNT);
 
if (!rt) {
err = -ENOMEM;
@@ -2399,7 +2396,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev 
*idev,
 {
struct net *net = dev_net(idev-dev);
struct rt6_info *rt = ip6_dst_alloc(net, net-loopback_dev,
-   DST_NOCOUNT, NULL);
+   DST_NOCOUNT);
if (!rt)
return ERR_PTR(-ENOMEM);
 
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-14 Thread Sander Eikelenboom

On 2015-08-13 00:41, Eric Dumazet wrote:

On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:


Thanks for the reminder, but luckily i was aware of that,
seen enough of your replies asking for patches to be resubmitted
against the other tree ;)
Kernel with patch is currently running so fingers crossed.


Thanks for testing. I am definitely interested knowing your results.


Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is 
breaking things

(have to test if a revert helps) i get this in some guests:

NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0]
[ 6620.282805] Modules linked in:
[ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1
[ 6620.282805] task: 8221a580 ti: 8220 task.ti: 
8220
[ 6620.282805] RIP: e030:[8100122a]  [8100122a] 
xen_hypercall_xen_version+0xa/0x20

[ 6620.282805] RSP: e02b:88000fc03d48  EFLAGS: 0246
[ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 
8100122a
[ 6620.282805] RDX: 0001 RSI: deadbeef RDI: 
deadbeef
[ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 
00ee
[ 6620.282805] R10: 8220a0c0 R11: 0246 R12: 

[ 6620.282805] R13: 0001 R14: 880003b53054 R15: 
0005
[ 6620.282805] FS:  7fec747ad800() GS:88000fc0() 
knlGS:

[ 6620.282805] CS:  e033 DS:  ES:  CR0: 8005003b
[ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 
0660

[ 6620.282805] Stack:
[ 6620.282805]  0068 0007 81008dbd 
88000fc03dd8
[ 6620.282805]  81009592 0068 8220a0c0 
00ee
[ 6620.282805]  88000fc03ee0 0200 0200 
0001

[ 6620.282805] Call Trace:
[ 6620.282805]  IRQ
[ 6620.282805]  [81008dbd] ? 
xen_force_evtchn_callback+0xd/0x10

[ 6620.282805]  [81009592] check_events+0x12/0x20
[ 6620.282805]  [8100957f] ? 
xen_restore_fl_direct_reloc+0x4/0x4
[ 6620.282805]  [81af79a5] ? 
_raw_spin_unlock_irqrestore+0x25/0x30

[ 6620.282805]  [8110ed43] try_to_del_timer_sync+0x43/0x60
[ 6620.282805]  [8110eda7] del_timer_sync+0x47/0x60
[ 6620.282805]  [81a2b698] 
inet_csk_reqsk_queue_drop+0x118/0x1f0

[ 6620.282805]  [81a2b8c6] reqsk_timer_handler+0x156/0x260
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f55d] run_timer_softirq+0x12d/0x200
[ 6620.282805]  [810ca6c3] __do_softirq+0x103/0x210
[ 6620.282805]  [810ca9cb] irq_exit+0x4b/0xa0
[ 6620.282805]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
[ 6620.282805]  [81af932e] 
xen_do_hypervisor_callback+0x1e/0x40

[ 6620.282805]  EOI
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [81008d60] ? xen_safe_halt+0x10/0x20
[ 6620.282805]  [810188d3] ? default_idle+0x13/0x20
[ 6620.282805]  [81018e1a] ? arch_cpu_idle+0xa/0x10
[ 6620.282805]  [810f8e7e] ? default_idle_call+0x2e/0x50
[ 6620.282805]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
[ 6620.282805]  [81ae7967] ? rest_init+0x77/0x80
[ 6620.282805]  [82312f58] ? start_kernel+0x43b/0x448
[ 6620.282805]  [823124ef] ? 
x86_64_start_reservations+0x2a/0x2c

[ 6620.282805]  [82316008] ? xen_start_kernel+0x550/0x55c
[ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc cc 
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00 00 00 
0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] rhashtable-test: extend to test concurrency

2015-08-14 Thread Phil Sutter
After having tested insertion, lookup, table walk and removal, spawn a
number of threads running operations on the same rhashtable. Each of
them will:

1) insert it's own set of objects,
2) lookup every successfully inserted object and finally
3) remove objects in several rounds until all of them have been removed,
   making sure the remaining ones are still found after each round.

This should put a good amount of load onto the system and due to
synchronising thread startup via two semaphores also extensive
concurrent table access.

The default number of ten threads returned within half a second on my
local VM with two cores. Running 200 threads took about four seconds. If
slow systems suffer too much from this though, the default could be
lowered or even set to zero so this extended test does not run at all by
default.

Signed-off-by: Phil Sutter p...@nwl.cc
---
 lib/test_rhashtable.c | 155 +-
 1 file changed, 154 insertions(+), 1 deletion(-)

diff --git a/lib/test_rhashtable.c b/lib/test_rhashtable.c
index 9af7cef..a26d76f 100644
--- a/lib/test_rhashtable.c
+++ b/lib/test_rhashtable.c
@@ -16,9 +16,11 @@
 #include linux/init.h
 #include linux/jhash.h
 #include linux/kernel.h
+#include linux/kthread.h
 #include linux/module.h
 #include linux/rcupdate.h
 #include linux/rhashtable.h
+#include linux/semaphore.h
 #include linux/slab.h
 #include linux/sched.h
 
@@ -45,11 +47,21 @@ static int size = 8;
 module_param(size, int, 0);
 MODULE_PARM_DESC(size, Initial size hint of table (default: 8));
 
+static int tcount = 10;
+module_param(tcount, int, 0);
+MODULE_PARM_DESC(tcount, Number of threads to spawn (default: 10));
+
 struct test_obj {
int value;
struct rhash_head   node;
 };
 
+struct thread_data {
+   int id;
+   struct task_struct *task;
+   struct test_obj *objs;
+};
+
 static struct test_obj array[MAX_ENTRIES];
 
 static struct rhashtable_params test_rht_params = {
@@ -60,6 +72,9 @@ static struct rhashtable_params test_rht_params = {
.nulls_base = (3U  RHT_BASE_SHIFT),
 };
 
+static struct semaphore prestart_sem;
+static struct semaphore startup_sem = __SEMAPHORE_INITIALIZER(startup_sem, 0);
+
 static int __init test_rht_lookup(struct rhashtable *ht)
 {
unsigned int i;
@@ -200,10 +215,97 @@ static s64 __init test_rhashtable(struct rhashtable *ht)
 
 static struct rhashtable ht;
 
+static int thread_lookup_test(struct thread_data *tdata)
+{
+   int i, err = 0;
+
+   for (i = 0; i  entries; i++) {
+   struct test_obj *obj;
+   int key = (tdata-id  16) | i;
+
+   obj = rhashtable_lookup_fast(ht, key, test_rht_params);
+   if (obj  (tdata-objs[i].value == TEST_INSERT_FAIL)) {
+   pr_err(  found unexpected object %d\n, key);
+   err++;
+   } else if (!obj  (tdata-objs[i].value != TEST_INSERT_FAIL)) {
+   pr_err(  object %d not found!\n, key);
+   err++;
+   } else if (obj  (obj-value != key)) {
+   pr_err(  wrong object returned (got %d, expected 
%d)\n,
+  obj-value, key);
+   err++;
+   }
+   }
+   return err;
+}
+
+static int threadfunc(void *data)
+{
+   int i, step, err = 0, insert_fails = 0;
+   struct thread_data *tdata = data;
+
+   up(prestart_sem);
+   if (down_interruptible(startup_sem))
+   pr_err(  thread[%d]: down_interruptible failed\n, tdata-id);
+
+   for (i = 0; i  entries; i++) {
+   tdata-objs[i].value = (tdata-id  16) | i;
+   err = rhashtable_insert_fast(ht, tdata-objs[i].node,
+test_rht_params);
+   if (err == -ENOMEM || err == -EBUSY) {
+   tdata-objs[i].value = TEST_INSERT_FAIL;
+   insert_fails++;
+   } else if (err) {
+   pr_err(  thread[%d]: rhashtable_insert_fast failed\n,
+  tdata-id);
+   goto out;
+   }
+   }
+   if (insert_fails)
+   pr_info(  thread[%d]: %d insert failures\n,
+   tdata-id, insert_fails);
+
+   err = thread_lookup_test(tdata);
+   if (err) {
+   pr_err(  thread[%d]: rhashtable_lookup_test failed\n,
+  tdata-id);
+   goto out;
+   }
+
+   for (step = 10; step  0; step--) {
+   for (i = 0; i  entries; i += step) {
+   if (tdata-objs[i].value == TEST_INSERT_FAIL)
+   continue;
+   err = rhashtable_remove_fast(ht, tdata-objs[i].node,
+test_rht_params);
+   if (err) {
+   pr_err(  

Re: [PATCH net-next 1/3] lwt: Add support to redirect dst.input

2015-08-14 Thread Tom Herbert
I will send out a v2 short, this breaks compilation when
CONFIG_LWTUNNEL is not defined.


On Thu, Aug 13, 2015 at 9:54 AM, Tom Herbert t...@herbertland.com wrote:
 This patch adds the capability to redirect dst input in the same way
 that dst output is redirected by LWT.

 Also, save the original dst.input and and dst.out when setting up
 lwtunnel redirection. These can be called by the client as a pass-
 through.

 Signed-off-by: Tom Herbert t...@herbertland.com
 ---
  include/net/lwtunnel.h | 25 ++-
  net/core/lwtunnel.c| 55 
 ++
  net/ipv4/route.c   |  8 +++-
  net/ipv6/route.c   |  8 +++-
  4 files changed, 93 insertions(+), 3 deletions(-)

 diff --git a/include/net/lwtunnel.h b/include/net/lwtunnel.h
 index 33bd309..3db87d7 100644
 --- a/include/net/lwtunnel.h
 +++ b/include/net/lwtunnel.h
 @@ -11,12 +11,15 @@
  #define LWTUNNEL_HASH_SIZE   (1  LWTUNNEL_HASH_BITS)

  /* lw tunnel state flags */
 -#define LWTUNNEL_STATE_OUTPUT_REDIRECT 0x1
 +#define LWTUNNEL_STATE_OUTPUT_REDIRECT BIT(0)
 +#define LWTUNNEL_STATE_INPUT_REDIRECT  BIT(1)

  struct lwtunnel_state {
 __u16   type;
 __u16   flags;
 atomic_trefcnt;
 +   int (*orig_output)(struct sock *sk, struct sk_buff *skb);
 +   int (*orig_input)(struct sk_buff *);
 int len;
 __u8data[0];
  };
 @@ -25,6 +28,7 @@ struct lwtunnel_encap_ops {
 int (*build_state)(struct net_device *dev, struct nlattr *encap,
struct lwtunnel_state **ts);
 int (*output)(struct sock *sk, struct sk_buff *skb);
 +   int (*input)(struct sk_buff *skb);
 int (*fill_encap)(struct sk_buff *skb,
   struct lwtunnel_state *lwtstate);
 int (*get_encap_size)(struct lwtunnel_state *lwtstate);
 @@ -58,6 +62,13 @@ static inline bool lwtunnel_output_redirect(struct 
 lwtunnel_state *lwtstate)
 return false;
  }

 +static inline bool lwtunnel_input_redirect(struct lwtunnel_state *lwtstate)
 +{
 +   if (lwtstate  (lwtstate-flags  LWTUNNEL_STATE_INPUT_REDIRECT))
 +   return true;
 +
 +   return false;
 +}
  int lwtunnel_encap_add_ops(const struct lwtunnel_encap_ops *op,
unsigned int num);
  int lwtunnel_encap_del_ops(const struct lwtunnel_encap_ops *op,
 @@ -72,6 +83,8 @@ struct lwtunnel_state *lwtunnel_state_alloc(int hdr_len);
  int lwtunnel_cmp_encap(struct lwtunnel_state *a, struct lwtunnel_state *b);
  int lwtunnel_output(struct sock *sk, struct sk_buff *skb);
  int lwtunnel_output6(struct sock *sk, struct sk_buff *skb);
 +int lwtunnel_input(struct sk_buff *skb);
 +int lwtunnel_input6(struct sk_buff *skb);

  #else

 @@ -142,6 +155,16 @@ static inline int lwtunnel_output6(struct sock *sk, 
 struct sk_buff *skb)
 return -EOPNOTSUPP;
  }

 +static inline int lwtunnel_input(struct sock *sk, struct sk_buff *skb)
 +{
 +   return -EOPNOTSUPP;
 +}
 +
 +static inline int lwtunnel_input6(struct sock *sk, struct sk_buff *skb)
 +{
 +   return -EOPNOTSUPP;
 +}
 +
  #endif

  #endif /* __NET_LWTUNNEL_H */
 diff --git a/net/core/lwtunnel.c b/net/core/lwtunnel.c
 index 5d6d8e3..3331585 100644
 --- a/net/core/lwtunnel.c
 +++ b/net/core/lwtunnel.c
 @@ -241,3 +241,58 @@ int lwtunnel_output(struct sock *sk, struct sk_buff *skb)
 return __lwtunnel_output(sk, skb, lwtstate);
  }
  EXPORT_SYMBOL(lwtunnel_output);
 +
 +int __lwtunnel_input(struct sk_buff *skb,
 +struct lwtunnel_state *lwtstate)
 +{
 +   const struct lwtunnel_encap_ops *ops;
 +   int ret = -EINVAL;
 +
 +   if (!lwtstate)
 +   goto drop;
 +
 +   if (lwtstate-type == LWTUNNEL_ENCAP_NONE ||
 +   lwtstate-type  LWTUNNEL_ENCAP_MAX)
 +   return 0;
 +
 +   ret = -EOPNOTSUPP;
 +   rcu_read_lock();
 +   ops = rcu_dereference(lwtun_encaps[lwtstate-type]);
 +   if (likely(ops  ops-input))
 +   ret = ops-input(skb);
 +   rcu_read_unlock();
 +
 +   if (ret == -EOPNOTSUPP)
 +   goto drop;
 +
 +   return ret;
 +
 +drop:
 +   kfree_skb(skb);
 +
 +   return ret;
 +}
 +
 +int lwtunnel_input6(struct sk_buff *skb)
 +{
 +   struct rt6_info *rt = (struct rt6_info *)skb_dst(skb);
 +   struct lwtunnel_state *lwtstate = NULL;
 +
 +   if (rt)
 +   lwtstate = rt-rt6i_lwtstate;
 +
 +   return __lwtunnel_input(skb, lwtstate);
 +}
 +EXPORT_SYMBOL(lwtunnel_input6);
 +
 +int lwtunnel_input(struct sk_buff *skb)
 +{
 +   struct rtable *rt = (struct rtable *)skb_dst(skb);
 +   struct lwtunnel_state *lwtstate = NULL;
 +
 +   if (rt)
 +   lwtstate = rt-rt_lwtstate;
 +
 +   return __lwtunnel_input(skb, lwtstate);
 +}
 +EXPORT_SYMBOL(lwtunnel_input);
 diff --git a/net/ipv4/route.c b/net/ipv4/route.c
 index 18fd7c9..051d834 100644
 --- 

Re: [BUG net-next] infamous dev refcnt leak... again.

2015-08-14 Thread Eric Dumazet
On Fri, 2015-08-14 at 17:19 -0600, David Ahern wrote:
 On 8/14/15 5:14 PM, Eric Dumazet wrote:
  On Fri, 2015-08-14 at 14:14 -0700, Eric Dumazet wrote:
  While rebooting host running latest net-next
 
unregister_netdevice: waiting for eth0 to become free. Usage count = 4
 
  Oh well...
 
 
  It looks like David Ahern recent changes uncover a bug ?
 
  Not clear which commit is at fault.
 
  Maybe 3bfd847203c6d89532f836ad3f5b4ff4ced26dd9 ?
 
  Somehow a down device can be found.
 
 Can you elaborate on what you are doing to see the refcnt leak? I have 
 not seen that at all. I have to leave for soccer carpool in 45 minutes 
 or so, but can take a look this weekend.


I simply reboot my host. eth0 device can not be dismantled and block the
reboot, I gave to reset the host.

I get the issue every time.

I confirm reverting 3bfd847203c6d89532f836ad3f5b4ff4ced26dd9
removes the issue for me.


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-14 Thread Eric Dumazet
On Sat, 2015-08-15 at 00:09 +0200, Sander Eikelenboom wrote:
 On 2015-08-13 00:41, Eric Dumazet wrote:
  On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:
  
  Thanks for the reminder, but luckily i was aware of that,
  seen enough of your replies asking for patches to be resubmitted
  against the other tree ;)
  Kernel with patch is currently running so fingers crossed.
  
  Thanks for testing. I am definitely interested knowing your results.
 
 Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is 
 breaking things
 (have to test if a revert helps) i get this in some guests:


Yes, this was fixed by :
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG net-next] infamous dev refcnt leak... again.

2015-08-14 Thread Eric Dumazet
On Fri, 2015-08-14 at 14:14 -0700, Eric Dumazet wrote:
 While rebooting host running latest net-next
 
  unregister_netdevice: waiting for eth0 to become free. Usage count = 4
 
 Oh well...
 

It looks like David Ahern recent changes uncover a bug ?

Not clear which commit is at fault.

Maybe 3bfd847203c6d89532f836ad3f5b4ff4ced26dd9 ?

Somehow a down device can be found.

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index b7f1d20..675a3b6 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -725,10 +725,14 @@ static int fib_check_nh(struct fib_config *cfg, struct 
fib_info *fi,
nh-nh_dev = dev = FIB_RES_DEV(res);
if (!dev)
goto out;
-   dev_hold(dev);
if (!netif_carrier_ok(dev))
nh-nh_flags |= RTNH_F_LINKDOWN;
-   err = (dev-flags  IFF_UP) ? 0 : -ENETDOWN;
+   if (dev-flags  IFF_UP) {
+   err = 0;
+   dev_hold(dev);
+   } else {
+   err = -ENETDOWN;
+   }
} else {
struct in_device *in_dev;
 


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [Intel-wired-lan] [PATCH v2] e1000e: Modify tx/rx configurations to avoid null pointer dereferences in e1000_open

2015-08-14 Thread Brown, Aaron F
 From: Intel-wired-lan [mailto:intel-wired-lan-boun...@lists.osuosl.org] On
 Behalf Of Jia-Ju Bai
 Sent: Wednesday, August 05, 2015 3:16 AM
 To: Kirsher, Jeffrey T; Brandeburg, Jesse
 Cc: netdev@vger.kernel.org; Jia-Ju Bai; intel-wired-...@lists.osuosl.org;
 linux-ker...@vger.kernel.org
 Subject: [Intel-wired-lan] [PATCH v2] e1000e: Modify tx/rx configurations
 to avoid null pointer dereferences in e1000_open
 
 When e1000e_setup_rx_resources is failed in e1000_open,
 e1000e_free_tx_resources in err_setup_rx segment is executed.
 writel(0, tx_ring-head) statement in e1000_clean_tx_ring
 in e1000e_free_tx_resources will cause a null poonter dereference(crash),
 because tx_ring-head is only assigned in e1000_configure_tx
 in e1000_configure, but it is after e1000e_setup_rx_resources.
 
 This patch moves head/tail register writing to e1000_configure_tx/rx,
 which can fix this problem. It is inspired by igb_configure_tx_ring
 in the igb driver.
 
 Specially, thank Alexander Duyck for his valuable suggestion.
 
 Signed-off-by: Jia-Ju Bai baijiaju1...@163.com
 ---
  drivers/net/ethernet/intel/e1000e/netdev.c |   24 ---
 -
  1 file changed, 12 insertions(+), 12 deletions(-)

Tested-by: Aaron Brown aaron.f.br...@intel.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [B.A.T.M.A.N.] [PATCH 03/10] batman-adv: Make DAT capability changes atomic

2015-08-14 Thread Antonio Quartulli


On 11/08/15 21:36, Sergei Shtylyov wrote:
   /* check if orig node candidate is running DAT */
 -if (!(candidate-capabilities  BATADV_ORIG_CAPA_HAS_DAT))
 +if (!(test_bit(BATADV_ORIG_CAPA_HAS_DAT, candidate-capabilities)))
 
() around the tst_bit() call not needed.

Thanks for the hint Sergei. Even if I don't remember having seen any
complaint from checkpatch.pl about this.

I'll resend the pull request with this fixed patches.

Cheers,


-- 
Antonio Quartulli



signature.asc
Description: OpenPGP digital signature


pull request: batman-adv 20150814

2015-08-14 Thread Antonio Quartulli
Hi David,

this is our first batch intended for net-next/linux-4.3 (resent after
fixing the parenthesis as reported by Sergei).

Here you have all those non-critical fixes/changes that we couldn't
merge into the net tree as it was already too late in the release
cycle.

This is a summary of what each patch does:

- patch 1 by Sven Eckelmann is changing the way the GW metric is
computed so that the resulting operation does not make use of
divisions and also does not lead to any data type promotion. This is
a requirement for patch 2;
- patch 2 by Ruben Wisniewski is changing the type of the variable used
in the same GW metric computation as patch 1 to uint64_t so that potential
integer overflows are prevented. Thanks to Sven's patch above no 64bit
division will be involved;
- patches 3, 4, 5 and 6 by Linus Lüssing are converting plain bitwise
operations on capability bits to set/clear/test_bit() in order to ensure
their atomicity and prevent potential race conditions;
- patch 7, also by Linus, is making the multicast TVLV parsing routine
thread-safe in order to prevent potential race conditions upon reception
of two OGMs from the same originator at the same time;
- patch 8 by Marek Lindner prevents potential double deletions
of TT Request objects from its lists which would lead to a kernel crash.
- patch 9 by Simon Wunderlich is ensuring that no enqueued packet is leaked
when an interface is deactivated;
- patch 10 by Linus Lüssing is setting the network header in the skb struct
right after a packet was delivered to the batman virtual interface so that
subsequent call to ip/ipv6_hdr() do not crash.


Please pull or let me know of any problem!

Thanks a lot David,
Antonio



The following changes since commit 07a51cd3794960548627a27aae68c1446341db32:

  vxlan: fix fdb_dump index calculation (2015-08-10 21:15:18 -0700)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batman-adv-for-davem

for you to fetch changes up to 53cf037bf846417fd92dc92ddf97267f69b110f4:

  batman-adv: Fix potentially broken skb network header access (2015-08-14 
22:52:10 +0200)


Included changes:
- avoid integer overflow in GW selection routine
- prevent race condition by making capability bit changes atomic (use
  clear/set/test_bit)
- fix synchronization issue in mcast tvlv handler
- fix crash on double list removal of TT Request objects
- fix leak by puring packets enqueued for sending upon iface removal
- ensure network header pointer is set in skb


Linus Lüssing (6):
  batman-adv: Make DAT capability changes atomic
  batman-adv: Make NC capability changes atomic
  batman-adv: Make TT capability changes atomic
  batman-adv: Make MCAST capability changes atomic
  batman-adv: Fix potential synchronization issues in mcast tvlv handler
  batman-adv: Fix potentially broken skb network header access

Marek Lindner (1):
  batman-adv: protect tt request from double deletion

Ruben Wisniewski (1):
  batman-adv: Avoid u32 overflow during gateway select

Simon Wunderlich (1):
  batman-adv: remove broadcast packets scheduled for purged outgoing if

Sven Eckelmann (1):
  batman-adv: Replace gw_reselect divisor with simple shift

 net/batman-adv/distributed-arp-table.c |  7 +--
 net/batman-adv/gateway_client.c|  8 +---
 net/batman-adv/multicast.c | 81 +-
 net/batman-adv/network-coding.c|  7 +--
 net/batman-adv/originator.c|  5 +++
 net/batman-adv/send.c  |  3 +-
 net/batman-adv/soft-interface.c|  7 ++-
 net/batman-adv/translation-table.c | 17 ---
 net/batman-adv/types.h | 15 ---
 9 files changed, 102 insertions(+), 48 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] batman-adv: Make NC capability changes atomic

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 3f4841ffb336 (batman-adv: tvlv - add network coding container)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/network-coding.c | 7 ---
 net/batman-adv/types.h  | 2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index f0a50f3..4660401 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -19,6 +19,7 @@
 #include main.h
 
 #include linux/atomic.h
+#include linux/bitops.h
 #include linux/byteorder/generic.h
 #include linux/compiler.h
 #include linux/debugfs.h
@@ -134,9 +135,9 @@ static void batadv_nc_tvlv_ogm_handler_v1(struct 
batadv_priv *bat_priv,
  uint16_t tvlv_value_len)
 {
if (flags  BATADV_TVLV_HANDLER_OGM_CIFNOTFND)
-   orig-capabilities = ~BATADV_ORIG_CAPA_HAS_NC;
+   clear_bit(BATADV_ORIG_CAPA_HAS_NC, orig-capabilities);
else
-   orig-capabilities |= BATADV_ORIG_CAPA_HAS_NC;
+   set_bit(BATADV_ORIG_CAPA_HAS_NC, orig-capabilities);
 }
 
 /**
@@ -894,7 +895,7 @@ void batadv_nc_update_nc_node(struct batadv_priv *bat_priv,
goto out;
 
/* check if orig node is network coding enabled */
-   if (!(orig_node-capabilities  BATADV_ORIG_CAPA_HAS_NC))
+   if (!test_bit(BATADV_ORIG_CAPA_HAS_NC, orig_node-capabilities))
goto out;
 
/* accept ogms from 'good' neighbors and single hop neighbors */
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 29fd625..ed4aec5 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -314,7 +314,7 @@ struct batadv_orig_node {
  */
 enum batadv_orig_capabilities {
BATADV_ORIG_CAPA_HAS_DAT,
-   BATADV_ORIG_CAPA_HAS_NC = BIT(1),
+   BATADV_ORIG_CAPA_HAS_NC,
BATADV_ORIG_CAPA_HAS_TT = BIT(2),
BATADV_ORIG_CAPA_HAS_MCAST = BIT(3),
 };
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG net-next] infamous dev refcnt leak... again.

2015-08-14 Thread Eric Dumazet
While rebooting host running latest net-next

 unregister_netdevice: waiting for eth0 to become free. Usage count = 4

Oh well...


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/10] batman-adv: Fix potentially broken skb network header access

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

The two commits noted below added calls to ip_hdr() and ipv6_hdr(). They
need a correctly set skb network header.

Unfortunately we cannot rely on the device drivers to set it for us.
Therefore setting it in the beginning of the according ndo_start_xmit
handler.

Fixes: 1d8ab8d3c176 (batman-adv: Modified forwarding behaviour for multicast 
packets)
Fixes: ab49886e3da7 (batman-adv: Add IPv4 link-local/IPv6-ll-all-nodes 
multicast support)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/soft-interface.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index c002961..926292d 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -202,6 +202,7 @@ static int batadv_interface_tx(struct sk_buff *skb,
int gw_mode;
enum batadv_forw_mode forw_mode;
struct batadv_orig_node *mcast_single_orig = NULL;
+   int network_offset = ETH_HLEN;
 
if (atomic_read(bat_priv-mesh_state) != BATADV_MESH_ACTIVE)
goto dropped;
@@ -214,14 +215,18 @@ static int batadv_interface_tx(struct sk_buff *skb,
case ETH_P_8021Q:
vhdr = vlan_eth_hdr(skb);
 
-   if (vhdr-h_vlan_encapsulated_proto != ethertype)
+   if (vhdr-h_vlan_encapsulated_proto != ethertype) {
+   network_offset += VLAN_HLEN;
break;
+   }
 
/* fall through */
case ETH_P_BATMAN:
goto dropped;
}
 
+   skb_set_network_header(skb, network_offset);
+
if (batadv_bla_tx(bat_priv, skb, vid))
goto dropped;
 
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] ipv6: trivial whitespace fix

2015-08-14 Thread Ian Morris
Change brace placement to be in line with coding standards

Signed-off-by: Ian Morris i...@chirality.org.uk
---
 net/ipv6/udp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index e51fc3e..0aba654 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1496,7 +1496,8 @@ int __net_init udp6_proc_init(struct net *net)
return udp_proc_register(net, udp6_seq_afinfo);
 }
 
-void udp6_proc_exit(struct net *net) {
+void udp6_proc_exit(struct net *net)
+{
udp_proc_unregister(net, udp6_seq_afinfo);
 }
 #endif /* CONFIG_PROC_FS */
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/6] net/bonding: enable LRO if one device supports it

2015-08-14 Thread Jarod Wilson

On 2015-08-14 2:56 AM, Michal Kubecek wrote:

On Thu, Aug 13, 2015 at 02:02:55PM -0400, Jarod Wilson wrote:

Currently, all bonding devices come up, and claim to have LRO support,
which ethtool will let you toggle on and off, even if none of the
underlying hardware devices actually support it. While the bonding driver
takes precautions for slaves that don't support all features, this is at
least a little bit misleading to users.

If we add NETIF_F_LRO to the NETIF_F_ONE_FOR_ALL flags in
netdev_features.h, then netdev_features_increment() will only enable LRO
if 1) its listed in the device's feature mask and 2) if there's actually a
slave present that supports the feature.

Note that this is going to require some follow-up patches, as not all LRO
capable device drivers are currently properly reporting LRO support in
their vlan_features, which is where the bonding driver picks up
device-specific features.

CC: David S. Miller da...@davemloft.net
CC: Jiri Pirko j...@resnulli.us
CC: Tom Herbert therb...@google.com
CC: Scott Feldman sfel...@gmail.com
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson ja...@redhat.com
---
  include/linux/netdev_features.h | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 9672781..6440bf1 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -159,7 +159,8 @@ enum {
   */
  #define NETIF_F_ONE_FOR_ALL   (NETIF_F_GSO_SOFTWARE | NETIF_F_GSO_ROBUST | \
 NETIF_F_SG | NETIF_F_HIGHDMA | \
-NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED)
+NETIF_F_FRAGLIST | NETIF_F_VLAN_CHALLENGED | \
+NETIF_F_LRO)

  /*
   * If one device doesn't support one of these features, then disable it
--


I don't think this is going to work the way you expect. Assume we have a
non-LRO eth1 and LRO capable eth2. If we enslave eth1 first, bond will
lose NETIF_F_LRO so that while enslaving eth2, bond_enslave() does run

if (!(bond_dev-features  NETIF_F_LRO))
dev_disable_lro(slave_dev);

and disable LRO on eth2 even before computing the bond features so that
in the end, all three interfaces end up with disabled LRO. If you add
the slaves in the opposite order, you end up with eth2 and bond having
LRO enabled. IMHO features should not depend on the order in which
slaves are added into the bond.


Crap, you're right. Hadn't tried inverting the order of added devices, 
as it didn't occur to me that it would make a difference.



You would need to remove the code quoted above to make things work the
way you want (or move it after the call to bond_compute_features() which
is effectively the same). But then the result would be even worse:
adding a LRO-capable slave to a bond having dev_disable_lro() called on
it would not disable LRO on that slave, possibly (or rather likely)
causing communication breakage.

I believe NETIF_F_LRO in its original sense should be only considered
for physical devices; even if it's not explicitely said in the commit
message, the logic behind fbe168ba91f7 (net: generic dev_disable_lro()
stacked device handling) is that for stacked devices like bond or team,
NETIF_F_LRO means allow slaves to use LRO if they can and want while
its absence means disable LRO on all slaves. If you wanted NETIF_F_LRO
for a bond to mean there is at least one LRO capable slave, you would
need a new flag for the LRO should be disabled for all lower devices
state. I don't think it's worth the effort.


Yeah, my thinking was that it should mean there's at least one lro 
capable slave. If we just leave things the way they are though, I think 
its confusing on the user side -- it was one of our QE people who 
reported confusion being able to toggle lro on a bond when none of the 
slaves supported it. And there's also the inconsistency among devices 
that support lro in their vlan_features. So I think *something* should 
still be done here to make things clearer and more consistent, but I'll 
have to ponder that next week, since its beyond quitting time on Friday 
already. :)


Oh, last thought: the comment above #define NETIF_F_ONE_FOR_ALL is 
partly to blame for my not thinking harder and trying inverted ordering 
of slave additions:


/*
 * If one device supports one of these features, then enable them
 * for all in netdev_increment_features.
 */

This clearly seems to fall down in the lro case. :)

--
Jarod Wilson
ja...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG net-next] infamous dev refcnt leak... again.

2015-08-14 Thread Eric Dumazet
On Fri, 2015-08-14 at 16:31 -0700, Eric Dumazet wrote:

 
 I simply reboot my host. eth0 device can not be dismantled and block the
 reboot, I gave to reset the host.
 
 I get the issue every time.
 
 I confirm reverting 3bfd847203c6d89532f836ad3f5b4ff4ced26dd9
 removes the issue for me.
 

Also, netif_index_is_vrf() is supposed to be called under rcu,
but it is not the case from net/ipv4/udp.c , and ip_route_connect_init()


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)

2015-08-14 Thread Arnd Bergmann
On Friday 14 August 2015 18:30:18 Kenneth Lee wrote:
 diff --git a/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt 
 b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 new file mode 100644
 index 000..5ab6969
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/net/hisilicon-hns-nic.txt
 @@ -0,0 +1,14 @@
 +Hisilicon Network Subsystem NIC controller
 +
 +Required properties:
 +- compatible: hisilicon,hns-nic
 +- ae-name: accelerator name who provide this interface
 +- ae-opts: options (string) to the accelerator. e.g. the index interface
 +
 +Example:
 +
 +   ethernet@0{
 +   compatible = hisilicon,hns-nic;
 +   ae-name = soc0-n4;
 +   ae-opts = 0;
 +   };
 

These properties look very unconventional. What are the valid strings
for ae-name and ae-opts? It looks like the latter is just a number,
so why not use an integer property?

Arnd
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] batman-adv: Replace gw_reselect divisor with simple shift

2015-08-14 Thread Antonio Quartulli
From: Sven Eckelmann s...@narfation.org

The gw_factor is divided by BATADV_TQ_LOCAL_WINDOW_SIZE ** 2 * 64. But the
rest of the calculation has nothing to do with the tq window size and
therefore the calculation is just (tmp_gw_factor / (64 ** 3)).

Replace it with a simple shift to avoid a costly 64-bit divide when the
max_gw_factor is changed from u32 to u64. This type change is necessary
to avoid an overflow bug.

Signed-off-by: Sven Eckelmann s...@narfation.org
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/gateway_client.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index bb015862..e1e1f31 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -154,14 +154,10 @@ batadv_gw_get_best_gw_node(struct batadv_priv *bat_priv)
struct batadv_neigh_ifinfo *router_ifinfo;
struct batadv_gw_node *gw_node, *curr_gw = NULL;
uint32_t max_gw_factor = 0, tmp_gw_factor = 0;
-   uint32_t gw_divisor;
uint8_t max_tq = 0;
uint8_t tq_avg;
struct batadv_orig_node *orig_node;
 
-   gw_divisor = BATADV_TQ_LOCAL_WINDOW_SIZE * BATADV_TQ_LOCAL_WINDOW_SIZE;
-   gw_divisor *= 64;
-
rcu_read_lock();
hlist_for_each_entry_rcu(gw_node, bat_priv-gw.list, list) {
if (gw_node-deleted)
@@ -187,7 +183,7 @@ batadv_gw_get_best_gw_node(struct batadv_priv *bat_priv)
tmp_gw_factor = tq_avg * tq_avg;
tmp_gw_factor *= gw_node-bandwidth_down;
tmp_gw_factor *= 100 * 100;
-   tmp_gw_factor /= gw_divisor;
+   tmp_gw_factor = 18;
 
if ((tmp_gw_factor  max_gw_factor) ||
((tmp_gw_factor == max_gw_factor) 
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/10] batman-adv: Avoid u32 overflow during gateway select

2015-08-14 Thread Antonio Quartulli
From: Ruben Wisniewski ru...@freifunk-nrw.de

The gateway selection based on fast connections is using a single value
calculated from the average tq (0-255) and the download bandwidth (in
100Kibit). The formula for the first step (tq ** 2 * 1 * bandwidth)
tends to overflow a u32 with low bandwidth settings like 50 [100KiBit]
and a tq value of over 92.

Changing this to a 64 bit unsigned integer allows to support a
bandwidth_down with up to ~2.8e10 [100KiBit] and a perfect tq of 255. This
is ~6.6 times higher than the maximum possible value of the gateway
announcement TVLV.

This problem only affects the non-default gw_sel_class 1.

Signed-off-by: Ruben Wisniewsi ru...@vfn-nrw.de
[s...@narfation.org: rewritten commit message]
Signed-off-by: Sven Eckelmann s...@narfation.org
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/gateway_client.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index e1e1f31..4ac24d8 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -153,7 +153,7 @@ batadv_gw_get_best_gw_node(struct batadv_priv *bat_priv)
struct batadv_neigh_node *router;
struct batadv_neigh_ifinfo *router_ifinfo;
struct batadv_gw_node *gw_node, *curr_gw = NULL;
-   uint32_t max_gw_factor = 0, tmp_gw_factor = 0;
+   uint64_t max_gw_factor = 0, tmp_gw_factor = 0;
uint8_t max_tq = 0;
uint8_t tq_avg;
struct batadv_orig_node *orig_node;
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] ipv6: remove unnecessary include

2015-08-14 Thread Ian Morris
printk.h does not need to be explicitly included as we include kernel.h
which already called it.

Signed-off-by: Ian Morris i...@chirality.org.uk
---
 net/ipv6/ip6_offload.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index 08b6204..1cb2dc7 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -12,7 +12,6 @@
 #include linux/socket.h
 #include linux/netdevice.h
 #include linux/skbuff.h
-#include linux/printk.h
 
 #include net/protocol.h
 #include net/ipv6.h
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2-rc6 regression: RIP: e030:[ffffffff8110fb18] [ffffffff8110fb18] detach_if_pending+0x18/0x80

2015-08-14 Thread Sander Eikelenboom

On 2015-08-15 00:09, Sander Eikelenboom wrote:

On 2015-08-13 00:41, Eric Dumazet wrote:

On Wed, 2015-08-12 at 23:46 +0200, Sander Eikelenboom wrote:


Thanks for the reminder, but luckily i was aware of that,
seen enough of your replies asking for patches to be resubmitted
against the other tree ;)
Kernel with patch is currently running so fingers crossed.


Thanks for testing. I am definitely interested knowing your results.


Hmm it seems now commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af is
breaking things
(have to test if a revert helps) i get this in some guests:


Should have done that before, because it wasn't in yet .. and likely to 
fix the issue,

also pulled and compiling now.

--
Sander




NMI watchdog: BUG: soft lockup - CPU#0 stuck for 506s! [swapper/0:0]
[ 6620.282805] Modules linked in:
[ 6620.282805] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.2.0-rc6-20150814-linus-doflr-apicrevert+ #1
[ 6620.282805] task: 8221a580 ti: 8220 task.ti:
8220
[ 6620.282805] RIP: e030:[8100122a]  [8100122a]
xen_hypercall_xen_version+0xa/0x20
[ 6620.282805] RSP: e02b:88000fc03d48  EFLAGS: 0246
[ 6620.282805] RAX: 00040006 RBX: 0200 RCX: 
8100122a
[ 6620.282805] RDX: 0001 RSI: deadbeef RDI: 
deadbeef
[ 6620.282805] RBP: 88000fc03d60 R08: 88000fc03ee0 R09: 
00ee
[ 6620.282805] R10: 8220a0c0 R11: 0246 R12: 

[ 6620.282805] R13: 0001 R14: 880003b53054 R15: 
0005

[ 6620.282805] FS:  7fec747ad800() GS:88000fc0()
knlGS:
[ 6620.282805] CS:  e033 DS:  ES:  CR0: 8005003b
[ 6620.282805] CR2: 7ffcb7a7a6d8 CR3: 03164000 CR4: 
0660

[ 6620.282805] Stack:
[ 6620.282805]  0068 0007 81008dbd
88000fc03dd8
[ 6620.282805]  81009592 0068 8220a0c0
00ee
[ 6620.282805]  88000fc03ee0 0200 0200
0001
[ 6620.282805] Call Trace:
[ 6620.282805]  IRQ
[ 6620.282805]  [81008dbd] ? 
xen_force_evtchn_callback+0xd/0x10

[ 6620.282805]  [81009592] check_events+0x12/0x20
[ 6620.282805]  [8100957f] ? 
xen_restore_fl_direct_reloc+0x4/0x4
[ 6620.282805]  [81af79a5] ? 
_raw_spin_unlock_irqrestore+0x25/0x30

[ 6620.282805]  [8110ed43] try_to_del_timer_sync+0x43/0x60
[ 6620.282805]  [8110eda7] del_timer_sync+0x47/0x60
[ 6620.282805]  [81a2b698] 
inet_csk_reqsk_queue_drop+0x118/0x1f0

[ 6620.282805]  [81a2b8c6] reqsk_timer_handler+0x156/0x260
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f3c7] call_timer_fn.isra.27+0x17/0x80
[ 6620.282805]  [81a2b770] ? 
inet_csk_reqsk_queue_drop+0x1f0/0x1f0

[ 6620.282805]  [8110f55d] run_timer_softirq+0x12d/0x200
[ 6620.282805]  [810ca6c3] __do_softirq+0x103/0x210
[ 6620.282805]  [810ca9cb] irq_exit+0x4b/0xa0
[ 6620.282805]  [814f05d4] xen_evtchn_do_upcall+0x34/0x50
[ 6620.282805]  [81af932e] 
xen_do_hypervisor_callback+0x1e/0x40

[ 6620.282805]  EOI
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [810013aa] ? xen_hypercall_sched_op+0xa/0x20
[ 6620.282805]  [81008d60] ? xen_safe_halt+0x10/0x20
[ 6620.282805]  [810188d3] ? default_idle+0x13/0x20
[ 6620.282805]  [81018e1a] ? arch_cpu_idle+0xa/0x10
[ 6620.282805]  [810f8e7e] ? default_idle_call+0x2e/0x50
[ 6620.282805]  [810f9112] ? cpu_startup_entry+0x272/0x2e0
[ 6620.282805]  [81ae7967] ? rest_init+0x77/0x80
[ 6620.282805]  [82312f58] ? start_kernel+0x43b/0x448
[ 6620.282805]  [823124ef] ? 
x86_64_start_reservations+0x2a/0x2c

[ 6620.282805]  [82316008] ? xen_start_kernel+0x550/0x55c
[ 6620.282805] Code: cc 51 41 53 b8 10 00 00 00 0f 05 41 5b 59 c3 cc
cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 11 00
00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] be2net: avoid vxlan offloading on multichannel configs

2015-08-14 Thread Ajit Kumar Khaparde

 On Aug 14, 2015, at 3:30 PM, Ivan Vecera ivec...@redhat.com wrote:
 
 VxLAN offloading is not functional if the NIC is running in multichannel
 mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
 connectivity through the NIC and the device needs to be down and up to
 restore it. The firmware should take care about it and does not allow
 the conversion of interface to tunnel type (be_cmd_manage_iface) or should
 support VxLAN offloading if multichannel config is enabled.
 I have tested this on the latest available firmware (10.6.144.21).
 
 Result:
 [root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 
 172.30.10.50/24 dev enp5s0f0
 [root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 
 56(84) bytes of data.
 64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
 64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
 64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms
 
 --- 172.30.10.254 ping statistics ---
 3 packets transmitted, 3 received, 0% packet loss, time 2000ms
 rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
 [root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 
 172.30.10.60 dstport 4789
 [root@sm-04 ~]# ip link set vxlan10 up
 [ 7900.442811] be2net :05:00.0: Enabled VxLAN offloads for UDP port 4789
 [ 7900.455722] be2net :05:00.1: Enabled VxLAN offloads for UDP port 4789
 [ 7900.468635] be2net :05:00.2: Enabled VxLAN offloads for UDP port 4789
 [ 7900.481553] be2net :05:00.3: Enabled VxLAN offloads for UDP port 4789
 [root@sm-04 ~]# ping -c 3 172.30.10.254
 PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
 
 --- 172.30.10.254 ping statistics ---
 3 packets transmitted, 0 received, 100% packet loss, time 1999ms
 
 [root@sm-04 ~]# ip link set vxlan10 down
 [ 7959.434093] be2net :05:00.0: Disabled VxLAN offloads for UDP port 4789
 [ 7959.444792] be2net :05:00.1: Disabled VxLAN offloads for UDP port 4789
 [ 7959.455592] be2net :05:00.2: Disabled VxLAN offloads for UDP port 4789
 [ 7959.466416] be2net :05:00.3: Disabled VxLAN offloads for UDP port 4789
 [root@sm-04 ~]# ip link del vxlan10
 [root@sm-04 ~]# ping -c 3 172.30.10.254
 PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
 
 --- 172.30.10.254 ping statistics ---
 3 packets transmitted, 0 received, 100% packet loss, time 1999ms
 
 [root@sm-04 ~]# ip link set enp5s0f0 down
 [root@sm-04 ~]# ip link set enp5s0f0 up
 [ 8071.019003] be2net :05:00.0 enp5s0f0: Link is Up
 [root@sm-04 ~]# ping -c 3 172.30.10.254
 PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
 64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
 64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
 64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms
 
 --- 172.30.10.254 ping statistics ---
 3 packets transmitted, 3 received, 0% packet loss, time 2000ms
 rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms
 
 Cc: Sathya Perla sathya.pe...@avagotech.com
 Cc: Ajit Khaparde ajit.khapa...@avagotech.com
 Cc: Padmanabh Ratnakar padmanabh.ratna...@avagotech.com
 Cc: Sriharsha Basavapatna sriharsha.basavapa...@avagotech.com
 Signed-off-by: Ivan Vecera ivec...@redhat.com
Acked-by: Ajit Khaparde ajit.khapa...@avagotech.com

 
 ---
 drivers/net/ethernet/emulex/benet/be_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/drivers/net/ethernet/emulex/benet/be_main.c 
 b/drivers/net/ethernet/emulex/benet/be_main.c
 index c28e3bf..6ca693b 100644
 --- a/drivers/net/ethernet/emulex/benet/be_main.c
 +++ b/drivers/net/ethernet/emulex/benet/be_main.c
 @@ -5174,7 +5174,7 @@ static void be_add_vxlan_port(struct net_device 
 *netdev, sa_family_t sa_family,
   struct device *dev = adapter-pdev-dev;
   int status;
 
 - if (lancer_chip(adapter) || BEx_chip(adapter))
 + if (lancer_chip(adapter) || BEx_chip(adapter) || be_is_mc(adapter))
   return;
 
   if (adapter-flags  BE_FLAGS_VXLAN_OFFLOADS) {
 @@ -5221,7 +5221,7 @@ static void be_del_vxlan_port(struct net_device 
 *netdev, sa_family_t sa_family,
 {
   struct be_adapter *adapter = netdev_priv(netdev);
 
 - if (lancer_chip(adapter) || BEx_chip(adapter))
 + if (lancer_chip(adapter) || BEx_chip(adapter) || be_is_mc(adapter))
   return;
 
   if (adapter-vxlan_port != port)
 -- 
 2.4.6
 

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/10] batman-adv: remove broadcast packets scheduled for purged outgoing if

2015-08-14 Thread Antonio Quartulli
From: Simon Wunderlich si...@open-mesh.com

When an interface is purged, the broadcast packets scheduled for this
interface should get purged as well.

Signed-off-by: Simon Wunderlich si...@open-mesh.com
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/send.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 0a01992..191076e 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -616,7 +616,8 @@ batadv_purge_outstanding_packets(struct batadv_priv 
*bat_priv,
 * we delete only packets belonging to the given interface
 */
if ((hard_iface) 
-   (forw_packet-if_incoming != hard_iface))
+   (forw_packet-if_incoming != hard_iface) 
+   (forw_packet-if_outgoing != hard_iface))
continue;
 
spin_unlock_bh(bat_priv-forw_bcast_list_lock);
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support

2015-08-14 Thread Arnd Bergmann
On Friday 14 August 2015 18:30:20 Kenneth Lee wrote:

 +#define MDIO_BASE_ADDR 0x403C

Does not belong in here (and is not used)

 +#define MDIO_COMMAND_REG   0x0
 +#define MDIO_ADDR_REG  0x4
 +#define MDIO_WDATA_REG 0x8
 +#define MDIO_RDATA_REG 0xc
 +#define MDIO_STA_REG   0x10

These look suspiciously similar to definitions from
drivers/net/ethernet/hisilicon/hip04_mdio.c.

Could the hardware be related? If so, please try to share
the common parts.

 +static inline void mdio_write_reg(void *base, u32 reg, u32 value)
 +{
 +   u8 __iomem *reg_addr = ACCESS_ONCE(base);
 +
 +   writel(value, reg_addr + reg);
 +}
 +
 +#define MDIO_WRITE_REG(a, reg, value) \
 +   mdio_write_reg((a)-vbase, (reg), (value))
 

Something seems wrong here: why do you have an ACCESS_ONCE() on a
local variable? Doesn't this just make the code less efficient
without providing lockless access to shared variables?

The types are inconsistent here, you should get a warning from
running this through 'make C=1' because of the missing __iomem
annotation of the pointer.

Also, why both a macro and an inline function? Just use an inline
function.

Arnd
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/10] batman-adv: Make TT capability changes atomic

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: e17931d1a61d (batman-adv: introduce capability initialization bitfield)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/translation-table.c | 8 +---
 net/batman-adv/types.h | 4 ++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/batman-adv/translation-table.c 
b/net/batman-adv/translation-table.c
index b482495..1573489 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -19,6 +19,7 @@
 #include main.h
 
 #include linux/atomic.h
+#include linux/bitops.h
 #include linux/bug.h
 #include linux/byteorder/generic.h
 #include linux/compiler.h
@@ -1862,7 +1863,7 @@ void batadv_tt_global_del_orig(struct batadv_priv 
*bat_priv,
}
spin_unlock_bh(list_lock);
}
-   orig_node-capa_initialized = ~BATADV_ORIG_CAPA_HAS_TT;
+   clear_bit(BATADV_ORIG_CAPA_HAS_TT, orig_node-capa_initialized);
 }
 
 static bool batadv_tt_global_to_purge(struct batadv_tt_global_entry *tt_global,
@@ -2821,7 +2822,7 @@ static void _batadv_tt_update_changes(struct batadv_priv 
*bat_priv,
return;
}
}
-   orig_node-capa_initialized |= BATADV_ORIG_CAPA_HAS_TT;
+   set_bit(BATADV_ORIG_CAPA_HAS_TT, orig_node-capa_initialized);
 }
 
 static void batadv_tt_fill_gtable(struct batadv_priv *bat_priv,
@@ -3321,7 +3322,8 @@ static void batadv_tt_update_orig(struct batadv_priv 
*bat_priv,
bool has_tt_init;
 
tt_vlan = (struct batadv_tvlv_tt_vlan_data *)tt_buff;
-   has_tt_init = orig_node-capa_initialized  BATADV_ORIG_CAPA_HAS_TT;
+   has_tt_init = test_bit(BATADV_ORIG_CAPA_HAS_TT,
+  orig_node-capa_initialized);
 
/* orig table not initialised AND first diff is in the OGM OR the ttvn
 * increased by one - we can apply the attached changes
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index ed4aec5..6f801ef 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -274,7 +274,7 @@ struct batadv_orig_node {
struct hlist_node mcast_want_all_ipv6_node;
 #endif
unsigned long capabilities;
-   uint8_t capa_initialized;
+   unsigned long capa_initialized;
atomic_t last_ttvn;
unsigned char *tt_buff;
int16_t tt_buff_len;
@@ -315,7 +315,7 @@ struct batadv_orig_node {
 enum batadv_orig_capabilities {
BATADV_ORIG_CAPA_HAS_DAT,
BATADV_ORIG_CAPA_HAS_NC,
-   BATADV_ORIG_CAPA_HAS_TT = BIT(2),
+   BATADV_ORIG_CAPA_HAS_TT,
BATADV_ORIG_CAPA_HAS_MCAST = BIT(3),
 };
 
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/10] batman-adv: Fix potential synchronization issues in mcast tvlv handler

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

So far the mcast tvlv handler did not anticipate the processing of
multiple incoming OGMs from the same originator at the same time. This
can lead to various issues:

* Broken refcounting: For instance two mcast handlers might both assume
  that an originator just got multicast capabilities and will together
  wrongly decrease mcast.num_disabled by two, potentially leading to
  an integer underflow.

* Potential kernel panic on hlist_del_rcu(): Two mcast handlers might
  one after another try to do an
  hlist_del_rcu(orig-mcast_want_all_*_node). The second one will
  cause memory corruption / crashes.
  (Reported by: Sven Eckelmann s...@narfation.org)

Right in the beginning the code path makes assumptions about the current
multicast related state of an originator and bases all updates on that. The
easiest and least error prune way to fix the issues in this case is to
serialize multiple mcast handler invocations with a spinlock.

Fixes: 60432d756cf0 (batman-adv: Announce new capability via multicast TVLV)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/multicast.c  | 63 +++--
 net/batman-adv/originator.c |  5 
 net/batman-adv/types.h  |  3 +++
 3 files changed, 58 insertions(+), 13 deletions(-)

diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c
index 8f1ec21..68a9554 100644
--- a/net/batman-adv/multicast.c
+++ b/net/batman-adv/multicast.c
@@ -20,6 +20,7 @@
 
 #include linux/atomic.h
 #include linux/bitops.h
+#include linux/bug.h
 #include linux/byteorder/generic.h
 #include linux/errno.h
 #include linux/etherdevice.h
@@ -589,19 +590,26 @@ batadv_mcast_forw_mode(struct batadv_priv *bat_priv, 
struct sk_buff *skb,
  *
  * If the BATADV_MCAST_WANT_ALL_UNSNOOPABLES flag of this originator,
  * orig, has toggled then this method updates counter and list accordingly.
+ *
+ * Caller needs to hold orig-mcast_handler_lock.
  */
 static void batadv_mcast_want_unsnoop_update(struct batadv_priv *bat_priv,
 struct batadv_orig_node *orig,
 uint8_t mcast_flags)
 {
+   struct hlist_node *node = orig-mcast_want_all_unsnoopables_node;
+   struct hlist_head *head = bat_priv-mcast.want_all_unsnoopables_list;
+
/* switched from flag unset to set */
if (mcast_flags  BATADV_MCAST_WANT_ALL_UNSNOOPABLES 
!(orig-mcast_flags  BATADV_MCAST_WANT_ALL_UNSNOOPABLES)) {
atomic_inc(bat_priv-mcast.num_want_all_unsnoopables);
 
spin_lock_bh(bat_priv-mcast.want_lists_lock);
-   hlist_add_head_rcu(orig-mcast_want_all_unsnoopables_node,
-  bat_priv-mcast.want_all_unsnoopables_list);
+   /* flag checks above + mcast_handler_lock prevents this */
+   WARN_ON(!hlist_unhashed(node));
+
+   hlist_add_head_rcu(node, head);
spin_unlock_bh(bat_priv-mcast.want_lists_lock);
/* switched from flag set to unset */
} else if (!(mcast_flags  BATADV_MCAST_WANT_ALL_UNSNOOPABLES) 
@@ -609,7 +617,10 @@ static void batadv_mcast_want_unsnoop_update(struct 
batadv_priv *bat_priv,
atomic_dec(bat_priv-mcast.num_want_all_unsnoopables);
 
spin_lock_bh(bat_priv-mcast.want_lists_lock);
-   hlist_del_rcu(orig-mcast_want_all_unsnoopables_node);
+   /* flag checks above + mcast_handler_lock prevents this */
+   WARN_ON(hlist_unhashed(node));
+
+   hlist_del_init_rcu(node);
spin_unlock_bh(bat_priv-mcast.want_lists_lock);
}
 }
@@ -622,19 +633,26 @@ static void batadv_mcast_want_unsnoop_update(struct 
batadv_priv *bat_priv,
  *
  * If the BATADV_MCAST_WANT_ALL_IPV4 flag of this originator, orig, has
  * toggled then this method updates counter and list accordingly.
+ *
+ * Caller needs to hold orig-mcast_handler_lock.
  */
 static void batadv_mcast_want_ipv4_update(struct batadv_priv *bat_priv,
  struct batadv_orig_node *orig,
  uint8_t mcast_flags)
 {
+   struct hlist_node *node = orig-mcast_want_all_ipv4_node;
+   struct hlist_head *head = bat_priv-mcast.want_all_ipv4_list;
+
/* switched from flag unset to set */
if (mcast_flags  BATADV_MCAST_WANT_ALL_IPV4 
!(orig-mcast_flags  BATADV_MCAST_WANT_ALL_IPV4)) {
atomic_inc(bat_priv-mcast.num_want_all_ipv4);
 
spin_lock_bh(bat_priv-mcast.want_lists_lock);
-   hlist_add_head_rcu(orig-mcast_want_all_ipv4_node,
-  bat_priv-mcast.want_all_ipv4_list);
+   /* flag checks above + mcast_handler_lock prevents this */
+   

[PATCH 08/10] batman-adv: protect tt request from double deletion

2015-08-14 Thread Antonio Quartulli
From: Marek Lindner mareklind...@neomailbox.ch

The list_del() calls were changed to list_del_init() to prevent
an accidental double deletion in batadv_tt_req_node_new().

Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/translation-table.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/batman-adv/translation-table.c 
b/net/batman-adv/translation-table.c
index 1573489..cd35bb8 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -2196,7 +2196,7 @@ static void batadv_tt_req_list_free(struct batadv_priv 
*bat_priv)
spin_lock_bh(bat_priv-tt.req_list_lock);
 
list_for_each_entry_safe(node, safe, bat_priv-tt.req_list, list) {
-   list_del(node-list);
+   list_del_init(node-list);
kfree(node);
}
 
@@ -2232,7 +2232,7 @@ static void batadv_tt_req_purge(struct batadv_priv 
*bat_priv)
list_for_each_entry_safe(node, safe, bat_priv-tt.req_list, list) {
if (batadv_has_timed_out(node-issued_at,
 BATADV_TT_REQUEST_TIMEOUT)) {
-   list_del(node-list);
+   list_del_init(node-list);
kfree(node);
}
}
@@ -2514,7 +2514,8 @@ out:
batadv_hardif_free_ref(primary_if);
if (ret  tt_req_node) {
spin_lock_bh(bat_priv-tt.req_list_lock);
-   list_del(tt_req_node-list);
+   /* list_del_init() verifies tt_req_node still is in the list */
+   list_del_init(tt_req_node-list);
spin_unlock_bh(bat_priv-tt.req_list_lock);
kfree(tt_req_node);
}
@@ -2951,7 +2952,7 @@ static void batadv_handle_tt_response(struct batadv_priv 
*bat_priv,
list_for_each_entry_safe(node, safe, bat_priv-tt.req_list, list) {
if (!batadv_compare_eth(node-addr, resp_src))
continue;
-   list_del(node-list);
+   list_del_init(node-list);
kfree(node);
}
 
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/10] batman-adv: Make MCAST capability changes atomic

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 60432d756cf0 (batman-adv: Announce new capability via multicast TVLV)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/multicast.c | 18 ++
 net/batman-adv/types.h |  2 +-
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c
index 7aa480b..8f1ec21 100644
--- a/net/batman-adv/multicast.c
+++ b/net/batman-adv/multicast.c
@@ -19,6 +19,7 @@
 #include main.h
 
 #include linux/atomic.h
+#include linux/bitops.h
 #include linux/byteorder/generic.h
 #include linux/errno.h
 #include linux/etherdevice.h
@@ -697,29 +698,30 @@ static void batadv_mcast_tvlv_ogm_handler_v1(struct 
batadv_priv *bat_priv,
uint8_t mcast_flags = BATADV_NO_FLAGS;
bool orig_initialized;
 
-   orig_initialized = orig-capa_initialized  BATADV_ORIG_CAPA_HAS_MCAST;
+   orig_initialized = test_bit(BATADV_ORIG_CAPA_HAS_MCAST,
+   orig-capa_initialized);
 
/* If mcast support is turned on decrease the disabled mcast node
 * counter only if we had increased it for this node before. If this
 * is a completely new orig_node no need to decrease the counter.
 */
if (orig_mcast_enabled 
-   !(orig-capabilities  BATADV_ORIG_CAPA_HAS_MCAST)) {
+   !test_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capabilities)) {
if (orig_initialized)
atomic_dec(bat_priv-mcast.num_disabled);
-   orig-capabilities |= BATADV_ORIG_CAPA_HAS_MCAST;
+   set_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capabilities);
/* If mcast support is being switched off or if this is an initial
 * OGM without mcast support then increase the disabled mcast
 * node counter.
 */
} else if (!orig_mcast_enabled 
-  (orig-capabilities  BATADV_ORIG_CAPA_HAS_MCAST ||
+  (test_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capabilities) ||
!orig_initialized)) {
atomic_inc(bat_priv-mcast.num_disabled);
-   orig-capabilities = ~BATADV_ORIG_CAPA_HAS_MCAST;
+   clear_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capabilities);
}
 
-   orig-capa_initialized |= BATADV_ORIG_CAPA_HAS_MCAST;
+   set_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capa_initialized);
 
if (orig_mcast_enabled  tvlv_value 
(tvlv_value_len = sizeof(mcast_flags)))
@@ -763,8 +765,8 @@ void batadv_mcast_purge_orig(struct batadv_orig_node *orig)
 {
struct batadv_priv *bat_priv = orig-bat_priv;
 
-   if (!(orig-capabilities  BATADV_ORIG_CAPA_HAS_MCAST) 
-   orig-capa_initialized  BATADV_ORIG_CAPA_HAS_MCAST)
+   if (!test_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capabilities) 
+   test_bit(BATADV_ORIG_CAPA_HAS_MCAST, orig-capa_initialized))
atomic_dec(bat_priv-mcast.num_disabled);
 
batadv_mcast_want_unsnoop_update(bat_priv, orig, BATADV_NO_FLAGS);
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 6f801ef..1eeed18 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -316,7 +316,7 @@ enum batadv_orig_capabilities {
BATADV_ORIG_CAPA_HAS_DAT,
BATADV_ORIG_CAPA_HAS_NC,
BATADV_ORIG_CAPA_HAS_TT,
-   BATADV_ORIG_CAPA_HAS_MCAST = BIT(3),
+   BATADV_ORIG_CAPA_HAS_MCAST,
 };
 
 /**
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/10] batman-adv: Make DAT capability changes atomic

2015-08-14 Thread Antonio Quartulli
From: Linus Lüssing linus.luess...@c0d3.blue

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 17cf0ea455f1 (batman-adv: tvlv - add distributed arp table container)
Signed-off-by: Linus Lüssing linus.luess...@c0d3.blue
Signed-off-by: Marek Lindner mareklind...@neomailbox.ch
Signed-off-by: Antonio Quartulli anto...@meshcoding.com
---
 net/batman-adv/distributed-arp-table.c | 7 ---
 net/batman-adv/types.h | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/net/batman-adv/distributed-arp-table.c 
b/net/batman-adv/distributed-arp-table.c
index fb54e6a..1cfba20 100644
--- a/net/batman-adv/distributed-arp-table.c
+++ b/net/batman-adv/distributed-arp-table.c
@@ -19,6 +19,7 @@
 #include main.h
 
 #include linux/atomic.h
+#include linux/bitops.h
 #include linux/byteorder/generic.h
 #include linux/errno.h
 #include linux/etherdevice.h
@@ -453,7 +454,7 @@ static bool batadv_is_orig_node_eligible(struct 
batadv_dat_candidate *res,
int j;
 
/* check if orig node candidate is running DAT */
-   if (!(candidate-capabilities  BATADV_ORIG_CAPA_HAS_DAT))
+   if (!test_bit(BATADV_ORIG_CAPA_HAS_DAT, candidate-capabilities))
goto out;
 
/* Check if this node has already been selected... */
@@ -713,9 +714,9 @@ static void batadv_dat_tvlv_ogm_handler_v1(struct 
batadv_priv *bat_priv,
   uint16_t tvlv_value_len)
 {
if (flags  BATADV_TVLV_HANDLER_OGM_CIFNOTFND)
-   orig-capabilities = ~BATADV_ORIG_CAPA_HAS_DAT;
+   clear_bit(BATADV_ORIG_CAPA_HAS_DAT, orig-capabilities);
else
-   orig-capabilities |= BATADV_ORIG_CAPA_HAS_DAT;
+   set_bit(BATADV_ORIG_CAPA_HAS_DAT, orig-capabilities);
 }
 
 /**
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 67d6348..29fd625 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -273,7 +273,7 @@ struct batadv_orig_node {
struct hlist_node mcast_want_all_ipv4_node;
struct hlist_node mcast_want_all_ipv6_node;
 #endif
-   uint8_t capabilities;
+   unsigned long capabilities;
uint8_t capa_initialized;
atomic_t last_ttvn;
unsigned char *tt_buff;
@@ -313,7 +313,7 @@ struct batadv_orig_node {
  *  (= orig node announces a tvlv of type BATADV_TVLV_MCAST)
  */
 enum batadv_orig_capabilities {
-   BATADV_ORIG_CAPA_HAS_DAT = BIT(0),
+   BATADV_ORIG_CAPA_HAS_DAT,
BATADV_ORIG_CAPA_HAS_NC = BIT(1),
BATADV_ORIG_CAPA_HAS_TT = BIT(2),
BATADV_ORIG_CAPA_HAS_MCAST = BIT(3),
-- 
2.5.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Global protection fault in netfilter code

2015-08-14 Thread Trond Myklebust
Hi,

When doing NFS stress tests in a VM with a recent kernel (yesterday's
commit 7ddab73346a1 Merge branch 'fixes' of git
://ftp.arm.linux.org.uk/~rmk/linux-arm), I've been seeing the
following General Protection Fault code apparently in the nf_conntrack
code:


PID: 358TASK: 88003630cb80  CPU: 0   COMMAND: kworker/0:1H
 #0 [88013a603680] die at 81007608
 #1 [88013a6036b0] do_general_protection at 8100407a
 #2 [88013a6036e0] general_protection at 817888c8
[exception RIP: detach_if_pending+103]
RIP: 81101b37  RSP: 88013a603798  RFLAGS: 00010086
RAX: dead00200200  RBX: 8800b9771bd8  RCX: 000f
RDX: 88013a60e818  RSI: 88013a60d980  RDI: 0046
RBP: 88013a6037b8   R8:    R9: 0001
R10: 88013a60d998  R11: 0001  R12: 8800b9771bd8
R13: 88013a60d980  R14:   R15: 0001
ORIG_RAX:   CS: 0010  SS: 0018
 #3 [88013a6037c0] mod_timer_pending at 81101fd2
 #4 [88013a603820] __nf_ct_refresh_acct at a055891b [nf_conntrack]
 #5 [88013a603850] tcp_packet at a056232e [nf_conntrack]
 #6 [88013a603970] nf_conntrack_in at a055b70a [nf_conntrack]
 #7 [88013a603a40] ipv4_conntrack_in at a0576326 [nf_conntrack_ipv4]
 #8 [88013a603a50] nf_iterate at 81688dad
 #9 [88013a603aa0] nf_hook_slow at 81688e42
#10 [88013a603af0] ip_rcv at 81695ca3
#11 [88013a603b60] __netif_receive_skb_core at 8164d688
#12 [88013a603c00] __netif_receive_skb at 8164e108
#13 [88013a603c20] netif_receive_skb_internal at 8164f7f6
#14 [88013a603c60] napi_gro_complete at 8164fbf7
#15 [88013a603cb0] dev_gro_receive at 81650508
#16 [88013a603d20] napi_gro_receive at 81650a6b
#17 [88013a603d50] e1000_clean_rx_irq at a00308db [e1000]
#18 [88013a603e00] e1000_clean at a0030f3d [e1000]
#19 [88013a603ec0] net_rx_action at 8164ffda
#20 [88013a603f40] __do_softirq at 81087a18
#21 [88013a603fb0] do_softirq_own_stack at 8178875c
--- IRQ stack ---
#22 [880035fabaf0] do_softirq_own_stack at 8178875c
[exception RIP: unknown or invalid address]
RIP: 88007d7a6108  RSP: 88007d7a60d0  RFLAGS: a02827e5
RAX: 810868a9  RBX: 880035fabb38  RCX: 88007d7a6108
RDX: 880136e56a00  RSI: 880035fabb78  RDI: 81786209
RBP: 810d662d   R8: 880035fabb58   R9: fe00
R10: 0046  R11: 810867e5  R12: 0046
R13: 810ce565  R14: 880035fabb18  R15: 
ORIG_RAX: 8800a4408840  CS: 880035fabbc8  SS: 0001
WARNING: possibly bogus exception frame
#23 [880035fabbd0] nfs41_wake_and_assign_slot at a06c7a9d [nfsv4]
#24 [880035fabbe0] nfs41_sequence_done at a069c7c0 [nfsv4]
#25 [880035fabc30] nfs4_sequence_done at a069caaf [nfsv4]
#26 [880035fabc40] nfs4_read_done at a06a2c0e [nfsv4]
#27 [880035fabc60] nfs_readpage_done at a0654736 [nfs]
#28 [880035fabc90] nfs_pgio_result at a0653414 [nfs]
#29 [880035fabcc0] rpc_exit_task at a027f10c [sunrpc]
#30 [880035fabce0] __rpc_execute at a02820dd [sunrpc]
#31 [880035fabd60] rpc_async_schedule at a0282725 [sunrpc]
#32 [880035fabd70] process_one_work at 8109fe89
#33 [880035fabdf0] worker_thread at 810a04ae
#34 [880035fabe60] kthread at 810a6aef
#35 [880035fabf50] ret_from_fork at 81786f5f

I do not see that in vanilla Linux-4.1, so it seems to be a 4.2 cycle
thing.

Is anyone else seeing this, and is it being looked at by the netfilter
folks?

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG net-next] infamous dev refcnt leak... again.

2015-08-14 Thread David Ahern

On 8/14/15 5:14 PM, Eric Dumazet wrote:

On Fri, 2015-08-14 at 14:14 -0700, Eric Dumazet wrote:

While rebooting host running latest net-next

  unregister_netdevice: waiting for eth0 to become free. Usage count = 4

Oh well...



It looks like David Ahern recent changes uncover a bug ?

Not clear which commit is at fault.

Maybe 3bfd847203c6d89532f836ad3f5b4ff4ced26dd9 ?

Somehow a down device can be found.


Can you elaborate on what you are doing to see the refcnt leak? I have 
not seen that at all. I have to leave for soccer carpool in 45 minutes 
or so, but can take a look this weekend.


David



diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index b7f1d20..675a3b6 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -725,10 +725,14 @@ static int fib_check_nh(struct fib_config *cfg, struct 
fib_info *fi,
nh-nh_dev = dev = FIB_RES_DEV(res);
if (!dev)
goto out;
-   dev_hold(dev);
if (!netif_carrier_ok(dev))
nh-nh_flags |= RTNH_F_LINKDOWN;
-   err = (dev-flags  IFF_UP) ? 0 : -ENETDOWN;
+   if (dev-flags  IFF_UP) {
+   err = 0;
+   dev_hold(dev);
+   } else {
+   err = -ENETDOWN;
+   }
} else {
struct in_device *in_dev;





--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 1/4] packet: add classic BPF fanout mode

2015-08-14 Thread Willem de Bruijn
From: Willem de Bruijn will...@google.com

Add fanout mode PACKET_FANOUT_CBPF that accepts a classic BPF program
to select a socket.

This avoids having to keep adding special case fanout modes. One
example use case is application layer load balancing. The QUIC
protocol, for instance, encodes a connection ID in UDP payload.

Also add socket option SOL_PACKET/PACKET_FANOUT_DATA that updates data
associated with the socket group. Fanout mode PACKET_FANOUT_CBPF is the
only user so far.

Signed-off-by: Willem de Bruijn will...@google.com
---
 include/uapi/linux/if_packet.h |  2 +
 net/packet/af_packet.c | 99 +-
 net/packet/internal.h  |  5 ++-
 3 files changed, 104 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index d3d715f8c..a4bb16f 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -55,6 +55,7 @@ struct sockaddr_ll {
 #define PACKET_TX_HAS_OFF  19
 #define PACKET_QDISC_BYPASS20
 #define PACKET_ROLLOVER_STATS  21
+#define PACKET_FANOUT_DATA 22
 
 #define PACKET_FANOUT_HASH 0
 #define PACKET_FANOUT_LB   1
@@ -62,6 +63,7 @@ struct sockaddr_ll {
 #define PACKET_FANOUT_ROLLOVER 3
 #define PACKET_FANOUT_RND  4
 #define PACKET_FANOUT_QM   5
+#define PACKET_FANOUT_CBPF 6
 #define PACKET_FANOUT_FLAG_ROLLOVER0x1000
 #define PACKET_FANOUT_FLAG_DEFRAG  0x8000
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index b5afe53..8869d07 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -92,6 +92,7 @@
 #ifdef CONFIG_INET
 #include net/inet_common.h
 #endif
+#include linux/bpf.h
 
 #include internal.h
 
@@ -1410,6 +1411,22 @@ static unsigned int fanout_demux_qm(struct packet_fanout 
*f,
return skb_get_queue_mapping(skb) % num;
 }
 
+static unsigned int fanout_demux_bpf(struct packet_fanout *f,
+struct sk_buff *skb,
+unsigned int num)
+{
+   struct bpf_prog *prog;
+   unsigned int ret = 0;
+
+   rcu_read_lock();
+   prog = rcu_dereference(f-bpf_prog);
+   if (prog)
+   ret = BPF_PROG_RUN(prog, skb) % num;
+   rcu_read_unlock();
+
+   return ret;
+}
+
 static bool fanout_has_flag(struct packet_fanout *f, u16 flag)
 {
return f-flags  (flag  8);
@@ -1454,6 +1471,9 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct 
net_device *dev,
case PACKET_FANOUT_ROLLOVER:
idx = fanout_demux_rollover(f, skb, 0, false, num);
break;
+   case PACKET_FANOUT_CBPF:
+   idx = fanout_demux_bpf(f, skb, num);
+   break;
}
 
if (fanout_has_flag(f, PACKET_FANOUT_FLAG_ROLLOVER))
@@ -1502,6 +1522,74 @@ static bool match_fanout_group(struct packet_type 
*ptype, struct sock *sk)
return false;
 }
 
+static void fanout_init_data(struct packet_fanout *f)
+{
+   switch (f-type) {
+   case PACKET_FANOUT_LB:
+   atomic_set(f-rr_cur, 0);
+   break;
+   case PACKET_FANOUT_CBPF:
+   RCU_INIT_POINTER(f-bpf_prog, NULL);
+   break;
+   }
+}
+
+static void __fanout_set_data_bpf(struct packet_fanout *f, struct bpf_prog 
*new)
+{
+   struct bpf_prog *old;
+
+   spin_lock(f-lock);
+   old = rcu_dereference_protected(f-bpf_prog, lockdep_is_held(f-lock));
+   rcu_assign_pointer(f-bpf_prog, new);
+   spin_unlock(f-lock);
+
+   if (old) {
+   synchronize_net();
+   bpf_prog_destroy(old);
+   }
+}
+
+static int fanout_set_data_cbpf(struct packet_sock *po, char __user *data,
+   unsigned int len)
+{
+   struct bpf_prog *new;
+   struct sock_fprog fprog;
+   int ret;
+
+   if (sock_flag(po-sk, SOCK_FILTER_LOCKED))
+   return -EPERM;
+   if (len != sizeof(fprog))
+   return -EINVAL;
+   if (copy_from_user(fprog, data, len))
+   return -EFAULT;
+
+   ret = bpf_prog_create_from_user(new, fprog, NULL);
+   if (ret)
+   return ret;
+
+   __fanout_set_data_bpf(po-fanout, new);
+   return 0;
+}
+
+static int fanout_set_data(struct packet_sock *po, char __user *data,
+  unsigned int len)
+{
+   switch (po-fanout-type) {
+   case PACKET_FANOUT_CBPF:
+   return fanout_set_data_cbpf(po, data, len);
+   default:
+   return -EINVAL;
+   };
+}
+
+static void fanout_release_data(struct packet_fanout *f)
+{
+   switch (f-type) {
+   case PACKET_FANOUT_CBPF:
+   __fanout_set_data_bpf(f, NULL);
+   };
+}
+
 static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
 {
struct packet_sock *po = pkt_sk(sk);
@@ -1519,6 +1607,7 @@ static int 

[PATCH net-next v2 2/4] packet: add extended BPF fanout mode

2015-08-14 Thread Willem de Bruijn
From: Willem de Bruijn will...@google.com

Add fanout mode PACKET_FANOUT_EBPF that accepts an en extended BPF
program to select a socket.

Update the internal eBPF program by passing to socket option
SOL_PACKET/PACKET_FANOUT_DATA a file descriptor returned by bpf().

Signed-off-by: Willem de Bruijn will...@google.com
---
 include/uapi/linux/if_packet.h |  1 +
 net/packet/af_packet.c | 31 +++
 2 files changed, 32 insertions(+)

diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index a4bb16f..9e7edfd 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -64,6 +64,7 @@ struct sockaddr_ll {
 #define PACKET_FANOUT_RND  4
 #define PACKET_FANOUT_QM   5
 #define PACKET_FANOUT_CBPF 6
+#define PACKET_FANOUT_EBPF 7
 #define PACKET_FANOUT_FLAG_ROLLOVER0x1000
 #define PACKET_FANOUT_FLAG_DEFRAG  0x8000
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 8869d07..7b8e39a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1472,6 +1472,7 @@ static int packet_rcv_fanout(struct sk_buff *skb, struct 
net_device *dev,
idx = fanout_demux_rollover(f, skb, 0, false, num);
break;
case PACKET_FANOUT_CBPF:
+   case PACKET_FANOUT_EBPF:
idx = fanout_demux_bpf(f, skb, num);
break;
}
@@ -1529,6 +1530,7 @@ static void fanout_init_data(struct packet_fanout *f)
atomic_set(f-rr_cur, 0);
break;
case PACKET_FANOUT_CBPF:
+   case PACKET_FANOUT_EBPF:
RCU_INIT_POINTER(f-bpf_prog, NULL);
break;
}
@@ -1571,12 +1573,39 @@ static int fanout_set_data_cbpf(struct packet_sock *po, 
char __user *data,
return 0;
 }
 
+static int fanout_set_data_ebpf(struct packet_sock *po, char __user *data,
+   unsigned int len)
+{
+   struct bpf_prog *new;
+   u32 fd;
+
+   if (sock_flag(po-sk, SOCK_FILTER_LOCKED))
+   return -EPERM;
+   if (len != sizeof(fd))
+   return -EINVAL;
+   if (copy_from_user(fd, data, len))
+   return -EFAULT;
+
+   new = bpf_prog_get(fd);
+   if (IS_ERR(new))
+   return PTR_ERR(new);
+   if (new-type != BPF_PROG_TYPE_SOCKET_FILTER) {
+   bpf_prog_put(new);
+   return -EINVAL;
+   }
+
+   __fanout_set_data_bpf(po-fanout, new);
+   return 0;
+}
+
 static int fanout_set_data(struct packet_sock *po, char __user *data,
   unsigned int len)
 {
switch (po-fanout-type) {
case PACKET_FANOUT_CBPF:
return fanout_set_data_cbpf(po, data, len);
+   case PACKET_FANOUT_EBPF:
+   return fanout_set_data_ebpf(po, data, len);
default:
return -EINVAL;
};
@@ -1586,6 +1615,7 @@ static void fanout_release_data(struct packet_fanout *f)
 {
switch (f-type) {
case PACKET_FANOUT_CBPF:
+   case PACKET_FANOUT_EBPF:
__fanout_set_data_bpf(f, NULL);
};
 }
@@ -1608,6 +1638,7 @@ static int fanout_add(struct sock *sk, u16 id, u16 
type_flags)
case PACKET_FANOUT_RND:
case PACKET_FANOUT_QM:
case PACKET_FANOUT_CBPF:
+   case PACKET_FANOUT_EBPF:
break;
default:
return -EINVAL;
-- 
2.5.0.276.gf5e568e

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 3/4] selftests/net: test classic bpf fanout mode

2015-08-14 Thread Willem de Bruijn
From: Willem de Bruijn will...@google.com

Test PACKET_FANOUT_CBPF by inserting a cBPF program that selects a
socket by payload. Requires modifying the test program to send
packets with multiple payloads.

Also fix a bug in testing the return value of mmap()

Signed-off-by: Willem de Bruijn will...@google.com
---
 tools/testing/selftests/net/psock_fanout.c | 16 
 tools/testing/selftests/net/psock_lib.h| 29 +
 2 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/net/psock_fanout.c 
b/tools/testing/selftests/net/psock_fanout.c
index 08c2a36..baf46a2 100644
--- a/tools/testing/selftests/net/psock_fanout.c
+++ b/tools/testing/selftests/net/psock_fanout.c
@@ -19,6 +19,7 @@
  *   - PACKET_FANOUT_LB
  *   - PACKET_FANOUT_CPU
  *   - PACKET_FANOUT_ROLLOVER
+ *   - PACKET_FANOUT_CBPF
  *
  * Todo:
  * - functionality: PACKET_FANOUT_FLAG_DEFRAG
@@ -115,8 +116,8 @@ static char *sock_fanout_open_ring(int fd)
 
ring = mmap(0, req.tp_block_size * req.tp_block_nr,
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
-   if (!ring) {
-   fprintf(stderr, packetsock ring mmap\n);
+   if (ring == MAP_FAILED) {
+   perror(packetsock ring mmap);
exit(1);
}
 
@@ -209,6 +210,7 @@ static int test_datapath(uint16_t typeflags, int port_off,
 {
const int expect0[] = { 0, 0 };
char *rings[2];
+   uint8_t type = typeflags  0xFF;
int fds[2], fds_udp[2][2], ret;
 
fprintf(stderr, test: datapath 0x%hx\n, typeflags);
@@ -219,6 +221,9 @@ static int test_datapath(uint16_t typeflags, int port_off,
fprintf(stderr, ERROR: failed open\n);
exit(1);
}
+   if (type == PACKET_FANOUT_CBPF)
+   sock_setfilter(fds[0], SOL_PACKET, PACKET_FANOUT_DATA);
+
rings[0] = sock_fanout_open_ring(fds[0]);
rings[1] = sock_fanout_open_ring(fds[1]);
pair_udp_open(fds_udp[0], PORT_BASE);
@@ -227,11 +232,11 @@ static int test_datapath(uint16_t typeflags, int port_off,
 
/* Send data, but not enough to overflow a queue */
pair_udp_send(fds_udp[0], 15);
-   pair_udp_send(fds_udp[1], 5);
+   pair_udp_send_char(fds_udp[1], 5, DATA_CHAR_1);
ret = sock_fanout_read(fds, rings, expect1);
 
/* Send more data, overflow the queue */
-   pair_udp_send(fds_udp[0], 15);
+   pair_udp_send_char(fds_udp[0], 15, DATA_CHAR_1);
/* TODO: ensure consistent order between expect1 and expect2 */
ret |= sock_fanout_read(fds, rings, expect2);
 
@@ -275,6 +280,7 @@ int main(int argc, char **argv)
const int expect_rb[2][2]   = { { 15, 5 },  { 20, 15 } };
const int expect_cpu0[2][2] = { { 20, 0 },  { 20, 0 } };
const int expect_cpu1[2][2] = { { 0, 20 },  { 0, 20 } };
+   const int expect_bpf[2][2]  = { { 15, 5 },  { 15, 20 } };
int port_off = 2, tries = 5, ret;
 
test_control_single();
@@ -295,6 +301,8 @@ int main(int argc, char **argv)
 port_off, expect_lb[0], expect_lb[1]);
ret |= test_datapath(PACKET_FANOUT_ROLLOVER,
 port_off, expect_rb[0], expect_rb[1]);
+   ret |= test_datapath(PACKET_FANOUT_CBPF,
+port_off, expect_bpf[0], expect_bpf[1]);
 
set_cpuaffinity(0);
ret |= test_datapath(PACKET_FANOUT_CPU, port_off,
diff --git a/tools/testing/selftests/net/psock_lib.h 
b/tools/testing/selftests/net/psock_lib.h
index 37da54a..24bc7ec 100644
--- a/tools/testing/selftests/net/psock_lib.h
+++ b/tools/testing/selftests/net/psock_lib.h
@@ -30,6 +30,7 @@
 
 #define DATA_LEN   100
 #define DATA_CHAR  'a'
+#define DATA_CHAR_1'b'
 
 #define PORT_BASE  8000
 
@@ -37,29 +38,36 @@
 # define __maybe_unused__attribute__ ((__unused__))
 #endif
 
-static __maybe_unused void pair_udp_setfilter(int fd)
+static __maybe_unused void sock_setfilter(int fd, int lvl, int optnum)
 {
struct sock_filter bpf_filter[] = {
{ 0x80, 0, 0, 0x },  /* LD  pktlen*/
-   { 0x35, 0, 5, DATA_LEN   },  /* JGE DATA_LEN  [f goto nomatch]*/
+   { 0x35, 0, 4, DATA_LEN   },  /* JGE DATA_LEN  [f goto nomatch]*/
{ 0x30, 0, 0, 0x0050 },  /* LD  ip[80]*/
-   { 0x15, 0, 3, DATA_CHAR  },  /* JEQ DATA_CHAR [f goto nomatch]*/
-   { 0x30, 0, 0, 0x0051 },  /* LD  ip[81]*/
-   { 0x15, 0, 1, DATA_CHAR  },  /* JEQ DATA_CHAR [f goto nomatch]*/
+   { 0x15, 1, 0, DATA_CHAR  },  /* JEQ DATA_CHAR   [t goto match]*/
+   { 0x15, 0, 1, DATA_CHAR_1},  /* JEQ DATA_CHAR_1 [t goto match]*/
{ 0x06, 0, 0, 0x0060 },  /* RET match */

[PATCH net-next v2 4/4] selftests/net: test extended BPF fanout mode

2015-08-14 Thread Willem de Bruijn
From: Willem de Bruijn will...@google.com

Test PACKET_FANOUT_EBPF by inserting a program into the the kernel
with bpf(), then attaching it to the fanout group. Observe the same
payload-based distribution as in the PACKET_FANOUT_CBPF test.

Signed-off-by: Willem de Bruijn will...@google.com
---
 tools/testing/selftests/net/psock_fanout.c | 53 ++
 1 file changed, 53 insertions(+)

diff --git a/tools/testing/selftests/net/psock_fanout.c 
b/tools/testing/selftests/net/psock_fanout.c
index baf46a2..4124593 100644
--- a/tools/testing/selftests/net/psock_fanout.c
+++ b/tools/testing/selftests/net/psock_fanout.c
@@ -20,6 +20,7 @@
  *   - PACKET_FANOUT_CPU
  *   - PACKET_FANOUT_ROLLOVER
  *   - PACKET_FANOUT_CBPF
+ *   - PACKET_FANOUT_EBPF
  *
  * Todo:
  * - functionality: PACKET_FANOUT_FLAG_DEFRAG
@@ -45,7 +46,9 @@
 #include arpa/inet.h
 #include errno.h
 #include fcntl.h
+#include linux/unistd.h  /* for __NR_bpf */
 #include linux/filter.h
+#include linux/bpf.h
 #include linux/if_packet.h
 #include net/ethernet.h
 #include netinet/ip.h
@@ -92,6 +95,51 @@ static int sock_fanout_open(uint16_t typeflags, int 
num_packets)
return fd;
 }
 
+static void sock_fanout_set_ebpf(int fd)
+{
+   const int len_off = __builtin_offsetof(struct __sk_buff, len);
+   struct bpf_insn prog[] = {
+   { BPF_ALU64 | BPF_MOV | BPF_X,   6, 1, 0, 0 },
+   { BPF_LDX   | BPF_W   | BPF_MEM, 0, 6, len_off, 0 },
+   { BPF_JMP   | BPF_JGE | BPF_K,   0, 0, 1, DATA_LEN },
+   { BPF_JMP   | BPF_JA  | BPF_K,   0, 0, 4, 0 },
+   { BPF_LD| BPF_B   | BPF_ABS, 0, 0, 0, 0x50 },
+   { BPF_JMP   | BPF_JEQ | BPF_K,   0, 0, 2, DATA_CHAR },
+   { BPF_JMP   | BPF_JEQ | BPF_K,   0, 0, 1, DATA_CHAR_1 },
+   { BPF_ALU   | BPF_MOV | BPF_K,   0, 0, 0, 0 },
+   { BPF_JMP   | BPF_EXIT,  0, 0, 0, 0 }
+   };
+   char log_buf[512];
+   union bpf_attr attr;
+   int pfd;
+
+   memset(attr, 0, sizeof(attr));
+   attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+   attr.insns = (unsigned long) prog;
+   attr.insn_cnt = sizeof(prog) / sizeof(prog[0]);
+   attr.license = (unsigned long) GPL;
+   attr.log_buf = (unsigned long) log_buf,
+   attr.log_size = sizeof(log_buf),
+   attr.log_level = 1,
+
+   pfd = syscall(__NR_bpf, BPF_PROG_LOAD, attr, sizeof(attr));
+   if (pfd  0) {
+   perror(bpf);
+   fprintf(stderr, bpf verifier:\n%s\n, log_buf);
+   exit(1);
+   }
+
+   if (setsockopt(fd, SOL_PACKET, PACKET_FANOUT_DATA, pfd, sizeof(pfd))) {
+   perror(fanout data ebpf);
+   exit(1);
+   }
+
+   if (close(pfd)) {
+   perror(close ebpf);
+   exit(1);
+   }
+}
+
 static char *sock_fanout_open_ring(int fd)
 {
struct tpacket_req req = {
@@ -223,6 +271,8 @@ static int test_datapath(uint16_t typeflags, int port_off,
}
if (type == PACKET_FANOUT_CBPF)
sock_setfilter(fds[0], SOL_PACKET, PACKET_FANOUT_DATA);
+   else if (type == PACKET_FANOUT_EBPF)
+   sock_fanout_set_ebpf(fds[0]);
 
rings[0] = sock_fanout_open_ring(fds[0]);
rings[1] = sock_fanout_open_ring(fds[1]);
@@ -301,8 +351,11 @@ int main(int argc, char **argv)
 port_off, expect_lb[0], expect_lb[1]);
ret |= test_datapath(PACKET_FANOUT_ROLLOVER,
 port_off, expect_rb[0], expect_rb[1]);
+
ret |= test_datapath(PACKET_FANOUT_CBPF,
 port_off, expect_bpf[0], expect_bpf[1]);
+   ret |= test_datapath(PACKET_FANOUT_EBPF,
+port_off, expect_bpf[0], expect_bpf[1]);
 
set_cpuaffinity(0);
ret |= test_datapath(PACKET_FANOUT_CPU, port_off,
-- 
2.5.0.276.gf5e568e

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next v2 0/4] packet: add cBPF and eBPF fanout modes

2015-08-14 Thread Willem de Bruijn
From: Willem de Bruijn will...@google.com

Allow programmable fanout modes. Support both classical BPF programs
passed directly and extended BPF programs passed by file descriptor.

One use case is packet steering by deep packet inspection, for
instance for packet steering by application layer header fields.

Separate the configuration of the fanout mode and the configuration
of the program, to allow dynamic updates to the latter at runtime.

Changes
  v1 - v2:
- follow SO_LOCK_FILTER semantics on filter updates
- only accept eBPF programs of type BPF_PROG_TYPE_SOCKET_FILTER
- rename PACKET_FANOUT_BPF to PACKET_FANOUT_CBPF to match
  man 2 bpf usage: classic vs. extended BPF.

Willem de Bruijn (4):
  packet: add classic BPF fanout mode
  packet: add extended BPF fanout mode
  selftests/net: test classic bpf fanout mode
  selftests/net: test extended BPF fanout mode

 include/uapi/linux/if_packet.h |   3 +
 net/packet/af_packet.c | 130 -
 net/packet/internal.h  |   5 +-
 tools/testing/selftests/net/psock_fanout.c |  69 ++-
 tools/testing/selftests/net/psock_lib.h|  29 +--
 5 files changed, 222 insertions(+), 14 deletions(-)

-- 
2.5.0.276.gf5e568e

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 04/11] udp: Handle VRF device in sendmsg

2015-08-14 Thread David Ahern

On 8/14/15 9:16 PM, Tom Herbert wrote:

At least collect this code into one (static inline) function to better
minimize the code churn in udp. If this is general functionality that
can be used by other drivers then abstract it out as such. Also, if
the VRF driver is not configured it seems like this code should
compiled out. As it stands now if (netif_index_is_vrf(net, ipc.oif))
{ adds a conditional to every call of udp_sendmsg rather or not we
are using VRF :-(.


Sure. I wanted to make sure all of the VRF related changes compiled out 
when the VRF driver is not enabled. This one slipped by me. I'll send a 
patch next week along with a couple of others per Eric D's comments.


David
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/4] packet: add eBPF fanout mode

2015-08-14 Thread Willem de Bruijn
On Fri, Aug 14, 2015 at 3:46 PM, Daniel Borkmann dan...@iogearbox.net wrote:
 On 08/14/2015 09:27 PM, Willem de Bruijn wrote:
 ...

 Btw, in case someone sets sock_flag(sk, SOCK_FILTER_LOCKED),
 perhaps we should also apply it on fanout?


 Good point. With classic bpf, packet access control is fully
 enforced in per-socket filters, but playing with load balancing
 filters could allow an adversary to infer some information
 about the dropped packets*. With eBPF and maps, access
 is even more direct. Let's support locking of fanout filters in
 place.


 Right, a process could share a map between the fanout lb filter
 and actual sk filter, i.e. to look up how much actually passed
 through on the later sk level filter, and use that information
 in addition for its lb decisions.

 I intend to test the existing socket flag. No need to add a
 separate flag for the fanout group, as far as I can see.


 Agreed, should be okay.

Great. Thanks for the suggestion, Daniel! I'll send a v2 the
three suggested changes in a minute.


 Thanks Willem!

 (*) I noticed that a similar unintended effect also causes the
 PACKET_FANOUT_LB selftest to be flaky: filters on the
 sockets ensure that the test only reads expected packets.
 But, all traffic makes it through packet_rcv_fanout. Packets
 that are later dropped by sk_filter have already incremented
 rr_cur. Worst case, with 2 sockets and each accepted packet
 interleaved with a dropped packet, all packets are queued on
 only one socket. Test flakiness is fixed, e.g., by running in a
 private network namespace. The implementation behavior
 may be unexpected in other, production, environments.
 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues

2015-08-14 Thread Ken-ichirou MATSUZAWA
 Hi,

Thank you for taking your time and trying to understand, even though
one of samples is wrong. correct one is:

rx only mmaped nflog sample:

https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/365c8a106840368f313a3791958da9be0f5fbed0/rxring-nflog.c

 Currently, what happens is that the shared info accesses whatever
 memory is there in the mmaped region. So when you already do an
 skb_clone() you should already get into trouble right there f.e. when
 we test for orphaning frags etc (if at the right offset in the mmap
 buffer, the tx_flags member would contain a SKBTX_DEV_ZEROCOPY bit).

And I'm afraid of a skb which does not have shared info can be released
by kfree_skb or not if the next frame is valid. i.e. the current
skb-end, shared info points to the next frame's nm_status, say
NL_MMAP_STATUS_SKIP, and handle it as shared info pointer.

 Ken-ichirou, have you observed this issue only in relation to nlmon?

Yes,

 if taps are indeed the only ones affected, it might probably not be
 worth adding that much complexity for a fix itself, but to keep it simple
 instead. I don't know if there are any real users of netlink mmap, but

You mean mmaped skb can not be monitored by nlmon for a while?
I'll follow you, it's tough for me to fix this issue.

 It seems you have some other, separate fixes in your series, so you might
 want to submit them separately against the net tree, instead?

I'll follow you too.
Thank you, I appreciate.

  include/linux/netlink.h  |  4 
  net/netlink/af_netlink.c | 12 +++-
  2 files changed, 11 insertions(+), 5 deletions(-)
 
 diff --git a/include/linux/netlink.h b/include/linux/netlink.h
 index 9120edb..42cdcd8 100644
 --- a/include/linux/netlink.h
 +++ b/include/linux/netlink.h
 @@ -35,6 +35,10 @@ struct netlink_skb_parms {
  #define NETLINK_CB(skb)  (*(struct 
 netlink_skb_parms*)((skb)-cb))
  #define NETLINK_CREDS(skb)   (NETLINK_CB((skb)).creds)
 
 +static inline bool netlink_skb_is_mmaped(const struct sk_buff *skb)
 +{
 + return NETLINK_CB(skb).flags  NETLINK_SKB_MMAPED;
 +}
 
  extern void netlink_table_grab(void);
  extern void netlink_table_ungrab(void);
 diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
 index 67d2104..4307446 100644
 --- a/net/netlink/af_netlink.c
 +++ b/net/netlink/af_netlink.c
 @@ -238,6 +238,13 @@ static void __netlink_deliver_tap(struct sk_buff *skb)
 
  static void netlink_deliver_tap(struct sk_buff *skb)
  {
 + /* Netlink mmaped skbs must not access shared info, and thus
 +  * are not allowed to be cloned. For now, just don't allow
 +  * them to get inspected by taps.
 +  */
 + if (netlink_skb_is_mmaped(skb))
 + return;
 +
   rcu_read_lock();
 
   if (unlikely(!list_empty(netlink_tap_all)))
 @@ -278,11 +285,6 @@ static void netlink_rcv_wake(struct sock *sk)
  }
 
  #ifdef CONFIG_NETLINK_MMAP
 -static bool netlink_skb_is_mmaped(const struct sk_buff *skb)
 -{
 - return NETLINK_CB(skb).flags  NETLINK_SKB_MMAPED;
 -}
 -
  static bool netlink_rx_is_mmaped(struct sock *sk)
  {
   return nlk_sk(sk)-rx_ring.pg_vec != NULL;
 -- 
 1.9.3
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 04/11] udp: Handle VRF device in sendmsg

2015-08-14 Thread Tom Herbert
On Fri, Aug 14, 2015 at 10:58 AM, Shrijeet Mukherjee
s...@cumulusnetworks.com wrote:


 On Fri, Aug 14, 2015 at 9:27 AM, Tom Herbert t...@herbertland.com wrote:

 On Thu, Aug 13, 2015 at 1:59 PM, David Ahern d...@cumulusnetworks.com
 wrote:
  For unconnected UDP sockets using a VRF device lookup source address
  based on VRF table. This allows the UDP header to be properly setup
  before showing up at the VRF device via the dst.
 
  Signed-off-by: Shrijeet Mukherjee s...@cumulusnetworks.com
  Signed-off-by: David Ahern d...@cumulusnetworks.com
  ---
   net/ipv4/udp.c | 22 +-
   1 file changed, 21 insertions(+), 1 deletion(-)
 
  diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
  index 83aa604f9273..7af5052e3b1f 100644
  --- a/net/ipv4/udp.c
  +++ b/net/ipv4/udp.c
  @@ -1013,11 +1013,31 @@ int udp_sendmsg(struct sock *sk, struct msghdr
  *msg, size_t len)
 
  if (!rt) {
  struct net *net = sock_net(sk);
  +   __u8 flow_flags = inet_sk_flowi_flags(sk);
 
  fl4 = fl4_stack;
  +
  +   /* unconnected socket. If output device is enslaved to a
  VRF
  +* device lookup source address from VRF table. This
  mimics
  +* behavior of ip_route_connect{_init}.
  +*/
  +   if (netif_index_is_vrf(net, ipc.oif)) {
  +   flowi4_init_output(fl4, ipc.oif, sk-sk_mark,
  tos,
  +  RT_SCOPE_UNIVERSE,
  sk-sk_protocol,
  +  (flow_flags |
  FLOWI_FLAG_VRFSRC),
  +  faddr, saddr, dport,
  +  inet-inet_sport);
  +
  +   rt = ip_route_output_flow(net, fl4, sk);
  +   if (!IS_ERR(rt)) {
  +   saddr = fl4-saddr;
  +   ip_rt_put(rt);
  +   }
  +   }
  +

 I really don't like this. It seems like you're putting device specific
 code in a critical L4 data path function. Also, does ipv6/udp.c need
 be updated similarly? Why can't VRF be abstracted out in routing
 lookups?


 Tom,

 Did not have a better way to make this work. The point of the VRF driver was
 to be completely transparent for anything other routing lookups. Modifying
 the header in the driver means that fragmentation etc will have trouble.

 So this code really just makes the saddr evaluation before we enter the udp
 code path and is similar to what the tcp side does. If you have a suggestion
 on a different and hopefully consistent way to do with tcp and ipv6, that
 would be preferable.

At least collect this code into one (static inline) function to better
minimize the code churn in udp. If this is general functionality that
can be used by other drivers then abstract it out as such. Also, if
the VRF driver is not configured it seems like this code should
compiled out. As it stands now if (netif_index_is_vrf(net, ipc.oif))
{ adds a conditional to every call of udp_sendmsg rather or not we
are using VRF :-(.

Tom
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH net] igb: Fix oops caused by missing queue pairing

2015-08-14 Thread Brown, Aaron F
 From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org]
 On Behalf Of Shota Suzuki
 Sent: Tuesday, June 30, 2015 5:26 PM
 To: Kirsher, Jeffrey T; Brandeburg, Jesse; Nelson, Shannon; Wyborny,
 Carolyn; Skidmore, Donald C; Vick, Matthew; Ronciak, John; Williams, Mitch
 A; intel-wired-...@lists.osuosl.org; netdev@vger.kernel.org; linux-
 ker...@vger.kernel.org
 Cc: Shota Suzuki
 Subject: [PATCH net] igb: Fix oops caused by missing queue pairing
 
 When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
 set if adapter-rss_queues exceeds half of max_rss_queues in
 igb_init_queue_configuration().
 On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
 queues exceeds half of max_combined in igb_set_channels() when changing
 the number of queues by ethtool -L.
 In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
 of adapter-msix_entries[], an overflow can occur in
 igb_set_interrupt_capability(), which in turn leads to an oops.
 
 Fix this problem as follows:
  - When changing the number of queues by ethtool -L, set
IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
  - When increasing the size of q_vector, reallocate it appropriately.
(With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
 
 Another possible way to fix this problem is to cap the queues at its
 initial number, which is the number of the initial online cpus. But this
 is not the optimal way because we cannnot increase queues when another
 cpu becomes online.
 
 Note that before commit cd14ef54d25b (igb: Change to use statically
 allocated array for MSIx entries), this problem did not cause oops
 but just made the number of queues become 1 because of entering msi_only
 mode in igb_set_interrupt_capability().
 
 Fixes: 907b7835799f (igb: Add ethtool support to configure number of
 channels)
 Signed-off-by: Shota Suzuki suzuki_shota...@lab.ntt.co.jp
 ---
 Although we might be able to additionally unset IGB_FLAG_QUEUE_PAIRS
 when it is not needed, this patch doesn't change existing behaviour
 because such a change is not a bug fix.
 
  drivers/net/ethernet/intel/igb/igb.h |  1 +
  drivers/net/ethernet/intel/igb/igb_ethtool.c |  5 -
  drivers/net/ethernet/intel/igb/igb_main.c| 16 ++--
  3 files changed, 19 insertions(+), 3 deletions(-)

Tested-by: Aaron Brown aaron.f.br...@intel.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >