[PATCH 1/1] l2tp: Fix the connect status check in pppol2tp_getname

2016-08-18 Thread fgao
From: Gao Feng 

The sk->sk_state is bits flag, so need use bit operation check
instead of value check.

Signed-off-by: Gao Feng 
---
 net/l2tp/l2tp_ppp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index d9560aa..232cb92 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -856,7 +856,7 @@ static int pppol2tp_getname(struct socket *sock, struct 
sockaddr *uaddr,
error = -ENOTCONN;
if (sk == NULL)
goto end;
-   if (sk->sk_state != PPPOX_CONNECTED)
+   if (!(sk->sk_state & PPPOX_CONNECTED))
goto end;
 
error = -EBADF;
-- 
1.9.1




[PATCH net-next 0/2] qed*: Fix ethtool issues relating to link

2016-08-18 Thread Yuval Mintz
This series addresses two issues that were introduced when adding
support for ethtool's link_ksettings support - the first fixes a
regression and second incorrect functionallity in the submission.

Although these are fixes, as the feature currently exists only in
'next-next' I'm aiming them for it.

Dave, please consider applying this series to 'net-next'.

Cheers,
Yuval

Yuval Mintz (2):
  qed*: Fix pause setting
  qed: Fix forcing high speeds

 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 6 +++---
 include/linux/qed/qed_if.h  | 8 +---
 2 files changed, 8 insertions(+), 6 deletions(-)

-- 
1.9.3



[PATCH net-next 1/2] qed*: Fix pause setting

2016-08-18 Thread Yuval Mintz
When moving into using ethtool's link_ksetting, qed started
supplying its own bitmask of speed/capabilities, but qede
is still checking for the SUPPORTED value to determine whether
it supports pause.

Fixes: 054c67d1c82a ("qed*: Add support for ethtool link_ksettings callbacks")
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 2 +-
 include/linux/qed/qed_if.h  | 8 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index f0ab2fd..ef521e5 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -672,7 +672,7 @@ static int qede_set_pauseparam(struct net_device *dev,
memset(, 0, sizeof(params));
params.override_flags |= QED_LINK_OVERRIDE_PAUSE_CONFIG;
if (epause->autoneg) {
-   if (!(current_link.supported_caps & SUPPORTED_Autoneg)) {
+   if (!(current_link.supported_caps & QED_LM_Autoneg_BIT)) {
DP_INFO(edev, "autoneg not supported\n");
return -EINVAL;
}
diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
index 864265f..433e33a 100644
--- a/include/linux/qed/qed_if.h
+++ b/include/linux/qed/qed_if.h
@@ -314,9 +314,11 @@ struct qed_link_params {
 struct qed_link_output {
boollink_up;
 
-   u32 supported_caps; /* In SUPPORTED defs */
-   u32 advertised_caps;/* In ADVERTISED defs */
-   u32 lp_caps;/* In ADVERTISED defs */
+   /* In QED_LM_* defs */
+   u32 supported_caps;
+   u32 advertised_caps;
+   u32 lp_caps;
+
u32 speed;  /* In Mb/s */
u8  duplex; /* In DUPLEX defs */
u8  port;   /* In PORT defs */
-- 
1.9.3



[PATCH net-next 2/2] qede: Fix forcing high speeds

2016-08-18 Thread Yuval Mintz
While '0xdead' and '0xbeef' are "great" values, we should
use the correct SPEED_* values instead.

Fixes: 054c67d1c82a ("qed*: Add support for ethtool link_ksettings callbacks")
Signed-off-by: Yuval Mintz 
---
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index ef521e5..c1a4c98 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -414,7 +414,7 @@ static int qede_set_link_ksettings(struct net_device *dev,
}
params.adv_speeds = QED_LM_4baseLR4_Full_BIT;
break;
-   case 0xdead:
+   case SPEED_5:
if (!(current_link.supported_caps &
  QED_LM_5baseKR2_Full_BIT)) {
DP_INFO(edev, "50G speed not supported\n");
@@ -422,7 +422,7 @@ static int qede_set_link_ksettings(struct net_device *dev,
}
params.adv_speeds = QED_LM_5baseKR2_Full_BIT;
break;
-   case 0xbeef:
+   case SPEED_10:
if (!(current_link.supported_caps &
  QED_LM_10baseKR4_Full_BIT)) {
DP_INFO(edev, "100G speed not supported\n");
-- 
1.9.3



[net-next 4/4] igb: fix error code in igb_add_ethtool_nfc_entry()

2016-08-18 Thread Jeff Kirsher
From: Gangfeng Huang 

Use error "rmgr: Cannot insert RX class rule: Operation not supported" is
more meaningful than "rmgr: Cannot insert RX class rule: Unknown error 524"

Signed-off-by: Gangfeng Huang 
Tested-by: Aaron Brown 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c 
b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index ef5408b..0c33eca 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2851,7 +2851,7 @@ static int igb_add_ethtool_nfc_entry(struct igb_adapter 
*adapter,
int err = 0;
 
if (!(netdev->hw_features & NETIF_F_NTUPLE))
-   return -ENOTSUPP;
+   return -EOPNOTSUPP;
 
/* Don't allow programming if the action is a queue greater than
 * the number of online Rx queues.
-- 
2.7.4



[net-next 2/4] igb: support RX flow classification by ethertype

2016-08-18 Thread Jeff Kirsher
From: Gangfeng Huang 

This patch is meant to allow for RX network flow classification to insert
and remove ethertype filter by ethtool

Example:
Add an ethertype filter:
$ ethtool -N eth0 flow-type ether proto 0x88F8 action 2

Show all filters:
$ ethtool -n eth0
4 RX rings available
Total 1 rules

Filter: 15
Flow Type: Raw Ethernet
Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Ethertype: 0x88F8 mask: 0x0
Action: Direct to queue 2

Delete the filter by location:
$ ethtool -N delete 15

Signed-off-by: Ruhao Gao 
Signed-off-by: Gangfeng Huang 
Tested-by: Aaron Brown 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/igb/e1000_82575.h |  5 ++
 drivers/net/ethernet/intel/igb/igb.h | 21 ++--
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 77 +++-
 drivers/net/ethernet/intel/igb/igb_ptp.c |  4 +-
 4 files changed, 101 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.h 
b/drivers/net/ethernet/intel/igb/e1000_82575.h
index 199ff98..acf0605 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.h
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.h
@@ -188,6 +188,11 @@ struct e1000_adv_tx_context_desc {
 /* ETQF register bit definitions */
 #define E1000_ETQF_FILTER_ENABLE   BIT(26)
 #define E1000_ETQF_1588BIT(30)
+#define E1000_ETQF_IMM_INT BIT(29)
+#define E1000_ETQF_QUEUE_ENABLEBIT(31)
+#define E1000_ETQF_QUEUE_SHIFT 16
+#define E1000_ETQF_QUEUE_MASK  0x0007
+#define E1000_ETQF_ETYPE_MASK  0x
 
 /* FTQF register bit definitions */
 #define E1000_FTQF_VF_BP   0x8000
diff --git a/drivers/net/ethernet/intel/igb/igb.h 
b/drivers/net/ethernet/intel/igb/igb.h
index 37f82ca..5c50758 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -350,13 +350,24 @@ struct hwmon_buff {
};
 #endif
 
+/* The number of L2 ether-type filter registers, Index 3 is reserved
+ * for PTP 1588 timestamp
+ */
+#define MAX_ETYPE_FILTER   (4 - 1)
+/* ETQF filter list: one static filter per filter consumer. This is
+ * to avoid filter collisions later. Add new filters here!!
+ *
+ * Current filters:Filter 3
+ */
+#define IGB_ETQF_FILTER_1588   3
+
 #define IGB_N_EXTTS2
 #define IGB_N_PEROUT   2
 #define IGB_N_SDP  4
 #define IGB_RETA_SIZE  128
 
 enum igb_filter_match_flags {
-   IGB_FILTER_FLAG_NONE = 0x0,
+   IGB_FILTER_FLAG_ETHER_TYPE = 0x1,
 };
 
 #define IGB_MAX_RXNFC_FILTERS 16
@@ -364,14 +375,17 @@ enum igb_filter_match_flags {
 /* RX network flow classification data structure */
 struct igb_nfc_input {
/* Byte layout in order, all values with MSB first:
-   * match_flags - 1 byte
-   */
+* match_flags - 1 byte
+* etype - 2 bytes
+*/
u8 match_flags;
+   __be16 etype;
 };
 
 struct igb_nfc_filter {
struct hlist_node nfc_node;
struct igb_nfc_input filter;
+   u16 etype_reg_index;
u16 sw_idx;
u16 action;
 };
@@ -500,6 +514,7 @@ struct igb_adapter {
unsigned int nfc_filter_count;
/* lock for RX network flow classification filter */
spinlock_t nfc_lock;
+   bool etype_bitmap[MAX_ETYPE_FILTER];
 };
 
 /* flags controlling PTP/1588 function */
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c 
b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 2599826..00e3387 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2431,6 +2431,7 @@ static int igb_get_ts_info(struct net_device *dev,
}
 }
 
+#define ETHER_TYPE_FULL_MASK ((__force __be16)~0)
 static int igb_get_ethtool_nfc_entry(struct igb_adapter *adapter,
 struct ethtool_rxnfc *cmd)
 {
@@ -2448,6 +2449,13 @@ static int igb_get_ethtool_nfc_entry(struct igb_adapter 
*adapter,
if (!rule || fsp->location != rule->sw_idx)
return -EINVAL;
 
+   if (rule->filter.match_flags & IGB_FILTER_FLAG_ETHER_TYPE) {
+   fsp->flow_type = ETHER_FLOW;
+   fsp->ring_cookie = rule->action;
+   fsp->h_u.ether_spec.h_proto = rule->filter.etype;
+   fsp->m_u.ether_spec.h_proto = ETHER_TYPE_FULL_MASK;
+   return 0;
+   }
return -EINVAL;
 }
 
@@ -2650,13 +2658,75 @@ static int igb_set_rss_hash_opt(struct igb_adapter 
*adapter,
return 0;
 }
 
+static int igb_rxnfc_write_etype_filter(struct igb_adapter *adapter,
+   struct igb_nfc_filter *input)
+{
+   struct e1000_hw *hw = >hw;
+   u8 i;
+   u32 etqf;
+   u16 etype;
+
+   /* find an empty etype filter register */
+   for (i = 0; i < 

[net-next 1/4] igb: add support of RX network flow classification

2016-08-18 Thread Jeff Kirsher
From: Gangfeng Huang 

This patch is meant to allow for RX network flow classification to insert
and remove Rx filter by ethtool. Ethtool interface has it's own rules
manager

Show all filters:
$ ethtool -n eth0
4 RX rings available
Total 2 rules

Signed-off-by: Ruhao Gao 
Signed-off-by: Gangfeng Huang 
Tested-by: Aaron Brown 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/igb/igb.h |  32 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c | 193 +++
 drivers/net/ethernet/intel/igb/igb_main.c|  45 +++
 3 files changed, 270 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb.h 
b/drivers/net/ethernet/intel/igb/igb.h
index 5387b3a..37f82ca 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -355,6 +355,27 @@ struct hwmon_buff {
 #define IGB_N_SDP  4
 #define IGB_RETA_SIZE  128
 
+enum igb_filter_match_flags {
+   IGB_FILTER_FLAG_NONE = 0x0,
+};
+
+#define IGB_MAX_RXNFC_FILTERS 16
+
+/* RX network flow classification data structure */
+struct igb_nfc_input {
+   /* Byte layout in order, all values with MSB first:
+   * match_flags - 1 byte
+   */
+   u8 match_flags;
+};
+
+struct igb_nfc_filter {
+   struct hlist_node nfc_node;
+   struct igb_nfc_input filter;
+   u16 sw_idx;
+   u16 action;
+};
+
 /* board specific private data structure */
 struct igb_adapter {
unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)];
@@ -473,6 +494,12 @@ struct igb_adapter {
int copper_tries;
struct e1000_info ei;
u16 eee_advert;
+
+   /* RX network flow classification support */
+   struct hlist_head nfc_filter_list;
+   unsigned int nfc_filter_count;
+   /* lock for RX network flow classification filter */
+   spinlock_t nfc_lock;
 };
 
 /* flags controlling PTP/1588 function */
@@ -599,4 +626,9 @@ static inline struct netdev_queue *txring_txq(const struct 
igb_ring *tx_ring)
return netdev_get_tx_queue(tx_ring->netdev, tx_ring->queue_index);
 }
 
+int igb_add_filter(struct igb_adapter *adapter,
+  struct igb_nfc_filter *input);
+int igb_erase_filter(struct igb_adapter *adapter,
+struct igb_nfc_filter *input);
+
 #endif /* _IGB_H_ */
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c 
b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 64e91c5..2599826 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2431,6 +2431,48 @@ static int igb_get_ts_info(struct net_device *dev,
}
 }
 
+static int igb_get_ethtool_nfc_entry(struct igb_adapter *adapter,
+struct ethtool_rxnfc *cmd)
+{
+   struct ethtool_rx_flow_spec *fsp = >fs;
+   struct igb_nfc_filter *rule = NULL;
+
+   /* report total rule count */
+   cmd->data = IGB_MAX_RXNFC_FILTERS;
+
+   hlist_for_each_entry(rule, >nfc_filter_list, nfc_node) {
+   if (fsp->location <= rule->sw_idx)
+   break;
+   }
+
+   if (!rule || fsp->location != rule->sw_idx)
+   return -EINVAL;
+
+   return -EINVAL;
+}
+
+static int igb_get_ethtool_nfc_all(struct igb_adapter *adapter,
+  struct ethtool_rxnfc *cmd,
+  u32 *rule_locs)
+{
+   struct igb_nfc_filter *rule;
+   int cnt = 0;
+
+   /* report total rule count */
+   cmd->data = IGB_MAX_RXNFC_FILTERS;
+
+   hlist_for_each_entry(rule, >nfc_filter_list, nfc_node) {
+   if (cnt == cmd->rule_cnt)
+   return -EMSGSIZE;
+   rule_locs[cnt] = rule->sw_idx;
+   cnt++;
+   }
+
+   cmd->rule_cnt = cnt;
+
+   return 0;
+}
+
 static int igb_get_rss_hash_opts(struct igb_adapter *adapter,
 struct ethtool_rxnfc *cmd)
 {
@@ -2484,6 +2526,16 @@ static int igb_get_rxnfc(struct net_device *dev, struct 
ethtool_rxnfc *cmd,
cmd->data = adapter->num_rx_queues;
ret = 0;
break;
+   case ETHTOOL_GRXCLSRLCNT:
+   cmd->rule_cnt = adapter->nfc_filter_count;
+   ret = 0;
+   break;
+   case ETHTOOL_GRXCLSRULE:
+   ret = igb_get_ethtool_nfc_entry(adapter, cmd);
+   break;
+   case ETHTOOL_GRXCLSRLALL:
+   ret = igb_get_ethtool_nfc_all(adapter, cmd, rule_locs);
+   break;
case ETHTOOL_GRXFH:
ret = igb_get_rss_hash_opts(adapter, cmd);
break;
@@ -2598,6 +2650,142 @@ static int igb_set_rss_hash_opt(struct igb_adapter 
*adapter,
return 0;
 }
 
+int igb_add_filter(struct igb_adapter *adapter, struct igb_nfc_filter *input)
+{
+   return -EINVAL;
+}
+
+int 

[net-next 3/4] igb: support RX flow classification by VLAN priority

2016-08-18 Thread Jeff Kirsher
From: Gangfeng Huang 

This patch is meant to allow for RX network flow classification to insert
and remove VLAN priority filter by ethtool

Example:
Add an VLAN priority filter:
$ ethtool -N eth0 flow-type ether vlan 0x6000 vlan-mask 0x1FFF action 2 loc 1

Show all filters:
$ ethtool -n eth0
4 RX rings available
Total 1 rules

Filter: 1
Flow Type: Raw Ethernet
Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Ethertype: 0x0 mask: 0x
VLAN EtherType: 0x0 mask: 0x
VLAN: 0x6000 mask: 0x1fff
User-defined: 0x0 mask: 0x
Action: Direct to queue 2

Delete the filter by location:
$ ethtool -N delete 1

Signed-off-by: Ruhao Gao 
Signed-off-by: Gangfeng Huang 
Tested-by: Aaron Brown 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/igb/e1000_defines.h |  4 ++
 drivers/net/ethernet/intel/igb/e1000_regs.h|  1 +
 drivers/net/ethernet/intel/igb/igb.h   |  3 +
 drivers/net/ethernet/intel/igb/igb_ethtool.c   | 91 --
 4 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h 
b/drivers/net/ethernet/intel/igb/e1000_defines.h
index 2997c44..2688180 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -1024,4 +1024,8 @@
 #define E1000_RTTBCNRC_RF_INT_MASK \
(E1000_RTTBCNRC_RF_DEC_MASK << E1000_RTTBCNRC_RF_INT_SHIFT)
 
+#define E1000_VLAPQF_QUEUE_SEL(_n, q_idx) (q_idx << ((_n) * 4))
+#define E1000_VLAPQF_P_VALID(_n)   (0x1 << (3 + (_n) * 4))
+#define E1000_VLAPQF_QUEUE_MASK0x03
+
 #endif
diff --git a/drivers/net/ethernet/intel/igb/e1000_regs.h 
b/drivers/net/ethernet/intel/igb/e1000_regs.h
index 21d9d02..d84afdd 100644
--- a/drivers/net/ethernet/intel/igb/e1000_regs.h
+++ b/drivers/net/ethernet/intel/igb/e1000_regs.h
@@ -309,6 +309,7 @@
(0x054E0 + ((_i - 16) * 8)))
 #define E1000_RAH(_i)  (((_i) <= 15) ? (0x05404 + ((_i) * 8)) : \
(0x054E4 + ((_i - 16) * 8)))
+#define E1000_VLAPQF   0x055B0  /* VLAN Priority Queue Filter VLAPQF */
 #define E1000_IP4AT_REG(_i) (0x05840 + ((_i) * 8))
 #define E1000_IP6AT_REG(_i) (0x05880 + ((_i) * 4))
 #define E1000_WUPM_REG(_i)  (0x05A00 + ((_i) * 4))
diff --git a/drivers/net/ethernet/intel/igb/igb.h 
b/drivers/net/ethernet/intel/igb/igb.h
index 5c50758..03fbe4b 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -368,6 +368,7 @@ struct hwmon_buff {
 
 enum igb_filter_match_flags {
IGB_FILTER_FLAG_ETHER_TYPE = 0x1,
+   IGB_FILTER_FLAG_VLAN_TCI   = 0x2,
 };
 
 #define IGB_MAX_RXNFC_FILTERS 16
@@ -377,9 +378,11 @@ struct igb_nfc_input {
/* Byte layout in order, all values with MSB first:
 * match_flags - 1 byte
 * etype - 2 bytes
+* vlan_tci - 2 bytes
 */
u8 match_flags;
__be16 etype;
+   __be16 vlan_tci;
 };
 
 struct igb_nfc_filter {
diff --git a/drivers/net/ethernet/intel/igb/igb_ethtool.c 
b/drivers/net/ethernet/intel/igb/igb_ethtool.c
index 00e3387..ef5408b 100644
--- a/drivers/net/ethernet/intel/igb/igb_ethtool.c
+++ b/drivers/net/ethernet/intel/igb/igb_ethtool.c
@@ -2449,11 +2449,18 @@ static int igb_get_ethtool_nfc_entry(struct igb_adapter 
*adapter,
if (!rule || fsp->location != rule->sw_idx)
return -EINVAL;
 
-   if (rule->filter.match_flags & IGB_FILTER_FLAG_ETHER_TYPE) {
+   if (rule->filter.match_flags) {
fsp->flow_type = ETHER_FLOW;
fsp->ring_cookie = rule->action;
-   fsp->h_u.ether_spec.h_proto = rule->filter.etype;
-   fsp->m_u.ether_spec.h_proto = ETHER_TYPE_FULL_MASK;
+   if (rule->filter.match_flags & IGB_FILTER_FLAG_ETHER_TYPE) {
+   fsp->h_u.ether_spec.h_proto = rule->filter.etype;
+   fsp->m_u.ether_spec.h_proto = ETHER_TYPE_FULL_MASK;
+   }
+   if (rule->filter.match_flags & IGB_FILTER_FLAG_VLAN_TCI) {
+   fsp->flow_type |= FLOW_EXT;
+   fsp->h_ext.vlan_tci = rule->filter.vlan_tci;
+   fsp->m_ext.vlan_tci = htons(VLAN_PRIO_MASK);
+   }
return 0;
}
return -EINVAL;
@@ -2697,12 +2704,46 @@ static int igb_rxnfc_write_etype_filter(struct 
igb_adapter *adapter,
return 0;
 }
 
+int igb_rxnfc_write_vlan_prio_filter(struct igb_adapter *adapter,
+struct igb_nfc_filter *input)
+{
+   struct e1000_hw *hw = >hw;
+   u8 vlan_priority;
+   u16 queue_index;
+   u32 vlapqf;
+
+   vlapqf = rd32(E1000_VLAPQF);
+   

[net-next 0/4][pull request] 1GbE Intel Wired LAN Driver Updates 2016-08-18

2016-08-18 Thread Jeff Kirsher
This series contains updates to igb only.

Gangfeng Huang provides all the changes in the series to update the
igb driver to support advanced receive side filters that direct receive
packets by flows to different hardware queues. This enables a tight
control on routing a flow in the platform.  First patch allows for
receive network flow classification to insert and remove receive filters
by ethtool.  Second and third patches add the ability to insert and
remove ethertype and VLAN priority filters by ethtool.

Last patch just fixes an error message to return "Not supported" versus
"Unknown error 524".

The following are changes since commit e951f145d1724769546efe2e04f6cb2b7037d7ea:
  Merge branch 'tipc-next'
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue 1GbE

Gangfeng Huang (4):
  igb: add support of RX network flow classification
  igb: support RX flow classification by ethertype
  igb: support RX flow classification by VLAN priority
  igb: fix error code in igb_add_ethtool_nfc_entry()

 drivers/net/ethernet/intel/igb/e1000_82575.h   |   5 +
 drivers/net/ethernet/intel/igb/e1000_defines.h |   4 +
 drivers/net/ethernet/intel/igb/e1000_regs.h|   1 +
 drivers/net/ethernet/intel/igb/igb.h   |  50 
 drivers/net/ethernet/intel/igb/igb_ethtool.c   | 345 +
 drivers/net/ethernet/intel/igb/igb_main.c  |  45 
 drivers/net/ethernet/intel/igb/igb_ptp.c   |   4 +-
 7 files changed, 452 insertions(+), 2 deletions(-)

-- 
2.7.4



Re: [PATCH] smc91c92_cs : add a spinlock to avoid race condition

2016-08-18 Thread Vaishali Thakkar


On Friday 19 August 2016 09:37 AM, David Miller wrote:
> From: Pavel Andrianov 
> Date: Tue, 16 Aug 2016 16:39:06 +0300
> 
>> smc_reset may be executed in parallel with timer function media_check.
>> To avoid data race in smc_set_xcvr a spinlock was added.
>>
>> Found by Linux Driver Verification project (linuxtesting.org).
>>
>> Signed-off-by: Pavel Andrianov 
> 
> This is not sufficient.
> 
> You have to block basically the entire function, because both
> smc_reset and media_check program the bank selection so could
> corrupt eachother's register accesses.

Hmm, but then there is a use of udelay as well. Would it be still
fine to acquire a spinlock on whole function?

> 

-- 
Vaishali


Re: [PATCH net-next 0/3] qed*: Add support for additional statistics.

2016-08-18 Thread David Miller
From: Sudarsana Reddy Kalluru 
Date: Tue, 16 Aug 2016 10:51:00 -0400

> The patch series adds qed/qede support for new statistics.
> Patch (1) adds couple of statistcs for "ethtool -S" display.
> Patch (2) adds support for per-queue statistics to ethtool display.
> Patch (3) adds qed support for NCSI statistics.
> 
> Please consider applying this to 'net-next' branch.

Series applied, thanks.


[PATCH net-next 0/5] net: dsa: bcm_sf2: Platform conversion

2016-08-18 Thread Florian Fainelli
Hi all,

This patch series converts the bcm_sf2 driver from a traditional DSA driver
into a platform_device driver and makes it use the new DSA binding that Andrew
introduced in the latest merge window.

Prior attempts used to coerce the code in net/dsa/dsa2.c to accept the old
binding, while really there is only one broken single user out there: bcm_sf2,
so instead, just assume the new DT binding is deployed and use it accordingly.

Florian Fainelli (5):
  net: dsa: Export suspend/resume functions
  Documentation: dt: bindings: Update Broadcom 7445 switch document
  net: dsa: bcm_sf2: Make it a real platform device driver
  net: dsa: bcm_sf2: Use device managed helpers
  net: dsa: bcm_sf2: Remove probing through old DSA binding

 .../bindings/net/brcm,bcm7445-switch-v4.0.txt  |  45 -
 drivers/net/dsa/bcm_sf2.c  | 221 -
 include/net/dsa.h  |  14 ++
 net/dsa/dsa.c  |   7 +-
 4 files changed, 189 insertions(+), 98 deletions(-)

-- 
2.7.4



Re: [PATCH] fib_trie: Fix the description of pos and bits

2016-08-18 Thread Alexander Duyck
On Wed, Aug 17, 2016 at 9:33 PM, Xunlei Pang  wrote:
> 1) Fix one typo: s/tn/tp/
> 2) Fix the description about the "u" bits.
>
> Signed-off-by: Xunlei Pang 
> ---
>  net/ipv4/fib_trie.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index d07fc07..eb7c5d1 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -249,7 +249,7 @@ static inline unsigned long get_index(t_key key, struct 
> key_vector *kv)
>   * index into the parent's child array. That is, they will be used to find
>   * 'n' among tp's children.
>   *
> - * The bits from (n->pos + n->bits) to (tn->pos - 1) - "S" - are skipped bits
> + * The bits from (n->pos + n->bits) to (tp->pos - 1) - "S" - are skipped bits
>   * for the node n.
>   *
>   * All the bits we have seen so far are significant to the node n. The rest
> @@ -258,7 +258,7 @@ static inline unsigned long get_index(t_key key, struct 
> key_vector *kv)
>   * The bits from (n->pos) to (n->pos + n->bits - 1) - "C" - are the index 
> into
>   * n's child array, and will of course be different for each child.
>   *
> - * The rest of the bits, from 0 to (n->pos + n->bits), are completely unknown
> + * The rest of the bits, from 0 to (n->pos -1) - "u" - are completely unknown
>   * at this point.
>   */
>

These comment fixes all look correct.

Acked-by: Alexander Duyck 


Re: [PATCH next-next 0/2] qdisc-hashtable fixes

2016-08-18 Thread David Miller
From: Jiri Kosina 
Date: Tue, 16 Aug 2016 23:52:08 +0200 (CEST)

> The following two patches fix all the issues that have been reported 
> against the conversion of qdisc linked list to hashtable (currently in 
> net-next) so far.
> 
> First patch adjusts handling of singleton qdiscs to the new semantics, and 
> is rather straightforward.
> 
> The second patch, which fixes "cosmetic" issue of duplicate entries in the 
> qdisc dump for ingress qdiscs, is a little bit more hairy; I personally 
> would love to see all the already existing "if (ingress)"-like hacks go 
> away (by, let's say, introducing a general TCQ_F_? flag), but that's way 
> out of scope of this patchset (but already on my todo).
> 
> Thanks a lot to Daniel Borkmann and David Ahern for reporting the issues 
> and testing the patches promptly.

Series applied, thanks Jiri.


Re: [RFC PATCH] net: diag: support SOCK_DESTROY for UDP sockets

2016-08-18 Thread Eric Dumazet
On Thu, 2016-08-18 at 19:29 -0700, David Ahern wrote:
> This implements SOCK_DESTROY for UDP sockets similar to what was done
> for TCP with commit c1e64e298b8ca ("net: diag: Support destroying TCP
> sockets.") A process with a UDP socket targeted for destroy is awakened
> and recvmsg fails with ECONNABORTED giving the process notice that
> it should close the socket.
> 
> Signed-off-by: David Ahern 
> ---
>  include/net/udp.h   |  8 
>  net/ipv4/udp.c  | 58 
> +
>  net/ipv4/udp_diag.c | 55 ++
>  net/ipv6/udp.c  | 45 +
>  4 files changed, 166 insertions(+)
> 
> diff --git a/include/net/udp.h b/include/net/udp.h
> index 8894d7144189..5c3443634b4d 100644
> --- a/include/net/udp.h
> +++ b/include/net/udp.h
> @@ -251,6 +251,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
>int (*saddr_cmp)(const struct sock *,
> const struct sock *));
>  void udp_err(struct sk_buff *, u32);
> +int udp_abort(struct sock *sk, int err);
>  int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
>  int udp_push_pending_frames(struct sock *sk);
>  void udp_flush_pending_frames(struct sock *sk);
> @@ -274,6 +275,9 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
> saddr, __be16 sport,
>  struct udp_table *tbl, struct sk_buff *skb);
>  struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
>__be16 sport, __be16 dport);
> +struct sock *udp4_lib_lookup_full(struct net *net, __be32 saddr,
> +   __be16 sport, __be32 daddr, __be16 dport,
> +   int dif, struct udp_table *table);
>  struct sock *udp6_lib_lookup(struct net *net,
>const struct in6_addr *saddr, __be16 sport,
>const struct in6_addr *daddr, __be16 dport,
> @@ -286,6 +290,10 @@ struct sock *__udp6_lib_lookup(struct net *net,
>  struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
>__be16 sport, __be16 dport);
>  
> +struct sock *udp6_lib_lookup_full(struct net *net,
> +   const struct in6_addr *saddr, __be16 sport,
> +   const struct in6_addr *daddr, __be16 dport,
> +   int dif, struct udp_table *table);
>  /*
>   *   SNMP statistics for UDP and UDP-Lite
>   */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index e61f7cd65d08..e5b72a7165e1 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -594,6 +594,47 @@ struct sock *udp4_lib_lookup(struct net *net, __be32 
> saddr, __be16 sport,
>  EXPORT_SYMBOL_GPL(udp4_lib_lookup);
>  #endif
>  
> +/* caller should hold rcu lock */
> +struct sock *udp4_lib_lookup_full(struct net *net, __be32 saddr,
> +   __be16 sport, __be32 daddr, __be16 dport,
> +   int dif, struct udp_table *table)
> +{
> + struct sock *sk;
> + int i;
> +
> + for (i = 0; i <= table->mask; i++) {
> + struct udp_hslot *hslot = >hash[i];
> +
> + sk_for_each_rcu(sk, >head) {
> + struct inet_sock *inet;
> +
> + if (!net_eq(sock_net(sk), net) ||
> + ipv6_only_sock(sk))
> + continue;
> +
> + if (dif && sk->sk_bound_dev_if != dif)
> + continue;
> +
> + if (sk->sk_family != PF_INET)
> + continue;
> +
> + if (udp_sk(sk)->udp_port_hash != ntohs(sport))
> + continue;
> +
> + inet = inet_sk(sk);
> + if (inet->inet_rcv_saddr != saddr  ||
> + inet->inet_daddr != daddr  ||
> + inet->inet_dport != dport)
> + continue;
> +
> + return sk;
> + }
> + }
> +
> + return NULL;
> +}


Why are you iterating the whole table ?

Normally, udp_hashfn(net, ntohs(dport), udptable->mask) slot should be
enough to find all sockets bound to dport.





Re: [PATCH] net: thunderx: Fix OOPs with ethtool --register-dump

2016-08-18 Thread David Miller
From: David Daney 
Date: Tue, 16 Aug 2016 12:28:17 -0700

> From: David Daney 
> 
> The ethtool_ops .get_regs function attempts to read the nonexistent
> register NIC_QSET_SQ_0_7_CNM_CHG, which produces a "bus error" type
> OOPs.
> 
> Fix by not attempting to read, and removing the definition of,
> NIC_QSET_SQ_0_7_CNM_CHG.  A zero is written into the register dump to
> keep the layout unchanged.
> 
> Signed-off-by: David Daney 

Applied.


Re: [PATCH net-next 0/2] tipc: bearer and link improvements

2016-08-18 Thread David Miller
From: Jon Maloy 
Date: Tue, 16 Aug 2016 11:53:49 -0400

> The first commit makes it possible to set and check the 'blocked' state
> of a bearer from the generic bearer layer. The second commit is a small
> improvement to the link congestion mechanism.

Applied, thanks.


Re: [PATCH net] qede: Fix Tx timeout due to xmit_more

2016-08-18 Thread David Miller
From: Yuval Mintz 
Date: Tue, 16 Aug 2016 18:40:18 +0300

> Driver uses netif_tx_queue_stopped() to make sure the xmit_more
> indication will be honored, but that only checks for DRV_XOFF.
> 
> At the same time, it's possible that during transmission the DQL will
> close the transmission queue with STACK_XOFF indication.
> In re-configuration flows, when the threshold is relatively low, it's
> possible that the device has no pending tranmissions, and during
> tranmission the driver would miss doorbelling the HW.
> Since there are no pending transmission, there will never be a Tx
> completion [and thus the DQL would not remove the STACK_XOFF indication],
> eventually causing the Tx queue to timeout.
> 
> While we're at it - also doorbell in case driver has to close the
> transmission queue on its own [although this one is less important -
> if the ring is full, we're bound to receive completion eventually,
> which means the doorbell would only be postponed and not indefinetly
> blocked].
> 
> Fixes: 312e06761c99 ("qede: Utilize xmit_more")
> Signed-off-by: Yuval Mintz 

Applied, thanks.

> Do notice that we have pending patch intended for 'net-next'
> ("qede: Add support for per-queue stats") that might cause a merge
> conflict with this one - although it should be trivial to resolve.

Ok.


Re: [PATCH] samples/bpf: Add tunnel set/get tests.

2016-08-18 Thread David Miller
From: William Tu 
Date: Tue, 16 Aug 2016 07:03:01 -0700

> The patch creates sample code exercising bpf_skb_{set,get}_tunnel_key,
> and bpf_skb_{set,get}_tunnel_opt for GRE, VXLAN, and GENEVE.  A native
> tunnel device is created in a namespace to interact with a lwtunnel
> device out of the namespace, with metadata enabled.  The bpf_skb_set_*
> program is attached to tc egress and bpf_skb_get_* is attached to egress
> qdisc.  A ping between two tunnels is used to verify correctness and
> the result of bpf_skb_get_* printed by bpf_trace_printk.
> 
> Signed-off-by: William Tu 

Please respin, this doesn't apply cleanly to net-next.


Re: [PATCH 0/2] ipv6: fix stuck RA-derived route in container after migration with criu

2016-08-18 Thread David Miller

The indentation of both of your patches are incorrect.

You must indent each basic block by a full TAB character.

You must also line up multi-line function calls, such that
the second and subsequent lines start precisely at the first
column after the opennning parenthesis of the function call
itself.  You must use the appropriate number of TAB and
SPC characters necessary to achieve this.



Re: [PATCH] smc91c92_cs : add a spinlock to avoid race condition

2016-08-18 Thread David Miller
From: Pavel Andrianov 
Date: Tue, 16 Aug 2016 16:39:06 +0300

> smc_reset may be executed in parallel with timer function media_check.
> To avoid data race in smc_set_xcvr a spinlock was added.
> 
> Found by Linux Driver Verification project (linuxtesting.org).
> 
> Signed-off-by: Pavel Andrianov 

This is not sufficient.

You have to block basically the entire function, because both
smc_reset and media_check program the bank selection so could
corrupt eachother's register accesses.


Re: [PATCH] net: atm: remove redundant null pointer check on dev->name

2016-08-18 Thread David Miller
From: Colin King 
Date: Tue, 16 Aug 2016 12:08:42 +0100

> From: Colin Ian King 
> 
> dev->name is a char array of IFNAMSIZ elements, hence can never be
> null, so the null pointer check is redundant. Remove it.
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [PATCHv3] net: ethernet: macb: Add support for rx_clk

2016-08-18 Thread David Miller
From: Shubhrajyoti Datta 
Date: Tue, 16 Aug 2016 10:14:50 +0530

> Some of the platforms like zynqmp ultrascale+ has a
> separate clock gate for the rx clock. Add an optional
> rx_clk so that the clock can be enabled.
> 
> Signed-off-by: Shubhrajyoti Datta 

Applied to net-next.


Re: [net-next 00/15][pull request] 40GbE Intel Wired LAN Driver Updates 2016-08-18

2016-08-18 Thread David Miller
From: Jeff Kirsher 
Date: Thu, 18 Aug 2016 14:57:21 -0700

> This series contains updates to i40e and i40evf only.

Pulled, thanks Jeff.


RE: [PATCH v6 3/3] net: phy: Add gmiitorgmii converter support

2016-08-18 Thread Appana Durga Kedareswara Rao
Hi Andrew,

Thanks for the review...

> 
> > +static int xgmiitorgmii_read_status(struct phy_device *phydev) {
> > +   struct gmii2rgmii *priv = phydev->priv;
> > +   u16 val = 0;
> > +
> > +   priv->phy_drv->read_status(phydev);
> 
> This can return an error, in which case phydev->speed should not be trusted.

Will fix...

> 
> I've not thought locking all the way through yet. I don't think you need a 
> lock
> here, but i need to think about it.

Ok...

> 
> > +
> > +   val = mdiobus_read(phydev->mdio.bus, priv->addr,
> > +XILINX_GMII2RGMII_REG);
> 
> You should check for an error here.
> 
> > +   val &= XILINX_GMII2RGMII_SPEED_MASK;
> > +
> > +   if (phydev->speed == SPEED_1000)
> > +   val |= BMCR_SPEED1000;
> > +   else if (phydev->speed == SPEED_100)
> > +   val |= BMCR_SPEED100;
> > +   else
> > +   val |= BMCR_SPEED10;
> 
> What happens if for example the PHY is an aquantia and has negotiated
> SPEED_2500? Some Marvell PHYs can also do odd speeds, like 200Mbps.
> You probably want to return an error, rather than silently have things go 
> wrong.

Will fix...

> 
> > +
> > +   mdiobus_write(phydev->mdio.bus, priv->addr,
> XILINX_GMII2RGMII_REG,
> > +val);
> 
> This can also return an error.

Will fix...

> 
> > +   return 0;
> > +}
> > +
> > +int xgmiitorgmii_probe(struct mdio_device *mdiodev) {
> > +   struct device *dev = >dev;
> > +   struct device_node *np = dev->of_node, *phy_node;
> > +   struct gmii2rgmii *priv;
> > +
> > +   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> > +   if (!priv)
> > +   return -ENOMEM;
> > +
> > +   phy_node = of_parse_phandle(np, "phy-handle", 0);
> > +   if (IS_ERR(phy_node)) {
> > +   dev_err(dev, "Couldn't parse phy-handle\n");
> > +   return -ENODEV;
> > +   }
> > +
> > +   priv->phy_dev = of_phy_find_device(phy_node);
> > +   if (!priv->phy_dev) {
> > +   dev_info(dev, "Couldn't find phydev\n");
> > +   return -EPROBE_DEFER;
> > +   }
> > +
> > +   priv->addr = mdiodev->addr;
> > +   priv->phy_drv = priv->phy_dev->drv;
> > +   memcpy(>conv_phy_drv, priv->phy_dev->drv,
> > +  sizeof(struct phy_driver));
> > +   priv->conv_phy_drv.read_status = xgmiitorgmii_read_status;
> > +   priv->phy_dev->priv = priv;
> > +   priv->phy_dev->drv = >conv_phy_drv;
> 
> So from this point onward, the phy driver depends on the memory allocated by
> this driver. If this driver goes away, freeing its memory, the next call to
> read_status() is going to have a problem.
> 
> Also, i think this assignment should take the phy lock, just to be safe.

Ok will fix

> 
> There also needs to be some thought into what happens if the phy driver is
> unloaded. Should this driver take a reference on the phy driver to prevent 
> that?

Ok will increment the reference count of the external phy device...


Regards,
Kedar.

> 
>Andrew


[PATCH] netfilter: fix spelling mistake: "delimitter" -> "delimiter"

2016-08-18 Thread Colin King
From: Colin Ian King 

trivial fix to spelling mistake in pr_debug message

Signed-off-by: Colin Ian King 
---
 net/netfilter/nf_conntrack_ftp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c
index 4314700..b6934b5 100644
--- a/net/netfilter/nf_conntrack_ftp.c
+++ b/net/netfilter/nf_conntrack_ftp.c
@@ -237,7 +237,7 @@ static int try_eprt(const char *data, size_t dlen, struct 
nf_conntrack_man *cmd,
}
delim = data[0];
if (isdigit(delim) || delim < 33 || delim > 126 || data[2] != delim) {
-   pr_debug("try_eprt: invalid delimitter.\n");
+   pr_debug("try_eprt: invalid delimiter.\n");
return 0;
}
 
-- 
2.8.1



Re: [PATCH iproute2 1/1] tc classifiers: Modernize tcindex classifier

2016-08-18 Thread Stephen Hemminger
On Wed, 17 Aug 2016 07:16:53 -0400
Jamal Hadi Salim  wrote:

> From: Jamal Hadi Salim 
> 
> Signed-off-by: Jamal Hadi Salim 
> ---

The is ok for the parsing of config, but you are still using print_police
for display.


[net-next 03/15] i40e/i40evf: Reset VLAN filter count when resetting

2016-08-18 Thread Jeff Kirsher
From: Catherine Sullivan 

When we do a reset, all the VLAN filters get added again. Therefore we also
want to reset the VLAN count to 0 or we quickly run out of filters.

Change-ID: I459f26851e22204dc8b828ad87cde8170119
Signed-off-by: Catherine Sullivan 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 6fcbf76..0fa050a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -992,6 +992,7 @@ complete_reset:
set_bit(I40E_VF_STAT_ACTIVE, >vf_states);
clear_bit(I40E_VF_STAT_DISABLED, >vf_states);
i40e_notify_client_of_vf_reset(pf, abs_vf_id);
+   vf->num_vlan = 0;
}
/* tell the VF the reset is done */
wr32(hw, I40E_VFGEN_RSTAT1(vf->vf_id), I40E_VFR_VFACTIVE);
-- 
2.7.4



Re: [PATCH] mwifiex: propagate error if IRQ request fails in mwifiex_sdio_of()

2016-08-18 Thread Arend van Spriel


On 18-08-16 21:29, Javier Martinez Canillas wrote:
> Hello Arend,
> 
> Thanks a lot for your feedback.
> 
> On 08/18/2016 03:14 PM, Arend van Spriel wrote:
>> On 18-08-16 16:17, Javier Martinez Canillas wrote:
>>> If request_irq() fails in mwifiex_sdio_probe_of(), only an error message
>>> is printed but the actual error is not propagated to the caller function.
>>
>> Hmm. The caller function, ie. mwifiex_sdio_probe(), does not seem to care.
>>
> 
> Hmm, I'm not so sure about that. It's checking the wifiex_sdio_probe_of()
> return value.

Ok. I looked at 4.7 sources on lxr [1].

> If the IRQ request failing is not an error, then at the very least the call
> to disable_irq() should be avoided if request_irq() fails, and the message
> should be changed from dev_err() to dev_dgb() or dev_info().

agree.

>> The device may still function without this wake interrupt.
>>
> 
> That's correct, the binding says that the "interrupts" property in the child
> node is optional since is just a wakeup IRQ. Now the question is if should
> be an error if the IRQ is defined but fails to be requested.

Clearly it indicates an error in the DT specification so behavior is not
as expected. Personally I would indeed consider it an error, but I was
just indicating that it might have done like this intentionally.

Regards,
Arend

[1]
http://lxr.free-electrons.com/source/drivers/net/wireless/marvell/mwifiex/sdio.c#L192

>> Regards,
>> Arend
>>
> 
> Best regards,
> 


Re: [PATCH iproute2] ip: report IFLA_GSO_MAX_SIZE and IFLA_GSO_MAX_SEGS

2016-08-18 Thread Stephen Hemminger
On Wed, 17 Aug 2016 14:39:21 -0700
Eric Dumazet  wrote:

> From: Eric Dumazet 
> 
> kernel support for these attributes was added in linux-4.6
> 
> Signed-off-by: Eric Dumazet 


What about setting them?


[PATCH net] net: ipv4: fix sparse error in fib_good_nh()

2016-08-18 Thread Eric Dumazet
From: Eric Dumazet 

Fixes following sparse errors :

net/ipv4/fib_semantics.c:1579:61: warning: incorrect type in argument 2
(different base types)
net/ipv4/fib_semantics.c:1579:61:expected unsigned int [unsigned]
[usertype] key
net/ipv4/fib_semantics.c:1579:61:got restricted __be32 const
[usertype] nh_gw


Fixes: a6db4494d218c ("net: ipv4: Consider failed nexthops in multipath routes")
Signed-off-by: Eric Dumazet 
---
 net/ipv4/fib_semantics.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 539fa264e67d..8066ccc48a17 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1576,7 +1576,8 @@ static bool fib_good_nh(const struct fib_nh *nh)
 
rcu_read_lock_bh();
 
-   n = __ipv4_neigh_lookup_noref(nh->nh_dev, nh->nh_gw);
+   n = __ipv4_neigh_lookup_noref(nh->nh_dev,
+ (__force u32)nh->nh_gw);
if (n)
state = n->nud_state;
 




Re: [PATCH net-next] dsa: mv88e6xxx: Timeout based on iterations

2016-08-18 Thread Andrew Lunn
> >  static int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg,
> >   u16 mask)
> >  {
> > -   unsigned long timeout = jiffies + HZ / 10;
> > +   int i;
> >  
> > -   while (time_before(jiffies, timeout)) {
> > +   for (i = 0; i < 16; i++) {
> > u16 val;
> > int err;
> >  
> 
> Since we remove the elapsed time here, can we use mv88e6xxx_wait in
> mv88e6xxx_update? It'd be good to have a consistent wait routine
> everywhere.

Hi Vivien

Yes, it looks like that should work. I will add a second patch to do
this.

Andrew


[PATCH net-next 5/5] net: dsa: bcm_sf2: Remove probing through old DSA binding

2016-08-18 Thread Florian Fainelli
Remove our dsa_switch_driver::drv_probe callback to prevent probing
through the old DSA binding, not that this could happen anymore now that
we have moved the matching compatible string from net/dsa/dsa.c to
drivers/net/dsa/bcm_sf2.c, so this is essentially dead code.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c | 15 ---
 1 file changed, 15 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 9f8e007b01d6..8e6fe13dbec3 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -136,20 +136,6 @@ static int bcm_sf2_sw_get_sset_count(struct dsa_switch *ds)
return BCM_SF2_STATS_SIZE;
 }
 
-static const char *bcm_sf2_sw_drv_probe(struct device *dsa_dev,
-   struct device *host_dev, int sw_addr,
-   void **_priv)
-{
-   struct bcm_sf2_priv *priv;
-
-   priv = devm_kzalloc(dsa_dev, sizeof(*priv), GFP_KERNEL);
-   if (!priv)
-   return NULL;
-   *_priv = priv;
-
-   return "Broadcom Starfighter 2";
-}
-
 static void bcm_sf2_imp_vlan_setup(struct dsa_switch *ds, int cpu_port)
 {
struct bcm_sf2_priv *priv = ds_to_priv(ds);
@@ -1592,7 +1578,6 @@ static int bcm_sf2_sw_setup(struct dsa_switch *ds)
 
 static struct dsa_switch_driver bcm_sf2_switch_driver = {
.tag_protocol   = DSA_TAG_PROTO_BRCM,
-   .probe  = bcm_sf2_sw_drv_probe,
.setup  = bcm_sf2_sw_setup,
.set_addr   = bcm_sf2_sw_set_addr,
.get_phy_flags  = bcm_sf2_sw_get_phy_flags,
-- 
2.7.4



[net-next 15/15] i40e: Expose 'trust' flag to userspace via ndo_get_vf_config.

2016-08-18 Thread Jeff Kirsher
From: Sridhar Samudrala 

This enables
ip -d l
to indicate if trust is on or off for VFs.

Signed-off-by: Sridhar Samudrala 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 68e3482..da34235 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -3000,6 +3000,7 @@ int i40e_ndo_get_vf_config(struct net_device *netdev,
else
ivi->linkstate = IFLA_VF_LINK_STATE_DISABLE;
ivi->spoofchk = vf->spoofchk;
+   ivi->trusted = vf->trusted;
ret = 0;
 
 error_param:
-- 
2.7.4



[net-next 11/15] i40e: fix memory leak

2016-08-18 Thread Jeff Kirsher
From: Mitch Williams 

When we allocate memory, we must free it. It's simple courtesy.

Change-ID: Id007294096fb53344f1a8b9a0f78eddf9853c5d6
Signed-off-by: Mitch Williams 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 5ea659c..68e3482 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2318,6 +2318,7 @@ err:
/* send the response back to the VF */
aq_ret = i40e_vc_send_msg_to_vf(vf, I40E_VIRTCHNL_OP_GET_RSS_HENA_CAPS,
aq_ret, (u8 *)vrh, len);
+   kfree(vrh);
return aq_ret;
 }
 
-- 
2.7.4



[net-next 10/15] i40e: fix lookup table when RSS disabled/enabled

2016-08-18 Thread Jeff Kirsher
From: Alan Brady 

This patch fixes the bug which causes RSS to continue to work
after being disabled.  After disabling RSS, traffic would continue
to be assigned to different queues instead of falling back to a
single queue. Without this patch, attempting to disable RSS would
not work as expected. This patch fixes the bug by clearing the
lookup table used by RSS such that all traffic is assigned to a
single queue.  This patch also addresses the issue of reinstating
 the lookup table should RSS then be re-enabled.

Change-ID: Ib20c7c6a7e9f1f772bb787370f8a8c664796b141
Signed-off-by: Alan Brady 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 8c68ac1..704cd7d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -8691,6 +8691,28 @@ bool i40e_set_ntuple(struct i40e_pf *pf, 
netdev_features_t features)
 }
 
 /**
+ * i40e_clear_rss_lut - clear the rx hash lookup table
+ * @vsi: the VSI being configured
+ **/
+static void i40e_clear_rss_lut(struct i40e_vsi *vsi)
+{
+   struct i40e_pf *pf = vsi->back;
+   struct i40e_hw *hw = >hw;
+   u16 vf_id = vsi->vf_id;
+   u8 i;
+
+   if (vsi->type == I40E_VSI_MAIN) {
+   for (i = 0; i <= I40E_PFQF_HLUT_MAX_INDEX; i++)
+   wr32(hw, I40E_PFQF_HLUT(i), 0);
+   } else if (vsi->type == I40E_VSI_SRIOV) {
+   for (i = 0; i <= I40E_VFQF_HLUT_MAX_INDEX; i++)
+   i40e_write_rx_ctl(hw, I40E_VFQF_HLUT1(i, vf_id), 0);
+   } else {
+   dev_err(>pdev->dev, "Cannot set RSS LUT - invalid VSI 
type\n");
+   }
+}
+
+/**
  * i40e_set_features - set the netdev feature flags
  * @netdev: ptr to the netdev being adjusted
  * @features: the feature set that the stack is suggesting
@@ -8703,6 +8725,12 @@ static int i40e_set_features(struct net_device *netdev,
struct i40e_pf *pf = vsi->back;
bool need_reset;
 
+   if (features & NETIF_F_RXHASH && !(netdev->features & NETIF_F_RXHASH))
+   i40e_pf_config_rss(pf);
+   else if (!(features & NETIF_F_RXHASH) &&
+netdev->features & NETIF_F_RXHASH)
+   i40e_clear_rss_lut(vsi);
+
if (features & NETIF_F_HW_VLAN_CTAG_RX)
i40e_vlan_stripping_enable(vsi);
else
-- 
2.7.4



Re: [PATCH net] net: ipv4: fix sparse error in fib_good_nh()

2016-08-18 Thread David Ahern
On 8/18/16 11:19 AM, Eric Dumazet wrote:
> From: Eric Dumazet 
> 
> Fixes following sparse errors :
> 
> net/ipv4/fib_semantics.c:1579:61: warning: incorrect type in argument 2
> (different base types)
> net/ipv4/fib_semantics.c:1579:61:expected unsigned int [unsigned]
> [usertype] key
> net/ipv4/fib_semantics.c:1579:61:got restricted __be32 const
> [usertype] nh_gw
> 
> 
> Fixes: a6db4494d218c ("net: ipv4: Consider failed nexthops in multipath 
> routes")
> Signed-off-by: Eric Dumazet 
> ---
>  net/ipv4/fib_semantics.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)


Acked-by: David Ahern 

Thanks, Eric.



Re: [PATCH net-next 1/2] net: phy: Sort Makefile and Kconfig

2016-08-18 Thread Florian Fainelli
On 08/18/2016 02:56 PM, Andrew Lunn wrote:
> Sort the files to reduce merge conflicts and to make it easier to find
> drivers by name. Also separate the MDIO bus drivers from the PHY
> drivers, again to help find what you need.
> 
> Signed-off-by: Andrew Lunn 
> ---
>  drivers/net/phy/Kconfig  | 427 
> ---
>  drivers/net/phy/Makefile |  76 +
>  2 files changed, 253 insertions(+), 250 deletions(-)

I trust the diffstat which looks stable here, thanks for doing this!

Acked-by: Florian Fainelli 
-- 
Florian


[net-next 08/15] i40e: Force register writes to mitigate sync issues with iwarp VF driver

2016-08-18 Thread Jeff Kirsher
From: Avinash Dayanand 

This patch is a fix for the bug i.e. unable to create iwarp device
in VF. This is a sync issue and the iwarp device open is called even
before the PCI register writes are done.

Forcing the PCI register writes to happen just before it exits the
function.

Change-ID: I60c6a2c709da89e845f2764cc50ce8b7373c8c44
Signed-off-by: Avinash Dayanand 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_client.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c 
b/drivers/net/ethernet/intel/i40e/i40e_client.c
index 8726269..90b435c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_client.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
@@ -793,7 +793,8 @@ static int i40e_client_setup_qvlist(struct i40e_info *ldev,
wr32(hw, I40E_PFINT_AEQCTL, reg);
}
}
-
+   /* Mitigate sync problems with iwarp VF driver */
+   i40e_flush(hw);
return 0;
 err:
kfree(ldev->qvlist_info);
-- 
2.7.4



Re: [PATCH iproute2] ip: report IFLA_GSO_MAX_SIZE and IFLA_GSO_MAX_SEGS

2016-08-18 Thread Eric Dumazet
On Thu, 2016-08-18 at 14:52 -0700, Stephen Hemminger wrote:
> On Wed, 17 Aug 2016 14:39:21 -0700
> Eric Dumazet  wrote:
> 
> > From: Eric Dumazet 
> > 
> > kernel support for these attributes was added in linux-4.6
> > 
> > Signed-off-by: Eric Dumazet 
> 
> 
> What about setting them?

This is not something the kernel does at this moment.

Drivers usually set a value, depending on hardware or software
constraints.





[RFC PATCH] net: diag: support SOCK_DESTROY for UDP sockets

2016-08-18 Thread David Ahern
This implements SOCK_DESTROY for UDP sockets similar to what was done
for TCP with commit c1e64e298b8ca ("net: diag: Support destroying TCP
sockets.") A process with a UDP socket targeted for destroy is awakened
and recvmsg fails with ECONNABORTED giving the process notice that
it should close the socket.

Signed-off-by: David Ahern 
---
 include/net/udp.h   |  8 
 net/ipv4/udp.c  | 58 +
 net/ipv4/udp_diag.c | 55 ++
 net/ipv6/udp.c  | 45 +
 4 files changed, 166 insertions(+)

diff --git a/include/net/udp.h b/include/net/udp.h
index 8894d7144189..5c3443634b4d 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -251,6 +251,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
 int (*saddr_cmp)(const struct sock *,
  const struct sock *));
 void udp_err(struct sk_buff *, u32);
+int udp_abort(struct sock *sk, int err);
 int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
 int udp_push_pending_frames(struct sock *sk);
 void udp_flush_pending_frames(struct sock *sk);
@@ -274,6 +275,9 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 
saddr, __be16 sport,
   struct udp_table *tbl, struct sk_buff *skb);
 struct sock *udp4_lib_lookup_skb(struct sk_buff *skb,
 __be16 sport, __be16 dport);
+struct sock *udp4_lib_lookup_full(struct net *net, __be32 saddr,
+ __be16 sport, __be32 daddr, __be16 dport,
+ int dif, struct udp_table *table);
 struct sock *udp6_lib_lookup(struct net *net,
 const struct in6_addr *saddr, __be16 sport,
 const struct in6_addr *daddr, __be16 dport,
@@ -286,6 +290,10 @@ struct sock *__udp6_lib_lookup(struct net *net,
 struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
 __be16 sport, __be16 dport);
 
+struct sock *udp6_lib_lookup_full(struct net *net,
+ const struct in6_addr *saddr, __be16 sport,
+ const struct in6_addr *daddr, __be16 dport,
+ int dif, struct udp_table *table);
 /*
  * SNMP statistics for UDP and UDP-Lite
  */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e61f7cd65d08..e5b72a7165e1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -594,6 +594,47 @@ struct sock *udp4_lib_lookup(struct net *net, __be32 
saddr, __be16 sport,
 EXPORT_SYMBOL_GPL(udp4_lib_lookup);
 #endif
 
+/* caller should hold rcu lock */
+struct sock *udp4_lib_lookup_full(struct net *net, __be32 saddr,
+ __be16 sport, __be32 daddr, __be16 dport,
+ int dif, struct udp_table *table)
+{
+   struct sock *sk;
+   int i;
+
+   for (i = 0; i <= table->mask; i++) {
+   struct udp_hslot *hslot = >hash[i];
+
+   sk_for_each_rcu(sk, >head) {
+   struct inet_sock *inet;
+
+   if (!net_eq(sock_net(sk), net) ||
+   ipv6_only_sock(sk))
+   continue;
+
+   if (dif && sk->sk_bound_dev_if != dif)
+   continue;
+
+   if (sk->sk_family != PF_INET)
+   continue;
+
+   if (udp_sk(sk)->udp_port_hash != ntohs(sport))
+   continue;
+
+   inet = inet_sk(sk);
+   if (inet->inet_rcv_saddr != saddr  ||
+   inet->inet_daddr != daddr  ||
+   inet->inet_dport != dport)
+   continue;
+
+   return sk;
+   }
+   }
+
+   return NULL;
+}
+EXPORT_SYMBOL(udp4_lib_lookup_full);
+
 static inline bool __udp_is_mcast_sock(struct net *net, struct sock *sk,
   __be16 loc_port, __be32 loc_addr,
   __be16 rmt_port, __be32 rmt_addr,
@@ -2192,6 +2233,22 @@ unsigned int udp_poll(struct file *file, struct socket 
*sock, poll_table *wait)
 }
 EXPORT_SYMBOL(udp_poll);
 
+int udp_abort(struct sock *sk, int err)
+{
+   lock_sock(sk);
+
+   sk->sk_err = err;
+   sk->sk_error_report(sk);
+   udp_disconnect(sk, 0);
+
+   release_sock(sk);
+
+   sock_put(sk);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(udp_abort);
+
 struct proto udp_prot = {
.name  = "UDP",
.owner = THIS_MODULE,
@@ -2223,6 +2280,7 @@ struct proto udp_prot = {
.compat_getsockopt = compat_udp_getsockopt,
 #endif
.clear_sk  = sk_prot_clear_portaddr_nulls,
+   .diag_destroy  = 

[PATCH net] sit: ipip6_valid_ip_proto() is static

2016-08-18 Thread Eric Dumazet
From: Eric Dumazet 

Fixes this sparse error :
net/ipv6/sit.c:1129:6: warning: symbol 'ipip6_valid_ip_proto' was not
declared. Should it be static?


Fixes: 49dbe7ae2168b ("sit: support MPLS over IPv4")
Signed-off-by: Eric Dumazet 
---
 net/ipv6/sit.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 182b6a9be29d..d6da9febb12d 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -1126,7 +1126,7 @@ static int ipip6_tunnel_update_6rd(struct ip_tunnel *t,
 }
 #endif
 
-bool ipip6_valid_ip_proto(u8 ipproto)
+static bool ipip6_valid_ip_proto(u8 ipproto)
 {
return ipproto == IPPROTO_IPV6 ||
ipproto == IPPROTO_IPIP ||




[net-next 06/15] i40e: Fix static analysis tool warning

2016-08-18 Thread Jeff Kirsher
From: Dave Ertman 

This patch fixes a problem where a static analysis tool generates
a warning for "INVARIANT_CONDITION: Expression 'enabled_tc' used
in the condition always yields the same result."

Without this patch, the driver will not pass the static analysis
tool checks without generating warnings.

This patch fixes the problem by eliminating the irrelevant check
and redundant assignment for the value of enabled_tc.

Change-ID: Ia7d44cb050f507df7de333e96369d322e08bf408
Signed-off-by: Dave Ertman 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9f9d24f..8c68ac1 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -4617,7 +4617,7 @@ static u8 i40e_dcb_get_enabled_tc(struct i40e_dcbx_config 
*dcbcfg)
 static u8 i40e_pf_get_num_tc(struct i40e_pf *pf)
 {
struct i40e_hw *hw = >hw;
-   u8 i, enabled_tc;
+   u8 i, enabled_tc = 1;
u8 num_tc = 0;
struct i40e_dcbx_config *dcbcfg = >local_dcbx_config;
 
@@ -4635,8 +4635,6 @@ static u8 i40e_pf_get_num_tc(struct i40e_pf *pf)
else
return 1; /* Only TC0 */
 
-   /* At least have TC0 */
-   enabled_tc = (enabled_tc ? enabled_tc : 0x1);
for (i = 0; i < I40E_MAX_TRAFFIC_CLASS; i++) {
if (enabled_tc & BIT(i))
num_tc++;
-- 
2.7.4



Re: [PATCH] mwifiex: propagate error if IRQ request fails in mwifiex_sdio_of()

2016-08-18 Thread Javier Martinez Canillas
Hello Arend,

On 08/18/2016 03:49 PM, Arend van Spriel wrote:
> 
> 
> On 18-08-16 21:29, Javier Martinez Canillas wrote:
>> Hello Arend,
>>
>> Thanks a lot for your feedback.
>>
>> On 08/18/2016 03:14 PM, Arend van Spriel wrote:
>>> On 18-08-16 16:17, Javier Martinez Canillas wrote:
 If request_irq() fails in mwifiex_sdio_probe_of(), only an error message
 is printed but the actual error is not propagated to the caller function.
>>>
>>> Hmm. The caller function, ie. mwifiex_sdio_probe(), does not seem to care.
>>>
>>
>> Hmm, I'm not so sure about that. It's checking the wifiex_sdio_probe_of()
>> return value.
> 
> Ok. I looked at 4.7 sources on lxr [1].
>

Oh, right. That was fixed quite recently indeed.
 
>> If the IRQ request failing is not an error, then at the very least the call
>> to disable_irq() should be avoided if request_irq() fails, and the message
>> should be changed from dev_err() to dev_dgb() or dev_info().
> 
> agree.
> 
>>> The device may still function without this wake interrupt.
>>>
>>
>> That's correct, the binding says that the "interrupts" property in the child
>> node is optional since is just a wakeup IRQ. Now the question is if should
>> be an error if the IRQ is defined but fails to be requested.
> 
> Clearly it indicates an error in the DT specification so behavior is not
> as expected. Personally I would indeed consider it an error, but I was
> just indicating that it might have done like this intentionally.
>

Yes, might had been done intentionally indeed but I don't think that is
the case since the driver lacked error checking and propagation in many
places. But if someone thinks that's better to not honor the DT and at
least have the driver working without the wake up capability, then I'm
happy to respin the patch and change the print log level to info/debug.
 
> Regards,
> Arend
> 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America


Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection

2016-08-18 Thread Florian Westphal
Eric Dumazet  wrote:
> On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> > commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> > added the main infrastructure that is needed for per-connection
> > randomization, in particular writing/reading the on-wire tcp header
> > format takes the offset into account so rest of stack can use normal
> > tcp_time_stamp (jiffies).
> > 
> > So only two items are left:
> >  - add a tsoffset for request sockets
> >  - extend the tcp isn generator to also return another 32bit number
> >  in addition to the ISN.
> > 
> > Re-use of ISN generator also means timestamps are still monotonically
> > increasing for same connection quadruple.
> 
> I like the idea, but the implementation looks a bit complex.
> 
> Instead of initializing tsoffset to 0, we could simply use
> 
> jhash(src_addr, dst_addr, boot_time_rnd)
> 
> This way, even syncookies would be handled, and we do not need to
> increase tcp_request_sock size.

True, however I think it would be fairly easy to discover
boot_time_rnd given a few outputs, as jhash is not cryptograhic hash, no?

If thats not a concern I can just use jhash (not taking ports
into account doesn't seem to be a problem).

Alternatively (if tcp_request_sock increase/complexity is a problem)
I could either call the isn generator again, or add an extra function
for it (again using md5), I did not do this because I was afraid
it would be too expensive to do two md5 calculations.

Thanks for reviewing!

For cookies I had planned to just extend the cookie sha1 similar
to isn generator here, alternatives welcome.


[PATCH net] udp: include addrconf.h

2016-08-18 Thread Eric Dumazet
From: Eric Dumazet 

Include ipv4_rcv_saddr_equal() definition to avoid this sparse error :

net/ipv4/udp.c:362:5: warning: symbol 'ipv4_rcv_saddr_equal' was not
declared. Should it be static?

Signed-off-by: Eric Dumazet 
---
 net/ipv4/udp.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e61f7cd65d08..8f5f7f6026f7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -114,6 +114,7 @@
 #include 
 #include "udp_impl.h"
 #include 
+#include 
 
 struct udp_table udp_table __read_mostly;
 EXPORT_SYMBOL(udp_table);




Re: [PATCH net-next 2/2] net: phy: Kconfig: Remove redundant "Support for"

2016-08-18 Thread Florian Fainelli
On 08/18/2016 02:56 PM, Andrew Lunn wrote:
> Remove the redundant "Support for" and "Drivers for" from the Kconfig
> short description. This makes the manufacture much more prominent in
> the list and makes the shortcut keys useful.
> 
> Signed-off-by: Andrew Lunn 

Acked-by: Florian Fainelli 
-- 
Florian


Re: [PATCH iproute2] ip route: restore_handler should check tb[RTA_PREFSRC] for local networks

2016-08-18 Thread Stephen Hemminger
On Sun,  7 Aug 2016 17:12:30 +0800
Xin Long  wrote:

> Prior to this patch, If one route entry's RTA_PREFSRC and RTA_GATEWAY
> both were NULL, it was supposed to be restored ONLY as a local address.
> 
> But as it didn't check tb[RTA_PREFSRC] when restoring local networks,
> rtattr_cmp would return a success if it was NULL, this route entry would
> be restored again as a local network.
> 
> This patch is to add tb[RTA_PREFSRC] check when restoring local networks.
> 
> Fixes: 74af8dd9620e ("ip route: restore route entries in correct order")
> Signed-off-by: Xin Long 

Applied


Re: [PATCH] mwifiex: propagate error if IRQ request fails in mwifiex_sdio_of()

2016-08-18 Thread Arend van Spriel
On 18-08-16 16:17, Javier Martinez Canillas wrote:
> If request_irq() fails in mwifiex_sdio_probe_of(), only an error message
> is printed but the actual error is not propagated to the caller function.

Hmm. The caller function, ie. mwifiex_sdio_probe(), does not seem to care.

The device may still function without this wake interrupt.

Regards,
Arend

> Signed-off-by: Javier Martinez Canillas 
> ---
> 
>  drivers/net/wireless/marvell/mwifiex/sdio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c 
> b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index d3e1561ca075..00727936ad6e 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -125,6 +125,7 @@ static int mwifiex_sdio_probe_of(struct device *dev, 
> struct sdio_mmc_card *card)
>   dev_err(dev,
>   "Failed to request irq_wifi %d (%d)\n",
>   cfg->irq_wifi, ret);
> + return ret;
>   }
>   disable_irq(cfg->irq_wifi);
>   }
> 


Re: [PATCH net] sctp: fix a success return may hide an error

2016-08-18 Thread 'Marcelo Ricardo Leitner'
On Wed, Aug 17, 2016 at 09:01:38AM +, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 16 August 2016 18:25
> ...
> > > That doesn't seem a good idea.
> > > You don't want to abort the association if there is a transient
> > > memory allocation failure.
> > > You also can't drop data chunks.
> > 
> > From a system-wise POV, this behavior - to free the new asoc in case of
> > transient memory allocation failure - doesn't seem bad to me.
> > That's what will have to happen if any allocation before it failed and
> > also it helps the system to reduce the stress a little bit. I don't see
> > any inconsistency/problems here because we are not dropping a single
> > random chunk but instead we are actually refusing to initialize a new
> > asoc in such conditions.
> 
> Failing a new association should be ok, whether purists will like
> connect() failing ENOMEM is another matter.
> 

Good point.

> > Nevertheless, I agree that letting the application see ENOMEM errors when
> > the data actually got queued and is being fully handled, as in, it will
> > be retransmitted later, is not be wise, as the application probably
> > won't be able to distinguish from ENOMEMs that it should retry or not.
> 
> I think an application would be justified in thinking that an ENOMEM return
> meant that the system call had no effect.
> 

Yep

> For send() even ENOMEM is really wrong, it should be treated as 'flow control'
> and either block or return EAGAIN/EWOULDBLOCK.

Agreed.

> Getting POLLOUT set is left as an exercise to the reader :-)
> 

:-)

> ...
> > Well, it may be, but we are trying to improve it.  Please continue
> > discussing the fixes so we can keep improving it. :)
> 
> Indeed, we have customers who use sctp (for M3UA).
> We don't do anything 'complicated', but do end up sending a lot of short
> data chunks.
> 
>   David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH net-next 2/3] net: mpls: Fixups for GSO

2016-08-18 Thread David Ahern
On 8/18/16 8:37 AM, Alexander Duyck wrote:
> Thought I would go through and do a second pass since it sounds like
> the inner_mac_header idea isn't going to fly.  If we can't push this
> as an L2 encapsulation there are few tweaks we probably need in order
> to make this work as an L3.  I have included comments inline below.
> 
> Also I haven't worked with MPLS much before.  Is there a simple way to
> setup an MPLS tunnel between two hosts connected back to back so that
> I could try testing a few things related to this patch?

Here commands that I use for VMs - copy and paste. It is an adaptation of 
Lennert's namespace script. VM id's are local to my host. Network addresses are 
10.100.1.x/24 and 2100:1::x/120 on eth1 of the respective node. Includes MPLS 
encap, IP-IP encap and none to compare performances.

VM2
===
modprobe mpls_router
modprobe mpls_gso
modprobe mpls_iptunnel

sysctl -w net.mpls.platform_labels=1000
ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.3
ip -6 route add 3000:1::1/128 encap mpls 101 via inet6 2100:1::3

ip tunnel add tun0 mode ipip remote 10.100.1.3
ip link set dev tun0 up
ip route add 10.10.10.11/32 dev tun0

ip route add 10.10.10.12/32 via inet 10.100.1.3
ip -6 route add 3000:1::3/128 via inet6 2100:1::3


VM3
===
modprobe mpls_router
modprobe mpls_gso
modprobe mpls_iptunnel

sysctl -w net.mpls.conf.eth1.input=1
sysctl -w net.mpls.platform_labels=1000
ip -f mpls route add 100 via inet 10.100.2.4
ip -f mpls route add 101 via inet6 2100:2::4

ip tunnel add tun0 mode ipip remote 10.100.1.2
ip link set dev tun0 up
ip ro add 10.10.10.11/32 via 10.100.2.4

ip ro add 10.10.10.12/32 via 10.100.2.4
ip -6 route add 3000:1::3/128 via inet6 2100:2::4


VM4
===
ip addr add 10.10.10.10/32 dev lo
ip addr add 10.10.10.11/32 dev lo
ip addr add 10.10.10.12/32 dev lo

ip -6 addr add 3000:1::1/128 dev lo
ip -6 addr add 3000:1::2/128 dev lo
ip -6 addr add 3000:1::3/128 dev lo

netserver


Go back to VM2:

ping -c 1 10.10.10.10
ping -c 1 10.10.10.11
ping -c 1 10.10.10.12

netperf -c -C -H 10.10.10.10  -l 10 -t TCP_STREAM
netperf -c -C -H 10.10.10.11  -l 10 -t TCP_STREAM
netperf -c -C -H 10.10.10.12  -l 10 -t TCP_STREAM


I'll take a look at your other comments today.



Re: [ethtool PATCH v2 4/4] ethtool: Enhancing link mode bits to support 25G/50G/100G

2016-08-18 Thread John W. Linville
Ping?

On Tue, Aug 02, 2016 at 03:49:57PM -0400, John W. Linville wrote:
> Are these revised patches still on their way?
> 
> John
> 
> On Sun, Jul 24, 2016 at 01:57:14AM -0700, Vidya Sagar Ravipati wrote:
> > Yuval,
> > I will try to resubmit the patches this week with updated comments
> > 
> > Thanks
> > Vidya Sagar
> > 
> > On Sat, Jul 23, 2016 at 10:57 PM, Yuval Mintz 
> > wrote:
> > 
> > > > Enhancing link mode bits to support 25G/50G/100G for supported and
> > > > advertised speed mode bits
> > > >
> > > > Signed-off-by: Vidya Sagar Ravipati 
> > > > ---
> > > >  ethtool.c | 27 +++
> > > >  1 file changed, 27 insertions(+)
> > >
> > > Hi Vidya,
> > >
> > > Are you re-trying your series one anytime soon?
> > >
> > > If not, can we simply push this [and ethtool-copy.h], as those are needed
> > > for querying/setting the recently added new speeds.
> > >
> > >
> 
> -- 
> John W. Linville  Someday the world will need a hero, and you
> linvi...@tuxdriver.commight be all we have.  Be ready.
> 

-- 
John W. LinvilleSomeday the world will need a hero, and you
linvi...@tuxdriver.com  might be all we have.  Be ready.


Re: [PATCH 0/2] wireless: Use complete() instead complete_all()

2016-08-18 Thread Luis R. Rodriguez
On Thu, Aug 18, 2016 at 03:12:04PM +0200, Daniel Wagner wrote:
> This series ignores all complete_all() usages in the firmware loading
> path. They will be hopefully address by Luis' sysdata patches [0].
> That leaves a couple of complete_all() calls.

I had not considered this as a gain, but glad to know the sysdata series
could help with RT as well, thanks for the clarification.

 Luis


[PATCH net-next 1/3] dsa: mv88e6xxx: Timeout based on iterations, not time

2016-08-18 Thread Andrew Lunn
The mv88e6xxx driver times out operations on the switch based on
looping until an elapsed wall clock time is reached. However, if
usleep_range() sleeps much longer than expected, it could timeout with
an error without actually checking to see if the devices has completed
the operation. So replace the elapsed time with a fixed upper bound on
the number of loops.

Testing on various switches has shown that switches takes either 0 or
1 iteration, so a maximum of 16 iterations is a safe limit.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index a230fcba5b64..ac8e9af4879f 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -309,9 +309,9 @@ static int mv88e6xxx_serdes_write(struct mv88e6xxx_chip 
*chip, int reg, u16 val)
 static int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int addr, int reg,
  u16 mask)
 {
-   unsigned long timeout = jiffies + HZ / 10;
+   int i;
 
-   while (time_before(jiffies, timeout)) {
+   for (i = 0; i < 16; i++) {
u16 val;
int err;
 
@@ -375,7 +375,7 @@ static int _mv88e6xxx_reg_write(struct mv88e6xxx_chip 
*chip, int addr,
 static int mv88e6xxx_ppu_disable(struct mv88e6xxx_chip *chip)
 {
int ret;
-   unsigned long timeout;
+   int i;
 
ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_CONTROL);
if (ret < 0)
@@ -386,8 +386,7 @@ static int mv88e6xxx_ppu_disable(struct mv88e6xxx_chip 
*chip)
if (ret)
return ret;
 
-   timeout = jiffies + 1 * HZ;
-   while (time_before(jiffies, timeout)) {
+   for (i = 0; i < 16; i++) {
ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATUS);
if (ret < 0)
return ret;
@@ -403,8 +402,7 @@ static int mv88e6xxx_ppu_disable(struct mv88e6xxx_chip 
*chip)
 
 static int mv88e6xxx_ppu_enable(struct mv88e6xxx_chip *chip)
 {
-   int ret, err;
-   unsigned long timeout;
+   int ret, err, i;
 
ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_CONTROL);
if (ret < 0)
@@ -415,8 +413,7 @@ static int mv88e6xxx_ppu_enable(struct mv88e6xxx_chip *chip)
if (err)
return err;
 
-   timeout = jiffies + 1 * HZ;
-   while (time_before(jiffies, timeout)) {
+   for (i = 0; i < 16; i++) {
ret = _mv88e6xxx_reg_read(chip, REG_GLOBAL, GLOBAL_STATUS);
if (ret < 0)
return ret;
-- 
2.8.1



[PATCH net-next 0/2] PHY Kconfig and Makefile cleanup

2016-08-18 Thread Andrew Lunn
The Ethernet PHY directory has slowly been getting more entries.
Split the entries in the Makefile and Kconfig into MDIO bus drivers
and PHYs. Within these two groups, sort them. This should reduce merge
conflicts and aid finding what one searches for.

The Kconfig text contains redundant "Driver for" and "Support for"
which add little value, make the vendor less obvious, and defeat the
shortcut key in the menu. Remove such text.

Andrew Lunn (2):
  net: phy: Sort Makefile and Kconfig
  net: phy: Kconfig: Remove redundant "Support for"

 drivers/net/phy/Kconfig  | 439 ---
 drivers/net/phy/Makefile |  76 
 2 files changed, 259 insertions(+), 256 deletions(-)

-- 
2.8.1



[PATCH net-next 0/3] Fix mv88e6xxx wait function

2016-08-18 Thread Andrew Lunn
The mv88e6xxx wait function can be upset of the system has nots of
other things to do and a sleep takes a lot longer than expected. Fix
this be using a fixed number of iterations, rather than a fixed
walkclock time.

Witht that change made, it is possible to consoliate another
wait function.

A wait actually timing out should not happen and when it does, it
means something serious is wrong. Make sure an error is logged,
since not all callers will log an error.

Andrew Lunn (3):
  dsa: mv88e6xxx: Timeout based on iterations, not time
  dsa: mv88e6xxx: Use mv88e6xx_wait in mv88e6xxx_update()
  dsa: mv88e6xxx: Make mv88e6xxx_wait() timeout verbose

 drivers/net/dsa/mv88e6xxx/chip.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

-- 
2.8.1



[PATCH net-next 3/3] dsa: mv88e6xxx: Make mv88e6xxx_wait() timeout verbose

2016-08-18 Thread Andrew Lunn
When mv88e6xxx_wait() returns a timeout, something bad has
happened. Make sure it is noticed by logging an error.

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 8c846bce4edf..014b52bd72f1 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -325,6 +325,7 @@ static int mv88e6xxx_wait(struct mv88e6xxx_chip *chip, int 
addr, int reg,
usleep_range(1000, 2000);
}
 
+   dev_err(chip->dev, "Timeout while waiting for switch\n");
return -ETIMEDOUT;
 }
 
-- 
2.8.1



Re: [RFC PATCH 3/5] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands

2016-08-18 Thread Daniel Mack
On 08/17/2016 08:10 PM, Alexei Starovoitov wrote:
> On Wed, Aug 17, 2016 at 09:16:02AM -0700, Eric Dumazet wrote:
>> On Wed, 2016-08-17 at 16:00 +0200, Daniel Mack wrote:
>>
>>> +   progp = is_ingress ? >bpf_ingress : >bpf_egress;
>>> +
>>> +   rcu_read_lock();
>>> +   old_prog = rcu_dereference(*progp);
>>> +   rcu_assign_pointer(*progp, prog);
>>> +
>>> +   if (old_prog)
>>> +   bpf_prog_put(old_prog);
>>> +
>>> +   rcu_read_unlock();
>>
>>
>> This is a bogus locking strategy.
> 
> yep. this rcu_lock/unlock won't solve the race between parallel
> bpf_prog_attach calls.
> please use xchg() similar to bpf_fd_array_map_update_elem()
> Then prog pointer will be swapped automically and bpf_prog_put()
> will free it via call_rcu.
> The reader side in sk_filter_cgroup_bpf() looks correct.

Thanks! I reworked all the bits I got comments on, and fixed some other
details as well. I'll wait some days to see what else shakes out of this
thread, and then post again.

FWIW, the current code can be found here:

  https://github.com/zonque/linux/commits/cg-bpf-syscall


Thanks,
Daniel



Your Urgent Response

2016-08-18 Thread Mr Hassan Umra
Dear Friend,

I am Mr Hassan Umra, the director in charge of auditing and accounting section
of Bank of Africa (BOA)  West-Africa. I would like you
to indicate your interest to receive the transfer of $12.5 Million Dollars.
 
I will like you to stand as the next of kin to our late customer whose account
is presently dormant for claims. Please once you are interested kindly send
me the following details information below,

1.Your full name:...
2.Resident address:..
3.Private phone 
4.Country : 
5.Occupation:..
6.Age:. 
7.sex
I shall send you more details as soon as i hear from you.

Regards,
Mr.Hassan Umra,


Re: [PATCH v1 1/1] l2tp: Use existing macros instead of literal number

2016-08-18 Thread Guillaume Nault
On Thu, Aug 18, 2016 at 03:05:19PM +0800, f...@ikuai8.com wrote:
> From: Gao Feng 
> 
> 1. Use PPP_ALLSTATIONS/PPP_UI instead of literal 0xff/0x03;
> 2. Use one static const global fixed_ppphdr instead of two same
> static variable ppph in two different functions;
> 3. Use SEND_SHUTDOWN instead of literal 2;
> 
> Signed-off-by: Gao Feng 
> ---
>  v1: Initial patch
No need to send 'v1' for the initial series.

> --- a/net/l2tp/l2tp_ppp.c
> +++ b/net/l2tp/l2tp_ppp.c
> @@ -138,6 +138,8 @@ static const struct ppp_channel_ops pppol2tp_chan_ops = {
>  
>  static const struct proto_ops pppol2tp_ops;
>  
> +static const unsigned char fixed_ppphdr[2] = {PPP_ALLSTATIONS, PPP_UI};
> +
>  /* Helpers to obtain tunnel/session contexts from sockets.
>   */
>  static inline struct l2tp_session *pppol2tp_sock_to_session(struct sock *sk)
> @@ -174,11 +176,11 @@ static int pppol2tp_recv_payload_hook(struct sk_buff 
> *skb)
>* Note that skb->data[] isn't dereferenced from a u16 ptr here since
>* the field may be unaligned.
>*/
> - if (!pskb_may_pull(skb, 2))
> + if (!pskb_may_pull(skb, sizeof(fixed_ppphdr)))
>   return 1;
>  
> - if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
> - skb_pull(skb, 2);
> + if ((PPP_ADDRESS(skb->data) == PPP_ALLSTATIONS) && 
> (PPP_CONTROL(skb->data) == PPP_UI))
> + skb_pull(skb, sizeof(fixed_ppphdr));
>
Sorry, but I find the original code clearer. It's important to be
explicit about what's done with the sk_buff. Hiding skb->data[x] behind
macros certainly doesn't help.

Same thing for the use of sizeof(fixed_ppphdr) in pskb_may_pull(). The
size of fixed_ppphdr isn't used aftewards, so it's unclear why its size
was pulled. 2 was not a magic number here, it was directly related with
the operations done on the skb (i.e. accessing skb->data[0] and
skb->data[1]). So pskb_may_pull(skb, 2) makes perfect sense.

OTOH, replacing 0xff and 0x03 with PPP_ALLSTATIONS and PPP_UI is fine.

> @@ -312,7 +313,7 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
> msghdr *m,
>   error = -ENOMEM;
>   skb = sock_wmalloc(sk, NET_SKB_PAD + sizeof(struct iphdr) +
>  uhlen + session->hdr_len +
> -sizeof(ppph) + total_len,
> +sizeof(fixed_ppphdr) + total_len,
>  0, GFP_KERNEL);
>   if (!skb)
>   goto error_put_sess_tun;
> @@ -325,9 +326,9 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
> msghdr *m,
>   skb_reserve(skb, uhlen);
>  
>   /* Add PPP header */
> - skb->data[0] = ppph[0];
> - skb->data[1] = ppph[1];
> - skb_put(skb, 2);
> + PPP_ADDRESS(skb->data) = fixed_ppphdr[0];
> + PPP_CONTROL(skb->data) = fixed_ppphdr[1];
> + skb_put(skb, sizeof(fixed_ppphdr));
>
Same here. What about
+ skb->data[0] = PPP_ALLSTATIONS;
+ skb->data[1] = PPP_UI;
+ skb_put(skb, 2);
and removing ppph entirely?

>   /* Copy user data into skb */
>   error = memcpy_from_msg(skb_put(skb, total_len), m, total_len);
> @@ -369,7 +370,6 @@ error:
>   */
>  static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
>  {
> - static const u8 ppph[2] = { 0xff, 0x03 };
>   struct sock *sk = (struct sock *) chan->private;
>   struct sock *sk_tun;
>   struct l2tp_session *session;
> @@ -398,14 +398,14 @@ static int pppol2tp_xmit(struct ppp_channel *chan, 
> struct sk_buff *skb)
>  sizeof(struct iphdr) + /* IP header */
>  uhlen +  /* UDP header (if L2TP_ENCAPTYPE_UDP) */
>  session->hdr_len +   /* L2TP header */
> -sizeof(ppph);/* PPP header */
> +sizeof(fixed_ppphdr); /* PPP header */
>   if (skb_cow_head(skb, headroom))
>   goto abort_put_sess_tun;
>  
>   /* Setup PPP header */
> - __skb_push(skb, sizeof(ppph));
> - skb->data[0] = ppph[0];
> - skb->data[1] = ppph[1];
> + __skb_push(skb, sizeof(fixed_ppphdr));
> + skb->data[0] = fixed_ppphdr[0];
> + skb->data[1] = fixed_ppphdr[1];
> 
Same as for pppol2tp_sendmsg().

> @@ -440,7 +440,7 @@ static void pppol2tp_session_close(struct l2tp_session 
> *session)
>   BUG_ON(session->magic != L2TP_SESSION_MAGIC);
>  
>   if (sock) {
> - inet_shutdown(sock, 2);
> + inet_shutdown(sock, SEND_SHUTDOWN);
> 
Ok.


[PATCH net-next 2/2] net: phy: Kconfig: Remove redundant "Support for"

2016-08-18 Thread Andrew Lunn
Remove the redundant "Support for" and "Drivers for" from the Kconfig
short description. This makes the manufacture much more prominent in
the list and makes the shortcut keys useful.

Signed-off-by: Andrew Lunn 
---
 drivers/net/phy/Kconfig | 70 -
 1 file changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index 786e8b0b2d06..1c3e07c3d0b8 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -35,7 +35,7 @@ config MDIO_BCM_UNIMAC
  Starfighter 2 switches.
 
 config MDIO_BITBANG
-   tristate "Support for bitbanged MDIO buses"
+   tristate "Bitbanged MDIO buses"
help
  This module implements the MDIO bus protocol in software,
  for use by low level drivers that export the ability to
@@ -53,7 +53,7 @@ config MDIO_BUS_MUX
  device specific drivers.
 
 config MDIO_BUS_MUX_BCM_IPROC
-   tristate "Support for iProc based MDIO bus multiplexers"
+   tristate "Broadcom iProc based MDIO bus multiplexers"
depends on OF && OF_MDIO && (ARCH_BCM_IPROC || COMPILE_TEST)
select MDIO_BUS_MUX
default ARCH_BCM_IPROC
@@ -64,7 +64,7 @@ config MDIO_BUS_MUX_BCM_IPROC
  external and selection logic lies inside the same multiplexer.
 
 config MDIO_BUS_MUX_GPIO
-   tristate "Support for GPIO controlled MDIO bus multiplexers"
+   tristate "GPIO controlled MDIO bus multiplexers"
depends on OF_GPIO && OF_MDIO
select MDIO_BUS_MUX
help
@@ -74,7 +74,7 @@ config MDIO_BUS_MUX_GPIO
  selection is under the control of GPIO lines.
 
 config MDIO_BUS_MUX_MMIOREG
-   tristate "Support for MMIO device-controlled MDIO bus multiplexers"
+   tristate "MMIO device-controlled MDIO bus multiplexers"
depends on OF_MDIO && HAS_IOMEM
select MDIO_BUS_MUX
help
@@ -90,7 +90,7 @@ config MDIO_CAVIUM
tristate
 
 config MDIO_GPIO
-   tristate "Support for GPIO lib-based bitbanged MDIO buses"
+   tristate "GPIO lib-based bitbanged MDIO buses"
depends on MDIO_BITBANG && GPIOLIB
---help---
  Supports GPIO lib-based MDIO busses.
@@ -113,7 +113,7 @@ config MDIO_MOXART
   interface units of the MOXA ART SoC
 
 config MDIO_OCTEON
-   tristate "Support for MDIO buses on Octeon and some ThunderX SOCs"
+   tristate "Octeon and some ThunderX SOCs MDIO buses"
depends on 64BIT
depends on HAS_IOMEM
select MDIO_CAVIUM
@@ -131,7 +131,7 @@ config MDIO_SUN4I
  A12, A10s, etc.)
 
 config MDIO_THUNDER
-   tristate "Support for MDIO buses on ThunderX SOCs"
+   tristate "ThunderX SOCs MDIO buses"
depends on 64BIT
depends on PCI
select MDIO_CAVIUM
@@ -149,41 +149,41 @@ config MDIO_XGENE
 comment "MII PHY device drivers"
 
 config AMD_PHY
-   tristate "Drivers for the AMD PHYs"
+   tristate "AMD PHYs"
---help---
  Currently supports the am79c874
 
 config AQUANTIA_PHY
-tristate "Drivers for the Aquantia PHYs"
+tristate "Aquantia PHYs"
 ---help---
   Currently supports the Aquantia AQ1202, AQ2104, AQR105, AQR405
 
 config AT803X_PHY
-   tristate "Drivers for Atheros AT803X PHYs"
+   tristate "AT803X PHYs"
---help---
  Currently supports the AT8030 and AT8035 model
 
 config BCM63XX_PHY
-   tristate "Drivers for Broadcom 63xx SOCs internal PHY"
+   tristate "Broadcom 63xx SOCs internal PHY"
depends on BCM63XX
select BCM_NET_PHYLIB
---help---
  Currently supports the 6348 and 6358 PHYs.
 
 config BCM7XXX_PHY
-   tristate "Drivers for Broadcom 7xxx SOCs internal PHYs"
+   tristate "Broadcom 7xxx SOCs internal PHYs"
select BCM_NET_PHYLIB
---help---
  Currently supports the BCM7366, BCM7439, BCM7445, and
  40nm and 65nm generation of BCM7xxx Set Top Box SoCs.
 
 config BCM87XX_PHY
-   tristate "Driver for Broadcom BCM8706 and BCM8727 PHYs"
+   tristate "Broadcom BCM8706 and BCM8727 PHYs"
help
  Currently supports the BCM8706 and BCM8727 10G Ethernet PHYs.
 
 config BCM_CYGNUS_PHY
-   tristate "Drivers for Broadcom Cygnus SoC internal PHY"
+   tristate "Broadcom Cygnus SoC internal PHY"
depends on ARCH_BCM_CYGNUS || COMPILE_TEST
depends on MDIO_BCM_IPROC
select BCM_NET_PHYLIB
@@ -199,34 +199,34 @@ config BCM_NET_PHYLIB
tristate
 
 config BROADCOM_PHY
-   tristate "Drivers for Broadcom PHYs"
+   tristate "Broadcom PHYs"
select BCM_NET_PHYLIB
---help---
  Currently supports the BCM5411, BCM5421, BCM5461, BCM54616S, BCM5464,
  BCM5481 and BCM5482 PHYs.
 
 config CICADA_PHY
-   tristate "Drivers for the Cicada PHYs"
+   tristate "Cicada PHYs"
---help---
  Currently supports the cis8204
 
 config DAVICOM_PHY

[PATCH net-next 1/2] net: phy: Sort Makefile and Kconfig

2016-08-18 Thread Andrew Lunn
Sort the files to reduce merge conflicts and to make it easier to find
drivers by name. Also separate the MDIO bus drivers from the PHY
drivers, again to help find what you need.

Signed-off-by: Andrew Lunn 
---
 drivers/net/phy/Kconfig  | 427 ---
 drivers/net/phy/Makefile |  76 +
 2 files changed, 253 insertions(+), 250 deletions(-)

diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig
index d66133bf3eb5..786e8b0b2d06 100644
--- a/drivers/net/phy/Kconfig
+++ b/drivers/net/phy/Kconfig
@@ -15,85 +15,153 @@ if PHYLIB
 config SWPHY
bool
 
-comment "MII PHY device drivers"
+comment "MDIO bus device drivers"
 
-config AQUANTIA_PHY
-tristate "Drivers for the Aquantia PHYs"
----help---
-  Currently supports the Aquantia AQ1202, AQ2104, AQR105, AQR405
-
-config AT803X_PHY
-   tristate "Drivers for Atheros AT803X PHYs"
-   ---help---
- Currently supports the AT8030 and AT8035 model
+config MDIO_BCM_IPROC
+   tristate "Broadcom iProc MDIO bus controller"
+   depends on ARCH_BCM_IPROC || COMPILE_TEST
+   depends on HAS_IOMEM && OF_MDIO
+   help
+ This module provides a driver for the MDIO busses found in the
+ Broadcom iProc SoC's.
 
-config AMD_PHY
-   tristate "Drivers for the AMD PHYs"
-   ---help---
- Currently supports the am79c874
+config MDIO_BCM_UNIMAC
+   tristate "Broadcom UniMAC MDIO bus controller"
+   depends on HAS_IOMEM
+   help
+ This module provides a driver for the Broadcom UniMAC MDIO busses.
+ This hardware can be found in the Broadcom GENET Ethernet MAC
+ controllers as well as some Broadcom Ethernet switches such as the
+ Starfighter 2 switches.
 
-config MARVELL_PHY
-   tristate "Drivers for Marvell PHYs"
-   ---help---
- Currently has a driver for the 88E1011S
-   
-config DAVICOM_PHY
-   tristate "Drivers for Davicom PHYs"
-   ---help---
- Currently supports dm9161e and dm9131
+config MDIO_BITBANG
+   tristate "Support for bitbanged MDIO buses"
+   help
+ This module implements the MDIO bus protocol in software,
+ for use by low level drivers that export the ability to
+ drive the relevant pins.
 
-config QSEMI_PHY
-   tristate "Drivers for Quality Semiconductor PHYs"
-   ---help---
- Currently supports the qs6612
+ If in doubt, say N.
 
-config LXT_PHY
-   tristate "Drivers for the Intel LXT PHYs"
-   ---help---
- Currently supports the lxt970, lxt971
+config MDIO_BUS_MUX
+   tristate
+   depends on OF_MDIO
+   help
+ This module provides a driver framework for MDIO bus
+ multiplexers which connect one of several child MDIO busses
+ to a parent bus.  Switching between child busses is done by
+ device specific drivers.
 
-config CICADA_PHY
-   tristate "Drivers for the Cicada PHYs"
-   ---help---
- Currently supports the cis8204
+config MDIO_BUS_MUX_BCM_IPROC
+   tristate "Support for iProc based MDIO bus multiplexers"
+   depends on OF && OF_MDIO && (ARCH_BCM_IPROC || COMPILE_TEST)
+   select MDIO_BUS_MUX
+   default ARCH_BCM_IPROC
+   help
+ This module provides a driver for MDIO bus multiplexers found in
+ iProc based Broadcom SoCs. This multiplexer connects one of several
+ child MDIO bus to a parent bus. Buses could be internal as well as
+ external and selection logic lies inside the same multiplexer.
 
-config VITESSE_PHY
-tristate "Drivers for the Vitesse PHYs"
----help---
-  Currently supports the vsc8244
+config MDIO_BUS_MUX_GPIO
+   tristate "Support for GPIO controlled MDIO bus multiplexers"
+   depends on OF_GPIO && OF_MDIO
+   select MDIO_BUS_MUX
+   help
+ This module provides a driver for MDIO bus multiplexers that
+ are controlled via GPIO lines.  The multiplexer connects one of
+ several child MDIO busses to a parent bus.  Child bus
+ selection is under the control of GPIO lines.
 
-config TERANETICS_PHY
-tristate "Drivers for the Teranetics PHYs"
----help---
-  Currently supports the Teranetics TN2020
+config MDIO_BUS_MUX_MMIOREG
+   tristate "Support for MMIO device-controlled MDIO bus multiplexers"
+   depends on OF_MDIO && HAS_IOMEM
+   select MDIO_BUS_MUX
+   help
+ This module provides a driver for MDIO bus multiplexers that
+ are controlled via a simple memory-mapped device, like an FPGA.
+ The multiplexer connects one of several child MDIO busses to a
+ parent bus.  Child bus selection is under the control of one of
+ the FPGA's registers.
 
-config SMSC_PHY
-   tristate "Drivers for SMSC PHYs"
-   ---help---
- Currently supports the LAN83C185, LAN8187 and LAN8700 PHYs
+  

[PATCH net-next 2/3] dsa: mv88e6xxx: Use mv88e6xx_wait in mv88e6xxx_update()

2016-08-18 Thread Andrew Lunn
Now that mv88e6xx_wait() iterated on a counter than a fixed time
interval, it implements the same mechanism as mv88e6xxx_update() uses.
So use it in mv88e6xx_wait().

Signed-off-by: Andrew Lunn 
---
 drivers/net/dsa/mv88e6xxx/chip.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index ac8e9af4879f..8c846bce4edf 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -333,20 +333,12 @@ static int mv88e6xxx_update(struct mv88e6xxx_chip *chip, 
int addr, int reg,
u16 update)
 {
u16 val;
-   int i, err;
+   int err;
 
/* Wait until the previous operation is completed */
-   for (i = 0; i < 16; ++i) {
-   err = mv88e6xxx_read(chip, addr, reg, );
-   if (err)
-   return err;
-
-   if (!(val & BIT(15)))
-   break;
-   }
-
-   if (i == 16)
-   return -ETIMEDOUT;
+   err = mv88e6xxx_wait(chip, addr, reg, BIT(15));
+   if (err)
+   return err;
 
/* Set the Update bit to trigger a write operation */
val = BIT(15) | update;
-- 
2.8.1



Re: [PATCH 0/6] Netfilter fixes for net

2016-08-18 Thread David Miller
From: Pablo Neira Ayuso 
Date: Thu, 18 Aug 2016 19:29:02 +0200

> The following patchset contains Netfilter updates for your net tree,
> they are:
 ...
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thanks a lot Pablo.


[net-next 13/15] i40e/i40evf-Bump version from 1.6.11 to 1.6.12

2016-08-18 Thread Jeff Kirsher
From: Bimmy Pujari 

Signed-off-by: Bimmy Pujari 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 704cd7d..798c4e2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -41,7 +41,7 @@ static const char i40e_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 6
-#define DRV_VERSION_BUILD 11
+#define DRV_VERSION_BUILD 12
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
 __stringify(DRV_VERSION_MINOR) "." \
 __stringify(DRV_VERSION_BUILD)DRV_KERN
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index d1d4a65..ba046d1 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -38,7 +38,7 @@ static const char i40evf_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 6
-#define DRV_VERSION_BUILD 11
+#define DRV_VERSION_BUILD 12
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) "." \
 __stringify(DRV_VERSION_MINOR) "." \
 __stringify(DRV_VERSION_BUILD) \
-- 
2.7.4



[net-next 09/15] i40e: Don't notify client of VF reset during VF creation

2016-08-18 Thread Jeff Kirsher
From: Avinash Dayanand 

VF goes through reset path during VF creation which happens to also
have notification of VF reset to client. Adding conditional check to
avoid wrongly notifying VF reset during VF creation.

Also changing the call order of VF enable, calling it after VF creation
rather than before.

Change-ID: I96eabd99deae746a2f0fc465194c886f196178ce
Signed-off-by: Avinash Dayanand 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 0fa050a..5ea659c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -991,7 +991,9 @@ complete_reset:
i40e_enable_vf_mappings(vf);
set_bit(I40E_VF_STAT_ACTIVE, >vf_states);
clear_bit(I40E_VF_STAT_DISABLED, >vf_states);
-   i40e_notify_client_of_vf_reset(pf, abs_vf_id);
+   /* Do not notify the client during VF init */
+   if (vf->pf->num_alloc_vfs)
+   i40e_notify_client_of_vf_reset(pf, abs_vf_id);
vf->num_vlan = 0;
}
/* tell the VF the reset is done */
@@ -1090,7 +1092,6 @@ int i40e_alloc_vfs(struct i40e_pf *pf, u16 num_alloc_vfs)
goto err_iov;
}
}
-   i40e_notify_client_of_vf_enable(pf, num_alloc_vfs);
/* allocate memory */
vfs = kcalloc(num_alloc_vfs, sizeof(struct i40e_vf), GFP_KERNEL);
if (!vfs) {
@@ -1114,6 +1115,8 @@ int i40e_alloc_vfs(struct i40e_pf *pf, u16 num_alloc_vfs)
}
pf->num_alloc_vfs = num_alloc_vfs;
 
+   i40e_notify_client_of_vf_enable(pf, num_alloc_vfs);
+
 err_alloc:
if (ret)
i40e_free_vfs(pf);
-- 
2.7.4



Re: [PATCH 2/2] net: sched: avoid duplicates in qdisc dump

2016-08-18 Thread Cong Wang
On Tue, Aug 16, 2016 at 2:53 PM, Jiri Kosina  wrote:
> From: Jiri Kosina 
>
> tc_dump_qdisc() performs dumping of the per-device qdiscs in two phases;
> first, the "standard" dev->qdisc is being dumped. Second, if there is/are
> ingress queue(s), they are being dumped as well.
>
> After conversion of netdevice's qdisc linked-list into hashtable, these
> two sets are not in two disjunctive sets/lists any more, but are both
> "reachable" directly from netdevice's hashtable. As a consequence, the
> "full-depth" dump of the ingress qdiscs results in immediately hitting the
> netdevice hashtable again, and duplicating the dump that has already been
> performed for dev->qdisc.
> What in fact needs to be dumped in case of ingress queue is "just" the
> top-level ingress qdisc, as everything else has been dumped already.

Doesn't this mean we can now just remove the ingress case from
tc_dump_qdisc() and simply iterate the whole hash table?


[PATCH net 02/10] net/mlx5e: Optimization for MTU change

2016-08-18 Thread Saeed Mahameed
From: Tariq Toukan 

Avoid unnecessary interface down/up operations upon an MTU change
when it does not affect the rings configuration.

Fixes: 461017cb006a ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 0fc3a2b..65258b2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2569,6 +2569,7 @@ static int mlx5e_change_mtu(struct net_device *netdev, 
int new_mtu)
u16 max_mtu;
u16 min_mtu;
int err = 0;
+   bool reset;
 
mlx5_query_port_max_mtu(mdev, _mtu, 1);
 
@@ -2584,14 +2585,18 @@ static int mlx5e_change_mtu(struct net_device *netdev, 
int new_mtu)
 
mutex_lock(>state_lock);
 
+   reset = !priv->params.lro_en &&
+   (priv->params.rq_wq_type !=
+MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ);
+
was_opened = test_bit(MLX5E_STATE_OPENED, >state);
-   if (was_opened)
+   if (was_opened && reset)
mlx5e_close_locked(netdev);
 
netdev->mtu = new_mtu;
mlx5e_set_dev_port_mtu(netdev);
 
-   if (was_opened)
+   if (was_opened && reset)
err = mlx5e_open_locked(netdev);
 
mutex_unlock(>state_lock);
-- 
2.7.4



[PATCH net 07/10] net/mlx5e: Retrieve the switchdev id from the firmware only once

2016-08-18 Thread Saeed Mahameed
From: Hadar Hen Zion 

Avoid firmware command execution each time the switchdev HW ID attr get
call is made. We do that by reading the ID (PF NIC MAC) only once at
load time and store it on the representor structure.

Signed-off-by: Hadar Hen Zion 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 5 +
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c  | 5 ++---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 1 +
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 65258b2..03d944c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3387,6 +3387,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
queue_work(priv->wq, >set_rx_mode_work);
 
if (MLX5_CAP_GEN(mdev, vport_group_manager)) {
+   mlx5_query_nic_vport_mac_address(mdev, 0, rep.hw_id);
rep.load = mlx5e_nic_rep_load;
rep.unload = mlx5e_nic_rep_unload;
rep.vport = 0;
@@ -3505,16 +3506,20 @@ static void mlx5e_register_vport_rep(struct 
mlx5_core_dev *mdev)
struct mlx5_eswitch *esw = mdev->priv.eswitch;
int total_vfs = MLX5_TOTAL_VPORTS(mdev);
int vport;
+   u8 mac[ETH_ALEN];
 
if (!MLX5_CAP_GEN(mdev, vport_group_manager))
return;
 
+   mlx5_query_nic_vport_mac_address(mdev, 0, mac);
+
for (vport = 1; vport < total_vfs; vport++) {
struct mlx5_eswitch_rep rep;
 
rep.load = mlx5e_vport_rep_load;
rep.unload = mlx5e_vport_rep_unload;
rep.vport = vport;
+   ether_addr_copy(rep.hw_id, mac);
mlx5_eswitch_register_vport_rep(esw, );
}
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 1c7d8b8..134de4a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -135,17 +135,16 @@ static const struct ethtool_ops mlx5e_rep_ethtool_ops = {
 int mlx5e_attr_get(struct net_device *dev, struct switchdev_attr *attr)
 {
struct mlx5e_priv *priv = netdev_priv(dev);
+   struct mlx5_eswitch_rep *rep = priv->ppriv;
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
-   u8 mac[ETH_ALEN];
 
if (esw->mode == SRIOV_NONE)
return -EOPNOTSUPP;
 
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_PARENT_ID:
-   mlx5_query_nic_vport_mac_address(priv->mdev, 0, mac);
attr->u.ppid.id_len = ETH_ALEN;
-   memcpy(>u.ppid.id, , ETH_ALEN);
+   ether_addr_copy(attr->u.ppid.id, rep->hw_id);
break;
default:
return -EOPNOTSUPP;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index c0b0560..a961409 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -174,6 +174,7 @@ struct mlx5_eswitch_rep {
void  *priv_data;
struct list_head   vport_sqs_list;
bool   valid;
+   u8 hw_id[ETH_ALEN];
 };
 
 struct mlx5_esw_offload {
-- 
2.7.4



[PATCH net 10/10] net/mlx5: E-Switch, Avoid ACLs in the offloads mode

2016-08-18 Thread Saeed Mahameed
From: Or Gerlitz 

When we are in the switchdev/offloads mode, HW matching is done as
dictated by the offloaded rules and hence we don't need to enable
the ACLs mechanism used by the legacy mode.

Signed-off-by: Or Gerlitz 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index f6d6677..8b78f15 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1451,7 +1451,8 @@ static void esw_enable_vport(struct mlx5_eswitch *esw, 
int vport_num,
 
esw_debug(esw->dev, "Enabling VPORT(%d)\n", vport_num);
 
-   if (vport_num) { /* Only VFs need ACLs for VST and spoofchk filtering */
+   /* Only VFs need ACLs for VST and spoofchk filtering */
+   if (vport_num && esw->mode == SRIOV_LEGACY) {
esw_vport_ingress_config(esw, vport);
esw_vport_egress_config(esw, vport);
}
@@ -1502,7 +1503,7 @@ static void esw_disable_vport(struct mlx5_eswitch *esw, 
int vport_num)
 */
esw_vport_change_handle_locked(vport);
vport->enabled_events = 0;
-   if (vport_num) {
+   if (vport_num && esw->mode == SRIOV_LEGACY) {
esw_vport_disable_egress_acl(esw, vport);
esw_vport_disable_ingress_acl(esw, vport);
}
@@ -1767,7 +1768,7 @@ int mlx5_eswitch_set_vport_mac(struct mlx5_eswitch *esw,
   vport, err);
 
mutex_lock(>state_lock);
-   if (evport->enabled)
+   if (evport->enabled && esw->mode == SRIOV_LEGACY)
err = esw_vport_ingress_config(esw, evport);
mutex_unlock(>state_lock);
return err;
@@ -1839,7 +1840,7 @@ int mlx5_eswitch_set_vport_vlan(struct mlx5_eswitch *esw,
mutex_lock(>state_lock);
evport->vlan = vlan;
evport->qos = qos;
-   if (evport->enabled) {
+   if (evport->enabled && esw->mode == SRIOV_LEGACY) {
err = esw_vport_ingress_config(esw, evport);
if (err)
goto out;
@@ -1868,10 +1869,11 @@ int mlx5_eswitch_set_vport_spoofchk(struct mlx5_eswitch 
*esw,
mutex_lock(>state_lock);
pschk = evport->spoofchk;
evport->spoofchk = spoofchk;
-   if (evport->enabled)
+   if (evport->enabled && esw->mode == SRIOV_LEGACY) {
err = esw_vport_ingress_config(esw, evport);
-   if (err)
-   evport->spoofchk = pschk;
+   if (err)
+   evport->spoofchk = pschk;
+   }
mutex_unlock(>state_lock);
 
return err;
-- 
2.7.4



Re: [PATCH] [v7] net: emac: emac gigabit ethernet controller driver

2016-08-18 Thread Timur Tabi

Florian Fainelli wrote:

There is a helper function to obtain the platform device associated with a 
device_node: of_find_device_by_node.


Thank you, this is exactly what I needed.

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


[PATCH net 09/10] net/mlx5: E-Switch, Set the send-to-vport rules in the correct table

2016-08-18 Thread Saeed Mahameed
From: Or Gerlitz 

While adding actual offloading support to the new switchdev mode, we didn't
change the setup of the send-to-vport rules to put them in the slow path
table, fix that.

Fixes: 1033665e63b6 ('net/mlx5: E-Switch, Use two priorities for SRIOV offloads 
mode')
Signed-off-by: Or Gerlitz 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 1a3ccbf..3dc83a9 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -113,7 +113,7 @@ mlx5_eswitch_add_send_to_vport_rule(struct mlx5_eswitch 
*esw, int vport, u32 sqn
dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
dest.vport_num = vport;
 
-   flow_rule = mlx5_add_flow_rule(esw->fdb_table.fdb, spec,
+   flow_rule = mlx5_add_flow_rule(esw->fdb_table.offloads.fdb, spec,
   MLX5_FLOW_CONTEXT_ACTION_FWD_DEST,
   0, );
if (IS_ERR(flow_rule))
-- 
2.7.4



Re: [PATCH] bpf: fix size of copy_to_user in percpu map.

2016-08-18 Thread William Tu
Hi Alexei and Daniel,

I got feedback from Fusion bios/chipset team. In short, the value
'possible' includes empty CPU socket. To verify, I tested on a
physical Xeon machine with 2 CPU sockets, one of them is empty. I got
'possible' = 0-239, the number of 'cpu*' =12. As a result, extra bytes
are copied from kernel to userspace.

As for Fusion, there is a configuration in *.vmx. If we disable cpu
hot plug by "vcpu.hotadd=FALSE", then 'possible' ==  the number of
'cpu*' dirs, which is the configuration in ESX. If "vcpu.hotadd=TURE",
then 'possible' is larger than 'cpu*' dirs, allowing users to add more
vcpus.

Regards,
William


On Fri, Aug 12, 2016 at 4:08 PM, Alexei Starovoitov
 wrote:
> On Fri, Aug 12, 2016 at 09:58:51AM -0700, William Tu wrote:
>> Hi,
>>
>> I've tested on ESXi version 5.5 and it seems OK.
>> - VM1: Ubuntu 14.04, kernel 3.19 ---> OK 3 cpu dirs, possible = 0-2
>> - VM2: Centos7, kernel 3.10 ---> OK 8 cpu dirs, possible = 0-7
>>
>> I tried another MacBook with Fusion, same issue happens, the cpu[0-9]
>> dirs are not equal to /sys/devices/system/cpu/possible
>
> great. thanks for testing. I think the issue is closed and
> hopefully you can follow up with fusion guys ;)
>


[PATCH net 08/10] net/mlx5: E-Switch, Return the correct devlink e-switch mode

2016-08-18 Thread Saeed Mahameed
From: Or Gerlitz 

Since mlx5 has also the NONE e-switch mode, we must translate from mlx5
mode to devlink mode on the devlink eswitch mode get call, do that.

While here, remove the mlx5_ prefix from the static function helpers
that deal with the mode to comply with the rest of the code.

Fixes: c930a3ad7453 ('net/mlx5e: Add devlink based SRIOV mode change')
Signed-off-by: Or Gerlitz 
Signed-off-by: Saeed Mahameed 
---
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 24 +-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c 
b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a357e8e..1a3ccbf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -535,7 +535,7 @@ void esw_offloads_cleanup(struct mlx5_eswitch *esw, int 
nvports)
esw_destroy_offloads_fdb_table(esw);
 }
 
-static int mlx5_esw_mode_from_devlink(u16 mode, u16 *mlx5_mode)
+static int esw_mode_from_devlink(u16 mode, u16 *mlx5_mode)
 {
switch (mode) {
case DEVLINK_ESWITCH_MODE_LEGACY:
@@ -551,6 +551,22 @@ static int mlx5_esw_mode_from_devlink(u16 mode, u16 
*mlx5_mode)
return 0;
 }
 
+static int esw_mode_to_devlink(u16 mlx5_mode, u16 *mode)
+{
+   switch (mlx5_mode) {
+   case SRIOV_LEGACY:
+   *mode = DEVLINK_ESWITCH_MODE_LEGACY;
+   break;
+   case SRIOV_OFFLOADS:
+   *mode = DEVLINK_ESWITCH_MODE_SWITCHDEV;
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, u16 mode)
 {
struct mlx5_core_dev *dev;
@@ -566,7 +582,7 @@ int mlx5_devlink_eswitch_mode_set(struct devlink *devlink, 
u16 mode)
if (cur_mlx5_mode == SRIOV_NONE)
return -EOPNOTSUPP;
 
-   if (mlx5_esw_mode_from_devlink(mode, _mode))
+   if (esw_mode_from_devlink(mode, _mode))
return -EINVAL;
 
if (cur_mlx5_mode == mlx5_mode)
@@ -592,9 +608,7 @@ int mlx5_devlink_eswitch_mode_get(struct devlink *devlink, 
u16 *mode)
if (dev->priv.eswitch->mode == SRIOV_NONE)
return -EOPNOTSUPP;
 
-   *mode = dev->priv.eswitch->mode;
-
-   return 0;
+   return esw_mode_to_devlink(dev->priv.eswitch->mode, mode);
 }
 
 void mlx5_eswitch_register_vport_rep(struct mlx5_eswitch *esw,
-- 
2.7.4



Re: [PATCH v2 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-18 Thread Guillaume Nault
On Thu, Aug 18, 2016 at 08:58:31AM +0800, Feng Gao wrote:
> On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault  wrote:
> > On Tue, Aug 16, 2016 at 07:33:38PM +0800, f...@ikuai8.com wrote:
> >> From: Gao Feng 
> >>
> >> PPP channel holds one spinlock before send frame. But the skb may
> >> select the same PPP channel with wrong route policy. As a result,
> >> the skb reaches the same channel path. It tries to get the same
> >> spinlock which is held before. Bang, the deadlock comes out.
> >>
> > Unless I misunderstood the problem you're trying to solve, this patch
> > doesn't really help: deadlock still occurs if the same IP is used for
> > L2TP and PPP's peer address.
> >
> 
> The deadlock happens because the same cpu try to hold the spinlock
> which is already held before by itself.
> Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
> when the same cpu
> invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
> so it only increases
> the lock_cnt without trying to hold the lock again.
> So it avoids the deadlock.
> 
I'm sorry but, again, it just _moves_ the deadlock down to L2TP. The
kernel still oops because, now, l2tp_xmit_skb() is called recursively
while holding its tunnel socket.

> >> Now add one lock owner to avoid it like xmit_lock_owner of
> >> netdev_queue. Check the lock owner before try to get the spinlock.
> >> If the current cpu is already the owner, needn't lock again. When
> >> PPP channel holds the spinlock at the first time, it sets owner
> >> with current CPU ID.
> >>
> > I think you should forbid lock recursion entirely, and drop the packet
> > if the owner tries to re-acquire the channel lock. Otherwise you just
> > move the deadlock down the stack (l2tp_xmit_skb() can't be called
> > recursively).
> 
> The reason that fix it in ppp is that there are other layer on the ppp module.
> We resolve it in ppp module, it could avoid all similar potential issues.
>
Not sure if I understand your point here.
The xmit path of PPP and its sub-layers hasn't been designed to be
reentrant. Allowing recursive sends thus require to review the full
path.

Beyond the L2TP issue discussed above, just consider the locking
dependencies used in PPP: ppp->wlock has to be held before
channel->downl. Sending a packet directly on a channel will lock
channel->downl. If this packet is routed back to the parent unit then
ppp_xmit_process() will lock ppp->wlock, effectively leading to lock
inversion and potential deadlock.

So we have two options: adapt the whole xmit path to handle recursive
sends or forbid recursion entirely. Unfortunately none of these options
looks easy to achieve:

  * Making PPP xmit path reentrant will be hard and error prone because
of all the locking dependencies. Looks like simplifying PPP's
locking scheme will be required first.

  * I can't see any way to reliably prevent settings where a packet sent
on a given channel would be routed back to the parent unit.


OTOH, we can try to limit the impact of recursive sends for simple
cases:

  * Following your approach, either adapt the lower layers
(like l2tp_xmit_skb() for L2TP), or drop the packet when
cl.owner == smp_processor_id(). This is very limited in scope and
doesn't address issues like locking inversions. But it may let the
system survive long enough for the PPP to time out.

  * In the lower layer, check where the packet is going to be enqueued
and drop it if it's the parent device. That should reliably handle
simple and common cases. However this requires to update at least
L2TP and PPTP and to get a way to access the parent device. Also,
it doesn't prevent recursion with stacked interfaces.
--
To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net 03/10] net/mlx5: Fix pci error recovery flow

2016-08-18 Thread Saeed Mahameed
From: Mohamad Haj Yahia 

When PCI error is detected we should save the state of the pci prior to
disabling it.

Also when receiving pci slot reset call we need to verify that the
device is responsive.

Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core
driver')
Signed-off-by: Mohamad Haj Yahia 

Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 59 +-
 1 file changed, 29 insertions(+), 30 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 4f491d4..2385bae 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1420,36 +1420,12 @@ static pci_ers_result_t mlx5_pci_err_detected(struct 
pci_dev *pdev,
dev_info(>dev, "%s was called\n", __func__);
mlx5_enter_error_state(dev);
mlx5_unload_one(dev, priv);
+   pci_save_state(pdev);
mlx5_pci_disable_device(dev);
return state == pci_channel_io_perm_failure ?
PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_NEED_RESET;
 }
 
-static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
-{
-   struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
-   int err = 0;
-
-   dev_info(>dev, "%s was called\n", __func__);
-
-   err = mlx5_pci_enable_device(dev);
-   if (err) {
-   dev_err(>dev, "%s: mlx5_pci_enable_device failed with 
error code: %d\n"
-   , __func__, err);
-   return PCI_ERS_RESULT_DISCONNECT;
-   }
-   pci_set_master(pdev);
-   pci_set_power_state(pdev, PCI_D0);
-   pci_restore_state(pdev);
-
-   return err ? PCI_ERS_RESULT_DISCONNECT : PCI_ERS_RESULT_RECOVERED;
-}
-
-void mlx5_disable_device(struct mlx5_core_dev *dev)
-{
-   mlx5_pci_err_detected(dev->pdev, 0);
-}
-
 /* wait for the device to show vital signs by waiting
  * for the health counter to start counting.
  */
@@ -1477,21 +1453,44 @@ static int wait_vital(struct pci_dev *pdev)
return -ETIMEDOUT;
 }
 
-static void mlx5_pci_resume(struct pci_dev *pdev)
+static pci_ers_result_t mlx5_pci_slot_reset(struct pci_dev *pdev)
 {
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
-   struct mlx5_priv *priv = >priv;
int err;
 
dev_info(>dev, "%s was called\n", __func__);
 
-   pci_save_state(pdev);
-   err = wait_vital(pdev);
+   err = mlx5_pci_enable_device(dev);
if (err) {
+   dev_err(>dev, "%s: mlx5_pci_enable_device failed with 
error code: %d\n"
+   , __func__, err);
+   return PCI_ERS_RESULT_DISCONNECT;
+   }
+
+   pci_set_master(pdev);
+   pci_restore_state(pdev);
+
+   if (wait_vital(pdev)) {
dev_err(>dev, "%s: wait_vital timed out\n", __func__);
-   return;
+   return PCI_ERS_RESULT_DISCONNECT;
}
 
+   return PCI_ERS_RESULT_RECOVERED;
+}
+
+void mlx5_disable_device(struct mlx5_core_dev *dev)
+{
+   mlx5_pci_err_detected(dev->pdev, 0);
+}
+
+static void mlx5_pci_resume(struct pci_dev *pdev)
+{
+   struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+   struct mlx5_priv *priv = >priv;
+   int err;
+
+   dev_info(>dev, "%s was called\n", __func__);
+
err = mlx5_load_one(dev, priv);
if (err)
dev_err(>dev, "%s: mlx5_load_one failed with error code: 
%d\n"
-- 
2.7.4



[PATCH net 04/10] net/mlx5: Added missing check of msg length in verifying its signature

2016-08-18 Thread Saeed Mahameed
From: Paul Blakey 

Set and verify signature calculates the signature for each of the
mailbox nodes, even for those that are unused (from cache). Added
a missing length check to set and verify only those which are used.

While here, also moved the setting of msg's nodes token to where we
already go over them. This saves a pass because checksum is disabled,
and the only useful thing remaining that set signature does is setting
the token.

Fixes: e126ba97dba9 ('mlx5: Add driver for Mellanox Connect-IB
adapters')
Signed-off-by: Paul Blakey 

Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 85 +--
 1 file changed, 54 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index d6e2a1c..c2ec01a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -143,13 +143,14 @@ static struct mlx5_cmd_layout *get_inst(struct mlx5_cmd 
*cmd, int idx)
return cmd->cmd_buf + (idx << cmd->log_stride);
 }
 
-static u8 xor8_buf(void *buf, int len)
+static u8 xor8_buf(void *buf, size_t offset, int len)
 {
u8 *ptr = buf;
u8 sum = 0;
int i;
+   int end = len + offset;
 
-   for (i = 0; i < len; i++)
+   for (i = offset; i < end; i++)
sum ^= ptr[i];
 
return sum;
@@ -157,41 +158,49 @@ static u8 xor8_buf(void *buf, int len)
 
 static int verify_block_sig(struct mlx5_cmd_prot_block *block)
 {
-   if (xor8_buf(block->rsvd0, sizeof(*block) - sizeof(block->data) - 1) != 
0xff)
+   size_t rsvd0_off = offsetof(struct mlx5_cmd_prot_block, rsvd0);
+   int xor_len = sizeof(*block) - sizeof(block->data) - 1;
+
+   if (xor8_buf(block, rsvd0_off, xor_len) != 0xff)
return -EINVAL;
 
-   if (xor8_buf(block, sizeof(*block)) != 0xff)
+   if (xor8_buf(block, 0, sizeof(*block)) != 0xff)
return -EINVAL;
 
return 0;
 }
 
-static void calc_block_sig(struct mlx5_cmd_prot_block *block, u8 token,
-  int csum)
+static void calc_block_sig(struct mlx5_cmd_prot_block *block)
 {
-   block->token = token;
-   if (csum) {
-   block->ctrl_sig = ~xor8_buf(block->rsvd0, sizeof(*block) -
-   sizeof(block->data) - 2);
-   block->sig = ~xor8_buf(block, sizeof(*block) - 1);
-   }
+   int ctrl_xor_len = sizeof(*block) - sizeof(block->data) - 2;
+   size_t rsvd0_off = offsetof(struct mlx5_cmd_prot_block, rsvd0);
+
+   block->ctrl_sig = ~xor8_buf(block, rsvd0_off, ctrl_xor_len);
+   block->sig = ~xor8_buf(block, 0, sizeof(*block) - 1);
 }
 
-static void calc_chain_sig(struct mlx5_cmd_msg *msg, u8 token, int csum)
+static void calc_chain_sig(struct mlx5_cmd_msg *msg)
 {
struct mlx5_cmd_mailbox *next = msg->next;
-
-   while (next) {
-   calc_block_sig(next->buf, token, csum);
+   int size = msg->len;
+   int blen = size - min_t(int, sizeof(msg->first.data), size);
+   int n = (blen + MLX5_CMD_DATA_BLOCK_SIZE - 1)
+   / MLX5_CMD_DATA_BLOCK_SIZE;
+   int i = 0;
+
+   for (i = 0; i < n && next; i++)  {
+   calc_block_sig(next->buf);
next = next->next;
}
 }
 
 static void set_signature(struct mlx5_cmd_work_ent *ent, int csum)
 {
-   ent->lay->sig = ~xor8_buf(ent->lay, sizeof(*ent->lay));
-   calc_chain_sig(ent->in, ent->token, csum);
-   calc_chain_sig(ent->out, ent->token, csum);
+   ent->lay->sig = ~xor8_buf(ent->lay, 0,  sizeof(*ent->lay));
+   if (csum) {
+   calc_chain_sig(ent->in);
+   calc_chain_sig(ent->out);
+   }
 }
 
 static void poll_timeout(struct mlx5_cmd_work_ent *ent)
@@ -222,12 +231,17 @@ static int verify_signature(struct mlx5_cmd_work_ent *ent)
struct mlx5_cmd_mailbox *next = ent->out->next;
int err;
u8 sig;
+   int size = ent->out->len;
+   int blen = size - min_t(int, sizeof(ent->out->first.data), size);
+   int n = (blen + MLX5_CMD_DATA_BLOCK_SIZE - 1)
+   / MLX5_CMD_DATA_BLOCK_SIZE;
+   int i = 0;
 
-   sig = xor8_buf(ent->lay, sizeof(*ent->lay));
+   sig = xor8_buf(ent->lay, 0, sizeof(*ent->lay));
if (sig != 0xff)
return -EINVAL;
 
-   while (next) {
+   for (i = 0; i < n && next; i++) {
err = verify_block_sig(next->buf);
if (err)
return err;
@@ -656,7 +670,6 @@ static void cmd_work_handler(struct work_struct *work)
spin_unlock_irqrestore(>alloc_lock, flags);
}
 
-   ent->token = alloc_token(cmd);
cmd->ent_arr[ent->idx] = ent;
lay = get_inst(cmd, ent->idx);
ent->lay = lay;
@@ -766,7 +779,8 @@ static u8 

[PATCH net 01/10] net/mlx5e: Set port MTU on netdev creation rather on open

2016-08-18 Thread Saeed Mahameed
Port mtu shouldn't be written to hardware on every single interface
open.
Here we set it only when needed, on change_mtu and netdevice creation.

Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 870bea3..0fc3a2b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -1826,10 +1826,6 @@ int mlx5e_open_locked(struct net_device *netdev)
netif_set_real_num_tx_queues(netdev, num_txqs);
netif_set_real_num_rx_queues(netdev, priv->params.num_channels);
 
-   err = mlx5e_set_dev_port_mtu(netdev);
-   if (err)
-   goto err_clear_state_opened_flag;
-
err = mlx5e_open_channels(priv);
if (err) {
netdev_err(netdev, "%s: mlx5e_open_channels failed, %d\n",
@@ -2593,6 +2589,7 @@ static int mlx5e_change_mtu(struct net_device *netdev, 
int new_mtu)
mlx5e_close_locked(netdev);
 
netdev->mtu = new_mtu;
+   mlx5e_set_dev_port_mtu(netdev);
 
if (was_opened)
err = mlx5e_open_locked(netdev);
@@ -3463,6 +3460,8 @@ void *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 
mlx5e_init_l2_addr(priv);
 
+   mlx5e_set_dev_port_mtu(netdev);
+
err = register_netdev(netdev);
if (err) {
mlx5_core_err(mdev, "register_netdev failed, %d\n", err);
-- 
2.7.4



[PATCH net 05/10] net/mlx5: Update last-use statistics for flow rules

2016-08-18 Thread Saeed Mahameed
From: Amir Vadai 

Set lastuse statistic, when number of packets is changed compared to
last query. This was wrongly dropped when bulk counter reading was added.

Fixes: a351a1b03bf1 ('net/mlx5: Introduce bulk reading of flow counters')
Signed-off-by: Amir Vadai 
Reported-by: Paul Blakey 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c 
b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index c2877e9..3a9195b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -126,12 +126,21 @@ static struct rb_node *mlx5_fc_stats_query(struct 
mlx5_core_dev *dev,
for (node = >node; node; node = rb_next(node)) {
struct mlx5_fc *counter = rb_entry(node, struct mlx5_fc, node);
struct mlx5_fc_cache *c = >cache;
+   u64 packets;
+   u64 bytes;
 
if (counter->id > last_id)
break;
 
mlx5_cmd_fc_bulk_get(dev, b,
-counter->id, >packets, >bytes);
+counter->id, , );
+
+   if (c->packets == packets)
+   continue;
+
+   c->packets = packets;
+   c->bytes = bytes;
+   c->lastuse = jiffies;
}
 
 out:
-- 
2.7.4



[net-next 04/15] i40e: reset RX csum error stat with other pf stats

2016-08-18 Thread Jeff Kirsher
From: Catherine Sullivan 

When we are resetting the pf stats we should also reset the RX csum
error stat.

Change-ID: I7af5ee0ec81a10f6deee1a7b8c2082ea068ef620
Signed-off-by: Catherine Sullivan 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index c6ac7a6..9f9d24f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -527,6 +527,7 @@ void i40e_pf_reset_stats(struct i40e_pf *pf)
pf->veb[i]->stat_offsets_loaded = false;
}
}
+   pf->hw_csum_rx_error = 0;
 }
 
 /**
-- 
2.7.4



[PATCH net 00/10] Mellanox 100G mlx5 fixes 2016-08-16

2016-08-18 Thread Saeed Mahameed
Hi Dave,

This series includes some bug fixes for mlx5e driver.

>From Saeed and Tariq, Optimize MTU change to not reset when it is not required.

>From Paul, Command interface message length check to speedup firmware 
command preparation.

>From Mohamad, Save pci state when pci error is detected.

>From Amir, Flow counters "lastuse" update fix.

>From Hadar, Use correct flow dissector key on flower offloading.
Plus a small optimization for switchdev hardware id query.

>From Or, three patches to address some E-Switch offloads issues. 

For -stable of 4.6.y and 4.7.y:
net/mlx5e: Use correct flow dissector key on flower offloading
net/mlx5: Fix pci error recovery flow
net/mlx5: Added missing check of msg length in verifying its signature

Thanks,
Saeed.

Amir Vadai (1):
  net/mlx5: Update last-use statistics for flow rules

Hadar Hen Zion (2):
  net/mlx5e: Use correct flow dissector key on flower offloading
  net/mlx5e: Retrieve the switchdev id from the firmware only once

Mohamad Haj Yahia (1):
  net/mlx5: Fix pci error recovery flow

Or Gerlitz (3):
  net/mlx5: E-Switch, Return the correct devlink e-switch mode
  net/mlx5: E-Switch, Set the send-to-vport rules in the correct table
  net/mlx5: E-Switch, Avoid ACLs in the offloads mode

Paul Blakey (1):
  net/mlx5: Added missing check of msg length in verifying its signature

Saeed Mahameed (1):
  net/mlx5e: Set port MTU on netdev creation rather on open

Tariq Toukan (1):
  net/mlx5e: Optimization for MTU change

 drivers/net/ethernet/mellanox/mlx5/core/cmd.c  | 85 ++
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  | 21 --
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  5 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c|  2 +-
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  | 16 ++--
 drivers/net/ethernet/mellanox/mlx5/core/eswitch.h  |  1 +
 .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 26 +--
 .../net/ethernet/mellanox/mlx5/core/fs_counters.c  | 11 ++-
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 59 ---
 9 files changed, 141 insertions(+), 85 deletions(-)

-- 
2.7.4



Re: [PATCH v2 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-18 Thread Feng Gao
Hi Guillaume,

Thanks your detail analyses.
Now I think it is a good solution that just drop the packet and print
error log instead my original solution that supports reentrant. This
solution will not bring any side effects.

I will send one update according to this new solution.


Regards
Feng


On Fri, Aug 19, 2016 at 12:13 AM, Guillaume Nault  wrote:
> On Thu, Aug 18, 2016 at 08:58:31AM +0800, Feng Gao wrote:
>> On Thu, Aug 18, 2016 at 1:42 AM, Guillaume Nault  
>> wrote:
>> > On Tue, Aug 16, 2016 at 07:33:38PM +0800, f...@ikuai8.com wrote:
>> >> From: Gao Feng 
>> >>
>> >> PPP channel holds one spinlock before send frame. But the skb may
>> >> select the same PPP channel with wrong route policy. As a result,
>> >> the skb reaches the same channel path. It tries to get the same
>> >> spinlock which is held before. Bang, the deadlock comes out.
>> >>
>> > Unless I misunderstood the problem you're trying to solve, this patch
>> > doesn't really help: deadlock still occurs if the same IP is used for
>> > L2TP and PPP's peer address.
>> >
>>
>> The deadlock happens because the same cpu try to hold the spinlock
>> which is already held before by itself.
>> Now the PPP_CHANNEL_LOCK_BH sets the lock owner after hold lock, then
>> when the same cpu
>> invokes PPP_CHANNEL_LOCK_BH again. The cl.owner equals current cpu id,
>> so it only increases
>> the lock_cnt without trying to hold the lock again.
>> So it avoids the deadlock.
>>
> I'm sorry but, again, it just _moves_ the deadlock down to L2TP. The
> kernel still oops because, now, l2tp_xmit_skb() is called recursively
> while holding its tunnel socket.
>
>> >> Now add one lock owner to avoid it like xmit_lock_owner of
>> >> netdev_queue. Check the lock owner before try to get the spinlock.
>> >> If the current cpu is already the owner, needn't lock again. When
>> >> PPP channel holds the spinlock at the first time, it sets owner
>> >> with current CPU ID.
>> >>
>> > I think you should forbid lock recursion entirely, and drop the packet
>> > if the owner tries to re-acquire the channel lock. Otherwise you just
>> > move the deadlock down the stack (l2tp_xmit_skb() can't be called
>> > recursively).
>>
>> The reason that fix it in ppp is that there are other layer on the ppp 
>> module.
>> We resolve it in ppp module, it could avoid all similar potential issues.
>>
> Not sure if I understand your point here.
> The xmit path of PPP and its sub-layers hasn't been designed to be
> reentrant. Allowing recursive sends thus require to review the full
> path.
>
> Beyond the L2TP issue discussed above, just consider the locking
> dependencies used in PPP: ppp->wlock has to be held before
> channel->downl. Sending a packet directly on a channel will lock
> channel->downl. If this packet is routed back to the parent unit then
> ppp_xmit_process() will lock ppp->wlock, effectively leading to lock
> inversion and potential deadlock.
>
> So we have two options: adapt the whole xmit path to handle recursive
> sends or forbid recursion entirely. Unfortunately none of these options
> looks easy to achieve:
>
>   * Making PPP xmit path reentrant will be hard and error prone because
> of all the locking dependencies. Looks like simplifying PPP's
> locking scheme will be required first.
>
>   * I can't see any way to reliably prevent settings where a packet sent
> on a given channel would be routed back to the parent unit.
>
>
> OTOH, we can try to limit the impact of recursive sends for simple
> cases:
>
>   * Following your approach, either adapt the lower layers
> (like l2tp_xmit_skb() for L2TP), or drop the packet when
> cl.owner == smp_processor_id(). This is very limited in scope and
> doesn't address issues like locking inversions. But it may let the
> system survive long enough for the PPP to time out.
>
>   * In the lower layer, check where the packet is going to be enqueued
> and drop it if it's the parent device. That should reliably handle
> simple and common cases. However this requires to update at least
> L2TP and PPTP and to get a way to access the parent device. Also,
> it doesn't prevent recursion with stacked interfaces.


[PATCH net 06/10] net/mlx5e: Use correct flow dissector key on flower offloading

2016-08-18 Thread Saeed Mahameed
From: Hadar Hen Zion 

The wrong key is used when extracting the address type field set by
the flower offload code. We have to use the control key and not the
basic key, fix that.

Fixes: e3a2b7ed018e ('net/mlx5e: Support offload cls_flower with drop action')
Signed-off-by: Hadar Hen Zion 
Signed-off-by: Saeed Mahameed 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c 
b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index dc8b1cb..22cfc4a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -170,7 +170,7 @@ static int parse_cls_flower(struct mlx5e_priv *priv, struct 
mlx5_flow_spec *spec
if (dissector_uses_key(f->dissector, FLOW_DISSECTOR_KEY_CONTROL)) {
struct flow_dissector_key_control *key =
skb_flow_dissector_target(f->dissector,
- FLOW_DISSECTOR_KEY_BASIC,
+ FLOW_DISSECTOR_KEY_CONTROL,
  f->key);
addr_type = key->addr_type;
}
-- 
2.7.4



[net-next 02/15] i40e: Fix a bug where a client close can be called before an open is complete

2016-08-18 Thread Jeff Kirsher
From: Anjali Singhai Jain 

The client->open call in this path was not protected with the
client instance mutex, and hence the client->close can get initiated
before the open completes.

Change-Id: I0ed60c38868dd3f44966b6ed49a063d0e5b7edf5
Signed-off-by: Anjali Singhai Jain 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40e/i40e_client.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_client.c 
b/drivers/net/ethernet/intel/i40e/i40e_client.c
index e6b9263..8726269 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_client.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_client.c
@@ -541,6 +541,7 @@ void i40e_client_subtask(struct i40e_pf *pf)
 client->name, pf->hw.pf_id,
 pf->hw.bus.device, pf->hw.bus.func);
 
+   mutex_lock(_client_instance_mutex);
/* Send an Open request to the client */
atomic_inc(>ref_cnt);
if (client->ops && client->ops->open)
@@ -554,6 +555,7 @@ void i40e_client_subtask(struct i40e_pf *pf)
atomic_dec(>ref_cnt);
continue;
}
+   mutex_unlock(_client_instance_mutex);
}
mutex_unlock(_client_mutex);
 }
-- 
2.7.4



Re: [PATCH net-next] xen-netback: create a debugfs node for hash information

2016-08-18 Thread Wei Liu
On Wed, Aug 17, 2016 at 04:13:29PM +0100, Paul Durrant wrote:
> It is useful to be able to see the hash configuration when running tests.
> This patch adds a debugfs node for that purpose.
> 
> Signed-off-by: Paul Durrant 

Acked-by: Wei Liu 


Re: [PATCH 1/1] ppp: Fix one deadlock issue of PPP when send frame

2016-08-18 Thread Feng Gao
Hi Philp,

Yes. I am agree with you.
Just drop is better to support recursive lock.

I will send a new patch later.

Regards
Feng


On Fri, Aug 19, 2016 at 12:48 AM, Philp Prindeville
 wrote:
>
>
> On 08/18/2016 09:05 AM, Feng Gao wrote:
>>
>> On Thu, Aug 18, 2016 at 10:11 PM, Philp Prindeville
>>   wrote:
>>>
>>> >Feng,
>>> >
>>> >If the CPU can already be holding the lock, that implies re-entrancy.
>>> >What's to stop the first flow of code which acquired the lock from
>>> > releasing
>>> >it again before the 2nd flow is done?  Is the 2nd flow running at a
>>> > higher
>>> >priority or with interrupts disabled?
>>
>> There is no preemption happened. It is caused by wrong route policy by
>> l2tp.
>> For example, the cpu0 get the spinlock of channel1, then the channel1
>> is selected again after route. As a result, cpu0 tries to get the same
>> spinlock again.
>>
>> The call flow is like this.
>> ppp_write->ppp_channel_push->start_xmit->select inappropriate route
>>  -> dev_hard_start_xmit->ppp_start_xmit->ppp_xmit_process->
>> ppp_push. Now ppp_push tries to get the same spinlock which is held
>> in ppp_channel_push.
>>
>> Regards
>> Feng
>>
>
> If we're detecting (through the fact that the lock has already been
> acquired) that the wrong route is being applied, why don't we just punt the
> packet instead?
>
> -Philip
>


Re: [PATCH] mwifiex: propagate error if IRQ request fails in mwifiex_sdio_of()

2016-08-18 Thread Javier Martinez Canillas
Hello Arend,

Thanks a lot for your feedback.

On 08/18/2016 03:14 PM, Arend van Spriel wrote:
> On 18-08-16 16:17, Javier Martinez Canillas wrote:
>> If request_irq() fails in mwifiex_sdio_probe_of(), only an error message
>> is printed but the actual error is not propagated to the caller function.
> 
> Hmm. The caller function, ie. mwifiex_sdio_probe(), does not seem to care.
>

Hmm, I'm not so sure about that. It's checking the wifiex_sdio_probe_of()
return value.

If the IRQ request failing is not an error, then at the very least the call
to disable_irq() should be avoided if request_irq() fails, and the message
should be changed from dev_err() to dev_dgb() or dev_info().
 
> The device may still function without this wake interrupt.
>

That's correct, the binding says that the "interrupts" property in the child
node is optional since is just a wakeup IRQ. Now the question is if should
be an error if the IRQ is defined but fails to be requested.

> Regards,
> Arend
> 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America


[PATCH net v2] sctp: linearize early if it's not GSO

2016-08-18 Thread Marcelo Ricardo Leitner
Because otherwise when crc computation is still needed it's way more
expensive than on a linear buffer to the point that it affects
performance.

It's so expensive that netperf test gives a perf output as below:

Overhead  Command Shared Object   Symbol
  18,62%  netserver   [kernel.vmlinux][k] crc32_generic_shift
   2,57%  netserver   [kernel.vmlinux][k] __pskb_pull_tail
   1,94%  netserver   [kernel.vmlinux][k] fib_table_lookup
   1,90%  netserver   [kernel.vmlinux][k] copy_user_enhanced_fast_string
   1,66%  swapper [kernel.vmlinux][k] intel_idle
   1,63%  netserver   [kernel.vmlinux][k] _raw_spin_lock
   1,59%  netserver   [sctp]  [k] sctp_packet_transmit
   1,55%  netserver   [kernel.vmlinux][k] memcpy_erms
   1,42%  netserver   [sctp]  [k] sctp_rcv

# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 
0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  1200010.00  3016.42   2.88 3.78 1.874   2.462

After patch:
Overhead  Command Shared Object  Symbol
   2,75%  netserver   [kernel.vmlinux]   [k] memcpy_erms
   2,63%  netserver   [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
   2,39%  netserver   [kernel.vmlinux]   [k] fib_table_lookup
   2,04%  netserver   [kernel.vmlinux]   [k] __pskb_pull_tail
   1,91%  netserver   [kernel.vmlinux]   [k] _raw_spin_lock
   1,91%  netserver   [sctp] [k] sctp_packet_transmit
   1,72%  netserver   [mlx4_en]  [k] mlx4_en_process_rx_cq
   1,68%  netserver   [sctp] [k] sctp_rcv

# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 
0 AF_INET
Recv   SendSend  Utilization   Service Demand
Socket Socket  Message  Elapsed  Send Recv SendRecv
Size   SizeSize Time Throughput  localremote   local   remote
bytes  bytes   bytessecs.10^6bits/s  % S  % S  us/KB   us/KB

212992 212992  1200010.00  3681.77   3.83 3.46 2.045   1.849

Fixes: 3acb50c18d8d ("sctp: delay as much as possible skb_linearize")
Signed-off-by: Marcelo Ricardo Leitner 
---
v1->v2:
 - Fixed changelog referencing other kernel, spotted by Daniel Borkmann.

Please consider this for stable, as v4.7 is affected. Thanks!

 net/sctp/input.c   | 11 +++
 net/sctp/inqueue.c | 13 -
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index 
c182db7d691ff44a52923fb36c9170e49c141c04..69444d32ecda6cd1a4924911172feba89c5ae976
 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -119,7 +119,13 @@ int sctp_rcv(struct sk_buff *skb)
   skb_transport_offset(skb))
goto discard_it;
 
-   if (!pskb_may_pull(skb, sizeof(struct sctphdr)))
+   /* If the packet is fragmented and we need to do crc checking,
+* it's better to just linearize it otherwise crc computing
+* takes longer.
+*/
+   if ((!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP) &&
+skb_linearize(skb)) ||
+   !pskb_may_pull(skb, sizeof(struct sctphdr)))
goto discard_it;
 
/* Pull up the IP header. */
@@ -1177,9 +1183,6 @@ static struct sctp_association 
*__sctp_rcv_lookup_harder(struct net *net,
if ((skb_shinfo(skb)->gso_type & SKB_GSO_SCTP) == SKB_GSO_SCTP)
return NULL;
 
-   if (skb_linearize(skb))
-   return NULL;
-
ch = (sctp_chunkhdr_t *) skb->data;
 
/* The code below will attempt to walk the chunk and extract
diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c
index 
c30ddb0f31907f57c5ce85b00dbe04260ca1cb2e..6437aa97cfd79f14c633499c2b131389204c435b
 100644
--- a/net/sctp/inqueue.c
+++ b/net/sctp/inqueue.c
@@ -170,19 +170,6 @@ next_chunk:
 
chunk = list_entry(entry, struct sctp_chunk, list);
 
-   /* Linearize if it's not GSO */
-   if ((skb_shinfo(chunk->skb)->gso_type & SKB_GSO_SCTP) != 
SKB_GSO_SCTP &&
-   skb_is_nonlinear(chunk->skb)) {
-   if (skb_linearize(chunk->skb)) {
-   __SCTP_INC_STATS(dev_net(chunk->skb->dev), 
SCTP_MIB_IN_PKT_DISCARDS);
-   sctp_chunk_free(chunk);
-   goto next_chunk;
-   }
-
-   /* Update sctp_hdr as it probably changed */
-   

Re: [PATCH v1 1/1] pppoe: l2tp: the PPPOX_CONNECTED should be used with bit operation

2016-08-18 Thread Feng Gao
inline.

On Fri, Aug 19, 2016 at 1:44 AM, Guillaume Nault  wrote:
> On Thu, Aug 18, 2016 at 09:59:03AM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> There are some codes in pppoe and l2tp which use the PPPOX_CONNECTED
>> as the value including assignment and condition check.
>> They should keep consistent with other codes.
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v1: Initial Patch
>>
>>  drivers/net/ppp/pppoe.c | 2 +-
>>  net/l2tp/l2tp_ppp.c | 4 ++--
>>  2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
>> index 4ddae81..684b773 100644
>> --- a/drivers/net/ppp/pppoe.c
>> +++ b/drivers/net/ppp/pppoe.c
>> @@ -697,7 +697,7 @@ static int pppoe_connect(struct socket *sock, struct 
>> sockaddr *uservaddr,
>>   goto err_put;
>>   }
>>
>> - sk->sk_state = PPPOX_CONNECTED;
>> + sk->sk_state |= PPPOX_CONNECTED;
>>
> Using plain assignment makes it clear for the reader that other flags
> are unset. I see no reason for changing this.

I get you. So I don't modify the PPPOX_DEAD assignment.
But I am afraid if there is some case that the flag PPPOX_BOUND is set
before PPPOX_CONNECTED . Then the assignment of "PPPOX_CONNECTED" will
clear the PPPOX_BOUND flag.

>
>> diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
>> index d9560aa..3984385 100644
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -774,7 +774,7 @@ static int pppol2tp_connect(struct socket *sock, struct 
>> sockaddr *uservaddr,
>>  out_no_ppp:
>>   /* This is how we get the session context from the socket. */
>>   sk->sk_user_data = session;
>> - sk->sk_state = PPPOX_CONNECTED;
>> + sk->sk_state |= PPPOX_CONNECTED;
>>
> Same here.
>
>> @@ -856,7 +856,7 @@ static int pppol2tp_getname(struct socket *sock, struct 
>> sockaddr *uaddr,
>>   error = -ENOTCONN;
>>   if (sk == NULL)
>>   goto end;
>> - if (sk->sk_state != PPPOX_CONNECTED)
>> + if (!(sk->sk_state & PPPOX_CONNECTED))
>>
> Looks like it was a bug. This one is worth a separate patch.

Ok, I send another patch for this bug.

Regards
Feng


Re: [PATCH] [v7] net: emac: emac gigabit ethernet controller driver

2016-08-18 Thread Florian Fainelli
On August 17, 2016 9:19:23 PM PDT, Timur Tabi  wrote:
>Florian Fainelli wrote:
>> The larger issue is that the emac_sgmii node in the form you posted
>> is going to be backed by a platform device in Linux while you want a
>> PHY device with a reg property that describes a MDIO address
>> (#address-cells = 1, #size-cells = 0).
>
>But how do I get the platform device for the emac_sgmii node?  If I 
>create an of_device_id ID for it, like this:
>
>static const struct of_device_id emac_dt_match[] = {
>   {
>   .compatible = "qcom,fsm9900-emac",
>   },
>   {}
>};
>
>static const struct of_device_id emac_sgmii_dt_match[] = {
>   {
>   .compatible = "qcom,fsm9900-emac-sgmii",
>   },
>   {}
>};
>
>Then the probe function will be called for qcom,fsm9900-emac-sgmii 
>separately from qcom,fsm9900-emac, which just confuses things. So I 
>can't create emac_sgmii_dt_match.
>
>I know this is standard DT stuff, and I used to do a lot of work on DT 
>so maybe I should know this already.  But it seems to me that I need to
>
>manually create the platform_device for qcom,fsm9900-emac-sgmii.

There is a helper function to obtain the platform device associated with a 
device_node: of_find_device_by_node. AFAICT probe ordering is based on DT 
ordering, but if there was a dependency order to solve you could use 
EPROBE_DEFER until the emac driver finds its dependencies.

>
>> IIRC the amd xgbe driver mainline had a similar design but still
>> implemented a PHY device anyway although it may not have been using
>> Device Tree. It should still be possible to implement a PHY driver
>> that you manually register and bind to its device_node pointer such
>> that of_phy_find_device and friends still work. You would do this
>> from the emac_sgmii platform device driver and parent devices in a
>> way that satisfy the PHY device driver lifecycle as well.
>>
>> Hope this helps.
>
>It doesn't, sorry.  The emac_sgmii is really just another register
>block 
>for the driver to program.  Creating another PHY driver for it doesn't 
>really make sense.  It's not on an MDIO bus.

First of a PHY does not have to be on a MDIO bus, it just happens to be the 
most common thing, and second, this is a PHY of some kind as far as the 
Ethernet MAC is concerned and this is the abstraction you want at the MAC 
driver level so you can plug arbitrary external PHYs or internal PHYs 
irrespective of their backing transport/bus layer.

As long as your DT representation is sane and accurate, however this ends up 
being implemented matters less. Whether the PHY device model suits you here, 
you need to tell the emac0 node that it is bound to the emac_sgmii node somehow 
since there is programming of this block required for operating and for that a 
phandle property of some kind is required, phy-handle is a standard one which 
would help with the driver acceptance IMHO.


-- 
Florian


Re: [PATCH v1 1/1] l2tp: Use existing macros instead of literal number

2016-08-18 Thread Feng Gao
On Fri, Aug 19, 2016 at 2:41 AM, Guillaume Nault  wrote:
> On Thu, Aug 18, 2016 at 03:05:19PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> 1. Use PPP_ALLSTATIONS/PPP_UI instead of literal 0xff/0x03;
>> 2. Use one static const global fixed_ppphdr instead of two same
>> static variable ppph in two different functions;
>> 3. Use SEND_SHUTDOWN instead of literal 2;
>>
>> Signed-off-by: Gao Feng 
>> ---
>>  v1: Initial patch
> No need to send 'v1' for the initial series.

OK, I get it.

>
>> --- a/net/l2tp/l2tp_ppp.c
>> +++ b/net/l2tp/l2tp_ppp.c
>> @@ -138,6 +138,8 @@ static const struct ppp_channel_ops pppol2tp_chan_ops = {
>>
>>  static const struct proto_ops pppol2tp_ops;
>>
>> +static const unsigned char fixed_ppphdr[2] = {PPP_ALLSTATIONS, PPP_UI};
>> +
>>  /* Helpers to obtain tunnel/session contexts from sockets.
>>   */
>>  static inline struct l2tp_session *pppol2tp_sock_to_session(struct sock *sk)
>> @@ -174,11 +176,11 @@ static int pppol2tp_recv_payload_hook(struct sk_buff 
>> *skb)
>>* Note that skb->data[] isn't dereferenced from a u16 ptr here since
>>* the field may be unaligned.
>>*/
>> - if (!pskb_may_pull(skb, 2))
>> + if (!pskb_may_pull(skb, sizeof(fixed_ppphdr)))
>>   return 1;
>>
>> - if ((skb->data[0] == 0xff) && (skb->data[1] == 0x03))
>> - skb_pull(skb, 2);
>> + if ((PPP_ADDRESS(skb->data) == PPP_ALLSTATIONS) && 
>> (PPP_CONTROL(skb->data) == PPP_UI))
>> + skb_pull(skb, sizeof(fixed_ppphdr));
>>
> Sorry, but I find the original code clearer. It's important to be
> explicit about what's done with the sk_buff. Hiding skb->data[x] behind
> macros certainly doesn't help.
>
> Same thing for the use of sizeof(fixed_ppphdr) in pskb_may_pull(). The
> size of fixed_ppphdr isn't used aftewards, so it's unclear why its size
> was pulled. 2 was not a magic number here, it was directly related with
> the operations done on the skb (i.e. accessing skb->data[0] and
> skb->data[1]). So pskb_may_pull(skb, 2) makes perfect sense.

Agree now because of detail explanation.
Thanks.

>
> OTOH, replacing 0xff and 0x03 with PPP_ALLSTATIONS and PPP_UI is fine.

OK. get it.

>
>> @@ -312,7 +313,7 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
>> msghdr *m,
>>   error = -ENOMEM;
>>   skb = sock_wmalloc(sk, NET_SKB_PAD + sizeof(struct iphdr) +
>>  uhlen + session->hdr_len +
>> -sizeof(ppph) + total_len,
>> +sizeof(fixed_ppphdr) + total_len,
>>  0, GFP_KERNEL);
>>   if (!skb)
>>   goto error_put_sess_tun;
>> @@ -325,9 +326,9 @@ static int pppol2tp_sendmsg(struct socket *sock, struct 
>> msghdr *m,
>>   skb_reserve(skb, uhlen);
>>
>>   /* Add PPP header */
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> - skb_put(skb, 2);
>> + PPP_ADDRESS(skb->data) = fixed_ppphdr[0];
>> + PPP_CONTROL(skb->data) = fixed_ppphdr[1];
>> + skb_put(skb, sizeof(fixed_ppphdr));
>>
> Same here. What about
> + skb->data[0] = PPP_ALLSTATIONS;
> + skb->data[1] = PPP_UI;
> + skb_put(skb, 2);
> and removing ppph entirely?

Agree with you.

>
>>   /* Copy user data into skb */
>>   error = memcpy_from_msg(skb_put(skb, total_len), m, total_len);
>> @@ -369,7 +370,6 @@ error:
>>   */
>>  static int pppol2tp_xmit(struct ppp_channel *chan, struct sk_buff *skb)
>>  {
>> - static const u8 ppph[2] = { 0xff, 0x03 };
>>   struct sock *sk = (struct sock *) chan->private;
>>   struct sock *sk_tun;
>>   struct l2tp_session *session;
>> @@ -398,14 +398,14 @@ static int pppol2tp_xmit(struct ppp_channel *chan, 
>> struct sk_buff *skb)
>>  sizeof(struct iphdr) + /* IP header */
>>  uhlen +  /* UDP header (if L2TP_ENCAPTYPE_UDP) 
>> */
>>  session->hdr_len +   /* L2TP header */
>> -sizeof(ppph);/* PPP header */
>> +sizeof(fixed_ppphdr); /* PPP header */
>>   if (skb_cow_head(skb, headroom))
>>   goto abort_put_sess_tun;
>>
>>   /* Setup PPP header */
>> - __skb_push(skb, sizeof(ppph));
>> - skb->data[0] = ppph[0];
>> - skb->data[1] = ppph[1];
>> + __skb_push(skb, sizeof(fixed_ppphdr));
>> + skb->data[0] = fixed_ppphdr[0];
>> + skb->data[1] = fixed_ppphdr[1];
>>
> Same as for pppol2tp_sendmsg().
>
>> @@ -440,7 +440,7 @@ static void pppol2tp_session_close(struct l2tp_session 
>> *session)
>>   BUG_ON(session->magic != L2TP_SESSION_MAGIC);
>>
>>   if (sock) {
>> - inet_shutdown(sock, 2);
>> + inet_shutdown(sock, SEND_SHUTDOWN);
>>
> Ok.

Regards
Feng


[PATCH net-next 3/5] net: dsa: bcm_sf2: Make it a real platform device driver

2016-08-18 Thread Florian Fainelli
The Broadcom Starfighter 2 switch driver should be a proper platform
driver, now that the DSA code has been updated to allow that, register a
switch device, feed it with the proper configuration data coming from
Device Tree and register our switch device with DSA.

The bulk of the changes consist in moving what bcm_sf2_sw_setup() did
into the platform driver probe function.

Signed-off-by: Florian Fainelli 
---
 drivers/net/dsa/bcm_sf2.c | 178 ++
 net/dsa/dsa.c |   1 -
 2 files changed, 118 insertions(+), 61 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index b2b838724a9b..fe1cc92f72a8 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1571,23 +1571,84 @@ static int bcm_sf2_sw_vlan_dump(struct dsa_switch *ds, 
int port,
 
 static int bcm_sf2_sw_setup(struct dsa_switch *ds)
 {
-   const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
struct bcm_sf2_priv *priv = ds_to_priv(ds);
-   struct device_node *dn;
-   void __iomem **base;
unsigned int port;
+
+   /* Enable all valid ports and disable those unused */
+   for (port = 0; port < priv->hw_params.num_ports; port++) {
+   /* IMP port receives special treatment */
+   if ((1 << port) & ds->enabled_port_mask)
+   bcm_sf2_port_setup(ds, port, NULL);
+   else if (dsa_is_cpu_port(ds, port))
+   bcm_sf2_imp_setup(ds, port);
+   else
+   bcm_sf2_port_disable(ds, port, NULL);
+   }
+
+   bcm_sf2_sw_configure_vlan(ds);
+
+   return 0;
+}
+
+static struct dsa_switch_driver bcm_sf2_switch_driver = {
+   .tag_protocol   = DSA_TAG_PROTO_BRCM,
+   .probe  = bcm_sf2_sw_drv_probe,
+   .setup  = bcm_sf2_sw_setup,
+   .set_addr   = bcm_sf2_sw_set_addr,
+   .get_phy_flags  = bcm_sf2_sw_get_phy_flags,
+   .get_strings= bcm_sf2_sw_get_strings,
+   .get_ethtool_stats  = bcm_sf2_sw_get_ethtool_stats,
+   .get_sset_count = bcm_sf2_sw_get_sset_count,
+   .adjust_link= bcm_sf2_sw_adjust_link,
+   .fixed_link_update  = bcm_sf2_sw_fixed_link_update,
+   .suspend= bcm_sf2_sw_suspend,
+   .resume = bcm_sf2_sw_resume,
+   .get_wol= bcm_sf2_sw_get_wol,
+   .set_wol= bcm_sf2_sw_set_wol,
+   .port_enable= bcm_sf2_port_setup,
+   .port_disable   = bcm_sf2_port_disable,
+   .get_eee= bcm_sf2_sw_get_eee,
+   .set_eee= bcm_sf2_sw_set_eee,
+   .port_bridge_join   = bcm_sf2_sw_br_join,
+   .port_bridge_leave  = bcm_sf2_sw_br_leave,
+   .port_stp_state_set = bcm_sf2_sw_br_set_stp_state,
+   .port_fdb_prepare   = bcm_sf2_sw_fdb_prepare,
+   .port_fdb_add   = bcm_sf2_sw_fdb_add,
+   .port_fdb_del   = bcm_sf2_sw_fdb_del,
+   .port_fdb_dump  = bcm_sf2_sw_fdb_dump,
+   .port_vlan_filtering= bcm_sf2_sw_vlan_filtering,
+   .port_vlan_prepare  = bcm_sf2_sw_vlan_prepare,
+   .port_vlan_add  = bcm_sf2_sw_vlan_add,
+   .port_vlan_del  = bcm_sf2_sw_vlan_del,
+   .port_vlan_dump = bcm_sf2_sw_vlan_dump,
+};
+
+static int bcm_sf2_sw_probe(struct platform_device *pdev)
+{
+   const char *reg_names[BCM_SF2_REGS_NUM] = BCM_SF2_REGS_NAME;
+   struct device_node *dn = pdev->dev.of_node;
+   struct bcm_sf2_priv *priv;
+   struct dsa_switch *ds;
+   void __iomem **base;
unsigned int i;
u32 reg, rev;
int ret;
 
+   ds = devm_kzalloc(>dev, sizeof(*ds) + sizeof(*priv), GFP_KERNEL);
+   if (!ds)
+   return -ENOMEM;
+
+   priv = (struct bcm_sf2_priv *)(ds + 1);
+   ds->priv = priv;
+   ds->dev = >dev;
+   ds->drv = _sf2_switch_driver;
+
+   dev_set_drvdata(>dev, ds);
+
spin_lock_init(>indir_lock);
mutex_init(>stats_mutex);
 
-   /* All the interesting properties are at the parent device_node
-* level
-*/
-   dn = ds->cd->of_node->parent;
-   bcm_sf2_identify_ports(priv, ds->cd->of_node);
+   bcm_sf2_identify_ports(priv, dn->child);
 
priv->irq0 = irq_of_parse_and_map(dn, 0);
priv->irq1 = irq_of_parse_and_map(dn, 1);
@@ -1649,19 +1710,6 @@ static int bcm_sf2_sw_setup(struct dsa_switch *ds)
 >hw_params.num_gphy))
priv->hw_params.num_gphy = 1;
 
-   /* Enable all valid ports and disable those unused */
-   for (port = 0; port < priv->hw_params.num_ports; port++) {
-   /* IMP port receives special treatment */
-   if ((1 << port) & ds->enabled_port_mask)
-   bcm_sf2_port_setup(ds, port, 

Re: [PATCH net-next 0/3] Fix mv88e6xxx wait function

2016-08-18 Thread Vivien Didelot
Andrew Lunn  writes:

> The mv88e6xxx wait function can be upset of the system has nots of
> other things to do and a sleep takes a lot longer than expected. Fix
> this be using a fixed number of iterations, rather than a fixed
> walkclock time.
>
> Witht that change made, it is possible to consoliate another
> wait function.
>
> A wait actually timing out should not happen and when it does, it
> means something serious is wrong. Make sure an error is logged,
> since not all callers will log an error.
>
> Andrew Lunn (3):
>   dsa: mv88e6xxx: Timeout based on iterations, not time
>   dsa: mv88e6xxx: Use mv88e6xx_wait in mv88e6xxx_update()
>   dsa: mv88e6xxx: Make mv88e6xxx_wait() timeout verbose
>
>  drivers/net/dsa/mv88e6xxx/chip.c | 32 +++-
>  1 file changed, 11 insertions(+), 21 deletions(-)

Reviewed-by: Vivien Didelot 

David, this series superseeds http://patchwork.ozlabs.org/patch/660270/.

Thanks,

Vivien


[PATCH] ibmvnic: Handle backing device failover and reinitialization

2016-08-18 Thread Thomas Falcon
An upcoming feature of IBM VNIC protocol is the ability to configure
redundant backing devices for a VNIC client. In case of a failure
on the current backing device, the driver will receive a signal
from the hypervisor indicating that a failover will occur. The driver
will then wait for a message from the backing device before 
establishing a new connection.

Signed-off-by: Thomas Falcon 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 34 --
 drivers/net/ethernet/ibm/ibmvnic.h |  2 ++
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 88f3c85..b942108 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -203,7 +203,8 @@ static void free_long_term_buff(struct ibmvnic_adapter 
*adapter,
struct device *dev = >vdev->dev;
 
dma_free_coherent(dev, ltb->size, ltb->buff, ltb->addr);
-   send_request_unmap(adapter, ltb->map_id);
+   if (!adapter->failover)
+   send_request_unmap(adapter, ltb->map_id);
 }
 
 static int alloc_rx_pool(struct ibmvnic_adapter *adapter,
@@ -522,7 +523,8 @@ static int ibmvnic_close(struct net_device *netdev)
for (i = 0; i < adapter->req_rx_queues; i++)
napi_disable(>napi[i]);
 
-   netif_tx_stop_all_queues(netdev);
+   if (!adapter->failover)
+   netif_tx_stop_all_queues(netdev);
 
if (adapter->bounce_buffer) {
if (!dma_mapping_error(dev, adapter->bounce_buffer_dma)) {
@@ -3280,6 +3282,10 @@ static void ibmvnic_handle_crq(union ibmvnic_crq *crq,
rc = ibmvnic_send_crq_init(adapter);
if (rc)
dev_err(dev, "Error sending init rc=%ld\n", rc);
+   } else if (gen_crq->cmd == IBMVNIC_DEVICE_FAILOVER) {
+   dev_info(dev, "Backing device failover detected\n");
+   netif_carrier_off(netdev);
+   adapter->failover = true;
} else {
/* The adapter lost the connection */
dev_err(dev, "Virtual Adapter failed (rc=%d)\n",
@@ -3615,8 +3621,18 @@ static void handle_crq_init_rsp(struct work_struct *work)
struct device *dev = >vdev->dev;
struct net_device *netdev = adapter->netdev;
unsigned long timeout = msecs_to_jiffies(3);
+   bool restart = false;
int rc;
 
+   if (adapter->failover) {
+   release_sub_crqs(adapter);
+   if (netif_running(netdev)) {
+   netif_tx_disable(netdev);
+   ibmvnic_close(netdev);
+   restart = true;
+   }
+   }
+
send_version_xchg(adapter);
reinit_completion(>init_done);
if (!wait_for_completion_timeout(>init_done, timeout)) {
@@ -3645,6 +3661,17 @@ static void handle_crq_init_rsp(struct work_struct *work)
 
netdev->real_num_tx_queues = adapter->req_tx_queues;
 
+   if (adapter->failover) {
+   adapter->failover = false;
+   if (restart) {
+   rc = ibmvnic_open(netdev);
+   if (rc)
+   goto restart_failed;
+   }
+   netif_carrier_on(netdev);
+   return;
+   }
+
rc = register_netdev(netdev);
if (rc) {
dev_err(dev,
@@ -3655,6 +3682,8 @@ static void handle_crq_init_rsp(struct work_struct *work)
 
return;
 
+restart_failed:
+   dev_err(dev, "Failed to restart ibmvnic, rc=%d\n", rc);
 register_failed:
release_sub_crqs(adapter);
 task_failed:
@@ -3692,6 +3721,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const 
struct vio_device_id *id)
dev_set_drvdata(>dev, netdev);
adapter->vdev = dev;
adapter->netdev = netdev;
+   adapter->failover = false;
 
ether_addr_copy(adapter->mac_addr, mac_addr_p);
ether_addr_copy(netdev->dev_addr, adapter->mac_addr);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
b/drivers/net/ethernet/ibm/ibmvnic.h
index e82898f..bfc84c7 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -830,6 +830,7 @@ enum ibmvfc_crq_format {
IBMVNIC_CRQ_INIT = 0x01,
IBMVNIC_CRQ_INIT_COMPLETE= 0x02,
IBMVNIC_PARTITION_MIGRATED   = 0x06,
+   IBMVNIC_DEVICE_FAILOVER  = 0x08,
 };
 
 struct ibmvnic_crq_queue {
@@ -1047,4 +1048,5 @@ struct ibmvnic_adapter {
u8 map_id;
 
struct work_struct vnic_crq_init;
+   bool failover;
 };
-- 
1.8.3.1



Re: [RFC 1/3] tcp: randomize tcp timestamp offsets for each connection

2016-08-18 Thread Eric Dumazet
On Thu, 2016-08-18 at 14:48 +0200, Florian Westphal wrote:
> commit ceaa1fef65a7c2e ("tcp: adding a per-socket timestamp offset")
> added the main infrastructure that is needed for per-connection
> randomization, in particular writing/reading the on-wire tcp header
> format takes the offset into account so rest of stack can use normal
> tcp_time_stamp (jiffies).
> 
> So only two items are left:
>  - add a tsoffset for request sockets
>  - extend the tcp isn generator to also return another 32bit number
>  in addition to the ISN.
> 
> Re-use of ISN generator also means timestamps are still monotonically
> increasing for same connection quadruple.

I like the idea, but the implementation looks a bit complex.

Instead of initializing tsoffset to 0, we could simply use

jhash(src_addr, dst_addr, boot_time_rnd)

This way, even syncookies would be handled, and we do not need to
increase tcp_request_sock size.







[RFC PATCH] net/irda: release sock when waiting in accept()

2016-08-18 Thread Vegard Nossum
I've been seeing these recently:

INFO: task trinity-c3:14933 blocked for more than 120 seconds.
  Not tainted 4.8.0-rc1+ #135
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
trinity-c3  D 88010c16fc88 0 14933  1 0x00080004
 88010c16fc88 3b9aca00  0296
 776cdf88 88011a520ae0 88011a520b08 88011a520198
 867d7f00 88011942c080 880116841580 88010c168000
Call Trace:
 [] schedule+0x77/0x230
 [] __lock_sock+0x129/0x250
 [] ? __sk_destruct+0x450/0x450
 [] ? wake_bit_function+0x2e0/0x2e0
 [] lock_sock_nested+0xeb/0x120
 [] irda_setsockopt+0x65/0xb40
 [] SyS_setsockopt+0x139/0x230
 [] ? SyS_recv+0x20/0x20
 [] ? trace_event_raw_event_sys_enter+0xb90/0xb90
 [] ? __this_cpu_preempt_check+0x13/0x20
 [] ? __context_tracking_exit.part.3+0x30/0x1b0
 [] ? SyS_recv+0x20/0x20
 [] do_syscall_64+0x1b3/0x4b0
 [] entry_SYSCALL64_slow_path+0x25/0x25

Showing all locks held in the system:
2 locks held by khungtaskd/563:
 #0:  (rcu_read_lock){..}, at: [] watchdog+0x106/0x910
 #1:  (tasklist_lock){..}, at: [] 
debug_show_all_locks+0x74/0x360
1 lock held by trinity-c0/19280:
 #0:  (sk_lock-AF_IRDA){..}, at: [] 
irda_accept+0x176/0x10f0
1 lock held by trinity-c0/12865:
 #0:  (sk_lock-AF_IRDA){..}, at: [] 
irda_accept+0x176/0x10f0

The problem seems to be that irda_accept() goes to sleep after locking
the socket, which means that others trying to get the lock will be
"blocked for more than 120 seconds" like above.

There are unfortunately other places in the irda code that seem to be
doing the same thing: irda_connect(), irda_sendmsg(), and
irda_getsockopt() as far as I can tell at a glance. I'll start with
this patch to see if we're going in the right direction -- it does fix
the trinity problem for me, although I haven't tested any real IrDA
workloads.

Signed-off-by: Vegard Nossum 
---
 net/irda/af_irda.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 8d2f7c9..334836a 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -871,6 +871,8 @@ static int irda_accept(struct socket *sock, struct socket 
*newsock, int flags)
 * Jean II
 */
while (1) {
+   DEFINE_WAIT(wait);
+
skb = skb_dequeue(>sk_receive_queue);
if (skb)
break;
@@ -880,10 +882,17 @@ static int irda_accept(struct socket *sock, struct socket 
*newsock, int flags)
if (flags & O_NONBLOCK)
goto out;
 
-   err = wait_event_interruptible(*(sk_sleep(sk)),
-   skb_peek(>sk_receive_queue));
-   if (err)
+   if (signal_pending(current)) {
+   err = -EINTR;
goto out;
+   }
+
+   prepare_to_wait_exclusive(sk_sleep(sk), ,
+   TASK_INTERRUPTIBLE);
+   release_sock(sk);
+   schedule();
+   lock_sock(sk);
+   finish_wait(sk_sleep(sk), );
}
 
newsk = newsock->sk;
-- 
1.9.1



[PATCH 2/6] netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name

2016-08-18 Thread Pablo Neira Ayuso
From: Liping Zhang 

Otherwise, if nfnetlink_log.ko is not loaded, we cannot add rules
to log packets to the userspace when we specify it with arp family,
such as:

  # nft add rule arp filter input log group 0
  :1:1-37: Error: Could not process rule: No such file or
  directory
  add rule arp filter input log group 0
  ^

Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nfnetlink_log.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index cbcfdfb..6577db5 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -1147,6 +1147,7 @@ MODULE_ALIAS_NFNL_SUBSYS(NFNL_SUBSYS_ULOG);
 MODULE_ALIAS_NF_LOGGER(AF_INET, 1);
 MODULE_ALIAS_NF_LOGGER(AF_INET6, 1);
 MODULE_ALIAS_NF_LOGGER(AF_BRIDGE, 1);
+MODULE_ALIAS_NF_LOGGER(3, 1); /* NFPROTO_ARP */
 
 module_init(nfnetlink_log_init);
 module_exit(nfnetlink_log_fini);
-- 
2.1.4



[PATCH 0/6] Netfilter fixes for net

2016-08-18 Thread Pablo Neira Ayuso
Hi David,

The following patchset contains Netfilter updates for your net tree,
they are:

1) Dump only conntrack that belong to this namespace via /proc file.
   This is some fallout from the conversion to single conntrack table
   for all netns, patch from Liping Zhang.

2) Missing MODULE_ALIAS_NF_LOGGER() for the ARP family that prevents
   module autoloading, also from Liping Zhang.

3) Report overquota event to the right netnamespace, again from Liping.

4) Fix tproxy listener sk refcount that leads to crash, from
   Eric Dumazet.

5) Fix racy refcounting on object deletion from nfnetlink and rule
   removal both for nfacct and cttimeout, from Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!



The following changes since commit a1560dd7a47f983419760aa7f6a481e3b910b54b:

  Merge branch 'mediatek-fixes' (2016-08-15 23:02:45 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to b75911b66ad508a3c3f006ce37d9f9ebee34da43:

  netfilter: cttimeout: fix use after free error when delete netns (2016-08-18 
15:17:00 +0200)


Eric Dumazet (1):
  netfilter: tproxy: properly refcount tcp listeners

Liping Zhang (5):
  netfilter: conntrack: do not dump other netns's conntrack entries via proc
  netfilter: nfnetlink_log: add "nf-logger-3-1" module alias name
  netfilter: nfnetlink_acct: report overquota to the right netns
  netfilter: nfnetlink_acct: fix race between nfacct del and xt_nfacct 
destroy
  netfilter: cttimeout: fix use after free error when delete netns

 include/linux/netfilter/nfnetlink_acct.h |  4 ++--
 net/netfilter/nf_conntrack_standalone.c  |  4 
 net/netfilter/nfnetlink_acct.c   | 17 +
 net/netfilter/nfnetlink_cttimeout.c  | 16 ++--
 net/netfilter/nfnetlink_log.c|  1 +
 net/netfilter/xt_TPROXY.c|  4 
 net/netfilter/xt_nfacct.c|  2 +-
 7 files changed, 31 insertions(+), 17 deletions(-)


[PATCH 3/6] netfilter: nfnetlink_acct: report overquota to the right netns

2016-08-18 Thread Pablo Neira Ayuso
From: Liping Zhang 

We should report the over quota message to the right net namespace
instead of the init netns.

Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/nfnetlink_acct.h | 4 ++--
 net/netfilter/nfnetlink_acct.c   | 9 +
 net/netfilter/xt_nfacct.c| 2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/linux/netfilter/nfnetlink_acct.h 
b/include/linux/netfilter/nfnetlink_acct.h
index 80ca889..664da00 100644
--- a/include/linux/netfilter/nfnetlink_acct.h
+++ b/include/linux/netfilter/nfnetlink_acct.h
@@ -15,6 +15,6 @@ struct nf_acct;
 struct nf_acct *nfnl_acct_find_get(struct net *net, const char *filter_name);
 void nfnl_acct_put(struct nf_acct *acct);
 void nfnl_acct_update(const struct sk_buff *skb, struct nf_acct *nfacct);
-extern int nfnl_acct_overquota(const struct sk_buff *skb,
- struct nf_acct *nfacct);
+int nfnl_acct_overquota(struct net *net, const struct sk_buff *skb,
+   struct nf_acct *nfacct);
 #endif /* _NFNL_ACCT_H */
diff --git a/net/netfilter/nfnetlink_acct.c b/net/netfilter/nfnetlink_acct.c
index 1b4de4b..796605b 100644
--- a/net/netfilter/nfnetlink_acct.c
+++ b/net/netfilter/nfnetlink_acct.c
@@ -443,7 +443,7 @@ void nfnl_acct_update(const struct sk_buff *skb, struct 
nf_acct *nfacct)
 }
 EXPORT_SYMBOL_GPL(nfnl_acct_update);
 
-static void nfnl_overquota_report(struct nf_acct *nfacct)
+static void nfnl_overquota_report(struct net *net, struct nf_acct *nfacct)
 {
int ret;
struct sk_buff *skb;
@@ -458,11 +458,12 @@ static void nfnl_overquota_report(struct nf_acct *nfacct)
kfree_skb(skb);
return;
}
-   netlink_broadcast(init_net.nfnl, skb, 0, NFNLGRP_ACCT_QUOTA,
+   netlink_broadcast(net->nfnl, skb, 0, NFNLGRP_ACCT_QUOTA,
  GFP_ATOMIC);
 }
 
-int nfnl_acct_overquota(const struct sk_buff *skb, struct nf_acct *nfacct)
+int nfnl_acct_overquota(struct net *net, const struct sk_buff *skb,
+   struct nf_acct *nfacct)
 {
u64 now;
u64 *quota;
@@ -480,7 +481,7 @@ int nfnl_acct_overquota(const struct sk_buff *skb, struct 
nf_acct *nfacct)
 
if (now >= *quota &&
!test_and_set_bit(NFACCT_OVERQUOTA_BIT, >flags)) {
-   nfnl_overquota_report(nfacct);
+   nfnl_overquota_report(net, nfacct);
}
 
return ret;
diff --git a/net/netfilter/xt_nfacct.c b/net/netfilter/xt_nfacct.c
index 3048a7e..cf32759 100644
--- a/net/netfilter/xt_nfacct.c
+++ b/net/netfilter/xt_nfacct.c
@@ -26,7 +26,7 @@ static bool nfacct_mt(const struct sk_buff *skb, struct 
xt_action_param *par)
 
nfnl_acct_update(skb, info->nfacct);
 
-   overquota = nfnl_acct_overquota(skb, info->nfacct);
+   overquota = nfnl_acct_overquota(par->net, skb, info->nfacct);
 
return overquota == NFACCT_UNDERQUOTA ? false : true;
 }
-- 
2.1.4



[net-next 05/15] i40evf: add missing rtnl_lock() around i40evf_set_interrupt_capability

2016-08-18 Thread Jeff Kirsher
From: Jacob Keller 

The function calls netif_set_real_num_(tx|rx)_queues, both of which
should be done only under rntl lock. Unfortunately the
i40evf_init_task did not hold the rtnl_lock as necessary. This patch
adds the locking needed.

Change-ID: Ib72a21c3ce22b71a226b16f9bbe0f5f8cc3e849b
Signed-off-by: Jacob Keller 
Tested-by: Andrew Bowers 
Signed-off-by: Jeff Kirsher 
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 600fb9c..d1d4a65 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1420,7 +1420,9 @@ int i40evf_init_interrupt_scheme(struct i40evf_adapter 
*adapter)
 {
int err;
 
+   rtnl_lock();
err = i40evf_set_interrupt_capability(adapter);
+   rtnl_unlock();
if (err) {
dev_err(>pdev->dev,
"Unable to setup interrupt capabilities\n");
-- 
2.7.4



  1   2   >