Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread Neil Horman
On Fri, Jul 10, 2015 at 06:21:14PM -0700, David Miller wrote:
 From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
 Date: Thu,  9 Jul 2015 11:15:19 -0300
 
  SCTP has this operation to peel off associations from a given socket and
  create a new socket using this association. We currently have two ways
  to use this operation:
  - via getsockopt(), on which it will also create and return a file
descriptor for this new socket
  - via sctp_do_peeloff(), which is for kernel only
  
  The caveat with using sctp_do_peeloff() directly is that it creates a
  dependency to SCTP module, while all other operations are handled via
  kernel_{socket,sendmsg,getsockopt...}() interface. This causes the
  kernel to load SCTP module even when it's not directly used
  
  This patch then updates SCTP_SOCKOPT_PEELOFF so that for kernel users of
  this protocol it will not allocate a file descriptor but instead just
  return the socket pointer directly.
  
  If called by an user application it will work as before.
  
  Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
 
 I do not like this at all.
 
 Socket option implementations should not change their behavior or what
 datastructures they consume or return just because the socket happens
 to be a kernel socket.
 
But in this case its necessecary, as the kernel here can't allocate an fd, due
to serious leakage (see commit 2f2d76cc3e938389feee671b46252dde6880b3b7).
Initially Marcelo had created duplicate code paths, one to return an fd, one to
return a file struct.  If you would rather go in that direction, I'm sure he can
propose it again, but that seems less correct to me than this solution.

 I'm not applying this series, sorry.
 
 Also, your patch series lacked an intial PATCH 0/N posting, so you
 could at least spend the time to discuss this patch series at a high
 level and explain your overall motivations.
 
That was in the initial posting.  It should have been reposted, but if you're
interested:
http://marc.info/?l=linux-sctpm=143449456219518w=2

Regards
Neil

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 02/16] i40e/i40evf: Add stats to track FD ATR and SB dynamic enable state

2015-07-13 Thread Jeff Kirsher
From: Anjali Singhai Jain anjali.sing...@intel.com

Since the driver can dynamically enable/disable FD ATR and SB features,
these stats help keep track of the current state and along with
fd_flush count provide a means to debug what could be going on
with the flow director filters. This will take away the need for
being verbose in our debug logs with respect to FD.

Change-ID: I29224f750fe6602391043655d18996570720377d
Signed-off-by: Anjali Singhai Jain anjali.sing...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |  2 ++
 drivers/net/ethernet/intel/i40e/i40e_main.c| 12 
 drivers/net/ethernet/intel/i40e/i40e_type.h|  2 ++
 drivers/net/ethernet/intel/i40evf/i40e_type.h  |  2 ++
 4 files changed, 18 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c 
b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index 9a68c65..0b68f61 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -148,7 +148,9 @@ static struct i40e_stats i40e_gstrings_stats[] = {
I40E_PF_STAT(fdir_flush_cnt, fd_flush_cnt),
I40E_PF_STAT(fdir_atr_match, stats.fd_atr_match),
I40E_PF_STAT(fdir_atr_tunnel_match, stats.fd_atr_tunnel_match),
+   I40E_PF_STAT(fdir_atr_status, stats.fd_atr_status),
I40E_PF_STAT(fdir_sb_match, stats.fd_sb_match),
+   I40E_PF_STAT(fdir_sb_status, stats.fd_sb_status),
 
/* LPI stats */
I40E_PF_STAT(tx_lpi_status, stats.tx_lpi_status),
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b44eb35..b5fc654 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1123,6 +1123,18 @@ static void i40e_update_pf_stats(struct i40e_pf *pf)
   pf-stat_offsets_loaded,
   osd-rx_lpi_count, nsd-rx_lpi_count);
 
+   if (pf-flags  I40E_FLAG_FD_SB_ENABLED 
+   !(pf-auto_disable_flags  I40E_FLAG_FD_SB_ENABLED))
+   nsd-fd_sb_status = true;
+   else
+   nsd-fd_sb_status = false;
+
+   if (pf-flags  I40E_FLAG_FD_ATR_ENABLED 
+   !(pf-auto_disable_flags  I40E_FLAG_FD_ATR_ENABLED))
+   nsd-fd_atr_status = true;
+   else
+   nsd-fd_atr_status = false;
+
pf-stat_offsets_loaded = true;
 }
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h 
b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 9a5a75b..350c5ee 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -1134,6 +1134,8 @@ struct i40e_hw_port_stats {
u64 fd_atr_match;
u64 fd_sb_match;
u64 fd_atr_tunnel_match;
+   u32 fd_atr_status;
+   u32 fd_sb_status;
/* EEE LPI */
u32 tx_lpi_status;
u32 rx_lpi_status;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_type.h 
b/drivers/net/ethernet/intel/i40evf/i40e_type.h
index c463ec4..068813d 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_type.h
@@ -1109,6 +1109,8 @@ struct i40e_hw_port_stats {
u64 fd_atr_match;
u64 fd_sb_match;
u64 fd_atr_tunnel_match;
+   u32 fd_atr_status;
+   u32 fd_sb_status;
/* EEE LPI */
u32 tx_lpi_status;
u32 rx_lpi_status;
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 04/16] i40e/i40evf: improve Tx performance with a small tweak

2015-07-13 Thread Jeff Kirsher
From: Jesse Brandeburg jesse.brandeb...@intel.com

Add a prefetch for the next Tx descriptor to be used when we know
there are more coming.

Change-ID: Ibb9acab11d508eec2db7da795df74debc16eeacb
Signed-off-by: Jesse Brandeburg jesse.brandeb...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c   | 2 ++
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 9a4f2bc..1fe230d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2616,6 +2616,8 @@ static inline void i40e_tx_map(struct i40e_ring *tx_ring, 
struct sk_buff *skb,
netif_xmit_stopped(netdev_get_tx_queue(tx_ring-netdev,
   tx_ring-queue_index)))
writel(i, tx_ring-tail);
+   else
+   prefetchw(tx_desc + 1);
 
return;
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c 
b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 395f32f..0f0e185 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -1841,6 +1841,8 @@ static inline void i40evf_tx_map(struct i40e_ring 
*tx_ring, struct sk_buff *skb,
netif_xmit_stopped(netdev_get_tx_queue(tx_ring-netdev,
   tx_ring-queue_index)))
writel(i, tx_ring-tail);
+   else
+   prefetchw(tx_desc + 1);
 
return;
 
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 01/16] i40e: Implement ndo_features_check()

2015-07-13 Thread Jeff Kirsher
From: Joe Stringer joestrin...@nicira.com

i40e supports UDP tunnel headers up to 80 bytes in length, so
this adds a check to ensure that it doesn't try to offload
packets that exceed that.

Signed-off-by: Joe Stringer joestrin...@nicira.com
Signed-off-by: Jesse Gross je...@nicira.com
Acked-by: Jesse Brandeburg jesse.brandeb...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 48a52b3..b44eb35 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -8099,6 +8099,25 @@ static int i40e_ndo_bridge_getlink(struct sk_buff *skb, 
u32 pid, u32 seq,
 }
 #endif /* HAVE_BRIDGE_ATTRIBS */
 
+#define I40E_MAX_TUNNEL_HDR_LEN 80
+/**
+ * i40e_features_check - Validate encapsulated packet conforms to limits
+ * @skb: skb buff
+ * @netdev: This physical port's netdev
+ * @features: Offload features that the stack believes apply
+ **/
+static netdev_features_t i40e_features_check(struct sk_buff *skb,
+struct net_device *dev,
+netdev_features_t features)
+{
+   if (skb-encapsulation 
+   (skb_inner_mac_header(skb) - skb_transport_header(skb) 
+I40E_MAX_TUNNEL_HDR_LEN))
+   return features  ~(NETIF_F_ALL_CSUM | NETIF_F_GSO_MASK);
+
+   return features;
+}
+
 static const struct net_device_ops i40e_netdev_ops = {
.ndo_open   = i40e_open,
.ndo_stop   = i40e_close,
@@ -8133,6 +8152,7 @@ static const struct net_device_ops i40e_netdev_ops = {
 #endif
.ndo_get_phys_port_id   = i40e_get_phys_port_id,
.ndo_fdb_add= i40e_ndo_fdb_add,
+   .ndo_features_check = i40e_features_check,
 #ifdef HAVE_BRIDGE_ATTRIBS
.ndo_bridge_getlink = i40e_ndo_bridge_getlink,
.ndo_bridge_setlink = i40e_ndo_bridge_setlink,
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 05/16] i40evf: Allow for an abundance of vectors

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

The driver currently only maps TX and RX queues to a single MSI-X vector
per queue pair if there are exactly enough vectors for this.
Unfortunately, if we have too many vectors it will fail and allocate
queues to vectors in a suboptimal manner. Change the condition check to
allow for excess vectors. In this case, the extras just won't be used.

Change-ID: I23e1e2955c64739c86612db88a25583e6a7e0b17
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 4ab4ebb..94eff4a 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -406,7 +406,7 @@ static int i40evf_map_rings_to_vectors(struct 
i40evf_adapter *adapter)
/* The ideal configuration...
 * We have enough vectors to map one per queue.
 */
-   if (q_vectors == (rxr_remaining * 2)) {
+   if (q_vectors = (rxr_remaining * 2)) {
for (; rxr_idx  rxr_remaining; v_start++, rxr_idx++)
i40evf_map_vector_to_rxq(adapter, v_start, rxr_idx);
 
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2] ipv6: Do not iterate over all interfaces when finding source address on specific interface.

2015-07-13 Thread Erik Kline
On 13 July 2015 at 15:32, YOSHIFUJI Hideaki
hideaki.yoshif...@miraclelinux.com wrote:
 Hi,

 Erik Kline wrote:
 Hmm, when I run a UML linux with this patch (which, I'm ashamed to
 say, I failed to do before) I get these kinds of errors:

 unregister_netdevice: waiting for TAPdevice to become free.
 Usage count = 1
 unregister_netdevice: waiting for TAPdevice to become free.
 Usage count = 1

 Perhaps they're unrelated... I'm still investigating.

 Would you test attached patch please?

That does look logically correct, so +1 to it regardless, but it does
not seem to have fixed the issue I'm seeing.

I still haven't produced the smallest possible demo test program.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netlink: enable skb header refcounting before sending first broadcast

2015-07-13 Thread Konstantin Khlebnikov

On 13.07.2015 10:23, Herbert Xu wrote:

On Fri, Jul 10, 2015 at 02:51:41PM +0300, Konstantin Khlebnikov wrote:

This fixes race between non-atomic updates of adjacent bit-fields:
skb-cloned could be lost because netlink broadcast clones skb after
sending it to the first listener who sets skb-peeked at the same skb.
As a result atomic refcounting of skb header stays disabled and
skb_release_data() frees it twice. Race leads to double-free in kmalloc-xxx.

Signed-off-by: Konstantin Khlebnikov khlebni...@yandex-team.ru
Fixes: b19372273164 (net: reorganize sk_buff for faster __copy_skb_header())
---
  net/netlink/af_netlink.c |6 ++
  1 file changed, 6 insertions(+)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index dea925388a5b..921e0d8dfe3a 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2028,6 +2028,12 @@ int netlink_broadcast_filtered(struct sock *ssk, struct 
sk_buff *skb, u32 portid
info.tx_filter = filter;
info.tx_data = filter_data;

+   /* Enable atomic refcounting in skb_release_data() before first send:
+* non-atomic set of that bit-field in __skb_clone() could race with
+* __skb_recv_datagram() which touches the same set of bit-fields.
+*/
+   skb-cloned = 1;
+
/* While we sleep in clone, do not allow to change socket list */

netlink_lock_table();


Your effort in finding this bug is wonderful.  However I think
the fix is a bit dirty.

The real issue here is that the recv path no longer handles shared
skbs.  So either we need to fix the recv path to not touch skbs
without cloning them, or we need to get rid of the use of shared
skbs in netlink.


I don't think that recv path should care about shared skb -- skb can be
delivered into only one socket anyway.


Less dirty fix for that: do not send original skb.
That adds one extra clone but makes code much cleaner.


--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1957,17 +1957,16 @@ static void do_one_broadcast(struct sock *sk,
}

sock_hold(sk);
-   if (p-skb2 == NULL) {
-   if (skb_shared(p-skb)) {
-   p-skb2 = skb_clone(p-skb, p-allocation);
-   } else {
-   p-skb2 = skb_get(p-skb);
-   /*
-* skb ownership may have been set when
-* delivered to a previous socket.
-*/
-   skb_orphan(p-skb2);
-   }
+   if (p-skb2 == NULL || skb_shared(p-skb2)) {
+   kfree_skb(p-skb2);
+   p-skb2 = skb_clone(p-skb, p-allocation);
+   } else {
+   skb_get(p-skb2);
+   /*
+* skb ownership may have been set when
+* delivered to a previous socket.
+*/
+   skb_orphan(p-skb2);
}
if (p-skb2 == NULL) {
netlink_overrun(sk);
@@ -1997,7 +1996,6 @@ static void do_one_broadcast(struct sock *sk,
} else {
p-congested |= val;
p-delivered = 1;
-   p-skb2 = NULL;
}
 out:
sock_put(sk);





In fact it looks I introduced the bug way back in

commit a59322be07c964e916d15be3df473fb7ba20c41e
Author: Herbert Xu herb...@gondor.apana.org.au
Date:   Wed Dec 5 01:53:40 2007 -0800

 [UDP]: Only increment counter on first peek/recv

I will try to mend this error :)

Cheers,




--
Konstantin
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] bnx2x: Update to FW version 7.12.30

2015-07-13 Thread Yuval Mintz
   The new FW will allow us to utilize some new features in our driver,
   mainly adding vlan filtering offload and vxlan offload support.
  
   In addition, this fixes several issues:
   1. Packets from a VF with pvid configured which were sent with a
  different vlan were transmitted instead of being discarded.
  
   2. FCoE traffic might not recover after a failue while there's traffic
  to another function.
  
   Signed-off-by: Yuval Mintz yuval.mi...@qlogic.com
 
  Hi, any news about this one?
  Thanks, Yuval
 
 Any updates? I've sent this 3-weeks ago and haven't seen any reply.

Apparently the destination E-mail has changed and I was unaware.
Is anyone here? ;-)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 03/16] i40e/i40evf: Update Flex-10 related device/function capabilities

2015-07-13 Thread Jeff Kirsher
From: Pawel Orlowski pawel.orlow...@intel.com

The Flex10 device/function capability has been upgraded to include
information needed to support Flex-10 configurations. This patch adds new
fields to the i40e_hw_capabilities structure and updates
i40e_parse_discover_capabilities functions to extract them from the AQ
response. Naming convention has changed to use flex10 mode instead of
existing mfp_mode_1.

Change-ID: I305dd66985a30293acb3fb14fa43ca6b79ea
Signed-off-by: Pawel Orlowski pawel.orlow...@intel.com
Signed-off-by: Akeem G Abodunrin akeem.g.abodun...@intel.com
Signed-off-by: Shannon Nelson shannon.nel...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_common.c | 24 +++-
 drivers/net/ethernet/intel/i40e/i40e_main.c   |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_type.h   | 12 +++-
 drivers/net/ethernet/intel/i40evf/i40e_type.h | 12 +++-
 4 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c 
b/drivers/net/ethernet/intel/i40e/i40e_common.c
index 0bae22d..0703222 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_common.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_common.c
@@ -2391,7 +2391,7 @@ i40e_aq_erase_nvm_exit:
 #define I40E_DEV_FUNC_CAP_MSIX_VF  0x44
 #define I40E_DEV_FUNC_CAP_FLOW_DIRECTOR0x45
 #define I40E_DEV_FUNC_CAP_IEEE_15880x46
-#define I40E_DEV_FUNC_CAP_MFP_MODE_1   0xF1
+#define I40E_DEV_FUNC_CAP_FLEX10   0xF1
 #define I40E_DEV_FUNC_CAP_CEM  0xF2
 #define I40E_DEV_FUNC_CAP_IWARP0x51
 #define I40E_DEV_FUNC_CAP_LED  0x61
@@ -2416,6 +2416,7 @@ static void i40e_parse_discover_capabilities(struct 
i40e_hw *hw, void *buff,
u32 valid_functions, num_functions;
u32 number, logical_id, phys_id;
struct i40e_hw_capabilities *p;
+   u8 major_rev;
u32 i = 0;
u16 id;
 
@@ -2433,6 +2434,7 @@ static void i40e_parse_discover_capabilities(struct 
i40e_hw *hw, void *buff,
number = le32_to_cpu(cap-number);
logical_id = le32_to_cpu(cap-logical_id);
phys_id = le32_to_cpu(cap-phys_id);
+   major_rev = cap-major_rev;
 
switch (id) {
case I40E_DEV_FUNC_CAP_SWITCH_MODE:
@@ -2507,9 +2509,21 @@ static void i40e_parse_discover_capabilities(struct 
i40e_hw *hw, void *buff,
case I40E_DEV_FUNC_CAP_MSIX_VF:
p-num_msix_vectors_vf = number;
break;
-   case I40E_DEV_FUNC_CAP_MFP_MODE_1:
-   if (number == 1)
-   p-mfp_mode_1 = true;
+   case I40E_DEV_FUNC_CAP_FLEX10:
+   if (major_rev == 1) {
+   if (number == 1) {
+   p-flex10_enable = true;
+   p-flex10_capable = true;
+   }
+   } else {
+   /* Capability revision = 2 */
+   if (number  1)
+   p-flex10_enable = true;
+   if (number  2)
+   p-flex10_capable = true;
+   }
+   p-flex10_mode = logical_id;
+   p-flex10_status = phys_id;
break;
case I40E_DEV_FUNC_CAP_CEM:
if (number == 1)
@@ -2557,7 +2571,7 @@ static void i40e_parse_discover_capabilities(struct 
i40e_hw *hw, void *buff,
/* Software override ensuring FCoE is disabled if npar or mfp
 * mode because it is not supported in these modes.
 */
-   if (p-npar_enable || p-mfp_mode_1)
+   if (p-npar_enable || p-flex10_enable)
p-fcoe = false;
 
/* count the enabled ports (aka the not disabled ports) */
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index b5fc654..ed6fc52 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7685,7 +7685,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
}
 
/* MFP mode enabled */
-   if (pf-hw.func_caps.npar_enable || pf-hw.func_caps.mfp_mode_1) {
+   if (pf-hw.func_caps.npar_enable || pf-hw.func_caps.flex10_enable) {
pf-flags |= I40E_FLAG_MFP_ENABLED;
dev_info(pf-pdev-dev, MFP mode Enabled\n);
if (i40e_get_npar_bw_setting(pf))
diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h 
b/drivers/net/ethernet/intel/i40e/i40e_type.h
index 350c5ee..220371e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_type.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_type.h
@@ -213,7 +213,17 @@ struct 

[net-next 06/16] i40e: ignore duplicate port VLAN requests

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

If user attempts to set a port VLAN on a VF that already has the same
port VLAN configured, the driver will go through a completely
unnecessary flurry of filter removals and filter adds. Just check for
this condition and return success instead of doing a bunch of busywork.

Change-ID: Ia1a9e83e6ed48b3f4658bc20dfc6af0cf525d54a
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 23f95cd..433e803 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2088,6 +2088,10 @@ int i40e_ndo_set_vf_port_vlan(struct net_device *netdev,
goto error_pvid;
}
 
+   if (vsi-info.pvid == (vlan_id | (qos  I40E_VLAN_PRIORITY_SHIFT)))
+   /* duplicate request, so just return success */
+   goto error_pvid;
+
if (vsi-info.pvid == 0  i40e_is_vsi_in_vlan(vsi)) {
dev_err(pf-pdev-dev,
VF %d has already configured VLAN filters and the 
administrator is requesting a port VLAN override.\nPlease unload and reload the 
VF driver for this change to take effect.\n,
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 12/16] i40evf: don't delete all the filters

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

Due to an inverted conditional, the driver was marking all of its MAC
filters for deletion every time set_rx_mode was called. Depending upon
the timing of the calls to set_rx_mode and the processing of the admin
queue, the driver would (accidentally) end up with a varying number of
functional filters.

Correct this logic so that MAC filters are added and removed correctly.
Add a check for the driver's hardware MAC address so that this filter
doesn't get removed incorrectly.

Change-ID: Ib3e7c4a5b53df6835f164fe44cb778cb71f8aff8
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 94eff4a..07f6052 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -892,8 +892,10 @@ static void i40evf_set_rx_mode(struct net_device *netdev)
break;
}
}
+   if (ether_addr_equal(f-macaddr, adapter-hw.mac.addr))
+   found = true;
}
-   if (found) {
+   if (!found) {
f-remove = true;
adapter-aq_required |= I40EVF_FLAG_AQ_DEL_MAC_FILTER;
}
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 00/16][pull request] Intel Wired LAN Driver Updates 2015-07-13

2015-07-13 Thread Jeff Kirsher
This series contains updates to i40e and i40evf only.

Joe Stringer and Jesse Gross add a ndo_features_check function to ensure
that the i40e driver does not try to offload packets that exceed 80 bytes
in length.

Anjali adds additional stats to track flow director ATR and SB current
state and flow director flush count which will help the need for verbose
debug logs with respect to flow director.  Also refines an error message
to avoid confusion, so that it indicates what may have really happened
when the init_shared_code() call possibly fails.

Pawel adds new fields to the capabilities structures to handle Flex-10
device/function capabilities which is needed to support Flex-10 configs.

Jesse improves the transmit performance by added a prefetch for the
next transmit descriptor to be used when we know there are more coming.

Mitch modifies i40evf driver to handle/allow an abundance of vectors.
Currently the driver only maps transmit and receive queues to a single
MSI-X vector per queue if there are exactly enough vectors for this, but
if we have too many vectors, it will fail and allocate queues to vectors
in a suboptimal manner.  So change the condition check to allow for an
excess number of vectors and won't use the extras.  Also update the
driver to just return success if the user attempts to set a port VLAN on
a VF that already has the same port VLAN configured, instead of going
through unnecessary filter removals  adds.  Fix the MAC filters for VFs,
which were being programmed with 0 for the VLAN value when there was no
VLAN assigned.  Instead, we must use -1 to indicate that no VLAN is in
use.  Fix the VF disable code, which was not properly cleaning up the VF
and would leave the VF in an indeterminate state, so fix this by
notifying the VF and then call the normal VF reset routine.  Fix the
logic in the driver so that MAC filters are added and removed correctly
and added a check for the driver's hardware MAC address so that this
filter does not get removed incorrectly.

Carolyn removes incorrect #ifdef's which should not have been added in
the first place and with the #ifdef's removed, make the necessary
changes in the driver to resolve compile errors.

Greg updates the admin queue command header defines.

The following are changes since commit 14fe22e334623e451b5592193415c644005461ea:
  Revert ipv4: use skb coalescing in defragmentation
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue master

Anjali Singhai Jain (2):
  i40e/i40evf: Add stats to track FD ATR and SB dynamic enable state
  i40e: Refine an error message to avoid confusion

Carolyn Wyborny (1):
  i40e: Remove incorrect #ifdef's

Catherine Sullivan (1):
  i40e/i40evf: Bump version to 1.3.6 for i40e and 1.3.2 for i40evf

Faisal Latif (1):
  i40e/i40evf: Add support for pre-allocated pages for PD

Greg Rose (1):
  i40e/i40evf: Update the admin queue command header

Jesse Brandeburg (1):
  i40e/i40evf: improve Tx performance with a small tweak

Joe Stringer (1):
  i40e: Implement ndo_features_check()

Mitch Williams (7):
  i40evf: Allow for an abundance of vectors
  i40e: ignore duplicate port VLAN requests
  i40e: correctly program filters for VFs
  i40e: do a proper reset when disabling a VF
  i40e: un-disable VF after reset
  i40evf: don't delete all the filters
  i40evf: add MAC address filter in open, not init

Pawel Orlowski (1):
  i40e/i40evf: Update Flex-10 related device/function capabilities

 drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h  | 24 +-
 drivers/net/ethernet/intel/i40e/i40e_common.c  | 24 +++---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c |  2 +
 drivers/net/ethernet/intel/i40e/i40e_hmc.c | 30 -
 drivers/net/ethernet/intel/i40e/i40e_hmc.h |  4 +-
 drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c |  2 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c| 51 --
 drivers/net/ethernet/intel/i40e/i40e_txrx.c|  2 +
 drivers/net/ethernet/intel/i40e/i40e_type.h| 14 +-
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 23 +-
 .../net/ethernet/intel/i40evf/i40e_adminq_cmd.h| 18 +++-
 drivers/net/ethernet/intel/i40evf/i40e_hmc.h   |  4 +-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.c  |  2 +
 drivers/net/ethernet/intel/i40evf/i40e_type.h  | 14 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c| 20 +++--
 15 files changed, 155 insertions(+), 79 deletions(-)

-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 10/16] i40e: do a proper reset when disabling a VF

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

The VF disable code was just whanging on the reset bit without properly
cleaning up the VF, which would leave the VF in an indeterminate state
from which it could not recover. Fix this by notifying the VF and then
by calling the normal VF reset routine.

Change-ID: I862b9dfa919368773cbdc212b805b520db2f7430
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 4070a22..55b19f5 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -160,13 +160,8 @@ void i40e_vc_notify_vf_reset(struct i40e_vf *vf)
  **/
 static inline void i40e_vc_disable_vf(struct i40e_pf *pf, struct i40e_vf *vf)
 {
-   struct i40e_hw *hw = pf-hw;
-   u32 reg;
-
-   reg = rd32(hw, I40E_VPGEN_VFRTRIG(vf-vf_id));
-   reg |= I40E_VPGEN_VFRTRIG_VFSWR_MASK;
-   wr32(hw, I40E_VPGEN_VFRTRIG(vf-vf_id), reg);
-   i40e_flush(hw);
+   i40e_vc_notify_vf_reset(vf);
+   i40e_reset_vf(vf, false);
 }
 
 /**
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 14/16] i40e/i40evf: Add support for pre-allocated pages for PD

2015-07-13 Thread Jeff Kirsher
From: Faisal Latif faisal.la...@intel.com

The i40e_add_pd_table_entry() routine is being modified to handle both
cases where a backing page is passed and where backing page is allocated
in i40e_add_pd_table_entry().

For PBLE resource management, it is more efficient for it to manage its
backing pages. For VF, PBLE backing page addresses will be send to PF
driver for PBLE resource.

The i40e_remove_pd_bp() is also modified to not free pre-allocated pages and
free only ones which were allocated in i40e_add_pd_table_entry().

Change-ID: Ie673f0403f22979e9406f5a94048dceb91bcf9a8
Signed-off-by: Faisal Latif faisal.la...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_hmc.c | 30 +-
 drivers/net/ethernet/intel/i40e/i40e_hmc.h |  4 +++-
 drivers/net/ethernet/intel/i40e/i40e_lan_hmc.c |  2 +-
 drivers/net/ethernet/intel/i40evf/i40e_hmc.h   |  4 +++-
 4 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_hmc.c 
b/drivers/net/ethernet/intel/i40e/i40e_hmc.c
index 9b987cc..b89856a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_hmc.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_hmc.c
@@ -116,6 +116,7 @@ exit:
  * @hw: pointer to our HW structure
  * @hmc_info: pointer to the HMC configuration information structure
  * @pd_index: which page descriptor index to manipulate
+ * @rsrc_pg: if not NULL, use preallocated page instead of allocating new one.
  *
  * This function:
  * 1. Initializes the pd entry
@@ -129,12 +130,14 @@ exit:
  **/
 i40e_status i40e_add_pd_table_entry(struct i40e_hw *hw,
  struct i40e_hmc_info *hmc_info,
- u32 pd_index)
+ u32 pd_index,
+ struct i40e_dma_mem *rsrc_pg)
 {
i40e_status ret_code = 0;
struct i40e_hmc_pd_table *pd_table;
struct i40e_hmc_pd_entry *pd_entry;
struct i40e_dma_mem mem;
+   struct i40e_dma_mem *page = mem;
u32 sd_idx, rel_pd_idx;
u64 *pd_addr;
u64 page_desc;
@@ -155,18 +158,24 @@ i40e_status i40e_add_pd_table_entry(struct i40e_hw *hw,
pd_table = hmc_info-sd_table.sd_entry[sd_idx].u.pd_table;
pd_entry = pd_table-pd_entry[rel_pd_idx];
if (!pd_entry-valid) {
-   /* allocate a 4K backing page */
-   ret_code = i40e_allocate_dma_mem(hw, mem, i40e_mem_bp,
-I40E_HMC_PAGED_BP_SIZE,
-I40E_HMC_PD_BP_BUF_ALIGNMENT);
-   if (ret_code)
-   goto exit;
+   if (rsrc_pg) {
+   pd_entry-rsrc_pg = true;
+   page = rsrc_pg;
+   } else {
+   /* allocate a 4K backing page */
+   ret_code = i40e_allocate_dma_mem(hw, page, i40e_mem_bp,
+   I40E_HMC_PAGED_BP_SIZE,
+   I40E_HMC_PD_BP_BUF_ALIGNMENT);
+   if (ret_code)
+   goto exit;
+   pd_entry-rsrc_pg = false;
+   }
 
-   pd_entry-bp.addr = mem;
+   pd_entry-bp.addr = *page;
pd_entry-bp.sd_pd_index = pd_index;
pd_entry-bp.entry_type = I40E_SD_TYPE_PAGED;
/* Set page address and valid bit */
-   page_desc = mem.pa | 0x1;
+   page_desc = page-pa | 0x1;
 
pd_addr = (u64 *)pd_table-pd_page_addr.va;
pd_addr += rel_pd_idx;
@@ -240,7 +249,8 @@ i40e_status i40e_remove_pd_bp(struct i40e_hw *hw,
I40E_INVALIDATE_PF_HMC_PD(hw, sd_idx, idx);
 
/* free memory here */
-   ret_code = i40e_free_dma_mem(hw, (pd_entry-bp.addr));
+   if (!pd_entry-rsrc_pg)
+   ret_code = i40e_free_dma_mem(hw, pd_entry-bp.addr);
if (ret_code)
goto exit;
if (!pd_table-ref_cnt)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_hmc.h 
b/drivers/net/ethernet/intel/i40e/i40e_hmc.h
index 732a026..386416b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_hmc.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_hmc.h
@@ -62,6 +62,7 @@ struct i40e_hmc_bp {
 struct i40e_hmc_pd_entry {
struct i40e_hmc_bp bp;
u32 sd_index;
+   bool rsrc_pg;
bool valid;
 };
 
@@ -218,7 +219,8 @@ i40e_status i40e_add_sd_table_entry(struct i40e_hw *hw,
 
 i40e_status i40e_add_pd_table_entry(struct i40e_hw *hw,
  struct i40e_hmc_info *hmc_info,
- u32 pd_index);
+ u32 pd_index,
+ 

[net-next 11/16] i40e: un-disable VF after reset

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

When a VF is disabled, there is no way for it to recover until either
the PF driver is reloaded or SR-IOV is disabled and enabled. To correct
this, enable the VF after a successful reset.

Change-ID: I9e0788476c4d53d5407961b503febdfff2b8a7c6
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c 
b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
index 55b19f5..fdd7f5e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -832,6 +832,7 @@ complete_reset:
i40e_alloc_vf_res(vf);
i40e_enable_vf_mappings(vf);
set_bit(I40E_VF_STAT_ACTIVE, vf-vf_states);
+   clear_bit(I40E_VF_STAT_DISABLED, vf-vf_states);
 
/* tell the VF the reset is done */
wr32(hw, I40E_VFGEN_RSTAT1(vf-vf_id), I40E_VFR_VFACTIVE);
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 08/16] i40e/i40evf: Update the admin queue command header

2015-07-13 Thread Jeff Kirsher
From: Greg Rose gregory.v.r...@intel.com

Make the necessary updates to i40e_adminq_cmd.h.

Change-ID: Ib031c86cc6cab78e5aa44c64d8ce5474be8d7e42
Signed-off-by: Greg Rose gregory.v.r...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h  | 24 --
 .../net/ethernet/intel/i40evf/i40e_adminq_cmd.h| 18 +++-
 2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h 
b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
index 929e3d7..9101f5c 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h
@@ -34,7 +34,7 @@
  */
 
 #define I40E_FW_API_VERSION_MAJOR  0x0001
-#define I40E_FW_API_VERSION_MINOR  0x0002
+#define I40E_FW_API_VERSION_MINOR  0x0004
 
 struct i40e_aq_desc {
__le16 flags;
@@ -132,12 +132,7 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_list_func_capabilities = 0x000A,
i40e_aqc_opc_list_dev_capabilities  = 0x000B,
 
-   i40e_aqc_opc_set_cppm_configuration = 0x0103,
-   i40e_aqc_opc_set_arp_proxy_entry= 0x0104,
-   i40e_aqc_opc_set_ns_proxy_entry = 0x0105,
-
/* LAA */
-   i40e_aqc_opc_mng_laa= 0x0106,   /* AQ obsolete */
i40e_aqc_opc_mac_address_read   = 0x0107,
i40e_aqc_opc_mac_address_write  = 0x0108,
 
@@ -262,7 +257,6 @@ enum i40e_admin_queue_opc {
/* Tunnel commands */
i40e_aqc_opc_add_udp_tunnel = 0x0B00,
i40e_aqc_opc_del_udp_tunnel = 0x0B01,
-   i40e_aqc_opc_tunnel_key_structure   = 0x0B10,
 
/* Async Events */
i40e_aqc_opc_event_lan_overflow = 0x1001,
@@ -274,8 +268,6 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_oem_ocbb_initialize= 0xFE03,
 
/* debug commands */
-   i40e_aqc_opc_debug_get_deviceid = 0xFF00,
-   i40e_aqc_opc_debug_set_mode = 0xFF01,
i40e_aqc_opc_debug_read_reg = 0xFF03,
i40e_aqc_opc_debug_write_reg= 0xFF04,
i40e_aqc_opc_debug_modify_reg   = 0xFF07,
@@ -509,7 +501,8 @@ struct i40e_aqc_mac_address_read {
 #define I40E_AQC_SAN_ADDR_VALID0x20
 #define I40E_AQC_PORT_ADDR_VALID   0x40
 #define I40E_AQC_WOL_ADDR_VALID0x80
-#define I40E_AQC_ADDR_VALID_MASK   0xf0
+#define I40E_AQC_MC_MAG_EN_VALID   0x100
+#define I40E_AQC_ADDR_VALID_MASK   0x1F0
u8  reserved[6];
__le32  addr_high;
__le32  addr_low;
@@ -532,7 +525,9 @@ struct i40e_aqc_mac_address_write {
 #define I40E_AQC_WRITE_TYPE_LAA_ONLY   0x
 #define I40E_AQC_WRITE_TYPE_LAA_WOL0x4000
 #define I40E_AQC_WRITE_TYPE_PORT   0x8000
-#define I40E_AQC_WRITE_TYPE_MASK   0xc000
+#define I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG  0xC000
+#define I40E_AQC_WRITE_TYPE_MASK   0xC000
+
__le16  mac_sah;
__le32  mac_sal;
u8  reserved[8];
@@ -1068,6 +1063,7 @@ struct i40e_aqc_set_vsi_promiscuous_modes {
__le16  seid;
 #define I40E_AQC_VSI_PROM_CMD_SEID_MASK0x3FF
__le16  vlan_tag;
+#define I40E_AQC_SET_VSI_VLAN_MASK 0x0FFF
 #define I40E_AQC_SET_VSI_VLAN_VALID0x8000
u8  reserved[8];
 };
@@ -2064,6 +2060,12 @@ I40E_CHECK_CMD_LENGTH(i40e_aqc_lldp_start);
 #define I40E_AQC_CEE_PFC_STATUS_MASK   (0x7  I40E_AQC_CEE_PFC_STATUS_SHIFT)
 #define I40E_AQC_CEE_APP_STATUS_SHIFT  0x8
 #define I40E_AQC_CEE_APP_STATUS_MASK   (0x7  I40E_AQC_CEE_APP_STATUS_SHIFT)
+#define I40E_AQC_CEE_FCOE_STATUS_SHIFT 0x8
+#define I40E_AQC_CEE_FCOE_STATUS_MASK  (0x7  I40E_AQC_CEE_FCOE_STATUS_SHIFT)
+#define I40E_AQC_CEE_ISCSI_STATUS_SHIFT0xA
+#define I40E_AQC_CEE_ISCSI_STATUS_MASK (0x7  I40E_AQC_CEE_ISCSI_STATUS_SHIFT)
+#define I40E_AQC_CEE_FIP_STATUS_SHIFT  0x10
+#define I40E_AQC_CEE_FIP_STATUS_MASK   (0x7  I40E_AQC_CEE_FIP_STATUS_SHIFT)
 struct i40e_aqc_get_cee_dcb_cfg_v1_resp {
u8  reserved1;
u8  oper_num_tc;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h 
b/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
index e715bcc..d5bd6f0 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq_cmd.h
@@ -34,7 +34,7 @@
  */
 
 #define I40E_FW_API_VERSION_MAJOR  0x0001
-#define I40E_FW_API_VERSION_MINOR  0x0002
+#define I40E_FW_API_VERSION_MINOR  0x0004
 #define I40E_FW_API_VERSION_A0_MINOR  0x
 
 struct i40e_aq_desc {
@@ -133,12 +133,7 @@ enum i40e_admin_queue_opc {
i40e_aqc_opc_list_func_capabilities = 0x000A,
i40e_aqc_opc_list_dev_capabilities  = 0x000B,
 
-   i40e_aqc_opc_set_cppm_configuration = 0x0103,
-   i40e_aqc_opc_set_arp_proxy_entry= 0x0104,
-   i40e_aqc_opc_set_ns_proxy_entry   

[net-next 16/16] i40e/i40evf: Bump version to 1.3.6 for i40e and 1.3.2 for i40evf

2015-07-13 Thread Jeff Kirsher
From: Catherine Sullivan catherine.sulli...@intel.com

Bump.

Change-ID: I84573d9fa51effc5b29bf5b8c74e3cc8b2673f48
Signed-off-by: Catherine Sullivan catherine.sulli...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +-
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 9ec6fa2..6ce9086 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -39,7 +39,7 @@ static const char i40e_driver_string[] =
 
 #define DRV_VERSION_MAJOR 1
 #define DRV_VERSION_MINOR 3
-#define DRV_VERSION_BUILD 4
+#define DRV_VERSION_BUILD 6
 #define DRV_VERSION __stringify(DRV_VERSION_MAJOR) . \
 __stringify(DRV_VERSION_MINOR) . \
 __stringify(DRV_VERSION_BUILD)DRV_KERN
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 526cc8d..ec1eaa5 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -34,7 +34,7 @@ char i40evf_driver_name[] = i40evf;
 static const char i40evf_driver_string[] =
Intel(R) XL710/X710 Virtual Function Network Driver;
 
-#define DRV_VERSION 1.2.25
+#define DRV_VERSION 1.3.2
 const char i40evf_driver_version[] = DRV_VERSION;
 static const char i40evf_copyright[] =
Copyright (c) 2013 - 2014 Intel Corporation.;
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 13/16] i40evf: add MAC address filter in open, not init

2015-07-13 Thread Jeff Kirsher
From: Mitch Williams mitch.a.willi...@intel.com

During close, all of the MAC filters are cleared, so the driver would be
unable to receive unicast packets after being closed and reopened.

Add the adapter's hardware MAC address filter in open, not init. This
ensures that the correct filter is present each time.

Change-ID: I51a11e9c1200139dab6f66a5353bd38c7d26f875
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40evf/i40evf_main.c | 12 +---
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 07f6052..526cc8d 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -1858,6 +1858,7 @@ static int i40evf_open(struct net_device *netdev)
if (err)
goto err_req_irq;
 
+   i40evf_add_filter(adapter, adapter-hw.mac.addr);
i40evf_configure(adapter);
 
err = i40evf_up_complete(adapter);
@@ -1998,7 +1999,6 @@ static void i40evf_init_task(struct work_struct *work)
  struct i40evf_adapter,
  init_task.work);
struct net_device *netdev = adapter-netdev;
-   struct i40evf_mac_filter *f;
struct i40e_hw *hw = adapter-hw;
struct pci_dev *pdev = adapter-pdev;
int i, err, bufsz;
@@ -2132,16 +2132,6 @@ static void i40evf_init_task(struct work_struct *work)
ether_addr_copy(netdev-dev_addr, adapter-hw.mac.addr);
ether_addr_copy(netdev-perm_addr, adapter-hw.mac.addr);
 
-   f = kzalloc(sizeof(*f), GFP_ATOMIC);
-   if (!f)
-   goto err_sw_init;
-
-   ether_addr_copy(f-macaddr, adapter-hw.mac.addr);
-   f-add = true;
-   adapter-aq_required |= I40EVF_FLAG_AQ_ADD_MAC_FILTER;
-
-   list_add(f-list, adapter-mac_filter_list);
-
init_timer(adapter-watchdog_timer);
adapter-watchdog_timer.function = i40evf_watchdog_timer;
adapter-watchdog_timer.data = (unsigned long)adapter;
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] NET: AX.25: Stop heartbeat timer on disconnect.

2015-07-13 Thread Ralf Baechle
This may result in a kernel panic.  The bug has always existed but
somehow we've run out of luck now and it bites.

Signed-off-by: Richard Stearn rich...@rns-stearn.demon.co.uk
Cc: sta...@vger.kernel.org  # all branches
Signed-off-by: Ralf Baechle r...@linux-mips.org
---
 net/ax25/ax25_subr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ax25/ax25_subr.c b/net/ax25/ax25_subr.c
index 1997538..3b78e84 100644
--- a/net/ax25/ax25_subr.c
+++ b/net/ax25/ax25_subr.c
@@ -264,6 +264,7 @@ void ax25_disconnect(ax25_cb *ax25, int reason)
 {
ax25_clear_queues(ax25);
 
+   ax25_stop_heartbeat(ax25);
ax25_stop_t1timer(ax25);
ax25_stop_t2timer(ax25);
ax25_stop_t3timer(ax25);
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-07-13 Thread Fabio Estevam
This reverts commit 6c3e921b18edca290099adfddde8a50236bf2d80.

commit 6c3e921b18ed (net: fec: Ensure clocks are enabled while using mdio
 bus) prevents the kernel to boot on mx6 boards, so let's revert it.

Reported-by: Tyler Baker tyler.ba...@linaro.org
Signed-off-by: Fabio Estevam fabio.este...@freescale.com
---
 drivers/net/ethernet/freescale/fec_main.c | 88 +--
 1 file changed, 13 insertions(+), 75 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec_main.c 
b/drivers/net/ethernet/freescale/fec_main.c
index 42e20e5..1f89c59 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -24,7 +24,6 @@
 #include linux/module.h
 #include linux/kernel.h
 #include linux/string.h
-#include linux/pm_runtime.h
 #include linux/ptrace.h
 #include linux/errno.h
 #include linux/ioport.h
@@ -78,7 +77,6 @@ static void fec_enet_itr_coal_init(struct net_device *ndev);
 #define FEC_ENET_RAEM_V0x8
 #define FEC_ENET_RAFL_V0x8
 #define FEC_ENET_OPD_V 0xFFF0
-#define FEC_MDIO_PM_TIMEOUT  100 /* ms */
 
 static struct platform_device_id fec_devtype[] = {
{
@@ -1769,13 +1767,7 @@ static void fec_enet_adjust_link(struct net_device *ndev)
 static int fec_enet_mdio_read(struct mii_bus *bus, int mii_id, int regnum)
 {
struct fec_enet_private *fep = bus-priv;
-   struct device *dev = fep-pdev-dev;
unsigned long time_left;
-   int ret = 0;
-
-   ret = pm_runtime_get_sync(dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
 
fep-mii_timeout = 0;
init_completion(fep-mdio_done);
@@ -1791,30 +1783,18 @@ static int fec_enet_mdio_read(struct mii_bus *bus, int 
mii_id, int regnum)
if (time_left == 0) {
fep-mii_timeout = 1;
netdev_err(fep-netdev, MDIO read timeout\n);
-   ret = -ETIMEDOUT;
-   goto out;
+   return -ETIMEDOUT;
}
 
-   ret = FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
-
-out:
-   pm_runtime_mark_last_busy(dev);
-   pm_runtime_put_autosuspend(dev);
-
-   return ret;
+   /* return value */
+   return FEC_MMFR_DATA(readl(fep-hwp + FEC_MII_DATA));
 }
 
 static int fec_enet_mdio_write(struct mii_bus *bus, int mii_id, int regnum,
   u16 value)
 {
struct fec_enet_private *fep = bus-priv;
-   struct device *dev = fep-pdev-dev;
unsigned long time_left;
-   int ret = 0;
-
-   ret = pm_runtime_get_sync(dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
 
fep-mii_timeout = 0;
init_completion(fep-mdio_done);
@@ -1831,13 +1811,10 @@ static int fec_enet_mdio_write(struct mii_bus *bus, int 
mii_id, int regnum,
if (time_left == 0) {
fep-mii_timeout = 1;
netdev_err(fep-netdev, MDIO write timeout\n);
-   ret  = -ETIMEDOUT;
+   return -ETIMEDOUT;
}
 
-   pm_runtime_mark_last_busy(dev);
-   pm_runtime_put_autosuspend(dev);
-
-   return ret;
+   return 0;
 }
 
 static int fec_enet_clk_enable(struct net_device *ndev, bool enable)
@@ -1849,6 +1826,9 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
bool enable)
ret = clk_prepare_enable(fep-clk_ahb);
if (ret)
return ret;
+   ret = clk_prepare_enable(fep-clk_ipg);
+   if (ret)
+   goto failed_clk_ipg;
if (fep-clk_enet_out) {
ret = clk_prepare_enable(fep-clk_enet_out);
if (ret)
@@ -1872,6 +1852,7 @@ static int fec_enet_clk_enable(struct net_device *ndev, 
bool enable)
}
} else {
clk_disable_unprepare(fep-clk_ahb);
+   clk_disable_unprepare(fep-clk_ipg);
if (fep-clk_enet_out)
clk_disable_unprepare(fep-clk_enet_out);
if (fep-clk_ptp) {
@@ -1893,6 +1874,8 @@ failed_clk_ptp:
if (fep-clk_enet_out)
clk_disable_unprepare(fep-clk_enet_out);
 failed_clk_enet_out:
+   clk_disable_unprepare(fep-clk_ipg);
+failed_clk_ipg:
clk_disable_unprepare(fep-clk_ahb);
 
return ret;
@@ -2864,14 +2847,10 @@ fec_enet_open(struct net_device *ndev)
struct fec_enet_private *fep = netdev_priv(ndev);
int ret;
 
-   ret = pm_runtime_get_sync(fep-pdev-dev);
-   if (IS_ERR_VALUE(ret))
-   return ret;
-
pinctrl_pm_select_default_state(fep-pdev-dev);
ret = fec_enet_clk_enable(ndev, true);
if (ret)
-   goto clk_enable;
+   return ret;
 
/* I should reset the ring buffers here, but I don't yet know
 * a simple way to do that.
@@ -2902,9 +2881,6 @@ err_enet_mii_probe:
fec_enet_free_buffers(ndev);
 err_enet_alloc:
fec_enet_clk_enable(ndev, false);

net: Fix skb csum races when peeking

2015-07-13 Thread Herbert Xu
On Mon, Jul 13, 2015 at 04:31:00PM +0800, Herbert Xu wrote:
 On Mon, Jul 13, 2015 at 10:28:19AM +0200, Eric Dumazet wrote:
 
  Except that udp checksum are checked outside of spinlock protection.
 
 Good point.  I wonder when this got broken.  I'll do some digging.

OK looks like I can claim credit for this bug too :)

commit fb286bb2990a107009dbf25f6ffebeb7df77f9be
Author: Herbert Xu herb...@gondor.apana.org.au
Date:   Thu Nov 10 13:01:24 2005 -0800

[NET]: Detect hardware rx checksum faults correctly

Although others have made the hole bigger more recently.

PS we seem to no longer use the hardware checksum in case of
CHECKSUM_COMPLETE, I wonder why that is?

---8---
When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.

This is in fact bogus for the MSG_PEEK case as this is done without
any locking.  So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.

This patch fixes this by only storing the result if the skb is not
shared.  This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.

Signed-off-by: Herbert Xu herb...@gondor.apana.org.au

diff --git a/net/core/datagram.c b/net/core/datagram.c
index b80fb91..4967262 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -622,7 +657,8 @@ __sum16 __skb_checksum_complete_head(struct sk_buff *skb, 
int len)
!skb-csum_complete_sw)
netdev_rx_csum_fault(skb-dev);
}
-   skb-csum_valid = !sum;
+   if (!skb_shared(skb))
+   skb-csum_valid = !sum;
return sum;
 }
 EXPORT_SYMBOL(__skb_checksum_complete_head);
@@ -642,11 +678,13 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
netdev_rx_csum_fault(skb-dev);
}
 
-   /* Save full packet checksum */
-   skb-csum = csum;
-   skb-ip_summed = CHECKSUM_COMPLETE;
-   skb-csum_complete_sw = 1;
-   skb-csum_valid = !sum;
+   if (!skb_shared(skb)) {
+   /* Save full packet checksum */
+   skb-csum = csum;
+   skb-ip_summed = CHECKSUM_COMPLETE;
+   skb-csum_complete_sw = 1;
+   skb-csum_valid = !sum;
+   }
 
return sum;
 }
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/2] net: enable inband link state negotiation only when explicitly requested

2015-07-13 Thread Sebastien Rannou
Hi Stas,

On Fri, 10 Jul 2015, Stas Sergeev wrote:

 Those who were affected by the change, please send your Tested-by,
 Thanks!

I also confirm that this version of the patch solves the issue:

Tested-by: Sebastien Rannou m...@sbrk.org

-- 
Sébastien

linux-4.2-rc2/drivers/net/ethernet/brocade/bna/bfa_ioc.c:2843: out of bounds string access ?

2015-07-13 Thread David Binderman
Hello there,

[linux-4.2-rc2/drivers/net/ethernet/brocade/bna/bfa_ioc.c:2843]: (error) Buffer 
is accessed out of bounds.

Source code is

    memcpy(manufacturer, BFA_MFG_NAME, BFA_ADAPTER_MFG_NAME_LEN);

and

#define BFA_MFG_NAME QLogic

and

$ fgrep BFA_ADAPTER_MFG_NAME_LEN `find 
linux-4.2-rc2/drivers/net/ethernet/brocade/ -name \*.h -print`
linux-4.2-rc2/drivers/net/ethernet/brocade/bna/bfa_defs.h:    
BFA_ADAPTER_MFG_NAME_LEN    = 8,   /*! manufacturer name length */
$

so the code attempts to read eight bytes from a seven byte string. Suggest code 
rework.

Regards

David Binderman

  --
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave

2015-07-13 Thread Nikolay Aleksandrov
From: Satish Ashok sas...@cumulusnetworks.com

A report with INCLUDE/Change_to_include and empty source list should be
treated as a leave, specified by RFC 3376, section 3.1:
If the requested filter mode is INCLUDE *and* the requested source
 list is empty, then the entry corresponding to the requested
 interface and multicast address is deleted if present.  If no such
 entry is present, the request is ignored.

Signed-off-by: Satish Ashok sas...@cumulusnetworks.com
Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com
---
 net/bridge/br_multicast.c | 37 ++---
 1 file changed, 30 insertions(+), 7 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 742a6c27d7a2..79db489cdade 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -39,6 +39,16 @@ static void br_multicast_start_querier(struct net_bridge *br,
   struct bridge_mcast_own_query *query);
 static void br_multicast_add_router(struct net_bridge *br,
struct net_bridge_port *port);
+static void br_ip4_multicast_leave_group(struct net_bridge *br,
+struct net_bridge_port *port,
+__be32 group,
+__u16 vid);
+#if IS_ENABLED(CONFIG_IPV6)
+static void br_ip6_multicast_leave_group(struct net_bridge *br,
+struct net_bridge_port *port,
+const struct in6_addr *group,
+__u16 vid);
+#endif
 unsigned int br_mdb_rehash_seq;
 
 static inline int br_ip_equal(const struct br_ip *a, const struct br_ip *b)
@@ -1010,9 +1020,15 @@ static int br_ip4_multicast_igmp3_report(struct 
net_bridge *br,
continue;
}
 
-   err = br_ip4_multicast_add_group(br, port, group, vid);
-   if (err)
-   break;
+   if ((type == IGMPV3_CHANGE_TO_INCLUDE ||
+type == IGMPV3_MODE_IS_INCLUDE) 
+   ntohs(grec-grec_nsrcs) == 0) {
+   br_ip4_multicast_leave_group(br, port, group, vid);
+   } else {
+   err = br_ip4_multicast_add_group(br, port, group, vid);
+   if (err)
+   break;
+   }
}
 
return err;
@@ -1071,10 +1087,17 @@ static int br_ip6_multicast_mld2_report(struct 
net_bridge *br,
continue;
}
 
-   err = br_ip6_multicast_add_group(br, port, grec-grec_mca,
-vid);
-   if (err)
-   break;
+   if ((grec-grec_type == MLD2_CHANGE_TO_INCLUDE ||
+grec-grec_type == MLD2_MODE_IS_INCLUDE) 
+   ntohs(*nsrcs) == 0) {
+   br_ip6_multicast_leave_group(br, port, grec-grec_mca,
+vid);
+   } else {
+   err = br_ip6_multicast_add_group(br, port,
+grec-grec_mca, vid);
+   if (!err)
+   break;
+   }
}
 
return err;
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2] ipv6: Do not iterate over all interfaces when finding source address on specific interface.

2015-07-13 Thread Hajime Tazaki

Yoshifuji-san,

At Mon, 13 Jul 2015 17:38:48 +0900,
Erik Kline wrote:
 
 On 13 July 2015 at 15:32, YOSHIFUJI Hideaki
 hideaki.yoshif...@miraclelinux.com wrote:
  Hi,
 
  Erik Kline wrote:
  Hmm, when I run a UML linux with this patch (which, I'm ashamed to
  say, I failed to do before) I get these kinds of errors:
 
  unregister_netdevice: waiting for TAPdevice to become free.
  Usage count = 1
  unregister_netdevice: waiting for TAPdevice to become free.
  Usage count = 1
 
  Perhaps they're unrelated... I'm still investigating.
 
  Would you test attached patch please?
 
 That does look logically correct, so +1 to it regardless, but it does
 not seem to have fixed the issue I'm seeing.
 
 I still haven't produced the smallest possible demo test program.

sorry to jump-in, but there is a side-effect with this
patch, which my tcp and dccp tests (ipv6) are failed.

because newly added function (__ipv6_dev_get_saddr) won't
update a variable 'hiscore' (it swaps with 'score' in some
case), the caller (ipv6_dev_get_saddr) can't fill an
appropriate saddr in the end.

I don't know if this is a good patch but the following diff
makes my test happy.

-- Hajime

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4ab74d5..c4e9416 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1363,7 +1363,8 @@ static void __ipv6_dev_get_saddr(struct net *net,
 unsigned int prefs,
 const struct in6_addr *saddr,
 struct inet6_dev *idev,
-struct ipv6_saddr_score *scores)
+struct ipv6_saddr_score *scores,
+struct ipv6_saddr_score **in_hiscore)
 {
struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
 
@@ -1424,6 +1425,7 @@ static void __ipv6_dev_get_saddr(struct net *net,
in6_ifa_hold(score-ifa);
 
swap(hiscore, score);
+   *in_hiscore = hiscore;
 
/* restore our iterator */
score-ifa = hiscore-ifa;
@@ -1480,13 +1482,15 @@ int ipv6_dev_get_saddr(struct net *net, const struct 
net_device *dst_dev,
}
 
if (use_oif_addr) {
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, scores);
+   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev,
+scores, hiscore);
} else {
for_each_netdev_rcu(net, dev) {
idev = __in6_dev_get(dev);
if (!idev)
continue;
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, 
scores);
+   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev,
+scores, hiscore);
}
}
rcu_read_unlock();
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Add Kconfig option to signal cross-endian guests

2015-07-13 Thread Greg Kurz
On Thu,  9 Jul 2015 09:49:05 +0200
Thomas Huth th...@redhat.com wrote:

 The option for supporting cross-endianness legacy guests in
 the vhost and tun code should only be available on systems
 that support cross-endian guests.
 
 Signed-off-by: Thomas Huth th...@redhat.com

Acked-by: Greg Kurz gk...@linux.vnet.ibm.com

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] NET: AX.25: Stop heartbeat timer on disconnect.

2015-07-13 Thread Ralf Baechle
From: Richard Stearn rich...@rns-stearn.demon.co.uk

This may result in a kernel panic.  The bug has always existed but
somehow we've run out of luck now and it bites.

Signed-off-by: Richard Stearn rich...@rns-stearn.demon.co.uk
Cc: sta...@vger.kernel.org  # all branches
Signed-off-by: Ralf Baechle r...@linux-mips.org
---
v2: Correctly attribute the patch to Richard Stearn in the From: line

 net/ax25/ax25_subr.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ax25/ax25_subr.c b/net/ax25/ax25_subr.c
index 1997538..3b78e84 100644
--- a/net/ax25/ax25_subr.c
+++ b/net/ax25/ax25_subr.c
@@ -264,6 +264,7 @@ void ax25_disconnect(ax25_cb *ax25, int reason)
 {
ax25_clear_queues(ax25);
 
+   ax25_stop_heartbeat(ax25);
ax25_stop_t1timer(ax25);
ax25_stop_t2timer(ax25);
ax25_stop_t3timer(ax25);

- End forwarded message -

  Ralf
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next 07/16] i40e: Remove incorrect #ifdef's

2015-07-13 Thread Jeff Kirsher
From: Carolyn Wyborny carolyn.wybo...@intel.com

This patch removes some #ifdef's that should not be there. They
were stopping code that is needed from being compiled in.

With these #ifdef's removed, changes are needed in the driver
to fix some compile errors: adding missing parameters to
the definition of ndo_bridge_setlink and a ndo_dflt_brige_getlink call.

Change-ID: I5516614e1bc50b6bca0647cef971bc96161ba2de
Signed-off-by: Carolyn Wyborny carolyn.wybo...@intel.com
Signed-off-by: Catherine Sullivan catherine.sulli...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 12 ++--
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index ed6fc52..c7f2a0a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -7993,7 +7993,6 @@ static int i40e_ndo_fdb_add(struct ndmsg *ndm, struct 
nlattr *tb[],
return err;
 }
 
-#ifdef HAVE_BRIDGE_ATTRIBS
 /**
  * i40e_ndo_bridge_setlink - Set the hardware bridge mode
  * @dev: the netdev being configured
@@ -8007,7 +8006,8 @@ static int i40e_ndo_fdb_add(struct ndmsg *ndm, struct 
nlattr *tb[],
  * bridge mode enabled.
  **/
 static int i40e_ndo_bridge_setlink(struct net_device *dev,
-  struct nlmsghdr *nlh)
+  struct nlmsghdr *nlh,
+  u16 flags)
 {
struct i40e_netdev_priv *np = netdev_priv(dev);
struct i40e_vsi *vsi = np-vsi;
@@ -8078,14 +8078,9 @@ static int i40e_ndo_bridge_setlink(struct net_device 
*dev,
  * Return the mode in which the hardware bridge is operating in
  * i.e VEB or VEPA.
  **/
-#ifdef HAVE_BRIDGE_FILTER
 static int i40e_ndo_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
   struct net_device *dev,
   u32 filter_mask, int nlflags)
-#else
-static int i40e_ndo_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq,
-  struct net_device *dev, int nlflags)
-#endif /* HAVE_BRIDGE_FILTER */
 {
struct i40e_netdev_priv *np = netdev_priv(dev);
struct i40e_vsi *vsi = np-vsi;
@@ -8109,7 +8104,6 @@ static int i40e_ndo_bridge_getlink(struct sk_buff *skb, 
u32 pid, u32 seq,
return ndo_dflt_bridge_getlink(skb, pid, seq, dev, veb-bridge_mode,
   nlflags, 0, 0, filter_mask, NULL);
 }
-#endif /* HAVE_BRIDGE_ATTRIBS */
 
 #define I40E_MAX_TUNNEL_HDR_LEN 80
 /**
@@ -8165,10 +8159,8 @@ static const struct net_device_ops i40e_netdev_ops = {
.ndo_get_phys_port_id   = i40e_get_phys_port_id,
.ndo_fdb_add= i40e_ndo_fdb_add,
.ndo_features_check = i40e_features_check,
-#ifdef HAVE_BRIDGE_ATTRIBS
.ndo_bridge_getlink = i40e_ndo_bridge_getlink,
.ndo_bridge_setlink = i40e_ndo_bridge_setlink,
-#endif /* HAVE_BRIDGE_ATTRIBS */
 };
 
 /**
-- 
2.4.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


linux-4.2-rc2/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1993: possible bad error checking ?

2015-07-13 Thread David Binderman
Hello there,

[linux-4.2-rc2/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1993]: (style) 
Checking if unsigned variable 'entry' is less than zero.

Source code is

    entry = priv-hw-mode-jumbo_frm(priv, skb, csum_insertion);
    if (unlikely(entry  0))
    goto dma_map_err;

but

    unsigned int entry;

So the error checking from the function call looks broken to me.

If the return value from the function call to jumbo_frm is a plain signed 
integer, suggest
sanity check that *before* assigning into an unsigned integer.

Regards

David Binderman

  --
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/2] net: enable inband link state negotiation only when explicitly requested

2015-07-13 Thread Stas Sergeev
13.07.2015 12:54, Sebastien Rannou пишет:
 Hi Stas,
 
 On Fri, 10 Jul 2015, Stas Sergeev wrote:
 
 Those who were affected by the change, please send your Tested-by,
 Thanks!
 
 I also confirm that this version of the patch solves the issue:
 
 Tested-by: Sebastien Rannou m...@sbrk.org
Thanks Sebastien!
Unfortunately, there will be v3 in a few days.
Perhaps you should not rush with the tests until the
things are settled, or who knows how many iterations
you'll have to test...
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: GFIT

2015-07-13 Thread Cunliffe Bryan (RW3) CMFT Manchester



From: Cunliffe Bryan (RW3) CMFT Manchester
Sent: 12 July 2015 21:11
To: Cunliffe Bryan (RW3) CMFT Manchester
Subject: GFIT

Donation has been made to you Email 
mrs.gloriamacke...@outlook.commailto:mrs.gloriamacke...@outlook.com for more 
Details

Privacy and Confidentiality Notice: The information contained in this e-mail is 
intended for the named recipient(s) only. It may contain privileged and 
confidential information.  If you are not an intended recipient, you must not 
copy, distribute or take any action in reliance on it. If you have received 
this e-mail in error, we would be grateful if you would notify us immediately.  
Thank you for your assistance.
 
Please note that e-mails sent or received by our staff may be disclosed under 
the Freedom of Information Act (unless exempt).
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization

2015-07-13 Thread Marcel Holtmann
Hi Antonio,

 In ancient times it was necessary to manually initialize the bus
 field of an spi_driver to spi_bus_type. These days this is done in
 spi_register_driver(), so we can drop the manual assignment.
 
 Signed-off-by: Antonio Borneo borneo.anto...@gmail.com
 To: Alan Ott a...@signal11.us
 To: Alexander Aring alex.ar...@gmail.com
 To: Varka Bhadram varkabhad...@gmail.com
 To: linux-w...@vger.kernel.org
 To: netdev@vger.kernel.org
 Cc: linux-ker...@vger.kernel.org
 ---
 drivers/net/ieee802154/cc2520.c   | 1 -
 drivers/net/ieee802154/mrf24j40.c | 1 -
 2 files changed, 2 deletions(-)

patch has been applied to bluetooth-next tree.

Regards

Marcel

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/2] net: macb: Add mdio driver for accessing multiple phy devices

2015-07-13 Thread punnaiah choudary kalluri
On Tue, Jul 14, 2015 at 12:13 AM, Florian Fainelli f.faine...@gmail.com wrote:
 On 12/07/15 21:48, Punnaiah Choudary Kalluri wrote:
 This patch is to add support for the design that has multiple ethernet
 mac controllers and single mdio bus connected to multiple phy devices.
 i.e mdio lines are connected to any of the ethernet mac controller and
 all the phy devices will be accessed using the phy maintenance interface
 in that mac controller.

  __   _
 |  | |PHY0 |
 | MAC0 |-| |
 |__|   | |_|
|
  __|  _
 |  |   | | |
 | MAC1 |   |_|PHY1 |
 |__| | |

 So, i come up with two implementations for addressing the above 
 configuration.

 Implementation 1:
  Have separate driver for mdio bus
  Create a DT node for all the PHY devices connected to the mdio bus
  This driver will share the register space of the mac controller that has
  mdio bus connected.

 That is the best design implementation, MDIO in itself is a sub-piece of
 your Ethernet MAC controller the fact that it is within the Ethernet MAC
 core is just coincidental, but there is no reason why it could not be
 taken apart and made a separate block in itself.

Thanks Florian for suggesting this.
No idea on why the mdio block was not made a separate block.

regards,
Punnaiah



 Implementation 2:
  Add new property has-mdio and it should be 1 for the mac that has mdio bus
  connected.
  Create the mdio bus only when the has-mdio property is 1

 Please review the two implementations and suggest which one is better to 
 proceed
 further. In my opinion implementation 1 will be the ideal one.

 Agreed.


 Currently i have tested the patches with single mac and single phy
 configuration. I need to take care of few more cases before releasing the 
 final patch
 but before that i would like to have your opinion on the above 
 implementations
 and finalize one implementation. so that i can enhance it further.

 Punnaiah Choudary Kalluri (1):
   net: macb: Add mdio driver for accessing multiple phy devices
   net: macb: Add support for single mac managing more than one phy


  drivers/net/ethernet/cadence/Makefile|2 +-
  drivers/net/ethernet/cadence/macb.c  |   93 +-
  drivers/net/ethernet/cadence/macb.h  |3 +-
  drivers/net/ethernet/cadence/macb_mdio.c |  204 
 ++
  4 files changed, 211 insertions(+), 91 deletions(-)
  create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c



 --
 Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: mdb: fix double add notification

2015-07-13 Thread Cong Wang
On Mon, Jul 13, 2015 at 6:36 AM, Nikolay Aleksandrov
niko...@cumulusnetworks.com wrote:
 Since the mdb add/del code was introduced there have been 2 br_mdb_notify
 calls when doing br_mdb_add() resulting in 2 notifications on each add.

 Example:
  Command: bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
  Before patch:
  root@debian:~# bridge monitor all
  [MDB]dev br0 port eth1 grp 239.0.0.1 permanent
  [MDB]dev br0 port eth1 grp 239.0.0.1 permanent

  After patch:
  root@debian:~# bridge monitor all
  [MDB]dev br0 port eth1 grp 239.0.0.1 permanent

 Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com
 Fixes: cfd567543590 (bridge: add support of adding and deleting mdb entries)
 ---
  net/bridge/br_mdb.c | 1 -
  1 file changed, 1 deletion(-)

 diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
 index c11cf2611db0..1198a3dbad95 100644
 --- a/net/bridge/br_mdb.c
 +++ b/net/bridge/br_mdb.c
 @@ -351,7 +351,6 @@ static int br_mdb_add_group(struct net_bridge *br, struct 
 net_bridge_port *port,
 if (state == MDB_TEMPORARY)
 mod_timer(p-timer, now + br-multicast_membership_interval);

 -   br_mdb_notify(br-dev, port, group, RTM_NEWMDB);
 return 0;
  }

Looks good to me.

And probably we can convert existing __br_mdb_notify() to using
non-atomic allocation too, but that is for net-next.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] e1000e: Move e1000e_disable_aspm_locked() inside CONFIG_PM

2015-07-13 Thread Michael Ellerman
e1000e_disable_aspm_locked() is only used in __e1000_resume() which is
inside CONFIG_PM. So when CONFIG_PM=n we get a defined but not used
warning for e1000e_disable_aspm_locked().

Move it inside the existing CONFIG_PM block to avoid the warning.

Signed-off-by: Michael Ellerman m...@ellerman.id.au
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
b/drivers/net/ethernet/intel/e1000e/netdev.c
index 89d788d8f263..f1d7fe2ea183 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6439,6 +6439,7 @@ static void e1000e_disable_aspm(struct pci_dev *pdev, u16 
state)
__e1000e_disable_aspm(pdev, state, 0);
 }
 
+#ifdef CONFIG_PM
 /**
  * e1000e_disable_aspm_locked   Disable ASPM states.
  * @pdev: pointer to PCI device struct
@@ -6452,7 +6453,6 @@ static void e1000e_disable_aspm_locked(struct pci_dev 
*pdev, u16 state)
__e1000e_disable_aspm(pdev, state, 1);
 }
 
-#ifdef CONFIG_PM
 static int __e1000_resume(struct pci_dev *pdev)
 {
struct net_device *netdev = pci_get_drvdata(pdev);
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[rhashtable] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:301 __debug_object_init()

2015-07-13 Thread Fengguang Wu
-netfront: refactor skb slot counting
git bisect  bad 829a3ada9cc7d4c30fa61f8033403fb6c8f8092a  # 09:38  0-  
1  geneve: Simplify locking.
git bisect good a4c9ea5e8fec680134d22aa99b54d1cd8c226ebd  # 09:42 22+ 
12  geneve: Add Geneve GRO support
git bisect good 255047b0dca31e6b8ce254481a0b65d559d2ebb8  # 09:46 20+  
0  Bluetooth: Add timing information to SMP test case runs
git bisect good 354f473ee2c5d01c1cf90f747f95218ee3e73e95  # 09:52 22+  
0  ath9k: fix typo
git bisect good d312da293f787e1b19c57acb58e8c1b171c4a04a  # 09:59 22+  
0  ixgbe: convert to CYCLECOUNTER_MASK macro.
git bisect good b8e1943e9f754219bcfb40bac4a605b5348acb25  # 10:03 22+  
8  rhashtable: Factor out bucket_tail() function
git bisect  bad f89bd6f87a53ce5a7d60662429591ebac2745c10  # 10:08  0- 
22  rhashtable: Supports for nulls marker
git bisect good 113948d841e8d78039e5dbbb5248f5b73e99eafa  # 10:12 22+ 
13  spinlock: Add spin_lock_bh_nested()
git bisect  bad 97defe1ecf868b8127f8e62395499d6a06e4c4b1  # 10:16  0- 
22  rhashtable: Per bucket locks  deferred expansion/shrinking
# first bad commit: [97defe1ecf868b8127f8e62395499d6a06e4c4b1] rhashtable: Per 
bucket locks  deferred expansion/shrinking
git bisect good 113948d841e8d78039e5dbbb5248f5b73e99eafa  # 10:19 66+ 
27  spinlock: Add spin_lock_bh_nested()
# extra tests with DEBUG_INFO
git bisect  bad 97defe1ecf868b8127f8e62395499d6a06e4c4b1  # 10:25  0- 
66  rhashtable: Per bucket locks  deferred expansion/shrinking
# extra tests on HEAD of linux-devel/devel-spot-201507122014
git bisect good 3afd2c3f65a385c405a084d80431c84b103cb6df  # 10:28 66+ 
49  0day head guard for 'devel-spot-201507122014'
# extra tests on tree/branch linus/master
git bisect good f760b87f8f12eb262f14603e65042996fe03720e  # 10:33 66+  
0  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
# extra tests on tree/branch linus/master
git bisect good f760b87f8f12eb262f14603e65042996fe03720e  # 10:33 66+  
0  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
# extra tests on tree/branch next/master
git bisect good 2eb62d762a2112579f259903e62ba18d16c51f66  # 10:36 66+ 
20  Add linux-next specific files for 20150713


This script may reproduce the error.


#!/bin/bash

kernel=$1
initrd=yocto-minimal-x86_64.cgz

wget --no-clobber 
https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd

kvm=(
qemu-system-x86_64
-enable-kvm
-cpu Haswell,+smep,+smap
-kernel $kernel
-initrd $initrd
-m 256
-smp 1
-device e1000,netdev=net0
-netdev user,id=net0
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null 
)

append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
systemd.log_level=err
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)

${kvm[@]} --append ${append[*]}


---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/lkp  Intel Corporation
early console in setup code
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.19.0-rc2-00323-g97defe1 (kbuild@lkp-ib03) (gcc 
version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Tue Jul 14 10:14:59 CST 2015
[0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 
systemd.log_level=err debug apic=debug sysrq_always_enabled 
rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 
nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 
console=ttyS0,115200 console=tty0 vga=normal  root=/dev/ram0 rw 
link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-a0-07122340/linux-devel:devel-spot-201507122014:97defe1ecf868b8127f8e62395499d6a06e4c4b1:bisect-linux-1/.vmlinuz-97defe1ecf868b8127f8e62395499d6a06e4c4b1-20150714101515-19-ivb41
 branch=linux-devel/devel-spot-201507122014 
BOOT_IMAGE=/pkg/linux/x86_64-randconfig-a0-07122340/gcc-4.9/97defe1ecf868b8127f8e62395499d6a06e4c4b1/vmlinuz-3.19.0-rc2-00323-g97defe1
 drbd.minor_count=8
[0.00] KERNEL supported cpus:
[0.00]   AMD AuthenticAMD
[0.00]   Centaur CentaurHauls
[0.00] CPU: vendor_id 'GenuineIntel' unknown, using generic init.
[0.00] CPU: Your system may be unstable.
[0.00] e820: BIOS-provided physical RAM map

Re: mmap()ed AF_NETLINK: lockdep and sleep-in-atomic warnings

2015-07-13 Thread Cong Wang
On Mon, Jul 13, 2015 at 6:18 AM, Kirill A. Shutemov
kir...@shutemov.name wrote:
 Hi,

 This simple test-case trigers few locking asserts in kernel:

 #define _GNU_SOURCE
 #include stdlib.h
 #include stdio.h
 #include string.h
 #include sys/mman.h
 #include sys/socket.h
 #include sys/types.h
 #include linux/netlink.h

 #define SOL_NETLINK 270

 int main(int argc, char **argv)
 {
 unsigned int block_size = 16 * 4096;
 struct nl_mmap_req req = {
 .nm_block_size  = block_size,
 .nm_block_nr= 64,
 .nm_frame_size  = 16384,
 .nm_frame_nr= 64 * block_size / 16384,
 };
 unsigned int ring_size;
 int fd;

 fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
 if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, req, sizeof(req))  
 0)
 exit(1);
 if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, req, sizeof(req))  
 0)
 exit(1);

 ring_size = req.nm_block_nr * req.nm_block_size;
 mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
 return 0;
 }

 +++ exited with 0 +++
 [2.500126] BUG: sleeping function called from invalid context at 
 /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
 [2.501328] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
 [2.501997] 3 locks held by init/1:
 [2.502380]  #0:  (reboot_mutex){+.+...}, at: [81080959] 
 SyS_reboot+0xa9/0x220
 [2.503328]  #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: 
 [8107f379] __blocking_notifier_call_chain+0x39/0x70
 [2.504659]  #2:  (rcu_callback){..}, at: [810d32e0] 
 rcu_do_batch.isra.49+0x160/0x10c0
 [2.505724] Preemption disabled at:[8145365f] __delay+0xf/0x20
 [2.506443]
 [2.506612] CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-9-gbddf4c4818e0 
 #253
 [2.507378] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
 Debian-1.8.2-1 04/01/2014
 [2.508386]  88017b3d8000 88027bc03c38 81929ceb 
 0102
 [2.509233]   88027bc03c68 81085a9d 
 0002
 [2.510057]  81ca2a20 0268  
 88027bc03c98
 [2.510882] Call Trace:
 [2.511146]  IRQ  [81929ceb] dump_stack+0x4f/0x7b
 [2.511763]  [81085a9d] ___might_sleep+0x16d/0x270
 [2.512476]  [81085bed] __might_sleep+0x4d/0x90
 [2.513071]  [8192e96f] mutex_lock_nested+0x2f/0x430
 [2.513683]  [81932fed] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [2.514385]  [81464143] ? __this_cpu_preempt_check+0x13/0x20
 [2.515066]  [8182fc3d] netlink_set_ring+0x1ed/0x350
 [2.515694]  [8182e000] ? netlink_undo_bind+0x70/0x70
 [2.516411]  [8182fe20] netlink_sock_destruct+0x80/0x150
 [2.517070]  [817e484d] __sk_free+0x1d/0x160
 [2.517607]  [817e49a9] sk_free+0x19/0x20
 [2.518118]  [8182e020] deferred_put_nlk_sk+0x20/0x30
 [2.518735]  [810d391c] rcu_do_batch.isra.49+0x79c/0x10c0

Caused by:

commit 21e4902aea80ef35afc00ee8d2abdea4f519b7f7
Author: Thomas Graf tg...@suug.ch
Date:   Fri Jan 2 23:00:22 2015 +0100

netlink: Lockless lookup with RCU grace period in socket release

Defers the release of the socket reference using call_rcu() to
allow using an RCU read-side protected call to rhashtable_lookup()

This restores behaviour and performance gains as previously
introduced by e341694 (netlink: Convert netlink_lookup() to use
RCU protected hash table) without the side effect of severely
delayed socket destruction.

Signed-off-by: Thomas Graf tg...@suug.ch
Signed-off-by: David S. Miller da...@davemloft.net


We can't hold mutex lock in a rcu callback, perhaps we could
defer the mmap ring cleanup to a workqueue.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


add some more infomation RE: Issue with active-backup mode bond and bridge

2015-07-13 Thread pengyi Peng(Yi)
I test this issue in kernel 3.0.93. This issue is a reproduction problem.

Step 1. Create a active-backup mode bond with two nics and make sure the IP is 
in the bond.
Step 2. Create a bridge with brctl command
Step 3. Join the bond to the bridge and make the IP in the bridge device
Step 4. use tcpdump -i bond to ensure the packets across the bond
Step 5. Use ifconfig ethX down , make the active slave down, check whether 
there is gratuitous ARPs or not.

-Original Message-
From: pengyi Peng(Yi) 
Sent: Thursday, July 02, 2015 11:05 AM
To: 'netdev@vger.kernel.org'
Cc: Lichunhe; Zhangwei (FF)
Subject: Issue with active-backup mode bond and bridge

I find that kernel seems to be not well handled with the combination of bonding 
and bridge module. I have a physical host with two nics that are bonded 
together (active backup mode).  Each nic is connected to a separate L2 switch. 
And the two L2 switchs are connected to a L3 switch.

If the host only has the bond device, when I manually make the active slave 
down, bonding will issue one or more gratuitous ARPs on the newly active slave. 
One gratuitous ARP is issued for the bonding master interface, provided that 
the interface has at least one IP address configured. 

However, if there is a bridge named br0 and the bond device joins in the bridge 
br0, the IP address of the bond moves to the br0 device. First, I make two nics 
up. But this time, when I again make the active slave down, I can't capture the 
gratuitous ARP in the bond device with tcpdump. And this can result in the bad 
connect to the host, because with no ARP packet sended out of the host, the L3 
switch may still send the packets from outside to the old L2 switch which 
connect to the new backup nic. These packets can't get any responses.

I read the kernel code. 
When change the active slave into the specified one, in 
bond_change_active_slave function, bond will send the NETDEV_NOTIFY_PEERS event:
netdev_bonding_change(bond-dev, 
NETDEV_BONDING_FAILOVER);
if (should_notify_peers)
netdev_bonding_change(bond-dev,
  NETDEV_NOTIFY_PEERS);

  
And in inetdev_event function, if event is NETDEV_NOTIFY_PEERS, it will call 
inetdev_send_gratuitous_arp to send gratuitous ARP.
case NETDEV_NOTIFY_PEERS:
/* Send gratuitous ARP to notify of link change */
inetdev_send_gratuitous_arp(dev, in_dev);
break;

But when the bond is in the bridge, the code won't change the dev to the bridge 
device, and there is no IP address in bond device, so there is no gratuitous 
ARP.

My question is, why the latest kernel(4.1) still does not consider this 
conditoin ?


--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majordomo@xxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] ipv6: Fix finding best source address in ipv6_dev_get_saddr().

2015-07-13 Thread YOSHIFUJI Hideaki/吉藤英明
Commit 9131f3de2 (ipv6: Do not iterate over all interfaces when
finding source address on specific interface.) did not properly
update best source address available.  Plus, it introduced
possible NULL pointer dereference.

Bug was reported by Erik Kline e...@google.com.
Based on patch proposed by Hajime Tazaki thehaj...@gmail.com.

Fixes: 9131f3de24db4dc12199aede7d931e6703e97f3b (ipv6: Do not
iterate over all interfaces when finding source address
on specific interface.)
Signed-off-by: YOSHIFUJI Hideaki hideaki.yoshif...@miraclelinux.com
---
 net/ipv6/addrconf.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4ab74d5..4c9a024 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1358,14 +1358,15 @@ out:
return ret;
 }
 
-static void __ipv6_dev_get_saddr(struct net *net,
-struct ipv6_saddr_dst *dst,
-unsigned int prefs,
-const struct in6_addr *saddr,
-struct inet6_dev *idev,
-struct ipv6_saddr_score *scores)
+static int __ipv6_dev_get_saddr(struct net *net,
+   struct ipv6_saddr_dst *dst,
+   unsigned int prefs,
+   const struct in6_addr *saddr,
+   struct inet6_dev *idev,
+   struct ipv6_saddr_score *scores,
+   int hiscore_idx)
 {
-   struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
+   struct ipv6_saddr_score *score = scores[1 - hiscore_idx], *hiscore = 
scores[hiscore_idx];
 
read_lock_bh(idev-lock);
list_for_each_entry(score-ifa, idev-addr_list, if_list) {
@@ -1424,6 +1425,7 @@ static void __ipv6_dev_get_saddr(struct net *net,
in6_ifa_hold(score-ifa);
 
swap(hiscore, score);
+   hiscore_idx = 1 - hiscore_idx;
 
/* restore our iterator */
score-ifa = hiscore-ifa;
@@ -1434,18 +1436,20 @@ static void __ipv6_dev_get_saddr(struct net *net,
}
 out:
read_unlock_bh(idev-lock);
+   return hiscore_idx;
 }
 
 int ipv6_dev_get_saddr(struct net *net, const struct net_device *dst_dev,
   const struct in6_addr *daddr, unsigned int prefs,
   struct in6_addr *saddr)
 {
-   struct ipv6_saddr_score scores[2], *hiscore = scores[1];
+   struct ipv6_saddr_score scores[2], *hiscore;
struct ipv6_saddr_dst dst;
struct inet6_dev *idev;
struct net_device *dev;
int dst_type;
bool use_oif_addr = false;
+   int hiscore_idx = 0;
 
dst_type = __ipv6_addr_type(daddr);
dst.addr = daddr;
@@ -1454,8 +1458,8 @@ int ipv6_dev_get_saddr(struct net *net, const struct 
net_device *dst_dev,
dst.label = ipv6_addr_label(net, daddr, dst_type, dst.ifindex);
dst.prefs = prefs;
 
-   hiscore-rule = -1;
-   hiscore-ifa = NULL;
+   scores[hiscore_idx].rule = -1;
+   scores[hiscore_idx].ifa = NULL;
 
rcu_read_lock();
 
@@ -1480,17 +1484,19 @@ int ipv6_dev_get_saddr(struct net *net, const struct 
net_device *dst_dev,
}
 
if (use_oif_addr) {
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, scores);
+   if (idev)
+   hiscore_idx = __ipv6_dev_get_saddr(net, dst, prefs, 
saddr, idev, scores, hiscore_idx);
} else {
for_each_netdev_rcu(net, dev) {
idev = __in6_dev_get(dev);
if (!idev)
continue;
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, 
scores);
+   hiscore_idx = __ipv6_dev_get_saddr(net, dst, prefs, 
saddr, idev, scores, hiscore_idx);
}
}
rcu_read_unlock();
 
+   hiscore = scores[hiscore_idx];
if (!hiscore-ifa)
return -EADDRNOTAVAIL;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next 4/4 v2] gianfar: Add paged allocation and Rx S/G

2015-07-13 Thread Claudiu Manoil
The eTSEC h/w is capable of scatter/gather on the receive side
too if MAXFRM  MRBLR, when the allowed maximum Rx frame size
is set to be greater than the maximum Rx buffer size (MRBLR).
It's about time the driver makes use of this h/w capability,
by supporting fixed buffer sizes and Rx S/G.

The buffer size given to eTSEC for reception is fixed to
1536B (must be multiple of 64), which is the same default
buffer size as before, used to accommodate standard MTU
(1500B) size frames.  As before, eTSEC can receive frames of
up to 9600B.  Individual Rx buffers are mapped to page halves
(page size for eTSEC systems is 4KB).  The skb is built around
the first buffer of a frame (using build_skb()).  In case the
frame spans multiple buffers, the trailing buffers are added
as Rx fragments to the skb.  The last buffer in frame is marked
by the L status flag.  A mechanism is in place to reuse the pages
owned by the driver (for Rx) for subsequent receptions.

Supporting fixed size buffers allows the implementation of Rx S/G,
which in turn removes the memory pressure issues the driver had
before when MTU was set for jumbo frame reception.
Also, in most cases, the Rx path becomes faster due to Rx page
reusal, since the overhead of allocating new rx buffers is removed
from the fast path.

Signed-off-by: Claudiu Manoil claudiu.man...@freescale.com
---
v2: use lstatus as u32 consistently

 drivers/net/ethernet/freescale/gianfar.c | 320 ++-
 drivers/net/ethernet/freescale/gianfar.h |  31 ++-
 drivers/net/ethernet/freescale/gianfar_ethtool.c |   1 -
 3 files changed, 208 insertions(+), 144 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index 7654d5e..648ca85 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -109,7 +109,7 @@
 
 #define TX_TIMEOUT  (1*HZ)
 
-const char gfar_driver_version[] = 1.3;
+const char gfar_driver_version[] = 2.0;
 
 static int gfar_enet_open(struct net_device *dev);
 static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev);
@@ -207,6 +207,7 @@ static void gfar_init_bds(struct net_device *ndev)
 
rx_queue-next_to_clean = 0;
rx_queue-next_to_use = 0;
+   rx_queue-next_to_alloc = 0;
 
/* make sure next_to_clean != next_to_use after this
 * by leaving at least 1 unused descriptor
@@ -222,7 +223,7 @@ static int gfar_alloc_skb_resources(struct net_device *ndev)
 {
void *vaddr;
dma_addr_t addr;
-   int i, j, k;
+   int i, j;
struct gfar_private *priv = netdev_priv(ndev);
struct device *dev = priv-dev;
struct gfar_priv_tx_q *tx_queue = NULL;
@@ -262,6 +263,7 @@ static int gfar_alloc_skb_resources(struct net_device *ndev)
rx_queue-rx_bd_base = vaddr;
rx_queue-rx_bd_dma_base = addr;
rx_queue-ndev = ndev;
+   rx_queue-dev = dev;
addr  += sizeof(struct rxbd8) * rx_queue-rx_ring_size;
vaddr += sizeof(struct rxbd8) * rx_queue-rx_ring_size;
}
@@ -276,21 +278,17 @@ static int gfar_alloc_skb_resources(struct net_device 
*ndev)
if (!tx_queue-tx_skbuff)
goto cleanup;
 
-   for (k = 0; k  tx_queue-tx_ring_size; k++)
-   tx_queue-tx_skbuff[k] = NULL;
+   for (j = 0; j  tx_queue-tx_ring_size; j++)
+   tx_queue-tx_skbuff[j] = NULL;
}
 
for (i = 0; i  priv-num_rx_queues; i++) {
rx_queue = priv-rx_queue[i];
-   rx_queue-rx_skbuff =
-   kmalloc_array(rx_queue-rx_ring_size,
- sizeof(*rx_queue-rx_skbuff),
- GFP_KERNEL);
-   if (!rx_queue-rx_skbuff)
+   rx_queue-rx_buff = kcalloc(rx_queue-rx_ring_size,
+   sizeof(*rx_queue-rx_buff),
+   GFP_KERNEL);
+   if (!rx_queue-rx_buff)
goto cleanup;
-
-   for (j = 0; j  rx_queue-rx_ring_size; j++)
-   rx_queue-rx_skbuff[j] = NULL;
}
 
gfar_init_bds(ndev);
@@ -335,10 +333,8 @@ static void gfar_init_rqprm(struct gfar_private *priv)
}
 }
 
-static void gfar_rx_buff_size_config(struct gfar_private *priv)
+static void gfar_rx_offload_en(struct gfar_private *priv)
 {
-   int frame_size = priv-ndev-mtu + ETH_HLEN + ETH_FCS_LEN;
-
/* set this when rx hw offload (TOE) functions are being used */
priv-uses_rxfcb = 0;
 
@@ -347,16 +343,6 @@ static void gfar_rx_buff_size_config(struct gfar_private 
*priv)
 
if (priv-hwts_rx_en)
priv-uses_rxfcb = 1;
-
-   if (priv-uses_rxfcb)
-   frame_size += GMAC_FCB_LEN;
-
-   frame_size 

[PATCH net-next 3/4 v2] gianfar: Use ndev, more Rx path cleanup

2015-07-13 Thread Claudiu Manoil
Use ndev instead of dev, as the rx queue back pointer
to a net_device struct, to avoid name clashing with a
struct device reference.  This prepares the addition of a
struct device back pointer to the rx queue structure.

Remove duplicated rxq registration in the process.
Move napi_gro_receive() outside gfar_process_frame().

Signed-off-by: Claudiu Manoil claudiu.man...@freescale.com
---
v2: merge lstatus as u32

 drivers/net/ethernet/freescale/gianfar.c | 54 ++--
 drivers/net/ethernet/freescale/gianfar.h |  4 +--
 2 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index c839e76..7654d5e 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -141,8 +141,7 @@ static void gfar_netpoll(struct net_device *dev);
 #endif
 int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit);
 static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue);
-static void gfar_process_frame(struct net_device *dev, struct sk_buff *skb,
-  struct napi_struct *napi);
+static void gfar_process_frame(struct net_device *ndev, struct sk_buff *skb);
 static void gfar_halt_nodisable(struct gfar_private *priv);
 static void gfar_clear_exact_match(struct net_device *dev);
 static void gfar_set_mac_for_addr(struct net_device *dev, int num,
@@ -262,7 +261,7 @@ static int gfar_alloc_skb_resources(struct net_device *ndev)
rx_queue = priv-rx_queue[i];
rx_queue-rx_bd_base = vaddr;
rx_queue-rx_bd_dma_base = addr;
-   rx_queue-dev = ndev;
+   rx_queue-ndev = ndev;
addr  += sizeof(struct rxbd8) * rx_queue-rx_ring_size;
vaddr += sizeof(struct rxbd8) * rx_queue-rx_ring_size;
}
@@ -593,7 +592,7 @@ static int gfar_alloc_rx_queues(struct gfar_private *priv)
 
priv-rx_queue[i]-rx_skbuff = NULL;
priv-rx_queue[i]-qindex = i;
-   priv-rx_queue[i]-dev = priv-ndev;
+   priv-rx_queue[i]-ndev = priv-ndev;
}
return 0;
 }
@@ -1913,7 +1912,7 @@ static void free_skb_tx_queue(struct gfar_priv_tx_q 
*tx_queue)
 static void free_skb_rx_queue(struct gfar_priv_rx_q *rx_queue)
 {
struct rxbd8 *rxbdp;
-   struct gfar_private *priv = netdev_priv(rx_queue-dev);
+   struct gfar_private *priv = netdev_priv(rx_queue-ndev);
int i;
 
rxbdp = rx_queue-rx_bd_base;
@@ -2709,17 +2708,17 @@ static struct sk_buff *gfar_new_skb(struct net_device 
*ndev,
 
 static void gfar_rx_alloc_err(struct gfar_priv_rx_q *rx_queue)
 {
-   struct gfar_private *priv = netdev_priv(rx_queue-dev);
+   struct gfar_private *priv = netdev_priv(rx_queue-ndev);
struct gfar_extra_stats *estats = priv-extra_stats;
 
-   netdev_err(rx_queue-dev, Can't alloc RX buffers\n);
+   netdev_err(rx_queue-ndev, Can't alloc RX buffers\n);
atomic64_inc(estats-rx_alloc_err);
 }
 
 static void gfar_alloc_rx_buffs(struct gfar_priv_rx_q *rx_queue,
int alloc_cnt)
 {
-   struct net_device *ndev = rx_queue-dev;
+   struct net_device *ndev = rx_queue-ndev;
struct rxbd8 *bdp, *base;
dma_addr_t bufaddr;
int i;
@@ -2756,10 +2755,10 @@ static void gfar_alloc_rx_buffs(struct gfar_priv_rx_q 
*rx_queue,
rx_queue-next_to_use = i;
 }
 
-static void count_errors(u32 lstatus, struct net_device *dev)
+static void count_errors(u32 lstatus, struct net_device *ndev)
 {
-   struct gfar_private *priv = netdev_priv(dev);
-   struct net_device_stats *stats = dev-stats;
+   struct gfar_private *priv = netdev_priv(ndev);
+   struct net_device_stats *stats = ndev-stats;
struct gfar_extra_stats *estats = priv-extra_stats;
 
/* If the packet was truncated, none of the other errors matter */
@@ -2854,10 +2853,9 @@ static inline void gfar_rx_checksum(struct sk_buff *skb, 
struct rxfcb *fcb)
 }
 
 /* gfar_process_frame() -- handle one incoming packet if skb isn't NULL. */
-static void gfar_process_frame(struct net_device *dev, struct sk_buff *skb,
-  struct napi_struct *napi)
+static void gfar_process_frame(struct net_device *ndev, struct sk_buff *skb)
 {
-   struct gfar_private *priv = netdev_priv(dev);
+   struct gfar_private *priv = netdev_priv(ndev);
struct rxfcb *fcb = NULL;
 
/* fcb is at the beginning if exists */
@@ -2866,10 +2864,8 @@ static void gfar_process_frame(struct net_device *dev, 
struct sk_buff *skb,
/* Remove the FCB from the skb
 * Remove the padded bytes, if there are any
 */
-   if (priv-uses_rxfcb) {
-   skb_record_rx_queue(skb, fcb-rq);
+   if (priv-uses_rxfcb)
skb_pull(skb, GMAC_FCB_LEN);
-   }
 
/* Get receive timestamp from the skb */
if 

[PATCH net-next 2/4 v2] gianfar: Fix and cleanup rxbd status handling

2015-07-13 Thread Claudiu Manoil
There are several (long standing) problems about how the status
field of the rx buffer descriptor (rxbd) is currently handled on
the error path:
- too many unnecessary 16bit reads of the two halves of the rxbd
status field (32bit), also resulting in overuse of endianness
convesion macros;
- bdp-status = RXBD_LARGE makes no sense, since the large
flag is read only (only eTSEC can write it), and trying to clear
the other status bits is also error prone in this context
(most of the rx status bits are read only anyway).

This is fixed with a single 32bit read of the status field,
and then the appropriate 16bit shifting is applied to access
the various status bits or the rx frame length. Also corrected
the use of the RXBD_LARGE flag.

Additional fix:
rx_over_errors stat is incremented instead of rx_crc_errors
in case of RXBD_OVERRUN occurrence.

Signed-off-by: Claudiu Manoil claudiu.man...@freescale.com
---
v2: lstatus is u32, not unsigned long

 drivers/net/ethernet/freescale/gianfar.c | 34 +---
 1 file changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index b35bf3d..c839e76 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2756,14 +2756,14 @@ static void gfar_alloc_rx_buffs(struct gfar_priv_rx_q 
*rx_queue,
rx_queue-next_to_use = i;
 }
 
-static inline void count_errors(unsigned short status, struct net_device *dev)
+static void count_errors(u32 lstatus, struct net_device *dev)
 {
struct gfar_private *priv = netdev_priv(dev);
struct net_device_stats *stats = dev-stats;
struct gfar_extra_stats *estats = priv-extra_stats;
 
/* If the packet was truncated, none of the other errors matter */
-   if (status  RXBD_TRUNCATED) {
+   if (lstatus  BD_LFLAG(RXBD_TRUNCATED)) {
stats-rx_length_errors++;
 
atomic64_inc(estats-rx_trunc);
@@ -2771,25 +2771,25 @@ static inline void count_errors(unsigned short status, 
struct net_device *dev)
return;
}
/* Count the errors, if there were any */
-   if (status  (RXBD_LARGE | RXBD_SHORT)) {
+   if (lstatus  BD_LFLAG(RXBD_LARGE | RXBD_SHORT)) {
stats-rx_length_errors++;
 
-   if (status  RXBD_LARGE)
+   if (lstatus  BD_LFLAG(RXBD_LARGE))
atomic64_inc(estats-rx_large);
else
atomic64_inc(estats-rx_short);
}
-   if (status  RXBD_NONOCTET) {
+   if (lstatus  BD_LFLAG(RXBD_NONOCTET)) {
stats-rx_frame_errors++;
atomic64_inc(estats-rx_nonoctet);
}
-   if (status  RXBD_CRCERR) {
+   if (lstatus  BD_LFLAG(RXBD_CRCERR)) {
atomic64_inc(estats-rx_crcerr);
stats-rx_crc_errors++;
}
-   if (status  RXBD_OVERRUN) {
+   if (lstatus  BD_LFLAG(RXBD_OVERRUN)) {
atomic64_inc(estats-rx_overrun);
-   stats-rx_crc_errors++;
+   stats-rx_over_errors++;
}
 }
 
@@ -2921,6 +2921,7 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, 
int rx_work_limit)
i = rx_queue-next_to_clean;
 
while (rx_work_limit--) {
+   u32 lstatus;
 
if (cleaned_cnt = GFAR_RX_BUFF_ALLOC) {
gfar_alloc_rx_buffs(rx_queue, cleaned_cnt);
@@ -2928,7 +2929,8 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, 
int rx_work_limit)
}
 
bdp = rx_queue-rx_bd_base[i];
-   if (be16_to_cpu(bdp-status)  RXBD_EMPTY)
+   lstatus = be32_to_cpu(bdp-lstatus);
+   if (lstatus  BD_LFLAG(RXBD_EMPTY))
break;
 
/* order rx buffer descriptor reads */
@@ -2940,13 +2942,13 @@ int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, 
int rx_work_limit)
dma_unmap_single(priv-dev, be32_to_cpu(bdp-bufPtr),
 priv-rx_buffer_size, DMA_FROM_DEVICE);
 
-   if (unlikely(!(be16_to_cpu(bdp-status)  RXBD_ERR) 
-be16_to_cpu(bdp-length)  priv-rx_buffer_size))
-   bdp-status = cpu_to_be16(RXBD_LARGE);
+   if (unlikely(!(lstatus  BD_LFLAG(RXBD_ERR)) 
+(lstatus  BD_LENGTH_MASK)  priv-rx_buffer_size))
+   lstatus |= BD_LFLAG(RXBD_LARGE);
 
-   if (unlikely(!(be16_to_cpu(bdp-status)  RXBD_LAST) ||
-be16_to_cpu(bdp-status)  RXBD_ERR)) {
-   count_errors(be16_to_cpu(bdp-status), dev);
+   if (unlikely(!(lstatus  BD_LFLAG(RXBD_LAST)) ||
+(lstatus  BD_LFLAG(RXBD_ERR {
+   count_errors(lstatus, dev);
 
/* discard faulty buffer */
 

[PATCH net-next 0/4 v2] gianfar: Add Rx S/G

2015-07-13 Thread Claudiu Manoil
Hi David,
This patch-set introduces scatter/gather support
on the Rx side, addressing Rx path performance
issues in the driver.
Thanks.

As an example, two boards connected back-to-back
were used to measure the throughput, running the
same kernel 4.1, before and after applying these
patches.
The netperf UDP_STREAM results below show that the
bottleneck lies on the Rx side BEFORE applying the
patches, and that the Rx throughput is even lower
with a larger MTU.  AFTER applying the patches the
Rx bottleneck is gone (Rx throughput matches the
Tx one) and the RX throughput is not influenced by
MTU size any longer (as expected).


BEFORE:

1) MTU 1500 (default)

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 512
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840 512   150.0020119124  0  549.4 100.00   14.911
163840   150.0014057349 383.9 100.00   14.911

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 64
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840  64   150.0023654013  0   80.7 100.00   101.463
163840   150.0015875288  54.2 100.00   101.463

2) MTU 8000

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 512
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840 512   150.0020067232  0  548.0 100.00   14.950
163840   150.006113498 166.9 99.9514.942

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 64
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840  64   150.0023621279  0   80.6 100.00   101.604
163840   150.005868602  20.0 99.96101.563


AFTER:
(both MTU 1500 and MTU 8000)

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 512
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840 512   150.0019914969  0  543.8 100.00   15.064
163840   150.0019914969 543.8 99.3514.966

root@p1010rdb-pb:~# netperf -l 150 -cC -H 192.85.1.1 -p 12867 -t UDP_STREAM -- 
-m 64
MIGRATED UDP STREAM TEST from 0.0.0.0 () port 0 AF_INET to 192.85.1.1 () port 0 
AF_INET
Socket  Message  Elapsed  Messages   CPU  Service
SizeSize Time Okay Errors   Throughput   Util Demand
bytes   bytessecs#  #   10^6bits/sec % SS us/KB

163840  64   150.0023433989  0   80.0 100.00   102.416
163840   150.0023433989  80.0 99.62102.023




Claudiu Manoil (4):
  gianfar: Bundle Rx allocation, cleanup
  gianfar: Fix and cleanup rxbd status handling
  gianfar: Use ndev, more Rx path cleanup
  gianfar: Add paged allocation and Rx S/G

 drivers/net/ethernet/freescale/gianfar.c | 496 +--
 drivers/net/ethernet/freescale/gianfar.h |  72 ++--
 drivers/net/ethernet/freescale/gianfar_ethtool.c |   4 +-
 3 files changed, 331 insertions(+), 241 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next v2] ipv6: Do not iterate over all interfaces when finding source address on specific interface.

2015-07-13 Thread YOSHIFUJI Hideaki
Hi,

Hajime Tazaki wrote:
 
 Yoshifuji-san,
 
 At Mon, 13 Jul 2015 17:38:48 +0900,
 Erik Kline wrote:

 On 13 July 2015 at 15:32, YOSHIFUJI Hideaki
 hideaki.yoshif...@miraclelinux.com wrote:
 Hi,

 Erik Kline wrote:
 Hmm, when I run a UML linux with this patch (which, I'm ashamed to
 say, I failed to do before) I get these kinds of errors:

 unregister_netdevice: waiting for TAPdevice to become free.
 Usage count = 1
 unregister_netdevice: waiting for TAPdevice to become free.
 Usage count = 1

 Perhaps they're unrelated... I'm still investigating.

 Would you test attached patch please?

 That does look logically correct, so +1 to it regardless, but it does
 not seem to have fixed the issue I'm seeing.

 I still haven't produced the smallest possible demo test program.
 
 sorry to jump-in, but there is a side-effect with this
 patch, which my tcp and dccp tests (ipv6) are failed.
 
 because newly added function (__ipv6_dev_get_saddr) won't
 update a variable 'hiscore' (it swaps with 'score' in some
 case), the caller (ipv6_dev_get_saddr) can't fill an
 appropriate saddr in the end.
 
 I don't know if this is a good patch but the following diff
 makes my test happy.

We should update score as well...

 
 -- Hajime
 
 diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
 index 4ab74d5..c4e9416 100644
 --- a/net/ipv6/addrconf.c
 +++ b/net/ipv6/addrconf.c
 @@ -1363,7 +1363,8 @@ static void __ipv6_dev_get_saddr(struct net *net,
unsigned int prefs,
const struct in6_addr *saddr,
struct inet6_dev *idev,
 -  struct ipv6_saddr_score *scores)
 +  struct ipv6_saddr_score *scores,
 +  struct ipv6_saddr_score **in_hiscore)
  {
   struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
  
 @@ -1424,6 +1425,7 @@ static void __ipv6_dev_get_saddr(struct net *net,
   in6_ifa_hold(score-ifa);
  
   swap(hiscore, score);
 + *in_hiscore = hiscore;
  
   /* restore our iterator */
   score-ifa = hiscore-ifa;
 @@ -1480,13 +1482,15 @@ int ipv6_dev_get_saddr(struct net *net, const struct 
 net_device *dst_dev,
   }
  
   if (use_oif_addr) {
 - __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, scores);
 + __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev,
 +  scores, hiscore);
   } else {
   for_each_netdev_rcu(net, dev) {
   idev = __in6_dev_get(dev);
   if (!idev)
   continue;
 - __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, 
 scores);
 + __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev,
 +  scores, hiscore);
   }
   }
   rcu_read_unlock();
 

-- 
Hideaki Yoshifuji hideaki.yoshif...@miraclelinux.com
Technical Division, MIRACLE LINUX CORPORATION
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net v4] rtnl/bond: don't send rtnl msg for unregistered iface

2015-07-13 Thread Kristian Evensen
Hello,

I have a quick question about this patch.

On Wed, May 13, 2015 at 2:19 PM, Nicolas Dichtel
nicolas.dich...@6wind.com wrote:
 diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
 index 837d30b5ffed..7b25f1ef3d75 100644
 --- a/net/core/rtnetlink.c
 +++ b/net/core/rtnetlink.c
 @@ -2415,6 +2415,9 @@ void rtmsg_ifinfo(int type, struct net_device *dev, 
 unsigned int change,
  {
 struct sk_buff *skb;

 +   if (dev-reg_state != NETREG_REGISTERED)
 +   return;
 +

Is this check correct, or placed at the correct location? The reason I
am asking is as follows. In rollback_registered_many(), dev-reg_state
is set to NETREG_UNREGISTERING for devices that will be unregistered.
When rtmsg_ifinfo_build_skb(RTM_DELLINK, ...) is called in the
following loop in rollback_registered_many, this comparison will
always be true and no DELLINK event generated.

This change led to some applications I have not behaving as expected
due to missing DELLINK when network devices are removed. I also see no
DELLINK with ip mon link. Removing the check restores the old behavior
(DELLINK events are generated). My machine is running 3.18.18, which
includes this fix.

Thanks in advance for any help,
Kristian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: net: Fix skb csum races when peeking

2015-07-13 Thread Herbert Xu
On Mon, Jul 13, 2015 at 08:01:42PM +0800, Herbert Xu wrote:
 
 PS we seem to no longer use the hardware checksum in case of
 CHECKSUM_COMPLETE, I wonder why that is?

Nevermind, it's still there.  I was just looking in the wrong place.
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] add stealth mode

2015-07-13 Thread Austin S Hemmelgarn

On 2015-07-12 19:13, Matteo Croce wrote:

2015-07-08 15:32 GMT+02:00 Austin S Hemmelgarn ahferro...@gmail.com:

On 2015-07-06 15:44, Matteo Croce wrote:
Just to name a few that I know of off the top of my head:
1. IP packets with any protocol number not supported by your current kernel
(these return a special ICMP message).


Right, I'll handle them


2. SCTP INIT and COOKIE_ECHO chunks when you have SCTP enabled in the
kernel.


Well, I've never played with SCTP before
It should still be checked, as should DCCP and RDS (those are the only 
other Layer 3 protocols that I have ever actually seen people try to 
scan hosts with besides TCP/UDP/SCTP).  SCTP itself is not hugely 
prevalent outside of some clustering uses, but it is still seen on the 
internet sometimes (for example, Gentoo has optional patches for OpenSSH 
to use SCTP).



3. Theoretically, some IGMP messages.
4. NDP messages.
5. ARP queries looking for the machine's IP addresses.


Yes I know, but it's unlikely to receive this packets from WAN, right?
My flag is intended to be used mostly on WAN interfaces,
machines in LAN should be easily discoverable IMHO.
In theory it's unlikely, but if you use any kind of IPv4 multicast on 
the WAN you will get IGMP (and MLD for IPv6 multicast).  You may also 
get some NDP queries also if you are using IPv6 and your WAN is itself 
behind a NAT router (and yes, there are ISP's who do that).



6. Certain odd flag combinations on single TCP packets (check the
documentation for Nmap for more info regarding these), which I believe
(although I may be reading the code wrong) you aren't accounting for.


I've tried many TCP flags combination with hping3, NUL, SYN/ACK, ACK,
SYN/FIN, etc.
They doesn't get any response when the flag is set

How about FIN/ACK and FIN/PSH/URG?



7. DAD queries.


Never looked at this packets, are a subset of NDP?
Kind of, it's an ICMPv6 extension for detecting if SLACC configured 
address is already in use.  Most distro's have support for it enabled by 
default.

8. ICMP address mask queries (which you also don't appear to account for).


It's deprecated and actually it doesn't get any response already
Just because it's deprecated doesn't mean you shouldn't account for it, 
although it does appear to get dropped by default by the kernel.


You should also test how different combinations of sysctls under 
/proc/sys/net affect this (there are for example already sysctls for 
ignoring certain types of ICMP packets).




smime.p7s
Description: S/MIME Cryptographic Signature


[PATCH net] bridge: mdb: fix double add notification

2015-07-13 Thread Nikolay Aleksandrov
Since the mdb add/del code was introduced there have been 2 br_mdb_notify
calls when doing br_mdb_add() resulting in 2 notifications on each add.

Example:
 Command: bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
 Before patch:
 root@debian:~# bridge monitor all
 [MDB]dev br0 port eth1 grp 239.0.0.1 permanent
 [MDB]dev br0 port eth1 grp 239.0.0.1 permanent

 After patch:
 root@debian:~# bridge monitor all
 [MDB]dev br0 port eth1 grp 239.0.0.1 permanent

Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com
Fixes: cfd567543590 (bridge: add support of adding and deleting mdb entries)
---
 net/bridge/br_mdb.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index c11cf2611db0..1198a3dbad95 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -351,7 +351,6 @@ static int br_mdb_add_group(struct net_bridge *br, struct 
net_bridge_port *port,
if (state == MDB_TEMPORARY)
mod_timer(p-timer, now + br-multicast_membership_interval);
 
-   br_mdb_notify(br-dev, port, group, RTM_NEWMDB);
return 0;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread Marcelo Ricardo Leitner

On 13-07-2015 07:39, Neil Horman wrote:

On Fri, Jul 10, 2015 at 06:21:14PM -0700, David Miller wrote:

From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
Date: Thu,  9 Jul 2015 11:15:19 -0300


SCTP has this operation to peel off associations from a given socket and
create a new socket using this association. We currently have two ways
to use this operation:
- via getsockopt(), on which it will also create and return a file
   descriptor for this new socket
- via sctp_do_peeloff(), which is for kernel only

The caveat with using sctp_do_peeloff() directly is that it creates a
dependency to SCTP module, while all other operations are handled via
kernel_{socket,sendmsg,getsockopt...}() interface. This causes the
kernel to load SCTP module even when it's not directly used

This patch then updates SCTP_SOCKOPT_PEELOFF so that for kernel users of
this protocol it will not allocate a file descriptor but instead just
return the socket pointer directly.

If called by an user application it will work as before.

Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com


I do not like this at all.

Socket option implementations should not change their behavior or what
datastructures they consume or return just because the socket happens
to be a kernel socket.


But in this case its necessecary, as the kernel here can't allocate an fd, due
to serious leakage (see commit 2f2d76cc3e938389feee671b46252dde6880b3b7).
Initially Marcelo had created duplicate code paths, one to return an fd, one to
return a file struct.  If you would rather go in that direction, I'm sure he can
propose it again, but that seems less correct to me than this solution.


Yes.

dlm is the only user of this option within kernel today and it causes 
serious problems, as Neil just referenced. Another good result of this 
implementation is that we are preventing such leakage from happening 
again in the future.



I'm not applying this series, sorry.

Also, your patch series lacked an intial PATCH 0/N posting, so you
could at least spend the time to discuss this patch series at a high
level and explain your overall motivations.


That was in the initial posting.  It should have been reposted, but if you're
interested:
http://marc.info/?l=linux-sctpm=143449456219518w=2


My bad. Won't happen again.

Thanks,
Marcelo

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nf: IDLETIMER: fix lockdep warning

2015-07-13 Thread Dmitry Torokhov
On Mon, Jul 13, 2015 at 6:20 AM, Pablo Neira Ayuso pa...@netfilter.org wrote:
 On Thu, Jul 09, 2015 at 05:15:01PM -0700, Dmitry Torokhov wrote:
 Dynamically allocated sysfs attributes should be initialized with
 sysfs_attr_init() otherwise lockdep will be angry with us:

 [   45.468653] BUG: key ffc030fad4e0 not in .data!
 [   45.468655] [ cut here ]
 [   45.468666] WARNING: CPU: 0 PID: 1176 at 
 /mnt/host/source/src/third_party/kernel/v3.18/kernel/locking/lockdep.c:2991 
 lockdep_init_map+0x12c/0x490()
 [   45.468672] DEBUG_LOCKS_WARN_ON(1)
 [   45.468672] CPU: 0 PID: 1176 Comm: iptables Tainted: G U  W 3.18.0 #43
 [   45.468674] Hardware name: XXX
 [   45.468675] Call trace:
 [   45.468680] [ffc0002072b4] dump_backtrace+0x0/0x10c
 [   45.468683] [ffc0002073d0] show_stack+0x10/0x1c
 [   45.468688] [ffc000a86cd4] dump_stack+0x74/0x94
 [   45.468692] [ffc000217ae0] warn_slowpath_common+0x84/0xb0
 [   45.468694] [ffc000217b84] warn_slowpath_fmt+0x4c/0x58
 [   45.468697] [ffc0002530a4] lockdep_init_map+0x128/0x490
 [   45.468701] [ffc000367ef0] __kernfs_create_file+0x80/0xe4
 [   45.468704] [ffc00036862c] sysfs_add_file_mode_ns+0x104/0x170
 [   45.468706] [ffc00036870c] sysfs_create_file_ns+0x58/0x64
 [   45.468711] [ffc000930430] idletimer_tg_checkentry+0x14c/0x324
 [   45.468714] [ffc00092a728] xt_check_target+0x170/0x198
 [   45.468717] [ffc000993efc] check_target+0x58/0x6c
 [   45.468720] [ffc000994c64] translate_table+0x30c/0x424
 [   45.468723] [ffc00099529c] do_ipt_set_ctl+0x144/0x1d0
 [   45.468728] [ffc0009079f0] nf_setsockopt+0x50/0x60
 [   45.468732] [ffc000946870] ip_setsockopt+0x8c/0xb4
 [   45.468735] [ffc0009661c0] raw_setsockopt+0x10/0x50
 [   45.468739] [ffc0008c1550] sock_common_setsockopt+0x14/0x20
 [   45.468742] [ffc0008bd190] SyS_setsockopt+0x88/0xb8
 [   45.468744] ---[ end trace 41d156354d18c039 ]---

 Applied, thanks.

 One question:

 Change-Id: I1da5cd96fc8e1e1e4209e81eba1165a42d4d45e9

 BTW, does this gerrit change ID provide any public information? Thanks.

Argh, I am sorry, I forgot to clean this out when mailing the patch.
In this particular case you can find the change in AOSP gerrit at
https://android-review.googlesource.com but without such context this
change-id is of course useless.

Thanks,
Dmitry
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mmap()ed AF_NETLINK: lockdep and sleep-in-atomic warnings

2015-07-13 Thread Kirill A. Shutemov
Hi,

This simple test-case trigers few locking asserts in kernel:

#define _GNU_SOURCE
#include stdlib.h
#include stdio.h
#include string.h
#include sys/mman.h
#include sys/socket.h
#include sys/types.h
#include linux/netlink.h

#define SOL_NETLINK 270

int main(int argc, char **argv)
{
unsigned int block_size = 16 * 4096;
struct nl_mmap_req req = {
.nm_block_size  = block_size,
.nm_block_nr= 64,
.nm_frame_size  = 16384,
.nm_frame_nr= 64 * block_size / 16384,
};
unsigned int ring_size;
int fd;

fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, req, sizeof(req))  0)
exit(1);
if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, req, sizeof(req))  0)
exit(1);

ring_size = req.nm_block_nr * req.nm_block_size;
mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
return 0;
}

+++ exited with 0 +++
[2.500126] BUG: sleeping function called from invalid context at 
/home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
[2.501328] in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
[2.501997] 3 locks held by init/1:
[2.502380]  #0:  (reboot_mutex){+.+...}, at: [81080959] 
SyS_reboot+0xa9/0x220
[2.503328]  #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: 
[8107f379] __blocking_notifier_call_chain+0x39/0x70
[2.504659]  #2:  (rcu_callback){..}, at: [810d32e0] 
rcu_do_batch.isra.49+0x160/0x10c0
[2.505724] Preemption disabled at:[8145365f] __delay+0xf/0x20
[2.506443] 
[2.506612] CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-9-gbddf4c4818e0 
#253
[2.507378] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
Debian-1.8.2-1 04/01/2014
[2.508386]  88017b3d8000 88027bc03c38 81929ceb 
0102
[2.509233]   88027bc03c68 81085a9d 
0002
[2.510057]  81ca2a20 0268  
88027bc03c98
[2.510882] Call Trace:
[2.511146]  IRQ  [81929ceb] dump_stack+0x4f/0x7b
[2.511763]  [81085a9d] ___might_sleep+0x16d/0x270
[2.512476]  [81085bed] __might_sleep+0x4d/0x90
[2.513071]  [8192e96f] mutex_lock_nested+0x2f/0x430
[2.513683]  [81932fed] ? _raw_spin_unlock_irqrestore+0x5d/0x80
[2.514385]  [81464143] ? __this_cpu_preempt_check+0x13/0x20
[2.515066]  [8182fc3d] netlink_set_ring+0x1ed/0x350
[2.515694]  [8182e000] ? netlink_undo_bind+0x70/0x70
[2.516411]  [8182fe20] netlink_sock_destruct+0x80/0x150
[2.517070]  [817e484d] __sk_free+0x1d/0x160
[2.517607]  [817e49a9] sk_free+0x19/0x20
[2.518118]  [8182e020] deferred_put_nlk_sk+0x20/0x30
[2.518735]  [810d391c] rcu_do_batch.isra.49+0x79c/0x10c0
[2.519386]  [810d32e0] ? rcu_do_batch.isra.49+0x160/0x10c0
[2.520101]  [810d787b] rcu_process_callbacks+0xdb/0x6d0
[2.520790]  [8105dd52] __do_softirq+0x152/0x630
[2.521370]  [8105e3be] irq_exit+0x8e/0xb0
[2.521895]  [81936366] smp_apic_timer_interrupt+0x46/0x60
[2.522558]  [8145365f] ? __delay+0xf/0x20
[2.523079]  [81934a00] apic_timer_interrupt+0x70/0x80
[2.523705]  EOI  [8145365f] ? __delay+0xf/0x20
[2.524366]  [810b26ab] ? in_lock_functions+0x1b/0x20
[2.524995]  [8108ab81] get_parent_ip+0x11/0x50
[2.525562]  [8108ad1f] preempt_count_sub+0x9f/0xf0
[2.526179]  [81453778] delay_tsc+0x68/0xc0
[2.526706]  [8145365f] __delay+0xf/0x20
[2.527207]  [8145369a] __const_udelay+0x2a/0x30
[2.527781]  [8172d05a] md_notify_reboot+0xea/0x100
[2.528489]  [8107f379] ? __blocking_notifier_call_chain+0x39/0x70
[2.529236]  [8107efc6] notifier_call_chain+0x66/0x90
[2.529856]  [8107f391] __blocking_notifier_call_chain+0x51/0x70
[2.530570]  [810ae8c6] ? __lock_acquire+0x606/0xf50
[2.531172]  [8107f3c6] blocking_notifier_call_chain+0x16/0x20
[2.531869]  [8108061d] kernel_restart_prepare+0x1d/0x40
[2.532593]  [810806e6] kernel_restart+0x16/0x60
[2.533183]  [81080a07] SyS_reboot+0x157/0x220
[2.533738]  [81010778] ? __restore_xstate_sig+0xf8/0x720
[2.534390]  [81464127] ? debug_smp_processor_id+0x17/0x20
[2.535051]  [810a836e] ? put_lock_stats.isra.19+0xe/0x30
[2.535707]  [81933040] ? _raw_spin_unlock_irq+0x30/0x60
[2.536446]  [8108ad2b] ? preempt_count_sub+0xab/0xf0
[2.537112]  [81933daa] ? syscall_return+0x11/0x54
[2.537709]  [81464143] ? __this_cpu_preempt_check+0x13/0x20
[2.538399]  

[PATCH net-next 1/4 v2] gianfar: Bundle Rx allocation, cleanup

2015-07-13 Thread Claudiu Manoil
Use a more common consumer/ producer index design to improve
rx buffer allocation.  Instead of allocating a single new buffer
(skb) on each iteration, bundle the allocation of several rx
buffers at a time.  This also opens the path for further memory
optimizations.

Remove useless check of rxq-rfbptr, since this patch touches
rx pause frame handling code as well.  rxq-rfbptr is always
initialized as part of Rx BD ring init.
Remove redundant (and misleading) 'amount_pull' parameter.

Signed-off-by: Claudiu Manoil claudiu.man...@freescale.com
---
v2: none

 drivers/net/ethernet/freescale/gianfar.c | 201 ---
 drivers/net/ethernet/freescale/gianfar.h |  39 +++--
 drivers/net/ethernet/freescale/gianfar_ethtool.c |   3 +
 3 files changed, 136 insertions(+), 107 deletions(-)

diff --git a/drivers/net/ethernet/freescale/gianfar.c 
b/drivers/net/ethernet/freescale/gianfar.c
index ff87502..b35bf3d 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -116,8 +116,8 @@ static int gfar_start_xmit(struct sk_buff *skb, struct 
net_device *dev);
 static void gfar_reset_task(struct work_struct *work);
 static void gfar_timeout(struct net_device *dev);
 static int gfar_close(struct net_device *dev);
-static struct sk_buff *gfar_new_skb(struct net_device *dev,
-   dma_addr_t *bufaddr);
+static void gfar_alloc_rx_buffs(struct gfar_priv_rx_q *rx_queue,
+   int alloc_cnt);
 static int gfar_set_mac_address(struct net_device *dev);
 static int gfar_change_mtu(struct net_device *dev, int new_mtu);
 static irqreturn_t gfar_error(int irq, void *dev_id);
@@ -142,7 +142,7 @@ static void gfar_netpoll(struct net_device *dev);
 int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue, int rx_work_limit);
 static void gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue);
 static void gfar_process_frame(struct net_device *dev, struct sk_buff *skb,
-  int amount_pull, struct napi_struct *napi);
+  struct napi_struct *napi);
 static void gfar_halt_nodisable(struct gfar_private *priv);
 static void gfar_clear_exact_match(struct net_device *dev);
 static void gfar_set_mac_for_addr(struct net_device *dev, int num,
@@ -169,17 +169,15 @@ static void gfar_init_rxbdp(struct gfar_priv_rx_q 
*rx_queue, struct rxbd8 *bdp,
bdp-lstatus = cpu_to_be32(lstatus);
 }
 
-static int gfar_init_bds(struct net_device *ndev)
+static void gfar_init_bds(struct net_device *ndev)
 {
struct gfar_private *priv = netdev_priv(ndev);
struct gfar __iomem *regs = priv-gfargrp[0].regs;
struct gfar_priv_tx_q *tx_queue = NULL;
struct gfar_priv_rx_q *rx_queue = NULL;
struct txbd8 *txbdp;
-   struct rxbd8 *rxbdp;
u32 __iomem *rfbptr;
int i, j;
-   dma_addr_t bufaddr;
 
for (i = 0; i  priv-num_tx_queues; i++) {
tx_queue = priv-tx_queue[i];
@@ -207,33 +205,18 @@ static int gfar_init_bds(struct net_device *ndev)
rfbptr = regs-rfbptr0;
for (i = 0; i  priv-num_rx_queues; i++) {
rx_queue = priv-rx_queue[i];
-   rx_queue-cur_rx = rx_queue-rx_bd_base;
-   rx_queue-skb_currx = 0;
-   rxbdp = rx_queue-rx_bd_base;
-
-   for (j = 0; j  rx_queue-rx_ring_size; j++) {
-   struct sk_buff *skb = rx_queue-rx_skbuff[j];
 
-   if (skb) {
-   bufaddr = be32_to_cpu(rxbdp-bufPtr);
-   } else {
-   skb = gfar_new_skb(ndev, bufaddr);
-   if (!skb) {
-   netdev_err(ndev, Can't allocate RX 
buffers\n);
-   return -ENOMEM;
-   }
-   rx_queue-rx_skbuff[j] = skb;
-   }
+   rx_queue-next_to_clean = 0;
+   rx_queue-next_to_use = 0;
 
-   gfar_init_rxbdp(rx_queue, rxbdp, bufaddr);
-   rxbdp++;
-   }
+   /* make sure next_to_clean != next_to_use after this
+* by leaving at least 1 unused descriptor
+*/
+   gfar_alloc_rx_buffs(rx_queue, gfar_rxbd_unused(rx_queue));
 
rx_queue-rfbptr = rfbptr;
rfbptr += 2;
}
-
-   return 0;
 }
 
 static int gfar_alloc_skb_resources(struct net_device *ndev)
@@ -311,8 +294,7 @@ static int gfar_alloc_skb_resources(struct net_device *ndev)
rx_queue-rx_skbuff[j] = NULL;
}
 
-   if (gfar_init_bds(ndev))
-   goto cleanup;
+   gfar_init_bds(ndev);
 
return 0;
 
@@ -1639,10 +1621,7 @@ static int gfar_restore(struct device *dev)
return 0;
}
 
-   if (gfar_init_bds(ndev)) {
-

Re: [PATCH net-next] tc: fix tc actions in case of shared skb

2015-07-13 Thread Jamal Hadi Salim

On 07/10/15 20:10, Alexei Starovoitov wrote:

TC actions need to check for very unlikely event skb-users != 1,
otherwise subsequent pskb_may_pull/pskb_expand_head will crash.
When skb_shared() just drop the packet, since in the middle of actions
it's too late to call skb_share_check(), since classifiers/actions assume
the same skb pointer.



Alexei,
To add to what Dave said - are the rules specified here:
Documentation/networking/tc-actions-env-rules.txt
insufficient?

cheers,
jamal
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] nf: IDLETIMER: fix lockdep warning

2015-07-13 Thread Pablo Neira Ayuso
On Thu, Jul 09, 2015 at 05:15:01PM -0700, Dmitry Torokhov wrote:
 Dynamically allocated sysfs attributes should be initialized with
 sysfs_attr_init() otherwise lockdep will be angry with us:
 
 [   45.468653] BUG: key ffc030fad4e0 not in .data!
 [   45.468655] [ cut here ]
 [   45.468666] WARNING: CPU: 0 PID: 1176 at 
 /mnt/host/source/src/third_party/kernel/v3.18/kernel/locking/lockdep.c:2991 
 lockdep_init_map+0x12c/0x490()
 [   45.468672] DEBUG_LOCKS_WARN_ON(1)
 [   45.468672] CPU: 0 PID: 1176 Comm: iptables Tainted: G U  W 3.18.0 #43
 [   45.468674] Hardware name: XXX
 [   45.468675] Call trace:
 [   45.468680] [ffc0002072b4] dump_backtrace+0x0/0x10c
 [   45.468683] [ffc0002073d0] show_stack+0x10/0x1c
 [   45.468688] [ffc000a86cd4] dump_stack+0x74/0x94
 [   45.468692] [ffc000217ae0] warn_slowpath_common+0x84/0xb0
 [   45.468694] [ffc000217b84] warn_slowpath_fmt+0x4c/0x58
 [   45.468697] [ffc0002530a4] lockdep_init_map+0x128/0x490
 [   45.468701] [ffc000367ef0] __kernfs_create_file+0x80/0xe4
 [   45.468704] [ffc00036862c] sysfs_add_file_mode_ns+0x104/0x170
 [   45.468706] [ffc00036870c] sysfs_create_file_ns+0x58/0x64
 [   45.468711] [ffc000930430] idletimer_tg_checkentry+0x14c/0x324
 [   45.468714] [ffc00092a728] xt_check_target+0x170/0x198
 [   45.468717] [ffc000993efc] check_target+0x58/0x6c
 [   45.468720] [ffc000994c64] translate_table+0x30c/0x424
 [   45.468723] [ffc00099529c] do_ipt_set_ctl+0x144/0x1d0
 [   45.468728] [ffc0009079f0] nf_setsockopt+0x50/0x60
 [   45.468732] [ffc000946870] ip_setsockopt+0x8c/0xb4
 [   45.468735] [ffc0009661c0] raw_setsockopt+0x10/0x50
 [   45.468739] [ffc0008c1550] sock_common_setsockopt+0x14/0x20
 [   45.468742] [ffc0008bd190] SyS_setsockopt+0x88/0xb8
 [   45.468744] ---[ end trace 41d156354d18c039 ]---

Applied, thanks.

One question:

 Change-Id: I1da5cd96fc8e1e1e4209e81eba1165a42d4d45e9

BTW, does this gerrit change ID provide any public information? Thanks.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next 12/16] i40evf: don't delete all the filters

2015-07-13 Thread Sergei Shtylyov

Hello.

On 7/13/2015 12:08 PM, Jeff Kirsher wrote:


From: Mitch Williams mitch.a.willi...@intel.com



Due to an inverted conditional, the driver was marking all of its MAC
filters for deletion every time set_rx_mode was called. Depending upon
the timing of the calls to set_rx_mode and the processing of the admin
queue, the driver would (accidentally) end up with a varying number of
functional filters.



Correct this logic so that MAC filters are added and removed correctly.
Add a check for the driver's hardware MAC address so that this filter
doesn't get removed incorrectly.



Change-ID: Ib3e7c4a5b53df6835f164fe44cb778cb71f8aff8
Signed-off-by: Mitch Williams mitch.a.willi...@intel.com
Tested-by: Jim Young james.m.yo...@intel.com
Signed-off-by: Jeff Kirsher jeffrey.t.kirs...@intel.com
---
  drivers/net/ethernet/intel/i40evf/i40evf_main.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)



diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c 
b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index 94eff4a..07f6052 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -892,8 +892,10 @@ static void i40evf_set_rx_mode(struct net_device *netdev)
break;
}
}
+   if (ether_addr_equal(f-macaddr, adapter-hw.mac.addr))
+   found = true;


   This line is indented too much.

[...]

WBR, Sergei

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [rhashtable] WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:301 __debug_object_init()

2015-07-13 Thread Fengguang Wu
 22+ 
  0  Merge branch 'for-linus' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
 git bisect  bad fd3e646c87ab3f2ba98aa25394581af27cc78dc5  # 09:27  0- 
 22  net: act_bpf: fix size mismatch on filter preparation
 git bisect  bad e84448d52190413400663736067f826f28a04ad6  # 09:32  0- 
 22  xen-netfront: refactor skb slot counting
 git bisect  bad 829a3ada9cc7d4c30fa61f8033403fb6c8f8092a  # 09:38  0- 
  1  geneve: Simplify locking.
 git bisect good a4c9ea5e8fec680134d22aa99b54d1cd8c226ebd  # 09:42 22+ 
 12  geneve: Add Geneve GRO support
 git bisect good 255047b0dca31e6b8ce254481a0b65d559d2ebb8  # 09:46 20+ 
  0  Bluetooth: Add timing information to SMP test case runs
 git bisect good 354f473ee2c5d01c1cf90f747f95218ee3e73e95  # 09:52 22+ 
  0  ath9k: fix typo
 git bisect good d312da293f787e1b19c57acb58e8c1b171c4a04a  # 09:59 22+ 
  0  ixgbe: convert to CYCLECOUNTER_MASK macro.
 git bisect good b8e1943e9f754219bcfb40bac4a605b5348acb25  # 10:03 22+ 
  8  rhashtable: Factor out bucket_tail() function
 git bisect  bad f89bd6f87a53ce5a7d60662429591ebac2745c10  # 10:08  0- 
 22  rhashtable: Supports for nulls marker
 git bisect good 113948d841e8d78039e5dbbb5248f5b73e99eafa  # 10:12 22+ 
 13  spinlock: Add spin_lock_bh_nested()
 git bisect  bad 97defe1ecf868b8127f8e62395499d6a06e4c4b1  # 10:16  0- 
 22  rhashtable: Per bucket locks  deferred expansion/shrinking
 # first bad commit: [97defe1ecf868b8127f8e62395499d6a06e4c4b1] rhashtable: 
 Per bucket locks  deferred expansion/shrinking
 git bisect good 113948d841e8d78039e5dbbb5248f5b73e99eafa  # 10:19 66+ 
 27  spinlock: Add spin_lock_bh_nested()
 # extra tests with DEBUG_INFO
 git bisect  bad 97defe1ecf868b8127f8e62395499d6a06e4c4b1  # 10:25  0- 
 66  rhashtable: Per bucket locks  deferred expansion/shrinking
 # extra tests on HEAD of linux-devel/devel-spot-201507122014
 git bisect good 3afd2c3f65a385c405a084d80431c84b103cb6df  # 10:28 66+ 
 49  0day head guard for 'devel-spot-201507122014'
 # extra tests on tree/branch linus/master
 git bisect good f760b87f8f12eb262f14603e65042996fe03720e  # 10:33 66+ 
  0  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
 # extra tests on tree/branch linus/master
 git bisect good f760b87f8f12eb262f14603e65042996fe03720e  # 10:33 66+ 
  0  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
 # extra tests on tree/branch next/master
 git bisect good 2eb62d762a2112579f259903e62ba18d16c51f66  # 10:36 66+ 
 20  Add linux-next specific files for 20150713
 
 
 This script may reproduce the error.
 
 
 #!/bin/bash
 
 kernel=$1
 initrd=yocto-minimal-x86_64.cgz
 
 wget --no-clobber 
 https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
 
 kvm=(
   qemu-system-x86_64
   -enable-kvm
   -cpu Haswell,+smep,+smap
   -kernel $kernel
   -initrd $initrd
   -m 256
   -smp 1
   -device e1000,netdev=net0
   -netdev user,id=net0
   -boot order=nc
   -no-reboot
   -watchdog i6300esb
   -rtc base=localtime
   -serial stdio
   -display none
   -monitor null 
 )
 
 append=(
   hung_task_panic=1
   earlyprintk=ttyS0,115200
   systemd.log_level=err
   debug
   apic=debug
   sysrq_always_enabled
   rcupdate.rcu_cpu_stall_timeout=100
   panic=-1
   softlockup_panic=1
   nmi_watchdog=panic
   oops=panic
   load_ramdisk=2
   prompt_ramdisk=0
   console=ttyS0,115200
   console=tty0
   vga=normal
   root=/dev/ram0
   rw
   drbd.minor_count=8
 )
 
 ${kvm[@]} --append ${append[*]}
 
 
 ---
 0-DAY kernel test infrastructureOpen Source Technology Center
 https://lists.01.org/pipermail/lkp  Intel Corporation

 early console in setup code
 [0.00] Initializing cgroup subsys cpuset
 [0.00] Initializing cgroup subsys cpu
 [0.00] Linux version 3.19.0-rc2-00323-g97defe1 (kbuild@lkp-ib03) (gcc 
 version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Tue Jul 14 10:14:59 CST 2015
 [0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 
 systemd.log_level=err debug apic=debug sysrq_always_enabled 
 rcupdate.rcu_cpu_stall_timeout=100 panic=-1 softlockup_panic=1 
 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 
 console=ttyS0,115200 console=tty0 vga=normal  root=/dev/ram0 rw 
 link=/kbuild-tests/run-queue/kvm/x86_64-randconfig-a0-07122340/linux-devel:devel-spot-201507122014:97defe1ecf868b8127f8e62395499d6a06e4c4b1:bisect-linux-1/.vmlinuz-97defe1ecf868b8127f8e62395499d6a06e4c4b1-20150714101515-19-ivb41
  branch=linux-devel/devel-spot-201507122014 
 BOOT_IMAGE=/pkg/linux/x86_64-randconfig-a0-07122340/gcc-4.9

RE: [PATCH v3] Add support for driver cross-timestamp to PTP_SYS_OFFSET ioctl

2015-07-13 Thread Hall, Christopher S
I am assuming the patch is rejected at this point.  I will re-submit later as 
soon as I am able to post a full end to end solution.

Chris

 -Original Message-
 From: Richard Cochran [mailto:richardcoch...@gmail.com]
 Sent: Thursday, July 09, 2015 7:58 AM
 To: Hall, Christopher S
 Cc: t...@linutronix.de; john.stu...@linaro.org; Ronciak, John; linux-
 ker...@vger.kernel.org; netdev@vger.kernel.org
 Subject: Re: [PATCH v3] Add support for driver cross-timestamp to
 PTP_SYS_OFFSET ioctl
 
 On Wed, Jul 08, 2015 at 01:46:41PM -0700, Christopher Hall wrote:
  This patch allows system and device time (cross-timestamp) to be
  performed by the driver. Currently, the cross-timestamping is performed
  in the PTP_SYS_OFFSET ioctl.  The PTP clock driver reads gettimeofday()
  and the gettime64() callback provided by the driver. The cross-timestamp
  is best effort where the latency between the capture of system time
  (getnstimeofday()) and the device time (driver callback) may be
  significant.
 
 The interface looks okay to me.  Now all we need is a user of it...
 
 Acked-by: Richard Cochran richardcoch...@gmail.com
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] net: ieee802154: Remove redundant spi driver bus initialization

2015-07-13 Thread Alexander Aring
On Tue, Jun 23, 2015 at 10:52:52PM +0800, Antonio Borneo wrote:
 In ancient times it was necessary to manually initialize the bus
 field of an spi_driver to spi_bus_type. These days this is done in
 spi_register_driver(), so we can drop the manual assignment.
 

Marcel,

I don't see this patch in any linux-next, net-next, bluetooth-next tree.
Could you please apply this patch with the acks by Alan and Varka?

- Alex
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 1/2] pci: Add dev_flags bit to access VPD through function 0

2015-07-13 Thread Mark D Rustad
From: Mark Rustad mark.d.rus...@intel.com

Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
function 0 to provide VPD access on other functions. This is for
hardware devices that provide copies of the same VPD capability
registers in multiple functions. Because the kernel expects that
each function has its own registers, both the locking and the state
tracking are affected by VPD accesses to different functions.

On such devices for example, if a VPD write is performed on function
0, *any* later attempt to read VPD from any other function of that
device will hang. This has to do with how the kernel tracks the
expected value of the F bit per function.

Concurrent accesses to different functions of the same device can
not only hang but also corrupt both read and write VPD data.

When hangs occur, typically the error message:

vpd r/w failed.  This is likely a firmware bug on this device.

will be seen.

Never set this bit on function 0 or there will be an infinite recursion.

Signed-off-by: Mark Rustad mark.d.rus...@intel.com
---
Changes in V2:
- Corrected spelling in log message
- Added checks to see that the referenced function 0 is reasonable
Changes in V3:
- Don't leak a device reference
- Check that function 0 has VPD
- Make a helper for the function 0 checks
- Do multifunction check in the quirk
Changes in V4:
- Provide a much more detailed explanation in the commit log
---
 drivers/pci/access.c |   61 +-
 include/linux/pci.h  |2 ++
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index d9b64a175990..b965c12168b7 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -439,6 +439,56 @@ static const struct pci_vpd_ops pci_vpd_pci22_ops = {
.release = pci_vpd_pci22_release,
 };
 
+static ssize_t pci_vpd_f0_read(struct pci_dev *dev, loff_t pos, size_t count,
+  void *arg)
+{
+   struct pci_dev *tdev = pci_get_slot(dev-bus, PCI_SLOT(dev-devfn));
+   ssize_t ret;
+
+   if (!tdev)
+   return -ENODEV;
+
+   ret = pci_read_vpd(tdev, pos, count, arg);
+   pci_dev_put(tdev);
+   return ret;
+}
+
+static ssize_t pci_vpd_f0_write(struct pci_dev *dev, loff_t pos, size_t count,
+   const void *arg)
+{
+   struct pci_dev *tdev = pci_get_slot(dev-bus, PCI_SLOT(dev-devfn));
+   ssize_t ret;
+
+   if (!tdev)
+   return -ENODEV;
+
+   ret = pci_write_vpd(tdev, pos, count, arg);
+   pci_dev_put(tdev);
+   return ret;
+}
+
+static const struct pci_vpd_ops pci_vpd_f0_ops = {
+   .read = pci_vpd_f0_read,
+   .write = pci_vpd_f0_write,
+   .release = pci_vpd_pci22_release,
+};
+
+static int pci_vpd_f0_dev_check(struct pci_dev *dev)
+{
+   struct pci_dev *tdev = pci_get_slot(dev-bus, PCI_SLOT(dev-devfn));
+   int ret = 0;
+
+   if (!tdev)
+   return -ENODEV;
+   if (!tdev-vpd || !tdev-multifunction ||
+   dev-class != tdev-class || dev-vendor != tdev-vendor ||
+   dev-device != tdev-device)
+   ret = -ENODEV;
+
+   pci_dev_put(tdev);
+   return ret;
+}
+
 int pci_vpd_pci22_init(struct pci_dev *dev)
 {
struct pci_vpd_pci22 *vpd;
@@ -447,12 +497,21 @@ int pci_vpd_pci22_init(struct pci_dev *dev)
cap = pci_find_capability(dev, PCI_CAP_ID_VPD);
if (!cap)
return -ENODEV;
+   if (dev-dev_flags  PCI_DEV_FLAGS_VPD_REF_F0) {
+   int ret = pci_vpd_f0_dev_check(dev);
+
+   if (ret)
+   return ret;
+   }
vpd = kzalloc(sizeof(*vpd), GFP_ATOMIC);
if (!vpd)
return -ENOMEM;
 
vpd-base.len = PCI_VPD_PCI22_SIZE;
-   vpd-base.ops = pci_vpd_pci22_ops;
+   if (dev-dev_flags  PCI_DEV_FLAGS_VPD_REF_F0)
+   vpd-base.ops = pci_vpd_f0_ops;
+   else
+   vpd-base.ops = pci_vpd_pci22_ops;
mutex_init(vpd-lock);
vpd-cap = cap;
vpd-busy = false;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8a0321a8fb59..8edb125db13a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -180,6 +180,8 @@ enum pci_dev_flags {
PCI_DEV_FLAGS_NO_BUS_RESET = (__force pci_dev_flags_t) (1  6),
/* Do not use PM reset even if device advertises NoSoftRst- */
PCI_DEV_FLAGS_NO_PM_RESET = (__force pci_dev_flags_t) (1  7),
+   /* Get VPD from function 0 VPD */
+   PCI_DEV_FLAGS_VPD_REF_F0 = (__force pci_dev_flags_t) (1  8),
 };
 
 enum pci_irq_reroute_variant {

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 2/2] pci: Add VPD quirk for Intel Ethernet devices

2015-07-13 Thread Mark D Rustad
From: Mark Rustad mark.d.rus...@intel.com

This quirk sets the PCI_DEV_FLAGS_VPD_REF_F0 flag on all Intel
Ethernet device functions other than function 0.

Signed-off-by: Mark Rustad mark.d.rus...@intel.com
---
Changes in V3:
- Added a multifunction device check
---
 drivers/pci/quirks.c |9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e9fd0e90fa3b..08c04e4f5ab2 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1894,6 +1894,15 @@ static void quirk_netmos(struct pci_dev *dev)
 DECLARE_PCI_FIXUP_CLASS_HEADER(PCI_VENDOR_ID_NETMOS, PCI_ANY_ID,
 PCI_CLASS_COMMUNICATION_SERIAL, 8, quirk_netmos);
 
+static void quirk_f0_vpd_link(struct pci_dev *dev)
+{
+   if (!dev-multifunction || !PCI_FUNC(dev-devfn))
+   return;
+   dev-dev_flags |= PCI_DEV_FLAGS_VPD_REF_F0;
+}
+DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_INTEL, PCI_ANY_ID,
+ PCI_CLASS_NETWORK_ETHERNET, 8, quirk_f0_vpd_link);
+
 static void quirk_e100_interrupt(struct pci_dev *dev)
 {
u16 command, pmcsr;

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch net] fq_codel: fix a use-after-free

2015-07-13 Thread Cong Wang
Fixes: 25331d6ce42b (net: sched: implement qstat helper routines)
Cc: John Fastabend john.fastab...@gmail.com
Signed-off-by: Cong Wang xiyou.wangc...@gmail.com
Signed-off-by: Cong Wang cw...@twopensource.com
---
 net/sched/sch_fq_codel.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index d75993f..06e7c84 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -155,10 +155,10 @@ static unsigned int fq_codel_drop(struct Qdisc *sch)
skb = dequeue_head(flow);
len = qdisc_pkt_len(skb);
q-backlogs[idx] -= len;
-   kfree_skb(skb);
sch-q.qlen--;
qdisc_qstats_drop(sch);
qdisc_qstats_backlog_dec(sch, skb);
+   kfree_skb(skb);
flow-dropped++;
return idx;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 18/22] fjes: unshare_watch_task

2015-07-13 Thread Yasuaki Ishimatsu

Hi Izumi-san,

On Wed, 24 Jun 2015 11:55:50 +0900
Taku Izumi izumi.t...@jp.fujitsu.com wrote:

 This patch adds unshare_watch_task.
 Shared buffer's status can be changed into unshared.
 This task is used to monitor shared buffer's status.
 
 Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
 ---
  drivers/net/fjes/fjes.h  |   3 +
  drivers/net/fjes/fjes_main.c | 130 
 +++
  2 files changed, 133 insertions(+)
 
 diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
 index d31d4c3..57feee8 100644
 --- a/drivers/net/fjes/fjes.h
 +++ b/drivers/net/fjes/fjes.h
 @@ -59,6 +59,9 @@ struct fjes_adapter {
   struct work_struct tx_stall_task;
   struct work_struct raise_intr_rxdata_task;
  
 + struct work_struct unshare_watch_task;
 + unsigned long unshare_watch_bitmask;
 +
   struct delayed_work interrupt_watch_task;
   bool interrupt_watch_enable;
  
 diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
 index 1ddb9d3..69a238c 100644
 --- a/drivers/net/fjes/fjes_main.c
 +++ b/drivers/net/fjes/fjes_main.c
 @@ -73,6 +73,7 @@ static int fjes_remove(struct platform_device *);
  static int fjes_sw_init(struct fjes_adapter *);
  static void fjes_netdev_setup(struct net_device *);
  static void fjes_irq_watch_task(struct work_struct *);
 +static void fjes_watch_unshare_task(struct work_struct *);
  static void fjes_rx_irq(struct fjes_adapter *, int);
  static int fjes_poll(struct napi_struct *, int);
  
 @@ -312,6 +313,8 @@ static int fjes_close(struct net_device *netdev)
   fjes_free_irq(adapter);
  
   cancel_delayed_work_sync(adapter-interrupt_watch_task);
 + cancel_work_sync(adapter-unshare_watch_task);
 + adapter-unshare_watch_bitmask = 0;
   cancel_work_sync(adapter-raise_intr_rxdata_task);
   cancel_work_sync(adapter-tx_stall_task);
  
 @@ -1032,6 +1035,8 @@ static int fjes_probe(struct platform_device *plat_dev)
   INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task);
   INIT_WORK(adapter-raise_intr_rxdata_task,
 fjes_raise_intr_rxdata_task);
 + INIT_WORK(adapter-unshare_watch_task, fjes_watch_unshare_task);
 + adapter-unshare_watch_bitmask = 0;
  
   INIT_DELAYED_WORK(adapter-interrupt_watch_task, fjes_irq_watch_task);
   adapter-interrupt_watch_enable = false;
 @@ -1077,6 +1082,7 @@ static int fjes_remove(struct platform_device *plat_dev)
   struct fjes_hw *hw = adapter-hw;
  
   cancel_delayed_work_sync(adapter-interrupt_watch_task);
 + cancel_work_sync(adapter-unshare_watch_task);
   cancel_work_sync(adapter-raise_intr_rxdata_task);
   cancel_work_sync(adapter-tx_stall_task);
   if (adapter-control_wq)
 @@ -1136,6 +1142,130 @@ static void fjes_irq_watch_task(struct work_struct 
 *work)
   }
  }
  
 +static void fjes_watch_unshare_task(struct work_struct *work)
 +{
 + struct fjes_adapter *adapter =
 + container_of(work, struct fjes_adapter, unshare_watch_task);
 +
 + struct fjes_hw *hw = adapter-hw;
 + struct net_device *netdev = adapter-netdev;
 + int epidx;
 + int max_epid, my_epid;
 + unsigned long unshare_watch_bitmask;
 + int wait_time = 0;
 + int is_shared;
 + int stop_req, stop_req_done;
 + int unshare_watch, unshare_reserve;
 + int ret;
 +
 + my_epid = hw-my_epid;
 + max_epid = hw-max_epid;
 +
 + unshare_watch_bitmask = adapter-unshare_watch_bitmask;
 + adapter-unshare_watch_bitmask = 0;
 +
 + while ((unshare_watch_bitmask || hw-txrx_stop_req_bit) 
 +(wait_time  3000)) {
 + for (epidx = 0; epidx  hw-max_epid; epidx++) {
 + if (epidx == hw-my_epid)
 + continue;
 +
 + is_shared =
 + fjes_hw_epid_is_shared(hw-hw_info.share, epidx);
 +
 + stop_req =
 + test_bit(epidx, hw-txrx_stop_req_bit);
 +
 + stop_req_done =
 + hw-ep_shm_info[epidx].rx.info-v1i.rx_status 
 + FJES_RX_STOP_REQ_DONE;
 +
 + unshare_watch =
 + test_bit(epidx, unshare_watch_bitmask);
 +
 + unshare_reserve =
 + test_bit(epidx,
 +  hw-hw_info.buffer_unshare_reserve_bit);
 +
 + if ((!stop_req ||
 +  (is_shared  (!is_shared || !stop_req_done))) 
 + (is_shared || !unshare_watch || !unshare_reserve))
 + continue;
 +

 + mutex_lock(hw-hw_info.lock);
 + ret = fjes_hw_unregister_buff_addr(hw, epidx);
 + switch (ret) {
 + case 0:
 + break;
 + case -ENOMSG:
 + case -EBUSY:
 + default:
 + if 

Re: [PATCH net-next] ebpf: remove self-assignment in interpreter's tail call

2015-07-13 Thread Alexei Starovoitov

On 7/13/15 11:49 AM, Daniel Borkmann wrote:

ARG1 = BPF_R1 as it stands, evaluates to regs[BPF_REG_1] = regs[BPF_REG_1]
and thus has no effect. Add a comment instead, explaining what happens and
why it's okay to just remove it. Since from user space side, a tail call is
invoked as a pseudo helper function via bpf_tail_call_proto, the verifier
checks the arguments just like with any other helper function and makes
sure that the first argument (regs[BPF_REG_1])'s type is ARG_PTR_TO_CTX.

Signed-off-by: Daniel Borkmann dan...@iogearbox.net


Thanks!
Acked-by: Alexei Starovoitov a...@plumgrid.com

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread David Miller
From: Neil Horman nhor...@tuxdriver.com
Date: Mon, 13 Jul 2015 06:39:11 -0400

 Initially Marcelo had created duplicate code paths, one to return an
 fd, one to return a file struct.  If you would rather go in that
 direction, I'm sure he can propose it again, but that seems less
 correct to me than this solution.

That's much better.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] net/bonding: Add function bond_remove_proc_entry at __bond_release_one

2015-07-13 Thread clsoto
From: Carol L Soto cls...@linux.vnet.ibm.com

Add function bond_remove_proc_entry at __bond_release_one to avoid stack 
trace at rmmod bonding.

[68830.202239] remove_proc_entry: removing non-empty directory
'net/bonding', leaking at least 'bond0'
[68830.202257] [ cut here ]
[68830.202260] WARNING: at fs/proc/generic.c:562
[68830.202412] NIP [c02abf6c] .remove_proc_entry+0x1fc/0x240
[68830.202416] LR [c02abf68] .remove_proc_entry+0x1f8/0x240
[68830.202419] PACATMSCRATCH [80009032]
[68830.202421] Call Trace:
[68830.202424] [c00179277940] [c02abf68] 
.remove_proc_entry+0x1f8/0x240 (unreliable)
[68830.202434] [c001792779f0] [d53229a4] 
.bond_destroy_proc_dir+0x34/0x54 [bonding]
[68830.202440] [c00179277a70] [d53130e0] 
.bond_net_exit+0x90/0x120 [bonding]
[68830.202445] [c00179277b10] [c059944c] 
.ops_exit_list.isra.0+0x6c/0xd0
[68830.202450] [c00179277ba0] [c0599774] 
.unregister_pernet_operations+0x94/0x100
[68830.202454] [c00179277c40] [c0599814] 
.unregister_pernet_subsys+0x34/0x60
[68830.202460] [c00179277cc0] [d5323758] 
.bonding_exit+0x48/0x2328 [bonding]
[68830.202466] [c00179277d30] [c010dcc4] 
.SyS_delete_module+0x1f4/0x340
[68830.202471] [c00179277e30] [c0009e7c] 
syscall_exit+0x0/0x7c
[68830.202491] ---[ end trace 9bd1d810219c9875 ]---

Signed-off-by: Carol L Soto cls...@linux.vnet.ibm.com
---
 drivers/net/bonding/bond_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 19eb990..ace105a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1870,6 +1870,8 @@ static int __bond_release_one(struct net_device *bond_dev,
dev_set_mac_address(slave_dev, addr);
}
 
+   bond_remove_proc_entry(bond);
+
dev_set_mtu(slave_dev, slave-original_mtu);
 
slave_dev-priv_flags = ~IFF_BONDING;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread Marcelo Ricardo Leitner

On 13-07-2015 15:59, David Miller wrote:

From: Neil Horman nhor...@tuxdriver.com
Date: Mon, 13 Jul 2015 06:39:11 -0400


Initially Marcelo had created duplicate code paths, one to return an
fd, one to return a file struct.  If you would rather go in that
direction, I'm sure he can propose it again, but that seems less
correct to me than this solution.


That's much better.


I'm not sure what you mean. Is the new option better or the 
history/description?


  Marcelo

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fighting out-of-order reception with RPS?

2015-07-13 Thread Tom Herbert
On Sun, Jul 12, 2015 at 12:15 PM, Oliver Hartkopp
socket...@hartkopp.net wrote:
 Hello Eric,

 On 07/11/2015 06:35 AM, Eric Dumazet wrote:
 On Fri, 2015-07-10 at 22:36 +0200, Oliver Hartkopp wrote:

 Hm. Doesn't sound like a good solution when there's a difference between 
 NAPI
 and non-NAPI drivers in matters of OOO, right?

 Isn't OOO a problem for you ? Then you either have to :

 1) Use a single CPU to handle IRQ from the device
 2) Use NAPI


 See below ...

 What about checking in netif_rx() if the non-NAPI driver has set a hash (aka
 the driver is OOO sensitive)?
 And if so we could automatically set rps_cpus for this interface in a way 
 that
 all CPUs are enabled to take skbs following the hash.

 Wow, netif_rx() is packet processing fast path, certainly not the place
 to add controlling path decisions.

 My only requirement is to be able to pick CAN frames (contained in skbs) from
 the socket in the same order they have been received.

 Please convert your driver to NAPI. You might then even benefit from
 GRO.

 Just some remarks about CAN and CAN frames as you suggest GRO which is
 completely pointless for CAN.

 CAN frames have a 11 or 29 bit CAN Identifier (no MAC but content addressing)
 and 0 to 64 bytes of payload. Therefore the MTU for CAN interfaces is 16 or 72
 byte (see struct can(fd)_frame). Each skbuff contains a single CAN frame.

 There are CAN controllers which have a FIFO for up to 32 CAN frames, e.g.
 flexcan.c which also implements NAPI. Others (e.g. sja1000.c) don't have any
 FIFO and the reading of the CAN frame from the memory mapped registers needs
 to be processed in the irq context instantly. So 'fast path' netif_rx() is
 reasonable, right?

 So why is it not possible to pass netif_rx() skbs from a specific CAN network
 interface to whatever queue where they are processed in order?

 E.g. with

 skb_set_hash(skb, dev-ifindex, PKT_HASH_TYPE_L2);

 and

 echo f  /sys/class/net/can0/queues/rx-0/rps_cpus

 I get properly ordered CAN frames - even with netif_rx() processed skbs. I
 just want to have this stuff to be enabled by default for CAN interfaces to
 kill the OOO frame issue.

If you really must process the CAN FIFO in the hard interrupt then
create a private sk_buf queue. In the interrupt, dequeue from FIFO and
enqueue on the sk_buf queue. Then schedule NAPI, and when that runs
process the sk_buf queue calling call netif_receive_skb for each
enqueued skb. Pretty simple actually :-)

 Regards,
 Oliver

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] tcp: don't use F-RTO on non-recurring timeouts

2015-07-13 Thread Yuchung Cheng
Currently F-RTO may repeatedly send new data packets on non-recurring
timeouts in CA_Loss mode. This is a bug because F-RTO (RFC5682)
should only be used on either new recovery or recurring timeouts.

This exacerbates the recovery progress during frequent timeout 
repair, because we prioritize sending new data packets instead of
repairing the holes when the bandwidth is already scarce.

Fix it by correcting the test of a new recovery episode.

Signed-off-by: Yuchung Cheng ych...@google.com
Signed-off-by: Neal Cardwell ncardw...@google.com
---
 net/ipv4/tcp_input.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1578fc2..0cef1af 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1920,14 +1920,13 @@ void tcp_enter_loss(struct sock *sk)
const struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
-   bool new_recovery = false;
+   bool new_recovery = icsk-icsk_ca_state  TCP_CA_Recovery;
bool is_reneg;  /* is receiver reneging on SACKs? */
 
/* Reduce ssthresh if it has not yet been made inside this window. */
if (icsk-icsk_ca_state = TCP_CA_Disorder ||
!after(tp-high_seq, tp-snd_una) ||
(icsk-icsk_ca_state == TCP_CA_Loss  !icsk-icsk_retransmits)) {
-   new_recovery = true;
tp-prior_ssthresh = tcp_current_ssthresh(sk);
tp-snd_ssthresh = icsk-icsk_ca_ops-ssthresh(sk);
tcp_ca_event(sk, CA_EVENT_LOSS);
-- 
2.4.3.573.g4eafbef

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] bnx2:Make various functions to have a return type of void in the file bnx2.c

2015-07-13 Thread Sony Chacko
 Sony,
 I also sent this patch and was wondering if I can get a reply on it.
 From 4a607447562bec161fd947caae5eb02c2365c58a Mon Sep
 17 00:00:00 2001
 From: Nicholas Krause xerofo...@gmail.com
 Date: Wed, 8 Jul 2015 08:29:07 -0400
 Subject: [PATCH] bnx2i:Fix backwards locking scenario in the
 function  bnx2i_cleanup_task
 
 This fixes the backwards locking scenario for unlocking the
 bottom half spinlock before calling the
 wait_for_completion_timeout on the structure pointer
 bnx2i_conn's member cmd_cleanup_cmpl for the critical region
 of this function to lock the spin_lock bottom half before
 unlocking it after the call to this function in order to have actual
 protection for the function bnx2i_cleanup_task's critical region.
 
 Signed-off-by: Nicholas Krause xerofo...@gmail.com
 ---
  drivers/scsi/bnx2i/bnx2i_iscsi.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c
 b/drivers/scsi/bnx2i/bnx2i_iscsi.c
 index 7289437..619a26f 100644
 --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
 +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
 @@ -1172,12 +1172,12 @@ static void
 bnx2i_cleanup_task(struct iscsi_task *task)
   if (task-state == ISCSI_TASK_ABRT_TMF) {
   bnx2i_send_cmd_cleanup_req(hba, task-dd_data);
 
 - spin_unlock_bh(conn-session-back_lock);
 - spin_unlock_bh(conn-session-frwd_lock);
 + spin_lock_bh(conn-session-back_lock);
 + spin_lock_bh(conn-session-frwd_lock);
   wait_for_completion_timeout(bnx2i_conn-
 cmd_cleanup_cmpl,
 
   msecs_to_jiffies(ISCSI_CMD_CLEANUP_TIMEOUT));
 - spin_lock_bh(conn-session-frwd_lock);
 - spin_lock_bh(conn-session-back_lock);
 + spin_unlock_bh(conn-session-frwd_lock);
 + spin_unlock_bh(conn-session-back_lock);
   }
   bnx2i_iscsi_unmap_sg_list(task-dd_data);
  }
 --
 2.1.4
 I am assuming it's wrong but you never known.
 Nick

Nick,

I have included the Qlogic ISCSI engineer to the mailing list to
review and ACK the patch. I will also follow it up with the 
ISCSI team.

Thanks,
Sony
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V4 0/2] pci: Provide a flag to access VPD through function 0

2015-07-13 Thread Mark D Rustad
Many multi-function devices provide shared registers in extended
config space for accessing VPD. The behavior of these registers
means that the state must be tracked and access locked correctly
for accesses not to hang or worse. One way to meet these needs is
to always perform the accesses through function 0, thereby using
the state tracking and mutex that already exists.

To provide this behavior, add a dev_flags bit to indicate that this
should be done. This bit can then be set for any non-zero function
that needs to redirect such VPD access to function 0. Do not set
this bit on the zero function or there will be an infinite recursion.

The second patch uses this new flag to invoke this behavior on all
multi-function Intel Ethernet devices.

Any hardware that shares VPD registers with multiple functions has
been suffering these problems forever. The hangs result in the log
message:

vpd r/w failed.  This is likely a firmware bug on this device.

Both read and write data corruption are also possible during
overlapping accesses in addition to hangs.

Signed-off-by: Mark Rustad mark.d.rus...@intel.com

---
Changes in V2:
- Corrected a spelling error in a log message
- Added checks to see that the referenced function 0 is reasonable
Changes in V3:
- Don't leak a device reference
- Check that function 0 has VPD
- Make a helper for the function 0 checks
- Moved a multifunction check to the quirk patch
Changes in V4:
- Provide a more extensive commit log for patch 1

---

Mark Rustad (2):
  pci: Add dev_flags bit to access VPD through function 0
  pci: Add VPD quirk for Intel Ethernet devices


 drivers/pci/access.c |   61 +-
 drivers/pci/quirks.c |9 +++
 include/linux/pci.h  |2 ++
 3 files changed, 71 insertions(+), 1 deletion(-)

-- 
Mark Rustad, Network Division, Intel Corporation
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] hv_netvsc: Add close of RNDIS filter into change mtu call

2015-07-13 Thread Haiyang Zhang
The current change mtu call only stops tx before removing RNDIS filter.
In case ringbufer is not empty, the rndis_filter_device_remove() may
hang on removing the buffers.

This patch adds close of RNDIS filter before removing it, also a
gradual waiting loop until the ring is empty. The change_mtu hang
issue under heavy traffic is solved by this patch.

Signed-off-by: Haiyang Zhang haiya...@microsoft.com
Reviewed-by: K. Y. Srinivasan k...@microsoft.com
---
 drivers/net/hyperv/netvsc_drv.c |   58 +++
 1 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index b855ba9..7b36d5f 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -106,7 +106,7 @@ static int netvsc_open(struct net_device *net)
return ret;
}
 
-   netif_tx_start_all_queues(net);
+   netif_tx_wake_all_queues(net);
 
nvdev = hv_get_drvdata(device_obj);
rdev = nvdev-extension;
@@ -120,15 +120,56 @@ static int netvsc_close(struct net_device *net)
 {
struct net_device_context *net_device_ctx = netdev_priv(net);
struct hv_device *device_obj = net_device_ctx-device_ctx;
+   struct netvsc_device *nvdev = hv_get_drvdata(device_obj);
int ret;
+   u32 aread, awrite, i, msec = 10, retry = 0, retry_max = 20;
+   struct vmbus_channel *chn;
 
netif_tx_disable(net);
 
/* Make sure netvsc_set_multicast_list doesn't re-enable filter! */
cancel_work_sync(net_device_ctx-work);
ret = rndis_filter_close(device_obj);
-   if (ret != 0)
+   if (ret != 0) {
netdev_err(net, unable to close device (ret %d).\n, ret);
+   return ret;
+   }
+
+   /* Ensure pending bytes in ring are read */
+   while (true) {
+   aread = 0;
+   for (i = 0; i  nvdev-num_chn; i++) {
+   chn = nvdev-chn_table[i];
+   if (!chn)
+   continue;
+
+   hv_get_ringbuffer_availbytes(chn-inbound, aread,
+awrite);
+
+   if (aread)
+   break;
+
+   hv_get_ringbuffer_availbytes(chn-outbound, aread,
+awrite);
+
+   if (aread)
+   break;
+   }
+
+   retry++;
+   if (retry  retry_max || aread == 0)
+   break;
+
+   msleep(msec);
+
+   if (msec  1000)
+   msec *= 2;
+   }
+
+   if (aread) {
+   netdev_err(net, Ring buffer not empty after closing rndis\n);
+   ret = -ETIMEDOUT;
+   }
 
return ret;
 }
@@ -736,6 +777,7 @@ static int netvsc_change_mtu(struct net_device *ndev, int 
mtu)
struct netvsc_device *nvdev = hv_get_drvdata(hdev);
struct netvsc_device_info device_info;
int limit = ETH_DATA_LEN;
+   int ret = 0;
 
if (nvdev == NULL || nvdev-destroy)
return -ENODEV;
@@ -746,9 +788,11 @@ static int netvsc_change_mtu(struct net_device *ndev, int 
mtu)
if (mtu  NETVSC_MTU_MIN || mtu  limit)
return -EINVAL;
 
+   ret = netvsc_close(ndev);
+   if (ret)
+   goto out;
+
nvdev-start_remove = true;
-   cancel_work_sync(ndevctx-work);
-   netif_tx_disable(ndev);
rndis_filter_device_remove(hdev);
 
ndev-mtu = mtu;
@@ -758,9 +802,11 @@ static int netvsc_change_mtu(struct net_device *ndev, int 
mtu)
device_info.ring_size = ring_size;
device_info.max_num_vrss_chns = max_num_vrss_chns;
rndis_filter_device_add(hdev, device_info);
-   netif_tx_wake_all_queues(ndev);
 
-   return 0;
+out:
+   netvsc_open(ndev);
+
+   return ret;
 }
 
 static struct rtnl_link_stats64 *netvsc_get_stats64(struct net_device *net,
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] ebpf: remove self-assignment in interpreter's tail call

2015-07-13 Thread Daniel Borkmann
ARG1 = BPF_R1 as it stands, evaluates to regs[BPF_REG_1] = regs[BPF_REG_1]
and thus has no effect. Add a comment instead, explaining what happens and
why it's okay to just remove it. Since from user space side, a tail call is
invoked as a pseudo helper function via bpf_tail_call_proto, the verifier
checks the arguments just like with any other helper function and makes
sure that the first argument (regs[BPF_REG_1])'s type is ARG_PTR_TO_CTX.

Signed-off-by: Daniel Borkmann dan...@iogearbox.net
---
 kernel/bpf/core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index c5bedc8..bf38f5e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -453,7 +453,11 @@ select_insn:
if (unlikely(!prog))
goto out;
 
-   ARG1 = BPF_R1;
+   /* ARG1 at this point is guaranteed to point to CTX from
+* the verifier side due to the fact that the tail call is
+* handeled like a helper, that is, bpf_tail_call_proto,
+* where arg1_type is ARG_PTR_TO_CTX.
+*/
insn = prog-insnsi;
goto select_insn;
 out:
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 08/12] IB/cma: Add net_dev and private data checks to RDMA CM

2015-07-13 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:37PM +0300, Haggai Eran wrote:
 + switch (ib_event-event) {
 + case IB_CM_REQ_RECEIVED:
 + req-device = req_param-listen_id-device;
 + req-port   = req_param-port;
 + req-local_gid  = req_param-primary_path-sgid;
 + req-service_id = req_param-primary_path-service_id;
 + req-pkey   = be16_to_cpu(req_param-primary_path-pkey);

I feel pretty strongly that we should be using the pkey from the work
completion, not the pkey in the message.

The reason, if someone is using pkey like vlan, and expecting a
container to never receive packets outside the assigned pkey, then we
need to check each and every packet for the correct pkey before
associating it with that container.

When doing the namespace patches you should probably also look at
other CM GMPs than just the REQ and how the paths are setup and
consider what to do with the pkey. I'd probably suggest that the pkey
should be forced throughout the entire process to ensure it always
matches the ip device - at least for containers that is the right
thing.. I probably wouldn't turn it on for the root namespace though..

Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2 build error in net/netfilter/ipset/ip_set_hash_netnet.c

2015-07-13 Thread Cong Wang
On Mon, Jul 13, 2015 at 9:13 AM, Akemi Yagi amy...@gmail.com wrote:
 On Sun, 05 Jul 2015 08:35:20 -0700, Guenter Roeck wrote:

 On Sat, Jul 04, 2015 at 12:44:36AM -0700, Vinson Lee wrote:
 Hi.

 With the latest Linux 4.2-rc1, I am hitting this build error with GCC
 4.4.7 on CentOS 6.

   CC  net/netfilter/ipset/ip_set_hash_netnet.o
 net/netfilter/ipset/ip_set_hash_netnet.c: In function
 ‘hash_netnet4_uadt’:
 net/netfilter/ipset/ip_set_hash_netnet.c:163: error: unknown field
 ‘cidr’ specified in initializer
 net/netfilter/ipset/ip_set_hash_netnet.c:163: warning: missing braces
 around initializer net/netfilter/ipset/ip_set_hash_netnet.c:163:
 warning: (near initialization for ‘e.anonymous.ip’)
 net/netfilter/ipset/ip_set_hash_netnet.c: In function
 ‘hash_netnet6_uadt’:
 net/netfilter/ipset/ip_set_hash_netnet.c:388: error: unknown field
 ‘cidr’ specified in initializer
 net/netfilter/ipset/ip_set_hash_netnet.c:388: warning: missing braces
 around initializer net/netfilter/ipset/ip_set_hash_netnet.c:388:
 warning: (near initialization for ‘e.ip[0]’)

 Previously fixed with commit 1a869205c75cb (netfilter: ipset: The
 unnamed union initialization may lead to compilation error),
 reintroduced with commit aff227581ed1a (netfilter: ipset: Check CIDR
 value only when attribute is given).

 Guenter

 I wonder what can be done to get this issue fixed. This problem was seen
 in 4.2-rc1 and now in 4.2-rc2 on RHEL-6.6.


Just revert the initializer piece.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


request for -stable: route: Use ipv4_mtu instead of raw rt_pmtu

2015-07-13 Thread Timo Teras
Hi,

Can you queue for active older -stables up to 3.18:

commit 3cdaa5be9e81 ipv4: Don't increase PMTU with Datagram Too Big message
commit cb6ccf09d6b9 route: Use ipv4_mtu instead of raw rt_pmtu

commit 3cdaa5be9e81 made it to 3.19.y and was later fixed additionally
with conversion to ipv4_mtu() in the second referenced commit.

However, these patches together will fix another case that is not so
obvious: the case if the original route had MTU set on it. Previously
it was ignored but using ipv4_mtu as the first check will also check
RTAX_MTU on metrics. This fixes the nasty issue that PMTU can trigger
to send larger packets then what was explicitly configured via a static
route mtu.

Thanks,
Timo
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 05/12] IB/cm: Share listening CM IDs

2015-07-13 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:34PM +0300, Haggai Eran wrote:
   spin_lock_irq(cm.lock);
 + if (--cm_id_priv-listen_sharecount  0) {
 + /* The id is still shared. */
 + atomic_dec(cm_id_priv-refcount);

Nit: This looks very strange not to be cm_deref_id .. Looks OK as is
because we are sure refcount cannot be 0 here?

 @@ -958,8 +988,10 @@ int ib_cm_listen(struct ib_cm_id *cm_id, __be64 
 service_id, __be64 service_mask,
   }
  
   cm_id-state = IB_CM_LISTEN;
 + ++cm_id_priv-listen_sharecount;

 - spin_lock_irqsave(cm.lock, flags);
 + if (lock)
 + spin_lock_irqsave(cm.lock, flags);

Hmm, I'd like to see the listen_sharecount consistently locked, so it
should be manipulated only while cm.lock is held..

   if (service_id == IB_CM_ASSIGN_SERVICE_ID) {
   cm_id-service_id = cpu_to_be64(cm.listen_service_id++);
   cm_id-service_mask = ~cpu_to_be64(0);
 @@ -968,18 +1000,98 @@ int ib_cm_listen(struct ib_cm_id *cm_id, __be64 
 service_id, __be64 service_mask,
   cm_id-service_mask = service_mask;
   }
   cur_cm_id_priv = cm_insert_listen(cm_id_priv);
 - spin_unlock_irqrestore(cm.lock, flags);
 + if (lock)
 + spin_unlock_irqrestore(cm.lock, flags);
  
   if (cur_cm_id_priv) {
   cm_id-state = IB_CM_IDLE;
 + --cm_id_priv-listen_sharecount;

Ditto

Otherwise I don't see any other mechanical problems with this. Sean
said he was happy with the idea right?

Reviewed-By: Jason Gunthorpe jguntho...@obsidianresearch.com

Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1 11/12] IB/cma: Share ib_cm_ids between rdma_cm_ids

2015-07-13 Thread Jason Gunthorpe
On Mon, Jun 22, 2015 at 03:42:40PM +0300, Haggai Eran wrote:
 Use ib_cm_id_create_and_listen to create listening IB CM IDs or share
  ^^^
Is that the wrong name? ib_cm_insert_listen perhaps?

I think I've looked at the details in this series I was concerned
about, Sean should OK the rest of the changes to the CM code, but
nothing much stood out to me.

Jason
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ipv6: Fix finding best source address in ipv6_dev_get_saddr().

2015-07-13 Thread Hajime Tazaki

At Mon, 13 Jul 2015 23:28:10 +0900,
YOSHIFUJI Hideaki/吉藤英明 wrote:
 
 Commit 9131f3de2 (ipv6: Do not iterate over all interfaces when
 finding source address on specific interface.) did not properly
 update best source address available.  Plus, it introduced
 possible NULL pointer dereference.
 
 Bug was reported by Erik Kline e...@google.com.
 Based on patch proposed by Hajime Tazaki thehaj...@gmail.com.
 
 Fixes: 9131f3de24db4dc12199aede7d931e6703e97f3b (ipv6: Do not
   iterate over all interfaces when finding source address
   on specific interface.)
 Signed-off-by: YOSHIFUJI Hideaki hideaki.yoshif...@miraclelinux.com

all of my tests passed with the patch on 14fe22e: Revert
ipv4: use skb coalescing in defragmentation.

thanks for the prompt fix !

Acked-by: Hajime Tazaki thehaj...@gmail.com

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net-next] net: Build IPv6 into kernel by default

2015-07-13 Thread Tom Herbert
This patch makes the default to build IPv6 into the kernel. IPv6
now has significant traction and any remaining vestiges of IPv6
not being provided parity with IPv4 should be swept away. IPv6 is now
core to the Internet and kernel.

Points on IPv6 adoption:

- Per Google statistics, IPv6 usage has reached 7% on the Internet
  and continues to exhibit an exponential growth rate
  https://www.google.com/intl/en/ipv6/statistics.html
- Just a few days ago ARIN officially depleted its IPv4 pool
- IPv6 only data centers are being successfully built
  (e.g. at Facebook)

This patch changes the IPv6 Kconfig for IPV6. Default for CONFIG_IPV6
is set to y and the text has been updated to reflect the maturity of
IPv6.

Impact:

Under some circumstances building modules in to kernel might have a
performance advantage. In my testing, I did notice a very slight
improvement.

This will obviously increase the size of the kernel image. In my
configuration I see:

IPv6 as module:

   textdata bss dec hex filename
9703666 1899288  933888 12536842 bf4c0a vmlinux

IPv6 built into kernel

  text data bss dec hex filename
9436490 1879600  913408 12229498 ba9b7a vmlinux

Which increases text size by ~270K (2.8% increase in size for me). If
image size is an issue, presumably for a device which does not do IP
networking (IMO we should be discouraging IPv4-only devices), IPV6 can
be disabled or still built as a module.

Acked-by: YOSHIFUJI Hideaki yoshf...@linux-ipv6.org
Signed-off-by: Tom Herbert t...@herbertland.com
---
 net/ipv6/Kconfig | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig
index 438a73a..643f613 100644
--- a/net/ipv6/Kconfig
+++ b/net/ipv6/Kconfig
@@ -5,16 +5,15 @@
 #   IPv6 as module will cause a CRASH if you try to unload it
 menuconfig IPV6
tristate The IPv6 protocol
-   default m
+   default y
---help---
- This is complemental support for the IP version 6.
- You will still be able to do traditional IPv4 networking as well.
+ Support for IP version 6 (IPv6).
 
  For general information about IPv6, see
  https://en.wikipedia.org/wiki/IPv6.
- For Linux IPv6 development information, see 
http://www.linux-ipv6.org.
- For specific information about IPv6 under Linux, read the HOWTO at
- http://www.bieringer.de/linux/IPv6/.
+ For specific information about IPv6 under Linux, see
+ Documentation/networking/ipv6.txt and read the HOWTO at
+ http://www.tldp.org/HOWTO/Linux+IPv6-HOWTO/
 
  To compile this protocol support as a module, choose M here: the 
  module will be called ipv6.
-- 
1.8.1

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ipv6: Fix finding best source address in ipv6_dev_get_saddr().

2015-07-13 Thread Tom Herbert
I am testing this patch which may be a little simpler. Also idev needs
to be checked after __in6_dev_get

Tom

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 4ab74d5..d631ac3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1363,9 +1363,10 @@ static void __ipv6_dev_get_saddr(struct net *net,
 unsigned int prefs,
 const struct in6_addr *saddr,
 struct inet6_dev *idev,
-struct ipv6_saddr_score *scores)
+struct ipv6_saddr_score **in_score,
+struct ipv6_saddr_score **in_hiscore)
 {
-   struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
+   struct ipv6_saddr_score *score = *in_score, *hiscore = *in_hiscore;

read_lock_bh(idev-lock);
list_for_each_entry(score-ifa, idev-addr_list, if_list) {
@@ -1434,13 +1435,16 @@ static void __ipv6_dev_get_saddr(struct net *net,
}
 out:
read_unlock_bh(idev-lock);
+   *in_hiscore = hiscore;
+   *in_score = score;
 }

 int ipv6_dev_get_saddr(struct net *net, const struct net_device *dst_dev,
   const struct in6_addr *daddr, unsigned int prefs,
   struct in6_addr *saddr)
 {
-   struct ipv6_saddr_score scores[2], *hiscore = scores[1];
+   struct ipv6_saddr_score scores[2];
+   struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
struct ipv6_saddr_dst dst;
struct inet6_dev *idev;
struct net_device *dev;
@@ -1475,18 +1479,19 @@ int ipv6_dev_get_saddr(struct net *net, const
struct net_device *dst_dev,
if ((dst_type  IPV6_ADDR_MULTICAST) ||
dst.scope = IPV6_ADDR_SCOPE_LINKLOCAL) {
idev = __in6_dev_get(dst_dev);
-   use_oif_addr = true;
+   if (idev)
+   use_oif_addr = true;
}
}
if (use_oif_addr) {
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev, scores);
+   __ipv6_dev_get_saddr(net, dst, prefs, saddr, idev,
score, hiscore);
} else {
for_each_netdev_rcu(net, dev) {
idev = __in6_dev_get(dev);
if (!idev)
continue;
-   __ipv6_dev_get_saddr(net, dst, prefs, saddr,
idev, scores);
+   __ipv6_dev_get_saddr(net, dst, prefs, saddr,
idev, score, hiscore);
}
}
rcu_read_unlock();

On Mon, Jul 13, 2015 at 7:28 AM, YOSHIFUJI Hideaki/吉藤英明
hideaki.yoshif...@miraclelinux.com wrote:
 Commit 9131f3de2 (ipv6: Do not iterate over all interfaces when
 finding source address on specific interface.) did not properly
 update best source address available.  Plus, it introduced
 possible NULL pointer dereference.

 Bug was reported by Erik Kline e...@google.com.
 Based on patch proposed by Hajime Tazaki thehaj...@gmail.com.

 Fixes: 9131f3de24db4dc12199aede7d931e6703e97f3b (ipv6: Do not
 iterate over all interfaces when finding source address
 on specific interface.)
 Signed-off-by: YOSHIFUJI Hideaki hideaki.yoshif...@miraclelinux.com
 ---
  net/ipv6/addrconf.c | 30 ++
  1 file changed, 18 insertions(+), 12 deletions(-)

 diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
 index 4ab74d5..4c9a024 100644
 --- a/net/ipv6/addrconf.c
 +++ b/net/ipv6/addrconf.c
 @@ -1358,14 +1358,15 @@ out:
 return ret;
  }

 -static void __ipv6_dev_get_saddr(struct net *net,
 -struct ipv6_saddr_dst *dst,
 -unsigned int prefs,
 -const struct in6_addr *saddr,
 -struct inet6_dev *idev,
 -struct ipv6_saddr_score *scores)
 +static int __ipv6_dev_get_saddr(struct net *net,
 +   struct ipv6_saddr_dst *dst,
 +   unsigned int prefs,
 +   const struct in6_addr *saddr,
 +   struct inet6_dev *idev,
 +   struct ipv6_saddr_score *scores,
 +   int hiscore_idx)
  {
 -   struct ipv6_saddr_score *score = scores[0], *hiscore = scores[1];
 +   struct ipv6_saddr_score *score = scores[1 - hiscore_idx], *hiscore = 
 scores[hiscore_idx];

 read_lock_bh(idev-lock);
 list_for_each_entry(score-ifa, idev-addr_list, if_list) {
 @@ -1424,6 +1425,7 @@ static void __ipv6_dev_get_saddr(struct net *net,
 in6_ifa_hold(score-ifa);

 swap(hiscore, score);
 +   hiscore_idx = 1 - hiscore_idx;

 /* restore our iterator */
 

Re: [PATCH v2 15/22] fjes: net_device_ops.ndo_vlan_rx_add/kill_vid

2015-07-13 Thread Yasuaki Ishimatsu
Hi Izumi-san,

On Wed, 24 Jun 2015 11:55:47 +0900
Taku Izumi izumi.t...@jp.fujitsu.com wrote:

 This patch adds net_device_ops.ndo_vlan_rx_add_vid and
 net_device_ops.ndo_vlan_rx_kill_vid callback.
 
 Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
 ---
  drivers/net/fjes/fjes_hw.c   | 27 +++
  drivers/net/fjes/fjes_hw.h   |  2 ++
  drivers/net/fjes/fjes_main.c | 40 
  3 files changed, 69 insertions(+)
 
 diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c
 index 5e3f847..8363e22 100644
 --- a/drivers/net/fjes/fjes_hw.c
 +++ b/drivers/net/fjes/fjes_hw.c
 @@ -827,6 +827,33 @@ bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, 
 u16 vlan_id)
   return ret;
  }
  
 +bool fjes_hw_set_vlan_id(struct epbuf_handler *epbh, u16 vlan_id)
 +{
 + union ep_buffer_info *info = epbh-info;
 + int i;
 +
 + for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++) {
 + if (info-v1i.vlan_id[i] == 0) {
 + info-v1i.vlan_id[i] = vlan_id;
 + return true;
 + }
 + }
 + return false;
 +}
 +
 +void fjes_hw_del_vlan_id(struct epbuf_handler *epbh, u16 vlan_id)
 +{
 + union ep_buffer_info *info = epbh-info;
 + int i;
 +

 + if (0 != vlan_id) {

How about using the following if statement so than you can delete
indent?

if (vlan_id == 0)
return;

 + for (i = 0; i  EP_BUFFER_SUPPORT_VLAN_MAX; i++) {
 + if (vlan_id == info-v1i.vlan_id[i])
 + info-v1i.vlan_id[i] = 0;
 + }
 + }
 +}
 +
  bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *epbh)
  {
   union ep_buffer_info *info = epbh-info;
 diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h
 index ea30aeb..afad03e 100644
 --- a/drivers/net/fjes/fjes_hw.h
 +++ b/drivers/net/fjes/fjes_hw.h
 @@ -321,6 +321,8 @@ int fjes_hw_epid_is_shared(struct fjes_device_shared_info 
 *, int);
  bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32);
  bool fjes_hw_check_mtu(struct epbuf_handler *, u32);
  bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16);
 +bool fjes_hw_set_vlan_id(struct epbuf_handler *, u16);
 +void fjes_hw_del_vlan_id(struct epbuf_handler *, u16);
  bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *);
  void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *, size_t *);
  void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *);
 diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
 index e2e69e0..bb4c8e4 100644
 --- a/drivers/net/fjes/fjes_main.c
 +++ b/drivers/net/fjes/fjes_main.c
 @@ -58,6 +58,8 @@ static irqreturn_t fjes_intr(int, void*);
  static struct rtnl_link_stats64 *
  fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *);
  static int fjes_change_mtu(struct net_device *, int);
 +static int fjes_vlan_rx_add_vid(struct net_device *, __be16 proto, u16);
 +static int fjes_vlan_rx_kill_vid(struct net_device *, __be16 proto, u16);
  static void fjes_tx_retry(struct net_device *);
  
  static int fjes_acpi_add(struct acpi_device *);
 @@ -229,6 +231,8 @@ static const struct net_device_ops fjes_netdev_ops = {
   .ndo_get_stats64= fjes_get_stats64,
   .ndo_change_mtu = fjes_change_mtu,
   .ndo_tx_timeout = fjes_tx_retry,
 + .ndo_vlan_rx_add_vid= fjes_vlan_rx_add_vid,
 + .ndo_vlan_rx_kill_vid = fjes_vlan_rx_kill_vid,
  };
  
  /* fjes_open - Called when a network interface is made active */
 @@ -757,6 +761,42 @@ static int fjes_change_mtu(struct net_device *netdev, 
 int new_mtu)
   return -EINVAL;
  }
  
 +static int fjes_vlan_rx_add_vid(struct net_device *netdev,
 + __be16 proto, u16 vid)
 +{
 + struct fjes_adapter *adapter = netdev_priv(netdev);
 + bool ret = true;
 + int epid;
 +
 + for (epid = 0; epid  adapter-hw.max_epid; epid++) {
 + if (epid == adapter-hw.my_epid)
 + continue;
 +
 + if (!fjes_hw_check_vlan_id(
 + adapter-hw.ep_shm_info[epid].tx, vid))
 + ret = fjes_hw_set_vlan_id(
 + adapter-hw.ep_shm_info[epid].tx, vid);
 + }
 +
 + return ret ? 0 : -ENOSPC;
 +}
 +

 +static int fjes_vlan_rx_kill_vid(struct net_device *netdev,
 +  __be16 proto, u16 vid)

The function always returns 0. So how about defining the function
as void?

Thanks,
Ysauaki Ishimatsu

 +{
 + struct fjes_adapter *adapter = netdev_priv(netdev);
 + int epid;
 +
 + for (epid = 0; epid  adapter-hw.max_epid; epid++) {
 + if (epid == adapter-hw.my_epid)
 + continue;
 +
 + fjes_hw_del_vlan_id(adapter-hw.ep_shm_info[epid].tx, vid);
 + }
 +
 + return 0;
 +}
 +
  static irqreturn_t fjes_intr(int irq, void *data)
  {
   struct fjes_adapter *adapter = data;
 -- 
 

Re: [PATCH] nf: IDLETIMER: fix lockdep warning

2015-07-13 Thread Pablo Neira Ayuso
On Mon, Jul 13, 2015 at 08:02:36AM -0700, Dmitry Torokhov wrote:
 On Mon, Jul 13, 2015 at 6:20 AM, Pablo Neira Ayuso pa...@netfilter.org 
 wrote:
  On Thu, Jul 09, 2015 at 05:15:01PM -0700, Dmitry Torokhov wrote:
  Dynamically allocated sysfs attributes should be initialized with
  sysfs_attr_init() otherwise lockdep will be angry with us:
 
  [   45.468653] BUG: key ffc030fad4e0 not in .data!
  [   45.468655] [ cut here ]
  [   45.468666] WARNING: CPU: 0 PID: 1176 at 
  /mnt/host/source/src/third_party/kernel/v3.18/kernel/locking/lockdep.c:2991
   lockdep_init_map+0x12c/0x490()
  [   45.468672] DEBUG_LOCKS_WARN_ON(1)
  [   45.468672] CPU: 0 PID: 1176 Comm: iptables Tainted: G U  W 3.18.0 
  #43
  [   45.468674] Hardware name: XXX
  [   45.468675] Call trace:
  [   45.468680] [ffc0002072b4] dump_backtrace+0x0/0x10c
  [   45.468683] [ffc0002073d0] show_stack+0x10/0x1c
  [   45.468688] [ffc000a86cd4] dump_stack+0x74/0x94
  [   45.468692] [ffc000217ae0] warn_slowpath_common+0x84/0xb0
  [   45.468694] [ffc000217b84] warn_slowpath_fmt+0x4c/0x58
  [   45.468697] [ffc0002530a4] lockdep_init_map+0x128/0x490
  [   45.468701] [ffc000367ef0] __kernfs_create_file+0x80/0xe4
  [   45.468704] [ffc00036862c] sysfs_add_file_mode_ns+0x104/0x170
  [   45.468706] [ffc00036870c] sysfs_create_file_ns+0x58/0x64
  [   45.468711] [ffc000930430] idletimer_tg_checkentry+0x14c/0x324
  [   45.468714] [ffc00092a728] xt_check_target+0x170/0x198
  [   45.468717] [ffc000993efc] check_target+0x58/0x6c
  [   45.468720] [ffc000994c64] translate_table+0x30c/0x424
  [   45.468723] [ffc00099529c] do_ipt_set_ctl+0x144/0x1d0
  [   45.468728] [ffc0009079f0] nf_setsockopt+0x50/0x60
  [   45.468732] [ffc000946870] ip_setsockopt+0x8c/0xb4
  [   45.468735] [ffc0009661c0] raw_setsockopt+0x10/0x50
  [   45.468739] [ffc0008c1550] sock_common_setsockopt+0x14/0x20
  [   45.468742] [ffc0008bd190] SyS_setsockopt+0x88/0xb8
  [   45.468744] ---[ end trace 41d156354d18c039 ]---
 
  Applied, thanks.
 
  One question:
 
  Change-Id: I1da5cd96fc8e1e1e4209e81eba1165a42d4d45e9
 
  BTW, does this gerrit change ID provide any public information? Thanks.
 
 Argh, I am sorry, I forgot to clean this out when mailing the patch.
 In this particular case you can find the change in AOSP gerrit at
 https://android-review.googlesource.com but without such context this
 change-id is of course useless.

No problem, I'll remove it. Thanks Dmitry.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 4.2 build error in net/netfilter/ipset/ip_set_hash_netnet.c

2015-07-13 Thread Akemi Yagi
On Sun, 05 Jul 2015 08:35:20 -0700, Guenter Roeck wrote:

 On Sat, Jul 04, 2015 at 12:44:36AM -0700, Vinson Lee wrote:
 Hi.
 
 With the latest Linux 4.2-rc1, I am hitting this build error with GCC
 4.4.7 on CentOS 6.
 
   CC  net/netfilter/ipset/ip_set_hash_netnet.o
 net/netfilter/ipset/ip_set_hash_netnet.c: In function
 ‘hash_netnet4_uadt’:
 net/netfilter/ipset/ip_set_hash_netnet.c:163: error: unknown field
 ‘cidr’ specified in initializer
 net/netfilter/ipset/ip_set_hash_netnet.c:163: warning: missing braces
 around initializer net/netfilter/ipset/ip_set_hash_netnet.c:163:
 warning: (near initialization for ‘e.anonymous.ip’)
 net/netfilter/ipset/ip_set_hash_netnet.c: In function
 ‘hash_netnet6_uadt’:
 net/netfilter/ipset/ip_set_hash_netnet.c:388: error: unknown field
 ‘cidr’ specified in initializer
 net/netfilter/ipset/ip_set_hash_netnet.c:388: warning: missing braces
 around initializer net/netfilter/ipset/ip_set_hash_netnet.c:388:
 warning: (near initialization for ‘e.ip[0]’)
 
 Previously fixed with commit 1a869205c75cb (netfilter: ipset: The
 unnamed union initialization may lead to compilation error),
 reintroduced with commit aff227581ed1a (netfilter: ipset: Check CIDR
 value only when attribute is given).
 
 Guenter

I wonder what can be done to get this issue fixed. This problem was seen 
in 4.2-rc1 and now in 4.2-rc2 on RHEL-6.6.

$ gcc --version
gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11)

Please advise.

Akemi

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 10/22] fjes: tx_stall_task

2015-07-13 Thread Yasuaki Ishimatsu
Hi Izum-san,

On Wed, 24 Jun 2015 11:55:42 +0900
Taku Izumi izumi.t...@jp.fujitsu.com wrote:

 This patch adds tx_stall_task.
 When receiver's buffer is full, sender stops
 its tx queue. This task is used to monitor
 receiver's status and when receiver's buffer
 is avairable, it resumes tx queue.
 
 Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com
 ---
  drivers/net/fjes/fjes.h  |  2 ++
  drivers/net/fjes/fjes_main.c | 63 
 
  2 files changed, 65 insertions(+)
 
 diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h
 index 8e9899e..b04ea9d 100644
 --- a/drivers/net/fjes/fjes.h
 +++ b/drivers/net/fjes/fjes.h
 @@ -30,6 +30,7 @@
  #define FJES_MAX_QUEUES  1
  #define FJES_TX_RETRY_INTERVAL   (20 * HZ)
  #define FJES_TX_RETRY_TIMEOUT(100)
 +#define FJES_TX_TX_STALL_TIMEOUT (FJES_TX_RETRY_INTERVAL / 2)
  #define FJES_OPEN_ZONE_UPDATE_WAIT   (300) /* msec */
  
  /* board specific private data structure */
 @@ -52,6 +53,7 @@ struct fjes_adapter {
  
   struct workqueue_struct *txrx_wq;
  
 + struct work_struct tx_stall_task;
   struct work_struct raise_intr_rxdata_task;
  
   struct fjes_hw hw;
 diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c
 index 735aa5e..f4c2445 100644
 --- a/drivers/net/fjes/fjes_main.c
 +++ b/drivers/net/fjes/fjes_main.c
 @@ -53,6 +53,7 @@ static int fjes_setup_resources(struct fjes_adapter *);
  static void fjes_free_resources(struct fjes_adapter *);
  static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *);
  static void fjes_raise_intr_rxdata_task(struct work_struct *);
 +static void fjes_tx_stall_task(struct work_struct *);
  static irqreturn_t fjes_intr(int, void*);
  
  static int fjes_acpi_add(struct acpi_device *);
 @@ -281,6 +282,7 @@ static int fjes_close(struct net_device *netdev)
   fjes_free_irq(adapter);
  
   cancel_work_sync(adapter-raise_intr_rxdata_task);
 + cancel_work_sync(adapter-tx_stall_task);
  
   fjes_hw_wait_epstop(hw);
  
 @@ -410,6 +412,61 @@ static void fjes_free_resources(struct fjes_adapter 
 *adapter)
   }
  }
  
 +static void fjes_tx_stall_task(struct work_struct *work)
 +{
 + struct fjes_adapter *adapter = container_of(work,
 + struct fjes_adapter, tx_stall_task);
 + struct fjes_hw *hw = adapter-hw;
 + struct net_device *netdev = adapter-netdev;
 + enum ep_partner_status pstatus;
 + int epid;
 + int max_epid, my_epid;
 + union ep_buffer_info *info;
 + int all_queue_available;
 + int i;
 + int sendable;
 +
 + if (((long)jiffies -
 + (long)(netdev-trans_start))  FJES_TX_TX_STALL_TIMEOUT) {
 + netif_wake_queue(netdev);
 + return;
 + }
 +
 + my_epid = hw-my_epid;
 + max_epid = hw-max_epid;
 +

 + for (i = 0; i  5; i++) {

Why do you loop 5 times?

Thanks,
Yasuaki Ishimatsu

 + all_queue_available = 1;
 +
 + for (epid = 0; epid  max_epid; epid++) {
 + if (my_epid == epid)
 + continue;
 +
 + pstatus = fjes_hw_get_partner_ep_status(hw, epid);
 + sendable = (pstatus == EP_PARTNER_SHARED);
 + if (!sendable)
 + continue;
 +
 + info = adapter-hw.ep_shm_info[epid].tx.info;
 +
 + if (EP_RING_FULL(info-v1i.head, info-v1i.tail,
 +  info-v1i.count_max)) {
 + all_queue_available = 0;
 + break;
 + }
 + }
 +
 + if (all_queue_available) {
 + netif_wake_queue(netdev);
 + return;
 + }
 + }
 +
 + usleep_range(50, 100);
 +
 + queue_work(adapter-txrx_wq, adapter-tx_stall_task);
 +}
 +
  static void fjes_raise_intr_rxdata_task(struct work_struct *work)
  {
   struct fjes_adapter *adapter = container_of(work,
 @@ -606,6 +663,10 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device 
 *netdev)
   netdev-trans_start = jiffies;
   netif_tx_stop_queue(cur_queue);
  
 + if 
 (!work_pending(adapter-tx_stall_task))
 + queue_work(adapter-txrx_wq,
 +
 adapter-tx_stall_task);
 +
   ret = NETDEV_TX_BUSY;
   }
   } else {
 @@ -690,6 +751,7 @@ static int fjes_probe(struct platform_device *plat_dev)
  
   adapter-txrx_wq = create_workqueue(DRV_NAME /txrx);
  
 + INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task);
   INIT_WORK(adapter-raise_intr_rxdata_task,
 fjes_raise_intr_rxdata_task);
  
 @@ -734,6 

[PATCH] net: qlcnic: Deletion of unnecessary memset

2015-07-13 Thread Christophe JAILLET
There is no need to memset memory allocated with vzalloc.

Signed-off-by: Christophe JAILLET christophe.jail...@wanadoo.fr
---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c 
b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
index 2f6cc42..7dbab3c 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
@@ -2403,7 +2403,6 @@ int qlcnic_alloc_tx_rings(struct qlcnic_adapter *adapter,
qlcnic_free_tx_rings(adapter);
return -ENOMEM;
}
-   memset(cmd_buf_arr, 0, TX_BUFF_RINGSIZE(tx_ring));
tx_ring-cmd_buf_arr = cmd_buf_arr;
spin_lock_init(tx_ring-tx_clean_lock);
}
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


virtio-net TSO Lockup

2015-07-13 Thread Brian Rak
We've been encountering an issue in the virtio-net driver that cause it 
to become unresponsive after a period of high load.  This issue goes 
away if we disable TSO on the interface.


Once this issue has been triggered, the interface can still receive 
traffic, but will not transmit anything.


Specifically:
* Initially the machine will still try to respond to packets (I say try, 
because I see the packets in tcpdump, but the counters shown by 'ip -s 
-d link show eth1' do not increment.  I also do not see the packets make 
it to the upstream network interface)
* After a little while (1-2 minutes), I stop seeing the response packets 
in tcpdump.  (In this case I'm looking for ARP request/replies, so the 
requests still come in, but the responses do not go out.  This is not 
limited to just ARP, the interface will not respond at all)
* If I leave a ping running while the interface is broken, eventually I 
start seeing 'ping: sendmsg: No buffer space available'


I've reproduced this on a few Ubuntu kernel builds (3.13.0-53-generic 
and 4.0.7-040007-generic), and a few CentOS kernels 
(2.6.32-504.16.2.el6.x86_64, 4.1.1-1.el6.elrepo.x86_64) so I do not 
believe this to be distribution specific.


If I restart the machine (just issuing a server level 'reboot' command, 
not restarting qemu itself), the adapter starts working properly again.


Interestingly, these machines have two virtio NICs, and this only seems 
to occur for one of them (by this, I mean eth0 always works, and eth1 
always breaks.  If I remove eth0 from the machine, eth1 still breaks). 
On the host level, the broken one is a macvtap interface, while the 
working one is an tap device.   We've seen this in the past with a 
different interface type (the qemu multicast NIC type), so I do not 
believe this is really relevant.  If I switch the machines to using 
emulated e1000 nics, I can no longer reproduce the issue.


Reproduction is fairly easy, with two machines run `nc -lk 1818 | pv  
/dev/null` on one, and `cat /dev/zero | pv | nc 10.99.0.100 1818` (the 
machine sending traffic will break within a minute or two).  I can 
easily provide access to machines where the problem manifests, if that 
would be helpful.


I'm not really sure where to go from here.  Tracking down a bug in the 
virtio driver is a bit above my skill level.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/bonding: Add function bond_remove_proc_entry at __bond_release_one

2015-07-13 Thread Nikolay Aleksandrov
On 07/13/2015 11:05 PM, Nikolay Aleksandrov wrote:
 On 07/13/2015 08:57 PM, cls...@linux.vnet.ibm.com wrote:
 From: Carol L Soto cls...@linux.vnet.ibm.com

 Add function bond_remove_proc_entry at __bond_release_one to avoid stack 
 trace at rmmod bonding.

 [68830.202239] remove_proc_entry: removing non-empty directory
 'net/bonding', leaking at least 'bond0'
 [68830.202257] [ cut here ]
 [68830.202260] WARNING: at fs/proc/generic.c:562
 [68830.202412] NIP [c02abf6c] .remove_proc_entry+0x1fc/0x240
 [68830.202416] LR [c02abf68] .remove_proc_entry+0x1f8/0x240
 [68830.202419] PACATMSCRATCH [80009032]
 [68830.202421] Call Trace:
 [68830.202424] [c00179277940] [c02abf68] 
 .remove_proc_entry+0x1f8/0x240 (unreliable)
 [68830.202434] [c001792779f0] [d53229a4] 
 .bond_destroy_proc_dir+0x34/0x54 [bonding]
 [68830.202440] [c00179277a70] [d53130e0] 
 .bond_net_exit+0x90/0x120 [bonding]
 [68830.202445] [c00179277b10] [c059944c] 
 .ops_exit_list.isra.0+0x6c/0xd0
 [68830.202450] [c00179277ba0] [c0599774] 
 .unregister_pernet_operations+0x94/0x100
 [68830.202454] [c00179277c40] [c0599814] 
 .unregister_pernet_subsys+0x34/0x60
 [68830.202460] [c00179277cc0] [d5323758] 
 .bonding_exit+0x48/0x2328 [bonding]
 [68830.202466] [c00179277d30] [c010dcc4] 
 .SyS_delete_module+0x1f4/0x340
 [68830.202471] [c00179277e30] [c0009e7c] 
 syscall_exit+0x0/0x7c
 [68830.202491] ---[ end trace 9bd1d810219c9875 ]---

 Signed-off-by: Carol L Soto cls...@linux.vnet.ibm.com
 ---
  drivers/net/bonding/bond_main.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/drivers/net/bonding/bond_main.c 
 b/drivers/net/bonding/bond_main.c
 index 19eb990..ace105a 100644
 --- a/drivers/net/bonding/bond_main.c
 +++ b/drivers/net/bonding/bond_main.c
 @@ -1870,6 +1870,8 @@ static int __bond_release_one(struct net_device 
 *bond_dev,
  dev_set_mac_address(slave_dev, addr);
  }
  
 +bond_remove_proc_entry(bond);
 +
  dev_set_mtu(slave_dev, slave-original_mtu);
  
  slave_dev-priv_flags = ~IFF_BONDING;

 
 This is incorrect, it tries to remove the bond entry on every slave release
 so if we have a bonding device with = 2 slaves and release one of them then
 the whole bond device entry will be removed from /proc/net/bonding.

 You can hit this case only if you had created a bonding device while doing the
 rmmod bonding (it's an old race condition which was fixed long time ago, but
 the procfs was apparently missed) and only after the notifier has been
 unregistered but before the sysfs has been removed.
 
Scratch this part, it should be triggered in a different way.
Could you provide a way to reproduce ?

 Since the bonding netdevice notifier is handling the procfs 
 creation/destruction
 we could try moving the unregister after the pernet destruction which should
 help avoid such problems. Could you try the following patch:
 
 
 diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
 index 19eb990d398c..d515ee38b77f 100644
 --- a/drivers/net/bonding/bond_main.c
 +++ b/drivers/net/bonding/bond_main.c
 @@ -4682,12 +4682,10 @@ err_link:
  
  static void __exit bonding_exit(void)
  {
 - unregister_netdevice_notifier(bond_netdev_notifier);
 -
   bond_destroy_debugfs();
 -
   bond_netlink_fini();
   unregister_pernet_subsys(bond_net_ops);
 + unregister_netdevice_notifier(bond_netdev_notifier);
  
  #ifdef CONFIG_NET_POLL_CONTROLLER
   /* Make sure we don't have an imbalance on our netpoll blocking */
 
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread David Miller
From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
Date: Mon, 13 Jul 2015 16:05:27 -0300

 On 13-07-2015 15:59, David Miller wrote:
 From: Neil Horman nhor...@tuxdriver.com
 Date: Mon, 13 Jul 2015 06:39:11 -0400

 Initially Marcelo had created duplicate code paths, one to return an
 fd, one to return a file struct.  If you would rather go in that
 direction, I'm sure he can propose it again, but that seems less
 correct to me than this solution.

 That's much better.
 
 I'm not sure what you mean. Is the new option better or the
 history/description?

I mean that adding an explicit function for these internal kernel
users to call is better.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tc: fix tc actions in case of shared skb

2015-07-13 Thread Alexei Starovoitov

On 7/11/15 9:29 PM, David Miller wrote:

From: Alexei Starovoitov a...@plumgrid.com
Date: Fri, 10 Jul 2015 17:10:11 -0700


TC actions need to check for very unlikely event skb-users != 1,
otherwise subsequent pskb_may_pull/pskb_expand_head will crash.
When skb_shared() just drop the packet, since in the middle of actions
it's too late to call skb_share_check(), since classifiers/actions assume
the same skb pointer.

Signed-off-by: Alexei Starovoitov a...@plumgrid.com


I think whatever creates this skb-users != 1 situation should be fixed,
they should clone the packet.


In all normal cases skb-users == 1, but pktgen is using trick:
atomic_add(burst, skb-users);
so when testing something like:
tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
  action vlan push id 2 action drop

it will crash:
[   31.999519] kernel BUG at ../net/core/skbuff.c:1130!
[   31.999519] invalid opcode:  [#1] PREEMPT SMP
[   31.999519] Modules linked in: act_gact act_vlan cls_u32 sch_ingress 
veth pktgen

[   31.999519] CPU: 0 PID: 339 Comm: kpktgend_0 Not tainted 4.1.0+ #730
[   31.999519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), [ 
  31.999519] Call Trace:

[   31.999519]  [8160eea7] skb_vlan_push+0x1d7/0x200
[   31.999519]  [a0017108] tcf_vlan+0x108/0x110 [act_vlan]
[   31.999519]  [81650d26] tcf_action_exec+0x46/0x80
[   31.999519]  [a001f4fe] u32_classify+0x30e/0x740 [cls_u32]
[   31.999519]  [810bcc6f] ? __lock_acquire+0xbcf/0x1e80
[   31.999519]  [810bcc6f] ? __lock_acquire+0xbcf/0x1e80
[   31.999519]  [8161f392] ? __netif_receive_skb_core+0x1b2/0xce0
[   31.999519]  [8164c0c3] tc_classify_compat+0xa3/0xb0
[   31.999519]  [8164ca03] tc_classify+0x33/0x90
[   31.999519]  [8161f674] __netif_receive_skb_core+0x494/0xce0
[   31.999519]  [8161f274] ? __netif_receive_skb_core+0x94/0xce0
[   31.999519]  [810bf10d] ? trace_hardirqs_on_caller+0xad/0x1d0
[   31.999519]  [8161fee1] __netif_receive_skb+0x21/0x70
[   31.999519]  [81620b43] netif_receive_skb_internal+0x23/0x1c0
[   31.999519]  [816219a9] netif_receive_skb_sk+0x49/0x1e0
[   31.999519]  [a0006e8d] pktgen_thread_worker+0x111d/0x1fa0 
[pktgen]



In fact, it would really help enormously if you could explain in detail
how this situation can actually arise.  Especially since I do not consider
it acceptable to drop the packet in this situation.


It's not pretty to drop, but it's better than crash.
I don't think we can get rid of 'skb-users += burst' trick, since
that's where all performance comes from (for both TX and RX testing).

So the only cheap way I see to avoid crash is to do this
if (unlikely(skb_shared(skb)))
check in actions that call pskb_expand_head.

In all normal scenarios it won't be triggered and pktgen tests
won't be crashing.
Yes. pktgen numbers will be a bit meaningless, since act_vlan will be
dropping instead of adding vlan, so users cannot make any performance
conclusions, but still better than crash.


the rules specified here:
Documentation/networking/tc-actions-env-rules.txt
insufficient?


Jamal,
that doc definitely needs updating. :)
It says:
If you munge any packet thou shalt call pskb_expand_head in the case
someone else is referencing the skb. After that you own the skb.
that's incorrect. If somebody 'referencing' skb via skb-users  1
it's too late to call pskb_expand_head. As you can see in the
crash trace above.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] net: Build IPv6 into kernel by default

2015-07-13 Thread David Miller
From: Tom Herbert t...@herbertland.com
Date: Mon, 13 Jul 2015 08:48:00 -0700

 This patch makes the default to build IPv6 into the kernel. IPv6
 now has significant traction and any remaining vestiges of IPv6
 not being provided parity with IPv4 should be swept away. IPv6 is now
 core to the Internet and kernel.
 
 Points on IPv6 adoption:
 ...
 Acked-by: YOSHIFUJI Hideaki yoshf...@linux-ipv6.org
 Signed-off-by: Tom Herbert t...@herbertland.com

Applied, thanks.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] ebpf: remove self-assignment in interpreter's tail call

2015-07-13 Thread David Miller
From: Daniel Borkmann dan...@iogearbox.net
Date: Mon, 13 Jul 2015 20:49:32 +0200

 ARG1 = BPF_R1 as it stands, evaluates to regs[BPF_REG_1] = regs[BPF_REG_1]
 and thus has no effect. Add a comment instead, explaining what happens and
 why it's okay to just remove it. Since from user space side, a tail call is
 invoked as a pseudo helper function via bpf_tail_call_proto, the verifier
 checks the arguments just like with any other helper function and makes
 sure that the first argument (regs[BPF_REG_1])'s type is ARG_PTR_TO_CTX.
 
 Signed-off-by: Daniel Borkmann dan...@iogearbox.net

Applied, thanks Daniel.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] sctp: SCTP_SOCKOPT_PEELOFF return socket pointer for kernel users

2015-07-13 Thread Marcelo Ricardo Leitner

On 13-07-2015 16:58, David Miller wrote:

From: Marcelo Ricardo Leitner marcelo.leit...@gmail.com
Date: Mon, 13 Jul 2015 16:05:27 -0300


On 13-07-2015 15:59, David Miller wrote:

From: Neil Horman nhor...@tuxdriver.com
Date: Mon, 13 Jul 2015 06:39:11 -0400


Initially Marcelo had created duplicate code paths, one to return an
fd, one to return a file struct.  If you would rather go in that
direction, I'm sure he can propose it again, but that seems less
correct to me than this solution.


That's much better.


I'm not sure what you mean. Is the new option better or the
history/description?


I mean that adding an explicit function for these internal kernel
users to call is better.


Okay. I'll try to minimize that code duplication then.

Thanks
Marcelo

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Logically DeadCode

2015-07-13 Thread Rafał Miłecki
On 3 July 2015 at 06:52, Rahul Jain rahul.j...@samsung.com wrote:
 From 0c34030166a150d6d9f1ab52e7bb40a5440a68c2 Mon Sep 17 00:00:00 2001
 From: Rahul Jain rahul.j...@samsung.com
 Date: Fri, 3 Jul 2015 10:19:12 +0530
 Subject: [PATCH] Logically DeadCode

You didn't use any prefix for the commit message, it's unclear
(Logically DeadCode what?), no description, you touch two code places
at once.

Please fix above problems and resend.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Revert net: fec: Ensure clocks are enabled while using mdio bus

2015-07-13 Thread David Miller
From: Fabio Estevam fabio.este...@freescale.com
Date: Mon, 13 Jul 2015 08:13:52 -0300

 This reverts commit 6c3e921b18edca290099adfddde8a50236bf2d80.
 
 commit 6c3e921b18ed (net: fec: Ensure clocks are enabled while using mdio
  bus) prevents the kernel to boot on mx6 boards, so let's revert it.
 
 Reported-by: Tyler Baker tyler.ba...@linaro.org
 Signed-off-by: Fabio Estevam fabio.este...@freescale.com

Andrew, please review.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tc: fix tc actions in case of shared skb

2015-07-13 Thread Alexei Starovoitov

On 7/13/15 1:04 PM, David Miller wrote:

From: Alexei Starovoitov a...@plumgrid.com
Date: Mon, 13 Jul 2015 12:47:42 -0700


In all normal cases skb-users == 1, but pktgen is using trick:
atomic_add(burst, skb-users);
so when testing something like:


You can want pktgen rx (which is the only buggy case as far as I can
see, TX is fine) to run fast, but you must do so by abiding by the
appropriate SKB sharing rules.

You can't do an optimization in pktgen for RX processing that works
some of the time.  We have shared SKB rules for a reason.

And I don't want to have to explain to someone in the future why that
drop check is there, and have to tell them because pktgen is broken
and we decided to add a hack here rather than make pktgen send
properly formed SKBs into the RX path

Ok?


in general all makes sense, but it is both RX and TX.
Without burst hack we cannot achieve line rate TX.
atomic_add(burst, pkt_dev-skb-users);
xmit_more:
ret = netdev_start_xmit(pkt_dev-skb, odev, txq, --burst  0);

in pktgen we check that driver can work with users  1 via:
pkt_dev-odev-priv_flags  IFF_TX_SKB_SHARING

so real hw driver are mostly ready for users  1, it's only
few tc actions struggle a bit.
We cannot check tc actions from pktgen, since they can be added
dynamically.
So I see three options:
1 get rid of burst hack for both RX and TX in pktgen (kills performance)
2 add unlikely(skb_shread) check to few tc actions
3 do nothing

I think 2 isn't that bad after all if properly documented with
because pktgen is doing this hack for performance ?

I'm fine with 3 too, since the whole pktgen business is for root
and for kernel hackers who suppose to know what they're doing.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tc: fix tc actions in case of shared skb

2015-07-13 Thread David Miller
From: Alexei Starovoitov a...@plumgrid.com
Date: Mon, 13 Jul 2015 12:47:42 -0700

 In all normal cases skb-users == 1, but pktgen is using trick:
 atomic_add(burst, skb-users);
 so when testing something like:

You can want pktgen rx (which is the only buggy case as far as I can
see, TX is fine) to run fast, but you must do so by abiding by the
appropriate SKB sharing rules.

You can't do an optimization in pktgen for RX processing that works
some of the time.  We have shared SKB rules for a reason.

And I don't want to have to explain to someone in the future why that
drop check is there, and have to tell them because pktgen is broken
and we decided to add a hack here rather than make pktgen send
properly formed SKBs into the RX path

Ok?
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/2] net: macb: Add mdio driver for accessing multiple phy devices

2015-07-13 Thread Florian Fainelli
On 12/07/15 21:48, Punnaiah Choudary Kalluri wrote:
 This patch is to add support for the design that has multiple ethernet
 mac controllers and single mdio bus connected to multiple phy devices.
 i.e mdio lines are connected to any of the ethernet mac controller and
 all the phy devices will be accessed using the phy maintenance interface
 in that mac controller.
 
  __   _
 |  | |PHY0 |
 | MAC0 |-| |
 |__|   | |_|
|   
  __|  _
 |  |   | | |
 | MAC1 |   |_|PHY1 | 
 |__| | |
 
 So, i come up with two implementations for addressing the above configuration.
 
 Implementation 1:
  Have separate driver for mdio bus
  Create a DT node for all the PHY devices connected to the mdio bus
  This driver will share the register space of the mac controller that has
  mdio bus connected.

That is the best design implementation, MDIO in itself is a sub-piece of
your Ethernet MAC controller the fact that it is within the Ethernet MAC
core is just coincidental, but there is no reason why it could not be
taken apart and made a separate block in itself.

 
 Implementation 2:
  Add new property has-mdio and it should be 1 for the mac that has mdio bus
  connected.
  Create the mdio bus only when the has-mdio property is 1
 
 Please review the two implementations and suggest which one is better to 
 proceed
 further. In my opinion implementation 1 will be the ideal one.

Agreed.

 
 Currently i have tested the patches with single mac and single phy
 configuration. I need to take care of few more cases before releasing the 
 final patch
 but before that i would like to have your opinion on the above implementations
 and finalize one implementation. so that i can enhance it further.
 
 Punnaiah Choudary Kalluri (1):
   net: macb: Add mdio driver for accessing multiple phy devices
   net: macb: Add support for single mac managing more than one phy
 
 
  drivers/net/ethernet/cadence/Makefile|2 +-
  drivers/net/ethernet/cadence/macb.c  |   93 +-
  drivers/net/ethernet/cadence/macb.h  |3 +-
  drivers/net/ethernet/cadence/macb_mdio.c |  204 
 ++
  4 files changed, 211 insertions(+), 91 deletions(-)
  create mode 100644 drivers/net/ethernet/cadence/macb_mdio.c
 


-- 
Florian
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next] tc: fix tc actions in case of shared skb

2015-07-13 Thread Daniel Borkmann

On 07/13/2015 10:17 PM, Alexei Starovoitov wrote:
...

We cannot check tc actions from pktgen, since they can be added
dynamically.
So I see three options:
1 get rid of burst hack for both RX and TX in pktgen (kills performance)
2 add unlikely(skb_shread) check to few tc actions
3 do nothing

I think 2 isn't that bad after all if properly documented with
because pktgen is doing this hack for performance ?

I'm fine with 3 too, since the whole pktgen business is for root
and for kernel hackers who suppose to know what they're doing.


Hmm, one thing for option 3 could be that we add a modinfo tag
experimental, so that on loading of pktgen module, we trigger
(like in case of staging) ...

  add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);

... and add a pr_warn() to the user, it may be more visible/clear
than the Packet Generator (USE WITH CAUTION) Kconfig title? ;)

It'd be a pity that we'd need the extra atomic read only for the
pktgen case. :/ With regards to option 2, you could hide that behind
a static inline helper wrapped in IS_ENABLED(CONFIG_NET_PKTGEN), but
that is a vry ugly workaround/hack as well (and distros might
even ship it nevertheless). I wouldn't be surprised if there are
other usage combinations with pktgen that would crash your box. :/

Best,
Daniel
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/bonding: Add function bond_remove_proc_entry at __bond_release_one

2015-07-13 Thread Nikolay Aleksandrov
On 07/13/2015 08:57 PM, cls...@linux.vnet.ibm.com wrote:
 From: Carol L Soto cls...@linux.vnet.ibm.com
 
 Add function bond_remove_proc_entry at __bond_release_one to avoid stack 
 trace at rmmod bonding.
 
 [68830.202239] remove_proc_entry: removing non-empty directory
 'net/bonding', leaking at least 'bond0'
 [68830.202257] [ cut here ]
 [68830.202260] WARNING: at fs/proc/generic.c:562
 [68830.202412] NIP [c02abf6c] .remove_proc_entry+0x1fc/0x240
 [68830.202416] LR [c02abf68] .remove_proc_entry+0x1f8/0x240
 [68830.202419] PACATMSCRATCH [80009032]
 [68830.202421] Call Trace:
 [68830.202424] [c00179277940] [c02abf68] 
 .remove_proc_entry+0x1f8/0x240 (unreliable)
 [68830.202434] [c001792779f0] [d53229a4] 
 .bond_destroy_proc_dir+0x34/0x54 [bonding]
 [68830.202440] [c00179277a70] [d53130e0] 
 .bond_net_exit+0x90/0x120 [bonding]
 [68830.202445] [c00179277b10] [c059944c] 
 .ops_exit_list.isra.0+0x6c/0xd0
 [68830.202450] [c00179277ba0] [c0599774] 
 .unregister_pernet_operations+0x94/0x100
 [68830.202454] [c00179277c40] [c0599814] 
 .unregister_pernet_subsys+0x34/0x60
 [68830.202460] [c00179277cc0] [d5323758] 
 .bonding_exit+0x48/0x2328 [bonding]
 [68830.202466] [c00179277d30] [c010dcc4] 
 .SyS_delete_module+0x1f4/0x340
 [68830.202471] [c00179277e30] [c0009e7c] 
 syscall_exit+0x0/0x7c
 [68830.202491] ---[ end trace 9bd1d810219c9875 ]---
 
 Signed-off-by: Carol L Soto cls...@linux.vnet.ibm.com
 ---
  drivers/net/bonding/bond_main.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
 index 19eb990..ace105a 100644
 --- a/drivers/net/bonding/bond_main.c
 +++ b/drivers/net/bonding/bond_main.c
 @@ -1870,6 +1870,8 @@ static int __bond_release_one(struct net_device 
 *bond_dev,
   dev_set_mac_address(slave_dev, addr);
   }
  
 + bond_remove_proc_entry(bond);
 +
   dev_set_mtu(slave_dev, slave-original_mtu);
  
   slave_dev-priv_flags = ~IFF_BONDING;
 

This is incorrect, it tries to remove the bond entry on every slave release
so if we have a bonding device with = 2 slaves and release one of them then
the whole bond device entry will be removed from /proc/net/bonding.
You can hit this case only if you had created a bonding device while doing the
rmmod bonding (it's an old race condition which was fixed long time ago, but
the procfs was apparently missed) and only after the notifier has been
unregistered but before the sysfs has been removed.

Since the bonding netdevice notifier is handling the procfs creation/destruction
we could try moving the unregister after the pernet destruction which should
help avoid such problems. Could you try the following patch:


diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 19eb990d398c..d515ee38b77f 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4682,12 +4682,10 @@ err_link:
 
 static void __exit bonding_exit(void)
 {
-   unregister_netdevice_notifier(bond_netdev_notifier);
-
bond_destroy_debugfs();
-
bond_netlink_fini();
unregister_pernet_subsys(bond_net_ops);
+   unregister_netdevice_notifier(bond_netdev_notifier);
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
/* Make sure we don't have an imbalance on our netpoll blocking */
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC net-next 18/22] openvswitch: Make tunnel set action attach a metadata dst

2015-07-13 Thread Joe Stringer
Hi Thomas,

On 10 July 2015 at 07:19, Thomas Graf tg...@suug.ch wrote:
 diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
 index ecfa530..05fe46b 100644
 --- a/net/openvswitch/flow_netlink.c
 +++ b/net/openvswitch/flow_netlink.c
 @@ -1548,11 +1548,45 @@ static struct sw_flow_actions 
 *nla_alloc_flow_actions(int size, bool log)
 return sfa;
  }

 +static void ovs_nla_free_set_action(const struct nlattr *a)
 +{
 +   const struct nlattr *ovs_key = nla_data(a);
 +   struct ovs_tunnel_info *ovs_tun;
 +
 +   switch (nla_type(ovs_key)) {
 +   case OVS_KEY_ATTR_TUNNEL_INFO:
 +   ovs_tun = nla_data(ovs_key);
 +   dst_release((struct dst_entry *)ovs_tun-tun_dst);
 +   break;
 +   }
 +}
 +
 +void ovs_nla_free_flow_actions(struct sw_flow_actions *sf_acts)
 +{
 +   const struct nlattr *a;
 +   int rem;
 +
 +   nla_for_each_attr(a, sf_acts-actions, sf_acts-actions_len, rem) {
 +   switch (nla_type(a)) {
 +   case OVS_ACTION_ATTR_SET:
 +   ovs_nla_free_set_action(a);
 +   break;
 +   }
 +   }
 +
 +   kfree(sf_acts);
 +}

It doesn't look like flow_free() is using this new function to
properly free the actions. Also, some of the error cases that hit this
code have sf_acts=NULL.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >