date:20151026

Re: [PATCH net] ipv6: no CHECKSUM_PARTIAL on skbs with extension headers and recalc checksum during fragmentation

2015-10-26 Thread Hannes Frederic Sowa

On Sun, Oct 25, 2015, at 14:32, Tom Herbert wrote:
> > Anyway, currently it is easy to generate broken checksums on the wire
> > and would like to solve that for net, we certainly can improve that in
> > net-next.
> >
> Hannes,
> 
> The IPv4 fragment code is very similar to IPv6 in that both will
> perform skb_checksum_help only in the slow_path, so it seems like
> skb_checksum_help should be called earlier before the fragments are
> generated in both cases. If we're not correctly setting checksum for
> packets that are fragmented that is an issue with the stack.

We already concluded that drivers do have this problem and not the stack
above ip6_fragment. The places I am aware of I fixed in this patch. Also
IPv4 to me seems unaffected, albeit one can certainly clean up the logic
in net-next.

Do you want to move the skb_checksum_help() check to the front of
ip_fragment in ipv4 now too?

My patch fixed the part above ip6_fragment (in ip6_append_data) and made
sure we don't send out packets with wrong checksums if we get to
ip6_fragment directly.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH stable<3.19] net: handle null iovec pointer in skb_copy_and_csum_datagram_iovec()

2015-10-26 Thread Luis Henriques

On Fri, Oct 23, 2015 at 11:39:46AM +0200, Michal Kubecek wrote:
> On Fri, Oct 23, 2015 at 11:22:19AM +0200, Sabrina Dubroca wrote:
> > Hello Michal,
> > 
> > 2015-10-23, 10:46:09 +0200, Michal Kubecek wrote:
> > > Mainline commit 89c22d8c3b27 ("net: Fix skb csum races when peeking")
> > > backport into pre-3.19 stable kernels introduces a regression causing
> > > null pointer dererefence in skb_copy_and_csum_datagram_iovec().
> > > 
> > > This commit only sets CHECKSUM_UNNECESSARY for non-shared skb, allowing
> > > udp_recvmsg() to take the "else" branch of if (skb_csum_unnecessary(skb))
> > > when called with null iovec (and len=0, e.g. when peeking for datagram
> > > size first). The problem is that unlike skb_copy_and_csum_datagram_msg()
> > > called in this path since 3.19, skb_copy_and_csum_datagram_iovec() does
> > > not handle null iov parameter and always dereferences iov->iov_len. This
> > > is especially harmful when udp_recvmsg() is called in kernel context,
> > > e.g. from kernel nfsd.
> > > 
> > > Band-aid skb_copy_and_csum_datagram_iovec() by testing iov for null and
> > > only checking the checksum in this case.
> > > 
> > > Signed-off-by: Michal Kubecek 
> > > ---
> > 
> > I ran into this problem too and that was my initial solution to this
> > problem as well, but actually, we need a more complete fix, like the
> > one I submitted a few days ago:
> > 
> > http://patchwork.ozlabs.org/patch/530642/
> > 
> > With your solution, userspace can still receive bogus EFAULT, or the
> > kernel ends up writing data to an unwanted memory location.
> 
> I must admit I wondered why skb_copy_and_csum_datagram_iovec() doesn't
> get (and check) read length and why it cannot overfill the buffer. But
> then I saw the comment "Caller _must_ check that skb will fit to this
> iovec", stopped thinking and assumed it's OK. I guess I should be less
> trusting... :-(
> 
> Thank you for the warning.
> 
> Michal Kubecek

I can confirm that we had this issue reported in our kernels too
(https://bugs.launchpad.net/bugs/1508510).  I'll queue Sabrina's patch for
the next 3.16 stable kernel release.  Thanks a lot!


Cheers,
--
Luís
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: [PATCH v7] can: xilinx: Convert to runtime_pm

2015-10-26 Thread Appana Durga Kedareswara Rao

Hi Marc,

> -Original Message-
> From: Marc Kleine-Budde [mailto:m...@pengutronix.de]
> Sent: Monday, October 26, 2015 1:54 AM
> To: Appana Durga Kedareswara Rao; Anirudha Sarangi; w...@grandegger.com;
> Michal Simek; Soren Brinkmann
> Cc: linux-...@vger.kernel.org; netdev@vger.kernel.org; linux-arm-
> ker...@lists.infradead.org; linux-ker...@vger.kernel.org; Appana Durga
> Kedareswara Rao
> Subject: Re: [PATCH v7] can: xilinx: Convert to runtime_pm
> 
> On 10/23/2015 07:23 AM, Kedareswara rao Appana wrote:
> > Instead of enabling/disabling clocks at several locations in the
> > driver, Use the runtime_pm framework. This consolidates the actions
> > for runtime PM In the appropriate callbacks and makes the driver more
> readable and mantainable.
> >
> > Signed-off-by: Kedareswara rao Appana 
> > ---
> > Changes for v7:
> >   - Removed the unnecessary clk_prepare/clk_unprepare calls
> > From  the probe and remove as suggested by Soren.
> > Changes for v6:
> >  - Updated the driver with review comments as suggested by Marc.
> > Changes for v5:
> >  - Updated with the review comments.
> >Updated the remove fuction to use runtime_pm.
> > Chnages for v4:
> >  - Updated with the review comments.
> > Changes for v3:
> >   - Converted the driver to use runtime_pm.
> > Changes for v2:
> >   - Removed the struct platform_device* from suspend/resume
> > as suggest by Lothar.
> >
> >  drivers/net/can/xilinx_can.c | 177
> > +--
> >  1 file changed, 102 insertions(+), 75 deletions(-)
> >
> > diff --git a/drivers/net/can/xilinx_can.c
> > b/drivers/net/can/xilinx_can.c index fc55e8e..fcb584f 100644
> > --- a/drivers/net/can/xilinx_can.c
> > +++ b/drivers/net/can/xilinx_can.c
> > @@ -32,6 +32,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #define DRIVER_NAME"xilinx_can"
> >
> > @@ -138,7 +139,7 @@ struct xcan_priv {
> > u32 (*read_reg)(const struct xcan_priv *priv, enum xcan_reg reg);
> > void (*write_reg)(const struct xcan_priv *priv, enum xcan_reg reg,
> > u32 val);
> > -   struct net_device *dev;
> > +   struct device *dev;
> > void __iomem *reg_base;
> > unsigned long irq_flags;
> > struct clk *bus_clk;
> > @@ -843,6 +844,13 @@ static int xcan_open(struct net_device *ndev)
> > struct xcan_priv *priv = netdev_priv(ndev);
> > int ret;
> >
> > +   ret = pm_runtime_get_sync(priv->dev);
> > +   if (ret < 0) {
> > +   netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
> > +   __func__, ret);
> > +   return ret;
> > +   }
> > +
> > ret = request_irq(ndev->irq, xcan_interrupt, priv->irq_flags,
> > ndev->name, ndev);
> > if (ret < 0) {
> > @@ -850,29 +858,17 @@ static int xcan_open(struct net_device *ndev)
> > goto err;
> > }
> >
> > -   ret = clk_prepare_enable(priv->can_clk);
> > -   if (ret) {
> > -   netdev_err(ndev, "unable to enable device clock\n");
> > -   goto err_irq;
> > -   }
> > -
> > -   ret = clk_prepare_enable(priv->bus_clk);
> > -   if (ret) {
> > -   netdev_err(ndev, "unable to enable bus clock\n");
> > -   goto err_can_clk;
> > -   }
> > -
> > /* Set chip into reset mode */
> > ret = set_reset_mode(ndev);
> > if (ret < 0) {
> > netdev_err(ndev, "mode resetting failed!\n");
> > -   goto err_bus_clk;
> > +   goto err_irq;
> > }
> >
> > /* Common open */
> > ret = open_candev(ndev);
> > if (ret)
> > -   goto err_bus_clk;
> > +   goto err_irq;
> >
> > ret = xcan_chip_start(ndev);
> > if (ret < 0) {
> > @@ -888,13 +884,11 @@ static int xcan_open(struct net_device *ndev)
> >
> >  err_candev:
> > close_candev(ndev);
> > -err_bus_clk:
> > -   clk_disable_unprepare(priv->bus_clk);
> > -err_can_clk:
> > -   clk_disable_unprepare(priv->can_clk);
> >  err_irq:
> > free_irq(ndev->irq, ndev);
> >  err:
> > +   pm_runtime_put(priv->dev);
> > +
> > return ret;
> >  }
> >
> > @@ -911,12 +905,11 @@ static int xcan_close(struct net_device *ndev)
> > netif_stop_queue(ndev);
> > napi_disable(>napi);
> > xcan_chip_stop(ndev);
> > -   clk_disable_unprepare(priv->bus_clk);
> > -   clk_disable_unprepare(priv->can_clk);
> > free_irq(ndev->irq, ndev);
> > close_candev(ndev);
> >
> > can_led_event(ndev, CAN_LED_EVENT_STOP);
> > +   pm_runtime_put(priv->dev);
> >
> > return 0;
> >  }
> > @@ -935,27 +928,20 @@ static int xcan_get_berr_counter(const struct
> net_device *ndev,
> > struct xcan_priv *priv = netdev_priv(ndev);
> > int ret;
> >
> > -   ret = clk_prepare_enable(priv->can_clk);
> > -   if (ret)
> > -   goto err;
> > -
> > -   ret = clk_prepare_enable(priv->bus_clk);
> > -   if (ret)
> > -   goto err_clk;
> > +   ret = pm_runtime_get_sync(priv->dev);
> > +   if (ret < 0) {
> > +   netdev_err(ndev, "%s:

[-next] WARNING at iwl_mvm_time_event_send_add+0x72/0x1b6

2015-10-26 Thread Sergey Senozhatsky

Hi,

linux-next 20151022


wlp2s0: aborting authentication with 00:04:96:61:cd:e0 by local choice (Reason: 
3=DEAUTH_LEAVING)
[ cut here ]
WARNING: CPU: 0 PID: 1006 at drivers/net/wireless/iwlwifi/mvm/time-event.c:513 
iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]()
Modules linked in: mousedev arc4 nls_iso8859_1 nls_cp437 vfat fat serio_raw 
psmouse atkbd coretemp hwmon i915 libps2 iwlmvm i2c_algo_bit mac80211 
drm_kms_helper cfbfillrect intel_powerclamp syscopyarea cfbimgblt sysfillrect 
sysimgblt crc32c_intel fb_sys_fops cfbcopyarea iwlwifi drm r8
CPU: 0 PID: 1006 Comm: iwconfig Not tainted 
4.3.0-rc6-next-20151022-dbg-2-g4041783-dirty #260
  8800c69479c8 811dd4ad 
 8800c6947a00 8103db4e a04fd261 88041c7cdfc8
 88041cc87a20 88041c7ceb28 8800c6947aac 8800c6947a10
Call Trace:
 [] dump_stack+0x4b/0x63
 [] warn_slowpath_common+0x99/0xb2
 [] ? iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
 [] warn_slowpath_null+0x1a/0x1c
 [] iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
 [] ? __lock_is_held+0x3c/0x57
 [] iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
 [] ? iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
 [] ? iwl_mvm_ref_sync+0x37/0x10c [iwlmvm]
 [] iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
 [] ? iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
 [] ieee80211_mgd_deauth+0x14f/0x3b0 [mac80211]
 [] ? __lock_is_held+0x3c/0x57
 [] ieee80211_deauth+0x18/0x1a [mac80211]
 [] cfg80211_mlme_deauth+0x13c/0x28e [cfg80211]
 [] cfg80211_disconnect+0xb5/0x2f7 [cfg80211]
 [] cfg80211_mgd_wext_siwfreq+0xed/0x160 [cfg80211]
 [] ? cfg80211_wext_freq+0x5f/0x5f [cfg80211]
 [] cfg80211_wext_siwfreq+0x76/0xf6 [cfg80211]
 [] ioctl_standard_call+0x66/0x376
 [] wext_handle_ioctl+0x102/0x16d
 [] dev_ioctl+0x6bb/0x6de
 [] ? handle_mm_fault+0xefc/0x13f9
 [] sock_ioctl+0x230/0x23c
 [] ? sock_ioctl+0x230/0x23c
 [] do_vfs_ioctl+0x458/0x4dc
 [] ? retint_user+0x18/0x20
 [] ? __fget_light+0x4d/0x71
 [] SyS_ioctl+0x43/0x61
 [] entry_SYSCALL_64_fastpath+0x12/0x6f
---[ end trace 6a44e7f1588bdae7 ]---


-ss
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: tso: add support for IPv6

2015-10-26 Thread Toshiaki Makita



On 2015/10/26 17:13, Toshiaki Makita wrote:
> On 2015/10/26 16:47, Grumbach, Emmanuel wrote:
>> On 10/26/2015 06:03 AM, Toshiaki Makita wrote:
>>> On 2015/10/26 5:02, Emmanuel Grumbach wrote:
 Adding IPv6 for the TSO helper API is trivial:
 * Don't play with the id (which doesn't exist in IPv6)
 * Correctly update the payload_len (don't include the
   length of the IP header itself)
>>> ...
memcpy(hdr, skb->data, hdr_len);
 -  iph = (struct iphdr *)(hdr + mac_hdr_len);
 -  iph->id = htons(tso->ip_id);
 -  iph->tot_len = htons(size + hdr_len - mac_hdr_len);
 +  if (skb->protocol == htons(ETH_P_IP)) {
>>>
>>> I guess this should be vlan_get_protocol(skb).
>>
>> I truly don't know. I guess we could have VLANs, but I'd need to check
>> how the packet would look like after it exits mac80211.
> 
> I don't know much about mac80211.
> 
> What I see is that mvneta has TSO in vlan_features and it uses
> tso_build_hdr(). When vlan device is used, we cannot access network
> protocol by skb->protocol without HW vlan acceleration.
> So it looks like this change corrupts TSO functionality on mvneta.
> 
>> If we need that, I'll likely do this check once in tso_start() and add a
>> variable to struct tso_t.
> 
> I'm not sure if an additional variable is needed.
> At least, skb_network_offset()/ip_hdr() should correctly handle (skip)
> vlan headers.

Ah, sorry, I misread your suggestion.
Additional variable would make sense to me.

Toshiaki Makita

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] macvtap: unbreak receiving of gro skb with frag list

2015-10-26 Thread Michael S. Tsirkin

On Mon, Oct 26, 2015 at 02:53:38PM +0800, Jason Wang wrote:
> 
> 
> On 10/26/2015 02:09 PM, Michael S. Tsirkin wrote:
> > On Mon, Oct 26, 2015 at 11:15:57AM +0800, Jason Wang wrote:
> >>
> >> On 10/23/2015 09:37 PM, Michael S. Tsirkin wrote:
> >>> On Fri, Oct 23, 2015 at 12:57:05AM -0400, Jason Wang wrote:
>  We don't have fraglist support in TAP_FEATURES. This will lead
>  software segmentation of gro skb with frag list. Fixes by having
>  frag list support in TAP_FEATURES.
> 
>  With this patch single session of netperf receiving were restored from
>  about 5Gb/s to about 12Gb/s on mlx4.
> 
>  Fixes a567dd6252 ("macvtap: simplify usage of tap_features")
>  Cc: Vlad Yasevich 
>  Cc: Michael S. Tsirkin 
>  Signed-off-by: Jason Wang 
> >>> Thanks!
> >>> Does this mean we should look at re-adding NETIF_F_FRAGLIST
> >>> to virtio-net as well?
> >> Not sure I get the point, but probably not. This is for receiving and
> >> skb_copy_datagram_iter() can deal with frag list.
> >
> > Point is:
> > - bridge within guest
> > - assigned device creating gro skbs with frag list bridged to virtio
> 
> I see, but this problem looks not specific to virtio. Most cards does
> not support frag list.

These will be slower when used with a bridge then, won't they?

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v8 04/10] qed: Add slowpath L2 support

2015-10-26 Thread Yuval Mintz

From: Manish Chopra 

This patch adds to the qed the support to configure various L2 elements,
such as channels and basic filtering conditions.
It also enhances its public API to allow qede to later utilize this
functionality.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qed/qed_dev.c |  114 ++
 drivers/net/ethernet/qlogic/qed/qed_dev_api.h |   58 +
 drivers/net/ethernet/qlogic/qed/qed_hsi.h |  294 +
 drivers/net/ethernet/qlogic/qed/qed_l2.c  | 1605 +
 drivers/net/ethernet/qlogic/qed/qed_main.c|   10 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.c |   16 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.h |   13 +
 drivers/net/ethernet/qlogic/qed/qed_sp.h  |   27 +
 drivers/net/ethernet/qlogic/qed/qed_spq.c |   29 +
 include/linux/qed/qed_eth_if.h|  120 ++
 10 files changed, 2286 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 3243cb4..3d1bdbf 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -799,6 +799,60 @@ int qed_hw_stop(struct qed_dev *cdev)
return rc;
 }
 
+void qed_hw_stop_fastpath(struct qed_dev *cdev)
+{
+   int i, j;
+
+   for_each_hwfn(cdev, j) {
+   struct qed_hwfn *p_hwfn = >hwfns[j];
+   struct qed_ptt *p_ptt   = p_hwfn->p_main_ptt;
+
+   DP_VERBOSE(p_hwfn,
+  NETIF_MSG_IFDOWN,
+  "Shutting down the fastpath\n");
+
+   qed_wr(p_hwfn, p_ptt,
+  NIG_REG_RX_LLH_BRB_GATE_DNTFWD_PERPF, 0x1);
+
+   qed_wr(p_hwfn, p_ptt, PRS_REG_SEARCH_TCP, 0x0);
+   qed_wr(p_hwfn, p_ptt, PRS_REG_SEARCH_UDP, 0x0);
+   qed_wr(p_hwfn, p_ptt, PRS_REG_SEARCH_FCOE, 0x0);
+   qed_wr(p_hwfn, p_ptt, PRS_REG_SEARCH_ROCE, 0x0);
+   qed_wr(p_hwfn, p_ptt, PRS_REG_SEARCH_OPENFLOW, 0x0);
+
+   qed_wr(p_hwfn, p_ptt, TM_REG_PF_ENABLE_CONN, 0x0);
+   qed_wr(p_hwfn, p_ptt, TM_REG_PF_ENABLE_TASK, 0x0);
+   for (i = 0; i < QED_HW_STOP_RETRY_LIMIT; i++) {
+   if ((!qed_rd(p_hwfn, p_ptt,
+TM_REG_PF_SCAN_ACTIVE_CONN)) &&
+   (!qed_rd(p_hwfn, p_ptt,
+TM_REG_PF_SCAN_ACTIVE_TASK)))
+   break;
+
+   usleep_range(1000, 2000);
+   }
+   if (i == QED_HW_STOP_RETRY_LIMIT)
+   DP_NOTICE(p_hwfn,
+ "Timers linear scans are not over [Connection 
%02x Tasks %02x]\n",
+ (u8)qed_rd(p_hwfn, p_ptt,
+TM_REG_PF_SCAN_ACTIVE_CONN),
+ (u8)qed_rd(p_hwfn, p_ptt,
+TM_REG_PF_SCAN_ACTIVE_TASK));
+
+   qed_int_igu_init_pure_rt(p_hwfn, p_ptt, false, false);
+
+   /* Need to wait 1ms to guarantee SBs are cleared */
+   usleep_range(1000, 2000);
+   }
+}
+
+void qed_hw_start_fastpath(struct qed_hwfn *p_hwfn)
+{
+   /* Re-open incoming traffic */
+   qed_wr(p_hwfn, p_hwfn->p_main_ptt,
+  NIG_REG_RX_LLH_BRB_GATE_DNTFWD_PERPF, 0x0);
+}
+
 static int qed_reg_assert(struct qed_hwfn *hwfn,
  struct qed_ptt *ptt, u32 reg,
  bool expected)
@@ -1337,3 +1391,63 @@ void qed_chain_free(struct qed_dev *cdev,
  p_chain->p_virt_addr,
  p_chain->p_phys_addr);
 }
+
+int qed_fw_l2_queue(struct qed_hwfn *p_hwfn,
+   u16 src_id, u16 *dst_id)
+{
+   if (src_id >= RESC_NUM(p_hwfn, QED_L2_QUEUE)) {
+   u16 min, max;
+
+   min = (u16)RESC_START(p_hwfn, QED_L2_QUEUE);
+   max = min + RESC_NUM(p_hwfn, QED_L2_QUEUE);
+   DP_NOTICE(p_hwfn,
+ "l2_queue id [%d] is not valid, available indices [%d 
- %d]\n",
+ src_id, min, max);
+
+   return -EINVAL;
+   }
+
+   *dst_id = RESC_START(p_hwfn, QED_L2_QUEUE) + src_id;
+
+   return 0;
+}
+
+int qed_fw_vport(struct qed_hwfn *p_hwfn,
+u8 src_id, u8 *dst_id)
+{
+   if (src_id >= RESC_NUM(p_hwfn, QED_VPORT)) {
+   u8 min, max;
+
+   min = (u8)RESC_START(p_hwfn, QED_VPORT);
+   max = min + RESC_NUM(p_hwfn, QED_VPORT);
+   DP_NOTICE(p_hwfn,
+ "vport id [%d] is not valid, available indices [%d - 
%d]\n",
+ src_id, min, max);
+
+   return -EINVAL;
+   }
+
+

[PATCH net-next v8 03/10] qede: Add basic Network driver

2015-10-26 Thread Yuval Mintz

The Qlogic Everest Driver for Ethernet is the Ethernet specific module for
QL4xxx ethernet products by Qlogic.

This patch adds a very minimal PCI driver, one that doesn't yet register
a network device, but one that does interact with qed and does a basic
initialization of the HW.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/Kconfig  |   5 +
 drivers/net/ethernet/qlogic/Makefile |   1 +
 drivers/net/ethernet/qlogic/qede/Makefile|   3 +
 drivers/net/ethernet/qlogic/qede/qede.h  |  73 ++
 drivers/net/ethernet/qlogic/qede/qede_main.c | 354 +++
 5 files changed, 436 insertions(+)
 create mode 100644 drivers/net/ethernet/qlogic/qede/Makefile
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede.h
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_main.c

diff --git a/drivers/net/ethernet/qlogic/Kconfig 
b/drivers/net/ethernet/qlogic/Kconfig
index 58c3fb3..30a6f24 100644
--- a/drivers/net/ethernet/qlogic/Kconfig
+++ b/drivers/net/ethernet/qlogic/Kconfig
@@ -97,4 +97,9 @@ config QED
---help---
  This enables the support for ...
 
+config QEDE
+   tristate "QLogic QED 25/40/100Gb Ethernet NIC"
+   depends on QED
+   ---help---
+ This enables the support for ...
 endif # NET_VENDOR_QLOGIC
diff --git a/drivers/net/ethernet/qlogic/Makefile 
b/drivers/net/ethernet/qlogic/Makefile
index 7600138..cee90e0 100644
--- a/drivers/net/ethernet/qlogic/Makefile
+++ b/drivers/net/ethernet/qlogic/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_QLCNIC) += qlcnic/
 obj-$(CONFIG_QLGE) += qlge/
 obj-$(CONFIG_NETXEN_NIC) += netxen/
 obj-$(CONFIG_QED) += qed/
+obj-$(CONFIG_QEDE)+= qede/
diff --git a/drivers/net/ethernet/qlogic/qede/Makefile 
b/drivers/net/ethernet/qlogic/qede/Makefile
new file mode 100644
index 000..bedfe9f
--- /dev/null
+++ b/drivers/net/ethernet/qlogic/qede/Makefile
@@ -0,0 +1,3 @@
+obj-$(CONFIG_QEDE) := qede.o
+
+qede-y := qede_main.o
diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
new file mode 100644
index 000..7e2bcfa
--- /dev/null
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -0,0 +1,73 @@
+/* QLogic qede NIC Driver
+* Copyright (c) 2015 QLogic Corporation
+*
+* This software is available under the terms of the GNU General Public License
+* (GPL) Version 2, available from the file COPYING in the main directory of
+* this source tree.
+*/
+
+#ifndef _QEDE_H_
+#define _QEDE_H_
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define QEDE_MAJOR_VERSION 8
+#define QEDE_MINOR_VERSION 4
+#define QEDE_REVISION_VERSION  0
+#define QEDE_ENGINEERING_VERSION   0
+#define DRV_MODULE_VERSION __stringify(QEDE_MAJOR_VERSION) "." \
+   __stringify(QEDE_MINOR_VERSION) "." \
+   __stringify(QEDE_REVISION_VERSION) "."  \
+   __stringify(QEDE_ENGINEERING_VERSION)
+
+#define QEDE_ETH_INTERFACE_VERSION 300
+
+#define DRV_MODULE_SYM qede
+
+struct qede_dev {
+   struct qed_dev  *cdev;
+   struct net_device   *ndev;
+   struct pci_dev  *pdev;
+
+   u32 dp_module;
+   u8  dp_level;
+
+   const struct qed_eth_ops*ops;
+
+   struct qed_dev_eth_info dev_info;
+#define QEDE_MAX_RSS_CNT(edev) ((edev)->dev_info.num_queues)
+#define QEDE_MAX_TSS_CNT(edev) ((edev)->dev_info.num_queues * \
+(edev)->dev_info.num_tc)
+
+   u16 num_rss;
+   u8  num_tc;
+#define QEDE_RSS_CNT(edev) ((edev)->num_rss)
+#define QEDE_TSS_CNT(edev) ((edev)->num_rss *  \
+(edev)->num_tc)
+#define QEDE_TSS_IDX(edev, txqidx) ((txqidx) % (edev)->num_rss)
+#define QEDE_TC_IDX(edev, txqidx)  ((txqidx) / (edev)->num_rss)
+
+   struct qed_int_info int_info;
+   unsigned char   primary_mac[ETH_ALEN];
+
+   /* Smaller private varaiant of the RTNL lock */
+   struct mutexqede_lock;
+   u32 state; /* Protected by qede_lock */
+};
+
+/* Debug print definitions */
+#define DP_NAME(edev) ((edev)->ndev->name)
+
+#endif /* _QEDE_H_ */
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
new file mode 100644
index 000..02ed6db
--- /dev/null
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -0,0 +1,354 @@
+/* QLogic qede NIC Driver
+* Copyright (c) 2015 QLogic Corporation
+*
+* This software is available under the terms of the GNU General Public License
+* (GPL) Version 2, available from

Re: [Xen-devel] [PATCH net-next 0/8] xen-netback/core: packet hashing

2015-10-26 Thread David Vrabel

On 24/10/15 12:55, David Miller wrote:
> From: Paul Durrant 
> Date: Wed, 21 Oct 2015 11:36:17 +0100
> 
>> This series adds xen-netback support for hash negotiation with a frontend
>> driver, and an implementation of toeplitz hashing as the initial negotiable
>> algorithm.
> 
> Ping, I want to see some review from some other xen networking folks.

There's been some review of the front/back protocol (on a different
thread) and some significant changes have been suggested.

David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Xen-devel] [PATCH net-next 0/8] xen-netback/core: packet hashing

2015-10-26 Thread David Miller

From: David Vrabel 
Date: Mon, 26 Oct 2015 10:38:50 +

> On 24/10/15 12:55, David Miller wrote:
>> From: Paul Durrant 
>> Date: Wed, 21 Oct 2015 11:36:17 +0100
>> 
>>> This series adds xen-netback support for hash negotiation with a frontend
>>> driver, and an implementation of toeplitz hashing as the initial negotiable
>>> algorithm.
>> 
>> Ping, I want to see some review from some other xen networking folks.
> 
> There's been some review of the front/back protocol (on a different
> thread) and some significant changes have been suggested.

Ok, I'll mark this series as "changes requested" then, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [RFC PATCH net-next] net/core: initial support for stacked dev feature toggles

2015-10-26 Thread Michal Kubecek

On Fri, Oct 23, 2015 at 10:51:09PM -0700, Alexander Duyck wrote:
> On 10/23/2015 08:40 PM, Jarod Wilson wrote:
> >
> >+static netdev_features_t netdev_sync_upper_features(struct net_device 
> >*lower,
> >+struct net_device *upper, netdev_features_t features)
> >+{
> >+netdev_features_t want = upper->wanted_features & lower->hw_features;
> >+
> >+if (!(upper->wanted_features & NETIF_F_LRO)
> >+&& (features & NETIF_F_LRO)) {
> >+netdev_info(lower, "Dropping LRO, upper dev %s has it off.\n",
> >+   upper->name);
> >+features &= ~NETIF_F_LRO;
> >+} else if ((want & NETIF_F_LRO) && !(features & NETIF_F_LRO)) {
> >+netdev_info(lower, "Keeping LRO, upper dev %s has it on.\n",
> >+   upper->name);
> >+features |= NETIF_F_LRO;
> >+}
> >+
> >+return features;
> >+}
> >+
> 
> I'd say to drop the second half of this statement.  LRO is a feature
> that should be enabled explicitly per interface.  If someone enables
> LRO on the master they may only want it on one interface.  The fact
> is there are some implementations of LRO that work better than
> others so you want to give the end user the option to mix and match.

Agreed. IMHO it makes sense to allow setups with LRO disabled on some
slaves and enabled on other.

Also, the logic seems to only consider the 1 upper : N lower scheme
(bond, team) but we also have N upper : 1 lower setups (vlan, macvlan).
For these, there is no way to propagate both 0 and 1 down as this would
result in a conflict.

> >+static void netdev_sync_lower_features(struct net_device *upper,
> >+struct net_device *lower, netdev_features_t features)
> >+{
> >+netdev_features_t want = features & lower->hw_features;
> >+
> >+if (!(features & NETIF_F_LRO) && (lower->features & NETIF_F_LRO)) {
> >+netdev_info(upper, "Disabling LRO on lower dev %s.\n",
> >+   lower->name);
> >+upper->wanted_features &= ~NETIF_F_LRO;
> >+lower->wanted_features &= ~NETIF_F_LRO;
> >+netdev_update_features(lower);
> >+if (unlikely(lower->features & NETIF_F_LRO))
> >+netdev_WARN(upper, "failed to disable LRO on %s!\n",
> >+lower->name);
> >+} else if ((want & NETIF_F_LRO) && !(lower->features & NETIF_F_LRO)) {
> >+netdev_info(upper, "Enabling LRO on lower dev %s.\n",
> >+   lower->name);
> >+upper->wanted_features |= NETIF_F_LRO;
> >+lower->wanted_features |= NETIF_F_LRO;
> >+netdev_update_features(lower);
> >+if (unlikely(!(lower->features & NETIF_F_LRO)))
> >+netdev_WARN(upper, "failed to enable LRO on %s!\n",
> >+lower->name);
> >+}
> >+}
> >+
> 
> Same thing here.  If a lower dev has it disabled then leave it
> disabled.  I believe your goal is to make it so that
> dev_disable_lro() can shut down LRO when it is making packets in the
> data-path unusable.

This is already the case since commit fbe168ba91f7 ("net: generic
dev_disable_lro() stacked device handling"). That commit makes sure
dev_disable_lro() is propagated down the stack and also makes sure new
slaves added to a bond/team with LRO disabled have it disabled too.

What it does not do is propagating LRO disabling down if it is disabled
in ways that do not call dev_disable_lro() (e.g. via ethtool). I'm not
sure if this should be done or not, both options have their pros and
cons. However, I believe enabling LRO shouldn't be propagated down.

 Michal Kubecek

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 2/3] bpf: introduce bpf_perf_event_output() helper

2015-10-26 Thread Alexei Starovoitov


On 10/25/15 6:46 PM, Wangnan (F) wrote:

Can we (or have we already) setup some rules for licensing? Which part
should be GPL? Who has the response to decide it?


in my mind the rules were set long ago. See my other email.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] net: fsl: expands dependencies of NET_VENDOR_FREESCALE

2015-10-26 Thread shh.xie

From: Shaohui Xie 

Freescale hosts some ARMv8 based SoCs, and a generic convention
ARCH_LAYERSCAPE is used to cover such SoCs. Adding ARCH_LAYERSCAPE
to dependencies of NET_VENDOR_FREESCALE to support networking on those
SoCs.

The patch to add ARCH_LAYERSCAPE can be viewed at:
http://www.spinics.net/lists/arm-kernel/msg455583.html

Signed-off-by: Shaohui Xie 
---
 drivers/net/ethernet/freescale/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/Kconfig 
b/drivers/net/ethernet/freescale/Kconfig
index ff76d4e..bee32a9 100644
--- a/drivers/net/ethernet/freescale/Kconfig
+++ b/drivers/net/ethernet/freescale/Kconfig
@@ -7,7 +7,8 @@ config NET_VENDOR_FREESCALE
default y
depends on FSL_SOC || QUICC_ENGINE || CPM1 || CPM2 || PPC_MPC512x || \
   M523x || M527x || M5272 || M528x || M520x || M532x || \
-  ARCH_MXC || ARCH_MXS || (PPC_MPC52xx && PPC_BESTCOMM)
+  ARCH_MXC || ARCH_MXS || (PPC_MPC52xx && PPC_BESTCOMM) || \
+  ARCH_LAYERSCAPE
---help---
  If you have a network (Ethernet) card belonging to this class, say Y.
 
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: tso: add support for IPv6

2015-10-26 Thread Grumbach, Emmanuel



On 10/26/2015 10:14 AM, Toshiaki Makita wrote:
> On 2015/10/26 16:47, Grumbach, Emmanuel wrote:
>> On 10/26/2015 06:03 AM, Toshiaki Makita wrote:
>>> On 2015/10/26 5:02, Emmanuel Grumbach wrote:
 Adding IPv6 for the TSO helper API is trivial:
 * Don't play with the id (which doesn't exist in IPv6)
 * Correctly update the payload_len (don't include the
   length of the IP header itself)
>>> ...
memcpy(hdr, skb->data, hdr_len);
 -  iph = (struct iphdr *)(hdr + mac_hdr_len);
 -  iph->id = htons(tso->ip_id);
 -  iph->tot_len = htons(size + hdr_len - mac_hdr_len);
 +  if (skb->protocol == htons(ETH_P_IP)) {
>>>
>>> I guess this should be vlan_get_protocol(skb).
>>
>> I truly don't know. I guess we could have VLANs, but I'd need to check
>> how the packet would look like after it exits mac80211.
> 
> I don't know much about mac80211.
> 
> What I see is that mvneta has TSO in vlan_features and it uses
> tso_build_hdr(). When vlan device is used, we cannot access network
> protocol by skb->protocol without HW vlan acceleration.
> So it looks like this change corrupts TSO functionality on mvneta.

Convincing enough :)

> 
>> If we need that, I'll likely do this check once in tso_start() and add a
>> variable to struct tso_t.
> 
> I'm not sure if an additional variable is needed.
> At least, skb_network_offset()/ip_hdr() should correctly handle (skip)
> vlan headers.
> 

Adding a bool to tso_t is more an optimisation than something needed for
correctness. It avoid to call vlan_get_protocol(skb) for each MSS.

> Toshiaki Makita
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3] net: tso: add support for IPv6

2015-10-26 Thread Emmanuel Grumbach

Adding IPv6 for the TSO helper API is trivial:
* Don't play with the id (which doesn't exist in IPv6)
* Correctly update the payload_len (don't include the
  length of the IP header itself)

Signed-off-by: Emmanuel Grumbach 
---
v3: use vlan_get_protocol and call it once in tso_start
store the result in tso_t
---
 include/net/tso.h |  1 +
 net/core/tso.c| 18 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/net/tso.h b/include/net/tso.h
index 47e5444..b7be852 100644
--- a/include/net/tso.h
+++ b/include/net/tso.h
@@ -8,6 +8,7 @@ struct tso_t {
void *data;
size_t size;
u16 ip_id;
+   bool ipv6;
u32 tcp_seq;
 };
 
diff --git a/net/core/tso.c b/net/core/tso.c
index 630b30b..5dca7ce 100644
--- a/net/core/tso.c
+++ b/net/core/tso.c
@@ -1,4 +1,5 @@
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -14,18 +15,24 @@ EXPORT_SYMBOL(tso_count_descs);
 void tso_build_hdr(struct sk_buff *skb, char *hdr, struct tso_t *tso,
   int size, bool is_last)
 {
-   struct iphdr *iph;
struct tcphdr *tcph;
int hdr_len = skb_transport_offset(skb) + tcp_hdrlen(skb);
int mac_hdr_len = skb_network_offset(skb);
 
memcpy(hdr, skb->data, hdr_len);
-   iph = (struct iphdr *)(hdr + mac_hdr_len);
-   iph->id = htons(tso->ip_id);
-   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   if (!tso->ipv6) {
+   struct iphdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->id = htons(tso->ip_id);
+   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
+   tso->ip_id++;
+   } else {
+   struct ipv6hdr *iph = (void *)(hdr + mac_hdr_len);
+
+   iph->payload_len = htons(size + tcp_hdrlen(skb));
+   }
tcph = (struct tcphdr *)(hdr + skb_transport_offset(skb));
put_unaligned_be32(tso->tcp_seq, >seq);
-   tso->ip_id++;
 
if (!is_last) {
/* Clear all special flags for not last packet */
@@ -61,6 +68,7 @@ void tso_start(struct sk_buff *skb, struct tso_t *tso)
tso->ip_id = ntohs(ip_hdr(skb)->id);
tso->tcp_seq = ntohl(tcp_hdr(skb)->seq);
tso->next_frag_idx = 0;
+   tso->ipv6 = vlan_get_protocol(skb) == htons(ETH_P_IPV6);
 
/* Build first data */
tso->size = skb_headlen(skb) - hdr_len;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] macvtap: unbreak receiving of gro skb with frag list

2015-10-26 Thread Michael S. Tsirkin

On Mon, Oct 26, 2015 at 11:15:57AM +0800, Jason Wang wrote:
> 
> 
> On 10/23/2015 09:37 PM, Michael S. Tsirkin wrote:
> > On Fri, Oct 23, 2015 at 12:57:05AM -0400, Jason Wang wrote:
> >> We don't have fraglist support in TAP_FEATURES. This will lead
> >> software segmentation of gro skb with frag list. Fixes by having
> >> frag list support in TAP_FEATURES.
> >>
> >> With this patch single session of netperf receiving were restored from
> >> about 5Gb/s to about 12Gb/s on mlx4.
> >>
> >> Fixes a567dd6252 ("macvtap: simplify usage of tap_features")
> >> Cc: Vlad Yasevich 
> >> Cc: Michael S. Tsirkin 
> >> Signed-off-by: Jason Wang 
> > Thanks!
> > Does this mean we should look at re-adding NETIF_F_FRAGLIST
> > to virtio-net as well?
> 
> Not sure I get the point, but probably not. This is for receiving and
> skb_copy_datagram_iter() can deal with frag list.


Point is:
- bridge within guest
- assigned device creating gro skbs with frag list bridged to virtio

> >
> >> ---
> >>  drivers/net/macvtap.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> >> index 248478c..197c939 100644
> >> --- a/drivers/net/macvtap.c
> >> +++ b/drivers/net/macvtap.c
> >> @@ -137,7 +137,7 @@ static const struct proto_ops macvtap_socket_ops;
> >>  #define TUN_OFFLOADS (NETIF_F_HW_CSUM | NETIF_F_TSO_ECN | NETIF_F_TSO | \
> >>  NETIF_F_TSO6 | NETIF_F_UFO)
> >>  #define RX_OFFLOADS (NETIF_F_GRO | NETIF_F_LRO)
> >> -#define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG)
> >> +#define TAP_FEATURES (NETIF_F_GSO | NETIF_F_SG | NETIF_F_FRAGLIST)
> >>  
> >>  static struct macvlan_dev *macvtap_get_vlan_rcu(const struct net_device 
> >> *dev)
> >>  {
> >> -- 
> >> 1.8.3.1
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] macvtap: unbreak receiving of gro skb with frag list

2015-10-26 Thread Jason Wang



On 10/26/2015 02:09 PM, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2015 at 11:15:57AM +0800, Jason Wang wrote:
>>
>> On 10/23/2015 09:37 PM, Michael S. Tsirkin wrote:
>>> On Fri, Oct 23, 2015 at 12:57:05AM -0400, Jason Wang wrote:
 We don't have fraglist support in TAP_FEATURES. This will lead
 software segmentation of gro skb with frag list. Fixes by having
 frag list support in TAP_FEATURES.

 With this patch single session of netperf receiving were restored from
 about 5Gb/s to about 12Gb/s on mlx4.

 Fixes a567dd6252 ("macvtap: simplify usage of tap_features")
 Cc: Vlad Yasevich 
 Cc: Michael S. Tsirkin 
 Signed-off-by: Jason Wang 
>>> Thanks!
>>> Does this mean we should look at re-adding NETIF_F_FRAGLIST
>>> to virtio-net as well?
>> Not sure I get the point, but probably not. This is for receiving and
>> skb_copy_datagram_iter() can deal with frag list.
>
> Point is:
> - bridge within guest
> - assigned device creating gro skbs with frag list bridged to virtio

I see, but this problem looks not specific to virtio. Most cards does
not support frag list.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: tso: add support for IPv6

2015-10-26 Thread Toshiaki Makita

On 2015/10/26 16:47, Grumbach, Emmanuel wrote:
> On 10/26/2015 06:03 AM, Toshiaki Makita wrote:
>> On 2015/10/26 5:02, Emmanuel Grumbach wrote:
>>> Adding IPv6 for the TSO helper API is trivial:
>>> * Don't play with the id (which doesn't exist in IPv6)
>>> * Correctly update the payload_len (don't include the
>>>   length of the IP header itself)
>> ...
>>> memcpy(hdr, skb->data, hdr_len);
>>> -   iph = (struct iphdr *)(hdr + mac_hdr_len);
>>> -   iph->id = htons(tso->ip_id);
>>> -   iph->tot_len = htons(size + hdr_len - mac_hdr_len);
>>> +   if (skb->protocol == htons(ETH_P_IP)) {
>>
>> I guess this should be vlan_get_protocol(skb).
> 
> I truly don't know. I guess we could have VLANs, but I'd need to check
> how the packet would look like after it exits mac80211.

I don't know much about mac80211.

What I see is that mvneta has TSO in vlan_features and it uses
tso_build_hdr(). When vlan device is used, we cannot access network
protocol by skb->protocol without HW vlan acceleration.
So it looks like this change corrupts TSO functionality on mvneta.

> If we need that, I'll likely do this check once in tso_start() and add a
> variable to struct tso_t.

I'm not sure if an additional variable is needed.
At least, skb_network_offset()/ip_hdr() should correctly handle (skip)
vlan headers.

Toshiaki Makita

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linuxwifi] [-next] WARNING at iwl_mvm_time_event_send_add+0x72/0x1b6

2015-10-26 Thread Grumbach, Emmanuel

Hi,

On 10/26/2015 08:41 AM, Sergey Senozhatsky wrote:
> Hi,
> 
> linux-next 20151022
> 
> 

Can be reproduced reliably?
Seems like a bad race between the end of session protection for the
authentication and the start of the session protection for the deauth.
I think I found the hole in the locks in there, but it is going to be
tricky to solve.

> wlp2s0: aborting authentication with 00:04:96:61:cd:e0 by local choice 
> (Reason: 3=DEAUTH_LEAVING)
> [ cut here ]
> WARNING: CPU: 0 PID: 1006 at 
> drivers/net/wireless/iwlwifi/mvm/time-event.c:513 
> iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]()
> Modules linked in: mousedev arc4 nls_iso8859_1 nls_cp437 vfat fat serio_raw 
> psmouse atkbd coretemp hwmon i915 libps2 iwlmvm i2c_algo_bit mac80211 
> drm_kms_helper cfbfillrect intel_powerclamp syscopyarea cfbimgblt sysfillrect 
> sysimgblt crc32c_intel fb_sys_fops cfbcopyarea iwlwifi drm r8
> CPU: 0 PID: 1006 Comm: iwconfig Not tainted 
> 4.3.0-rc6-next-20151022-dbg-2-g4041783-dirty #260
>   8800c69479c8 811dd4ad 
>  8800c6947a00 8103db4e a04fd261 88041c7cdfc8
>  88041cc87a20 88041c7ceb28 8800c6947aac 8800c6947a10
> Call Trace:
>  [] dump_stack+0x4b/0x63
>  [] warn_slowpath_common+0x99/0xb2
>  [] ? iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
>  [] warn_slowpath_null+0x1a/0x1c
>  [] iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
>  [] ? __lock_is_held+0x3c/0x57
>  [] iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
>  [] ? iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
>  [] ? iwl_mvm_ref_sync+0x37/0x10c [iwlmvm]
>  [] iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
>  [] ? iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
>  [] ieee80211_mgd_deauth+0x14f/0x3b0 [mac80211]
>  [] ? __lock_is_held+0x3c/0x57
>  [] ieee80211_deauth+0x18/0x1a [mac80211]
>  [] cfg80211_mlme_deauth+0x13c/0x28e [cfg80211]
>  [] cfg80211_disconnect+0xb5/0x2f7 [cfg80211]
>  [] cfg80211_mgd_wext_siwfreq+0xed/0x160 [cfg80211]
>  [] ? cfg80211_wext_freq+0x5f/0x5f [cfg80211]
>  [] cfg80211_wext_siwfreq+0x76/0xf6 [cfg80211]
>  [] ioctl_standard_call+0x66/0x376
>  [] wext_handle_ioctl+0x102/0x16d
>  [] dev_ioctl+0x6bb/0x6de
>  [] ? handle_mm_fault+0xefc/0x13f9
>  [] sock_ioctl+0x230/0x23c
>  [] ? sock_ioctl+0x230/0x23c
>  [] do_vfs_ioctl+0x458/0x4dc
>  [] ? retint_user+0x18/0x20
>  [] ? __fget_light+0x4d/0x71
>  [] SyS_ioctl+0x43/0x61
>  [] entry_SYSCALL_64_fastpath+0x12/0x6f
> ---[ end trace 6a44e7f1588bdae7 ]---
> 
> 
>   -ss
> -
> linuxw...@eclists.intel.com
> https://eclists.intel.com/sympa/info/linuxwifi
> Unsubscribe by sending email to sy...@eclists.intel.com with subject 
> "Unsubscribe linuxwifi"
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v8 00/10] Add new drivers: qed & qede

2015-10-26 Thread Yuval Mintz

From: Ariel Elior 

This series implements the driver set for Qlogic's new QL4xxx series.
These are 10/20/25/40/50/100 Gig capable converged nics, supporting
ethernet (obviously), iscsi, fcoe, roce and iwarp protocols.

The overall driver design includes a common module ('qed') and protocol
specific dependent modules for ethernet ('qede'), fcoe ('qedf'),
iscsi ('qedi') and roce ('qedr').
The common module contains all of the common logic, e.g. initialization,
cleanup, infrastructure for interrupt handling, link management, slowpath
etc. as well as protocol agnostic features, and supplying an abstraction
layer for other modules.
The protocol specific modules can be compiled and operated independently
of each other, with the exception of the rdma modules which are dependent
on the ethernet module, in accordance with the kernel rdma stack design.

This series only adds the core and ethernet modules, with basic L2
capabilities. Future series will add the rest of the modules and enhance
the L2 functionality.

Ths patch series is constructed of the following patches:
qed:  Add module with basic common support
qed:  Add basic L2 interface
qede: Add basic Network driver
qed:  Add slowpath L2 support
qede: Add basic network device support
qede: Add classification configuration
qed:  Add link support
qede: Add support for link
qed:  Add statistics support
qede: Add basic ethtool support

This project is a team effort, thanks go to Yuval Mintz, Dmitry Kravkov,
Michal Kalderon, Tomer Tayar, Manish Chopra, Sudarsana Kalluru,
Rajesh Borundia, Sony Chacko, Artum Zolotushko, Harish Patil, Rasesh Mody,
Sergey Ukhterov and Elad Manela, as well as former team members,
Eilon Greenstein and Shmulik Ravid.

Changes from previos version:
-

>From Version 7:
  - Various small fixes according to Dave's suggestions; Largest change
[code-wise] - don't use tabs for indenting function arguments.

>From Version 6:
  - Reduced the number of arguments for functions with exceptionally
high number of parameters.

>From Version 5:
  - Style change and fixes [mostly in 1, 4 and 7].
Thanks go to Francois Romieu, a mere mortal. ;-)

>From Version 4:
  - Drop dependency for x86_64.

>From Version 3:
  - Limit support of initial submission to x86_64.
  - Fix endian problems appearing via sparse [although no BE support yet].
  - Fix small issues suggested by the kbuild test robot.

>From Version 2:
  - Removed U64_{HI,LO}; Using {upper,lower}_32_bits instead.
  - Use regular napi weight definition.
  - [We still use the __le variants for variables, since we didn't get
 a reply regarding the change into non-user API types].

>From Version 1:
  - Removed private license file; Instead revised comments at source headers.

Thanks,
Ariel Elior
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Linux 4.2.4

2015-10-26 Thread Gerhard Wiesinger


On 25.10.2015 22:53, Jozsef Kadlecsik wrote:

On Sun, 25 Oct 2015, Gerhard Wiesinger wrote:


Any further ideas?

Does it crash without counters? That could narrow down where to look for.




Hello Jozsef,

it doesn't crash i I don't use the counters so far. So there must be a 
bug with the counters.


Any idea for the root cause?

Thnx.

Ciao,
Gerhard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] net: tso: add support for IPv6

2015-10-26 Thread Grumbach, Emmanuel



On 10/26/2015 06:03 AM, Toshiaki Makita wrote:
> On 2015/10/26 5:02, Emmanuel Grumbach wrote:
>> Adding IPv6 for the TSO helper API is trivial:
>> * Don't play with the id (which doesn't exist in IPv6)
>> * Correctly update the payload_len (don't include the
>>   length of the IP header itself)
> ...
>>  memcpy(hdr, skb->data, hdr_len);
>> -iph = (struct iphdr *)(hdr + mac_hdr_len);
>> -iph->id = htons(tso->ip_id);
>> -iph->tot_len = htons(size + hdr_len - mac_hdr_len);
>> +if (skb->protocol == htons(ETH_P_IP)) {
> 
> I guess this should be vlan_get_protocol(skb).

I truly don't know. I guess we could have VLANs, but I'd need to check
how the packet would look like after it exits mac80211.
If we need that, I'll likely do this check once in tso_start() and add a
variable to struct tso_t.

> 
>> +struct iphdr *iph = (void *)(hdr + mac_hdr_len);
>> +
>> +iph->id = htons(tso->ip_id);
>> +iph->tot_len = htons(size + hdr_len - mac_hdr_len);
>> +tso->ip_id++;
>> +} else if (skb->protocol == htons(ETH_P_IPV6)) {
> 
> Likewise.
> 
> Toshiaki Makita
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linuxwifi] [-next] WARNING at iwl_mvm_time_event_send_add+0x72/0x1b6

2015-10-26 Thread Grumbach, Emmanuel



On 10/26/2015 10:30 AM, Sergey Senozhatsky wrote:
> On (10/26/15 07:51), Grumbach, Emmanuel wrote:
>>> On 10/26/2015 08:41 AM, Sergey Senozhatsky wrote:
 Hi,

 linux-next 20151022


>>>
>>> Can be reproduced reliably?
>>> Seems like a bad race between the end of session protection for the
>>> authentication and the start of the session protection for the deauth.
>>> I think I found the hole in the locks in there, but it is going to be
>>> tricky to solve.
>>
>> Not sure if I found the race. Can you please send the complete log?
>> If you have timestamps, it'd greatly helps...
>> dmesg output should do.
>>
> 
> Hi,
> 
> not really sure if I can reproduce this one easily. seen once.
> 

I see... This log seems to teach me you have 2 entities trying to
control the wlan interface: as soon as we are associated, someone kicks
us out.
What I *think* is happening here is that the session protection for the
authentication is launched (command sent to the firmware), but not
started yet (at least the driver hasn't received the notification from
the firmware) until someone (who?) wants to deauth.
At that stage, te_data->running is false, and te_data->id is valid.
To send the deauth frame (why are we even sending it since we are not
authenticated?), another session protection is started, and here you hit
the warning.

This scenario is highly improbable and besides the WARNING (which is an
issue, I admit), there is no undesirable behavior.
I lean towards leaving the code as is for now instead of playing with
locks and get the code nasty. I think I do want to leave the WARNING in
place despite the fact that we see it *can* happen. But this is *so*
rare, that I prefer to have it so that it can catch other (real?) issues.

The interesting point here is why mac80211 tries to deauth when we are
not authenticated yet. I think I've seen a patch for that, but I'd need
to check.

> ---
> 
> Oct 26 15:20:51  dhclient[399]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 7
> Oct 26 15:20:58  dhclient[399]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 17
> Oct 26 15:21:09  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 7
> Oct 26 15:21:09  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
> Oct 26 15:21:09  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:09  kernel: wlp2s0: authenticated
> Oct 26 15:21:09  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:09  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
> (capab=0x11 status=0 aid=24)
> Oct 26 15:21:09  kernel: wlp2s0: associated
> Oct 26 15:21:09  kernel: IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes 
> ready
> Oct 26 15:21:09  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
> local choice (Reason: 3=DEAUTH_LEAVING)
> Oct 26 15:21:12  kernel: wlp2s0: authenticate with 00:04:96:61:e9:f0
> Oct 26 15:21:12  kernel: wlp2s0: send auth to 00:04:96:61:e9:f0 (try 1/3)
> Oct 26 15:21:12  kernel: wlp2s0: authenticated
> Oct 26 15:21:12  kernel: wlp2s0: associate with 00:04:96:61:e9:f0 (try 1/3)
> Oct 26 15:21:12  kernel: wlp2s0: RX AssocResp from 00:04:96:61:e9:f0 
> (capab=0x11 status=0 aid=16)
> Oct 26 15:21:12  kernel: wlp2s0: associated
> Oct 26 15:21:12  kernel: wlp2s0: deauthenticating from 00:04:96:61:e9:f0 by 
> local choice (Reason: 3=DEAUTH_LEAVING)
> Oct 26 15:21:16  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 10
> Oct 26 15:21:22  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
> Oct 26 15:21:22  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:22  kernel: wlp2s0: authenticated
> Oct 26 15:21:22  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:22  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
> (capab=0x11 status=0 aid=25)
> Oct 26 15:21:22  kernel: wlp2s0: associated
> Oct 26 15:21:22  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
> local choice (Reason: 3=DEAUTH_LEAVING)
> Oct 26 15:21:26  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 9
> Oct 26 15:21:35  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 12
> Oct 26 15:21:47  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
> Oct 26 15:21:47  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:47  kernel: wlp2s0: authenticated
> Oct 26 15:21:47  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
> Oct 26 15:21:47  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
> (capab=0x11 status=0 aid=25)
> Oct 26 15:21:47  kernel: wlp2s0: associated
> Oct 26 15:21:47  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
> local choice (Reason: 3=DEAUTH_LEAVING)
> Oct 26 15:21:47  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 18
> Oct 26 15:22:05  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 
> port 67 interval 5
> Oct 26 15:22:10

Re: Linux 4.2.4

2015-10-26 Thread Gerhard Wiesinger


On 26.10.2015 09:58, Jozsef Kadlecsik wrote:

On Sun, 25 Oct 2015, Gerhard Wiesinger wrote:


Also any idea regarding the second isssue? Or do you think it has the
same root cause?

Looking at your RedHat bugzilla report, the "nf_conntrack: table full,
dropping packet" and "Alignment trap: not handling instruction" are two
unrelated issues and the second one is triggered by the unaligned counter
extension acccess in ipset, I'm investigating. I can't think of any reason
how those issues could be related to each other.


Yes, they are unrelated.
Issue 1: nf_conntrack: table full, dropping packet => Fixed with 4.2.4
Issue 2: Alignment trap: not handling instruction => Happens when ipset 
counters are enabled


Please keep in mind it happens with IPv6 commands.

Currently 4.2.4 without ipset counters runs well.

Ciao,
Gerhard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v8] can: xilinx: Convert to runtime_pm

2015-10-26 Thread Kedareswara rao Appana

Instead of enabling/disabling clocks at several locations in the driver,
Use the runtime_pm framework. This consolidates the actions for runtime PM
In the appropriate callbacks and makes the driver more readable and mantainable.

Signed-off-by: Kedareswara rao Appana 
---
Changes for v8:
  - Remove pm_runtime_irq_safe() API call from the probe as clk_prepare_enable
Call canbe called from the atomic context as suggested by Marc.
Changes for v7:
  - Removed the unnecessary clk_prepare/clk_unprepare calls
From  the probe and remove as suggested by Soren.
Changes for v6:
 - Updated the driver with review comments as suggested by Marc.
Changes for v5:
 - Updated with the review comments.
   Updated the remove fuction to use runtime_pm.
Chnages for v4:
 - Updated with the review comments.
Changes for v3:
  - Converted the driver to use runtime_pm.
Changes for v2:
  - Removed the struct platform_device* from suspend/resume
as suggest by Lothar

 drivers/net/can/xilinx_can.c | 176 +--
 1 file changed, 101 insertions(+), 75 deletions(-)

diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index fc55e8e..ad38065 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_NAME"xilinx_can"
 
@@ -138,7 +139,7 @@ struct xcan_priv {
u32 (*read_reg)(const struct xcan_priv *priv, enum xcan_reg reg);
void (*write_reg)(const struct xcan_priv *priv, enum xcan_reg reg,
u32 val);
-   struct net_device *dev;
+   struct device *dev;
void __iomem *reg_base;
unsigned long irq_flags;
struct clk *bus_clk;
@@ -843,6 +844,13 @@ static int xcan_open(struct net_device *ndev)
struct xcan_priv *priv = netdev_priv(ndev);
int ret;
 
+   ret = pm_runtime_get_sync(priv->dev);
+   if (ret < 0) {
+   netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
+   __func__, ret);
+   return ret;
+   }
+
ret = request_irq(ndev->irq, xcan_interrupt, priv->irq_flags,
ndev->name, ndev);
if (ret < 0) {
@@ -850,29 +858,17 @@ static int xcan_open(struct net_device *ndev)
goto err;
}
 
-   ret = clk_prepare_enable(priv->can_clk);
-   if (ret) {
-   netdev_err(ndev, "unable to enable device clock\n");
-   goto err_irq;
-   }
-
-   ret = clk_prepare_enable(priv->bus_clk);
-   if (ret) {
-   netdev_err(ndev, "unable to enable bus clock\n");
-   goto err_can_clk;
-   }
-
/* Set chip into reset mode */
ret = set_reset_mode(ndev);
if (ret < 0) {
netdev_err(ndev, "mode resetting failed!\n");
-   goto err_bus_clk;
+   goto err_irq;
}
 
/* Common open */
ret = open_candev(ndev);
if (ret)
-   goto err_bus_clk;
+   goto err_irq;
 
ret = xcan_chip_start(ndev);
if (ret < 0) {
@@ -888,13 +884,11 @@ static int xcan_open(struct net_device *ndev)
 
 err_candev:
close_candev(ndev);
-err_bus_clk:
-   clk_disable_unprepare(priv->bus_clk);
-err_can_clk:
-   clk_disable_unprepare(priv->can_clk);
 err_irq:
free_irq(ndev->irq, ndev);
 err:
+   pm_runtime_put(priv->dev);
+
return ret;
 }
 
@@ -911,12 +905,11 @@ static int xcan_close(struct net_device *ndev)
netif_stop_queue(ndev);
napi_disable(>napi);
xcan_chip_stop(ndev);
-   clk_disable_unprepare(priv->bus_clk);
-   clk_disable_unprepare(priv->can_clk);
free_irq(ndev->irq, ndev);
close_candev(ndev);
 
can_led_event(ndev, CAN_LED_EVENT_STOP);
+   pm_runtime_put(priv->dev);
 
return 0;
 }
@@ -935,27 +928,20 @@ static int xcan_get_berr_counter(const struct net_device 
*ndev,
struct xcan_priv *priv = netdev_priv(ndev);
int ret;
 
-   ret = clk_prepare_enable(priv->can_clk);
-   if (ret)
-   goto err;
-
-   ret = clk_prepare_enable(priv->bus_clk);
-   if (ret)
-   goto err_clk;
+   ret = pm_runtime_get_sync(priv->dev);
+   if (ret < 0) {
+   netdev_err(ndev, "%s: pm_runtime_get failed(%d)\n",
+   __func__, ret);
+   return ret;
+   }
 
bec->txerr = priv->read_reg(priv, XCAN_ECR_OFFSET) & XCAN_ECR_TEC_MASK;
bec->rxerr = ((priv->read_reg(priv, XCAN_ECR_OFFSET) &
XCAN_ECR_REC_MASK) >> XCAN_ESR_REC_SHIFT);
 
-   clk_disable_unprepare(priv->bus_clk);
-   clk_disable_unprepare(priv->can_clk);
+   pm_runtime_put(priv->dev);
 
return 0;
-
-err_clk:
-   clk_disable_unprepare(priv->can_clk);
-err:
-   return ret;
 }
 
 
@@ -968,15 +954,45 @@ static const

[PATCHv2 net-next 2/6] arcnet: com20020: add enable and disable device on open/close

2015-10-26 Thread Michael Grzeschik

This patch changes the driver to properly work with the linux netif
interface. The controller gets enabled on open and disabled on close.
Therefor it removes every bogus start of the xceiver. It only gets
enabled on com20020_open and disabled on com20020_close.

Signed-off-by: Michael Grzeschik 
---
v1 -> v2: kbuild test robot: declared com20020_netdev_{open,close} static

 drivers/net/arcnet/com20020.c | 39 +--
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/drivers/net/arcnet/com20020.c b/drivers/net/arcnet/com20020.c
index c82f323..13d9ad4 100644
--- a/drivers/net/arcnet/com20020.c
+++ b/drivers/net/arcnet/com20020.c
@@ -118,7 +118,7 @@ int com20020_check(struct net_device *dev)
arcnet_outb(STARTIOcmd, ioaddr, COM20020_REG_W_COMMAND);
}
 
-   lp->config = TXENcfg | (lp->timeout << 3) | (lp->backplane << 2) | 
SUB_NODE;
+   lp->config = (lp->timeout << 3) | (lp->backplane << 2) | SUB_NODE;
/* set node ID to 0x42 (but transmitter is disabled, so it's okay) */
arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
arcnet_outb(0x42, ioaddr, COM20020_REG_W_XREG);
@@ -131,11 +131,6 @@ int com20020_check(struct net_device *dev)
}
arc_printk(D_INIT_REASONS, dev, "status after reset: %X\n", status);
 
-   /* Enable TX */
-   lp->config |= TXENcfg;
-   arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
-   arcnet_outb(arcnet_inb(ioaddr, 8), ioaddr, COM20020_REG_W_XREG);
-
arcnet_outb(CFLAGScmd | RESETclear | CONFIGclear,
ioaddr, COM20020_REG_W_COMMAND);
status = arcnet_inb(ioaddr, COM20020_REG_R_STATUS);
@@ -169,9 +164,33 @@ static int com20020_set_hwaddr(struct net_device *dev, 
void *addr)
return 0;
 }
 
+static int com20020_netdev_open(struct net_device *dev)
+{
+   int ioaddr = dev->base_addr;
+   struct arcnet_local *lp = netdev_priv(dev);
+
+   lp->config |= TXENcfg;
+   arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
+
+   return arcnet_open(dev);
+}
+
+static int com20020_netdev_close(struct net_device *dev)
+{
+   int ioaddr = dev->base_addr;
+   struct arcnet_local *lp = netdev_priv(dev);
+
+   arcnet_close(dev);
+
+   /* disable transmitter */
+   lp->config &= ~TXENcfg;
+   arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
+   return 0;
+}
+
 const struct net_device_ops com20020_netdev_ops = {
-   .ndo_open   = arcnet_open,
-   .ndo_stop   = arcnet_close,
+   .ndo_open   = com20020_netdev_open,
+   .ndo_stop   = com20020_netdev_close,
.ndo_start_xmit = arcnet_send_packet,
.ndo_tx_timeout = arcnet_timeout,
.ndo_set_mac_address = com20020_set_hwaddr,
@@ -215,7 +234,7 @@ int com20020_found(struct net_device *dev, int shared)
arcnet_outb(STARTIOcmd, ioaddr, COM20020_REG_W_COMMAND);
}
 
-   lp->config = TXENcfg | (lp->timeout << 3) | (lp->backplane << 2) | 
SUB_NODE;
+   lp->config = (lp->timeout << 3) | (lp->backplane << 2) | SUB_NODE;
/* Default 0x38 + register: Node ID */
arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
arcnet_outb(dev->dev_addr[0], ioaddr, COM20020_REG_W_XREG);
@@ -274,7 +293,7 @@ static int com20020_reset(struct net_device *dev, int 
really_reset)
   dev->name, arcnet_inb(ioaddr, COM20020_REG_R_STATUS));
 
arc_printk(D_DEBUG, dev, "%s: %d: %s\n", __FILE__, __LINE__, __func__);
-   lp->config = TXENcfg | (lp->timeout << 3) | (lp->backplane << 2);
+   lp->config |= (lp->timeout << 3) | (lp->backplane << 2);
/* power-up defaults */
arcnet_outb(lp->config, ioaddr, COM20020_REG_W_CONFIG);
arc_printk(D_DEBUG, dev, "%s: %d: %s\n", __FILE__, __LINE__, __func__);
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net-next 1/6] arcnet: move dev_free_skb to its only user

2015-10-26 Thread Michael Grzeschik

The call for dev_free_skb is done only once. This patch
moves its call to its only user and removes the obsolete
condition variable.

Signed-off-by: Michael Grzeschik 
---
 drivers/net/arcnet/arcnet.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index e41dd36..542e2b4 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -515,7 +515,7 @@ netdev_tx_t arcnet_send_packet(struct sk_buff *skb,
struct ArcProto *proto;
int txbuf;
unsigned long flags;
-   int freeskb, retval;
+   int retval;
 
arc_printk(D_DURING, dev,
   "transmit requested (status=%Xh, txbufs=%d/%d, len=%d, 
protocol %x)\n",
@@ -554,15 +554,13 @@ netdev_tx_t arcnet_send_packet(struct sk_buff *skb,
 *  the package later - forget about it now
 */
dev->stats.tx_bytes += skb->len;
-   freeskb = 1;
+   dev_kfree_skb(skb);
} else {
/* do it the 'split' way */
lp->outgoing.proto = proto;
lp->outgoing.skb = skb;
lp->outgoing.pkt = pkt;
 
-   freeskb = 0;
-
if (proto->continue_tx &&
proto->continue_tx(dev, txbuf)) {
arc_printk(D_NORMAL, dev,
@@ -574,7 +572,6 @@ netdev_tx_t arcnet_send_packet(struct sk_buff *skb,
lp->next_tx = txbuf;
} else {
retval = NETDEV_TX_BUSY;
-   freeskb = 0;
}
 
arc_printk(D_DEBUG, dev, "%s: %d: %s, status: %x\n",
@@ -589,9 +586,6 @@ netdev_tx_t arcnet_send_packet(struct sk_buff *skb,
   __FILE__, __LINE__, __func__, lp->hw.status(dev));
 
spin_unlock_irqrestore(>lock, flags);
-   if (freeskb)
-   dev_kfree_skb(skb);
-
return retval;  /* no need to try again */
 }
 EXPORT_SYMBOL(arcnet_send_packet);
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net-next 4/6] arcnet: com20020-pci: add rotary index support

2015-10-26 Thread Michael Grzeschik

The EAE PLX-PCI card has a special rotary encoder
to configure the address of every card individually.
We take this information for the initial setup of
the cards dev_id.

Signed-off-by: Michael Grzeschik 
---
v1 -> v2: kbuild test robot: fixed type of misc variable to resource_size_t

 drivers/net/arcnet/com20020-pci.c | 33 +
 drivers/net/arcnet/com20020.h |  4 
 2 files changed, 37 insertions(+)

diff --git a/drivers/net/arcnet/com20020-pci.c 
b/drivers/net/arcnet/com20020-pci.c
index e3b7c14e..637a611 100644
--- a/drivers/net/arcnet/com20020-pci.c
+++ b/drivers/net/arcnet/com20020-pci.c
@@ -68,6 +68,7 @@ static int com20020pci_probe(struct pci_dev *pdev,
 const struct pci_device_id *id)
 {
struct com20020_pci_card_info *ci;
+   struct com20020_pci_channel_map *mm;
struct net_device *dev;
struct arcnet_local *lp;
struct com20020_priv *priv;
@@ -84,9 +85,22 @@ static int com20020pci_probe(struct pci_dev *pdev,
 
ci = (struct com20020_pci_card_info *)id->driver_data;
priv->ci = ci;
+   mm = >misc_map;
 
INIT_LIST_HEAD(>list_dev);
 
+   if (mm->size) {
+   ioaddr = pci_resource_start(pdev, mm->bar) + mm->offset;
+   r = devm_request_region(>dev, ioaddr, mm->size,
+   "com20020-pci");
+   if (!r) {
+   pr_err("IO region %xh-%xh already allocated.\n",
+  ioaddr, ioaddr + mm->size - 1);
+   return -EBUSY;
+   }
+   priv->misc = ioaddr;
+   }
+
for (i = 0; i < ci->devcount; i++) {
struct com20020_pci_channel_map *cm = >chan_map_tbl[i];
struct com20020_dev *card;
@@ -132,6 +146,13 @@ static int com20020pci_probe(struct pci_dev *pdev,
lp->timeout = timeout;
lp->hw.owner = THIS_MODULE;
 
+   /* Get the dev_id from the PLX rotary coder */
+   if (!strncmp(ci->name, "EAE PLX-PCI MA1", 15))
+   dev->dev_id = 0xc;
+   dev->dev_id ^= inb(priv->misc + ci->rotary) >> 4;
+
+   snprintf(dev->name, sizeof(dev->name), "arc%d-%d", dev->dev_id, 
i);
+
if (arcnet_inb(ioaddr, COM20020_REG_R_STATUS) == 0xFF) {
pr_err("IO address %Xh is empty!\n", ioaddr);
ret = -EIO;
@@ -235,6 +256,12 @@ static struct com20020_pci_card_info card_info_eae_arc1 = {
.size = 0x08,
},
},
+   .misc_map = {
+   .bar = 2,
+   .offset = 0x10,
+   .size = 0x04,
+   },
+   .rotary = 0x0,
.flags = ARC_CAN_10MBIT,
 };
 
@@ -252,6 +279,12 @@ static struct com20020_pci_card_info card_info_eae_ma1 = {
.size = 0x08,
}
},
+   .misc_map = {
+   .bar = 2,
+   .offset = 0x10,
+   .size = 0x04,
+   },
+   .rotary = 0x0,
.flags = ARC_CAN_10MBIT,
 };
 
diff --git a/drivers/net/arcnet/com20020.h b/drivers/net/arcnet/com20020.h
index 22a460f..f2ed2ef 100644
--- a/drivers/net/arcnet/com20020.h
+++ b/drivers/net/arcnet/com20020.h
@@ -47,6 +47,9 @@ struct com20020_pci_card_info {
int devcount;
 
struct com20020_pci_channel_map chan_map_tbl[PLX_PCI_MAX_CARDS];
+   struct com20020_pci_channel_map misc_map;
+
+   int rotary;
 
unsigned int flags;
 };
@@ -54,6 +57,7 @@ struct com20020_pci_card_info {
 struct com20020_priv {
struct com20020_pci_card_info *ci;
struct list_head list_dev;
+   resource_size_t misc;
 };
 
 struct com20020_dev {
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[GIT PULL v2] ARCNET: code simplification and features

2015-10-26 Thread Michael Grzeschik

The following changes since commit d1611c3aba11ffa281bdd027aace52f5a370b8c5:

  bnxt_en: Fix compile warnings when CONFIG_INET is not set. (2015-10-25 
22:36:15 -0700)

are available in the git repository at:

  git://git.pengutronix.de/git/mgr/linux.git tags/arcnet-for-4.4-rc1

for you to fetch changes up to 59fbcbc61e1f0fd9acdf3efb09faca0320049718:

  arcnet: add netif_carrier_on/off for reconnect (2015-10-26 09:10:56 +0100)


This series includes code simplifaction. The main changes are the correct
xceiver handling (enable/disable) of the com20020 cards. The driver now handles
link status change detection. The EAE PCI-ARCNET cards now make use of the
rotary encoded subdevice indexing and got support for led triggers on transmit
and reconnection events.


Michael Grzeschik (6):
  arcnet: move dev_free_skb to its only user
  arcnet: com20020: add enable and disable device on open/close
  arcnet: com20020-pci: set dev_port to the subdevice index
  arcnet: com20020-pci: add rotary index support
  arcnet: com20020-pci: add led trigger support
  arcnet: add netif_carrier_on/off for reconnect

 drivers/net/arcnet/arcdevice.h|  21 
 drivers/net/arcnet/arcnet.c   | 107 
+++
 drivers/net/arcnet/com20020-pci.c | 107 
+++
 drivers/net/arcnet/com20020.c |  39 ++--
 drivers/net/arcnet/com20020.h |  14 +
 include/linux/leds.h  |   7 +++
 6 files changed, 277 insertions(+), 18 deletions(-)

Michael Grzeschik (6):
  arcnet: move dev_free_skb to its only user
  arcnet: com20020: add enable and disable device on open/close
  arcnet: com20020-pci: set dev_port to the subdevice index
  arcnet: com20020-pci: add rotary index support
  arcnet: com20020-pci: add led trigger support
  arcnet: add netif_carrier_on/off for reconnect

 drivers/net/arcnet/arcdevice.h|  21 
 drivers/net/arcnet/arcnet.c   | 107 +++---
 drivers/net/arcnet/com20020-pci.c | 107 ++
 drivers/net/arcnet/com20020.c |  39 ++
 drivers/net/arcnet/com20020.h |  14 +
 include/linux/leds.h  |   7 +++
 6 files changed, 277 insertions(+), 18 deletions(-)

-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net-next 5/6] arcnet: com20020-pci: add led trigger support

2015-10-26 Thread Michael Grzeschik

The EAE PLX-PCI card has special leds on the the main io pci resource
bar. This patch adds support to trigger the conflict and data leds with
the packages.

Signed-off-by: Michael Grzeschik 
---
v1 -> v2: kbuild test robot: added dummy inline functions for 
led_trigger_blink{,_oneshot}

 drivers/net/arcnet/arcdevice.h| 19 ++
 drivers/net/arcnet/arcnet.c   | 72 ++
 drivers/net/arcnet/com20020-pci.c | 73 +++
 drivers/net/arcnet/com20020.h | 10 ++
 include/linux/leds.h  |  7 
 5 files changed, 181 insertions(+)

diff --git a/drivers/net/arcnet/arcdevice.h b/drivers/net/arcnet/arcdevice.h
index d7fdea1..2edc0c0 100644
--- a/drivers/net/arcnet/arcdevice.h
+++ b/drivers/net/arcnet/arcdevice.h
@@ -237,6 +237,8 @@ struct Outgoing {
numsegs;/* number of segments */
 };
 
+#define ARCNET_LED_NAME_SZ (IFNAMSIZ + 6)
+
 struct arcnet_local {
uint8_t config, /* current value of CONFIG register */
timeout,/* Extended timeout for COM20020 */
@@ -260,6 +262,11 @@ struct arcnet_local {
/* On preemtive and SMB a lock is needed */
spinlock_t lock;
 
+   struct led_trigger *tx_led_trig;
+   char tx_led_trig_name[ARCNET_LED_NAME_SZ];
+   struct led_trigger *recon_led_trig;
+   char recon_led_trig_name[ARCNET_LED_NAME_SZ];
+
/*
 * Buffer management: an ARCnet card has 4 x 512-byte buffers, each of
 * which can be used for either sending or receiving.  The new dynamic
@@ -309,6 +316,8 @@ struct arcnet_local {
int (*reset)(struct net_device *dev, int really_reset);
void (*open)(struct net_device *dev);
void (*close)(struct net_device *dev);
+   void (*datatrigger) (struct net_device * dev, int enable);
+   void (*recontrigger) (struct net_device * dev, int enable);
 
void (*copy_to_card)(struct net_device *dev, int bufnum,
 int offset, void *buf, int count);
@@ -319,6 +328,16 @@ struct arcnet_local {
void __iomem *mem_start;/* pointer to ioremap'ed MMIO */
 };
 
+enum arcnet_led_event {
+   ARCNET_LED_EVENT_RECON,
+   ARCNET_LED_EVENT_OPEN,
+   ARCNET_LED_EVENT_STOP,
+   ARCNET_LED_EVENT_TX,
+};
+
+void arcnet_led_event(struct net_device *netdev, enum arcnet_led_event event);
+void devm_arcnet_led_init(struct net_device *netdev, int index, int subid);
+
 #if ARCNET_DEBUG_MAX & D_SKB
 void arcnet_dump_skb(struct net_device *dev, struct sk_buff *skb, char *desc);
 #else
diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index 542e2b4..4242522 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -52,6 +52,8 @@
 #include 
 #include 
 
+#include 
+
 #include "arcdevice.h"
 #include "com9026.h"
 
@@ -189,6 +191,71 @@ static void arcnet_dump_packet(struct net_device *dev, int 
bufnum,
 
 #endif
 
+/* Trigger a LED event in response to a ARCNET device event */
+void arcnet_led_event(struct net_device *dev, enum arcnet_led_event event)
+{
+   struct arcnet_local *lp = netdev_priv(dev);
+   unsigned long led_delay = 350;
+   unsigned long tx_delay = 50;
+
+   switch (event) {
+   case ARCNET_LED_EVENT_RECON:
+   led_trigger_blink_oneshot(lp->recon_led_trig,
+ _delay, _delay, 0);
+   break;
+   case ARCNET_LED_EVENT_OPEN:
+   led_trigger_event(lp->tx_led_trig, LED_OFF);
+   led_trigger_event(lp->recon_led_trig, LED_OFF);
+   break;
+   case ARCNET_LED_EVENT_STOP:
+   led_trigger_event(lp->tx_led_trig, LED_OFF);
+   led_trigger_event(lp->recon_led_trig, LED_OFF);
+   break;
+   case ARCNET_LED_EVENT_TX:
+   led_trigger_blink_oneshot(lp->tx_led_trig,
+ _delay, _delay, 0);
+   break;
+   }
+}
+EXPORT_SYMBOL_GPL(arcnet_led_event);
+
+static void arcnet_led_release(struct device *gendev, void *res)
+{
+   struct arcnet_local *lp = netdev_priv(to_net_dev(gendev));
+
+   led_trigger_unregister_simple(lp->tx_led_trig);
+   led_trigger_unregister_simple(lp->recon_led_trig);
+}
+
+/* Register ARCNET LED triggers for a arcnet device
+ *
+ * This is normally called from a driver's probe function
+ */
+void devm_arcnet_led_init(struct net_device *netdev, int index, int subid)
+{
+   struct arcnet_local *lp = netdev_priv(netdev);
+   void *res;
+
+   res = devres_alloc(arcnet_led_release, 0, GFP_KERNEL);
+   if (!res) {
+   netdev_err(netdev, "cannot register LED triggers\n");
+   return;
+   }
+
+   snprintf(lp->tx_led_trig_name, sizeof(lp->tx_led_trig_name),
+"arc%d-%d-tx", index, subid);
+

[PATCHv2 net-next 3/6] arcnet: com20020-pci: set dev_port to the subdevice index

2015-10-26 Thread Michael Grzeschik

This patch sets the dev_port according to the index of
the card. This can be used by udev to name the ports
in userspace.

Signed-off-by: Michael Grzeschik 
---
 drivers/net/arcnet/com20020-pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/arcnet/com20020-pci.c 
b/drivers/net/arcnet/com20020-pci.c
index a12bf83..e3b7c14e 100644
--- a/drivers/net/arcnet/com20020-pci.c
+++ b/drivers/net/arcnet/com20020-pci.c
@@ -96,6 +96,7 @@ static int com20020pci_probe(struct pci_dev *pdev,
ret = -ENOMEM;
goto out_port;
}
+   dev->dev_port = i;
 
dev->netdev_ops = _netdev_ops;
 
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net-next 6/6] arcnet: add netif_carrier_on/off for reconnect

2015-10-26 Thread Michael Grzeschik

The arcnet device has no interrupt to detect if the link has changed
from disconnected to connected. This patch adds an timer to toggle the
link detection. The timer will get retriggered as long as the
reconnection interrupts accure. If the recon interrupts hold off
for >1s we define the connection stable again.

Signed-off-by: Michael Grzeschik 
---
 drivers/net/arcnet/arcdevice.h |  2 ++
 drivers/net/arcnet/arcnet.c| 25 +
 2 files changed, 27 insertions(+)

diff --git a/drivers/net/arcnet/arcdevice.h b/drivers/net/arcnet/arcdevice.h
index 2edc0c0..20bfb9b 100644
--- a/drivers/net/arcnet/arcdevice.h
+++ b/drivers/net/arcnet/arcdevice.h
@@ -267,6 +267,8 @@ struct arcnet_local {
struct led_trigger *recon_led_trig;
char recon_led_trig_name[ARCNET_LED_NAME_SZ];
 
+   struct timer_list   timer;
+
/*
 * Buffer management: an ARCnet card has 4 x 512-byte buffers, each of
 * which can be used for either sending or receiving.  The new dynamic
diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index 4242522..6ea963e 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -381,6 +381,16 @@ static void arcdev_setup(struct net_device *dev)
dev->flags = IFF_BROADCAST;
 }
 
+static void arcnet_timer(unsigned long data)
+{
+   struct net_device *dev = (struct net_device *)data;
+
+   if (!netif_carrier_ok(dev)) {
+   netif_carrier_on(dev);
+   netdev_info(dev, "link up\n");
+   }
+}
+
 struct net_device *alloc_arcdev(const char *name)
 {
struct net_device *dev;
@@ -392,6 +402,9 @@ struct net_device *alloc_arcdev(const char *name)
struct arcnet_local *lp = netdev_priv(dev);
 
spin_lock_init(>lock);
+   init_timer(>timer);
+   lp->timer.data = (unsigned long) dev;
+   lp->timer.function = arcnet_timer;
}
 
return dev;
@@ -490,7 +503,9 @@ int arcnet_open(struct net_device *dev)
lp->hw.intmask(dev, lp->intmask);
arc_printk(D_DEBUG, dev, "%s: %d: %s\n", __FILE__, __LINE__, __func__);
 
+   netif_carrier_off(dev);
netif_start_queue(dev);
+   mod_timer(>timer, jiffies + msecs_to_jiffies(1000));
 
arcnet_led_event(dev, ARCNET_LED_EVENT_OPEN);
return 0;
@@ -507,7 +522,10 @@ int arcnet_close(struct net_device *dev)
struct arcnet_local *lp = netdev_priv(dev);
 
arcnet_led_event(dev, ARCNET_LED_EVENT_STOP);
+   del_timer_sync(>timer);
+
netif_stop_queue(dev);
+   netif_carrier_off(dev);
 
/* flush TX and disable RX */
lp->hw.intmask(dev, 0);
@@ -908,6 +926,12 @@ irqreturn_t arcnet_interrupt(int irq, void *dev_id)
 
arc_printk(D_RECON, dev, "Network reconfiguration 
detected (status=%Xh)\n",
   status);
+   if (netif_carrier_ok(dev)) {
+   netif_carrier_off(dev);
+   netdev_info(dev, "link down\n");
+   }
+   mod_timer(>timer, jiffies + msecs_to_jiffies(1000));
+
arcnet_led_event(dev, ARCNET_LED_EVENT_RECON);
/* MYRECON bit is at bit 7 of diagstatus */
if (diagstatus & 0x80)
@@ -959,6 +983,7 @@ irqreturn_t arcnet_interrupt(int irq, void *dev_id)
lp->num_recons = lp->network_down = 0;
 
arc_printk(D_DURING, dev, "not recon: clearing counters 
anyway.\n");
+   netif_carrier_on(dev);
}
 
if (didsomething)
-- 
2.6.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v8 07/10] qed: Add link support

2015-10-26 Thread Yuval Mintz

Physical link is handled by the management Firmware.
This patch lays the infrastructure for attention handling in the driver,
as link change notifications arrive via async. attentions,
as well the handling of such notifications.

This patch also extends the API with the protocol drivers by adding
registered callbacks which the protocol driver passes to qed in order
to be notified of async. events originating from the FW/HW.

Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qed/qed.h  |  20 ++
 drivers/net/ethernet/qlogic/qed/qed_dev.c  | 106 -
 drivers/net/ethernet/qlogic/qed/qed_int.c  | 336 -
 drivers/net/ethernet/qlogic/qed/qed_l2.c   |   9 +
 drivers/net/ethernet/qlogic/qed/qed_main.c | 211 ++
 drivers/net/ethernet/qlogic/qed/qed_mcp.c  | 295 +
 drivers/net/ethernet/qlogic/qed/qed_mcp.h  | 126 ++-
 include/linux/qed/qed_eth_if.h |   4 +
 8 files changed, 1102 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index e03371d..ca6cc8a 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -108,6 +108,18 @@ enum QED_FEATURE {
QED_MAX_FEATURES,
 };
 
+enum QED_PORT_MODE {
+   QED_PORT_MODE_DE_2X40G,
+   QED_PORT_MODE_DE_2X50G,
+   QED_PORT_MODE_DE_1X100G,
+   QED_PORT_MODE_DE_4X10G_F,
+   QED_PORT_MODE_DE_4X10G_E,
+   QED_PORT_MODE_DE_4X20G,
+   QED_PORT_MODE_DE_1X40G,
+   QED_PORT_MODE_DE_2X25G,
+   QED_PORT_MODE_DE_1X25G
+};
+
 struct qed_hw_info {
/* PCI personality */
enum qed_pci_personalitypersonality;
@@ -404,6 +416,13 @@ struct qed_dev {
u8  protocol;
 #define IS_QED_ETH_IF(cdev) ((cdev)->protocol == QED_PROTOCOL_ETH)
 
+   /* Callbacks to protocol driver */
+   union {
+   struct qed_common_cb_ops*common;
+   struct qed_eth_cb_ops   *eth;
+   } protocol_ops;
+   void*ops_cookie;
+
const struct firmware   *firmware;
 };
 
@@ -453,6 +472,7 @@ static inline u8 qed_concrete_to_sw_fid(struct qed_dev 
*cdev,
 /* Prototypes */
 int qed_fill_dev_info(struct qed_dev *cdev,
  struct qed_dev_info *dev_info);
+void qed_link_update(struct qed_hwfn *hwfn);
 u32 qed_unzip_data(struct qed_hwfn *p_hwfn,
   u32 input_len, u8 *input_buf,
   u32 max_size, u8 *unzip_buf);
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 3d1bdbf..7fd3d78 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -1039,8 +1039,9 @@ static void qed_hw_get_resc(struct qed_hwfn *p_hwfn)
 static int qed_hw_get_nvm_info(struct qed_hwfn *p_hwfn,
   struct qed_ptt *p_ptt)
 {
-   u32 nvm_cfg1_offset, mf_mode, addr, generic_cont0, nvm_cfg_addr;
-   u32 val;
+   u32 nvm_cfg1_offset, mf_mode, addr, generic_cont0, core_cfg;
+   u32 port_cfg_addr, link_temp, val, nvm_cfg_addr;
+   struct qed_mcp_link_params *link;
 
/* Read global nvm_cfg address */
nvm_cfg_addr = qed_rd(p_hwfn, p_ptt, MISC_REG_GEN_PURP_CR0);
@@ -1060,6 +1061,48 @@ static int qed_hw_get_nvm_info(struct qed_hwfn *p_hwfn,
   offsetof(struct nvm_cfg1_glob, pci_id);
p_hwfn->hw_info.vendor_id = qed_rd(p_hwfn, p_ptt, addr) &
NVM_CFG1_GLOB_VENDOR_ID_MASK;
+
+   addr = MCP_REG_SCRATCH + nvm_cfg1_offset +
+  offsetof(struct nvm_cfg1, glob) +
+  offsetof(struct nvm_cfg1_glob, core_cfg);
+
+   core_cfg = qed_rd(p_hwfn, p_ptt, addr);
+
+   switch ((core_cfg & NVM_CFG1_GLOB_NETWORK_PORT_MODE_MASK) >>
+   NVM_CFG1_GLOB_NETWORK_PORT_MODE_OFFSET) {
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_2X40G:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_2X40G;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_2X50G:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_2X50G;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_1X100G:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_1X100G;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_4X10G_F:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_4X10G_F;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_4X10G_E:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_4X10G_E;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_4X20G:
+   p_hwfn->hw_info.port_mode = QED_PORT_MODE_DE_4X20G;
+   break;
+   case NVM_CFG1_GLOB_NETWORK_PORT_MODE_DE_1X40G:
+

[PATCH net-next v8 05/10] qede: Add basic network device support

2015-10-26 Thread Yuval Mintz

This patch includes the basic Rx/Tx support for the driver [although
carrier will still never be turned on].
Following this patch the driver registers a network device, initializes
it and prepares it for traffic.

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qede/qede.h  |  128 ++
 drivers/net/ethernet/qlogic/qede/qede_main.c | 1807 ++
 2 files changed, 1935 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index 7e2bcfa..424ef4a 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -51,6 +51,7 @@ struct qede_dev {
 #define QEDE_MAX_TSS_CNT(edev) ((edev)->dev_info.num_queues * \
 (edev)->dev_info.num_tc)
 
+   struct qede_fastpath*fp_array;
u16 num_rss;
u8  num_tc;
 #define QEDE_RSS_CNT(edev) ((edev)->num_rss)
@@ -58,6 +59,9 @@ struct qede_dev {
 (edev)->num_tc)
 #define QEDE_TSS_IDX(edev, txqidx) ((txqidx) % (edev)->num_rss)
 #define QEDE_TC_IDX(edev, txqidx)  ((txqidx) / (edev)->num_rss)
+#define QEDE_TX_QUEUE(edev, txqidx)\
+   (&(edev)->fp_array[QEDE_TSS_IDX((edev), (txqidx))].txqs[QEDE_TC_IDX( \
+   (edev), (txqidx))])
 
struct qed_int_info int_info;
unsigned char   primary_mac[ETH_ALEN];
@@ -65,9 +69,133 @@ struct qede_dev {
/* Smaller private varaiant of the RTNL lock */
struct mutexqede_lock;
u32 state; /* Protected by qede_lock */
+   u16 rx_buf_size;
+   /* L2 header size + 2*VLANs (8 bytes) + LLC SNAP (8 bytes) */
+#define ETH_OVERHEAD   (ETH_HLEN + 8 + 8)
+   /* Max supported alignment is 256 (8 shift)
+* minimal alignment shift 6 is optimal for 57xxx HW performance
+*/
+#define QEDE_RX_ALIGN_SHIFTmax(6, min(8, L1_CACHE_SHIFT))
+   /* We assume skb_build() uses sizeof(struct skb_shared_info) bytes
+* at the end of skb->data, to avoid wasting a full cache line.
+* This reduces memory use (skb->truesize).
+*/
+#define QEDE_FW_RX_ALIGN_END   \
+   max_t(u64, 1UL << QEDE_RX_ALIGN_SHIFT,  \
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
+
+   struct qed_update_vport_rss_params  rss_params;
+   u16 q_num_rx_buffers; /* Must be a power of two */
+   u16 q_num_tx_buffers; /* Must be a power of two */
+};
+
+enum QEDE_STATE {
+   QEDE_STATE_CLOSED,
+   QEDE_STATE_OPEN,
+};
+
+#define HILO_U64(hi, lo)   u64)(hi)) << 32) + (lo))
+
+#defineMAX_NUM_TC  8
+#defineMAX_NUM_PRI 8
+
+/* The driver supports the new build_skb() API:
+ * RX ring buffer contains pointer to kmalloc() data only,
+ * skb are built only after the frame was DMA-ed.
+ */
+struct sw_rx_data {
+   u8 *data;
+
+   DEFINE_DMA_UNMAP_ADDR(mapping);
+};
+
+struct qede_rx_queue {
+   __le16  *hw_cons_ptr;
+   struct sw_rx_data   *sw_rx_ring;
+   u16 sw_rx_cons;
+   u16 sw_rx_prod;
+   struct qed_chainrx_bd_ring;
+   struct qed_chainrx_comp_ring;
+   void __iomem*hw_rxq_prod_addr;
+
+   int rx_buf_size;
+
+   u16 num_rx_buffers;
+   u16 rxq_id;
+
+   u64 rx_hw_errors;
+   u64 rx_alloc_errors;
+};
+
+union db_prod {
+   struct eth_db_data data;
+   u32 raw;
+};
+
+struct sw_tx_bd {
+   struct sk_buff *skb;
+   u8 flags;
+/* Set on the first BD descriptor when there is a split BD */
+#define QEDE_TSO_SPLIT_BD  BIT(0)
+};
+
+struct qede_tx_queue {
+   int index; /* Queue index */
+   __le16  *hw_cons_ptr;
+   struct sw_tx_bd *sw_tx_ring;
+   u16 sw_tx_cons;
+   u16 sw_tx_prod;
+   struct qed_chaintx_pbl;
+   void __iomem*doorbell_addr;
+   union db_prod   tx_db;
+
+   u16 num_tx_buffers;
+};
+
+#define BD_UNMAP_ADDR(bd)  HILO_U64(le32_to_cpu((bd)->addr.hi), \
+le32_to_cpu((bd)->addr.lo))
+#define BD_SET_UNMAP_ADDR_LEN(bd, maddr, len)  \
+   do {\

[PATCH net-next v8 10/10] qede: Add basic ethtool support

2015-10-26 Thread Yuval Mintz

From: Sudarsana Kalluru 

This adds basic ethtool operations to the qed driver, allowing support in:
 - Statistics gathering [ethtool -S]
 - Setting of debug level [ethtool -s  msglvl]
 - Getting basic information [ethtool, ethtool -i]

In addition it adds the ability to change the MTU.

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qede/Makefile   |   2 +-
 drivers/net/ethernet/qlogic/qede/qede.h |  74 +
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 385 
 drivers/net/ethernet/qlogic/qede/qede_main.c| 137 -
 4 files changed, 596 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/qlogic/qede/qede_ethtool.c

diff --git a/drivers/net/ethernet/qlogic/qede/Makefile 
b/drivers/net/ethernet/qlogic/qede/Makefile
index bedfe9f..06ff90d 100644
--- a/drivers/net/ethernet/qlogic/qede/Makefile
+++ b/drivers/net/ethernet/qlogic/qede/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_QEDE) := qede.o
 
-qede-y := qede_main.o
+qede-y := qede_main.o qede_ethtool.o
diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index 7947942..ea00d5f 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -36,6 +36,70 @@
 
 #define DRV_MODULE_SYM qede
 
+struct qede_stats {
+   u64 no_buff_discards;
+   u64 rx_ucast_bytes;
+   u64 rx_mcast_bytes;
+   u64 rx_bcast_bytes;
+   u64 rx_ucast_pkts;
+   u64 rx_mcast_pkts;
+   u64 rx_bcast_pkts;
+   u64 mftag_filter_discards;
+   u64 mac_filter_discards;
+   u64 tx_ucast_bytes;
+   u64 tx_mcast_bytes;
+   u64 tx_bcast_bytes;
+   u64 tx_ucast_pkts;
+   u64 tx_mcast_pkts;
+   u64 tx_bcast_pkts;
+   u64 tx_err_drop_pkts;
+   u64 coalesced_pkts;
+   u64 coalesced_events;
+   u64 coalesced_aborts_num;
+   u64 non_coalesced_pkts;
+   u64 coalesced_bytes;
+
+   /* port */
+   u64 rx_64_byte_packets;
+   u64 rx_127_byte_packets;
+   u64 rx_255_byte_packets;
+   u64 rx_511_byte_packets;
+   u64 rx_1023_byte_packets;
+   u64 rx_1518_byte_packets;
+   u64 rx_1522_byte_packets;
+   u64 rx_2047_byte_packets;
+   u64 rx_4095_byte_packets;
+   u64 rx_9216_byte_packets;
+   u64 rx_16383_byte_packets;
+   u64 rx_crc_errors;
+   u64 rx_mac_crtl_frames;
+   u64 rx_pause_frames;
+   u64 rx_pfc_frames;
+   u64 rx_align_errors;
+   u64 rx_carrier_errors;
+   u64 rx_oversize_packets;
+   u64 rx_jabbers;
+   u64 rx_undersize_packets;
+   u64 rx_fragments;
+   u64 tx_64_byte_packets;
+   u64 tx_65_to_127_byte_packets;
+   u64 tx_128_to_255_byte_packets;
+   u64 tx_256_to_511_byte_packets;
+   u64 tx_512_to_1023_byte_packets;
+   u64 tx_1024_to_1518_byte_packets;
+   u64 tx_1519_to_2047_byte_packets;
+   u64 tx_2048_to_4095_byte_packets;
+   u64 tx_4096_to_9216_byte_packets;
+   u64 tx_9217_to_16383_byte_packets;
+   u64 tx_pause_frames;
+   u64 tx_pfc_frames;
+   u64 tx_lpi_entry_count;
+   u64 tx_total_collisions;
+   u64 brb_truncates;
+   u64 brb_discards;
+   u64 tx_mac_ctrl_frames;
+};
+
 struct qede_dev {
struct qed_dev  *cdev;
struct net_device   *ndev;
@@ -84,6 +148,7 @@ struct qede_dev {
max_t(u64, 1UL << QEDE_RX_ALIGN_SHIFT,  \
  SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
 
+   struct qede_stats   stats;
struct qed_update_vport_rss_params  rss_params;
u16 q_num_rx_buffers; /* Must be a power of two */
u16 q_num_tx_buffers; /* Must be a power of two */
@@ -194,6 +259,15 @@ union qede_reload_args {
u16 mtu;
 };
 
+void qede_config_debug(uint debug, u32 *p_dp_module, u8 *p_dp_level);
+void qede_set_ethtool_ops(struct net_device *netdev);
+void qede_reload(struct qede_dev *edev,
+void (*func)(struct qede_dev *edev,
+ union qede_reload_args *args),
+union qede_reload_args *args);
+int qede_change_mtu(struct net_device *dev, int new_mtu);
+void qede_fill_by_demand_stats(struct qede_dev *edev);
+
 #define RX_RING_SIZE_POW   13
 #define RX_RING_SIZE   BIT(RX_RING_SIZE_POW)
 #define NUM_RX_BDS_MAX (RX_RING_SIZE - 1)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c 
b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
new file mode 100644
index 000..3a36247
--- /dev/null
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -0,0 +1,385 @@
+/* QLogic qede NIC Driver
+* Copyright (c) 2015 QLogic Corporation
+*
+* This software is available under the terms of the GNU General

[PATCH net-next v8 06/10] qede: classification configuration

2015-10-26 Thread Yuval Mintz

From: Sudarsana Kalluru 

Add the ability to configure basic classification in driver by
implementing ndo_set_mac_address() and ndo_set_rx_mode().

Signed-off-by: Sudarsana Kalluru 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qede/qede.h  |  10 ++
 drivers/net/ethernet/qlogic/qede/qede_main.c | 241 +++
 2 files changed, 251 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h 
b/drivers/net/ethernet/qlogic/qede/qede.h
index 424ef4a..7947942 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -87,6 +87,9 @@ struct qede_dev {
struct qed_update_vport_rss_params  rss_params;
u16 q_num_rx_buffers; /* Must be a power of two */
u16 q_num_tx_buffers; /* Must be a power of two */
+
+   struct delayed_work sp_task;
+   unsigned long   sp_flags;
 };
 
 enum QEDE_STATE {
@@ -184,6 +187,13 @@ struct qede_fastpath {
 
 #define QEDE_CSUM_ERRORBIT(0)
 #define QEDE_CSUM_UNNECESSARY  BIT(1)
+
+#define QEDE_SP_RX_MODE1
+
+union qede_reload_args {
+   u16 mtu;
+};
+
 #define RX_RING_SIZE_POW   13
 #define RX_RING_SIZE   BIT(RX_RING_SIZE_POW)
 #define NUM_RX_BDS_MAX (RX_RING_SIZE - 1)
diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c 
b/drivers/net/ethernet/qlogic/qede/qede_main.c
index daba118..0351204 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_main.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_main.c
@@ -1030,10 +1030,31 @@ static irqreturn_t qede_msix_fp_int(int irq, void 
*fp_cookie)
 
 static int qede_open(struct net_device *ndev);
 static int qede_close(struct net_device *ndev);
+static int qede_set_mac_addr(struct net_device *ndev, void *p);
+static void qede_set_rx_mode(struct net_device *ndev);
+static void qede_config_rx_mode(struct net_device *ndev);
+
+static int qede_set_ucast_rx_mac(struct qede_dev *edev,
+enum qed_filter_xcast_params_type opcode,
+unsigned char mac[ETH_ALEN])
+{
+   struct qed_filter_params filter_cmd;
+
+   memset(_cmd, 0, sizeof(filter_cmd));
+   filter_cmd.type = QED_FILTER_TYPE_UCAST;
+   filter_cmd.filter.ucast.type = opcode;
+   filter_cmd.filter.ucast.mac_valid = 1;
+   ether_addr_copy(filter_cmd.filter.ucast.mac, mac);
+
+   return edev->ops->filter_config(edev->cdev, _cmd);
+}
+
 static const struct net_device_ops qede_netdev_ops = {
.ndo_open = qede_open,
.ndo_stop = qede_close,
.ndo_start_xmit = qede_start_xmit,
+   .ndo_set_rx_mode = qede_set_rx_mode,
+   .ndo_set_mac_address = qede_set_mac_addr,
.ndo_validate_addr = eth_validate_addr,
 };
 
@@ -1198,6 +1219,20 @@ err:
return -ENOMEM;
 }
 
+static void qede_sp_task(struct work_struct *work)
+{
+   struct qede_dev *edev = container_of(work, struct qede_dev,
+sp_task.work);
+   mutex_lock(>qede_lock);
+
+   if (edev->state == QEDE_STATE_OPEN) {
+   if (test_and_clear_bit(QEDE_SP_RX_MODE, >sp_flags))
+   qede_config_rx_mode(edev->ndev);
+   }
+
+   mutex_unlock(>qede_lock);
+}
+
 static void qede_update_pf_params(struct qed_dev *cdev)
 {
struct qed_pf_params pf_params;
@@ -1269,6 +1304,9 @@ static int __qede_probe(struct pci_dev *pdev, u32 
dp_module, u8 dp_level,
 
edev->ops->common->set_id(cdev, edev->ndev->name, DRV_MODULE_VERSION);
 
+   INIT_DELAYED_WORK(>sp_task, qede_sp_task);
+   mutex_init(>qede_lock);
+
DP_INFO(edev, "Ending successfully qede probe\n");
 
return 0;
@@ -1306,6 +1344,7 @@ static void __qede_remove(struct pci_dev *pdev, enum 
qede_remove_mode mode)
 
DP_INFO(edev, "Starting qede_remove\n");
 
+   cancel_delayed_work_sync(>sp_task);
unregister_netdev(ndev);
 
edev->ops->common->set_power_state(cdev, PCI_D0);
@@ -2036,6 +2075,24 @@ static int qede_start_queues(struct qede_dev *edev)
return 0;
 }
 
+static int qede_set_mcast_rx_mac(struct qede_dev *edev,
+enum qed_filter_xcast_params_type opcode,
+unsigned char *mac, int num_macs)
+{
+   struct qed_filter_params filter_cmd;
+   int i;
+
+   memset(_cmd, 0, sizeof(filter_cmd));
+   filter_cmd.type = QED_FILTER_TYPE_MCAST;
+   filter_cmd.filter.mcast.type = opcode;
+   filter_cmd.filter.mcast.num = num_macs;
+
+   for (i = 0; i < num_macs; i++, mac += ETH_ALEN)
+   ether_addr_copy(filter_cmd.filter.mcast.mac[i], mac);
+
+   return edev->ops->filter_config(edev->cdev, _cmd);
+}
+
 enum qede_unload_mode {

[PATCH net-next v8 02/10] qed: Add basic L2 interface

2015-10-26 Thread Yuval Mintz

This patch adds a public API for a network driver to work on top of QED.
The interface itself is very minimal - it's mostly infrastructure, as the
only content it has after this patch is a query for HW-based information
required for the creation of a network interface [I.e., no actual
protocol-specific configurations are supported].

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qed/Makefile  |   2 +-
 drivers/net/ethernet/qlogic/qed/qed.h |  14 ++
 drivers/net/ethernet/qlogic/qed/qed_dev.c |  62 +++
 drivers/net/ethernet/qlogic/qed/qed_hsi.h |   1 +
 drivers/net/ethernet/qlogic/qed/qed_l2.c  |  87 ++
 include/linux/qed/eth_common.h| 279 ++
 include/linux/qed/qed_eth_if.h|  38 
 7 files changed, 482 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/ethernet/qlogic/qed/qed_l2.c
 create mode 100644 include/linux/qed/eth_common.h
 create mode 100644 include/linux/qed/qed_eth_if.h

diff --git a/drivers/net/ethernet/qlogic/qed/Makefile 
b/drivers/net/ethernet/qlogic/qed/Makefile
index 6969b5c..5c2fd57 100644
--- a/drivers/net/ethernet/qlogic/qed/Makefile
+++ b/drivers/net/ethernet/qlogic/qed/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_QED) := qed.o
 
 qed-y := qed_cxt.o qed_dev.o qed_hw.o qed_init_fw_funcs.o qed_init_ops.o \
-qed_int.o qed_main.o qed_mcp.o qed_sp_commands.o qed_spq.o
+qed_int.o qed_main.o qed_mcp.o qed_sp_commands.o qed_spq.o qed_l2.o
diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index a63ef31..e03371d 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -25,6 +25,7 @@
 #include 
 #include "qed_hsi.h"
 
+extern const struct qed_common_ops qed_common_ops_pass;
 #define DRV_MODULE_VERSION "8.4.0.0"
 
 #define MAX_HWFNS_PER_DEVICE(4)
@@ -91,13 +92,22 @@ struct qed_qm_iids {
 
 enum QED_RESOURCES {
QED_SB,
+   QED_L2_QUEUE,
QED_VPORT,
+   QED_RSS_ENG,
QED_PQ,
QED_RL,
+   QED_MAC,
+   QED_VLAN,
QED_ILT,
QED_MAX_RESC,
 };
 
+enum QED_FEATURE {
+   QED_PF_L2_QUE,
+   QED_MAX_FEATURES,
+};
+
 struct qed_hw_info {
/* PCI personality */
enum qed_pci_personalitypersonality;
@@ -105,6 +115,7 @@ struct qed_hw_info {
/* Resource Allocation scheme results */
u32 resc_start[QED_MAX_RESC];
u32 resc_num[QED_MAX_RESC];
+   u32 feat_num[QED_MAX_FEATURES];
 
 #define RESC_START(_p_hwfn, resc) ((_p_hwfn)->hw_info.resc_start[resc])
 #define RESC_NUM(_p_hwfn, resc) ((_p_hwfn)->hw_info.resc_num[resc])
@@ -266,6 +277,9 @@ struct qed_hwfn {
 
struct qed_mcp_info *mcp_info;
 
+   struct qed_hw_cid_data  *p_tx_cids;
+   struct qed_hw_cid_data  *p_rx_cids;
+
struct qed_dmae_infodmae_info;
 
/* QM init */
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 5b84522..3243cb4 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -92,6 +92,15 @@ void qed_resc_free(struct qed_dev *cdev)
for_each_hwfn(cdev, i) {
struct qed_hwfn *p_hwfn = >hwfns[i];
 
+   kfree(p_hwfn->p_tx_cids);
+   p_hwfn->p_tx_cids = NULL;
+   kfree(p_hwfn->p_rx_cids);
+   p_hwfn->p_rx_cids = NULL;
+   }
+
+   for_each_hwfn(cdev, i) {
+   struct qed_hwfn *p_hwfn = >hwfns[i];
+
qed_cxt_mngr_free(p_hwfn);
qed_qm_info_free(p_hwfn);
qed_spq_free(p_hwfn);
@@ -202,6 +211,29 @@ int qed_resc_alloc(struct qed_dev *cdev)
if (!cdev->fw_data)
return -ENOMEM;
 
+   /* Allocate Memory for the Queue->CID mapping */
+   for_each_hwfn(cdev, i) {
+   struct qed_hwfn *p_hwfn = >hwfns[i];
+   int tx_size = sizeof(struct qed_hw_cid_data) *
+RESC_NUM(p_hwfn, QED_L2_QUEUE);
+   int rx_size = sizeof(struct qed_hw_cid_data) *
+RESC_NUM(p_hwfn, QED_L2_QUEUE);
+
+   p_hwfn->p_tx_cids = kzalloc(tx_size, GFP_KERNEL);
+   if (!p_hwfn->p_tx_cids) {
+   DP_NOTICE(p_hwfn,
+ "Failed to allocate memory for Tx Cids\n");
+   goto alloc_err;
+   }
+
+   p_hwfn->p_rx_cids = kzalloc(rx_size, GFP_KERNEL);
+   if (!p_hwfn->p_rx_cids) {
+   DP_NOTICE(p_hwfn,
+ "Failed to allocate memory for Rx Cids\n");
+   goto

[PATCH net-next v8 09/10] qed: Add statistics support

2015-10-26 Thread Yuval Mintz

From: Manish Chopra 

Device statistics can be gathered on-demand. This adds the qed support for
reading the statistics [both function and port] from the device, and adds
to the public API a method for requesting the current statistics.

Signed-off-by: Manish Chopra 
Signed-off-by: Yuval Mintz 
Signed-off-by: Ariel Elior 
---
 drivers/net/ethernet/qlogic/qed/qed.h |  14 ++
 drivers/net/ethernet/qlogic/qed/qed_dev.c | 244 +-
 drivers/net/ethernet/qlogic/qed/qed_dev_api.h |   3 +
 drivers/net/ethernet/qlogic/qed/qed_hsi.h |  30 
 drivers/net/ethernet/qlogic/qed/qed_l2.c  |   3 +
 include/linux/qed/qed_eth_if.h|   3 +
 6 files changed, 296 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed.h 
b/drivers/net/ethernet/qlogic/qed/qed.h
index ca6cc8a..ac17d86 100644
--- a/drivers/net/ethernet/qlogic/qed/qed.h
+++ b/drivers/net/ethernet/qlogic/qed/qed.h
@@ -212,7 +212,20 @@ struct qed_qm_info {
u32 pf_rl;
 };
 
+struct storm_stats {
+   u32 address;
+   u32 len;
+};
+
+struct qed_storm_stats {
+   struct storm_stats mstats;
+   struct storm_stats pstats;
+   struct storm_stats tstats;
+   struct storm_stats ustats;
+};
+
 struct qed_fw_data {
+   struct fw_ver_info  *fw_ver_info;
const u8*modes_tree_buf;
union init_op   *init_ops;
const u32   *arr_data;
@@ -296,6 +309,7 @@ struct qed_hwfn {
 
/* QM init */
struct qed_qm_info  qm_info;
+   struct qed_storm_stats  storm_stats;
 
/* Buffer for unzipping firmware data */
void*unzip_buf;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_dev.c 
b/drivers/net/ethernet/qlogic/qed/qed_dev.c
index 7fd3d78..b9b7b7e 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_dev.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_dev.c
@@ -649,8 +649,10 @@ int qed_hw_init(struct qed_dev *cdev,
bool allow_npar_tx_switch,
const u8 *bin_fw_data)
 {
-   u32 load_code, param;
+   struct qed_storm_stats *p_stat;
+   u32 load_code, param, *p_address;
int rc, mfw_rc, i;
+   u8 fw_vport = 0;
 
rc = qed_init_fw_data(cdev, bin_fw_data);
if (rc != 0)
@@ -659,6 +661,10 @@ int qed_hw_init(struct qed_dev *cdev,
for_each_hwfn(cdev, i) {
struct qed_hwfn *p_hwfn = >hwfns[i];
 
+   rc = qed_fw_vport(p_hwfn, 0, _vport);
+   if (rc != 0)
+   return rc;
+
/* Enable DMAE in PXP */
rc = qed_change_pci_hwfn(p_hwfn, p_hwfn->p_main_ptt, true);
 
@@ -722,6 +728,25 @@ int qed_hw_init(struct qed_dev *cdev,
}
 
p_hwfn->hw_init_done = true;
+
+   /* init PF stats */
+   p_stat = _hwfn->storm_stats;
+   p_stat->mstats.address = BAR0_MAP_REG_MSDM_RAM +
+MSTORM_QUEUE_STAT_OFFSET(fw_vport);
+   p_stat->mstats.len = sizeof(struct eth_mstorm_per_queue_stat);
+
+   p_stat->ustats.address = BAR0_MAP_REG_USDM_RAM +
+USTORM_QUEUE_STAT_OFFSET(fw_vport);
+   p_stat->ustats.len = sizeof(struct eth_ustorm_per_queue_stat);
+
+   p_stat->pstats.address = BAR0_MAP_REG_PSDM_RAM +
+PSTORM_QUEUE_STAT_OFFSET(fw_vport);
+   p_stat->pstats.len = sizeof(struct eth_pstorm_per_queue_stat);
+
+   p_address = _stat->tstats.address;
+   *p_address = BAR0_MAP_REG_TSDM_RAM +
+TSTORM_PORT_STAT_OFFSET(MFW_PORT(p_hwfn));
+   p_stat->tstats.len = sizeof(struct tstorm_per_port_stat);
}
 
return 0;
@@ -1494,6 +1519,223 @@ void qed_chain_free(struct qed_dev *cdev,
  p_chain->p_phys_addr);
 }
 
+static void __qed_get_vport_stats(struct qed_dev *cdev,
+ struct qed_eth_stats  *stats)
+{
+   int i, j;
+
+   memset(stats, 0, sizeof(*stats));
+
+   for_each_hwfn(cdev, i) {
+   struct qed_hwfn *p_hwfn = >hwfns[i];
+   struct eth_mstorm_per_queue_stat mstats;
+   struct eth_ustorm_per_queue_stat ustats;
+   struct eth_pstorm_per_queue_stat pstats;
+   struct tstorm_per_port_stat tstats;
+   struct port_stats port_stats;
+   struct qed_ptt *p_ptt = qed_ptt_acquire(p_hwfn);
+
+   if (!p_ptt) {
+   DP_ERR(p_hwfn, "Failed to acquire ptt\n");
+   continue;
+   }
+
+   memset(, 0, sizeof(mstats));
+   qed_memcpy_from(p_hwfn, p_ptt, ,
+

Re: [linuxwifi] [-next] WARNING at iwl_mvm_time_event_send_add+0x72/0x1b6

2015-10-26 Thread Grumbach, Emmanuel



On 10/26/2015 09:23 AM, Grumbach, Emmanuel wrote:
> Hi,
> 
> On 10/26/2015 08:41 AM, Sergey Senozhatsky wrote:
>> Hi,
>>
>> linux-next 20151022
>>
>>
> 
> Can be reproduced reliably?
> Seems like a bad race between the end of session protection for the
> authentication and the start of the session protection for the deauth.
> I think I found the hole in the locks in there, but it is going to be
> tricky to solve.

Not sure if I found the race. Can you please send the complete log?
If you have timestamps, it'd greatly helps...
dmesg output should do.

> 
>> wlp2s0: aborting authentication with 00:04:96:61:cd:e0 by local choice 
>> (Reason: 3=DEAUTH_LEAVING)
>> [ cut here ]
>> WARNING: CPU: 0 PID: 1006 at 
>> drivers/net/wireless/iwlwifi/mvm/time-event.c:513 
>> iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]()
>> Modules linked in: mousedev arc4 nls_iso8859_1 nls_cp437 vfat fat serio_raw 
>> psmouse atkbd coretemp hwmon i915 libps2 iwlmvm i2c_algo_bit mac80211 
>> drm_kms_helper cfbfillrect intel_powerclamp syscopyarea cfbimgblt 
>> sysfillrect sysimgblt crc32c_intel fb_sys_fops cfbcopyarea iwlwifi drm r8
>> CPU: 0 PID: 1006 Comm: iwconfig Not tainted 
>> 4.3.0-rc6-next-20151022-dbg-2-g4041783-dirty #260
>>   8800c69479c8 811dd4ad 
>>  8800c6947a00 8103db4e a04fd261 88041c7cdfc8
>>  88041cc87a20 88041c7ceb28 8800c6947aac 8800c6947a10
>> Call Trace:
>>  [] dump_stack+0x4b/0x63
>>  [] warn_slowpath_common+0x99/0xb2
>>  [] ? iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
>>  [] warn_slowpath_null+0x1a/0x1c
>>  [] iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]
>>  [] ? __lock_is_held+0x3c/0x57
>>  [] iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
>>  [] ? iwl_mvm_protect_session+0x150/0x219 [iwlmvm]
>>  [] ? iwl_mvm_ref_sync+0x37/0x10c [iwlmvm]
>>  [] iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
>>  [] ? iwl_mvm_mac_mgd_prepare_tx+0xa4/0xc2 [iwlmvm]
>>  [] ieee80211_mgd_deauth+0x14f/0x3b0 [mac80211]
>>  [] ? __lock_is_held+0x3c/0x57
>>  [] ieee80211_deauth+0x18/0x1a [mac80211]
>>  [] cfg80211_mlme_deauth+0x13c/0x28e [cfg80211]
>>  [] cfg80211_disconnect+0xb5/0x2f7 [cfg80211]
>>  [] cfg80211_mgd_wext_siwfreq+0xed/0x160 [cfg80211]
>>  [] ? cfg80211_wext_freq+0x5f/0x5f [cfg80211]
>>  [] cfg80211_wext_siwfreq+0x76/0xf6 [cfg80211]
>>  [] ioctl_standard_call+0x66/0x376
>>  [] wext_handle_ioctl+0x102/0x16d
>>  [] dev_ioctl+0x6bb/0x6de
>>  [] ? handle_mm_fault+0xefc/0x13f9
>>  [] sock_ioctl+0x230/0x23c
>>  [] ? sock_ioctl+0x230/0x23c
>>  [] do_vfs_ioctl+0x458/0x4dc
>>  [] ? retint_user+0x18/0x20
>>  [] ? __fget_light+0x4d/0x71
>>  [] SyS_ioctl+0x43/0x61
>>  [] entry_SYSCALL_64_fastpath+0x12/0x6f
>> ---[ end trace 6a44e7f1588bdae7 ]---
>>
>>
>>  -ss
>> -
>> linuxw...@eclists.intel.com
>> https://eclists.intel.com/sympa/info/linuxwifi
>> Unsubscribe by sending email to sy...@eclists.intel.com with subject 
>> "Unsubscribe linuxwifi"
>>
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [linuxwifi] [-next] WARNING at iwl_mvm_time_event_send_add+0x72/0x1b6

2015-10-26 Thread Sergey Senozhatsky

On (10/26/15 07:51), Grumbach, Emmanuel wrote:
> > On 10/26/2015 08:41 AM, Sergey Senozhatsky wrote:
> >> Hi,
> >>
> >> linux-next 20151022
> >>
> >>
> > 
> > Can be reproduced reliably?
> > Seems like a bad race between the end of session protection for the
> > authentication and the start of the session protection for the deauth.
> > I think I found the hole in the locks in there, but it is going to be
> > tricky to solve.
> 
> Not sure if I found the race. Can you please send the complete log?
> If you have timestamps, it'd greatly helps...
> dmesg output should do.
> 

Hi,

not really sure if I can reproduce this one easily. seen once.

---

Oct 26 15:20:51  dhclient[399]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 7
Oct 26 15:20:58  dhclient[399]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 17
Oct 26 15:21:09  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 7
Oct 26 15:21:09  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
Oct 26 15:21:09  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:09  kernel: wlp2s0: authenticated
Oct 26 15:21:09  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:09  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
(capab=0x11 status=0 aid=24)
Oct 26 15:21:09  kernel: wlp2s0: associated
Oct 26 15:21:09  kernel: IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes 
ready
Oct 26 15:21:09  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:21:12  kernel: wlp2s0: authenticate with 00:04:96:61:e9:f0
Oct 26 15:21:12  kernel: wlp2s0: send auth to 00:04:96:61:e9:f0 (try 1/3)
Oct 26 15:21:12  kernel: wlp2s0: authenticated
Oct 26 15:21:12  kernel: wlp2s0: associate with 00:04:96:61:e9:f0 (try 1/3)
Oct 26 15:21:12  kernel: wlp2s0: RX AssocResp from 00:04:96:61:e9:f0 
(capab=0x11 status=0 aid=16)
Oct 26 15:21:12  kernel: wlp2s0: associated
Oct 26 15:21:12  kernel: wlp2s0: deauthenticating from 00:04:96:61:e9:f0 by 
local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:21:16  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 10
Oct 26 15:21:22  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
Oct 26 15:21:22  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:22  kernel: wlp2s0: authenticated
Oct 26 15:21:22  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:22  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
(capab=0x11 status=0 aid=25)
Oct 26 15:21:22  kernel: wlp2s0: associated
Oct 26 15:21:22  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:21:26  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 9
Oct 26 15:21:35  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 12
Oct 26 15:21:47  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
Oct 26 15:21:47  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:47  kernel: wlp2s0: authenticated
Oct 26 15:21:47  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:21:47  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
(capab=0x11 status=0 aid=25)
Oct 26 15:21:47  kernel: wlp2s0: associated
Oct 26 15:21:47  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:21:47  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 18
Oct 26 15:22:05  dhclient[539]: DHCPDISCOVER on wlp2s0 to 255.255.255.255 port 
67 interval 5
Oct 26 15:22:10  dhclient[539]: No DHCPOFFERS received.
Oct 26 15:22:10  dhclient[539]: No working leases in persistent database - 
sleeping.
Oct 26 15:22:34  kernel: wlp2s0: authenticate with 00:04:96:69:0d:80
Oct 26 15:22:34  kernel: wlp2s0: send auth to 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:22:34  kernel: wlp2s0: authenticated
Oct 26 15:22:34  kernel: wlp2s0: associate with 00:04:96:69:0d:80 (try 1/3)
Oct 26 15:22:34  kernel: wlp2s0: RX AssocResp from 00:04:96:69:0d:80 
(capab=0x11 status=0 aid=29)
Oct 26 15:22:34  kernel: wlp2s0: associated
Oct 26 15:22:34  kernel: wlp2s0: deauthenticating from 00:04:96:69:0d:80 by 
local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:22:34  kernel: wlp2s0: authenticate with 00:04:96:61:cd:e0
Oct 26 15:22:34  kernel: wlp2s0: send auth to 00:04:96:61:cd:e0 (try 1/3)
Oct 26 15:22:34  kernel: wlp2s0: aborting authentication with 00:04:96:61:cd:e0 
by local choice (Reason: 3=DEAUTH_LEAVING)
Oct 26 15:22:34  kernel: [ cut here ]
Oct 26 15:22:34  kernel: WARNING: CPU: 0 PID: 1006 at 
drivers/net/wireless/iwlwifi/mvm/time-event.c:513 
iwl_mvm_time_event_send_add+0x72/0x1b6 [iwlmvm]()
Oct 26 15:22:34  kernel: Modules linked in: mousedev arc4 nls_iso8859_1 
nls_cp437 vfat fat serio_raw psmouse atkbd coretemp hwmon i915 libps2 iwlmvm 
i2c_algo_bit mac80211 drm_kms_helper cfbfillrect intel_powerclamp syscopyarea 
cfbimgblt

Re: Linux 4.2.4

2015-10-26 Thread Jozsef Kadlecsik

On Sun, 25 Oct 2015, Gerhard Wiesinger wrote:

> On 25.10.2015 20:46, Jozsef Kadlecsik wrote:
> > Hi,
> > 
> > On Sun, 25 Oct 2015, Gerhard Wiesinger wrote:
> > 
> > > On 25.10.2015 10:46, Willy Tarreau wrote:
> > > > ipset *triggered* the problem. The whole stack dump would tell more.
> > > OK, find the stack traces in the bug report:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1272645
> > > 
> > > Kernel 4.1.10 triggered also a kernel dump when playing with ipset
> > > commands
> > > and IPv6, details in the bug report  
> > It seems to me it is an architecture-specific alignment issue. I don't
> > have a Cortex-A7 ARM hardware and qemu doesn't seem to support it either,
> > so I'm unable to reproduce it (ipset passes all my tests on my hardware,
> > including more complex ones than what breaks here). My first wild guess is
> > that the dynamic array of the element structure is not aligned properly.
> > Could you give a try to the next patch?
> > 
> > diff --git a/net/netfilter/ipset/ip_set_hash_gen.h
> > b/net/netfilter/ipset/ip_set_hash_gen.h
> > index afe905c..1cf357d 100644
> > --- a/net/netfilter/ipset/ip_set_hash_gen.h
> > +++ b/net/netfilter/ipset/ip_set_hash_gen.h
> > @@ -1211,6 +1211,9 @@ static const struct ip_set_type_variant mtype_variant
> > = {
> > .same_set = mtype_same_set,
> >   };
> >   +#define IP_SET_BASE_ALIGN(dtype) \
> > +   ALIGN(sizeof(struct dtype), __alignof__(struct dtype))
> > +
> >   #ifdef IP_SET_EMIT_CREATE
> >   static int
> >   IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
> > @@ -1319,12 +1322,12 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct
> > ip_set *set,
> >   #endif
> > set->variant = _TOKEN(HTYPE, 4_variant);
> > set->dsize = ip_set_elem_len(set, tb,
> > -   sizeof(struct IPSET_TOKEN(HTYPE, 4_elem)));
> > +   IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE,
> > 4_elem)));
> >   #ifndef IP_SET_PROTO_UNDEF
> > } else {
> > set->variant = _TOKEN(HTYPE, 6_variant);
> > set->dsize = ip_set_elem_len(set, tb,
> > -   sizeof(struct IPSET_TOKEN(HTYPE, 6_elem)));
> > +   IP_SET_BASE_ALIGN(IPSET_TOKEN(HTYPE,
> > 6_elem)));
> > }
> >   #endif
> > if (tb[IPSET_ATTR_TIMEOUT]) {
> > 
> > If that does not solve it, then could you help to narrow down the issue?
> > Does the bug still appear if your remove the counter extension of the set?
> > 
> 
> Patch applied well, compiling ...
> 
> Interesting, that it didn't happen before. Device is in production for 
> more than 2 month without any issue.

You mean the device was stable with the earlier kernels, but starting with 
4.2.3 (and back to 4.1.10) you have got problems, don't you?
 
> Also any idea regarding the second isssue? Or do you think it has the 
> same root cause?

Looking at your RedHat bugzilla report, the "nf_conntrack: table full, 
dropping packet" and "Alignment trap: not handling instruction" are two 
unrelated issues and the second one is triggered by the unaligned counter 
extension acccess in ipset, I'm investigating. I can't think of any reason 
how those issues could be related to each other.

> Greetings from Vienna, Austria :-)

Quite near to my place :-) 

> BTW: You can get the Banana Pi R1 for example at:
> http://www.aliexpress.com/item/BPI-R1-Set-1-R1-Board-Clear-Case-5dB-Antenna-Power-Adapter-Banana-PI-R1-Smart/32362127917.html
> I can really recommend it as a router. Power consumption is as less as 3W.
> Price is also IMHO very good.

Cool mini gear, indeed!

Best regards,
Jozsef
-
E-mail  : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
  H-1525 Budapest 114, POB. 49, Hungary
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU

2015-10-26 Thread Jay Vosburgh

Hannes Frederic Sowa  wrote:

>Hello Alex,
>
>On Mon, Oct 26, 2015, at 16:52, Alexander Duyck wrote:
>> Seems like this code isn't quite correct.  You are calling ipv6_add_dev 
>> for slave devices, and if I understand things correctly I don't believe 
>> that was happening before and may be an unintended side effect.
>
>Ah, btw., autoconf and ipv6 operation on IFF_SLAVE devices is actually
>desired nowadays and don't think we can change this. See also:
>

IPv6 addrconf on IFF_SLAVE devices was disabled for bonding
slaves in commit c2edacf80e15 because it caused issues with snooping
switches.

This is also referenced in

https://bugzilla.redhat.com/show_bug.cgi?id=236750

Won't re-enabling autoconf on IFF_SLAVE devices cause that issue
to return?

-J

---
-Jay Vosburgh, jay.vosbu...@canonical.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU

2015-10-26 Thread Hannes Frederic Sowa

Hi,

On Mon, Oct 26, 2015, at 20:16, Jay Vosburgh wrote:
> Hannes Frederic Sowa  wrote:
> 
> >Hello Alex,
> >
> >On Mon, Oct 26, 2015, at 16:52, Alexander Duyck wrote:
> >> Seems like this code isn't quite correct.  You are calling ipv6_add_dev 
> >> for slave devices, and if I understand things correctly I don't believe 
> >> that was happening before and may be an unintended side effect.
> >
> >Ah, btw., autoconf and ipv6 operation on IFF_SLAVE devices is actually
> >desired nowadays and don't think we can change this. See also:
> >
> 
>   IPv6 addrconf on IFF_SLAVE devices was disabled for bonding
> slaves in commit c2edacf80e15 because it caused issues with snooping
> switches.
> 
>   This is also referenced in
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=236750
> 
>   Won't re-enabling autoconf on IFF_SLAVE devices cause that issue
> to return?

Both patches don't enable autoconf on IFF_SLAVE devices. Sorry for being
imprecise. The referred patch was changing the behavior to whether the
device had a master device.

@Alex, I will take your patch and submit it with the necessary guards to
not enable ipv6 again if we forcefully disable ipv6 and later on shrink
and increase the MTU again. I will do so in your name. Thanks again for
the patch!

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv6: no CHECKSUM_PARTIAL on skbs with extension headers and recalc checksum during fragmentation

2015-10-26 Thread Hannes Frederic Sowa

On Mon, Oct 26, 2015, at 20:39, Tom Herbert wrote:
> On Mon, Oct 26, 2015 at 11:44 AM, Hannes Frederic Sowa
>  wrote:
> >
> >
> > On Mon, Oct 26, 2015, at 15:19, Tom Herbert wrote:
> >> > We already concluded that drivers do have this problem and not the stack
> >> > above ip6_fragment. The places I am aware of I fixed in this patch. Also
> >> > IPv4 to me seems unaffected, albeit one can certainly clean up the logic
> >> > in net-next.
> >> >
> >> I don't understand why checksum for IP fragments is a driver problem.
> >> When fragments are sent to driver they should never have
> >> CHECKSUM_PARTIAL set (or maybe that is what you are seeing?).
> >
> > Because either the drivers or the hardware does not correctly iterate
> > over the extension headers to fetch the final nexthdr field which is
> > used to compute the checksum. This is different from IPv4.
> >
> > I can only guess e.g. from the e1000e driver:
> >
> > case cpu_to_be16(ETH_P_IPV6):
> > /* XXX not handling all IPV6 headers */
> > if (ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
> > cmd_len |= E1000_TXD_CMD_TCP;
> > break;
> >
> Yes, but in the case of a fragment that code should never be hit since
> ip_summed shouldn't be CHECKSUM_PARTIAL for a fragment (maybe after
> the fix in ip_output). For other cases of extension headers the e1000e
> is broken since it apparently does call skb_checksum_help for
> protocols it doesn't understand (the /* XXX not handling all IPV6
> headers */ comment is worrisome!)

Agreed! I am testing with WARN_ON_ONCE in ip6_fragment if I can hit
another path where we would have to call skb_checksum_help. I need to
review IPv4 tomorrow if we need to do according changes there, probably.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip_no_pmtu_disc and UDP

2015-10-26 Thread Vincent Li

ok, I observed  if i increase the UDP client packet size > local
interface  MTU 1500, the client will fragment the packet first and
then send it out, if the UDP client packet size < local interface MTU
1500, the DF bit will be set when ip_no_pmtu_disc set to 0, is this
expected behavior ?



On Mon, Oct 26, 2015 at 3:35 PM, Vincent Li  wrote:
> I test again and i did see DF bit now, it is weird. I am going to do
> more test, sorry for the noise.
>
> On Mon, Oct 26, 2015 at 3:12 PM, Hannes Frederic Sowa
>  wrote:
>> Hello,
>>
>> On Mon, Oct 26, 2015, at 23:00, Vincent Li wrote:
>>> the UDP packet size is about 768, here is how packet path  like:
>>>
>>> client
>>> server
>>> (eth0 mtu 1500 ip 10.3.72.69) (eth0 mtu 1500 ip 10.3.72.1,
>>>   (eth0 mtu 1500 ip 10.2.72.99)
>>>   eth1.1102 mtu
>>> 567 ip 10.2.72.139)
>>>
>>>
>>> UDP client test script:
>>>
>>> [...]
>>>
>>> so I am hoping if I echo 0, 1, 2, 3 respectively to
>>> /proc/sys/net/ipv4/ip_no_pmtu_disc, I am expected to see DF bit
>>> set/unset from the client and should have shown me on the router eth0
>>> interface tcpdump, but instead, DF bit never set on the client. am I
>>> misunderstanding something?
>>
>> This is strange...
>>
>> Can you please capture traffic on eth0 on the client?
>>
>> For outgoing packets only zero or non-zero matter. A '0' definitely
>> generates a UDP packet with a DF bit on my side, anything else a frame
>> with DF bit cleared. I just verified this on net-next with your script.
>> It also does not cause any setsockopts but uses the default.
>>
>>> for example:
>>>
>>>  two concurrent tcpdump on router eth0 (mtu 1500) and eth1.1102 (mtu
>>> 576) interface:
>>>
>>> 1 #tcpdump -nn -i eth0 -v udp and host 10.3.72.69 &
>>>
>>> 14:51:11.946143 IP (tos 0x0, ttl 64, id 7193, offset 0, flags [none],
>>> proto UDP (17), length 796)
>>> 10.3.72.69.43748 > 10.2.72.99.: UDP, length 768
>>>
>>
>> As I said, I cannot reproduce that. :( Please test on eth0 directly so
>> we can be sure the packet does not get mangled.
>>
>> Can you also show me the output of
>> ip route get 10.2.72.139
>> on the client after you maybe already received a icmp pkt-too-big
>> packet?
>>
>> Thanks,
>> Hannes
>>
>>
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFT v2] sh_eth: fix kernel oops in skb_put()

2015-10-26 Thread Yasushi SHOJI

Hi Sergei,

Thank you for your patch!

On Sun, 25 Oct 2015 07:42:33 +0900,
Sergei Shtylyov wrote:
> 
> In a low memory situation the following kernel oops occurs:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0050
> pgd = 8490c000
> [0050] *pgd=4651e831, *pte=, *ppte=
> Internal error: Oops: 17 [#1] PREEMPT ARM
> Modules linked in:
> CPU: 0Not tainted  (3.4-at16 #9)
> PC is at skb_put+0x10/0x98
> LR is at sh_eth_poll+0x2c8/0xa10
> pc : [<8035f780>]lr : [<8028bf50>]psr: 6113
> sp : 84eb1a90  ip : 84eb1ac8  fp : 84eb1ac4
> r10: 003f  r9 : 05ea  r8 : 
> r7 :   r6 : 940453b0  r5 : 0003  r4 : 9381b180
> r3 :   r2 :   r1 : 05ea  r0 : 
> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 10c53c7d  Table: 4248c059  DAC: 0015
> Process klogd (pid: 2046, stack limit = 0x84eb02e8)
> [...]
> 
> This is because netdev_alloc_skb() fails and 'mdp->rx_skbuff[entry]' is left
> NULL but sh_eth_rx() later uses it without checking. Add such check...
> 
> Reported-by: Yasushi SHOJI 
> Signed-off-by: Sergei Shtylyov 
> 
> ---
> This patch is against DaveM's 'net.git' repo.
> 
>  drivers/net/ethernet/renesas/sh_eth.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> Index: net/drivers/net/ethernet/renesas/sh_eth.c
> ===
> --- net.orig/drivers/net/ethernet/renesas/sh_eth.c
> +++ net/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1481,6 +1481,7 @@ static int sh_eth_rx(struct net_device *
>   if (mdp->cd->shift_rd0)
>   desc_status >>= 16;
>  
> + skb = mdp->rx_skbuff[entry];
>   if (desc_status & (RD_RFS1 | RD_RFS2 | RD_RFS3 | RD_RFS4 |
>  RD_RFS5 | RD_RFS6 | RD_RFS10)) {
>   ndev->stats.rx_errors++;
> @@ -1496,12 +1497,11 @@ static int sh_eth_rx(struct net_device *
>   ndev->stats.rx_missed_errors++;
>   if (desc_status & RD_RFS10)
>   ndev->stats.rx_over_errors++;
> - } else {
> + } else  if (skb) {
>   if (!mdp->cd->hw_swap)
>   sh_eth_soft_swap(
>   phys_to_virt(ALIGN(rxdesc->addr, 4)),
>   pkt_len + 2);
> - skb = mdp->rx_skbuff[entry];
>   mdp->rx_skbuff[entry] = NULL;
>   if (mdp->cd->rpadir)
>   skb_reserve(skb, NET_IP_ALIGN);
> 

This certainly prevents from a bad access, however, some odd thing is
going on.  Once I hit a low memory situation with this patch, network
thorough-put and response is very bad.

telnet, ping, wget takes loong time.

PING 172.16.2.13 (172.16.2.13) 56(84) bytes of data.
64 bytes from 172.16.2.13: icmp_seq=5 ttl=64 time=0.223 ms
64 bytes from 172.16.2.13: icmp_seq=6 ttl=64 time=0.195 ms
64 bytes from 172.16.2.13: icmp_seq=7 ttl=64 time=0.203 ms
64 bytes from 172.16.2.13: icmp_seq=8 ttl=64 time=0.219 ms
64 bytes from 172.16.2.13: icmp_seq=9 ttl=64 time=0.165 ms
64 bytes from 172.16.2.13: icmp_seq=10 ttl=64 time=0.171 ms
64 bytes from 172.16.2.13: icmp_seq=1 ttl=64 time=9023 ms
64 bytes from 172.16.2.13: icmp_seq=2 ttl=64 time=8022 ms
64 bytes from 172.16.2.13: icmp_seq=3 ttl=64 time=7014 ms
64 bytes from 172.16.2.13: icmp_seq=4 ttl=64 time=6006 ms

I'll investigate it.
-- 
  yashi
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: dsa: bcm_sf2: Unhardcode port numbers

2015-10-26 Thread David Miller

From: Florian Fainelli 
Date: Fri, 23 Oct 2015 12:11:08 -0700

> While the current driver mostly supports BCM7445 which has a hardcoded
> location for its MoCA port on port 7 and port 0 for its internal PHY,
> this is not necessarily true for all other chips out there such as
> BCM3390 for instance.
> 
> Walk the list of ports from Device Tree, get their port number ("reg"
> property), and then parse the "phy-mode" property and initialize two
> internal variables: moca_port and a bitmask of internal PHYs. Since we
> use interrupts for the MoCA port, we introduce two helper functions to
> enable/disable interrupts and do this at the appropriate bank (INTRL2_0
> or INTRL2_1).
> 
> Signed-off-by: Florian Fainelli 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-26 Thread Yuchung Cheng

On Mon, Oct 26, 2015 at 2:35 PM, Andreas Petlund  wrote:
>
>
> > On 26 Oct 2015, at 15:50, Neal Cardwell  wrote:
> >
> > On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
> >  wrote:
> >> @@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk, int 
> >> level,
> > ...
> >> +   case TCP_RDB:
> >> +   if (val < 0 || val > 1) {
> >> +   err = -EINVAL;
> >> +   } else {
> >> +   tp->rdb = val;
> >> +   tp->nonagle = val;
> >
> > The semantics of the tp->nonagle bits are already a bit complex. My
> > sense is that having a setsockopt of TCP_RDB transparently modify the
> > nagle behavior is going to add more extra complexity and unanticipated
> > behavior than is warranted given the slight possible gain in
> > convenience to the app writer. What about a model where the
> > application user just needs to remember to call
> > setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> > sensible? I see your nice tests at
> >
> >   
> > https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b7d8baf703b2c2ac1b
> >
> > are already doing that. And my sense is that likewise most
> > well-engineered "thin stream" apps will already be using
> > setsockopt(TCP_NODELAY). Is that workable?
>
> We have been discussing this a bit back and forth. Your suggestion would be 
> the right thing to keep the nagle semantics less complex and to educate 
> developers in the intrinsics of the transport.
>
> We ended up choosing to implicitly disable nagle since it
> 1) is incompatible with the logic of RDB.
> 2) leaving it up to the developer to read the documentation and register the 
> line saying that "failing to set TCP_NODELAY will void the RDB latency gain" 
> will increase the chance of misconfigurations leading to deployment with no 
> effect.
>
> The hope was to help both the well-engineered thin-stream apps and the ones 
> deployed by developers with less detailed knowledge of the transport.
but would RDB be voided if this developer turns on RDB then turns on
Nagle later?

>
> -Andreas
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv6: no CHECKSUM_PARTIAL on skbs with extension headers and recalc checksum during fragmentation

2015-10-26 Thread Tom Herbert

On Mon, Oct 26, 2015 at 11:44 AM, Hannes Frederic Sowa
 wrote:
>
>
> On Mon, Oct 26, 2015, at 15:19, Tom Herbert wrote:
>> > We already concluded that drivers do have this problem and not the stack
>> > above ip6_fragment. The places I am aware of I fixed in this patch. Also
>> > IPv4 to me seems unaffected, albeit one can certainly clean up the logic
>> > in net-next.
>> >
>> I don't understand why checksum for IP fragments is a driver problem.
>> When fragments are sent to driver they should never have
>> CHECKSUM_PARTIAL set (or maybe that is what you are seeing?).
>
> Because either the drivers or the hardware does not correctly iterate
> over the extension headers to fetch the final nexthdr field which is
> used to compute the checksum. This is different from IPv4.
>
> I can only guess e.g. from the e1000e driver:
>
> case cpu_to_be16(ETH_P_IPV6):
> /* XXX not handling all IPV6 headers */
> if (ipv6_hdr(skb)->nexthdr == IPPROTO_TCP)
> cmd_len |= E1000_TXD_CMD_TCP;
> break;
>
Yes, but in the case of a fragment that code should never be hit since
ip_summed shouldn't be CHECKSUM_PARTIAL for a fragment (maybe after
the fix in ip_output). For other cases of extension headers the e1000e
is broken since it apparently does call skb_checksum_help for
protocols it doesn't understand (the /* XXX not handling all IPV6
headers */ comment is worrisome!)

Tom
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 3/3] geneve: add IPv6 bits to geneve_fill_metadata_dst

2015-10-26 Thread John W. Linville

Signed-off-by: John W. Linville 
---
v7 -- initial version (numbered to match earlier patches in series)

 drivers/net/geneve.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 44e724508c55..be532d7b879d 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1006,16 +1006,30 @@ static int geneve_fill_metadata_dst(struct net_device 
*dev, struct sk_buff *skb)
struct geneve_dev *geneve = netdev_priv(dev);
struct rtable *rt;
struct flowi4 fl4;
+#if IS_ENABLED(CONFIG_IPV6)
+   struct dst_entry *dst;
+   struct flowi6 fl6;
+#endif
 
-   if (ip_tunnel_info_af(info) != AF_INET)
-   return -EINVAL;
+   if (ip_tunnel_info_af(info) == AF_INET) {
+   rt = geneve_get_v4_rt(skb, dev, , info);
+   if (IS_ERR(rt))
+   return PTR_ERR(rt);
 
-   rt = geneve_get_v4_rt(skb, dev, , info);
-   if (IS_ERR(rt))
-   return PTR_ERR(rt);
+   ip_rt_put(rt);
+   info->key.u.ipv4.src = fl4.saddr;
+#if IS_ENABLED(CONFIG_IPV6)
+   } else if (ip_tunnel_info_af(info) == AF_INET6) {
+   dst = geneve_get_v6_dst(skb, dev, , info);
+   if (IS_ERR(dst))
+   return PTR_ERR(dst);
+
+   dst_release(dst);
+   info->key.u.ipv6.src = fl6.saddr;
+#endif
+   } else
+   return -EINVAL;
 
-   ip_rt_put(rt);
-   info->key.u.ipv4.src = fl4.saddr;
info->key.tp_src = udp_flow_src_port(geneve->net, skb,
 1, USHRT_MAX, true);
info->key.tp_dst = geneve->dst_port;
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 net-next] bpf: fix bpf_perf_event_read() helper

2015-10-26 Thread Alexei Starovoitov


On 10/26/15 5:54 AM, Wangnan (F) wrote:



On 2015/10/26 20:32, Peter Zijlstra wrote:

On Sun, Oct 25, 2015 at 09:23:36AM -0700, Alexei Starovoitov wrote:

bpf_perf_event_read() muxes of -EINVAL into return value, but it's non
ambiguous to the program whether it got an error or real counter value.

How can that be, the (u64)-EINVAL value is a valid counter value..
unlikely maybe, but still quite possible.

In our real usecase we simply treat return value larger than
0x7fff
as error result. We can make it even larger, for example, to
0x.


either above or write the program that index is valid, then you
don't need to check for errors.


Resuling values can be pre-processed by a script to filter potential
error result
out so it is not a very big problem for our real usecases.

For a better interface, I suggest

  u64 bpf_perf_event_read(bool *perror);

which still returns counter value through its return value but put error
code
to stack. Then BPF program can pass NULL to the function if BPF problem
doesn't want to deal with error code.


no. we're not going to introduce another interface for this.
The current one is fine. Don't pass incorrect index and you won't see
einval. Returning ints or bools via stack is much slower.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net 0/2] ipv4: fix problems from the RTNH_F_LINKDOWN introduction

2015-10-26 Thread Julian Anastasov

Fix two problems from the change that introduced RTNH_F_LINKDOWN
flag. The first patch deals with the removal of local route on
DOWN event. The second patch makes sure the RTNH_F_LINKDOWN
flag is properly updated on UP event because the DOWN event
sets it in all cases.

v1->v2:
- forgot to add ifconfig dummy0 down in the test case
- split to 2 patches

Julian Anastasov (2):
  ipv4: fix to not remove local route on link down
  ipv4: update RTNH_F_LINKDOWN flag on UP event

 include/net/ip_fib.h |  2 +-
 net/ipv4/fib_frontend.c  | 13 +++--
 net/ipv4/fib_semantics.c | 18 +++---
 3 files changed, 23 insertions(+), 10 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCHv2 net 1/2] ipv4: fix to not remove local route on link down

2015-10-26 Thread Julian Anastasov

When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
we should not delete the local routes if the local address
is still present. The confusion comes from the fact that both
fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
constant. Fix it by returning back the variable 'force'.

Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up
ifconfig dummy0 down
ip route list table local | grep dummy | grep host
local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1

Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov 
---
 include/net/ip_fib.h |  2 +-
 net/ipv4/fib_frontend.c  | 13 +++--
 net/ipv4/fib_semantics.c | 11 ---
 3 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 727d6e9..654aec1 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -317,7 +317,7 @@ void fib_flush_external(struct net *net);
 
 /* Exported by fib_semantics.c */
 int ip_fib_check_default(__be32 gw, struct net_device *dev);
-int fib_sync_down_dev(struct net_device *dev, unsigned long event);
+int fib_sync_down_dev(struct net_device *dev, unsigned long event, int force);
 int fib_sync_down_addr(struct net *net, __be32 local);
 int fib_sync_up(struct net_device *dev, unsigned int nh_flags);
 void fib_select_multipath(struct fib_result *res);
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 690bcbc..4826a22 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -1110,9 +1110,10 @@ static void nl_fib_lookup_exit(struct net *net)
net->ipv4.fibnl = NULL;
 }
 
-static void fib_disable_ip(struct net_device *dev, unsigned long event)
+static void fib_disable_ip(struct net_device *dev, unsigned long event,
+  int force)
 {
-   if (fib_sync_down_dev(dev, event))
+   if (fib_sync_down_dev(dev, event, force))
fib_flush(dev_net(dev));
rt_cache_flush(dev_net(dev));
arp_ifdown(dev);
@@ -1140,7 +1141,7 @@ static int fib_inetaddr_event(struct notifier_block 
*this, unsigned long event,
/* Last address was deleted from this interface.
 * Disable IP.
 */
-   fib_disable_ip(dev, event);
+   fib_disable_ip(dev, event, 1);
} else {
rt_cache_flush(dev_net(dev));
}
@@ -1157,7 +1158,7 @@ static int fib_netdev_event(struct notifier_block *this, 
unsigned long event, vo
unsigned int flags;
 
if (event == NETDEV_UNREGISTER) {
-   fib_disable_ip(dev, event);
+   fib_disable_ip(dev, event, 2);
rt_flush_dev(dev);
return NOTIFY_DONE;
}
@@ -1178,14 +1179,14 @@ static int fib_netdev_event(struct notifier_block 
*this, unsigned long event, vo
rt_cache_flush(net);
break;
case NETDEV_DOWN:
-   fib_disable_ip(dev, event);
+   fib_disable_ip(dev, event, 0);
break;
case NETDEV_CHANGE:
flags = dev_get_flags(dev);
if (flags & (IFF_RUNNING | IFF_LOWER_UP))
fib_sync_up(dev, RTNH_F_LINKDOWN);
else
-   fib_sync_down_dev(dev, event);
+   fib_sync_down_dev(dev, event, 0);
/* fall through */
case NETDEV_CHANGEMTU:
rt_cache_flush(net);
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 064bd3c..f493eff 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1281,7 +1281,13 @@ int fib_sync_down_addr(struct net *net, __be32 local)
return ret;
 }
 
-int fib_sync_down_dev(struct net_device *dev, unsigned long event)
+/* Event  force Flags   Description
+ * NETDEV_CHANGE  0 LINKDOWNCarrier OFF, not for scope host
+ * NETDEV_DOWN0 LINKDOWN|DEAD   Link down, not for scope host
+ * NETDEV_DOWN1 LINKDOWN|DEAD   Last address removed
+ * NETDEV_UNREGISTER  2 LINKDOWN|DEAD   Device removed
+ */
+int fib_sync_down_dev(struct net_device *dev, unsigned long event, int force)
 {
int ret = 0;
int scope = RT_SCOPE_NOWHERE;
@@ -1290,8 +1296,7 @@ int fib_sync_down_dev(struct net_device *dev, unsigned 
long event)
struct hlist_head *head = _info_devhash[hash];
struct fib_nh *nh;
 
-   if (event == NETDEV_UNREGISTER ||
-   event == NETDEV_DOWN)
+   if (force)
scope = -1;
 
hlist_for_each_entry(nh, head, nh_hash) {
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip_no_pmtu_disc and UDP

2015-10-26 Thread Vincent Li

the UDP packet size is about 768, here is how packet path  like:

client 
server
(eth0 mtu 1500 ip 10.3.72.69) (eth0 mtu 1500 ip 10.3.72.1,
  (eth0 mtu 1500 ip 10.2.72.99)
  eth1.1102 mtu
567 ip 10.2.72.139)


UDP client test script:

#!/usr/bin/perl

use strict;
use warnings;
use IO::Socket::INET;

my $socket = IO::Socket::INET->new(
  PeerPort  => ,
  PeerAddr  => '10.2.72.99',
  Proto => 'udp',
  )
  or die "Can't bind : $@\n";



$| = 1;

my $data = 
"012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567012345670123456701234567";

$socket->send($data);


sleep(10);

$socket->close();

so I am hoping if I echo 0, 1, 2, 3 respectively to
/proc/sys/net/ipv4/ip_no_pmtu_disc, I am expected to see DF bit
set/unset from the client and should have shown me on the router eth0
interface tcpdump, but instead, DF bit never set on the client. am I
misunderstanding something?


for example:

 two concurrent tcpdump on router eth0 (mtu 1500) and eth1.1102 (mtu
576) interface:

1 #tcpdump -nn -i eth0 -v udp and host 10.3.72.69 &

14:51:11.946143 IP (tos 0x0, ttl 64, id 7193, offset 0, flags [none],
proto UDP (17), length 796)
10.3.72.69.43748 > 10.2.72.99.: UDP, length 768


2# tcpdump -nn -i eth1.1102 -v udp and host 10.3.72.69 &

14:51:11.946164 IP (tos 0x0, ttl 63, id 7193, offset 0, flags [+],
proto UDP (17), length 572)
14:51:11.946176 IP (tos 0x0, ttl 63, id 7193, offset 552, flags
[none], proto UDP (17), length 244)
10.3.72.69.43748 > 10.2.72.99.: UDP, length 768
10.3.72.69 > 10.2.72.99: udp

as you can see, the router was fragmenting the UDP packet and not
sending icmp frag needed message, one reason I can think of is  the DF
bit is not set on the original UDP packet.

client is on kernel 4.3.0-rc7+, router is on kernel  3.13.0-rc3

On Fri, Oct 23, 2015 at 3:34 PM, Hannes Frederic Sowa
 wrote:
> Hello,
>
> On Fri, Oct 23, 2015, at 18:45, Vincent Li wrote:
>> It looks ip_no_pmtu_disc setting does not affect UDP IP packet DF bit
>> setting, is that intended behavior? echo 0, 1, 2, 3 respectively to
>> ip_no_pmtu_disc, UDP IP packet always have DF bit cleared, unless use
>> IP_PMTUDISC_DO on IP_MTU_DISCOVER as ip man page says.
>
> Which size do the UDP packets have and what is your MTU? inet_create
> also creates udp sockets and thus the setting does have effect.
>
>>
>> in inet_create, seems to prove that.
>>
>>if (net->ipv4.sysctl_ip_no_pmtu_disc)
>> inet->pmtudisc = IP_PMTUDISC_DONT;
>> else
>> inet->pmtudisc = IP_PMTUDISC_WANT;
>>
>> so I am wondering why UDP is excluded by ip_no_pmtu_disc, why in
>> inet_create, not assign each individual ip_no_pmtu_disc setting to
>> inet->pmtudisc but only check true and assign IP_PMTUDISC_DONT or
>> IP_PMTUDISC_WANT only.
>
> ip_no_pmtu_disc sysctl != IP_MTU_DISCOVER setsockopt. Also we cannot
> change this as it would disrupt communication easily relying on this
> established behavior.
>
> See Documentation/ip-sysctl.txt:
>
> ip_no_pmtu_disc - INTEGER
> Disable Path MTU Discovery. If enabled in mode 1 and a
> fragmentation-required ICMP is received, the PMTU to this
> destination will be set to min_pmtu (see below). You will need
> to raise min_pmtu to the smallest interface MTU on your system
> manually if you want to avoid locally generated fragments.
>
> In mode 2 incoming Path MTU Discovery messages will be
> discarded. Outgoing frames are handled the same as in mode 1,
> implicitly setting IP_PMTUDISC_DONT on every created socket.
>
> Mode 3 is a hardend pmtu discover mode. The kernel will only
> accept fragmentation-needed errors if the underlying protocol
> can verify them besides a plain socket lookup. Current
> protocols for which pmtu events will be honored are TCP, SCTP
> and DCCP as they verify e.g. the sequence number or the
>

[PATCHv2 net 2/2] ipv4: update RTNH_F_LINKDOWN flag on UP event

2015-10-26 Thread Julian Anastasov

When nexthop is part of multipath route we should clear the
LINKDOWN flag when link goes UP or when first address is added.
This is needed because we always set LINKDOWN flag when DEAD flag
was set but now on UP the nexthop is not dead anymore. Examples when
LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered:

- link goes down (LINKDOWN is set), then link goes UP and device
shows carrier OK but LINKDOWN remains set

- last address is deleted (LINKDOWN is set), then address is
added and device shows carrier OK but LINKDOWN remains set

Steps to reproduce:
modprobe dummy
ifconfig dummy0 192.168.168.1 up

here add a multipath route where one nexthop is for dummy0:

ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE
ifconfig dummy0 down
ifconfig dummy0 up

now ip route shows nexthop that is not dead. Now set the sysctl var:

echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown

now ip route will show a dead nexthop because the forgotten
RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD.

Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
Signed-off-by: Julian Anastasov 
---
 net/ipv4/fib_semantics.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f493eff..f657418 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1445,6 +1445,13 @@ int fib_sync_up(struct net_device *dev, unsigned int 
nh_flags)
if (!(dev->flags & IFF_UP))
return 0;
 
+   if (nh_flags & RTNH_F_DEAD) {
+   unsigned int flags = dev_get_flags(dev);
+
+   if (flags & (IFF_RUNNING | IFF_LOWER_UP))
+   nh_flags |= RTNH_F_LINKDOWN;
+   }
+
prev_fi = NULL;
hash = fib_devindex_hashfn(dev->ifindex);
head = _info_devhash[hash];
-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 14/15] net: wireless: ath: Remove unneeded variable ret returning 0

2015-10-26 Thread punit vara

On Fri, Oct 23, 2015 at 12:26 AM, Sergei Shtylyov
 wrote:
> On 10/22/2015 09:47 PM, Punit Vara wrote:
>
>> Remove black line suggested by Sergei
>
>
>Such kind of comments should be under the --- tear line.
>
>>
>> This patch is to the ath5k/eeprom.c that fixes up warning caught by
>> coccicheck:
>>
>> Unneeded variable: "ret". Return "0" on line 980
>>
>> Remove unneeded variable ret created to return zero.
>>
>> Signed-off-by: Punit Vara 
>
> [...]
>
> MBR, Sergei
>
Thanks you Sergei for review.

Actually I didnt know You have replied to this mail because this was
gone in some other folder of my mail box .. I will send this patch
again as you suggested . Will my other patches which are already
correct be added to wireless tree ? or I have to resend everything ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU

2015-10-26 Thread Alexander Duyck


On 10/26/2015 01:45 PM, Hannes Frederic Sowa wrote:

Hi,

On Mon, Oct 26, 2015, at 20:16, Jay Vosburgh wrote:

Hannes Frederic Sowa  wrote:


Hello Alex,

On Mon, Oct 26, 2015, at 16:52, Alexander Duyck wrote:

Seems like this code isn't quite correct.  You are calling ipv6_add_dev
for slave devices, and if I understand things correctly I don't believe
that was happening before and may be an unintended side effect.

Ah, btw., autoconf and ipv6 operation on IFF_SLAVE devices is actually
desired nowadays and don't think we can change this. See also:


IPv6 addrconf on IFF_SLAVE devices was disabled for bonding
slaves in commit c2edacf80e15 because it caused issues with snooping
switches.

This is also referenced in

https://bugzilla.redhat.com/show_bug.cgi?id=236750

Won't re-enabling autoconf on IFF_SLAVE devices cause that issue
to return?

Both patches don't enable autoconf on IFF_SLAVE devices. Sorry for being
imprecise. The referred patch was changing the behavior to whether the
device had a master device.


Yes, the IFF_SLAVE comment on my part was an error in interpretation of 
the code.



@Alex, I will take your patch and submit it with the necessary guards to
not enable ipv6 again if we forcefully disable ipv6 and later on shrink
and increase the MTU again. I will do so in your name. Thanks again for
the patch!


No problem.  If you want to you can take over authorship of the patch 
and just leave my signed-off-by on there.  I'm good either way.


- Alex



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v3 2/2] net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller

2015-10-26 Thread Mans Rullgard

This adds a driver for the Aurora VLSI NB8800 Ethernet controller.
It is an almost complete rewrite of a driver originally found in
a Sigma Designs 2.6.22 tree.

Signed-off-by: Mans Rullgard 
---
Changes:
- remove check for wake on lan irq as it is never requested
- prettify mac address setting
- use ethtool_op_get_link()
- use hardware statistics counters
- check for dma mapping errors
- drop bogus netdev_mc_count(dev) > 64 check
- move request_irq to ndo_open callback
- drop tx_queue_len override
- remove batched tx descriptor cleanup
- use bool type as appropriate
- set the IC_THRESHOLD register correctly according to documentation
- set the RX_ITR and TX_ITR registers to better values improving performance
- move phy_connect to ndo_open callback
- get phy information from devicetree
- move dma buffer allocation to ndo_open callback
- anything I forgot to mention
---
 drivers/net/ethernet/Kconfig |1 +
 drivers/net/ethernet/Makefile|1 +
 drivers/net/ethernet/aurora/Kconfig  |   20 +
 drivers/net/ethernet/aurora/Makefile |1 +
 drivers/net/ethernet/aurora/nb8800.c | 1118 ++
 drivers/net/ethernet/aurora/nb8800.h |  229 +++
 6 files changed, 1370 insertions(+)
 create mode 100644 drivers/net/ethernet/aurora/Kconfig
 create mode 100644 drivers/net/ethernet/aurora/Makefile
 create mode 100644 drivers/net/ethernet/aurora/nb8800.c
 create mode 100644 drivers/net/ethernet/aurora/nb8800.h

diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
index 05aa759..8310163 100644
--- a/drivers/net/ethernet/Kconfig
+++ b/drivers/net/ethernet/Kconfig
@@ -29,6 +29,7 @@ source "drivers/net/ethernet/apm/Kconfig"
 source "drivers/net/ethernet/apple/Kconfig"
 source "drivers/net/ethernet/arc/Kconfig"
 source "drivers/net/ethernet/atheros/Kconfig"
+source "drivers/net/ethernet/aurora/Kconfig"
 source "drivers/net/ethernet/cadence/Kconfig"
 source "drivers/net/ethernet/adi/Kconfig"
 source "drivers/net/ethernet/broadcom/Kconfig"
diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
index ddfc808..b435fb0 100644
--- a/drivers/net/ethernet/Makefile
+++ b/drivers/net/ethernet/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_NET_XGENE) += apm/
 obj-$(CONFIG_NET_VENDOR_APPLE) += apple/
 obj-$(CONFIG_NET_VENDOR_ARC) += arc/
 obj-$(CONFIG_NET_VENDOR_ATHEROS) += atheros/
+obj-$(CONFIG_NET_VENDOR_AURORA) += aurora/
 obj-$(CONFIG_NET_CADENCE) += cadence/
 obj-$(CONFIG_NET_BFIN) += adi/
 obj-$(CONFIG_NET_VENDOR_BROADCOM) += broadcom/
diff --git a/drivers/net/ethernet/aurora/Kconfig 
b/drivers/net/ethernet/aurora/Kconfig
new file mode 100644
index 000..a3c7106
--- /dev/null
+++ b/drivers/net/ethernet/aurora/Kconfig
@@ -0,0 +1,20 @@
+config NET_VENDOR_AURORA
+   bool "Aurora VLSI devices"
+   help
+ If you have a network (Ethernet) device belonging to this class,
+ say Y.
+
+ Note that the answer to this question doesn't directly affect the
+ kernel: saying N will just cause the configurator to skip all
+ questions about Aurora devices. If you say Y, you will be asked
+ for your specific device in the following questions.
+
+if NET_VENDOR_AURORA
+
+config AURORA_NB8800
+   tristate "Aurora AU-NB8800 support"
+   select PHYLIB
+   help
+Support for the AU-NB8800 gigabit Ethernet controller.
+
+endif
diff --git a/drivers/net/ethernet/aurora/Makefile 
b/drivers/net/ethernet/aurora/Makefile
new file mode 100644
index 000..6cb528a
--- /dev/null
+++ b/drivers/net/ethernet/aurora/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_AURORA_NB8800) += nb8800.o
diff --git a/drivers/net/ethernet/aurora/nb8800.c 
b/drivers/net/ethernet/aurora/nb8800.c
new file mode 100644
index 000..b546b67
--- /dev/null
+++ b/drivers/net/ethernet/aurora/nb8800.c
@@ -0,0 +1,1118 @@
+/*
+ * Copyright (C) 2015 Mans Rullgard 
+ *
+ * Mostly rewritten, based on driver from Sigma Designs.  Original
+ * copyright notice below.
+ *
+ *
+ * Driver for tangox SMP864x/SMP865x/SMP867x/SMP868x builtin Ethernet Mac.
+ *
+ * Copyright (C) 2005 Maxime Bizon 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "nb8800.h"
+
+static inline u8 nb8800_readb(struct nb8800_priv *priv, int reg)
+{
+   return readb(priv->base + reg);
+}
+

[PATCH v7 1/3] geneve: implement support for IPv6-based tunnels

2015-10-26 Thread John W. Linville

NOTE: Link-local IPv6 addresses for remote endpoints are not supported,
since the driver currently has no capacity for binding a geneve
interface to a specific link.

Signed-off-by: John W. Linville 
Reviewed-by: Jesse Gross 
---
v7:
- rebase on top of commit fc4099f17240 ("openvswitch: Fix egress tunnel info.")
- revise error handling in ipv6 tx path to match ipv4 tx path (as above)
- Added Reviewed-by from Jesse based on v6 review -- changes above are minor

v6:
- fix a typo (missing {}'s)

v5:
- wrap declaration of sock6 in geneve_dev with IS_ENABLED(CONFIG_IPV6)
- remove superfluous '!!' when assigning geneve->collect_md to bool
- use skb_scrub_packet in IPv4 tx path as well 
- check for NULL ip_tunnel_info pointer in geneve[6]_xmit_skb
- use ipv6_addr_equal for comparing IPv6 addresses
- more use of IS_ENABLED(CONFIG_IPV6) for preserving build integrity
- reject link-local ipv6 address for remote tunnel endpoint

v4:
- treat mode field of ip_tunnel_info as flags
- add a missing IS_ENABLED(CONFIG_IPV6) to geneve_rx
- remove unneeded flags field in geneve_dev
- NULL-check parameter for __geneve_sock_release
- check remote socket family for AF_UNSPEC in geneve_configure
- rename geneve_get_{rt,dst} as geneve_get_{v4_rt,v6_dst}
- refactor some error handling in the xmit paths

v3:
- declare geneve_remote_unspec as static

v2:
- do not require remote address for tx on metadata tunnels
- pass correct sockaddr family to udp_tun_rx_dst in geneve_rx
- accommodate both ipv4 and ipv6 sockets open on same tunnel
- move declaration of geneve_get_dst for aesthetic purposes

 drivers/net/geneve.c | 473 +++
 include/uapi/linux/if_link.h |   1 +
 2 files changed, 395 insertions(+), 79 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 445071c163cb..393b0bddf7cf 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -46,16 +46,27 @@ struct geneve_net {
 
 static int geneve_net_id;
 
+union geneve_addr {
+   struct sockaddr_in sin;
+   struct sockaddr_in6 sin6;
+   struct sockaddr sa;
+};
+
+static union geneve_addr geneve_remote_unspec = { .sa.sa_family = AF_UNSPEC, };
+
 /* Pseudo network device */
 struct geneve_dev {
struct hlist_node  hlist;   /* vni hash table */
struct net *net;/* netns for packet i/o */
struct net_device  *dev;/* netdev for geneve tunnel */
-   struct geneve_sock *sock;   /* socket used for geneve tunnel */
+   struct geneve_sock *sock4;  /* IPv4 socket used for geneve tunnel */
+#if IS_ENABLED(CONFIG_IPV6)
+   struct geneve_sock *sock6;  /* IPv6 socket used for geneve tunnel */
+#endif
u8 vni[3];  /* virtual network ID for tunnel */
u8 ttl; /* TTL override */
u8 tos; /* TOS override */
-   struct sockaddr_in remote;  /* IPv4 address for link partner */
+   union geneve_addr  remote;  /* IP address for link partner */
struct list_head   next;/* geneve's per namespace list */
__be16 dst_port;
bool   collect_md;
@@ -103,11 +114,31 @@ static struct geneve_dev *geneve_lookup(struct 
geneve_sock *gs,
vni_list_head = >vni_list[hash];
hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
-   addr == geneve->remote.sin_addr.s_addr)
+   addr == geneve->remote.sin.sin_addr.s_addr)
+   return geneve;
+   }
+   return NULL;
+}
+
+#if IS_ENABLED(CONFIG_IPV6)
+static struct geneve_dev *geneve6_lookup(struct geneve_sock *gs,
+struct in6_addr addr6, u8 vni[])
+{
+   struct hlist_head *vni_list_head;
+   struct geneve_dev *geneve;
+   __u32 hash;
+
+   /* Find the device for this VNI */
+   hash = geneve_net_vni_hash(vni);
+   vni_list_head = >vni_list[hash];
+   hlist_for_each_entry_rcu(geneve, vni_list_head, hlist) {
+   if (!memcmp(vni, geneve->vni, sizeof(geneve->vni)) &&
+   ipv6_addr_equal(, >remote.sin6.sin6_addr))
return geneve;
}
return NULL;
 }
+#endif
 
 static inline struct genevehdr *geneve_hdr(const struct sk_buff *skb)
 {
@@ -121,24 +152,49 @@ static void geneve_rx(struct geneve_sock *gs, struct 
sk_buff *skb)
struct metadata_dst *tun_dst = NULL;
struct geneve_dev *geneve = NULL;
struct pcpu_sw_netstats *stats;
-   struct iphdr *iph;
-   u8 *vni;
+   struct iphdr *iph = NULL;
__be32 addr;
-   int err;
+   static u8 zero_vni[3];
+   u8 *vni;
+   int err = 0;
+   sa_family_t sa_family;
+#if IS_ENABLED(CONFIG_IPV6)
+   struct ipv6hdr *ip6h = NULL;
+   struct in6_addr addr6;
+

Re: [PATCH RFC net-next 2/2] tcp: Add Redundant Data Bundling (RDB)

2015-10-26 Thread Andreas Petlund

> On 26 Oct 2015, at 15:50, Neal Cardwell  wrote:
> 
> On Fri, Oct 23, 2015 at 4:50 PM, Bendik Rønning Opstad
>  wrote:
>> @@ -2409,6 +2412,15 @@ static int do_tcp_setsockopt(struct sock *sk, int 
>> level,
> ...
>> +   case TCP_RDB:
>> +   if (val < 0 || val > 1) {
>> +   err = -EINVAL;
>> +   } else {
>> +   tp->rdb = val;
>> +   tp->nonagle = val;
> 
> The semantics of the tp->nonagle bits are already a bit complex. My
> sense is that having a setsockopt of TCP_RDB transparently modify the
> nagle behavior is going to add more extra complexity and unanticipated
> behavior than is warranted given the slight possible gain in
> convenience to the app writer. What about a model where the
> application user just needs to remember to call
> setsockopt(TCP_NODELAY) if they want the TCP_RDB behavior to be
> sensible? I see your nice tests at
> 
>   
> https://github.com/bendikro/packetdrill/commit/9916b6c53e33dd04329d29b7d8baf703b2c2ac1b
> 
> are already doing that. And my sense is that likewise most
> well-engineered "thin stream" apps will already be using
> setsockopt(TCP_NODELAY). Is that workable?

We have been discussing this a bit back and forth. Your suggestion would be the 
right thing to keep the nagle semantics less complex and to educate developers 
in the intrinsics of the transport.

We ended up choosing to implicitly disable nagle since it 
1) is incompatible with the logic of RDB.
2) leaving it up to the developer to read the documentation and register the 
line saying that "failing to set TCP_NODELAY will void the RDB latency gain" 
will increase the chance of misconfigurations leading to deployment with no 
effect.

The hope was to help both the well-engineered thin-stream apps and the ones 
deployed by developers with less detailed knowledge of the transport.

-Andreas

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] ipv6: recreate ipv6 link-local addresses when increasing MTU over IPV6_MIN_MTU

2015-10-26 Thread Hannes Frederic Sowa

Hi Alex,

On Mon, Oct 26, 2015, at 18:07, Alexander Duyck wrote:
> Might be a bit longer.  I just realized that I think there is another 
> bug here where you are going through the NETDEV_UP path even though the 
> interface isn't up.  I'll run through some testing this morning to work 
> out the kinks.

When you wrote this, I noticed that if someone removes the LL addresses
to disable the interface and raises the MTU again, we would also start
adding link-local addresses. Probably we need to safe the last state of
disable_ipv6 somewhere in the parent interface. :(

Maybe there is an easier solution for that.

Thanks for your patch, it looks cleaner!

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v7 2/3] geneve: handle ipv6 priority like ipv4 tos

2015-10-26 Thread John W. Linville

Other callers of udp_tunnel6_xmit_skb just pass 0 for the prio
argument.  Jesse Gross  suggested that prio is really
the same as IPv4's tos and should be handled the same, so this is my
interpretation of that suggestion.

Signed-off-by: John W. Linville 
Reported-by: Jesse Gross 
Reviewed-by: Jesse Gross 
---
v7 -- same as previous revisions

 drivers/net/geneve.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 393b0bddf7cf..44e724508c55 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -763,6 +763,7 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff 
*skb,
struct geneve_dev *geneve = netdev_priv(dev);
struct geneve_sock *gs6 = geneve->sock6;
struct dst_entry *dst = NULL;
+   __u8 prio;
 
memset(fl6, 0, sizeof(*fl6));
fl6->flowi6_mark = skb->mark;
@@ -771,7 +772,16 @@ static struct dst_entry *geneve_get_v6_dst(struct sk_buff 
*skb,
if (info) {
fl6->daddr = info->key.u.ipv6.dst;
fl6->saddr = info->key.u.ipv6.src;
+   fl6->flowi6_tos = RT_TOS(info->key.tos);
} else {
+   prio = geneve->tos;
+   if (prio == 1) {
+   const struct iphdr *iip = ip_hdr(skb);
+
+   prio = ip_tunnel_get_dsfield(iip, skb);
+   }
+
+   fl6->flowi6_tos = RT_TOS(prio);
fl6->daddr = geneve->remote.sin6.sin6_addr;
}
 
@@ -897,9 +907,10 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, 
struct net_device *dev,
struct geneve_dev *geneve = netdev_priv(dev);
struct geneve_sock *gs6 = geneve->sock6;
struct dst_entry *dst = NULL;
+   const struct iphdr *iip; /* interior IP header */
int err = -EINVAL;
struct flowi6 fl6;
-   __u8 ttl;
+   __u8 prio, ttl;
__be16 sport;
bool udp_csum;
bool xnet = !net_eq(geneve->net, dev_net(geneve->dev));
@@ -920,6 +931,8 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, 
struct net_device *dev,
sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
skb_reset_mac_header(skb);
 
+   iip = ip_hdr(skb);
+
if (info) {
const struct ip_tunnel_key *key = >key;
u8 *opts = NULL;
@@ -936,6 +949,7 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, 
struct net_device *dev,
if (unlikely(err))
goto err;
 
+   prio = ip_tunnel_ecn_encap(key->tos, iip, skb);
ttl = key->ttl;
} else {
udp_csum = false;
@@ -944,13 +958,14 @@ static netdev_tx_t geneve6_xmit_skb(struct sk_buff *skb, 
struct net_device *dev,
if (unlikely(err))
goto err;
 
+   prio = ip_tunnel_ecn_encap(fl6.flowi6_tos, iip, skb);
ttl = geneve->ttl;
if (!ttl && ipv6_addr_is_multicast())
ttl = 1;
ttl = ttl ? : ip6_dst_hoplimit(dst);
}
err = udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
-  , , 0, ttl,
+  , , prio, ttl,
   sport, geneve->dst_port, !udp_csum);
 
iptunnel_xmit_stats(err, >stats, dev->tstats);
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: dsa: bcm_sf2: Implement FDB operations

2015-10-26 Thread Florian Fainelli

On 23/10/15 15:44, Vivien Didelot wrote:
> On Oct. Friday 23 (43) 01:20 PM, Florian Fainelli wrote:
>> On 23/10/15 12:28, Vivien Didelot wrote:
>>> On Oct. Friday 23 (43) 11:38 AM, Florian Fainelli wrote:
 +static int bcm_sf2_sw_fdb_del(struct dsa_switch *ds, int port,
 +const struct switchdev_obj_port_fdb *fdb)
 +{
 +  struct bcm_sf2_priv *priv = ds_to_priv(ds);
 +
 +  return bcm_sf2_arl_op(priv, 0, port, fdb->addr, fdb->vid, false);
 +}
>>>
>>> I'm wondering if you are populating the FDB of the invalid VLAN 0 here.
>>>
>>> Does your ARL consider that fdb->vid == 0 means "this port's FDB" and
>>> not "FDB of VLAN 0"?
>>
>> (please trim your replies)
> 
> Noted, thanks.
> 
>> I do not think this matters right now, since 802.1q is not currently
>> enabled/supported in the driver, but maybe I am trivializing this?
> 
> I meant that, when you issue the following command:
> 
> bridge fdb add de:ea:be:ef:12:34 dev swp3
> 
> you will get vid == 0 in bcm_sf2_sw_fdb_add(). So I was wondering if
> your code was populating the FDB of the VLAN 0, instead of the choosen
> FDB associated to the given port.

It is populating the FDB for the given port right now, based on the
current switch configuration done by the driver. But thanks for noting
that, as this may need changing in the future.
-- 
Florian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv4: fix problems from the RTNH_F_LINKDOWN introduction

2015-10-26 Thread Andy Gospodarek

On Sat, Oct 24, 2015 at 09:20:00PM +0300, Julian Anastasov wrote:
> When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
> we should not delete the local routes if the local address
> is still present. The confusion comes from the fact that both
> fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
> constant. Fix it by returning back the variable 'force'.
> 
> Steps to reproduce:
> modprobe dummy
> ifconfig dummy0 192.168.168.1 up
> ip route list table local | grep dummy | grep host
> local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1
I tested this before and after your patch and I don't see a different
output.  Was I supposed to see something different?

> Second fix
I would prefer you move these two fixes into 2 separate patches as it
isn't totally clear which hunks fix each of these issues.

> is for fib_sync_up: when nexthop is part of multipath
> route we should clear the LINKDOWN flag when link goes UP
> or when first address is added. This is needed because we always
> set LINKDOWN flag when DEAD flag is set but now the nexthop
> is not dead anymore. Examples when LINKDOWN bit can be forgotten:
> 
> - link goes down (LINKDOWN is set), then link goes UP and device
> shows carrier OK but LINKDOWN remains set
> 
> - last address is deleted (LINKDOWN is set), then address is
> added and device shows carrier OK but LINKDOWN remains set

Are you seeing this with iproute2 (or other tools) or are you just
seeing this by monitoring netlink messages/looking at a netlink cache
you have built inside an application?

I have seen a problem similar to what you have reported with netlink
caches and have a fix I can give you if you would like to try it.  It is
a slightly larger structural change, but it appears to cover covers a
few more cases than this fix does.

> 
> Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
> Signed-off-by: Julian Anastasov 
> ---
>  include/net/ip_fib.h |  2 +-
>  net/ipv4/fib_frontend.c  | 13 +++--
>  net/ipv4/fib_semantics.c | 18 +++---
>  3 files changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
> index 727d6e9..654aec1 100644
> --- a/include/net/ip_fib.h
> +++ b/include/net/ip_fib.h
> @@ -317,7 +317,7 @@ void fib_flush_external(struct net *net);
>  
>  /* Exported by fib_semantics.c */
>  int ip_fib_check_default(__be32 gw, struct net_device *dev);
> -int fib_sync_down_dev(struct net_device *dev, unsigned long event);
> +int fib_sync_down_dev(struct net_device *dev, unsigned long event, int 
> force);
>  int fib_sync_down_addr(struct net *net, __be32 local);
>  int fib_sync_up(struct net_device *dev, unsigned int nh_flags);
>  void fib_select_multipath(struct fib_result *res);
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 690bcbc..4826a22 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -1110,9 +1110,10 @@ static void nl_fib_lookup_exit(struct net *net)
>   net->ipv4.fibnl = NULL;
>  }
>  
> -static void fib_disable_ip(struct net_device *dev, unsigned long event)
> +static void fib_disable_ip(struct net_device *dev, unsigned long event,
> +int force)
>  {
> - if (fib_sync_down_dev(dev, event))
> + if (fib_sync_down_dev(dev, event, force))
>   fib_flush(dev_net(dev));
>   rt_cache_flush(dev_net(dev));
>   arp_ifdown(dev);
> @@ -1140,7 +1141,7 @@ static int fib_inetaddr_event(struct notifier_block 
> *this, unsigned long event,
>   /* Last address was deleted from this interface.
>* Disable IP.
>*/
> - fib_disable_ip(dev, event);
> + fib_disable_ip(dev, event, 1);
>   } else {
>   rt_cache_flush(dev_net(dev));
>   }
> @@ -1157,7 +1158,7 @@ static int fib_netdev_event(struct notifier_block 
> *this, unsigned long event, vo
>   unsigned int flags;
>  
>   if (event == NETDEV_UNREGISTER) {
> - fib_disable_ip(dev, event);
> + fib_disable_ip(dev, event, 2);
>   rt_flush_dev(dev);
>   return NOTIFY_DONE;
>   }
> @@ -1178,14 +1179,14 @@ static int fib_netdev_event(struct notifier_block 
> *this, unsigned long event, vo
>   rt_cache_flush(net);
>   break;
>   case NETDEV_DOWN:
> - fib_disable_ip(dev, event);
> + fib_disable_ip(dev, event, 0);
>   break;
>   case NETDEV_CHANGE:
>   flags = dev_get_flags(dev);
>   if (flags & (IFF_RUNNING | IFF_LOWER_UP))
>   fib_sync_up(dev, RTNH_F_LINKDOWN);
>   else
> - fib_sync_down_dev(dev, event);
> + fib_sync_down_dev(dev, event, 0);
>   /* fall through */
>   case NETDEV_CHANGEMTU:
>

Re: [PATCH net-next] net: dsa: bcm_sf2: Implement FDB operations

2015-10-26 Thread Vivien Didelot

On Oct. Friday 23 (43) 11:38 AM, Florian Fainelli wrote:
> Add support for the FDB add, delete, and dump operations. The add and
> delete operations are implemented using directed ARL operations using
> the specified MAC address and consist in a read operation, write and
> readback operation.
> 
> The dump operation consists in using the ARL search and software
> filtering entries which are not for the desired port.
> 
> Signed-off-by: Florian Fainelli 

Reviewed-by: Vivien Didelot 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 0/4] Automatic adjustment of max frame size

2015-10-26 Thread Keller, Jacob E

On Mon, 2015-10-26 at 09:56 +0900, Toshiaki Makita wrote:
> On 2015/10/24 17:50, Toshiaki Makita wrote:
> > David,
> > 
> > I found my patch set is marked with Changes Requested, but I
> > haven't
> > seen any feedback.
> > 
> > Could you give me your feedback?
> 
> Somehow the mail from LD Linux CI Server did not reach netdev mailing
> list so I could not have seen it from gmail...
> 
> Toshiaki Makita
> 

the ND Linux Bot is only checking against the Intel mailing list  
intel-wired-...@lists.osuosl.org, and isn't subscribed to netdev.

I am not sure why it failed to mail to it, unless netdev blocks non
subscribers from sending mail (it probably does)

The checkpatch output here is likely ignorable, note how it shows up as
a warning. You can fix them if you feel there is a reasonable way to
shorten the line. In this case, I probably wouldn't unless you want to
perform the VLAN and ETH_FRAME LEN calculation once somewhere else...

Generally being close to 80, (81,82 etc) is probably ok, as long as
there isn't an obvious nicer way to shorten the line.

Regards,
JakeN�r��yb�X��ǧv�^�)޺{.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH v3 1/2] devicetree: add binding for Aurora VLSI NB8800 Ethernet controller

2015-10-26 Thread Måns Rullgård

Forgot to CC netdev.

Mans Rullgard  writes:

> This adds a binding for the Aurora VLSI NB8800 Ethernet controller
> using the "aurora,nb8800" compatible string.  When used in Sigma
> Designs chips a few additional features are available.  These variants
> are indicated by a "sigma,-ethernet" compatible string.
>
> Signed-off-by: Mans Rullgard 
> ---
> Changes:
> - added phy child node
> ---
>  .../devicetree/bindings/net/aurora,nb8800.txt  | 37 
> ++
>  1 file changed, 37 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/aurora,nb8800.txt
>
> diff --git a/Documentation/devicetree/bindings/net/aurora,nb8800.txt 
> b/Documentation/devicetree/bindings/net/aurora,nb8800.txt
> new file mode 100644
> index 000..df12ff1
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/aurora,nb8800.txt
> @@ -0,0 +1,37 @@
> +* Aurora VLSI AU-NB8800 Ethernet controller
> +
> +Required properties:
> +- compatible: Should be "sigma,-ethernet", "aurora,nb8800"
> +- reg: Should be MMIO address space of the device
> +- interrupts: Should contain the interrupt specifier for the device
> +- interrupt-parent: Should be a phandle for the interrupt controller
> +- clocks: Should be a phandle for the clock for the device
> +- #address-cells: Should be <1>
> +- #size-cells: Should be <0>
> +
> +Common properties described in ethernet.txt:
> +- local-mac-address
> +- mac-address
> +- phy-handle
> +- phy-mode
> +
> +The attached PHY should be specified in a child node as per phy.txt.
> +
> +Example:
> +
> +ethernet@26000 {
> + compatible = "sigma,smp8642-ethernet", "aurora,nb8800";
> + reg = <0x26000 0x800>;
> + interrupts = <38>;
> + clocks = <_clk>;
> + max-speed = <1000>;
> + phy-connection-type = "rgmii";
> + phy-handle = <_phy>;
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + eth0_phy: ethernet-phy@1 {
> + compatible = "ethernet-phy-ieee802.3-c22";
> + reg = <1>;
> + };
> +};
> -- 
> 2.6.2
>

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net] amd-xgbe: Fix race between access of desc and desc index

2015-10-26 Thread Tom Lendacky

During Tx cleanup it's still possible for the descriptor data to be
read ahead of the descriptor index. A memory barrier is required between
the read of the descriptor index and the start of the Tx cleanup loop.
This allows a change to a lighter-weight barrier in the Tx transmit
routine just before updating the current descriptor index.

Since the memory barrier does result in extra overhead on arm64, keep
the previous change to not chase the current descriptor value. This
prevents the execution of the barrier for each loop performed.

Suggested-by: Alexander Duyck 
Signed-off-by: Tom Lendacky 
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c |2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c |4 
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
index e9ab8b9..f672dba 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1595,7 +1595,7 @@ static void xgbe_dev_xmit(struct xgbe_channel *channel)
  packet->rdesc_count, 1);
 
/* Make sure ownership is written to the descriptor */
-   wmb();
+   smp_wmb();
 
ring->cur = cur_index + 1;
if (!packet->skb->xmit_more ||
diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c 
b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index d2b77d9..dde0486 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1816,6 +1816,10 @@ static int xgbe_tx_poll(struct xgbe_channel *channel)
return 0;
 
cur = ring->cur;
+
+   /* Be sure we get ring->cur before accessing descriptor data */
+   smp_rmb();
+
txq = netdev_get_tx_queue(netdev, channel->queue_index);
 
while ((processed < XGBE_TX_DESC_MAX_PROC) &&

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip_no_pmtu_disc and UDP

2015-10-26 Thread Vincent Li

I test again and i did see DF bit now, it is weird. I am going to do
more test, sorry for the noise.

On Mon, Oct 26, 2015 at 3:12 PM, Hannes Frederic Sowa
 wrote:
> Hello,
>
> On Mon, Oct 26, 2015, at 23:00, Vincent Li wrote:
>> the UDP packet size is about 768, here is how packet path  like:
>>
>> client
>> server
>> (eth0 mtu 1500 ip 10.3.72.69) (eth0 mtu 1500 ip 10.3.72.1,
>>   (eth0 mtu 1500 ip 10.2.72.99)
>>   eth1.1102 mtu
>> 567 ip 10.2.72.139)
>>
>>
>> UDP client test script:
>>
>> [...]
>>
>> so I am hoping if I echo 0, 1, 2, 3 respectively to
>> /proc/sys/net/ipv4/ip_no_pmtu_disc, I am expected to see DF bit
>> set/unset from the client and should have shown me on the router eth0
>> interface tcpdump, but instead, DF bit never set on the client. am I
>> misunderstanding something?
>
> This is strange...
>
> Can you please capture traffic on eth0 on the client?
>
> For outgoing packets only zero or non-zero matter. A '0' definitely
> generates a UDP packet with a DF bit on my side, anything else a frame
> with DF bit cleared. I just verified this on net-next with your script.
> It also does not cause any setsockopts but uses the default.
>
>> for example:
>>
>>  two concurrent tcpdump on router eth0 (mtu 1500) and eth1.1102 (mtu
>> 576) interface:
>>
>> 1 #tcpdump -nn -i eth0 -v udp and host 10.3.72.69 &
>>
>> 14:51:11.946143 IP (tos 0x0, ttl 64, id 7193, offset 0, flags [none],
>> proto UDP (17), length 796)
>> 10.3.72.69.43748 > 10.2.72.99.: UDP, length 768
>>
>
> As I said, I cannot reproduce that. :( Please test on eth0 directly so
> we can be sure the packet does not get mangled.
>
> Can you also show me the output of
> ip route get 10.2.72.139
> on the client after you maybe already received a icmp pkt-too-big
> packet?
>
> Thanks,
> Hannes
>
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] ixgbe: check Master Disable bit after setting

2015-10-26 Thread dan . streetman

From: Dan Streetman 

Spec section 8.2.4.1.1 notes that after setting the PCIe Master Disable
bit, it must be read to verify it was set before polling the Master Enable
status bit.

This adds the check to verify the Master Disable bit was set.

This also corrects the spec section number reference - the Master Disable
section is 5.2.4.3.2, not 5.2.5.3.2.

Signed-off-by: Dan Streetman 
Signed-off-by: Dan Streetman 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 3f56a80..abfada7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -2453,6 +2453,16 @@ static s32 ixgbe_disable_pcie_master(struct ixgbe_hw *hw)
/* Always set this bit to ensure any future transactions are blocked */
IXGBE_WRITE_REG(hw, IXGBE_CTRL, IXGBE_CTRL_GIO_DIS);
 
+   /* Spec sec 8.2.4.1.1, Master Disable bit :
+* "After doing any change to this bit the host must read that
+*  the bit has been modified as expected before reading
+*  STATUS.PCIe Master Enable Status bit."
+*/
+   if (!(IXGBE_READ_REG(hw, IXGBE_CTRL) & IXGBE_CTRL_GIO_DIS)) {
+   hw_err(hw, "GIO Master Disable bit didn't set\n");
+   goto gio_dis_fail;
+   }
+
/* Exit if master requests are blocked */
if (!(IXGBE_READ_REG(hw, IXGBE_STATUS) & IXGBE_STATUS_GIO) ||
ixgbe_removed(hw->hw_addr))
@@ -2467,13 +2477,14 @@ static s32 ixgbe_disable_pcie_master(struct ixgbe_hw 
*hw)
 
/*
 * Two consecutive resets are required via CTRL.RST per datasheet
-* 5.2.5.3.2 Master Disable.  We set a flag to inform the reset routine
+* 5.2.4.3.2 Master Disable.  We set a flag to inform the reset routine
 * of this need.  The first reset prevents new master requests from
 * being issued by our device.  We then must wait 1usec or more for any
 * remaining completions from the PCIe bus to trickle in, and then reset
 * again to clear out any effects they may have had on our device.
 */
hw_dbg(hw, "GIO Master Disable bit didn't clear - requesting resets\n");
+gio_dis_fail:
hw->mac.flags |= IXGBE_FLAGS_DOUBLE_RESET_REQUIRED;
 
/*
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] bpf: sample: define aarch64 specific registers

2015-10-26 Thread Yang Shi

Define aarch64 specific registers for building bpf samples correctly.

Signed-off-by: Yang Shi 
---
 samples/bpf/bpf_helpers.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/samples/bpf/bpf_helpers.h b/samples/bpf/bpf_helpers.h
index 3a44d3a..af44e56 100644
--- a/samples/bpf/bpf_helpers.h
+++ b/samples/bpf/bpf_helpers.h
@@ -86,5 +86,17 @@ static int (*bpf_l4_csum_replace)(void *ctx, int off, int 
from, int to, int flag
 #define PT_REGS_RC(x) ((x)->gprs[2])
 #define PT_REGS_SP(x) ((x)->gprs[15])
 
+#elif defined(__aarch64__)
+
+#define PT_REGS_PARM1(x) ((x)->regs[0])
+#define PT_REGS_PARM2(x) ((x)->regs[1])
+#define PT_REGS_PARM3(x) ((x)->regs[2])
+#define PT_REGS_PARM4(x) ((x)->regs[3])
+#define PT_REGS_PARM5(x) ((x)->regs[4])
+#define PT_REGS_RET(x) ((x)->regs[30])
+#define PT_REGS_FP(x) ((x)->regs[29]) /* Works only with CONFIG_FRAME_POINTER 
*/
+#define PT_REGS_RC(x) ((x)->regs[0])
+#define PT_REGS_SP(x) ((x)->sp)
+
 #endif
 #endif
-- 
2.0.2

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] commit c6825c0976fa7893692e0e43b09740b419b23c09 upstream.

2015-10-26 Thread Pablo Neira Ayuso

Hi,

On Mon, Oct 26, 2015 at 11:55:39AM -0700, Ani Sinha wrote:
> netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get

Please, no need to Cc everyone here. Please, submit your Netfilter
patches to netfilter-de...@vger.kernel.org.

Moreover, it would be great if the subject includes something
descriptive on what you need, for this I'd suggest:

[PATCH -stable 3.4,backport] netfilter: nf_conntrack: fix RCU race in 
nf_conntrack_find_get

I'm including Neal P. Murphy, he said he would help testing these
backports, getting a Tested-by: tag usually speeds up things too.

Burden is usually huge here, the easier you get it for us, the best.
Then we can review and, if no major concerns, I can submit this to
-stable.

Let me know if you have any other questions,
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH v6 14/15] net: wireless: ath: Remove unneeded variable ret returning 0

2015-10-26 Thread Punit Vara

This patch is to the ath5k/eeprom.c that fixes up warning caught by
coccicheck:

Unneeded variable: "ret". Return "0" on line 980

Remove unneeded variable ret created to return zero.

Signed-off-by: Punit Vara 
---
Remove empty line suggested by Sergei

 drivers/net/wireless/ath/wcn36xx/main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/wcn36xx/main.c 
b/drivers/net/wireless/ath/wcn36xx/main.c
index 900e72a..94bcc08 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -935,8 +935,6 @@ static const struct ieee80211_ops wcn36xx_ops = {
 
 static int wcn36xx_init_ieee80211(struct wcn36xx *wcn)
 {
-   int ret = 0;
-
static const u32 cipher_suites[] = {
WLAN_CIPHER_SUITE_WEP40,
WLAN_CIPHER_SUITE_WEP104,
@@ -977,7 +975,7 @@ static int wcn36xx_init_ieee80211(struct wcn36xx *wcn)
wcn->hw->sta_data_size = sizeof(struct wcn36xx_sta);
wcn->hw->vif_data_size = sizeof(struct wcn36xx_vif);
 
-   return ret;
+   return 0;
 }
 
 static int wcn36xx_platform_get_resources(struct wcn36xx *wcn,
-- 
2.5.3

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv4: fix problems from the RTNH_F_LINKDOWN introduction

2015-10-26 Thread Julian Anastasov


Hello,

On Mon, 26 Oct 2015, Andy Gospodarek wrote:

> On Sat, Oct 24, 2015 at 09:20:00PM +0300, Julian Anastasov wrote:
> > When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
> > we should not delete the local routes if the local address
> > is still present. The confusion comes from the fact that both
> > fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
> > constant. Fix it by returning back the variable 'force'.
> > 
> > Steps to reproduce:
> > modprobe dummy
> > ifconfig dummy0 192.168.168.1 up
> > ip route list table local | grep dummy | grep host
> > local 192.168.168.1 dev dummy0  proto kernel  scope host  src 192.168.168.1
> I tested this before and after your patch and I don't see a different
> output.  Was I supposed to see something different?

Sorry, the test is missing one command. I'll
split the patch and will add the missing ifconfig dummy0 down
command. It was lost because I had problems adding '#' before
the commands, which is comment, anyways.

> > Second fix
> I would prefer you move these two fixes into 2 separate patches as it
> isn't totally clear which hunks fix each of these issues.

Preparing patchset...

> Are you seeing this with iproute2 (or other tools) or are you just
> seeing this by monitoring netlink messages/looking at a netlink cache
> you have built inside an application?

ifconfig and ip route.

> I have seen a problem similar to what you have reported with netlink
> caches and have a fix I can give you if you would like to try it.  It is
> a slightly larger structural change, but it appears to cover covers a
> few more cases than this fix does.

No, I'm focusing just on this problem.

Regards

--
Julian Anastasov 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v2 2/2] Documentation: dts: xgene: Add TX/RX delay field

2015-10-26 Thread Iyappan Subramanian

Signed-off-by: Iyappan Subramanian 
---
 Documentation/devicetree/bindings/net/apm-xgene-enet.txt | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt 
b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt
index f55aa28..078060a 100644
--- a/Documentation/devicetree/bindings/net/apm-xgene-enet.txt
+++ b/Documentation/devicetree/bindings/net/apm-xgene-enet.txt
@@ -37,6 +37,14 @@ Required properties for ethernet interfaces that have 
external PHY:
 
 Optional properties:
 - status: Should be "ok" or "disabled" for enabled/disabled. Default is "ok".
+- tx-delay: Delay value for RGMII bridge TX clock.
+   Valid values are between 0 to 7, that maps to
+   417, 717, 1020, 1321, 1611, 1913, 2215, 2514 ps
+   Default value is 4, which corresponds to 1611 ps
+- rx-delay: Delay value for RGMII bridge RX clock.
+   Valid values are between 0 to 7, that maps to
+   273, 589, 899, 1222, 1480, 1806, 2147, 2464 ps
+   Default value is 2, which corresponds to 899 ps
 
 Example:
menetclk: menetclk {
@@ -72,5 +80,7 @@ Example:
 
 /* Board-specific peripheral configurations */
  {
+   tx-delay = <4>;
+   rx-delay = <2>;
 status = "ok";
 };
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v2 0/2] drivers: xgene: Add support RGMII TX/RX delay configuration

2015-10-26 Thread Iyappan Subramanian

X-Gene RGMII ethernet controller has a RGMII bridge that performs the
task of converting the RGMII signal {RX_CLK,RX_CTL, RX_DATA[3:0]} from
PHY to GMII signal {RX_DV,RX_ER,RX_DATA[7:0]} and vice versa.  This
RGMII bridge has a provision to internally delay the input RX_CLK and
the output TX_CLK using configuration registers. This will help in
maintain the CLK-CTL delay relationship in various operating
conditions.

This patch adds support RGMII TX/RX delay configuration.

Signed-off-by: Iyappan Subramanian 
---

Iyappan Subramanian (2):
  drivers: net: xgene: Add support RGMII TX/RX delay configuration
  Documentation: dts: xgene: Add TX/RX delay field

 .../devicetree/bindings/net/apm-xgene-enet.txt | 10 +
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c |  8 +++-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c   | 49 ++
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h   |  2 +
 5 files changed, 69 insertions(+), 1 deletion(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next v2 1/2] drivers: net: xgene: Add support RGMII TX/RX delay configuration

2015-10-26 Thread Iyappan Subramanian

Add RGMII TX/RX delay configuration support. RGMII standard requires 2ns
delay to help the RGMII bridge receiver to sample data correctly. If the
default value does not provide proper centering of the data sample, the
TX/RX delay parameters can be used to adjust accordingly.

Signed-off-by: Iyappan Subramanian 
---
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.c   |  8 +++-
 drivers/net/ethernet/apm/xgene/xgene_enet_hw.h   |  1 +
 drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 49 
 drivers/net/ethernet/apm/xgene/xgene_enet_main.h |  2 +
 4 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
index 652f218..33850a0 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c
@@ -461,6 +461,7 @@ static void xgene_gmac_reset(struct xgene_enet_pdata *pdata)
 
 static void xgene_gmac_init(struct xgene_enet_pdata *pdata)
 {
+   struct device *dev = >pdev->dev;
u32 value, mc2;
u32 intf_ctl, rgmii;
u32 icm0, icm2;
@@ -490,7 +491,12 @@ static void xgene_gmac_init(struct xgene_enet_pdata *pdata)
default:
ENET_INTERFACE_MODE2_SET(, 2);
intf_ctl |= ENET_GHD_MODE;
-   CFG_TXCLK_MUXSEL0_SET(, 4);
+
+   if (dev->of_node) {
+   CFG_TXCLK_MUXSEL0_SET(, pdata->tx_delay);
+   CFG_RXCLK_MUXSEL0_SET(, pdata->rx_delay);
+   }
+
xgene_enet_rd_csr(pdata, DEBUG_REG_ADDR, );
value |= CFG_BYPASS_UNISEC_TX | CFG_BYPASS_UNISEC_RX;
xgene_enet_wr_csr(pdata, DEBUG_REG_ADDR, value);
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h 
b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
index ff05bbc..6dee73c 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h
@@ -144,6 +144,7 @@ enum xgene_enet_rm {
 #define CFG_BYPASS_UNISEC_RX   BIT(1)
 #define CFG_CLE_BYPASS_EN0 BIT(31)
 #define CFG_TXCLK_MUXSEL0_SET(dst, val)xgene_set_bits(dst, val, 29, 3)
+#define CFG_RXCLK_MUXSEL0_SET(dst, val)xgene_set_bits(dst, val, 26, 3)
 
 #define CFG_CLE_IP_PROTOCOL0_SET(dst, val) xgene_set_bits(dst, val, 16, 2)
 #define CFG_CLE_DSTQID0_SET(dst, val)  xgene_set_bits(dst, val, 0, 12)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 6b1846d..ce10687 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -1118,6 +1118,47 @@ static int xgene_get_port_id_dt(struct device *dev, 
struct xgene_enet_pdata *pda
return ret;
 }
 
+static int xgene_get_tx_delay(struct xgene_enet_pdata *pdata)
+{
+   struct device *dev = >pdev->dev;
+   int delay, ret;
+
+   ret = of_property_read_u32(dev->of_node, "tx-delay", );
+   if (ret) {
+   pdata->tx_delay = 4;
+   return 0;
+   }
+
+   if (delay < 0 || delay > 7) {
+   dev_err(dev, "Invalid tx-delay specified\n");
+   return -EINVAL;
+   }
+
+   pdata->tx_delay = delay;
+
+   return 0;
+}
+
+static int xgene_get_rx_delay(struct xgene_enet_pdata *pdata)
+{
+   struct device *dev = >pdev->dev;
+   int delay, ret;
+
+   ret = of_property_read_u32(dev->of_node, "rx-delay", );
+   if (ret) {
+   pdata->rx_delay = 2;
+   return 0;
+   }
+
+   if (delay < 0 || delay > 7) {
+   dev_err(dev, "Invalid rx-delay specified\n");
+   return -EINVAL;
+   }
+
+   pdata->rx_delay = delay;
+
+   return 0;
+}
 
 static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 {
@@ -1194,6 +1235,14 @@ static int xgene_enet_get_resources(struct 
xgene_enet_pdata *pdata)
return -ENODEV;
}
 
+   ret = xgene_get_tx_delay(pdata);
+   if (ret)
+   return ret;
+
+   ret = xgene_get_rx_delay(pdata);
+   if (ret)
+   return ret;
+
ret = platform_get_irq(pdev, 0);
if (ret <= 0) {
dev_err(dev, "Unable to get ENET Rx IRQ\n");
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h 
b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
index ff89a5d..a6e56b8 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h
@@ -184,6 +184,8 @@ struct xgene_enet_pdata {
u8 bp_bufnum;
u16 ring_num;
u32 mss;
+   u8 tx_delay;
+   u8 rx_delay;
 };
 
 struct xgene_indirect_ctl {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

seccomp c/r patch

2015-10-26 Thread Tycho Andersen

Hi all,

Here is a patch that we'd like to go via net-next, as it depends on previous
changes in that tree.

Thanks,

Tycho

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip_no_pmtu_disc and UDP

2015-10-26 Thread Hannes Frederic Sowa

On Mon, Oct 26, 2015, at 23:53, Vincent Li wrote:
> ok, I observed  if i increase the UDP client packet size > local
> interface  MTU 1500, the client will fragment the packet first and
> then send it out, if the UDP client packet size < local interface MTU
> 1500, the DF bit will be set when ip_no_pmtu_disc set to 0, is this
> expected behavior ?

Yes, it is.

Bye,
Hannes
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] bnx2x_config_vlan_mac called a NULL function pointer

2015-10-26 Thread Otto Sabart

Thank you Ariel, just let me know if you find anything.


On 23. Oct (Friday) v 08:16:01 + 2015, Ariel Elior wrote:
> Looking into it...
> 
> > Hello netdev,
> > I probably found a bug in kernel-4.3.0-0.rc5 (bnx2x driver). So I opened
> > new bug report in our bugzilla [0]. Michal Schmidt told me the best way
> > to solve an upstream bug is to contact you directly to netdev list.. so
> > here I am :).
> > 
> > Can somebody take a look at it?
> > 
> > [0] https://bugzilla.redhat.com/show_bug.cgi?id=1273894
> > 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] ixgbe: Wait for 1ms, not 1us, after RST

2015-10-26 Thread dan . streetman

From: Dan Streetman 

The driver currently waits 1us after issuing a RST, but the spec
requires it to wait 1ms.

Signed-off-by: Dan Streetman 
Signed-off-by: Dan Streetman 
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c 
b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
index 4e75843..147bc65 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_x540.c
@@ -113,7 +113,12 @@ mac_reset_top:
 
/* Poll for reset bit to self-clear indicating reset is complete */
for (i = 0; i < 10; i++) {
-   udelay(1);
+   /* sec 8.2.4.1.1 :
+* programmers must wait approximately 1 ms after setting before
+* attempting to check if the bit has cleared or to access (read
+* or write) any other device register.
+*/
+   mdelay(1);
ctrl = IXGBE_READ_REG(hw, IXGBE_CTRL);
if (!(ctrl & IXGBE_CTRL_RST_MASK))
break;
-- 
2.5.0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v8] seccomp, ptrace: add support for dumping seccomp filters

2015-10-26 Thread Daniel Borkmann


Hi Tycho,

On 10/27/2015 01:04 AM, Tycho Andersen wrote:

On Mon, Oct 26, 2015 at 04:07:01PM +0900, Kees Cook wrote:

On Mon, Oct 26, 2015 at 3:46 PM, Kees Cook  wrote:

Cool, thanks. I'll get this into my tree after kernel summit. Thanks
for suffering through all this Tycho!


Actually, since this depends on changes in net, could this get pulled
in from that direction?

Acked-by: Kees Cook 


Can we get the attached patch into net-next?


You need to make a fresh, formal submission of your patch to netdev,
not as an attachment (otherwise patchwork cannot properly pick it up).

Also, indicate the right tree in the subject as: [PATCH net-next] ...

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next 0/4] Automatic adjustment of max frame size

2015-10-26 Thread David Miller

From: "Keller, Jacob E" 
Date: Mon, 26 Oct 2015 20:50:45 +

> I am not sure why it failed to mail to it, unless netdev blocks non
> subscribers from sending mail (it probably does)

It absolutely does not.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v8] seccomp, ptrace: add support for dumping seccomp filters

2015-10-26 Thread Tycho Andersen

Hi David,

On Mon, Oct 26, 2015 at 04:07:01PM +0900, Kees Cook wrote:
> On Mon, Oct 26, 2015 at 3:46 PM, Kees Cook  wrote:
> > Cool, thanks. I'll get this into my tree after kernel summit. Thanks
> > for suffering through all this Tycho!
> 
> Actually, since this depends on changes in net, could this get pulled
> in from that direction?
> 
> Acked-by: Kees Cook 

Can we get the attached patch into net-next?

Thanks,

Tycho
>From 5d9be66e4f48e0882a5546376380147f2f711bec Mon Sep 17 00:00:00 2001
From: Tycho Andersen 
Date: Fri, 2 Oct 2015 18:49:43 -0600
Subject: [PATCH] seccomp, ptrace: add support for dumping seccomp filters

This patch adds support for dumping a process' (classic BPF) seccomp
filters via ptrace.

PTRACE_SECCOMP_GET_FILTER allows the tracer to dump the user's classic BPF
seccomp filters. addr should be an integer which represents the ith seccomp
filter (0 is the most recently installed filter). data should be a struct
sock_filter * with enough room for the ith filter, or NULL, in which case
the filter is not saved. The return value for this command is the number of
BPF instructions the program represents, or negative in the case of errors.
Command specific errors are ENOENT: which indicates that there is no ith
filter in this seccomp tree, and EMEDIUMTYPE, which indicates that the ith
filter was not installed as a classic BPF filter.

A caveat with this approach is that there is no way to get explicitly at
the heirarchy of seccomp filters, and users need to memcmp() filters to
decide which are inherited. This means that a task which installs two of
the same filter can potentially confuse users of this interface.

v2: * make save_orig const
* check that the orig_prog exists (not necessary right now, but when
   grows eBPF support it will be)
* s/n/filter_off and make it an unsigned long to match ptrace
* count "down" the tree instead of "up" when passing a filter offset

v3: * don't take the current task's lock for inspecting its seccomp mode
* use a 0x42** constant for the ptrace command value

v4: * don't copy to userspace while holding spinlocks

v5: * add another condition to WARN_ON

v6: * rebase on net-next

Signed-off-by: Tycho Andersen 
Acked-by: Kees Cook 
CC: Will Drewry 
Reviewed-by: Oleg Nesterov 
CC: Andy Lutomirski 
CC: Pavel Emelyanov 
CC: Serge E. Hallyn 
CC: Alexei Starovoitov 
CC: Daniel Borkmann 
---
 include/linux/seccomp.h | 11 +++
 include/uapi/linux/ptrace.h |  2 ++
 kernel/ptrace.c |  5 +++
 kernel/seccomp.c| 76 -
 4 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
index f426503..2296e6b 100644
--- a/include/linux/seccomp.h
+++ b/include/linux/seccomp.h
@@ -95,4 +95,15 @@ static inline void get_seccomp_filter(struct task_struct *tsk)
 	return;
 }
 #endif /* CONFIG_SECCOMP_FILTER */
+
+#if defined(CONFIG_SECCOMP_FILTER) && defined(CONFIG_CHECKPOINT_RESTORE)
+extern long seccomp_get_filter(struct task_struct *task,
+			   unsigned long filter_off, void __user *data);
+#else
+static inline long seccomp_get_filter(struct task_struct *task,
+  unsigned long n, void __user *data)
+{
+	return -EINVAL;
+}
+#endif /* CONFIG_SECCOMP_FILTER && CONFIG_CHECKPOINT_RESTORE */
 #endif /* _LINUX_SECCOMP_H */
diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h
index a7a6979..fb81065 100644
--- a/include/uapi/linux/ptrace.h
+++ b/include/uapi/linux/ptrace.h
@@ -64,6 +64,8 @@ struct ptrace_peeksiginfo_args {
 #define PTRACE_GETSIGMASK	0x420a
 #define PTRACE_SETSIGMASK	0x420b
 
+#define PTRACE_SECCOMP_GET_FILTER	0x420c
+
 /* Read signals from a shared (process wide) queue */
 #define PTRACE_PEEKSIGINFO_SHARED	(1 << 0)
 
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 787320d..b760bae 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -1016,6 +1016,11 @@ int ptrace_request(struct task_struct *child, long request,
 		break;
 	}
 #endif
+
+	case PTRACE_SECCOMP_GET_FILTER:
+		ret = seccomp_get_filter(child, addr, datavp);
+		break;
+
 	default:
 		break;
 	}
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 06858a7..580ac2d 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -347,6 +347,7 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 {
 	struct seccomp_filter *sfilter;
 	int ret;
+	const bool save_orig = config_enabled(CONFIG_CHECKPOINT_RESTORE);
 
 	if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
 		return ERR_PTR(-EINVAL);
@@ -370,7 +371,7 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
 		return ERR_PTR(-ENOMEM);
 
 	ret =

Re: [PATCH 0/4] net: mitigating kmem_cache slowpath for network stack in NAPI context

2015-10-26 Thread David Miller

From: Jesper Dangaard Brouer 
Date: Fri, 23 Oct 2015 14:46:01 +0200

> It have been a long road. Back in July 2014 I realized that network
> stack were hitting the kmem_cache/SLUB slowpath when freeing SKBs, but
> had no solution.  In Dec 2014 I had implemented a solution called
> qmempool[1], that showed it was possible to improve this, but got
> rejected due to being a cache on top of kmem_cache.  In July 2015
> improvements to kmem_cache were proposed, and recently Oct 2015 my
> kmem_cache (SLAB+SLUB) patches for bulk alloc and free have been
> accepted into the AKPM quilt tree.
> 
> This patchset is the first real use-case kmem_cache bulk alloc and free.
> And is joint work with Alexander Duyck while still at Red Hat.
> 
> Using bulk free to avoid the SLUB slowpath shows the full potential.
> In this patchset it is realized in NAPI/softirq context.  1. During
> DMA TX completion bulk free is optimal and does not introduce any
> added latency. 2. bulk free of SKBs delay free'ed due to IRQ context
> in net_tx_action softirq completion queue.
> 
> Using bulk alloc is showing minor improvements for SLUB(+0.9%), but a
> very slight slowdown for SLAB(-0.1%).
> 
> [1] http://thread.gmane.org/gmane.linux.network/342347/focus=126138
> 
> 
> This patchset is based on net-next (commit 26440c835), BUT I've
> applied several patches from AKPMs MM-tree.
> 
> Cherrypick some commits from MMOTM tree on branch/tag mmotm-2015-10-06-16-30
> from git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> (Below commit IDs are obviously not stable)

Logically I'm fine with this series, but as you mention there are
dependencies that need to hit upstream before I can merge any of
this stuff into my tree.

I also think that patch #4 is a net-win, and also will expose the
bulking code to more testing since it will be used more often.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/2] sh_eth: RX buffer alignment fixes

2015-10-26 Thread David Miller

From: Sergei Shtylyov 
Date: Sat, 24 Oct 2015 00:44:27 +0300

>Here's a set of 2 patches against DaveM's 'net.git' repo which are the 
> fixes
> to the RX buffer size calculation.
> 
> [1/2] sh_eth: fix RX buffer size alignment
> [2/2] sh_eth: fix RX buffer size calculation

Series applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: ip_no_pmtu_disc and UDP

2015-10-26 Thread Hannes Frederic Sowa

Hello,

On Mon, Oct 26, 2015, at 23:00, Vincent Li wrote:
> the UDP packet size is about 768, here is how packet path  like:
> 
> client
> server
> (eth0 mtu 1500 ip 10.3.72.69) (eth0 mtu 1500 ip 10.3.72.1,
>   (eth0 mtu 1500 ip 10.2.72.99)
>   eth1.1102 mtu
> 567 ip 10.2.72.139)
> 
> 
> UDP client test script:
> 
> [...]
> 
> so I am hoping if I echo 0, 1, 2, 3 respectively to
> /proc/sys/net/ipv4/ip_no_pmtu_disc, I am expected to see DF bit
> set/unset from the client and should have shown me on the router eth0
> interface tcpdump, but instead, DF bit never set on the client. am I
> misunderstanding something?

This is strange...

Can you please capture traffic on eth0 on the client?

For outgoing packets only zero or non-zero matter. A '0' definitely
generates a UDP packet with a DF bit on my side, anything else a frame
with DF bit cleared. I just verified this on net-next with your script.
It also does not cause any setsockopts but uses the default.

> for example:
> 
>  two concurrent tcpdump on router eth0 (mtu 1500) and eth1.1102 (mtu
> 576) interface:
> 
> 1 #tcpdump -nn -i eth0 -v udp and host 10.3.72.69 &
> 
> 14:51:11.946143 IP (tos 0x0, ttl 64, id 7193, offset 0, flags [none],
> proto UDP (17), length 796)
> 10.3.72.69.43748 > 10.2.72.99.: UDP, length 768
> 

As I said, I cannot reproduce that. :( Please test on eth0 directly so
we can be sure the packet does not get mangled.

Can you also show me the output of
ip route get 10.2.72.139
on the client after you maybe already received a icmp pkt-too-big
packet?

Thanks,
Hannes

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 0/2] mpls: mulipath improvements

2015-10-26 Thread Robert Shearman

Two improvements to the recently added mpls multipath support. The
first is a fix for missing initialisation the nexthop address length
for the v4 and v6 explicit null label routes, and the second is to
reduce the amount of memory used by mpls routes by changing the way
the via addresses are stored.

Robert Shearman (2):
  mpls: fix forwarding using v4/v6 explicit null
  mpls: reduce memory usage of routes

 net/mpls/af_mpls.c  | 123 +---
 net/mpls/internal.h |  26 ++-
 2 files changed, 112 insertions(+), 37 deletions(-)

-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next 2/2] mpls: reduce memory usage of routes

2015-10-26 Thread Robert Shearman

Nexthops for MPLS routes have a via address field sized for the
largest via address that is expected, which is 32 bytes. This means
that in the most common case of having ipv4 via addresses, 28 bytes of
memory more than required are used per nexthop. In the other common
case of an ipv6 nexthop then 16 bytes more than required are
used. With large numbers of MPLS routes this extra memory usage could
start to become significant.

To avoid allocating memory for a maximum length via address when not
all of it is required and to allow for ease of iterating over
nexthops, then the via addresses are changed to be stored in the same
memory block as the route and nexthops, but in an array after the end
of the array of nexthops. New accessors are provided to retrieve a
pointer to the via address.

To allow for O(1) access without having to store a pointer or offset
per nh, the via address for each nexthop is sized according to the
maximum via address for any nexthop in the route, which is stored in a
new route field, rt_max_alen, but this is in an existing hole in
struct mpls_route so it doesn't increase the size of the
structure. Each via address is ensured to be aligned to VIA_ALEN_ALIGN
to account for architectures that don't allow unaligned accesses.

Signed-off-by: Robert Shearman 
---
 net/mpls/af_mpls.c  | 121 +---
 net/mpls/internal.h |  28 ++--
 2 files changed, 111 insertions(+), 38 deletions(-)

diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 1c58662db4b2..c70d750148b6 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -57,6 +57,20 @@ bool mpls_output_possible(const struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(mpls_output_possible);
 
+static u8 *__mpls_nh_via(struct mpls_route *rt, struct mpls_nh *nh)
+{
+   u8 *nh0_via = PTR_ALIGN((u8 *)>rt_nh[rt->rt_nhn], VIA_ALEN_ALIGN);
+   int nh_index = nh - rt->rt_nh;
+
+   return nh0_via + rt->rt_max_alen * nh_index;
+}
+
+static const u8 *mpls_nh_via(const struct mpls_route *rt,
+const struct mpls_nh *nh)
+{
+   return __mpls_nh_via((struct mpls_route *)rt, (struct mpls_nh *)nh);
+}
+
 static unsigned int mpls_nh_header_size(const struct mpls_nh *nh)
 {
/* The size of the layer 2.5 labels to be added for this route */
@@ -303,7 +317,7 @@ static int mpls_forward(struct sk_buff *skb, struct 
net_device *dev,
}
}
 
-   err = neigh_xmit(nh->nh_via_table, out_dev, nh->nh_via, skb);
+   err = neigh_xmit(nh->nh_via_table, out_dev, mpls_nh_via(rt, nh), skb);
if (err)
net_dbg_ratelimited("%s: packet transmission failed: %d\n",
__func__, err);
@@ -340,14 +354,19 @@ struct mpls_route_config {
int rc_mp_len;
 };
 
-static struct mpls_route *mpls_rt_alloc(int num_nh)
+static struct mpls_route *mpls_rt_alloc(int num_nh, u8 max_alen)
 {
+   u8 max_alen_aligned = ALIGN(max_alen, VIA_ALEN_ALIGN);
struct mpls_route *rt;
 
-   rt = kzalloc(sizeof(*rt) + (num_nh * sizeof(struct mpls_nh)),
+   rt = kzalloc(ALIGN(sizeof(*rt) + num_nh * sizeof(*rt->rt_nh),
+  VIA_ALEN_ALIGN) +
+num_nh * max_alen_aligned,
 GFP_KERNEL);
-   if (rt)
+   if (rt) {
rt->rt_nhn = num_nh;
+   rt->rt_max_alen = max_alen_aligned;
+   }
 
return rt;
 }
@@ -408,7 +427,8 @@ static unsigned find_free_label(struct net *net)
 }
 
 #if IS_ENABLED(CONFIG_INET)
-static struct net_device *inet_fib_lookup_dev(struct net *net, void *addr)
+static struct net_device *inet_fib_lookup_dev(struct net *net,
+ const void *addr)
 {
struct net_device *dev;
struct rtable *rt;
@@ -427,14 +447,16 @@ static struct net_device *inet_fib_lookup_dev(struct net 
*net, void *addr)
return dev;
 }
 #else
-static struct net_device *inet_fib_lookup_dev(struct net *net, void *addr)
+static struct net_device *inet_fib_lookup_dev(struct net *net,
+ const void *addr)
 {
return ERR_PTR(-EAFNOSUPPORT);
 }
 #endif
 
 #if IS_ENABLED(CONFIG_IPV6)
-static struct net_device *inet6_fib_lookup_dev(struct net *net, void *addr)
+static struct net_device *inet6_fib_lookup_dev(struct net *net,
+  const void *addr)
 {
struct net_device *dev;
struct dst_entry *dst;
@@ -457,13 +479,15 @@ static struct net_device *inet6_fib_lookup_dev(struct net 
*net, void *addr)
return dev;
 }
 #else
-static struct net_device *inet6_fib_lookup_dev(struct net *net, void *addr)
+static struct net_device *inet6_fib_lookup_dev(struct net *net,
+  const void *addr)
 {
return ERR_PTR(-EAFNOSUPPORT);
 }
 #endif
 
 static struct net_device

[PATCH net-next 1/2] mpls: fix forwarding using v4/v6 explicit null

2015-10-26 Thread Robert Shearman

Fill in the via address length for the predefined IPv4 and IPv6
explicit-null label routes.

Fixes: f8efb73c97e2 ("mpls: multipath route support")
Signed-off-by: Robert Shearman 
---
 net/mpls/af_mpls.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index cc972e30355b..1c58662db4b2 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -1345,6 +1345,7 @@ static int resize_platform_label_table(struct net *net, 
size_t limit)
rt0->rt_protocol = RTPROT_KERNEL;
rt0->rt_payload_type = MPT_IPV4;
rt0->rt_nh->nh_via_table = NEIGH_LINK_TABLE;
+   rt0->rt_nh->nh_via_alen = lo->addr_len;
memcpy(rt0->rt_nh->nh_via, lo->dev_addr, lo->addr_len);
}
if (limit > MPLS_LABEL_IPV6NULL) {
@@ -1356,6 +1357,7 @@ static int resize_platform_label_table(struct net *net, 
size_t limit)
rt2->rt_protocol = RTPROT_KERNEL;
rt2->rt_payload_type = MPT_IPV6;
rt2->rt_nh->nh_via_table = NEIGH_LINK_TABLE;
+   rt2->rt_nh->nh_via_alen = lo->addr_len;
memcpy(rt2->rt_nh->nh_via, lo->dev_addr, lo->addr_len);
}
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next] net: dsa: bcm_sf2: Implement FDB operations

2015-10-26 Thread David Miller

From: Florian Fainelli 
Date: Fri, 23 Oct 2015 11:38:07 -0700

> Add support for the FDB add, delete, and dump operations. The add and
> delete operations are implemented using directed ARL operations using
> the specified MAC address and consist in a read operation, write and
> readback operation.
> 
> The dump operation consists in using the ARL search and software
> filtering entries which are not for the desired port.
> 
> Signed-off-by: Florian Fainelli 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/2] net: add driver for Netronome NFP4000/NFP6000 NIC VFs

2015-10-26 Thread David Miller

From: Jakub Kicinski 
Date: Fri, 23 Oct 2015 19:58:11 +0100

> +struct nfp_net_tx_buf {
> + struct sk_buff *skb;
> + dma_addr_t dma_addr;
> + short int fidx;
> + u16 pkt_cnt;
> + u32 real_len;
> +};

This packs very poorly, and has a lot of padding holes.  Better ordering
would be:

struct nfp_net_tx_buf {
struct sk_buff *skb;
dma_addr_t dma_addr;
u32 real_len;
short int fidx;
u16 pkt_cnt;
};

You really should audit the most core datastructures in this driver
for the same problem.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net-next V18 3/3] 802.1AD: Flow handling, actions, vlan parsing and netlink attributes

2015-10-26 Thread Pravin Shelar

On Sun, Oct 25, 2015 at 5:11 PM, Thomas F Herbert
 wrote:
> Add support for 802.1ad including the ability to push and pop double
> tagged vlans. Add support for 802.1ad to netlink parsing and flow
> conversion. Uses double nested encap attributes to represent double
> tagged vlan. Inner TPID encoded along with ctci in nested attributes. Outer
> TPID is also encoded in the flow key.
>
> Signed-off-by: Thomas F Herbert 
This patch does not apply on current master due to conflicts related
net-branch merge.

> ---
>  net/openvswitch/actions.c  |   6 +-
>  net/openvswitch/flow.c |  76 
>  net/openvswitch/flow.h |   8 +-
>  net/openvswitch/flow_netlink.c | 199 
> +
>  net/openvswitch/vport-netdev.c |   4 +-
>  5 files changed, 252 insertions(+), 41 deletions(-)
>
> diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> index c8db44a..ed19e2b 100644
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -302,24 +302,68 @@ static bool icmp6hdr_ok(struct sk_buff *skb)
>   sizeof(struct icmp6hdr));
>  }
>
> -static int parse_vlan(struct sk_buff *skb, struct sw_flow_key *key)
> +/* Parse vlan tag from vlan header.
> + * Returns ERROR on memory error.
> + * Returns 0 if it encounters a non-vlan or incomplete packet.
> + * Returns 1 after successfully parsing vlan tag.
> + */
> +
> +static int parse_vlan_tag(struct sk_buff *skb, struct vlan_head *vlan)
>  {
> -   struct qtag_prefix {
> -   __be16 eth_type; /* ETH_P_8021Q */
> -   __be16 tci;
> -   };
> -   struct qtag_prefix *qp;
> +   struct vlan_head *qp = (struct vlan_head *)skb->data;
> +
> +   if (likely(!eth_type_vlan(qp->tpid)))
> +   return 0;
>
> -   if (unlikely(skb->len < sizeof(struct qtag_prefix) + sizeof(__be16)))
> +   if (unlikely(skb->len < sizeof(struct vlan_head) + sizeof(__be16)))
> return 0;
Why do we need extra sizeof(__be16) bytes here?

>
> -   if (unlikely(!pskb_may_pull(skb, sizeof(struct qtag_prefix) +
> -sizeof(__be16
> +   if (unlikely(!pskb_may_pull(skb, sizeof(struct vlan_head) +
> +sizeof(__be16
> return -ENOMEM;
>
> -   qp = (struct qtag_prefix *) skb->data;
> -   key->eth.tci = qp->tci | htons(VLAN_TAG_PRESENT);
> -   __skb_pull(skb, sizeof(struct qtag_prefix));
> +   vlan->tci = qp->tci | htons(VLAN_TAG_PRESENT);
> +   vlan->tpid = qp->tpid;
> +
> +   __skb_pull(skb, sizeof(struct vlan_head));
> +   return 1;
> +}
> +
...

> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index c92d6a2..7e90f8c 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
...

> +
> +static int parse_vlan_from_nlattrs(const struct nlattr **nla,
> +  struct sw_flow_match *match,
> +  u64 *key_attrs, bool *ie_valid,
> +  const struct nlattr **a, bool is_mask,
> +  bool log)
> +{
> +   int err;
> +   const struct nlattr *encap;
> +   u64 v_attrs = 0;
> +
> +   if (!is_mask) {
> +   err = __parse_vlan_from_nlattrs(nla, match, key_attrs,
> +   false, a, is_mask, log);
> +   if (err)
> +   return err;
> +
> +   /* Another encap attribute here indicates
> +* the presence of a double tagged vlan.
> +*/
> +   encap = a[OVS_KEY_ATTR_ENCAP];
> +
> +   err = parse_flow_nlattrs(encap, a, _attrs, log);
> +   if (err)
> +   return err;
> +
> +   if ((v_attrs & (1 << OVS_KEY_ATTR_ETHERTYPE)) &&
> +   eth_type_vlan(nla_get_be16(a[OVS_KEY_ATTR_ETHERTYPE]))) {
> +   if (!((v_attrs & (1 << OVS_KEY_ATTR_VLAN)) &&
> + (v_attrs & (1 << OVS_KEY_ATTR_ENCAP {
> +   OVS_NLERR(log, "Invalid Inner VLAN frame");
> +   return -EINVAL;
> +   }
> +   *ie_valid = true;
> +   err = __parse_vlan_from_nlattrs(, match, 
> _attrs,
> +   true, a, is_mask, 
> log);
> +   if (err)
> +   return err;
> +   *key_attrs |= v_attrs;
> +   }
> +   } else {
> +   err = __parse_vlan_from_nlattrs(nla, match, key_attrs,
> +   false, a, is_mask, log);
> +   if (err)
> +   return err;
> +
> +   encap =

Re: [PATCHv2 net 2/2] ipv4: update RTNH_F_LINKDOWN flag on UP event

2015-10-26 Thread Andy Gospodarek

On Mon, Oct 26, 2015 at 11:59:13PM +0200, Julian Anastasov wrote:
> When nexthop is part of multipath route we should clear the
> LINKDOWN flag when link goes UP or when first address is added.
> This is needed because we always set LINKDOWN flag when DEAD flag
> was set but now on UP the nexthop is not dead anymore. Examples when
> LINKDOWN bit can be forgotten when no NETDEV_CHANGE is delivered:
> 
> - link goes down (LINKDOWN is set), then link goes UP and device
> shows carrier OK but LINKDOWN remains set
> 
> - last address is deleted (LINKDOWN is set), then address is
> added and device shows carrier OK but LINKDOWN remains set
> 
> Steps to reproduce:
> modprobe dummy
> ifconfig dummy0 192.168.168.1 up
> 
> here add a multipath route where one nexthop is for dummy0:
> 
> ip route add 1.2.3.4 nexthop dummy0 nexthop SOME_OTHER_DEVICE
> ifconfig dummy0 down
> ifconfig dummy0 up
> 
> now ip route shows nexthop that is not dead. Now set the sysctl var:
> 
> echo 1 > /proc/sys/net/ipv4/conf/dummy0/ignore_routes_with_linkdown
> 
> now ip route will show a dead nexthop because the forgotten
> RTNH_F_LINKDOWN is propagated as RTNH_F_DEAD.

I tested this patch and I now see that your reported problem is a result
of dummy never taking carrier down.  There was a presumption that
carrier notification would go down when hardware went down (or when the
logical device backing the hardware went down, but this is clearly not
always the case.

> Fixes: 8a3d03166f19 ("net: track link-status of ipv4 nexthops")
> Signed-off-by: Julian Anastasov 
> ---
>  net/ipv4/fib_semantics.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index f493eff..f657418 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -1445,6 +1445,13 @@ int fib_sync_up(struct net_device *dev, unsigned int 
> nh_flags)
>   if (!(dev->flags & IFF_UP))
>   return 0;
>  
> + if (nh_flags & RTNH_F_DEAD) {
> + unsigned int flags = dev_get_flags(dev);
> +
> + if (flags & (IFF_RUNNING | IFF_LOWER_UP))
> + nh_flags |= RTNH_F_LINKDOWN;
> + }
> +
>   prev_fi = NULL;
>   hash = fib_devindex_hashfn(dev->ifindex);
>   head = _info_devhash[hash];

Logically this patch makes sense, but I feel as though there may be a
slightly better option.  Possibly this:

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 42778d9..7eb7c40 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1376,7 +1376,8 @@ int fib_sync_down_dev(struct net_device *dev, unsigned 
long event)
nexthop_nh->nh_flags |= RTNH_F_DEAD;
/* fall through */
case NETDEV_CHANGE:
-   nexthop_nh->nh_flags |= RTNH_F_LINKDOWN;
+   if (!netif_carrier_ok(dev))
+   nexthop_nh->nh_flags |= 
RTNH_F_LINKDOWN;
break;
}
dead++;
@@ -1396,7 +1397,8 @@ int fib_sync_down_dev(struct net_device *dev, unsigned 
long event)
fi->fib_flags |= RTNH_F_DEAD;
/* fall through */
case NETDEV_CHANGE:
-   fi->fib_flags |= RTNH_F_LINKDOWN;
+   if (!netif_carrier_ok(dev))
+   fi->fib_flags |= RTNH_F_LINKDOWN;
break;
}
ret++;

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 0/8] add missing of_node_put

2015-10-26 Thread David Miller

From: Julia Lawall 
Date: Sun, 25 Oct 2015 14:56:59 +0100

> The various for_each device_node iterators performs an of_node_get on each
> iteration, so a break out of the loop requires an of_node_put.
> 
> The complete semantic patch that fixes this problem is
> (http://coccinelle.lip6.fr):
 ...

Series applied, thanks a lot Julia.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/3] virtio-net: Using single MSIX IRQ for TX/RX Q pair

2015-10-26 Thread Jason Wang



On 10/27/2015 01:52 AM, Ravi Kerur wrote:
> Ported earlier patch from Jason Wang (dated 12/26/2014).
>
> This patch tries to reduce the number of MSIX irqs required for
> virtio-net by sharing a MSIX irq for each TX/RX queue pair through
> channels. If transport support channel, about half of the MSIX irqs
> were reduced.
>
> Signed-off-by: Ravi Kerur 
> ---
>  drivers/net/virtio_net.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)

Thanks for the patches. Some minor comments:

- If there's no big changes of the code, better keep my sign-offs :)
- Rusty does not like the name "channels", so better rename it to
"virtqueue groups"
- Build bot reports some compiling issues, this need to be fixed in next
version.
- The order of patches in this series is reversed, pach 1/3 should be
3/3. And better to have a cover letter to describe the motivation and
changes since last series. (You can do this through git format-patch
--cover)
- Michale's comment about unnecessary wakeup of tx queue needs to be
addressed, otherwise, we may get unnecessary tx interrupts.
- Some benchmarks is needed to make sure there's no performance regression.

> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index d8838ded..d705cce 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -72,6 +72,9 @@ struct send_queue {
>  
>   /* Name of the send queue: output.$index */
>   char name[40];
> +
> + /* Name of the channel, shared with irq. */
> + char channel_name[40];
>  };
>  
>  /* Internal representation of a receive virtqueue */
> @@ -1529,6 +1532,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   int ret = -ENOMEM;
>   int i, total_vqs;
>   const char **names;
> + const char **channel_names;
> + unsigned *channels;
>  
>   /* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
>* possible N-1 RX/TX queue pairs used in multiqueue mode, followed by
> @@ -1548,6 +1553,17 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   if (!names)
>   goto err_names;
>  
> + channel_names = kmalloc_array(vi->max_queue_pairs,
> +   sizeof(*channel_names),
> +   GFP_KERNEL);
> + if (!channel_names)
> + goto err_channel_names;
> +
> + channels = kmalloc_array(total_vqs, sizeof(*channels),
> +  GFP_KERNEL);
> + if (!channels)
> + goto err_channels;
> +
>   /* Parameters for control virtqueue, if any */
>   if (vi->has_cvq) {
>   callbacks[total_vqs - 1] = NULL;
> @@ -1562,10 +1578,15 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   sprintf(vi->sq[i].name, "output.%d", i);
>   names[rxq2vq(i)] = vi->rq[i].name;
>   names[txq2vq(i)] = vi->sq[i].name;
> + sprintf(vi->sq[i].channel_name, "txrx.%d", i);
> + channel_names[i] = vi->sq[i].channel_name;
> + channels[rxq2vq(i)] = i;
> + channels[txq2vq(i)] = i;
>   }
>  
>   ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
> -  names);
> +  names, channels, channel_names,
> +  vi->max_queue_pairs);
>   if (ret)
>   goto err_find;
>  
> @@ -1580,6 +1601,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   vi->sq[i].vq = vqs[txq2vq(i)];
>   }
>  
> + kfree(channels);
> + kfree(channel_names);
>   kfree(names);
>   kfree(callbacks);
>   kfree(vqs);
> @@ -1587,6 +1610,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
>   return 0;
>  
>  err_find:
> + kfree(channels);
> +err_channels:
> + kfree(channel_names);
> +err_channel_names:
>   kfree(names);
>  err_names:
>   kfree(callbacks);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3] net: tso: add support for IPv6

2015-10-26 Thread David Miller

From: Emmanuel Grumbach 
Date: Mon, 26 Oct 2015 10:31:29 +0200

> Adding IPv6 for the TSO helper API is trivial:
> * Don't play with the id (which doesn't exist in IPv6)
> * Correctly update the payload_len (don't include the
>   length of the IP header itself)
> 
> Signed-off-by: Emmanuel Grumbach 

Applied to net-next, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH net] ipv4: fix problems from the RTNH_F_LINKDOWN introduction

2015-10-26 Thread Andy Gospodarek

On Mon, Oct 26, 2015 at 11:15:57PM +0200, Julian Anastasov wrote:
> 
>   Hello,
> 
> On Mon, 26 Oct 2015, Andy Gospodarek wrote:
> 
> > On Sat, Oct 24, 2015 at 09:20:00PM +0300, Julian Anastasov wrote:
> > > When fib_netdev_event calls fib_disable_ip on NETDEV_DOWN event
> > > we should not delete the local routes if the local address
> > > is still present. The confusion comes from the fact that both
> > > fib_netdev_event and fib_inetaddr_event use the NETDEV_DOWN
> > > constant. Fix it by returning back the variable 'force'.
> > > 
> > > Steps to reproduce:
> > > modprobe dummy
> > > ifconfig dummy0 192.168.168.1 up
> > > ip route list table local | grep dummy | grep host
> > > local 192.168.168.1 dev dummy0  proto kernel  scope host  src 
> > > 192.168.168.1
> > I tested this before and after your patch and I don't see a different
> > output.  Was I supposed to see something different?
> 
>   Sorry, the test is missing one command. I'll
> split the patch and will add the missing ifconfig dummy0 down
> command. It was lost because I had problems adding '#' before
> the commands, which is comment, anyways.
> 
> > > Second fix
> > I would prefer you move these two fixes into 2 separate patches as it
> > isn't totally clear which hunks fix each of these issues.
> 
>   Preparing patchset...
> 
> > Are you seeing this with iproute2 (or other tools) or are you just
> > seeing this by monitoring netlink messages/looking at a netlink cache
> > you have built inside an application?
> 
>   ifconfig and ip route.
> 
> > I have seen a problem similar to what you have reported with netlink
> > caches and have a fix I can give you if you would like to try it.  It is
> > a slightly larger structural change, but it appears to cover covers a
> > few more cases than this fix does.
> 
>   No, I'm focusing just on this problem.
> 
> Regards
> 

Thanks for the update.  I'll test an report back in the v2 thread.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH net-next] bridge: set is_local and is_static before fdb entry is added to the fdb hashtable

2015-10-26 Thread Roopa Prabhu

From: Roopa Prabhu 

Problem Description:
We can add fdbs pointing to the bridge with NULL ->dst but that has a
few race conditions because br_fdb_insert() is used which first creates
the fdb and then, after the fdb has been published/linked, sets
"is_local" to 1 and in that time frame if a packet arrives for that fdb
it may see it as non-local and either do a NULL ptr dereference in
br_forward() or attach the fdb to the port where it arrived, and later
br_fdb_insert() will make it local thus getting a wrong fdb entry.
Call chain br_handle_frame_finish() -> br_forward():
But in br_handle_frame_finish() in order to call br_forward() the dst
should not be local i.e. skb != NULL, whenever the dst is
found to be local skb is set to NULL so we can't forward it,
and here comes the problem since it's running only
with RCU when forwarding packets it can see the entry before "is_local"
is set to 1 and actually try to dereference NULL.
The main issue is that if someone sends a packet to the switch while
it's adding the entry which points to the bridge device, it may
dereference NULL ptr. This is needed now after we can add fdbs
pointing to the bridge.  This poses a problem for
br_fdb_update() as well, while someone's adding a bridge fdb, but
before it has is_local == 1, it might get moved to a port if it comes
as a source mac and then it may get its "is_local" set to 1

This patch changes fdb_create to take is_local and is_static as
arguments to set these values in the fdb entry before it is added to the
hash. Also adds null check for port in br_forward.

Reported-by: Nikolay Aleksandrov 
Signed-off-by: Roopa Prabhu 
---
 net/bridge/br_fdb.c | 15 ---
 net/bridge/br_forward.c |  2 +-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index c88bd8e..35a1c7e 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -495,7 +495,9 @@ static struct net_bridge_fdb_entry *fdb_find_rcu(struct 
hlist_head *head,
 static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
   struct net_bridge_port *source,
   const unsigned char *addr,
-  __u16 vid)
+  __u16 vid,
+  unsigned char is_local,
+  unsigned char is_static)
 {
struct net_bridge_fdb_entry *fdb;
 
@@ -504,8 +506,8 @@ static struct net_bridge_fdb_entry *fdb_create(struct 
hlist_head *head,
memcpy(fdb->addr.addr, addr, ETH_ALEN);
fdb->dst = source;
fdb->vlan_id = vid;
-   fdb->is_local = 0;
-   fdb->is_static = 0;
+   fdb->is_local = is_local;
+   fdb->is_static = is_static;
fdb->added_by_user = 0;
fdb->added_by_external_learn = 0;
fdb->updated = fdb->used = jiffies;
@@ -536,11 +538,10 @@ static int fdb_insert(struct net_bridge *br, struct 
net_bridge_port *source,
fdb_delete(br, fdb);
}
 
-   fdb = fdb_create(head, source, addr, vid);
+   fdb = fdb_create(head, source, addr, vid, 1, 1);
if (!fdb)
return -ENOMEM;
 
-   fdb->is_local = fdb->is_static = 1;
fdb_add_hw_addr(br, addr);
fdb_notify(br, fdb, RTM_NEWNEIGH);
return 0;
@@ -597,7 +598,7 @@ void br_fdb_update(struct net_bridge *br, struct 
net_bridge_port *source,
} else {
spin_lock(>hash_lock);
if (likely(!fdb_find(head, addr, vid))) {
-   fdb = fdb_create(head, source, addr, vid);
+   fdb = fdb_create(head, source, addr, vid, 0, 0);
if (fdb) {
if (unlikely(added_by_user))
fdb->added_by_user = 1;
@@ -774,7 +775,7 @@ static int fdb_add_entry(struct net_bridge_port *source, 
const __u8 *addr,
if (!(flags & NLM_F_CREATE))
return -ENOENT;
 
-   fdb = fdb_create(head, source, addr, vid);
+   fdb = fdb_create(head, source, addr, vid, 0, 0);
if (!fdb)
return -ENOMEM;
 
diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index a9d424e..fcdb86d 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -141,7 +141,7 @@ EXPORT_SYMBOL_GPL(br_deliver);
 /* called with rcu_read_lock */
 void br_forward(const struct net_bridge_port *to, struct sk_buff *skb, struct 
sk_buff *skb0)
 {
-   if (should_deliver(to, skb)) {
+   if (to && should_deliver(to, skb)) {
if (skb0)
deliver_clone(to, skb, __br_forward);
else
--

Re: [PATCH v3 net-next] bpf: fix bpf_perf_event_read() helper

2015-10-26 Thread David Miller

From: Alexei Starovoitov 
Date: Thu, 22 Oct 2015 17:10:14 -0700

> Fix safety checks for bpf_perf_event_read():
> - only non-inherited events can be added to perf_event_array map
>   (do this check statically at map insertion time)
> - dynamically check that event is local and !pmu->count
> Otherwise buggy bpf program can cause kernel splat.
> 
> Also fix error path after perf_event_attrs()
> and remove redundant 'extern'.
> 
> Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get 
> the selected hardware PMU conuter")
> Signed-off-by: Alexei Starovoitov 

Applied, although my tendancy is to agree with the sentiment that you must
respect the entire universe of valid 64-bit counter values.  I do not buy
the arguments about values overlapping error codes being unlikely or not
worth worrying about.

Just FYI...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

1 2 >

1 - 100 of 156 matches

Mail list logo