date:20161109

Re: Virtio_net support vxlan encapsulation package TSO offload discuss

2016-11-09 Thread Zhangming (James, Euler)

On 2016年11月09日 15:14, Jason Wang wrote:
>On 2016年11月08日 19:58, Zhangming (James, Euler) wrote:
>> On 2016年11月08日 19:17, Jason Wang wrote:
>>
>>> On 2016年11月08日 19:13, Jason Wang wrote:
 Cc Michael

 On 2016年11月08日 16:34, Zhangming (James, Euler) wrote:
> In container scenario, OVS is installed in the Virtual machine, and 
> all the containers connected to the OVS will communicated through 
> VXLAN encapsulation.
>
> By now, virtio_net does not support TSO offload for VXLAN 
> encapsulated TSO package. In this condition, the performance is not 
> good, sender is bottleneck
>
> I googled this scenario, but I didn’t find any information. Will 
> virtio_net support VXLAN encapsulation package TSO offload later?
>
 Yes and for both sender and receiver.

> My idea is virtio_net open encapsulated TSO offload, and transport 
> encapsulation info to TUN, TUN will parse the info and build skb 
> with encapsulation info.
>
> OVS or kernel on the host should be modified to support this. Using 
> this method, the TCP performance aremore than 2x as before.
>
> Any advice and suggestions for this idea or new idea will be 
> greatly appreciated!
>
> Best regards,
>
> James zhang
>
 Sounds very good. And we may also need features bits
 (VIRTIO_NET_F_GUEST|HOST_GSO_X) for this.

 This is in fact one of items in networking todo list. (See 
 http://www.linux-kvm.org/page/NetworkingTodo). While at it, we'd 
 better support not only VXLAN but also other tunnels.
>>> Cc Vlad who is working on extending virtio-net headers.
>>>
 We can start with the spec work, or if you've already had some bits 
 you can post them as RFC for early review.

 Thanks
>> Below is my demo code
>> Virtio_net.c
>> static int virtnet_probe(struct virtio_device *vdev), add belows codes:
>>  if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) || 
>> // avoid gso segment, it should be negotiation later, 
>> because in the demo I reuse num_buffers.
>>  virtio_has_feature(vdev, VIRTIO_F_VERSION_1)) {
>>  dev->hw_enc_features |= NETIF_F_TSO;
>>  dev->hw_enc_features |= NETIF_F_ALL_CSUM;
>>  dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL;
>>  dev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
>>  dev->hw_enc_features |= NETIF_F_GSO_TUNNEL_REMCSUM;
>>
>>  dev->features |= NETIF_F_GSO_UDP_TUNNEL;
>>  dev->features |= NETIF_F_GSO_UDP_TUNNEL_CSUM;
>>  dev->features |= NETIF_F_GSO_TUNNEL_REMCSUM;
>>  }
>>
>> static int xmit_skb(struct send_queue *sq, struct sk_buff *skb), add 
>> below to pieces of codes
>>
>>  if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL)
>>  hdr->hdr.gso_type |= VIRTIO_NET_HDR_GSO_TUNNEL;
>>  if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL_CSUM)
>>  hdr->hdr.gso_type |= VIRTIO_NET_HDR_GSO_TUNNEL_CSUM;
>>  if (skb_shinfo(skb)->gso_type & SKB_GSO_TUNNEL_REMCSUM)
>>  hdr->hdr.gso_type |= 
>> VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM;
>>
>>  if (skb->encapsulation && skb_is_gso(skb)) {
>>  inner_mac_len = skb_inner_network_header(skb) - 
>> skb_inner_mac_header(skb);
>>  tnl_len = skb_inner_mac_header(skb) - skb_mac_header(skb);
>>  if ( !(inner_mac_len >> DATA_LEN_SHIFT) && !(tnl_len >> 
>> DATA_LEN_SHIFT) ) {
>>  hdr->hdr.flags |= VIRTIO_NET_HDR_F_ENCAPSULATION;
>>  hdr->num_buffers = (__virtio16)((inner_mac_len << 
>> DATA_LEN_SHIFT) | tnl_len);   //we reuse num_buffers for simple 
>> , we should add extend member for later.
>>  }  else
>>  hdr->num_buffers = 0;
>>  }
>>
>> Tun.c
>>  if (memcpy_fromiovecend((void *), iv, offset, 
>> tun->vnet_hdr_sz))//read header with negotiation length
>>  return -EFAULT;
>>
>>  if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL)   
>> //set tunnel gso info
>>  skb_shinfo(skb)->gso_type |= SKB_GSO_UDP_TUNNEL;
>>  if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_CSUM)
>>  skb_shinfo(skb)->gso_type |= 
>> SKB_GSO_UDP_TUNNEL_CSUM;
>>  if (hdr.gso_type & VIRTIO_NET_HDR_GSO_TUNNEL_REMCSUM)
>>  skb_shinfo(skb)->gso_type |= 
>> SKB_GSO_TUNNEL_REMCSUM;
>>
>>  if (hdr.flags & VIRTIO_NET_HDR_F_ENCAPSULATION) {   
>> //read tunnel info from header and set to built skb.
>>  tnl_len = tun16_to_cpu(tun, hdr.num_buffers) & 
>>

Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN

2016-11-09 Thread Neftin, Sasha

On 11/9/2016 11:41 PM, Tyler Baicar wrote:
> Move IRQ free code so that it will happen regardless of the
> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
> if the __E1000_DOWN bit is cleared. This is not sufficient because
> it is possible for __E1000_DOWN to be set without releasing the IRQ.
> In such a situation, we will hit a kernel bug later in e1000_remove
> because the IRQ still has action since it was never freed. A
> secondary bus reset can cause this case to happen.
> 
> Signed-off-by: Tyler Baicar 
> ---
>  drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c 
> b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 7017281..36cfcb0 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>  
>   if (!test_bit(__E1000_DOWN, >state)) {
>   e1000e_down(adapter, true);
> - e1000_free_irq(adapter);
>  
>   /* Link status message must follow this format */
>   pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>   }   
>  
> + e1000_free_irq(adapter);
> +
>   napi_disable(>napi);
>  
>   e1000e_free_tx_resources(adapter->tx_ring);
> 
I would like not recommend insert this change. This change related
driver state machine, we afraid from lot of synchronization problem and
issues.
We need keep e1000_free_irq in loop and check for 'test_bit' ready.
Another point, does before execute secondary bus reset your SW back up
pcie configuration space as properly?

Sasha

Re: [v16, 0/7] Fix eSDHC host version register bug

2016-11-09 Thread Scott Wood

On Thu, 2016-11-10 at 04:11 +, Y.B. Lu wrote:
> > 
> > -Original Message-
> > From: Y.B. Lu
> > Sent: Thursday, November 10, 2016 12:06 PM
> > To: 'Scott Wood'; Ulf Hansson
> > Cc: linux-mmc; Arnd Bergmann; linuxppc-...@lists.ozlabs.org;
> > devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org; linux-
> > ker...@vger.kernel.org; linux-clk; io...@lists.linux-foundation.org;
> > netdev@vger.kernel.org; Greg Kroah-Hartman; Mark Rutland; Rob Herring;
> > Russell King; Jochen Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh
> > Sharma; Qiang Zhao; Kumar Gala; Leo Li; X.B. Xie; M.H. Lian
> > Subject: RE: [v16, 0/7] Fix eSDHC host version register bug
> > 
> > > 
> > > -Original Message-
> > > From: linux-mmc-ow...@vger.kernel.org [mailto:linux-mmc-
> > > ow...@vger.kernel.org] On Behalf Of Scott Wood
> > > Sent: Thursday, November 10, 2016 11:55 AM
> > > To: Ulf Hansson; Y.B. Lu
> > > Cc: linux-mmc; Arnd Bergmann; linuxppc-...@lists.ozlabs.org;
> > > devicet...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> > > linux- ker...@vger.kernel.org; linux-clk;
> > > io...@lists.linux-foundation.org; netdev@vger.kernel.org; Greg
> > > Kroah-Hartman; Mark Rutland; Rob Herring; Russell King; Jochen
> > > Friedrich; Joerg Roedel; Claudiu Manoil; Bhupesh Sharma; Qiang Zhao;
> > > Kumar Gala; Leo Li; X.B. Xie; M.H. Lian
> > > Subject: Re: [v16, 0/7] Fix eSDHC host version register bug
> > > 
> > > On Wed, 2016-11-09 at 19:27 +0100, Ulf Hansson wrote:
> > > > 
> > > > - i2c-list
> > > > 
> > > > On 9 November 2016 at 04:14, Yangbo Lu  wrote:
> > > > > 
> > > > > 
> > > > > This patchset is used to fix a host version register bug in the
> > > > > T4240-
> > > > > R1.0-R2.0
> > > > > eSDHC controller. To match the SoC version and revision, 15
> > > > > previous version patchsets had tried many methods but all of them
> > > > > were rejected by reviewers.
> > > > > Such as
> > > > > - dts compatible method
> > > > > - syscon method
> > > > > - ifdef PPC method
> > > > > - GUTS driver getting SVR method Anrd suggested a
> > > > > soc_device_match method in v10, and this is the only available
> > > > > method left now. This v11 patchset introduces the soc_device_match
> > > > > interface in soc driver.
> > > > > 
> > > > > The first four patches of Yangbo are to add the GUTS driver. This
> > > > > is used to register a soc device which contain soc version and
> > > > > revision information.
> > > > > The other three patches introduce the soc_device_match method in
> > > > > soc driver and apply it on esdhc driver to fix this bug.
> > > > > 
> > > > > ---
> > > > > Changes for v15:
> > > > > - Dropped patch 'dt: bindings: update Freescale DCFG
> > > compatible'
> > > > 
> > > > > 
> > > > >   since the work had been done by below patch on
> > > > > ShawnGuo's linux tree.
> > > > >   'dt-bindings: fsl: add LS1043A/LS1046A/LS2080A
> > > > > compatible for SCFG
> > > > >    and DCFG'
> > > > > - Fixed error code issue in guts driver Changes for v16:
> > > > > - Dropped patch 'powerpc/fsl: move mpc85xx.h to
> > > include/linux/fsl'
> > > > 
> > > > > 
> > > > > - Added a bug-fix patch from Geert
> > > > > ---
> > > > > 
> > > > > Arnd Bergmann (1):
> > > > >   base: soc: introduce soc_device_match() interface
> > > > > 
> > > > > Geert Uytterhoeven (1):
> > > > >   base: soc: Check for NULL SoC device attributes
> > > > > 
> > > > > Yangbo Lu (5):
> > > > >   ARM64: dts: ls2080a: add device configuration node
> > > > >   dt: bindings: move guts devicetree doc out of powerpc directory
> > > > >   soc: fsl: add GUTS driver for QorIQ platforms
> > > > >   MAINTAINERS: add entry for Freescale SoC drivers
> > > > >   mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0
> > > > > 
> > > > >  .../bindings/{powerpc => soc}/fsl/guts.txt |   3 +
> > > > >  MAINTAINERS|  11 +-
> > > > >  arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |   6 +
> > > > >  drivers/base/Kconfig   |   1 +
> > > > >  drivers/base/soc.c |  70 ++
> > > > >  drivers/mmc/host/Kconfig   |   1 +
> > > > >  drivers/mmc/host/sdhci-of-esdhc.c  |  20 ++
> > > > >  drivers/soc/Kconfig|   3 +-
> > > > >  drivers/soc/fsl/Kconfig|  18 ++
> > > > >  drivers/soc/fsl/Makefile   |   1 +
> > > > >  drivers/soc/fsl/guts.c | 236
> > > > > +
> > > > >  include/linux/fsl/guts.h   | 125
> > > > > ++-
> > > > >  include/linux/sys_soc.h|   3 +
> > > > >  13 files changed, 447 insertions(+), 51 deletions(-)
> > > > >  rename Documentation/devicetree/bindings/{powerpc =>
> > > > > soc}/fsl/guts.txt
> > >

Re: [PATCH] usbnet: prevent device rpm suspend in usbnet_probe function

2016-11-09 Thread Kai-Heng Feng

Hi,

On Wed, Nov 9, 2016 at 8:32 PM, Bjørn Mork  wrote:
> Oliver Neukum  writes:
>
>> On Tue, 2016-11-08 at 13:44 -0500, Alan Stern wrote:
>>
>>> These problems could very well be caused by running at SuperSpeed
>>> (USB-3) instead of high speed (USB-2).

Yes, it's running at SuperSpeed, on a Kabylake laptop.

It does not have this issue on a Broadwell laptop, also running at SuperSpeed.

>>>
>>> Is there any way to test what happens when the device is attached to
>>> the computer by a USB-2 cable?  That would prevent it from operating at
>>> SuperSpeed.

I recall old Intel PCH can change the USB host from XHCI to EHCI,
newer PCH does not have this option.

Is there a way to force XHCI run at HighSpeed?

>>>
>>> The main point, however, is that the proposed patch doesn't seem to
>>> address the true problem, which is that the device gets suspended
>>> between probes.  The patch only tries to prevent it from being
>>> suspended during a probe -- which is already prevented by the USB core.
>>
>> But why doesn't it fail during normal operation?
>>
>> I suspect that its firmware requires the altsetting
>>
>> /* should we change control altsetting on a NCM/MBIM function? */
>> if (cdc_ncm_select_altsetting(intf) == CDC_NCM_COMM_ALTSETTING_MBIM) 
>> {
>> data_altsetting = CDC_NCM_DATA_ALTSETTING_MBIM;
>> ret = cdc_mbim_set_ctrlalt(dev, intf, 
>> CDC_NCM_COMM_ALTSETTING_MBIM);
>>
>> to be set before it accepts a suspension.
>
> Could be, but I don't think so.  The above code is effectively a noop
> unless the function is a combined NCM/MBIM function.  Something I've
> never seen on a Sierra Wireless device (ignoring the infamous EM7345,
> which really is an Intel device).
>
> This is a typical example of a Sierra Wireless modem configured for
> MBIM:
>
> P:  Vendor=1199 ProdID=9079 Rev= 0.06
> S:  Manufacturer=Sierra Wireless, Incorporated
> S:  Product=Sierra Wireless EM7455 Qualcomm Snapdragon X7 LTE-A
> S:  SerialNumber=LF615126xxx
> C:* #Ifs= 2 Cfg#= 1 Atr=a0 MxPwr=500mA
> A:  FirstIf#=12 IfCount= 2 Cls=02(comm.) Sub=0e Prot=00
> I:* If#=12 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=(none)
> E:  Ad=82(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
> I:* If#=13 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=02 Driver=(none)
> I:  If#=13 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=(none)
> E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
>
>
> The control interface of plain MBIM functions will always have a single
> altsetting, like the example above. So cdc_ncm_select_altsetting(intf)
> returns "0", while CDC_NCM_COMM_ALTSETTING_MBIM is "1".
>
>
> Just for reference, using the Intel^H^H^H^H^HEM7345 as example, this is
> what a combined NCM/MBIM function looks like:
>
>
> P:  Vendor=1199 ProdID=a001 Rev=17.29
> S:  Manufacturer=Sierra Wireless Inc.
> S:  Product=Sierra Wireless EM7345 4G LTE
> S:  SerialNumber=013937000xx
> C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=100mA
> A:  FirstIf#= 0 IfCount= 2 Cls=02(comm.) Sub=0d Prot=00
> A:  FirstIf#= 2 IfCount= 2 Cls=02(comm.) Sub=02 Prot=01
> I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0d Prot=00 Driver=cdc_mbim
> E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
> I:* If#= 0 Alt= 1 #EPs= 1 Cls=02(comm.) Sub=0e Prot=00 Driver=cdc_mbim
> E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
> I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_mbim
> I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_mbim
> E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> I:* If#= 1 Alt= 2 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
> E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> I:* If#= 2 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=02 Prot=01 Driver=(none)
> E:  Ad=83(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
> I:* If#= 3 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=(none)
> E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
> E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
>
>
> And this is what the code you quote is trying to deal with.  Note the
> different subclass of altsetting 0 and 1 This is incredibly ugly.
>
> FWIW, the modem in question cannot be an EM7345. That modem does not
> have the static interface numbering oddity.  Another sign that it isn't
> a true Sierra device.

Yes, this modem is an EM7445.

>
>
>
>
> Bjørn

Re: [v16, 0/7] Fix eSDHC host version register bug

2016-11-09 Thread Scott Wood

On Wed, 2016-11-09 at 19:27 +0100, Ulf Hansson wrote:
> - i2c-list
> 
> On 9 November 2016 at 04:14, Yangbo Lu  wrote:
> > 
> > This patchset is used to fix a host version register bug in the T4240-
> > R1.0-R2.0
> > eSDHC controller. To match the SoC version and revision, 15 previous
> > version
> > patchsets had tried many methods but all of them were rejected by
> > reviewers.
> > Such as
> > - dts compatible method
> > - syscon method
> > - ifdef PPC method
> > - GUTS driver getting SVR method
> > Anrd suggested a soc_device_match method in v10, and this is the only
> > available
> > method left now. This v11 patchset introduces the soc_device_match
> > interface in
> > soc driver.
> > 
> > The first four patches of Yangbo are to add the GUTS driver. This is used
> > to
> > register a soc device which contain soc version and revision information.
> > The other three patches introduce the soc_device_match method in soc
> > driver
> > and apply it on esdhc driver to fix this bug.
> > 
> > ---
> > Changes for v15:
> > - Dropped patch 'dt: bindings: update Freescale DCFG compatible'
> >   since the work had been done by below patch on ShawnGuo's linux
> > tree.
> >   'dt-bindings: fsl: add LS1043A/LS1046A/LS2080A compatible for
> > SCFG
> >    and DCFG'
> > - Fixed error code issue in guts driver
> > Changes for v16:
> > - Dropped patch 'powerpc/fsl: move mpc85xx.h to include/linux/fsl'
> > - Added a bug-fix patch from Geert
> > ---
> > 
> > Arnd Bergmann (1):
> >   base: soc: introduce soc_device_match() interface
> > 
> > Geert Uytterhoeven (1):
> >   base: soc: Check for NULL SoC device attributes
> > 
> > Yangbo Lu (5):
> >   ARM64: dts: ls2080a: add device configuration node
> >   dt: bindings: move guts devicetree doc out of powerpc directory
> >   soc: fsl: add GUTS driver for QorIQ platforms
> >   MAINTAINERS: add entry for Freescale SoC drivers
> >   mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0
> > 
> >  .../bindings/{powerpc => soc}/fsl/guts.txt |   3 +
> >  MAINTAINERS|  11 +-
> >  arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi |   6 +
> >  drivers/base/Kconfig   |   1 +
> >  drivers/base/soc.c |  70 ++
> >  drivers/mmc/host/Kconfig   |   1 +
> >  drivers/mmc/host/sdhci-of-esdhc.c  |  20 ++
> >  drivers/soc/Kconfig|   3 +-
> >  drivers/soc/fsl/Kconfig|  18 ++
> >  drivers/soc/fsl/Makefile   |   1 +
> >  drivers/soc/fsl/guts.c | 236
> > +
> >  include/linux/fsl/guts.h   | 125 ++-
> >  include/linux/sys_soc.h|   3 +
> >  13 files changed, 447 insertions(+), 51 deletions(-)
> >  rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/guts.txt
> > (91%)
> >  create mode 100644 drivers/soc/fsl/Kconfig
> >  create mode 100644 drivers/soc/fsl/guts.c
> > 
> > --
> > 2.1.0.27.g96db324
> > 
> Thanks, applied on my mmc tree for next!
> 
> I noticed that some DT compatibles weren't documented, according to
> checkpatch. Please fix that asap!

They are documented, in fsl/guts.txt (the file moved in patch 2/7):
>  - compatible : Should define the compatible device type for
>    global-utilities.
>    Possible compatibles:
> "fsl,qoriq-device-config-1.0"
> "fsl,qoriq-device-config-2.0"
> "fsl,-device-config"
> "fsl,-guts"

Checkpatch doesn't understand compatibles defined in such a way.

-Scott

Re: [PATCH net-next] tcp: remove unaligned accesses from tcp_get_info()

2016-11-09 Thread David Miller

From: Eric Dumazet 
Date: Wed, 09 Nov 2016 11:24:22 -0800

> From: Eric Dumazet 
> 
> After commit 6ed46d1247a5 ("sock_diag: align nlattr properly when
> needed"), tcp_get_info() gets 64bit aligned memory, so we can avoid
> the unaligned helpers.
> 
> Suggested-by: David Miller 
> Signed-off-by: Eric Dumazet 

Nice, applied.

Thanks!

[PATCH] r8152: Fix error path in open function

2016-11-09 Thread Guenter Roeck

If usb_submit_urb() called from the open function fails, the following
crash may be observed.

r8152 8-1:1.0 eth0: intr_urb submit failed: -19
...
r8152 8-1:1.0 eth0: v1.08.3
Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b6b7b
pgd = ffc0e7305000
[6b6b6b6b6b6b6b7b] *pgd=, *pud=
Internal error: Oops: 9604 [#1] PREEMPT SMP
...
PC is at notifier_chain_register+0x2c/0x58
LR is at blocking_notifier_chain_register+0x54/0x70
...
Call trace:
[] notifier_chain_register+0x2c/0x58
[] blocking_notifier_chain_register+0x54/0x70
[] register_pm_notifier+0x24/0x2c
[] rtl8152_open+0x3dc/0x3f8 [r8152]
[] __dev_open+0xac/0x104
[] __dev_change_flags+0xb0/0x148
[] dev_change_flags+0x34/0x70
[] do_setlink+0x2c8/0x888
[] rtnl_newlink+0x328/0x644
[] rtnetlink_rcv_msg+0x1a8/0x1d4
[] netlink_rcv_skb+0x68/0xd0
[] rtnetlink_rcv+0x2c/0x3c
[] netlink_unicast+0x16c/0x234
[] netlink_sendmsg+0x340/0x364
[] sock_sendmsg+0x48/0x60
[] SyS_sendto+0xe0/0x120
[] SyS_send+0x40/0x4c
[] el0_svc_naked+0x24/0x28

Clean up error handling to avoid registering the notifier if the open
function is going to fail.

Signed-off-by: Guenter Roeck 
---
 drivers/net/usb/r8152.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 44d439f50961..677922039548 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -3266,10 +3266,8 @@ static int rtl8152_open(struct net_device *netdev)
goto out;
 
res = usb_autopm_get_interface(tp->intf);
-   if (res < 0) {
-   free_all_mem(tp);
-   goto out;
-   }
+   if (res < 0)
+   goto out_free;
 
mutex_lock(>control);
 
@@ -3285,10 +3283,9 @@ static int rtl8152_open(struct net_device *netdev)
netif_device_detach(tp->netdev);
netif_warn(tp, ifup, netdev, "intr_urb submit failed: %d\n",
   res);
-   free_all_mem(tp);
-   } else {
-   napi_enable(>napi);
+   goto out_unlock;
}
+   napi_enable(>napi);
 
mutex_unlock(>control);
 
@@ -3297,7 +3294,13 @@ static int rtl8152_open(struct net_device *netdev)
tp->pm_notifier.notifier_call = rtl_notifier;
register_pm_notifier(>pm_notifier);
 #endif
+   return 0;
 
+out_unlock:
+   mutex_unlock(>control);
+   usb_autopm_put_interface(tp->intf);
+out_free:
+   free_all_mem(tp);
 out:
return res;
 }
-- 
2.5.0

Re: [PATCH net-next v2 6/7] vxlan: simplify vxlan xmit

2016-11-09 Thread Pravin Shelar

On Wed, Nov 9, 2016 at 8:59 AM, Jiri Benc  wrote:
> On Sat,  5 Nov 2016 11:45:56 -0700, Pravin B Shelar wrote:
>> @@ -2006,11 +2004,34 @@ static void vxlan_xmit_one(struct sk_buff *skb, 
>> struct net_device *dev,
>>   info = skb_tunnel_info(skb);
>>
>>   if (rdst) {
>> + dst = >remote_ip;
>> + if (vxlan_addr_any(dst)) {
>> + if (did_rsc) {
>> + /* short-circuited back to local bridge */
>> + vxlan_encap_bypass(skb, vxlan, vxlan);
>> + return;
>> + }
>> + goto drop;
>> + }
>> +
>>   dst_port = rdst->remote_port ? rdst->remote_port : 
>> vxlan->cfg.dst_port;
>>   vni = rdst->remote_vni;
>> - dst = >remote_ip;
>>   src = >cfg.saddr;
>>   dst_cache = >dst_cache;
>> + md->gbp = skb->mark;
>> + ttl = vxlan->cfg.ttl;
>> + if (!ttl && vxlan_addr_multicast(dst))
>> + ttl = 1;
>> +
>> + tos = vxlan->cfg.tos;
>> + if (tos == 1)
>> + tos = ip_tunnel_get_dsfield(old_iph, skb);
>
> Uninitialized old_iph.
>
It is initialized in begining of this function.

> Besides, you can't do this, having TOS, TTL, etc. specified is
> perfectly legal for lwtunnel interfaces, too.
>

TOS and TTL is initialized for LWT just else block. so I do not see
any changes compared to current implementation.

Can you elaborate on your concerns?

Re: [PATCH net-next v2 4/7] vxlan: improve vxlan route lookup checks.

2016-11-09 Thread Pravin Shelar

On Wed, Nov 9, 2016 at 8:41 AM, Jiri Benc  wrote:
> On Sat,  5 Nov 2016 11:45:54 -0700, Pravin B Shelar wrote:
>> Move route sanity check to respective vxlan[4/6]_get_route functions.
>> This allows us to perform all sanity checks before caching the dst so
>> that we can avoid these checks on subsequent packets.
>> This give move accurate metadata information for packet from
>> fill_metadata_dst().
>
> The description is misleading, it applies only to one vxlan lwt use case
> (openvswitch). For other use cases, the patch has no effect.
>
Why it would not help in non-ovs vxlan egress path? It avoids checking
(if condition) for device loop.

> I found the current handling of route lookup results irritating, too.
> The reason I did not change this while doing vxlan cleanup some time
> ago was that I assumed we should not increase dev stats from
> vxlan_fill_metadata_dst. Isn't that so?
>

Thats right. I will fix it.

Re: [PATCH net-next v2 5/7] vxlan: simplify RTF_LOCAL handling.

2016-11-09 Thread Pravin Shelar

On Wed, Nov 9, 2016 at 8:53 AM, Jiri Benc  wrote:
> On Sat,  5 Nov 2016 11:45:55 -0700, Pravin B Shelar wrote:
>> +static int check_route_rtf_local(struct sk_buff *skb, struct net_device 
>> *dev,
>> +  struct vxlan_dev *vxlan, union vxlan_addr 
>> *daddr,
>> +  __be32 dst_port, __be32 vni, struct dst_entry 
>> *dst,
>> +  u32 rt_flags)
>
> It's not just checking, it's also bypassing encapsulation if the check
> is successful. Would be good to use a name that suggests this effect,
> e.g. encap_bypass_if_local (I know, not a nice name) or something.
>

I am fine with this name. I will change the patch.

Re: [PATCH net-next v2 3/7] vxlan: avoid checking socket multiple times.

2016-11-09 Thread Pravin Shelar

On Wed, Nov 9, 2016 at 8:34 AM, Jiri Benc  wrote:
> On Sat,  5 Nov 2016 11:45:53 -0700, Pravin B Shelar wrote:
>> @@ -2070,11 +2072,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, 
>> struct net_device *dev,
>>   struct dst_entry *ndst;
>>   u32 rt6i_flags;
>>
>> - if (!sock6)
>> - goto drop;
>>   sk = sock6->sock->sk;
>
> I take back that the rest of the patch looks good. This will panic if
> an IPv6 packet is routed (through encap route) to an IPv4-only
> interface.
>

Actually this is fixed in later patch. But I will fix this patch too.

Thanks for review.

Re: [PATCH net-next v2 2/7] vxlan: simplify exception handling

2016-11-09 Thread Pravin Shelar

On Wed, Nov 9, 2016 at 8:10 AM, Jiri Benc  wrote:
> On Sat,  5 Nov 2016 11:45:52 -0700, Pravin B Shelar wrote:
>> @@ -2058,7 +2059,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
>> net_device *dev,
>>   err = vxlan_build_skb(skb, >dst, sizeof(struct iphdr),
>> vni, md, flags, udp_sum);
>>   if (err < 0)
>> - goto xmit_tx_error;
>> + goto tx_error;
>
> Seems you're leaking rt here?
>
I have moved the dst error handling to vxlan_build_skb(), which is
releasing the dst entry.

>> @@ -2117,11 +2118,9 @@ static void vxlan_xmit_one(struct sk_buff *skb, 
>> struct net_device *dev,
>>   skb_scrub_packet(skb, xnet);
>>   err = vxlan_build_skb(skb, ndst, sizeof(struct ipv6hdr),
>> vni, md, flags, udp_sum);
>> - if (err < 0) {
>> - dst_release(ndst);
>> - dev->stats.tx_errors++;
>> - return;
>> - }
>> + if (err < 0)
>> + goto tx_error;
>
> And ndst here?
>
same as above.

Re: [PATCH] net: tcp response should set oif only if it is L3 master

2016-11-09 Thread David Miller

From: David Ahern 
Date: Wed,  9 Nov 2016 09:07:26 -0800

> Lorenzo noted an Android unit test failed due to e0d56fdd7342:
> "The expectation in the test was that the RST replying to a SYN sent to a
> closed port should be generated with oif=0. In other words it should not
> prefer the interface where the SYN came in on, but instead should follow
> whatever the routing table says it should do."
> 
> Revert the change to ip_send_unicast_reply and tcp_v6_send_response such
> that the oif in the flow is set to the skb_iif only if skb_iif is an L3
> master.
> 
> Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls")
> Reported-by: Lorenzo Colitti 
> Signed-off-by: David Ahern 

Applied, thanks David.

Re: [PATCH 00/17] pull request for net-next: batman-adv 2016-11-08 v2

2016-11-09 Thread David Miller

From: Simon Wunderlich 
Date: Wed,  9 Nov 2016 23:25:49 +0100

> this is an updated version from yesterdays pull request. Sven did changes
> according to Eric Dumazets comments in Patch 13, everything else staid the
> same.
> 
> Please pull or let me know of any problem!

Pulled, thanks Simon.

Re: [PATCH] net: ipv4: ip_send_unicast_reply should set oif only if it is L3 master

2016-11-09 Thread David Ahern

On 11/9/16 7:48 PM, David Miller wrote:
> From: David Ahern 
> Date: Tue,  8 Nov 2016 14:50:31 -0800
> 
>> Lorenzo noted an Android unit test failed due to commit e0d56fdd7342:
>>   "The expectation in the test was that the RST replying to a SYN sent to a
>>   closed port should be generated with oif=0. In other words it should not
>>   prefer the interface where the SYN came in on, but instead should follow
>>   whatever the routing table says it should do."
>>
>> Since this a change in behavior, revert the change to
>> ip_send_unicast_reply such that the oif in the flow is set to the skb_iif
>> only if skb_iif is an L3 master.
>>
>> Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls")
>> Reported-by: Lorenzo Colitti 
>> Signed-off-by: David Ahern 
> 
> David, I'm assuming that a new spin of this patch is coming.
> 

yes. posted this morning; Lorenzo tested and ack'ed an hour or so ago. Since it 
expanded to include IPv6, the subject line changed so it won't be readily 
apparent.

Re: [PATCH] net: ipv4: ip_send_unicast_reply should set oif only if it is L3 master

2016-11-09 Thread David Miller

From: David Ahern 
Date: Tue,  8 Nov 2016 14:50:31 -0800

> Lorenzo noted an Android unit test failed due to commit e0d56fdd7342:
>   "The expectation in the test was that the RST replying to a SYN sent to a
>   closed port should be generated with oif=0. In other words it should not
>   prefer the interface where the SYN came in on, but instead should follow
>   whatever the routing table says it should do."
> 
> Since this a change in behavior, revert the change to
> ip_send_unicast_reply such that the oif in the flow is set to the skb_iif
> only if skb_iif is an L3 master.
> 
> Fixes: e0d56fdd7342 ("net: l3mdev: remove redundant calls")
> Reported-by: Lorenzo Colitti 
> Signed-off-by: David Ahern 

David, I'm assuming that a new spin of this patch is coming.

Re: [PATCH v5] Net Driver: Add Cypress GX3 VID=04b4 PID=3610.

2016-11-09 Thread David Miller

From: 
Date: Tue, 8 Nov 2016 16:08:01 -0600

> From: Allan Chou 
> 
> Add support for Cypress GX3 SuperSpeed to Gigabit Ethernet
> Bridge Controller (Vendor=04b4 ProdID=3610).
> 
> Patch verified on x64 linux kernel 4.7.4, 4.8.6, 4.9-rc4 systems
> with the Kensington SD4600P USB-C Universal Dock with Power,
> which uses the Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge
> Controller.
> 
> A similar patch was signed-off and tested-by Allan Chou
>  on 2015-12-01.
> 
> Allan verified his similar patch on x86 Linux kernel 4.1.6 system
> with Cypress GX3 SuperSpeed to Gigabit Ethernet Bridge Controller.
> 
> Tested-by: Allan Chou 
> Tested-by: Chris Roth 
> Tested-by: Artjom Simon 
> 
> Signed-off-by: Allan Chou 
> Signed-off-by: Chris Roth 

Applied.

Re: [PATCH net-next 0/3] PHC frequency fine tuning

2016-11-09 Thread David Miller

From: Richard Cochran 
Date: Tue,  8 Nov 2016 22:49:15 +0100

> This series expands the PTP Hardware Clock subsystem by adding a
> method that passes the frequency tuning word to the the drivers
> without dropping the low order bits.  Keeping those bits is useful for
> drivers whose frequency resolution is higher than 1 ppb.

Series applied, thanks Richard.

Re: [PATCH net-next 1/2] bnxt_en: do not call napi_hash_add()

2016-11-09 Thread David Miller

From: Eric Dumazet 
Date: Tue, 08 Nov 2016 11:06:53 -0800

> From: Eric Dumazet 
> 
> This is automatically done from netif_napi_add(), and we want to not
> export napi_hash_add() anymore in the following patch.
> 
> Signed-off-by: Eric Dumazet 

Applied.

Re: [PATCH net-next 2/2] net: napi_hash_add() is no longer exported

2016-11-09 Thread David Miller

From: Eric Dumazet 
Date: Tue, 08 Nov 2016 11:07:28 -0800

> From: Eric Dumazet 
> 
> There are no more users except from net/core/dev.c
> napi_hash_add() can now be static.
> 
> Signed-off-by: Eric Dumazet 

Applied.

Re: [PATCH] bpf: Remove unused but set variables

2016-11-09 Thread David Miller

From: Tobias Klauser 
Date: Tue,  8 Nov 2016 16:40:28 +0100

> Remove the unused but set variables min_set and max_set in
> adjust_reg_min_max_vals to fix the following warning when building with
> 'W=1':
> 
>   kernel/bpf/verifier.c:1483:7: warning: variable ‘min_set’ set but not used 
> [-Wunused-but-set-variable]
> 
> There is no warning about max_set being unused, but since it is only
> used in the assignment of min_set it can be removed as well.
> 
> They were introduced in commit 484611357c19 ("bpf: allow access into map
> value arrays") but seem to have never been used.
> 
> Cc: Josef Bacik 
> Signed-off-by: Tobias Klauser 

Applied to net-next, thanks.

Re: [PATCH net-next] tc_act: Remove tcf_act macro

2016-11-09 Thread David Miller

From: Yotam Gigi 
Date: Tue,  8 Nov 2016 17:24:03 +0200

> tc_act macro addressed a non existing field, and was not used in the
> kernel source.
> 
> Signed-off-by: Yotam Gigi 
> Reviewed-by: Jiri Pirko 

Applied.

Re: [PATCH net-next v5 0/9] net: add support for IPv6 Segment Routing

2016-11-09 Thread David Miller


Series applied, but I wonder if using a Kconfig knob for the INLINE thing
is overkill.

Re: [PATCH] net: tcp response should set oif only if it is L3 master

2016-11-09 Thread Lorenzo Colitti

On Thu, Nov 10, 2016 at 2:07 AM, David Ahern  wrote:
> Revert the change to ip_send_unicast_reply and tcp_v6_send_response such
> that the oif in the flow is set to the skb_iif only if skb_iif is an L3
> master.

This fixes the IPv4 and IPv6 tests, thanks!

Tested-by: Lorenzo Colitti 
Acked-by: Lorenzo Colitti

Re: [PATCH 00/14] Netfilter fixes for net

2016-11-09 Thread David Miller

From: Pablo Neira Ayuso 
Date: Thu, 10 Nov 2016 01:23:33 +0100

> The following patchset contains a larger than usual batch of Netfilter
> fixes for your net tree. This series contains a mixture of old bugs and
> recently introduced bugs, they are:
 ...
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

Pulled, thanks Pablo.

Re: [net-next PATCH] amd-xgbe: use __maybe_unused to hide pm functions

2016-11-09 Thread David Miller

From: Arnd Bergmann 
Date: Tue,  8 Nov 2016 14:37:32 +0100

> The amd-xgbe ethernet driver hides its suspend/resume functions
> in #ifdef CONFIG_PM, but uses SIMPLE_DEV_PM_OPS() to make the
> reference conditional on CONFIG_PM_SLEEP, which results in a
> warning when PM_SLEEP is not set but PM is:
> 
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c:553:12: error: 
> 'xgbe_platform_resume' defined but not used [-Werror=unused-function]
> drivers/net/ethernet/amd/xgbe/xgbe-platform.c:533:12: error: 
> 'xgbe_platform_suspend' defined but not used [-Werror=unused-function]
> 
> This removes the incorrect #ifdef and instead uses a __maybe_unused
> annotation to let the compiler know it can silently drop
> the function definition.
> 
> Fixes: bd8255d8ba35 ("amd-xgbe: Prepare for supporting PCI devices")
> Signed-off-by: Arnd Bergmann 
> ---
> I originally submitted this when in March 2016, but the patch has not
> yet made it upstream, and the file contents have moved around so
> the old patch no longer applied so I'm resending the rebased version
> now.

By and large, drivers handle this by using a CONFIG_PM_SLEEP ifdef.

Unless you can make an extremely convincing argument why not to do
so here, I'd like you to handle it that way instead.

Thanks.

Re: [Xen PATCH] xen-netback: fix error handling output

2016-11-09 Thread David Miller

From: Arnd Bergmann 
Date: Tue,  8 Nov 2016 14:34:34 +0100

> The connect function prints an unintialized error code after an
> earlier initialization was removed:
> 
> drivers/net/xen-netback/xenbus.c: In function 'connect':
> drivers/net/xen-netback/xenbus.c:938:3: error: 'err' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
> 
> This prints it as -EINVAL instead, which seems to be the most
> appropriate error code. Before the patch that caused the warning,
> this would print a positive number returned by vsscanf() instead,
> which is also wrong. We probably don't need a backport though,
> as fixing the warning here should be sufficient.
> 
> Fixes: f95842e7a9f2 ("xen: make use of xenbus_read_unsigned() in xen-netback")
> Fixes: 8d3d53b3e433 ("xen-netback: Add support for multiple queues")
> Signed-off-by: Arnd Bergmann 

That first Fixes: commit mentioned is in neither of my trees, so I
assume it is in the Xen tree and thus this fix should get applied
there.

Re: [PATCH] net: mii: report 0 for unknown lp_advertising

2016-11-09 Thread David Miller

From: Arnd Bergmann 
Date: Tue,  8 Nov 2016 14:31:38 +0100

> The newly introduced mii_ethtool_get_link_ksettings function sets
> lp_advertising to an uninitialized value when BMCR_ANENABLE is not
> set:
> 
> drivers/net/mii.c: In function 'mii_ethtool_get_link_ksettings':
> drivers/net/mii.c:224:2: error: 'lp_advertising' may be used uninitialized in 
> this function [-Werror=maybe-uninitialized]
> 
> As documented in include/uapi/linux/ethtool.h, the value is
> expected to be zero when we don't know it, so let's initialize
> it to that.
> 
> Fixes: bc8ee596afe8 ("net: mii: add generic function to support ksetting 
> support")
> Signed-off-by: Arnd Bergmann 

Applied.

Re: [PATCH v3] xen-netback: prefer xenbus_scanf() over xenbus_gather()

2016-11-09 Thread David Miller

From: "Jan Beulich" 
Date: Tue, 08 Nov 2016 00:45:53 -0700

> For single items being collected this should be preferred as being more
> typesafe (as the compiler can check format string and to-be-written-to
> variable match) and more efficient (requiring one less parameter to be
> passed).
> 
> Signed-off-by: Jan Beulich 
> ---
> v3: For consistency with other code don't consider zero an error
> (utilizing that xenbus_scanf() at present won't return zero).
> v2: Avoid commit message to continue from subject.

Applied to net-next, thanks.

Re: [PATCH net-next] igmp: Document sysctl force_igmp_version

2016-11-09 Thread David Miller

From: Hangbin Liu 
Date: Mon,  7 Nov 2016 14:51:23 +0800

> There is some difference between force_igmp_version and force_mld_version.
> Add document to make users aware of this.
> 
> Signed-off-by: Hangbin Liu 

Applied, thank you.

Re: [Regression w/ patch] Restore network resistance to weird ICMP messages

2016-11-09 Thread David Miller

From: Vicente Jiménez 
Date: Mon, 7 Nov 2016 12:11:59 +0100

> From bfc9a00e6b78d8eb60e46dacd7d761669d29a573 Mon Sep 17 00:00:00 2001
> From: Vicente Jimenez Aguilar 
> Date: Mon, 31 Oct 2016 13:10:29 +0100
> Subject: [PATCH] ipv4: icmp: Fix pMTU handling for rarest case
> 
> Restore network resistance to weird ICMP fragmentation needed messages
> with next hop MTU equal to (or exceeding) dropped packet size
> 
> Fixes: 46517008e116 ("ipv4: Kill ip_rt_frag_needed().")
> Signed-off-by: Vicente Jimenez Aguilar 
> ---
>  net/ipv4/icmp.c | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
> index 38abe70..c0af1d2 100644
> --- a/net/ipv4/icmp.c
> +++ b/net/ipv4/icmp.c
> @@ -776,6 +776,7 @@ static bool icmp_unreach(struct sk_buff *skb)
>   struct icmphdr *icmph;
>   struct net *net;
>   u32 info = 0;
> + unsigned short old_mtu;
>  
>   net = dev_net(skb_dst(skb)->dev);
>  

Order local variable declarations from longest to shortest line
please.

> + if ( info >= old_mtu )

There should be no space after the '(' and before the ')' in this
conditional.

Re: [PATCH net,v2] Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")

2016-11-09 Thread David Miller

From: Stephen Suryaputra Lin 
Date: Mon,  7 Nov 2016 17:48:39 -0500

> ICMP redirects behavior is different after the commit above. An email
> requesting the explanation on why the behavior needs to be different
> was sent earlier to netdev (https://patchwork.ozlabs.org/patch/687728/).
> Since there isn't a reply yet, I decided to prepare this formal patch.
> 
> In v2.6 kernel, it used to be that ip_rt_redirect() calls
> arp_bind_neighbour() which returns 0 and then the state of the neigh for
> the new_gw is checked. If the state isn't valid then the redirected
> route is deleted. This behavior is maintained up to v3.5.7 by
> check_peer_redirect() because rt->rt_gateway is assigned to
> peer->redirect_learned.a4 before calling ipv4_neigh_lookup().
> 
> After the commit, ipv4_neigh_lookup() is performed without the
> rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
> isn't zero, the function uses it as the key. The neigh is most likely valid
> since the old_gw is the one that sends the ICMP redirect message. Then the
> new_gw is assigned to fib_nh_exception. The problem is: the new_gw ARP may
> never gets resolved and the traffic is blackholed.
> 
> Changes from v1:
>  - use __ipv4_neigh_lookup instead (per Eric Dumazet).
> 
> Signed-off-by: Stephen Suryaputra Lin 

The Fixes tag belongs in the commit message body, right before the
signoff(s).  And you need to therefore write an appropriate subject
line of the form:

[PATCH net,v2] $SUBSYSTEM: $DESCRIPTION

Re: [PATCH] rtnl: reset calcit fptr in rtnl_unregister()

2016-11-09 Thread David Miller

From: Mathias Krause 
Date: Mon,  7 Nov 2016 23:22:19 +0100

> To avoid having dangling function pointers left behind, reset calcit in
> rtnl_unregister(), too.
> 
> This is no issue so far, as only the rtnl core registers a netlink
> handler with a calcit hook which won't be unregistered, but may become
> one if new code makes use of the calcit hook.
> 
> Fixes: c7ac8679bec9 ("rtnetlink: Compute and store minimum ifinfo...")
> Cc: Jeff Kirsher 
> Cc: Greg Rose 
> Signed-off-by: Mathias Krause 

Applied, thanks.

Re: linux-next: manual merge of the net-next tree with the netfilter tree

2016-11-09 Thread Pablo Neira Ayuso

Hi David,

On Thu, Nov 10, 2016 at 10:56:33AM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> Today's linux-next merge of the net-next tree got a conflict in:
> 
>   net/netfilter/ipvs/ip_vs_ctl.c
> 
> between commit:
> 
>   8fbfef7f505b ("ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr")
> 
> from the netfilter tree and commit:
> 
>   489111e5c25b ("genetlink: statically initialize families")
> 
> from the net-next tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

I think I cannot help to address this conflict myself.

8fbfef7f505b is in my nf tree, while 489111e5c25b is in net-next. So
you will hit this conflict by when you pull net into net-next.

So please keep this patch from Stephen to resolve the conflict in your
radar to solve this.

Or let me know if you come up with any way I can handle this from here
to reduce your burden. Thanks.

> diff --cc net/netfilter/ipvs/ip_vs_ctl.c
> index a6e44ef2ec9a,6b85ded4f91d..
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@@ -3872,10 -3865,20 +3865,20 @@@ static const struct genl_ops ip_vs_genl
>   },
>   };
>   
> + static struct genl_family ip_vs_genl_family __ro_after_init = {
> + .hdrsize= 0,
> + .name   = IPVS_GENL_NAME,
> + .version= IPVS_GENL_VERSION,
>  -.maxattr= IPVS_CMD_MAX,
> ++.maxattr= IPVS_CMD_ATTR_MAX,
> + .netnsok= true, /* Make ipvsadm to work on netns */
> + .module = THIS_MODULE,
> + .ops= ip_vs_genl_ops,
> + .n_ops  = ARRAY_SIZE(ip_vs_genl_ops),
> + };
> + 
>   static int __init ip_vs_genl_register(void)
>   {
> - return genl_register_family_with_ops(_vs_genl_family,
> -  ip_vs_genl_ops);
> + return genl_register_family(_vs_genl_family);
>   }
>   
>   static void ip_vs_genl_unregister(void)

[PATCH 03/14] netfilter: nf_tables: fix race when create new element in dynset

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

Packets may race when create the new element in nft_hash_update:
   CPU0 CPU1
  lookup_fast - fail lookup_fast - fail
   new - ok new - ok
 insert - ok insert - fail(EEXIST)

So when race happened, we reuse the existing element. Otherwise,
these *racing* packets will not be handled properly.

Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set 
updates")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_set_hash.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 88d9fc8343e7..a3dface3e6e6 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -98,7 +98,7 @@ static bool nft_hash_update(struct nft_set *set, const u32 
*key,
const struct nft_set_ext **ext)
 {
struct nft_hash *priv = nft_set_priv(set);
-   struct nft_hash_elem *he;
+   struct nft_hash_elem *he, *prev;
struct nft_hash_cmp_arg arg = {
.genmask = NFT_GENMASK_ANY,
.set = set,
@@ -112,9 +112,18 @@ static bool nft_hash_update(struct nft_set *set, const u32 
*key,
he = new(set, expr, regs);
if (he == NULL)
goto err1;
-   if (rhashtable_lookup_insert_key(>ht, , >node,
-nft_hash_params))
+
+   prev = rhashtable_lookup_get_insert_key(>ht, , >node,
+   nft_hash_params);
+   if (IS_ERR(prev))
goto err2;
+
+   /* Another cpu may race to insert the element with the same key */
+   if (prev) {
+   nft_set_elem_destroy(set, he, true);
+   he = prev;
+   }
+
 out:
*ext = >ext;
return true;
-- 
2.1.4

[PATCH 02/14] netfilter: nf_tables: fix leak when expr clone fail

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

When nft_expr_clone failed, a series of problems will happen:

1. module refcnt will leak, we call __module_get at the beginning but
   we forget to put it back if ops->clone returns fail
2. memory will be leaked, if clone fail, we just return NULL and forget
   to free the alloced element
3. set->nelems will become incorrect when set->size is specified. If
   clone fail, we should decrease the set->nelems

Now this patch fixes these problems. And fortunately, clone fail will
only happen on counter expression when memory is exhausted.

Fixes: 086f332167d6 ("netfilter: nf_tables: add clone interface to expression 
operations")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h |  6 --
 net/netfilter/nf_tables_api.c | 11 ++-
 net/netfilter/nft_dynset.c| 16 ++--
 net/netfilter/nft_set_hash.c  |  4 ++--
 net/netfilter/nft_set_rbtree.c|  2 +-
 5 files changed, 23 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 5031e072567b..741dcded5b4f 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -542,7 +542,8 @@ void *nft_set_elem_init(const struct nft_set *set,
const struct nft_set_ext_tmpl *tmpl,
const u32 *key, const u32 *data,
u64 timeout, gfp_t gfp);
-void nft_set_elem_destroy(const struct nft_set *set, void *elem);
+void nft_set_elem_destroy(const struct nft_set *set, void *elem,
+ bool destroy_expr);
 
 /**
  * struct nft_set_gc_batch_head - nf_tables set garbage collection batch
@@ -693,7 +694,6 @@ static inline int nft_expr_clone(struct nft_expr *dst, 
struct nft_expr *src)
 {
int err;
 
-   __module_get(src->ops->type->owner);
if (src->ops->clone) {
dst->ops = src->ops;
err = src->ops->clone(dst, src);
@@ -702,6 +702,8 @@ static inline int nft_expr_clone(struct nft_expr *dst, 
struct nft_expr *src)
} else {
memcpy(dst, src, src->ops->size);
}
+
+   __module_get(src->ops->type->owner);
return 0;
 }
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 24db22257586..86e48aeb20be 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3452,14 +3452,15 @@ void *nft_set_elem_init(const struct nft_set *set,
return elem;
 }
 
-void nft_set_elem_destroy(const struct nft_set *set, void *elem)
+void nft_set_elem_destroy(const struct nft_set *set, void *elem,
+ bool destroy_expr)
 {
struct nft_set_ext *ext = nft_set_elem_ext(set, elem);
 
nft_data_uninit(nft_set_ext_key(ext), NFT_DATA_VALUE);
if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
nft_data_uninit(nft_set_ext_data(ext), set->dtype);
-   if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPR))
+   if (destroy_expr && nft_set_ext_exists(ext, NFT_SET_EXT_EXPR))
nf_tables_expr_destroy(NULL, nft_set_ext_expr(ext));
 
kfree(elem);
@@ -3812,7 +3813,7 @@ void nft_set_gc_batch_release(struct rcu_head *rcu)
 
gcb = container_of(rcu, struct nft_set_gc_batch, head.rcu);
for (i = 0; i < gcb->head.cnt; i++)
-   nft_set_elem_destroy(gcb->head.set, gcb->elems[i]);
+   nft_set_elem_destroy(gcb->head.set, gcb->elems[i], true);
kfree(gcb);
 }
 EXPORT_SYMBOL_GPL(nft_set_gc_batch_release);
@@ -4030,7 +4031,7 @@ static void nf_tables_commit_release(struct nft_trans 
*trans)
break;
case NFT_MSG_DELSETELEM:
nft_set_elem_destroy(nft_trans_elem_set(trans),
-nft_trans_elem(trans).priv);
+nft_trans_elem(trans).priv, true);
break;
}
kfree(trans);
@@ -4171,7 +4172,7 @@ static void nf_tables_abort_release(struct nft_trans 
*trans)
break;
case NFT_MSG_NEWSETELEM:
nft_set_elem_destroy(nft_trans_elem_set(trans),
-nft_trans_elem(trans).priv);
+nft_trans_elem(trans).priv, true);
break;
}
kfree(trans);
diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index bfdb689664b0..31ca94793aa9 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -44,18 +44,22 @@ static void *nft_dynset_new(struct nft_set *set, const 
struct nft_expr *expr,
 >data[priv->sreg_key],
 >data[priv->sreg_data],
 timeout, GFP_ATOMIC);
-   if (elem == NULL) {
-   if (set->size)
-   atomic_dec(>nelems);
-

[PATCH 01/14] netfilter: nft_dynset: fix panic if NFT_SET_HASH is not enabled

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

When CONFIG_NFT_SET_HASH is not enabled and I input the following rule:
"nft add rule filter output flow table test {ip daddr counter }", kernel
panic happened on my system:
 BUG: unable to handle kernel NULL pointer dereference at (null)
 IP: [<  (null)>]   (null)
 [...]
 Call Trace:
 [] ? nft_dynset_eval+0x56/0x100 [nf_tables]
 [] nft_do_chain+0xfb/0x4e0 [nf_tables]
 [] ? nf_conntrack_tuple_taken+0x61/0x210 [nf_conntrack]
 [] ? get_unique_tuple+0x136/0x560 [nf_nat]
 [] ? __nf_ct_ext_add_length+0x111/0x130 [nf_conntrack]
 [] ? nf_nat_setup_info+0x87/0x3b0 [nf_nat]
 [] ? ipt_do_table+0x327/0x610
 [] ? __nf_nat_alloc_null_binding+0x57/0x80 [nf_nat]
 [] nft_ipv4_output+0xaf/0xd0 [nf_tables_ipv4]
 [] nf_iterate+0x55/0x60
 [] nf_hook_slow+0x73/0xd0

Because in rbtree type set, ops->update is not implemented. So just keep
it simple, in such case, report -EOPNOTSUPP to the user space.

Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set 
updates")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_dynset.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index 517f08767a3c..bfdb689664b0 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -139,6 +139,9 @@ static int nft_dynset_init(const struct nft_ctx *ctx,
return PTR_ERR(set);
}
 
+   if (set->ops->update == NULL)
+   return -EOPNOTSUPP;
+
if (set->flags & NFT_SET_CONSTANT)
return -EBUSY;
 
-- 
2.1.4

[PATCH 04/14] netfilter: nf_conntrack_sip: extend request line validation

2016-11-09 Thread Pablo Neira Ayuso

From: Ulrich Weber 

on SIP requests, so a fragmented TCP SIP packet from an allow header starting 
with
 INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
 Content-Length: 0

will not bet interpreted as an INVITE request. Also Request-URI must start with 
an alphabetic character.

Confirm with RFC 3261
 Request-Line   =  Method SP Request-URI SP SIP-Version CRLF

Fixes: 30f33e6dee80 ("[NETFILTER]: nf_conntrack_sip: support method specific 
request/response handling")
Signed-off-by: Ulrich Weber 
Acked-by: Marco Angaroni 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_conntrack_sip.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index 621b81c7bddc..c3fc14e021ec 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1436,9 +1436,12 @@ static int process_sip_request(struct sk_buff *skb, 
unsigned int protoff,
handler = _handlers[i];
if (handler->request == NULL)
continue;
-   if (*datalen < handler->len ||
+   if (*datalen < handler->len + 2 ||
strncasecmp(*dptr, handler->method, handler->len))
continue;
+   if ((*dptr)[handler->len] != ' ' ||
+   !isalpha((*dptr)[handler->len+1]))
+   continue;
 
if (ct_sip_get_header(ct, *dptr, 0, *datalen, SIP_HDR_CSEQ,
  , ) <= 0) {
-- 
2.1.4

[PATCH 11/14] netfilter: connmark: ignore skbs with magic untracked conntrack objects

2016-11-09 Thread Pablo Neira Ayuso

From: Florian Westphal 

The (percpu) untracked conntrack entries can end up with nonzero connmarks.

The 'untracked' conntrack objects are merely a way to distinguish INVALID
(i.e. protocol connection tracker says payload doesn't meet some
requirements or packet was never seen by the connection tracking code)
from packets that are intentionally not tracked (some icmpv6 types such as
neigh solicitation, or by using 'iptables -j CT --notrack' option).

Untracked conntrack objects are implementation detail, we might as well use
invalid magic address instead to tell INVALID and UNTRACKED apart.

Check skb->nfct for untracked dummy and behave as if skb->nfct is NULL.

Reported-by: XU Tianwen 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/xt_connmark.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c
index 69f78e96fdb4..b83e158e116a 100644
--- a/net/netfilter/xt_connmark.c
+++ b/net/netfilter/xt_connmark.c
@@ -44,7 +44,7 @@ connmark_tg(struct sk_buff *skb, const struct xt_action_param 
*par)
u_int32_t newmark;
 
ct = nf_ct_get(skb, );
-   if (ct == NULL)
+   if (ct == NULL || nf_ct_is_untracked(ct))
return XT_CONTINUE;
 
switch (info->mode) {
@@ -97,7 +97,7 @@ connmark_mt(const struct sk_buff *skb, struct xt_action_param 
*par)
const struct nf_conn *ct;
 
ct = nf_ct_get(skb, );
-   if (ct == NULL)
+   if (ct == NULL || nf_ct_is_untracked(ct))
return false;
 
return ((ct->mark & info->mask) == info->mark) ^ info->invert;
-- 
2.1.4

[PATCH 00/14] Netfilter fixes for net

2016-11-09 Thread Pablo Neira Ayuso

Hi David,

The following patchset contains a larger than usual batch of Netfilter
fixes for your net tree. This series contains a mixture of old bugs and
recently introduced bugs, they are:

1) Fix a crash when using nft_dynset with nft_set_rbtree, which doesn't
   support the set element updates from the packet path. From Liping
   Zhang.

2) Fix leak when nft_expr_clone() fails, from Liping Zhang.

3) Fix a race when inserting new elements to the set hash from the
   packet path, also from Liping.

4) Handle segmented TCP SIP packets properly, basically avoid that the
   INVITE in the allow header create bogus expectations by performing
   stricter SIP message parsing, from Ulrich Weber.

5) nft_parse_u32_check() should return signed integer for errors, from
   John Linville.

6) Fix wrong allocation instead of connlabels, allocate 16 instead of
   32 bytes, from Florian Westphal.

7) Fix compilation breakage when building the ip_vs_sync code with
   CONFIG_OPTIMIZE_INLINING on x86, from Arnd Bergmann.

8) Destroy the new set if the transaction object cannot be allocated,
   also from Liping Zhang.

9) Use device to route duplicated packets via nft_dup only when set by
   the user, otherwise packets may not follow the right route, again
   from Liping.

10) Fix wrong maximum genetlink attribute definition in IPVS, from
WANG Cong.

11) Ignore untracked conntrack objects from xt_connmark, from Florian
Westphal.

12) Allow to use conntrack helpers that are registered NFPROTO_UNSPEC
via CT target, otherwise we cannot use the h.245 helper, from
Florian.

13) Revisit garbage collection heuristic in the new workqueue-based
timer approach for conntrack to evict objects earlier, again from
Florian.

14) Fix crash in nf_tables when inserting an element into a verdict map,
from Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!



The following changes since commit 67f0160fe34ec5391a428603b9832c9f99d8f3a1:

  MAINTAINERS: Update qlogic networking drivers (2016-10-26 23:29:12 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 58c78e104d937c1f560fb10ed9bb2dcde0db4fcf:

  netfilter: nf_tables: fix oops when inserting an element into a verdict map 
(2016-11-08 23:53:39 +0100)


Arnd Bergmann (1):
  netfilter: ip_vs_sync: fix bogus maybe-uninitialized warning

Florian Westphal (4):
  netfilter: conntrack: avoid excess memory allocation
  netfilter: connmark: ignore skbs with magic untracked conntrack objects
  netfilter: conntrack: fix CT target for UNSPEC helpers
  netfilter: conntrack: refine gc worker heuristics

John W. Linville (1):
  netfilter: nf_tables: fix type mismatch with error return from 
nft_parse_u32_check

Liping Zhang (6):
  netfilter: nft_dynset: fix panic if NFT_SET_HASH is not enabled
  netfilter: nf_tables: fix *leak* when expr clone fail
  netfilter: nf_tables: fix race when create new element in dynset
  netfilter: nf_tables: destroy the set if fail to add transaction
  netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it
  netfilter: nf_tables: fix oops when inserting an element into a verdict 
map

Ulrich Weber (1):
  netfilter: nf_conntrack_sip: extend request line validation

WANG Cong (1):
  ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr

 include/net/netfilter/nf_conntrack_labels.h |  3 +-
 include/net/netfilter/nf_tables.h   |  8 +++--
 net/ipv4/netfilter/nft_dup_ipv4.c   |  6 ++--
 net/ipv6/netfilter/nft_dup_ipv6.c   |  6 ++--
 net/netfilter/ipvs/ip_vs_ctl.c  |  2 +-
 net/netfilter/ipvs/ip_vs_sync.c |  7 +++--
 net/netfilter/nf_conntrack_core.c   | 49 -
 net/netfilter/nf_conntrack_helper.c | 11 +--
 net/netfilter/nf_conntrack_sip.c|  5 ++-
 net/netfilter/nf_tables_api.c   | 18 ++-
 net/netfilter/nft_dynset.c  | 19 +++
 net/netfilter/nft_set_hash.c| 19 ---
 net/netfilter/nft_set_rbtree.c  |  2 +-
 net/netfilter/xt_connmark.c |  4 +--
 14 files changed, 114 insertions(+), 45 deletions(-)

[PATCH 06/14] netfilter: conntrack: avoid excess memory allocation

2016-11-09 Thread Pablo Neira Ayuso

From: Florian Westphal 

This is now a fixed-size extension, so we don't need to pass a variable
alloc size.  This (harmless) error results in allocating 32 instead of
the needed 16 bytes for this extension as the size gets passed twice.

Fixes: 23014011ba420 ("netfilter: conntrack: support a fixed size of 128 
distinct labels")
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_conntrack_labels.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_labels.h 
b/include/net/netfilter/nf_conntrack_labels.h
index 498814626e28..1723a67c0b0a 100644
--- a/include/net/netfilter/nf_conntrack_labels.h
+++ b/include/net/netfilter/nf_conntrack_labels.h
@@ -30,8 +30,7 @@ static inline struct nf_conn_labels 
*nf_ct_labels_ext_add(struct nf_conn *ct)
if (net->ct.labels_used == 0)
return NULL;
 
-   return nf_ct_ext_add_length(ct, NF_CT_EXT_LABELS,
-   sizeof(struct nf_conn_labels), GFP_ATOMIC);
+   return nf_ct_ext_add(ct, NF_CT_EXT_LABELS, GFP_ATOMIC);
 #else
return NULL;
 #endif
-- 
2.1.4

[PATCH 10/14] ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr

2016-11-09 Thread Pablo Neira Ayuso

From: WANG Cong 

family.maxattr is the max index for policy[], the size of
ops[] is determined with ARRAY_SIZE().

Reported-by: Andrey Konovalov 
Tested-by: Andrey Konovalov 
Cc: Pablo Neira Ayuso 
Signed-off-by: Cong Wang 
Signed-off-by: Simon Horman 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/ipvs/ip_vs_ctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index c3c809b2e712..a6e44ef2ec9a 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -2845,7 +2845,7 @@ static struct genl_family ip_vs_genl_family = {
.hdrsize= 0,
.name   = IPVS_GENL_NAME,
.version= IPVS_GENL_VERSION,
-   .maxattr= IPVS_CMD_MAX,
+   .maxattr= IPVS_CMD_ATTR_MAX,
.netnsok= true, /* Make ipvsadm to work on netns */
 };
 
-- 
2.1.4

[PATCH 08/14] netfilter: nf_tables: destroy the set if fail to add transaction

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

When the memory is exhausted, then we will fail to add the NFT_MSG_NEWSET
transaction. In such case, we should destroy the set before we free it.

Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure 
to handle sets")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_tables_api.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 365d31b86816..7d6a626b08f1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2956,12 +2956,14 @@ static int nf_tables_newset(struct net *net, struct 
sock *nlsk,
 
err = nft_trans_set_add(, NFT_MSG_NEWSET, set);
if (err < 0)
-   goto err2;
+   goto err3;
 
list_add_tail_rcu(>list, >sets);
table->use++;
return 0;
 
+err3:
+   ops->destroy(set);
 err2:
kfree(set);
 err1:
-- 
2.1.4

[PATCH 05/14] netfilter: nf_tables: fix type mismatch with error return from nft_parse_u32_check

2016-11-09 Thread Pablo Neira Ayuso

From: "John W. Linville" 

Commit 36b701fae12ac ("netfilter: nf_tables: validate maximum value of
u32 netlink attributes") introduced nft_parse_u32_check with a return
value of "unsigned int", yet on error it returns "-ERANGE".

This patch corrects the mismatch by changing the return value to "int",
which happens to match the actual users of nft_parse_u32_check already.

Found by Coverity, CID 1373930.

Note that commit 21a9e0f1568ea ("netfilter: nft_exthdr: fix error
handling in nft_exthdr_init()) attempted to address the issue, but
did not address the return type of nft_parse_u32_check.

Signed-off-by: John W. Linville 
Cc: Laura Garcia Liebana 
Cc: Pablo Neira Ayuso 
Cc: Dan Carpenter 
Fixes: 36b701fae12ac ("netfilter: nf_tables: validate maximum value...")
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h | 2 +-
 net/netfilter/nf_tables_api.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 741dcded5b4f..d79d1e9b9546 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -145,7 +145,7 @@ static inline enum nft_registers nft_type_to_reg(enum 
nft_data_types type)
return type == NFT_DATA_VERDICT ? NFT_REG_VERDICT : NFT_REG_1 * 
NFT_REG_SIZE / NFT_REG32_SIZE;
 }
 
-unsigned int nft_parse_u32_check(const struct nlattr *attr, int max, u32 
*dest);
+int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest);
 unsigned int nft_parse_register(const struct nlattr *attr);
 int nft_dump_register(struct sk_buff *skb, unsigned int attr, unsigned int 
reg);
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 86e48aeb20be..365d31b86816 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4422,7 +4422,7 @@ static int nf_tables_check_loops(const struct nft_ctx 
*ctx,
  * Otherwise a 0 is returned and the attribute value is stored in the
  * destination variable.
  */
-unsigned int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest)
+int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest)
 {
u32 val;
 
-- 
2.1.4

[PATCH 14/14] netfilter: nf_tables: fix oops when inserting an element into a verdict map

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

Dalegaard says:
 The following ruleset, when loaded with 'nft -f bad.txt'
 snip
 flush ruleset
 table ip inlinenat {
   map sourcemap {
 type ipv4_addr : verdict;
   }

   chain postrouting {
 ip saddr vmap @sourcemap accept
   }
 }
 add chain inlinenat test
 add element inlinenat sourcemap { 100.123.10.2 : jump test }
 snip

 results in a kernel oops:
 BUG: unable to handle kernel paging request at 1344
 IP: [] nf_tables_check_loops+0x114/0x1f0 [nf_tables]
 [...]
 Call Trace:
  [] ? nft_data_init+0x13e/0x1a0 [nf_tables]
  [] nft_validate_register_store+0x60/0xb0 [nf_tables]
  [] nft_add_set_elem+0x545/0x5e0 [nf_tables]
  [] ? nft_table_lookup+0x30/0x60 [nf_tables]
  [] ? nla_strcmp+0x40/0x50
  [] nf_tables_newsetelem+0x11e/0x210 [nf_tables]
  [] ? nla_validate+0x60/0x80
  [] nfnetlink_rcv+0x354/0x5a7 [nfnetlink]

Because we forget to fill the net pointer in bind_ctx, so dereferencing
it may cause kernel crash.

Reported-by: Dalegaard 
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_tables_api.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 7d6a626b08f1..026581b04ea8 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3568,6 +3568,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct 
nft_set *set,
dreg = nft_type_to_reg(set->dtype);
list_for_each_entry(binding, >bindings, list) {
struct nft_ctx bind_ctx = {
+   .net= ctx->net,
.afi= ctx->afi,
.table  = ctx->table,
.chain  = (struct nft_chain *)binding->chain,
-- 
2.1.4

[PATCH 13/14] netfilter: conntrack: refine gc worker heuristics

2016-11-09 Thread Pablo Neira Ayuso

From: Florian Westphal 

Nicolas Dichtel says:
  After commit b87a2f9199ea ("netfilter: conntrack: add gc worker to
  remove timed-out entries"), netlink conntrack deletion events may be
  sent with a huge delay.

Nicolas further points at this line:

  goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);

and indeed, this isn't optimal at all.  Rationale here was to ensure that
we don't block other work items for too long, even if
nf_conntrack_htable_size is huge.  But in order to have some guarantee
about maximum time period where a scan of the full conntrack table
completes we should always use a fixed slice size, so that once every
N scans the full table has been examined at least once.

We also need to balance this vs. the case where the system is either idle
(i.e., conntrack table (almost) empty) or very busy (i.e. eviction happens
from packet path).

So, after some discussion with Nicolas:

1. want hard guarantee that we scan entire table at least once every X s
-> need to scan fraction of table (get rid of upper bound)

2. don't want to eat cycles on idle or very busy system
-> increase interval if we did not evict any entries

3. don't want to block other worker items for too long
-> make fraction really small, and prefer small scan interval instead

4. Want reasonable short time where we detect timed-out entry when
system went idle after a burst of traffic, while not doing scans
all the time.
-> Store next gc scan in worker, increasing delays when no eviction
happened and shrinking delay when we see timed out entries.

The old gc interval is turned into a max number, scans can now happen
every jiffy if stale entries are present.

Longest possible time period until an entry is evicted is now 2 minutes
in worst case (entry expires right after it was deemed 'not expired').

Reported-by: Nicolas Dichtel 
Signed-off-by: Florian Westphal 
Acked-by: Nicolas Dichtel 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_conntrack_core.c | 49 ---
 1 file changed, 41 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index df2f5a3901df..0f87e5d21be7 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -76,6 +76,7 @@ struct conntrack_gc_work {
struct delayed_work dwork;
u32 last_bucket;
boolexiting;
+   longnext_gc_run;
 };
 
 static __read_mostly struct kmem_cache *nf_conntrack_cachep;
@@ -83,9 +84,11 @@ static __read_mostly spinlock_t nf_conntrack_locks_all_lock;
 static __read_mostly DEFINE_SPINLOCK(nf_conntrack_locks_all_lock);
 static __read_mostly bool nf_conntrack_locks_all;
 
+/* every gc cycle scans at most 1/GC_MAX_BUCKETS_DIV part of table */
 #define GC_MAX_BUCKETS_DIV 64u
-#define GC_MAX_BUCKETS 8192u
-#define GC_INTERVAL(5 * HZ)
+/* upper bound of scan intervals */
+#define GC_INTERVAL_MAX(2 * HZ)
+/* maximum conntracks to evict per gc run */
 #define GC_MAX_EVICTS  256u
 
 static struct conntrack_gc_work conntrack_gc_work;
@@ -936,13 +939,13 @@ static noinline int early_drop(struct net *net, unsigned 
int _hash)
 static void gc_worker(struct work_struct *work)
 {
unsigned int i, goal, buckets = 0, expired_count = 0;
-   unsigned long next_run = GC_INTERVAL;
-   unsigned int ratio, scanned = 0;
struct conntrack_gc_work *gc_work;
+   unsigned int ratio, scanned = 0;
+   unsigned long next_run;
 
gc_work = container_of(work, struct conntrack_gc_work, dwork.work);
 
-   goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, 
GC_MAX_BUCKETS);
+   goal = nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV;
i = gc_work->last_bucket;
 
do {
@@ -982,17 +985,47 @@ static void gc_worker(struct work_struct *work)
if (gc_work->exiting)
return;
 
+   /*
+* Eviction will normally happen from the packet path, and not
+* from this gc worker.
+*
+* This worker is only here to reap expired entries when system went
+* idle after a busy period.
+*
+* The heuristics below are supposed to balance conflicting goals:
+*
+* 1. Minimize time until we notice a stale entry
+* 2. Maximize scan intervals to not waste cycles
+*
+* Normally, expired_count will be 0, this increases the next_run time
+* to priorize 2) above.
+*
+* As soon as a timed-out entry is found, move towards 1) and increase
+* the scan frequency.
+* In case we have lots of evictions next scan is done immediately.
+*/
ratio = scanned ? expired_count * 100 / scanned : 0;
-   if (ratio >= 90 || expired_count ==

[PATCH 09/14] netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it

2016-11-09 Thread Pablo Neira Ayuso

From: Liping Zhang 

The NFTA_DUP_SREG_DEV attribute is not a must option, so we should use it
in routing lookup only when the user specify it.

Fixes: d877f07112f1 ("netfilter: nf_tables: add nft_dup expression")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/nft_dup_ipv4.c | 6 --
 net/ipv6/netfilter/nft_dup_ipv6.c | 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/netfilter/nft_dup_ipv4.c 
b/net/ipv4/netfilter/nft_dup_ipv4.c
index bf855e64fc45..0c01a270bf9f 100644
--- a/net/ipv4/netfilter/nft_dup_ipv4.c
+++ b/net/ipv4/netfilter/nft_dup_ipv4.c
@@ -28,7 +28,7 @@ static void nft_dup_ipv4_eval(const struct nft_expr *expr,
struct in_addr gw = {
.s_addr = (__force __be32)regs->data[priv->sreg_addr],
};
-   int oif = regs->data[priv->sreg_dev];
+   int oif = priv->sreg_dev ? regs->data[priv->sreg_dev] : -1;
 
nf_dup_ipv4(pkt->net, pkt->skb, pkt->hook, , oif);
 }
@@ -59,7 +59,9 @@ static int nft_dup_ipv4_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
 {
struct nft_dup_ipv4 *priv = nft_expr_priv(expr);
 
-   if (nft_dump_register(skb, NFTA_DUP_SREG_ADDR, priv->sreg_addr) ||
+   if (nft_dump_register(skb, NFTA_DUP_SREG_ADDR, priv->sreg_addr))
+   goto nla_put_failure;
+   if (priv->sreg_dev &&
nft_dump_register(skb, NFTA_DUP_SREG_DEV, priv->sreg_dev))
goto nla_put_failure;
 
diff --git a/net/ipv6/netfilter/nft_dup_ipv6.c 
b/net/ipv6/netfilter/nft_dup_ipv6.c
index 8bfd470cbe72..831f86e1ec08 100644
--- a/net/ipv6/netfilter/nft_dup_ipv6.c
+++ b/net/ipv6/netfilter/nft_dup_ipv6.c
@@ -26,7 +26,7 @@ static void nft_dup_ipv6_eval(const struct nft_expr *expr,
 {
struct nft_dup_ipv6 *priv = nft_expr_priv(expr);
struct in6_addr *gw = (struct in6_addr *)>data[priv->sreg_addr];
-   int oif = regs->data[priv->sreg_dev];
+   int oif = priv->sreg_dev ? regs->data[priv->sreg_dev] : -1;
 
nf_dup_ipv6(pkt->net, pkt->skb, pkt->hook, gw, oif);
 }
@@ -57,7 +57,9 @@ static int nft_dup_ipv6_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
 {
struct nft_dup_ipv6 *priv = nft_expr_priv(expr);
 
-   if (nft_dump_register(skb, NFTA_DUP_SREG_ADDR, priv->sreg_addr) ||
+   if (nft_dump_register(skb, NFTA_DUP_SREG_ADDR, priv->sreg_addr))
+   goto nla_put_failure;
+   if (priv->sreg_dev &&
nft_dump_register(skb, NFTA_DUP_SREG_DEV, priv->sreg_dev))
goto nla_put_failure;
 
-- 
2.1.4

[PATCH 07/14] netfilter: ip_vs_sync: fix bogus maybe-uninitialized warning

2016-11-09 Thread Pablo Neira Ayuso

From: Arnd Bergmann 

Building the ip_vs_sync code with CONFIG_OPTIMIZE_INLINING on x86
confuses the compiler to the point where it produces a rather
dubious warning message:

net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.init_seq’ may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
  struct ip_vs_sync_conn_options opt;
 ^~~
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.delta’ may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘opt.previous_delta’ may be 
used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void *)+12).init_seq’ 
may be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void *)+12).delta’ may 
be used uninitialized in this function [-Werror=maybe-uninitialized]
net/netfilter/ipvs/ip_vs_sync.c:1073:33: error: ‘*((void 
*)+12).previous_delta’ may be used uninitialized in this function 
[-Werror=maybe-uninitialized]

The problem appears to be a combination of a number of factors, including
the __builtin_bswap32 compiler builtin being slightly odd, having a large
amount of code inlined into a single function, and the way that some
functions only get partially inlined here.

I've spent way too much time trying to work out a way to improve the
code, but the best I've come up with is to add an explicit memset
right before the ip_vs_seq structure is first initialized here. When
the compiler works correctly, this has absolutely no effect, but in the
case that produces the warning, the warning disappears.

In the process of analysing this warning, I also noticed that
we use memcpy to copy the larger ip_vs_sync_conn_options structure
over two members of the ip_vs_conn structure. This works because
the layout is identical, but seems error-prone, so I'm changing
this in the process to directly copy the two members. This change
seemed to have no effect on the object code or the warning, but
it deals with the same data, so I kept the two changes together.

Signed-off-by: Arnd Bergmann 
Acked-by: Julian Anastasov 
Signed-off-by: Simon Horman 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/ipvs/ip_vs_sync.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index 1b07578bedf3..9350530c16c1 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -283,6 +283,7 @@ struct ip_vs_sync_buff {
  */
 static void ntoh_seq(struct ip_vs_seq *no, struct ip_vs_seq *ho)
 {
+   memset(ho, 0, sizeof(*ho));
ho->init_seq   = get_unaligned_be32(>init_seq);
ho->delta  = get_unaligned_be32(>delta);
ho->previous_delta = get_unaligned_be32(>previous_delta);
@@ -917,8 +918,10 @@ static void ip_vs_proc_conn(struct netns_ipvs *ipvs, 
struct ip_vs_conn_param *pa
kfree(param->pe_data);
}
 
-   if (opt)
-   memcpy(>in_seq, opt, sizeof(*opt));
+   if (opt) {
+   cp->in_seq = opt->in_seq;
+   cp->out_seq = opt->out_seq;
+   }
atomic_set(>in_pkts, sysctl_sync_threshold(ipvs));
cp->state = state;
cp->old_state = cp->state;
-- 
2.1.4

[PATCH 12/14] netfilter: conntrack: fix CT target for UNSPEC helpers

2016-11-09 Thread Pablo Neira Ayuso

From: Florian Westphal 

Thomas reports its not possible to attach the H.245 helper:

iptables -t raw -A PREROUTING -p udp -j CT --helper H.245
iptables: No chain/target/match by that name.
xt_CT: No such helper "H.245"

This is because H.245 registers as NFPROTO_UNSPEC, but the CT target
passes NFPROTO_IPV4/IPV6 to nf_conntrack_helper_try_module_get.

We should treat UNSPEC as wildcard and ignore the l3num instead.

Reported-by: Thomas Woerner 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nf_conntrack_helper.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_conntrack_helper.c 
b/net/netfilter/nf_conntrack_helper.c
index 336e21559e01..7341adf7059d 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -138,9 +138,14 @@ __nf_conntrack_helper_find(const char *name, u16 l3num, u8 
protonum)
 
for (i = 0; i < nf_ct_helper_hsize; i++) {
hlist_for_each_entry_rcu(h, _ct_helper_hash[i], hnode) {
-   if (!strcmp(h->name, name) &&
-   h->tuple.src.l3num == l3num &&
-   h->tuple.dst.protonum == protonum)
+   if (strcmp(h->name, name))
+   continue;
+
+   if (h->tuple.src.l3num != NFPROTO_UNSPEC &&
+   h->tuple.src.l3num != l3num)
+   continue;
+
+   if (h->tuple.dst.protonum == protonum)
return h;
}
}
-- 
2.1.4

[PATCH iproute2 -net-next] bpf: make tc's bpf loader generic and move into lib

2016-11-09 Thread Daniel Borkmann

This work moves the bpf loader into the iproute2 library and reworks
the tc specific parts into generic code. It's useful as we can then
more easily support new program types by just having the same ELF
loader backend. Joint work with Thomas Graf. I hacked a rough start
of a test suite to make sure nothing breaks [1] and looks all good.

  [1] https://github.com/borkmann/clsact/blob/master/test_bpf.sh

Signed-off-by: Daniel Borkmann 
Signed-off-by: Thomas Graf 
---
 Makefile   |   13 +
 configure  |2 +-
 include/bpf_api.h  |   26 +-
 include/bpf_util.h |   95 +++
 lib/Makefile   |2 +-
 lib/bpf.c  | 2262 
 tc/Makefile|7 +-
 tc/e_bpf.c |2 +-
 tc/f_bpf.c |   58 +-
 tc/m_bpf.c |   47 +-
 tc/tc_bpf.c| 2010 --
 tc/tc_bpf.h|   82 --
 12 files changed, 2467 insertions(+), 2139 deletions(-)
 create mode 100644 include/bpf_util.h
 create mode 100644 lib/bpf.c
 delete mode 100644 tc/tc_bpf.c
 delete mode 100644 tc/tc_bpf.h

diff --git a/Makefile b/Makefile
index fa200dd..37b68ad 100644
--- a/Makefile
+++ b/Makefile
@@ -1,3 +1,8 @@
+# Include "Config" if already generated
+ifneq ($(wildcard Config),)
+include Config
+endif
+
 ifndef VERBOSE
 MAKEFLAGS += --no-print-directory
 endif
@@ -7,6 +12,7 @@ LIBDIR?=$(PREFIX)/lib
 SBINDIR?=/sbin
 CONFDIR?=/etc/iproute2
 DATADIR?=$(PREFIX)/share
+HDRDIR?=$(PREFIX)/include/iproute2
 DOCDIR?=$(DATADIR)/doc/iproute2
 MANDIR?=$(DATADIR)/man
 ARPDDIR?=/var/lib/arpd
@@ -51,6 +57,11 @@ SUBDIRS=lib ip tc bridge misc netem genl tipc devlink man
 LIBNETLINK=../lib/libnetlink.a ../lib/libutil.a
 LDLIBS += $(LIBNETLINK)
 
+ifeq ($(HAVE_ELF),y)
+CFLAGS += -DHAVE_ELF
+LDLIBS += -lelf
+endif
+
 all: Config
@set -e; \
for i in $(SUBDIRS); \
@@ -63,6 +74,7 @@ install: all
install -m 0755 -d $(DESTDIR)$(SBINDIR)
install -m 0755 -d $(DESTDIR)$(CONFDIR)
install -m 0755 -d $(DESTDIR)$(ARPDDIR)
+   install -m 0755 -d $(DESTDIR)$(HDRDIR)
install -m 0755 -d $(DESTDIR)$(DOCDIR)/examples
install -m 0755 -d $(DESTDIR)$(DOCDIR)/examples/diffserv
install -m 0644 README.iproute2+tc $(shell find examples -maxdepth 1 
-type f) \
@@ -73,6 +85,7 @@ install: all
install -m 0644 $(shell find etc/iproute2 -maxdepth 1 -type f) 
$(DESTDIR)$(CONFDIR)
install -m 0755 -d $(DESTDIR)$(BASH_COMPDIR)
install -m 0644 bash-completion/tc $(DESTDIR)$(BASH_COMPDIR)
+   install -m 0644 include/bpf_elf.h $(DESTDIR)$(HDRDIR)
 
 snapshot:
echo "static const char SNAPSHOT[] = \""`date +%y%m%d`"\";" \
diff --git a/configure b/configure
index c978da3..6c431c3 100755
--- a/configure
+++ b/configure
@@ -272,7 +272,7 @@ EOF
 
 if $CC -I$INCLUDE -o $TMPDIR/elftest $TMPDIR/elftest.c -lelf >/dev/null 
2>&1
 then
-   echo "TC_CONFIG_ELF:=y" >>Config
+   echo "HAVE_ELF:=y" >>Config
echo "yes"
 else
echo "no"
diff --git a/include/bpf_api.h b/include/bpf_api.h
index 1b250d2..7642623 100644
--- a/include/bpf_api.h
+++ b/include/bpf_api.h
@@ -107,9 +107,14 @@
 
 /** BPF helper functions for tc. Individual flags are in linux/bpf.h */
 
+#ifndef __BPF_FUNC
+# define __BPF_FUNC(NAME, ...) \
+   (* NAME)(__VA_ARGS__) __maybe_unused
+#endif
+
 #ifndef BPF_FUNC
 # define BPF_FUNC(NAME, ...)   \
-   (* NAME)(__VA_ARGS__) __maybe_unused = (void *) BPF_FUNC_##NAME
+   __BPF_FUNC(NAME, __VA_ARGS__) = (void *) BPF_FUNC_##NAME
 #endif
 
 /* Map access/manipulation */
@@ -147,10 +152,15 @@ static void BPF_FUNC(tail_call, struct __sk_buff *skb, 
void *map,
 
 /* System helpers */
 static uint32_t BPF_FUNC(get_smp_processor_id);
+static uint32_t BPF_FUNC(get_numa_node_id);
 
 /* Packet misc meta data */
 static uint32_t BPF_FUNC(get_cgroup_classid, struct __sk_buff *skb);
+static int BPF_FUNC(skb_under_cgroup, void *map, uint32_t index);
+
 static uint32_t BPF_FUNC(get_route_realm, struct __sk_buff *skb);
+static uint32_t BPF_FUNC(get_hash_recalc, struct __sk_buff *skb);
+static uint32_t BPF_FUNC(set_hash_invalid, struct __sk_buff *skb);
 
 /* Packet redirection */
 static int BPF_FUNC(redirect, int ifindex, uint32_t flags);
@@ -169,6 +179,20 @@ static int BPF_FUNC(l4_csum_replace, struct __sk_buff 
*skb, uint32_t off,
uint32_t from, uint32_t to, uint32_t flags);
 static int BPF_FUNC(csum_diff, const void *from, uint32_t from_size,
const void *to, uint32_t to_size, uint32_t seed);
+static int BPF_FUNC(csum_update, struct __sk_buff *skb, uint32_t wsum);
+
+static int BPF_FUNC(skb_change_type, struct __sk_buff *skb, uint32_t type);
+static int BPF_FUNC(skb_change_proto, struct __sk_buff *skb, uint32_t proto,
+   uint32_t flags);
+static int

[PATCH net] net: __skb_flow_dissect() must cap its return value

2016-11-09 Thread Eric Dumazet

From: Eric Dumazet 

After Tom patch, thoff field could point past the end of the buffer,
this could fool some callers.

If an skb was provided, skb->len should be the upper limit.
If not, hlen is supposed to be the upper limit.

Fixes: a6e544b0a88b ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet 
Reported-by: Yibin Yang 
Acked-by: Willem de Bruijn 
Acked-by: Alexei Starovoitov 
---
 net/core/flow_dissector.c |   11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 44e6ba9d3a6b..5a908c534bec 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -122,7 +122,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
struct flow_dissector_key_keyid *key_keyid;
bool skip_vlan = false;
u8 ip_proto = 0;
-   bool ret = false;
+   bool ret;
 
if (!data) {
data = skb->data;
@@ -549,12 +549,17 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 out_good:
ret = true;
 
-out_bad:
+   key_control->thoff = (u16)nhoff;
+out:
key_basic->n_proto = proto;
key_basic->ip_proto = ip_proto;
-   key_control->thoff = (u16)nhoff;
 
return ret;
+
+out_bad:
+   ret = false;
+   key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
+   goto out;
 }
 EXPORT_SYMBOL(__skb_flow_dissect);

Re: [PATCH] vxlan: hide unused local variable

2016-11-09 Thread David Miller

From: Arnd Bergmann 
Date: Mon,  7 Nov 2016 22:09:07 +0100

> A bugfix introduced a harmless warning in v4.9-rc4:
> 
> drivers/net/vxlan.c: In function 'vxlan_group_used':
> drivers/net/vxlan.c:947:21: error: unused variable 'sock6' 
> [-Werror=unused-variable]
> 
> This hides the variable inside of the same #ifdef that is
> around its user. The extraneous initialization is removed
> at the same time, it was accidentally introduced in the
> same commit.
> 
> Fixes: c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.")
> Signed-off-by: Arnd Bergmann 

Applied.

Re: [PATCH net-next v2 2/5] net: l2tp: only set L2TP_ATTR_UDP_CSUM if AF_INET

2016-11-09 Thread David Miller

From: Asbjoern Sloth Toennesen 
Date: Mon,  7 Nov 2016 20:39:25 +

> Only set L2TP_ATTR_UDP_CSUM in l2tp_nl_tunnel_send()
> when it's running over IPv4.
> 
> This prepares the code to also have IPv6 specific attributes.
> 
> Signed-off-by: Asbjoern Sloth Toennesen 

Applied.

linux-next: manual merge of the net-next tree with the netfilter tree

2016-11-09 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  net/netfilter/ipvs/ip_vs_ctl.c

between commit:

  8fbfef7f505b ("ipvs: use IPVS_CMD_ATTR_MAX for family.maxattr")

from the netfilter tree and commit:

  489111e5c25b ("genetlink: statically initialize families")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/netfilter/ipvs/ip_vs_ctl.c
index a6e44ef2ec9a,6b85ded4f91d..
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@@ -3872,10 -3865,20 +3865,20 @@@ static const struct genl_ops ip_vs_genl
},
  };
  
+ static struct genl_family ip_vs_genl_family __ro_after_init = {
+   .hdrsize= 0,
+   .name   = IPVS_GENL_NAME,
+   .version= IPVS_GENL_VERSION,
 -  .maxattr= IPVS_CMD_MAX,
++  .maxattr= IPVS_CMD_ATTR_MAX,
+   .netnsok= true, /* Make ipvsadm to work on netns */
+   .module = THIS_MODULE,
+   .ops= ip_vs_genl_ops,
+   .n_ops  = ARRAY_SIZE(ip_vs_genl_ops),
+ };
+ 
  static int __init ip_vs_genl_register(void)
  {
-   return genl_register_family_with_ops(_vs_genl_family,
-ip_vs_genl_ops);
+   return genl_register_family(_vs_genl_family);
  }
  
  static void ip_vs_genl_unregister(void)

Re: [PATCH net-next v2 1/5] net: l2tp: change L2TP_ATTR_UDP_ZERO_CSUM6_{RX,TX} attribute types

2016-11-09 Thread David Miller

From: Asbjoern Sloth Toennesen 
Date: Mon,  7 Nov 2016 20:39:24 +

> The attributes L2TP_ATTR_UDP_ZERO_CSUM6_RX and
> L2TP_ATTR_UDP_ZERO_CSUM6_TX are used as flags,
> but is defined as a u8 in a comment.
> 
> This patch redocuments them as flags.
> 
> Adding nla_policy entries would break API, so not doing that.
> 
> CC: Tom Herbert 
> Signed-off-by: Asbjoern Sloth Toennesen 

Applied.

Re: [PATCH net-next v2 4/5] net: l2tp: cleanup: remove redundant condition

2016-11-09 Thread David Miller

From: Asbjoern Sloth Toennesen 
Date: Mon,  7 Nov 2016 20:39:27 +

> These assignments follow this pattern:
> 
>   unsigned int foo:1;
>   struct nlattr *nla = info->attrs[bar];
> 
>   if (nla)
>   foo = nla_get_flag(nla); /* expands to: foo = !!nla */
> 
> This could be simplified to: if (nla) foo = 1;
> but lets just remove the condition and use the macro,
> 
>   foo = nla_get_flag(nla);
> 
> Signed-off-by: Asbjoern Sloth Toennesen 

Applied.

Re: [PATCH net-next v2 3/5] net: l2tp: netlink: l2tp_nl_tunnel_send: set UDP6 checksum flags

2016-11-09 Thread David Miller

From: Asbjoern Sloth Toennesen 
Date: Mon,  7 Nov 2016 20:39:26 +

> This patch causes the proper attribute flags to be set,
> in the case that IPv6 UDP checksums are disabled, so that
> userspace ie. `ip l2tp show tunnel` knows about it.
> 
> Signed-off-by: Asbjoern Sloth Toennesen 

Applied.

Re: [PATCH] net: ethernet: ti: davinci_cpdma: fix fixed prio cpdma ctlr configuration

2016-11-09 Thread Ivan Khoronzhuk




On 09.11.16 23:09, Grygorii Strashko wrote:



On 11/08/2016 07:10 AM, Ivan Khoronzhuk wrote:

The dma ctlr is reseted to 0 while cpdma start, thus cpdma ctlr


I assume this is because cpdma_ctlr_start() does soft reset. Is it correct?

Probably not. I've seen this register doesn't hold any previous settings (just 
trash)
after cpdma_ctlr_stop(), actually after last channel is stopped (inside of 
cpdma_ctlr_stop()).
Then cpdma_ctlr_start() just reset it to 0.




cannot be configured after cpdma is stopped. So, restore content
of cpdma ctlr while off/on procedure.

Based on net-next/master


^ remove it

sure





Signed-off-by: Ivan Khoronzhuk 
---
 drivers/net/ethernet/ti/cpsw.c  |   6 +-
 drivers/net/ethernet/ti/davinci_cpdma.c | 103 +---
 drivers/net/ethernet/ti/davinci_cpdma.h |   2 +
 3 files changed, 58 insertions(+), 53 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index b1ddf89..4d04b8e 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1376,10 +1376,6 @@ static int cpsw_ndo_open(struct net_device *ndev)
  ALE_ALL_PORTS, ALE_ALL_PORTS, 0, 0);

if (!cpsw_common_res_usage_state(cpsw)) {
-   /* setup tx dma to fixed prio and zero offset */
-   cpdma_control_set(cpsw->dma, CPDMA_TX_PRIO_FIXED, 1);
-   cpdma_control_set(cpsw->dma, CPDMA_RX_BUFFER_OFFSET, 0);
-
/* disable priority elevation */
__raw_writel(0, >regs->ptype);

@@ -2710,6 +2706,8 @@ static int cpsw_probe(struct platform_device *pdev)
dma_params.desc_align   = 16;
dma_params.has_ext_regs = true;
dma_params.desc_hw_addr = dma_params.desc_mem_phys;
+   dma_params.rxbuf_offset = 0;
+   dma_params.fixed_prio   = 1;


Do we really need this new parameters? Do you have plans to use other values?

I'm ok if this is static (equally as a bunch of rest in dma_params), no see 
reason to use other values,
it at least now. But the issue is not only in these two parameters and not only 
in cpsw_ndo_open().
It touches cpsw_set_channels() also, where ctlr stop/start is present.
In order to not copy cpdma_control_set(cpsw->dma, CPDMA_TX_PRIO_FIXED, 1)...
in all such kind places in eth drivers, better to allow the cpdma to control 
it's parameters...

The cpdma ctlr register holds a little more parameters (but only two of them 
are set in cpsw)
Maybe there is reason to save them also. Actually I'd seen this problem when 
playing with
on/off channel shapers, unfortunately the cpdma ctlr holds this info also, and 
it was lost
while on/off (but I'm going to restore it in chan_start()).






--
Regards,
Ivan Khoronzhuk

Re: [PATCH net-next v2 5/5] net: l2tp: fix negative assignment to unsigned int

2016-11-09 Thread David Miller

From: Asbjoern Sloth Toennesen 
Date: Mon,  7 Nov 2016 20:39:28 +

> recv_seq, send_seq and lns_mode mode are all defined as
> unsigned int foo:1;
> 
> Signed-off-by: Asbjoern Sloth Toennesen 

Applied.

Re: [PATCH net] ibmvnic: Start completion queue negotiation at server-provided optimum values

2016-11-09 Thread David Miller

From: John Allen 
Date: Mon, 7 Nov 2016 14:27:28 -0600

> Use the opt_* fields to determine the starting point for negotiating the
> number of tx/rx completion queues with the vnic server. These contain the
> number of queues that the vnic server estimates that it will be able to
> allocate. While renegotiation may still occur, using the opt_* fields will
> reduce the number of times this needs to happen and will prevent driver
> probe timeout on systems using large numbers of ibmvnic client devices per
> vnic port.
> 
> Signed-off-by: John Allen 

Applied, thanks.

Re: [PATCH v2] net: icmp_route_lookup should use rt dev to determine L3 domain

2016-11-09 Thread David Miller

From: David Ahern 
Date: Mon,  7 Nov 2016 12:03:09 -0800

> icmp_send is called in response to some event. The skb may not have
> the device set (skb->dev is NULL), but it is expected to have an rt.
> Update icmp_route_lookup to use the rt on the skb to determine L3
> domain.
> 
> Fixes: 613d09b30f8b ("net: Use VRF device index for lookups on TX")
> Signed-off-by: David Ahern 

Applied and queued up for -stable, thanks David.

linux-next: manual merge of the net-next tree with the net tree

2016-11-09 Thread Stephen Rothwell

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c

between commit:

  ee39fbc4447d ("net/mlx5: E-Switch, Set the actions for offloaded rules 
properly")

from the net tree and commit:

  66958ed906b8 ("net/mlx5: Support encap id when setting new steering entry")

from the net-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index d239f5d0ea36,50fe8e8861bb..
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@@ -57,14 -58,14 +58,15 @@@ mlx5_eswitch_add_offloaded_rule(struct 
if (esw->mode != SRIOV_OFFLOADS)
return ERR_PTR(-EOPNOTSUPP);
  
 -  flow_act.action = attr->action;
 +  /* per flow vlan pop/push is emulated, don't set that into the firmware 
*/
-   action = attr->action & ~(MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH | 
MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
++  flow_act.action = attr->action & ~(MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH | 
MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
  
-   if (action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
-   dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
-   dest.vport_num = attr->out_rep->vport;
-   action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
-   } else if (action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
+   if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
+   dest[i].type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
+   dest[i].vport_num = attr->out_rep->vport;
+   i++;
+   }
+   if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
counter = mlx5_fc_create(esw->dev, true);
if (IS_ERR(counter))
return ERR_CAST(counter);

Re: [PATCH net-next] net-gro: avoid reorders

2016-11-09 Thread David Miller

From: Eric Dumazet 
Date: Mon, 07 Nov 2016 11:12:27 -0800

> From: Eric Dumazet 
> 
> Receiving a GSO packet in dev_gro_receive() is not uncommon
> in stacked devices, or devices partially implementing LRO/GRO
> like bnx2x. GRO is implementing the aggregation the device
> was not able to do itself.
> 
> Current code causes reorders, like in following case :
> 
> For a given flow where sender sent 3 packets P1,P2,P3,P4
> 
> Receiver might receive P1 as a single packet, stored in GRO engine.
> 
> Then P2-P4 are received as a single GSO packet, immediately given to
> upper stack, while P1 is held in GRO engine.
> 
> This patch will make sure P1 is given to upper stack, then P2-P4
> immediately after.
> 
> Signed-off-by: Eric Dumazet 

Applied.

Re: [PATCH 0/2] net: qcom/emac: ensure that pause frames are enabled

2016-11-09 Thread David Miller

From: Timur Tabi 
Date: Mon,  7 Nov 2016 10:51:39 -0600

> The qcom emac driver experiences significant packet loss (through frame
> check sequence errors) if flow control is not enabled and the phy is
> not configured to allow pause frames to pass through it.  Therefore, we
> need to enable flow control and force the phy to pass pause frames.

Series applied, thanks.

[PATCH net 1/2] bpf: Fix bpf_redirect to an ipip/ip6tnl dev

2016-11-09 Thread Martin KaFai Lau

If the bpf program calls bpf_redirect(dev, 0) and dev is
an ipip/ip6tnl, it currently includes the mac header.
e.g. If dev is ipip, the end result is IP-EthHdr-IP instead
of IP-IP.

The fix is to pull the mac header.  At ingress, skb_postpull_rcsum()
is not needed because the ethhdr should have been pulled once already
and then got pushed back just before calling the bpf_prog.
At egress, this patch calls skb_postpull_rcsum().

If bpf_redirect(dev, BPF_F_INGRESS) is called,
it also fails now because it calls dev_forward_skb() which
eventually calls eth_type_trans(skb, dev).  The eth_type_trans()
will set skb->type = PACKET_OTHERHOST because the mac address
does not match the redirecting dev->dev_addr.  The PACKET_OTHERHOST
will eventually cause the ip_rcv() errors out.  To fix this,
dev_forward_skb() is added.

Joint work with Daniel Borkmann.

Fixes: cfc7381b3002 ("ip_tunnel: add collect_md mode to IPIP tunnel")
Fixes: 8d79266bc48c ("ip6_tunnel: add collect_md mode to IPv6 tunnels")
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: Martin KaFai Lau 
---
 include/linux/netdevice.h | 15 +++
 net/core/dev.c| 17 +---
 net/core/filter.c | 68 +--
 3 files changed, 81 insertions(+), 19 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 91ee364..bf04a46 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3354,6 +3354,21 @@ int dev_forward_skb(struct net_device *dev, struct 
sk_buff *skb);
 bool is_skb_forwardable(const struct net_device *dev,
const struct sk_buff *skb);
 
+static __always_inline int dev_forward_skb(struct net_device *dev,
+  struct sk_buff *skb)
+{
+   if (skb_orphan_frags(skb, GFP_ATOMIC) ||
+   unlikely(!is_skb_forwardable(dev, skb))) {
+   atomic_long_inc(>rx_dropped);
+   kfree_skb(skb);
+   return NET_RX_DROP;
+   }
+
+   skb_scrub_packet(skb, true);
+   skb->priority = 0;
+   return 0;
+}
+
 void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev);
 
 extern int netdev_budget;
diff --git a/net/core/dev.c b/net/core/dev.c
index eaad4c2..b28 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1766,19 +1766,14 @@ EXPORT_SYMBOL_GPL(is_skb_forwardable);
 
 int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb)
 {
-   if (skb_orphan_frags(skb, GFP_ATOMIC) ||
-   unlikely(!is_skb_forwardable(dev, skb))) {
-   atomic_long_inc(>rx_dropped);
-   kfree_skb(skb);
-   return NET_RX_DROP;
-   }
+   int ret = dev_forward_skb(dev, skb);
 
-   skb_scrub_packet(skb, true);
-   skb->priority = 0;
-   skb->protocol = eth_type_trans(skb, dev);
-   skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
+   if (likely(!ret)) {
+   skb->protocol = eth_type_trans(skb, dev);
+   skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
+   }
 
-   return 0;
+   return ret;
 }
 EXPORT_SYMBOL_GPL(__dev_forward_skb);
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 00351cd..b391209 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1628,6 +1628,19 @@ static inline int __bpf_rx_skb(struct net_device *dev, 
struct sk_buff *skb)
return dev_forward_skb(dev, skb);
 }
 
+static inline int __bpf_rx_skb_no_mac(struct net_device *dev,
+ struct sk_buff *skb)
+{
+   int ret = dev_forward_skb(dev, skb);
+
+   if (likely(!ret)) {
+   skb->dev = dev;
+   ret = netif_rx(skb);
+   }
+
+   return ret;
+}
+
 static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 {
int ret;
@@ -1647,6 +1660,51 @@ static inline int __bpf_tx_skb(struct net_device *dev, 
struct sk_buff *skb)
return ret;
 }
 
+static int __bpf_redirect_no_mac(struct sk_buff *skb, struct net_device *dev,
+u32 flags)
+{
+   /* skb->mac_len is not set on normal egress */
+   unsigned int mlen = skb->network_header - skb->mac_header;
+
+   __skb_pull(skb, mlen);
+
+   /* At ingress, the mac header has already been pulled once.
+* At egress, skb_pospull_rcsum has to be done in case that
+* the skb is originated from ingress (i.e. a forwarded skb)
+* to ensure that rcsum starts at net header.
+*/
+   if (!skb_at_tc_ingress(skb))
+   skb_postpull_rcsum(skb, skb_mac_header(skb), mlen);
+   skb_pop_mac_header(skb);
+   skb_reset_mac_len(skb);
+   return flags & BPF_F_INGRESS ?
+  __bpf_rx_skb_no_mac(dev, skb) : __bpf_tx_skb(dev, skb);
+}
+
+static int __bpf_redirect_common(struct sk_buff *skb, struct net_device *dev,
+

[PATCH net 0/2] bpf: Fix bpf_redirect to an ipip/ip6tnl dev

2016-11-09 Thread Martin KaFai Lau

Hi,

This patch set fixes a bug in bpf_redirect(dev, flags) when dev is an
ipip/ip6tnl.  The current problem is IP-EthHdr-IP is sent out instead of
IP-IP.

Patch 1 adds a dev->type test similar to dev_is_mac_header_xmit()
in act_mirred.c which is only available in net-next.  We can consider to
refactor it once this patch is pulled into net-next from net.

Thanks,
-- Martin

[PATCH net 2/2] bpf: Add test for bpf_redirect to ipip/ip6tnl

2016-11-09 Thread Martin KaFai Lau

The test creates two netns, ns1 and ns2.  The host (the default netns)
has an ipip or ip6tnl dev configured for tunneling traffic to the ns2.

ping VIPS from ns1 <> host <--tunnel--> ns2 (VIPs at loopback)

The test is to have ns1 pinging VIPs configured at the loopback
interface in ns2.

The VIPs are 10.10.1.102 and 2401:face::66 (which are configured
at lo@ns2). [Note: 0x66 => 102].

At ns1, the VIPs are routed _via_ the host.

At the host, bpf programs are installed at the veth to redirect packets
from a veth to the ipip/ip6tnl.  The test is configured in a way so
that both ingress and egress can be tested.

At ns2, the ipip/ip6tnl dev is configured with the local and remote address
specified.  The return path is routed to the dev ipip/ip6tnl.

During egress test, the host also locally tests pinging the VIPs to ensure
that bpf_redirect at egress also works for the direct egress (i.e. not
forwarding from dev ve1 to ve2).

Acked-by: Alexei Starovoitov 
Signed-off-by: Martin KaFai Lau 
---
 samples/bpf/Makefile  |   4 +
 samples/bpf/tc_l2_redirect.sh | 173 
 samples/bpf/tc_l2_redirect_kern.c | 236 ++
 samples/bpf/tc_l2_redirect_user.c |  73 
 4 files changed, 486 insertions(+)
 create mode 100755 samples/bpf/tc_l2_redirect.sh
 create mode 100644 samples/bpf/tc_l2_redirect_kern.c
 create mode 100644 samples/bpf/tc_l2_redirect_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 12b7304..72c5867 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -27,6 +27,7 @@ hostprogs-y += xdp2
 hostprogs-y += test_current_task_under_cgroup
 hostprogs-y += trace_event
 hostprogs-y += sampleip
+hostprogs-y += tc_l2_redirect
 
 test_verifier-objs := test_verifier.o libbpf.o
 test_maps-objs := test_maps.o libbpf.o
@@ -56,6 +57,7 @@ test_current_task_under_cgroup-objs := bpf_load.o libbpf.o \
   test_current_task_under_cgroup_user.o
 trace_event-objs := bpf_load.o libbpf.o trace_event_user.o
 sampleip-objs := bpf_load.o libbpf.o sampleip_user.o
+tc_l2_redirect-objs := bpf_load.o libbpf.o tc_l2_redirect_user.o
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -72,6 +74,7 @@ always += test_probe_write_user_kern.o
 always += trace_output_kern.o
 always += tcbpf1_kern.o
 always += tcbpf2_kern.o
+always += tc_l2_redirect_kern.o
 always += lathist_kern.o
 always += offwaketime_kern.o
 always += spintest_kern.o
@@ -111,6 +114,7 @@ HOSTLOADLIBES_xdp2 += -lelf
 HOSTLOADLIBES_test_current_task_under_cgroup += -lelf
 HOSTLOADLIBES_trace_event += -lelf
 HOSTLOADLIBES_sampleip += -lelf
+HOSTLOADLIBES_tc_l2_redirect += -l elf
 
 # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on 
cmdline:
 #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc 
CLANG=~/git/llvm/build/bin/clang
diff --git a/samples/bpf/tc_l2_redirect.sh b/samples/bpf/tc_l2_redirect.sh
new file mode 100755
index 000..b0e1f09
--- /dev/null
+++ b/samples/bpf/tc_l2_redirect.sh
@@ -0,0 +1,173 @@
+#!/bin/bash
+
+[[ -z $TC ]] && TC='tc'
+[[ -z $IP ]] && IP='ip'
+
+REDIRECT_USER='./tc_l2_redirect'
+REDIRECT_BPF='./tc_l2_redirect_kern.o'
+
+RP_FILTER=$(< /proc/sys/net/ipv4/conf/all/rp_filter)
+IPV6_FORWARDING=$(< /proc/sys/net/ipv6/conf/all/forwarding)
+
+function config_common {
+   local tun_type=$1
+
+   $IP netns add ns1
+   $IP netns add ns2
+   $IP link add ve1 type veth peer name vens1
+   $IP link add ve2 type veth peer name vens2
+   $IP link set dev ve1 up
+   $IP link set dev ve2 up
+   $IP link set dev ve1 mtu 1500
+   $IP link set dev ve2 mtu 1500
+   $IP link set dev vens1 netns ns1
+   $IP link set dev vens2 netns ns2
+
+   $IP -n ns1 link set dev lo up
+   $IP -n ns1 link set dev vens1 up
+   $IP -n ns1 addr add 10.1.1.101/24 dev vens1
+   $IP -n ns1 addr add 2401:db01::65/64 dev vens1 nodad
+   $IP -n ns1 route add default via 10.1.1.1 dev vens1
+   $IP -n ns1 route add default via 2401:db01::1 dev vens1
+
+   $IP -n ns2 link set dev lo up
+   $IP -n ns2 link set dev vens2 up
+   $IP -n ns2 addr add 10.2.1.102/24 dev vens2
+   $IP -n ns2 addr add 2401:db02::66/64 dev vens2 nodad
+   $IP -n ns2 addr add 10.10.1.102 dev lo
+   $IP -n ns2 addr add 2401:face::66/64 dev lo nodad
+   $IP -n ns2 link add ipt2 type ipip local 10.2.1.102 remote 10.2.1.1
+   $IP -n ns2 link add ip6t2 type ip6tnl mode any local 2401:db02::66 
remote 2401:db02::1
+   $IP -n ns2 link set dev ipt2 up
+   $IP -n ns2 link set dev ip6t2 up
+   $IP netns exec ns2 $TC qdisc add dev vens2 clsact
+   $IP netns exec ns2 $TC filter add dev vens2 ingress bpf da obj 
$REDIRECT_BPF sec drop_non_tun_vip
+   if [[ $tun_type == "ipip" ]]; then
+   $IP -n ns2 route add 10.1.1.0/24 dev ipt2
+   $IP netns exec ns2 sysctl -q -w

Re: [PATCH] mwifiex: fix memory leak in mwifiex_save_hidden_ssid_channels()

2016-11-09 Thread Brian Norris

On Wed, Nov 09, 2016 at 11:37:28AM +0800, Ricky Liang wrote:
> kmemleak reports memory leak in mwifiex_save_hidden_ssid_channels():
> 
> unreferenced object 0xffc0a2914780 (size 192):
>   comm "ksdioirqd/mmc2", pid 2004, jiffies 4307182506 (age 820.684s)
>   hex dump (first 32 bytes):
> 00 06 47 49 4e 2d 32 67 01 03 c8 60 6c 03 01 40  ..GIN-2g...`l..@
> 07 10 54 57 20 34 04 1e 64 05 24 84 03 24 95 04  ..TW 4..d.$..$..
>   backtrace:
> [] create_object+0x164/0x2b4
> [] kmemleak_alloc+0x50/0x88
> [] __kmalloc_track_caller+0x1bc/0x264
> [] kmemdup+0x38/0x64
> [] mwifiex_fill_new_bss_desc+0x3c/0x130 [mwifiex]
> [] mwifiex_save_curr_bcn+0x4ec/0x640 [mwifiex]
> [] mwifiex_handle_event_ext_scan_report+0x1d4/0x268 
> [mwifiex]
> [] mwifiex_process_sta_event+0x378/0x898 [mwifiex]
> [] mwifiex_process_event+0x1a8/0x1e8 [mwifiex]
> [] mwifiex_main_process+0x258/0x534 [mwifiex]
> [] 0xffbffc258858
> [] process_sdio_pending_irqs+0xf8/0x160
> [] sdio_irq_thread+0x9c/0x1a4
> [] kthread+0xf4/0x100
> [] ret_from_fork+0xc/0x50
> [] 0x
> 
> Signed-off-by: Ricky Liang 
> ---
>  drivers/net/wireless/marvell/mwifiex/scan.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/scan.c 
> b/drivers/net/wireless/marvell/mwifiex/scan.c
> index 97c9765..98ce072 100644
> --- a/drivers/net/wireless/marvell/mwifiex/scan.c
> +++ b/drivers/net/wireless/marvell/mwifiex/scan.c
> @@ -1671,6 +1671,10 @@ static int mwifiex_save_hidden_ssid_channels(struct 
> mwifiex_private *priv,
>   }
>  
>  done:
> + /* beacon_ie buffer was allocated in function
> +  * mwifiex_fill_new_bss_desc(). Free it now.
> +  */
> + kfree(bss_desc->beacon_buf);

For a bit, I thought this was possibly a sort of double-free, since
mwifiex_fill_new_bss_desc() might actually fail to allocate
->beacon_buf, but kfree(NULL) is safe, so:

Reviewed-by: Brian Norris 

>   kfree(bss_desc);
>   return 0;
>  }
> -- 
> 2.6.6
>

[PATCH net-next V5 6/9] liquidio CN23XX: device states

2016-11-09 Thread Raghu Vatsavayi

Cleaned up resource leaks during destroy resources by
introducing more device states.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 33 --
 .../net/ethernet/cavium/liquidio/octeon_device.c   |  6 +++-
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 29 ++-
 drivers/net/ethernet/cavium/liquidio/octeon_droq.c | 13 +
 drivers/net/ethernet/cavium/liquidio/octeon_main.h |  8 --
 .../net/ethernet/cavium/liquidio/request_manager.c |  6 +++-
 6 files changed, 64 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 6e435db..eb46ffa 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -770,6 +770,7 @@ static void delete_glists(struct lio *lio)
}
 
kfree((void *)lio->glist);
+   kfree((void *)lio->glist_lock);
 }
 
 /**
@@ -1329,6 +1330,7 @@ static int liquidio_watchdog(void *param)
complete(_stage);
 
if (octeon_device_init(oct_dev)) {
+   complete(>init);
liquidio_remove(pdev);
return -ENOMEM;
}
@@ -1353,7 +1355,15 @@ static int liquidio_watchdog(void *param)
oct_dev->watchdog_task = kthread_create(
liquidio_watchdog, oct_dev,
"liowd/%02hhx:%02hhx.%hhx", bus, device, function);
-   wake_up_process(oct_dev->watchdog_task);
+   if (!IS_ERR(oct_dev->watchdog_task)) {
+   wake_up_process(oct_dev->watchdog_task);
+   } else {
+   oct_dev->watchdog_task = NULL;
+   dev_err(_dev->pci_dev->dev,
+   "failed to create kernel_thread\n");
+   liquidio_remove(pdev);
+   return -1;
+   }
}
}
 
@@ -1417,6 +1427,8 @@ static void octeon_destroy_resources(struct octeon_device 
*oct)
if (lio_wait_for_oq_pkts(oct))
dev_err(>pci_dev->dev, "OQ had pending packets\n");
 
+   /* fallthrough */
+   case OCT_DEV_INTR_SET_DONE:
/* Disable interrupts  */
oct->fn_list.disable_interrupt(oct, OCTEON_ALL_INTR);
 
@@ -1443,6 +1455,8 @@ static void octeon_destroy_resources(struct octeon_device 
*oct)
pci_disable_msi(oct->pci_dev);
}
 
+   /* fallthrough */
+   case OCT_DEV_MSIX_ALLOC_VECTOR_DONE:
if (OCTEON_CN23XX_PF(oct))
octeon_free_ioq_vector(oct);
 
@@ -1508,10 +1522,13 @@ static void octeon_destroy_resources(struct 
octeon_device *oct)
octeon_unmap_pci_barx(oct, 1);
 
/* fallthrough */
-   case OCT_DEV_BEGIN_STATE:
+   case OCT_DEV_PCI_ENABLE_DONE:
+   pci_clear_master(oct->pci_dev);
/* Disable the device, releasing the PCI INT */
pci_disable_device(oct->pci_dev);
 
+   /* fallthrough */
+   case OCT_DEV_BEGIN_STATE:
/* Nothing to be done here either */
break;
}   /* end switch (oct->status) */
@@ -1781,6 +1798,7 @@ static int octeon_pci_os_setup(struct octeon_device *oct)
 
if (dma_set_mask_and_coherent(>pci_dev->dev, DMA_BIT_MASK(64))) {
dev_err(>pci_dev->dev, "Unexpected DMA device 
capability\n");
+   pci_disable_device(oct->pci_dev);
return 1;
}
 
@@ -4434,6 +4452,8 @@ static int octeon_device_init(struct octeon_device 
*octeon_dev)
if (octeon_pci_os_setup(octeon_dev))
return 1;
 
+   atomic_set(_dev->status, OCT_DEV_PCI_ENABLE_DONE);
+
/* Identify the Octeon type and map the BAR address space. */
if (octeon_chip_specific_setup(octeon_dev)) {
dev_err(_dev->pci_dev->dev, "Chip specific setup 
failed\n");
@@ -4505,9 +4525,6 @@ static int octeon_device_init(struct octeon_device 
*octeon_dev)
if (octeon_setup_instr_queues(octeon_dev)) {
dev_err(_dev->pci_dev->dev,
"instruction queue initialization failed\n");
-   /* On error, release any previously allocated queues */
-   for (j = 0; j < octeon_dev->num_iqs; j++)
-   octeon_delete_instr_queue(octeon_dev, j);
return 1;
}
atomic_set(_dev->status, OCT_DEV_INSTR_QUEUE_INIT_DONE);
@@ -4523,9

[PATCH net-next V5 9/9] liquidio CN23XX: fix for new check patch errors

2016-11-09 Thread Raghu Vatsavayi

New checkpatch script shows some errors with pre-existing
driver. This patch provides fix for those errors.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 .../net/ethernet/cavium/liquidio/cn23xx_pf_regs.h  |  12 +--
 drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h |  12 +--
 .../net/ethernet/cavium/liquidio/cn68xx_device.c   |   2 +-
 drivers/net/ethernet/cavium/liquidio/lio_ethtool.c |   9 +-
 drivers/net/ethernet/cavium/liquidio/lio_main.c|  15 +--
 .../net/ethernet/cavium/liquidio/liquidio_common.h |  50 -
 .../net/ethernet/cavium/liquidio/octeon_console.c  | 113 ++---
 .../net/ethernet/cavium/liquidio/octeon_device.c   |  28 ++---
 .../net/ethernet/cavium/liquidio/octeon_device.h   |  25 +++--
 drivers/net/ethernet/cavium/liquidio/octeon_droq.c |  40 
 drivers/net/ethernet/cavium/liquidio/octeon_iq.h   |   3 +
 .../net/ethernet/cavium/liquidio/octeon_mem_ops.c  |   2 +-
 .../net/ethernet/cavium/liquidio/octeon_network.h  |   6 +-
 drivers/net/ethernet/cavium/liquidio/octeon_nic.h  |   2 +-
 .../net/ethernet/cavium/liquidio/request_manager.c |  16 ++-
 15 files changed, 155 insertions(+), 180 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_regs.h 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_regs.h
index 680a405..e6d4ad9 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_regs.h
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_regs.h
@@ -58,7 +58,7 @@
 
 #define CN23XX_CONFIG_SRIOV_BAR_START 0x19C
 #define CN23XX_CONFIG_SRIOV_BARX(i)\
-   (CN23XX_CONFIG_SRIOV_BAR_START + (i * 4))
+   (CN23XX_CONFIG_SRIOV_BAR_START + ((i) * 4))
 #define CN23XX_CONFIG_SRIOV_BAR_PF0x08
 #define CN23XX_CONFIG_SRIOV_BAR_64BIT 0x04
 #define CN23XX_CONFIG_SRIOV_BAR_IO0x01
@@ -508,7 +508,7 @@
 /* 4 Registers (64 - bit) */
 #defineCN23XX_SLI_S2M_PORT_CTL_START 0x23D80
 #defineCN23XX_SLI_S2M_PORTX_CTL(port)  \
-   (CN23XX_SLI_S2M_PORT_CTL_START + (port * 0x10))
+   (CN23XX_SLI_S2M_PORT_CTL_START + ((port) * 0x10))
 
 #defineCN23XX_SLI_MAC_NUMBER 0x20050
 
@@ -549,26 +549,26 @@
  * Provides DMA Engine Queue Enable
  */
 #defineCN23XX_DPI_DMA_ENG0_ENB0x0001df80ULL
-#defineCN23XX_DPI_DMA_ENG_ENB(eng) (CN23XX_DPI_DMA_ENG0_ENB + (eng * 8))
+#defineCN23XX_DPI_DMA_ENG_ENB(eng) (CN23XX_DPI_DMA_ENG0_ENB + ((eng) * 8))
 
 /* 8 register (64-bit) - DPI_DMA(0..7)_REQQ_CTL
  * Provides control bits for transaction on 8 Queues
  */
 #defineCN23XX_DPI_DMA_REQQ0_CTL   0x0001df000180ULL
 #defineCN23XX_DPI_DMA_REQQ_CTL(q_no)   \
-   (CN23XX_DPI_DMA_REQQ0_CTL + (q_no * 8))
+   (CN23XX_DPI_DMA_REQQ0_CTL + ((q_no) * 8))
 
 /* 6 register (64-bit) - DPI_ENG(0..5)_BUF
  * Provides DMA Engine FIFO (Queue) Size
  */
 #defineCN23XX_DPI_DMA_ENG0_BUF0x0001df000880ULL
 #defineCN23XX_DPI_DMA_ENG_BUF(eng)   \
-   (CN23XX_DPI_DMA_ENG0_BUF + (eng * 8))
+   (CN23XX_DPI_DMA_ENG0_BUF + ((eng) * 8))
 
 /* 4 Registers (64-bit) */
 #defineCN23XX_DPI_SLI_PRT_CFG_START   0x0001df000900ULL
 #defineCN23XX_DPI_SLI_PRTX_CFG(port)\
-   (CN23XX_DPI_SLI_PRT_CFG_START + (port * 0x8))
+   (CN23XX_DPI_SLI_PRT_CFG_START + ((port) * 0x8))
 
 /* Masks for DPI_DMA_CONTROL Register */
 #defineCN23XX_DPI_DMA_COMMIT_MODE BIT_ULL(58)
diff --git a/drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h 
b/drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h
index 23152c0..b248966 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h
+++ b/drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h
@@ -438,10 +438,10 @@
 #defineCN6XXX_SLI_S2M_PORT0_CTL  0x3D80
 #defineCN6XXX_SLI_S2M_PORT1_CTL  0x3D90
 #defineCN6XXX_SLI_S2M_PORTX_CTL(port)\
-   (CN6XXX_SLI_S2M_PORT0_CTL + (port * 0x10))
+   (CN6XXX_SLI_S2M_PORT0_CTL + ((port) * 0x10))
 
 #defineCN6XXX_SLI_INT_ENB64(port)\
-   (CN6XXX_SLI_INT_ENB64_PORT0 + (port * 0x10))
+   (CN6XXX_SLI_INT_ENB64_PORT0 + ((port) * 0x10))
 
 #defineCN6XXX_SLI_MAC_NUMBER 0x3E00
 
@@ -453,7 +453,7 @@
 #defineCN6XXX_PCI_BAR1_OFFSET  0x8
 
 #defineCN6XXX_BAR1_REG(idx, port) \
-   (CN6XXX_BAR1_INDEX_START + (port * CN6XXX_PEM_OFFSET) + \
+   (CN6XXX_BAR1_INDEX_START + ((port) * CN6XXX_PEM_OFFSET) + \
(CN6XXX_PCI_BAR1_OFFSET * (idx)))
 
 /* DPI #*/
@@ -471,17 +471,17 @@
 #defineCN6XXX_DPI_DMA_ENG0_ENB0x0001df80ULL

[PATCH net-next V5 3/9] liquidio CN23XX: Mailbox support

2016-11-09 Thread Raghu Vatsavayi

Adds support for mailbox communication between PF and VF.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/Makefile  |   1 +
 drivers/net/ethernet/cavium/liquidio/lio_core.c|  32 +++
 .../net/ethernet/cavium/liquidio/liquidio_common.h |   6 +-
 .../net/ethernet/cavium/liquidio/octeon_device.h   |   4 +
 .../net/ethernet/cavium/liquidio/octeon_mailbox.c  | 318 +
 .../net/ethernet/cavium/liquidio/octeon_mailbox.h  | 115 
 drivers/net/ethernet/cavium/liquidio/octeon_main.h |   2 +-
 7 files changed, 475 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
 create mode 100644 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h

diff --git a/drivers/net/ethernet/cavium/liquidio/Makefile 
b/drivers/net/ethernet/cavium/liquidio/Makefile
index 5a27b2a..14958de 100644
--- a/drivers/net/ethernet/cavium/liquidio/Makefile
+++ b/drivers/net/ethernet/cavium/liquidio/Makefile
@@ -11,6 +11,7 @@ liquidio-$(CONFIG_LIQUIDIO) += lio_ethtool.o \
cn66xx_device.o\
cn68xx_device.o\
cn23xx_pf_device.o \
+   octeon_mailbox.o   \
octeon_mem_ops.o   \
octeon_droq.o  \
octeon_nic.o
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_core.c 
b/drivers/net/ethernet/cavium/liquidio/lio_core.c
index 201eddb..e6026df 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_core.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_core.c
@@ -264,3 +264,35 @@ void liquidio_link_ctrl_cmd_completion(void *nctrl_ptr)
nctrl->ncmd.s.cmd);
}
 }
+
+void octeon_pf_changed_vf_macaddr(struct octeon_device *oct, u8 *mac)
+{
+   bool macaddr_changed = false;
+   struct net_device *netdev;
+   struct lio *lio;
+
+   rtnl_lock();
+
+   netdev = oct->props[0].netdev;
+   lio = GET_LIO(netdev);
+
+   lio->linfo.macaddr_is_admin_asgnd = true;
+
+   if (!ether_addr_equal(netdev->dev_addr, mac)) {
+   macaddr_changed = true;
+   ether_addr_copy(netdev->dev_addr, mac);
+   ether_addr_copy(((u8 *)>linfo.hw_addr) + 2, mac);
+   call_netdevice_notifiers(NETDEV_CHANGEADDR, netdev);
+   }
+
+   rtnl_unlock();
+
+   if (macaddr_changed)
+   dev_info(>pci_dev->dev,
+"PF changed VF's MAC address to 
%02hhx:%02hhx:%02hhx:%02hhx:%02hhx:%02hhx\n",
+mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);
+
+   /* no need to notify the firmware of the macaddr change because
+* the PF did that already
+*/
+}
diff --git a/drivers/net/ethernet/cavium/liquidio/liquidio_common.h 
b/drivers/net/ethernet/cavium/liquidio/liquidio_common.h
index 0d990ac..caeff9a 100644
--- a/drivers/net/ethernet/cavium/liquidio/liquidio_common.h
+++ b/drivers/net/ethernet/cavium/liquidio/liquidio_common.h
@@ -731,13 +731,15 @@ struct oct_link_info {
 
 #ifdef __BIG_ENDIAN_BITFIELD
u64 gmxport:16;
-   u64 rsvd:32;
+   u64 macaddr_is_admin_asgnd:1;
+   u64 rsvd:31;
u64 num_txpciq:8;
u64 num_rxpciq:8;
 #else
u64 num_rxpciq:8;
u64 num_txpciq:8;
-   u64 rsvd:32;
+   u64 rsvd:31;
+   u64 macaddr_is_admin_asgnd:1;
u64 gmxport:16;
 #endif
 
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.h 
b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
index cfd12ec..77a6eb7 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
@@ -492,6 +492,9 @@ struct octeon_device {
 
int msix_on;
 
+   /** Mail Box details of each octeon queue. */
+   struct octeon_mbox  *mbox[MAX_POSSIBLE_VFS];
+
/** IOq information of it's corresponding MSI-X interrupt. */
struct octeon_ioq_vector*ioq_vector;
 
@@ -511,6 +514,7 @@ struct octeon_device {
 #define  OCTEON_CN6XXX(oct)   ((oct->chip_id == OCTEON_CN66XX) || \
   (oct->chip_id == OCTEON_CN68XX))
 #define  OCTEON_CN23XX_PF(oct)(oct->chip_id == OCTEON_CN23XX_PF_VID)
+#define  OCTEON_CN23XX_VF(oct)((oct)->chip_id == OCTEON_CN23XX_VF_VID)
 #define CHIP_FIELD(oct, TYPE, field) \
(((struct octeon_ ## TYPE  *)(oct->chip))->field)
 
diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c 
b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
new file mode 100644
index 000..5309384
--- /dev/null
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
@@ -0,0 +1,318 @@

[PATCH net-next V5 8/9] liquidio CN23XX: copyrights changes and alignment

2016-11-09 Thread Raghu Vatsavayi

Updated copyrights comments and also changed some other comments
alignments.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 .../ethernet/cavium/liquidio/cn23xx_pf_device.c| 53 ++
 .../ethernet/cavium/liquidio/cn23xx_pf_device.h| 39 +++-
 .../net/ethernet/cavium/liquidio/cn23xx_pf_regs.h  | 39 +++-
 .../net/ethernet/cavium/liquidio/cn66xx_device.c   | 36 +++
 .../net/ethernet/cavium/liquidio/cn66xx_device.h   | 37 +++
 drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h | 37 +++
 .../net/ethernet/cavium/liquidio/cn68xx_device.c   | 36 +++
 .../net/ethernet/cavium/liquidio/cn68xx_device.h   | 37 +++
 drivers/net/ethernet/cavium/liquidio/cn68xx_regs.h | 37 +++
 drivers/net/ethernet/cavium/liquidio/lio_core.c| 36 +++
 drivers/net/ethernet/cavium/liquidio/lio_ethtool.c | 42 -
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 36 +++
 .../net/ethernet/cavium/liquidio/liquidio_common.h | 37 +++
 .../net/ethernet/cavium/liquidio/liquidio_image.h  | 36 +++
 .../net/ethernet/cavium/liquidio/octeon_config.h   | 37 +++
 .../net/ethernet/cavium/liquidio/octeon_console.c  | 43 --
 .../net/ethernet/cavium/liquidio/octeon_device.c   | 36 +++
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 45 --
 drivers/net/ethernet/cavium/liquidio/octeon_droq.c | 36 +++
 drivers/net/ethernet/cavium/liquidio/octeon_droq.h | 17 +++
 drivers/net/ethernet/cavium/liquidio/octeon_iq.h   | 21 -
 drivers/net/ethernet/cavium/liquidio/octeon_main.h | 19 +++-
 .../net/ethernet/cavium/liquidio/octeon_mem_ops.c  |  5 +-
 .../net/ethernet/cavium/liquidio/octeon_mem_ops.h  |  5 +-
 .../net/ethernet/cavium/liquidio/octeon_network.h  |  5 +-
 drivers/net/ethernet/cavium/liquidio/octeon_nic.c  |  5 +-
 drivers/net/ethernet/cavium/liquidio/octeon_nic.h  |  5 +-
 .../net/ethernet/cavium/liquidio/request_manager.c |  5 +-
 .../ethernet/cavium/liquidio/response_manager.c|  5 +-
 .../ethernet/cavium/liquidio/response_manager.h|  5 +-
 30 files changed, 352 insertions(+), 480 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
index d01b00b..962dcbc 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
@@ -1,27 +1,21 @@
 /**
-* Author: Cavium, Inc.
-*
-* Contact: supp...@cavium.com
-*  Please include "LiquidIO" in the subject.
-*
-* Copyright (c) 2003-2015 Cavium, Inc.
-*
-* This file is free software; you can redistribute it and/or modify
-* it under the terms of the GNU General Public License, Version 2, as
-* published by the Free Software Foundation.
-*
-* This file is distributed in the hope that it will be useful, but
-* AS-IS and WITHOUT ANY WARRANTY; without even the implied warranty
-* of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, TITLE, or
-* NONINFRINGEMENT.  See the GNU General Public License for more
-* details.
-*
-* This file may also be available under a different license from Cavium.
-* Contact Cavium, Inc. for more information
-**/
-
+ * Author: Cavium, Inc.
+ *
+ * Contact: supp...@cavium.com
+ *  Please include "LiquidIO" in the subject.
+ *
+ * Copyright (c) 2003-2016 Cavium, Inc.
+ *
+ * This file is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, Version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This file is distributed in the hope that it will be useful, but
+ * AS-IS and WITHOUT ANY WARRANTY; without even the implied warranty
+ * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, TITLE, or
+ * NONINFRINGEMENT.  See the GNU General Public License for more details.
+ ***/
 #include 
-#include 
 #include 
 #include 
 #include "liquidio_common.h"
@@ -421,10 +415,10 @@ static int cn23xx_pf_setup_global_input_regs(struct 
octeon_device *oct)
return -1;
 
/** Set the MAC_NUM and PVF_NUM in IQ_PKT_CONTROL reg
-   * for all queues.Only PF can set these bits.
-   * bits 29:30 indicate the MAC num.
-   * bits 32:47 indicate the PVF num.
-   */
+* for all queues.Only PF can set these bits.
+* bits 29:30 indicate the MAC num.
+* bits 32:47 indicate the PVF num.
+*/
for (q_no = 0; q_no <

[PATCH net-next V5 2/9] liquidio CN23XX: sysfs VF config support

2016-11-09 Thread Raghu Vatsavayi

Adds sysfs based support for enabling or disabling VFs.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 106 +
 .../net/ethernet/cavium/liquidio/octeon_config.h   |   3 +
 .../net/ethernet/cavium/liquidio/octeon_device.h   |   8 ++
 3 files changed, 117 insertions(+)

diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index 71d01a7..ed4f08e 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -180,6 +180,10 @@ struct octeon_device_priv {
unsigned long napi_mask;
 };
 
+#ifdef CONFIG_PCI_IOV
+static int liquidio_enable_sriov(struct pci_dev *dev, int num_vfs);
+#endif
+
 static int octeon_device_init(struct octeon_device *);
 static int liquidio_stop(struct net_device *netdev);
 static void liquidio_remove(struct pci_dev *pdev);
@@ -518,6 +522,9 @@ static int liquidio_resume(struct pci_dev *pdev 
__attribute__((unused)))
.suspend= liquidio_suspend,
.resume = liquidio_resume,
 #endif
+#ifdef CONFIG_PCI_IOV
+   .sriov_configure = liquidio_enable_sriov,
+#endif
 };
 
 /**
@@ -1472,6 +1479,10 @@ static void octeon_destroy_resources(struct 
octeon_device *oct)
continue;
octeon_delete_instr_queue(oct, i);
}
+#ifdef CONFIG_PCI_IOV
+   if (oct->sriov_info.sriov_enabled)
+   pci_disable_sriov(oct->pci_dev);
+#endif
/* fallthrough */
case OCT_DEV_SC_BUFF_POOL_INIT_DONE:
octeon_free_sc_buffer_pool(oct);
@@ -3990,6 +4001,101 @@ static int setup_nic_devices(struct octeon_device 
*octeon_dev)
return -ENODEV;
 }
 
+#ifdef CONFIG_PCI_IOV
+static int octeon_enable_sriov(struct octeon_device *oct)
+{
+   unsigned int num_vfs_alloced = oct->sriov_info.num_vfs_alloced;
+   struct pci_dev *vfdev;
+   int err;
+   u32 u;
+
+   if (OCTEON_CN23XX_PF(oct) && num_vfs_alloced) {
+   err = pci_enable_sriov(oct->pci_dev,
+  oct->sriov_info.num_vfs_alloced);
+   if (err) {
+   dev_err(>pci_dev->dev,
+   "OCTEON: Failed to enable PCI sriov: %d\n",
+   err);
+   oct->sriov_info.num_vfs_alloced = 0;
+   return err;
+   }
+   oct->sriov_info.sriov_enabled = 1;
+
+   /* init lookup table that maps DPI ring number to VF pci_dev
+* struct pointer
+*/
+   u = 0;
+   vfdev = pci_get_device(PCI_VENDOR_ID_CAVIUM,
+  OCTEON_CN23XX_VF_VID, NULL);
+   while (vfdev) {
+   if (vfdev->is_virtfn &&
+   (vfdev->physfn == oct->pci_dev)) {
+   oct->sriov_info.dpiring_to_vfpcidev_lut[u] =
+   vfdev;
+   u += oct->sriov_info.rings_per_vf;
+   }
+   vfdev = pci_get_device(PCI_VENDOR_ID_CAVIUM,
+  OCTEON_CN23XX_VF_VID, vfdev);
+   }
+   }
+
+   return num_vfs_alloced;
+}
+
+static int lio_pci_sriov_disable(struct octeon_device *oct)
+{
+   int u;
+
+   if (pci_vfs_assigned(oct->pci_dev)) {
+   dev_err(>pci_dev->dev, "VFs are still assigned to VMs.\n");
+   return -EPERM;
+   }
+
+   pci_disable_sriov(oct->pci_dev);
+
+   u = 0;
+   while (u < MAX_POSSIBLE_VFS) {
+   oct->sriov_info.dpiring_to_vfpcidev_lut[u] = NULL;
+   u += oct->sriov_info.rings_per_vf;
+   }
+
+   oct->sriov_info.num_vfs_alloced = 0;
+   dev_info(>pci_dev->dev, "oct->pf_num:%d disabled VFs\n",
+oct->pf_num);
+
+   return 0;
+}
+
+static int liquidio_enable_sriov(struct pci_dev *dev, int num_vfs)
+{
+   struct octeon_device *oct = pci_get_drvdata(dev);
+   int ret = 0;
+
+   if ((num_vfs == oct->sriov_info.num_vfs_alloced) &&
+   (oct->sriov_info.sriov_enabled)) {
+   dev_info(>pci_dev->dev, "oct->pf_num:%d already enabled 
num_vfs:%d\n",
+oct->pf_num, num_vfs);
+   return 0;
+   }
+
+   if (!num_vfs) {
+   ret = lio_pci_sriov_disable(oct);
+   } else if (num_vfs > oct->sriov_info.max_vfs) {
+   dev_err(>pci_dev->dev,
+   "OCTEON: Max allowed VFs:%d user requested:%d",
+

[PATCH net-next V5 7/9] liquidio CN23XX: code cleanup

2016-11-09 Thread Raghu Vatsavayi

Cleaned up unnecessary comments and added some minor macros.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 drivers/net/ethernet/cavium/liquidio/cn66xx_device.c   | 13 -
 drivers/net/ethernet/cavium/liquidio/cn66xx_device.h   |  4 ++--
 drivers/net/ethernet/cavium/liquidio/lio_ethtool.c | 14 --
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 16 +---
 drivers/net/ethernet/cavium/liquidio/liquidio_common.h |  2 --
 drivers/net/ethernet/cavium/liquidio/octeon_device.c   |  8 
 drivers/net/ethernet/cavium/liquidio/octeon_droq.c |  2 +-
 drivers/net/ethernet/cavium/liquidio/octeon_droq.h |  1 -
 drivers/net/ethernet/cavium/liquidio/octeon_iq.h   |  1 -
 drivers/net/ethernet/cavium/liquidio/octeon_main.h | 18 --
 drivers/net/ethernet/cavium/liquidio/request_manager.c |  7 ++-
 .../net/ethernet/cavium/liquidio/response_manager.c|  6 +-
 .../net/ethernet/cavium/liquidio/response_manager.h|  1 -
 13 files changed, 23 insertions(+), 70 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn66xx_device.c 
b/drivers/net/ethernet/cavium/liquidio/cn66xx_device.c
index e779af8..1ebc225 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn66xx_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn66xx_device.c
@@ -275,7 +275,6 @@ void lio_cn6xxx_setup_iq_regs(struct octeon_device *oct, 
u32 iq_no)
 {
struct octeon_instr_queue *iq = oct->instr_queue[iq_no];
 
-   /* Disable Packet-by-Packet mode; No Parse Mode or Skip length */
octeon_write_csr64(oct, CN6XXX_SLI_IQ_PKT_INSTR_HDR64(iq_no), 0);
 
/* Write the start of the input queue's ring and its size  */
@@ -378,7 +377,7 @@ void lio_cn6xxx_disable_io_queues(struct octeon_device *oct)
 
/* Reset the doorbell register for each Input queue. */
for (i = 0; i < MAX_OCTEON_INSTR_QUEUES(oct); i++) {
-   if (!(oct->io_qmask.iq & (1ULL << i)))
+   if (!(oct->io_qmask.iq & BIT_ULL(i)))
continue;
octeon_write_csr(oct, CN6XXX_SLI_IQ_DOORBELL(i), 0x);
d32 = octeon_read_csr(oct, CN6XXX_SLI_IQ_DOORBELL(i));
@@ -400,9 +399,8 @@ void lio_cn6xxx_disable_io_queues(struct octeon_device *oct)
;
 
/* Reset the doorbell register for each Output queue. */
-   /* for (i = 0; i < oct->num_oqs; i++) { */
for (i = 0; i < MAX_OCTEON_OUTPUT_QUEUES(oct); i++) {
-   if (!(oct->io_qmask.oq & (1ULL << i)))
+   if (!(oct->io_qmask.oq & BIT_ULL(i)))
continue;
octeon_write_csr(oct, CN6XXX_SLI_OQ_PKTS_CREDIT(i), 0x);
d32 = octeon_read_csr(oct, CN6XXX_SLI_OQ_PKTS_CREDIT(i));
@@ -537,15 +535,14 @@ static int lio_cn6xxx_process_droq_intr_regs(struct 
octeon_device *oct)
 
oct->droq_intr = 0;
 
-   /* for (oq_no = 0; oq_no < oct->num_oqs; oq_no++) { */
for (oq_no = 0; oq_no < MAX_OCTEON_OUTPUT_QUEUES(oct); oq_no++) {
-   if (!(droq_mask & (1ULL << oq_no)))
+   if (!(droq_mask & BIT_ULL(oq_no)))
continue;
 
droq = oct->droq[oq_no];
pkt_count = octeon_droq_check_hw_for_pkts(droq);
if (pkt_count) {
-   oct->droq_intr |= (1ULL << oq_no);
+   oct->droq_intr |= BIT_ULL(oq_no);
if (droq->ops.poll_mode) {
u32 value;
u32 reg;
@@ -721,8 +718,6 @@ int lio_setup_cn66xx_octeon_device(struct octeon_device 
*oct)
 int lio_validate_cn6xxx_config_info(struct octeon_device *oct,
struct octeon_config *conf6xxx)
 {
-   /* int total_instrs = 0; */
-
if (CFG_GET_IQ_MAX_Q(conf6xxx) > CN6XXX_MAX_INPUT_QUEUES) {
dev_err(>pci_dev->dev, "%s: Num IQ (%d) exceeds Max 
(%d)\n",
__func__, CFG_GET_IQ_MAX_Q(conf6xxx),
diff --git a/drivers/net/ethernet/cavium/liquidio/cn66xx_device.h 
b/drivers/net/ethernet/cavium/liquidio/cn66xx_device.h
index a40a913..32fbbb2 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn66xx_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/cn66xx_device.h
@@ -96,8 +96,8 @@ void lio_cn6xxx_setup_reg_address(struct octeon_device *oct, 
void *chip,
  struct octeon_reg_list *reg_list);
 u32 lio_cn6xxx_coprocessor_clock(struct octeon_device *oct);
 u32 lio_cn6xxx_get_oq_ticks(struct octeon_device *oct, u32 time_intr_in_us);
-int lio_setup_cn66xx_octeon_device(struct octeon_device *);
+int lio_setup_cn66xx_octeon_device(struct octeon_device *oct);
 int

Re: [PATCH 2/2] [nf-next] netfilter: fix NF_REPEAT handling

2016-11-09 Thread Pablo Neira Ayuso

On Tue, Nov 08, 2016 at 02:28:19PM +0100, Arnd Bergmann wrote:
> gcc correctly identified a theoretical uninitialized variable use:
> 
> net/netfilter/nf_conntrack_core.c: In function 'nf_conntrack_in':
> net/netfilter/nf_conntrack_core.c:1125:14: error: 'l4proto' may be used 
> uninitialized in this function [-Werror=maybe-uninitialized]
> 
> This could only happen when we 'goto out' before looking up l4proto,
> and then enter the retry, implying that l3proto->get_l4proto()
> returned NF_REPEAT. This does not currently get returned in any
> code path and probably won't ever happen, but is not good to
> rely on.
> 
> Moving the repeat handling up a little should have the same
> behavior as today but avoids the warning by making that case
> impossible to enter.
> 
> Fixes: 08733a0cb7de ("netfilter: handle NF_REPEAT from nf_conntrack_in()")
> Signed-off-by: Arnd Bergmann 
> ---
> The patch causing this is currently only in nf-next, and not yet
> in net-next.
> ---
>  net/netfilter/nf_conntrack_core.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/net/netfilter/nf_conntrack_core.c 
> b/net/netfilter/nf_conntrack_core.c
> index de4b8a75f30b..610c9de0ce18 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1337,6 +1337,8 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned 
> int hooknum,
>   NF_CT_STAT_INC_ATOMIC(net, invalid);
>   if (ret == -NF_DROP)
>   NF_CT_STAT_INC_ATOMIC(net, drop);
> + if (ret == -NF_REPEAT && tmpl)
> + goto repeat;

This is my fault, I'm going to mangle this patch since 08733a0cb7de
really broke the NF_REPEAT handling. We should inconditionally jump
back to repeat if we get NF_REPEAT, no matter if the template is set
or not. I'll include a side node on this mangling.

>   ret = -ret;
>   goto out;
>   }
> @@ -1349,10 +1351,7 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned 
> int hooknum,
>* closed/aborted connection. We have to go back and create a
>* fresh conntrack.
>*/

I'm going to move the comment above on top of the NF_REPEAT check, so
it still keeps around as context.

BTW, the revamped patch looks like the one attached.

Thanks a lot for addressing this fallout.
diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index de4b8a75f30b..e9ffe33dc0ca 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1337,6 +1337,12 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned 
int hooknum,
NF_CT_STAT_INC_ATOMIC(net, invalid);
if (ret == -NF_DROP)
NF_CT_STAT_INC_ATOMIC(net, drop);
+   /* Special case: TCP tracker reports an attempt to reopen a
+* closed/aborted connection. We have to go back and create a
+* fresh conntrack.
+*/
+   if (ret == -NF_REPEAT)
+   goto repeat;
ret = -ret;
goto out;
}
@@ -1344,16 +1350,8 @@ nf_conntrack_in(struct net *net, u_int8_t pf, unsigned 
int hooknum,
if (set_reply && !test_and_set_bit(IPS_SEEN_REPLY_BIT, >status))
nf_conntrack_event_cache(IPCT_REPLY, ct);
 out:
-   if (tmpl) {
-   /* Special case: TCP tracker reports an attempt to reopen a
-* closed/aborted connection. We have to go back and create a
-* fresh conntrack.
-*/
-   if (ret == NF_REPEAT)
-   goto repeat;
-   else
-   nf_ct_put(tmpl);
-   }
+   if (tmpl)
+   nf_ct_put(tmpl);
 
return ret;
 }

[PATCH net-next V5 0/9] liquidio CN23XX VF support

2016-11-09 Thread Raghu Vatsavayi

Dave,

Following is the V5 patch series for adding VF support on
CN23XX devices. This version addressed:
1) Your concern for ordering of local variable declarations
   from longest to shortest line.
2) Removed module parameters max_vfs, num_queues_per_{p,v}f.
3) Minor changes for fixing new checkpatch script related 
   errors on pre-existing driver.
4) Fixed compilation issues when CONFIG_PCI_IOV/CONFIG_PCI_ATS
   options are disabled.

I will post remaining VF patches soon after this patchseries is
applied. Please apply patches in the following order as some of
the patches depend on earlier patches.

Thanks.

Raghu Vatsavayi (9):
  liquidio CN23XX: HW config for VF support
  liquidio CN23XX: sysfs VF config support
  liquidio CN23XX: Mailbox support
  liquidio CN23XX: mailbox interrupt processing
  liquidio CN23XX: VF related operations
  liquidio CN23XX: device states
  liquidio CN23XX: code cleanup
  liquidio CN23XX: copyrights changes and alignment
  liquidio CN23XX: fix for new check patch errors

 drivers/net/ethernet/cavium/liquidio/Makefile  |   1 +
 .../ethernet/cavium/liquidio/cn23xx_pf_device.c| 322 +---
 .../ethernet/cavium/liquidio/cn23xx_pf_device.h|  44 +--
 .../net/ethernet/cavium/liquidio/cn23xx_pf_regs.h  |  51 ++-
 .../net/ethernet/cavium/liquidio/cn66xx_device.c   |  49 +--
 .../net/ethernet/cavium/liquidio/cn66xx_device.h   |  41 +-
 drivers/net/ethernet/cavium/liquidio/cn66xx_regs.h |  49 ++-
 .../net/ethernet/cavium/liquidio/cn68xx_device.c   |  38 +-
 .../net/ethernet/cavium/liquidio/cn68xx_device.h   |  37 +-
 drivers/net/ethernet/cavium/liquidio/cn68xx_regs.h |  37 +-
 drivers/net/ethernet/cavium/liquidio/lio_core.c|  68 +++-
 drivers/net/ethernet/cavium/liquidio/lio_ethtool.c |  65 ++--
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 429 ++---
 .../net/ethernet/cavium/liquidio/liquidio_common.h | 100 +++--
 .../net/ethernet/cavium/liquidio/liquidio_image.h  |  36 +-
 .../net/ethernet/cavium/liquidio/octeon_config.h   |  46 ++-
 .../net/ethernet/cavium/liquidio/octeon_console.c  | 156 
 .../net/ethernet/cavium/liquidio/octeon_device.c   |  79 ++--
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 138 ---
 drivers/net/ethernet/cavium/liquidio/octeon_droq.c |  91 +++--
 drivers/net/ethernet/cavium/liquidio/octeon_droq.h |  18 +-
 drivers/net/ethernet/cavium/liquidio/octeon_iq.h   |  25 +-
 .../net/ethernet/cavium/liquidio/octeon_mailbox.c  | 318 +++
 .../net/ethernet/cavium/liquidio/octeon_mailbox.h  | 115 ++
 drivers/net/ethernet/cavium/liquidio/octeon_main.h |  47 +--
 .../net/ethernet/cavium/liquidio/octeon_mem_ops.c  |   7 +-
 .../net/ethernet/cavium/liquidio/octeon_mem_ops.h  |   5 +-
 .../net/ethernet/cavium/liquidio/octeon_network.h  |  11 +-
 drivers/net/ethernet/cavium/liquidio/octeon_nic.c  |   5 +-
 drivers/net/ethernet/cavium/liquidio/octeon_nic.h  |   7 +-
 .../net/ethernet/cavium/liquidio/request_manager.c |  34 +-
 .../ethernet/cavium/liquidio/response_manager.c|  11 +-
 .../ethernet/cavium/liquidio/response_manager.h|   6 +-
 33 files changed, 1688 insertions(+), 798 deletions(-)
 create mode 100644 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.c
 create mode 100644 drivers/net/ethernet/cavium/liquidio/octeon_mailbox.h

-- 
1.8.3.1

[PATCH net-next V5 1/9] liquidio CN23XX: HW config for VF support

2016-11-09 Thread Raghu Vatsavayi

Adds support for configuring HW for creating VFs.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 .../ethernet/cavium/liquidio/cn23xx_pf_device.c| 90 ++
 .../net/ethernet/cavium/liquidio/octeon_config.h   |  6 ++
 .../net/ethernet/cavium/liquidio/octeon_device.h   | 12 ++-
 3 files changed, 74 insertions(+), 34 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
index 380a641..832d710 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
@@ -40,11 +40,6 @@
  */
 #define CN23XX_INPUT_JABBER 64600
 
-#define LIOLUT_RING_DISTRIBUTION 9
-const int liolut_num_vfs_to_rings_per_vf[LIOLUT_RING_DISTRIBUTION] = {
-   0, 8, 4, 2, 2, 2, 1, 1, 1
-};
-
 void cn23xx_dump_pf_initialized_regs(struct octeon_device *oct)
 {
int i = 0;
@@ -309,9 +304,10 @@ u32 cn23xx_pf_get_oq_ticks(struct octeon_device *oct, u32 
time_intr_in_us)
 
 static void cn23xx_setup_global_mac_regs(struct octeon_device *oct)
 {
-   u64 reg_val;
u16 mac_no = oct->pcie_port;
u16 pf_num = oct->pf_num;
+   u64 reg_val;
+   u64 temp;
 
/* programming SRN and TRS for each MAC(0..3)  */
 
@@ -333,6 +329,14 @@ static void cn23xx_setup_global_mac_regs(struct 
octeon_device *oct)
/* setting TRS <23:16> */
reg_val = reg_val |
  (oct->sriov_info.trs << CN23XX_PKT_MAC_CTL_RINFO_TRS_BIT_POS);
+   /* setting RPVF <39:32> */
+   temp = oct->sriov_info.rings_per_vf & 0xff;
+   reg_val |= (temp << CN23XX_PKT_MAC_CTL_RINFO_RPVF_BIT_POS);
+
+   /* setting NVFS <55:48> */
+   temp = oct->sriov_info.max_vfs & 0xff;
+   reg_val |= (temp << CN23XX_PKT_MAC_CTL_RINFO_NVFS_BIT_POS);
+
/* write these settings to MAC register */
octeon_write_csr64(oct, CN23XX_SLI_PKT_MAC_RINFO64(mac_no, pf_num),
   reg_val);
@@ -399,11 +403,12 @@ static int cn23xx_reset_io_queues(struct octeon_device 
*oct)
 
 static int cn23xx_pf_setup_global_input_regs(struct octeon_device *oct)
 {
+   struct octeon_cn23xx_pf *cn23xx = (struct octeon_cn23xx_pf *)oct->chip;
+   struct octeon_instr_queue *iq;
+   u64 intr_threshold, reg_val;
u32 q_no, ern, srn;
u64 pf_num;
-   u64 intr_threshold, reg_val;
-   struct octeon_instr_queue *iq;
-   struct octeon_cn23xx_pf *cn23xx = (struct octeon_cn23xx_pf *)oct->chip;
+   u64 vf_num;
 
pf_num = oct->pf_num;
 
@@ -420,6 +425,16 @@ static int cn23xx_pf_setup_global_input_regs(struct 
octeon_device *oct)
*/
for (q_no = 0; q_no < ern; q_no++) {
reg_val = oct->pcie_port << CN23XX_PKT_INPUT_CTL_MAC_NUM_POS;
+
+   /* for VF assigned queues. */
+   if (q_no < oct->sriov_info.pf_srn) {
+   vf_num = q_no / oct->sriov_info.rings_per_vf;
+   vf_num += 1; /* VF1, VF2, */
+   } else {
+   vf_num = 0;
+   }
+
+   reg_val |= vf_num << CN23XX_PKT_INPUT_CTL_VF_NUM_POS;
reg_val |= pf_num << CN23XX_PKT_INPUT_CTL_PF_NUM_POS;
 
octeon_write_csr64(oct, CN23XX_SLI_IQ_PKT_CONTROL64(q_no),
@@ -1048,50 +1063,59 @@ static void cn23xx_setup_reg_address(struct 
octeon_device *oct)
 
 static int cn23xx_sriov_config(struct octeon_device *oct)
 {
-   u32 total_rings;
struct octeon_cn23xx_pf *cn23xx = (struct octeon_cn23xx_pf *)oct->chip;
-   /* num_vfs is already filled for us */
+   u32 max_rings, total_rings, max_vfs, rings_per_vf;
u32 pf_srn, num_pf_rings;
+   u32 max_possible_vfs;
 
cn23xx->conf =
-   (struct octeon_config *)oct_get_config_info(oct, LIO_23XX);
+   (struct octeon_config *)oct_get_config_info(oct, LIO_23XX);
switch (oct->rev_id) {
case OCTEON_CN23XX_REV_1_0:
-   total_rings = CN23XX_MAX_RINGS_PER_PF_PASS_1_0;
+   max_rings = CN23XX_MAX_RINGS_PER_PF_PASS_1_0;
+   max_possible_vfs = CN23XX_MAX_VFS_PER_PF_PASS_1_0;
break;
case OCTEON_CN23XX_REV_1_1:
-   total_rings = CN23XX_MAX_RINGS_PER_PF_PASS_1_1;
+   max_rings = CN23XX_MAX_RINGS_PER_PF_PASS_1_1;
+   max_possible_vfs = CN23XX_MAX_VFS_PER_PF_PASS_1_1;
break;
default:
-   total_rings = CN23XX_MAX_RINGS_PER_PF;
+   max_rings = CN23XX_MAX_RINGS_PER_PF;
+   max_possible_vfs = CN23XX_MAX_VFS_PER_PF;
break;
}
-   if (!oct->sriov_info.num_pf_rings) {
-   if (total_rings >

[PATCH net-next V5 5/9] liquidio CN23XX: VF related operations

2016-11-09 Thread Raghu Vatsavayi

Adds support for VF related operations like mac address vlan
and link changes.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 .../ethernet/cavium/liquidio/cn23xx_pf_device.c|  22 +++
 .../ethernet/cavium/liquidio/cn23xx_pf_device.h|   5 +
 drivers/net/ethernet/cavium/liquidio/lio_main.c| 211 +
 .../net/ethernet/cavium/liquidio/liquidio_common.h |   5 +
 .../net/ethernet/cavium/liquidio/octeon_device.h   |   8 +
 5 files changed, 251 insertions(+)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
index ffc94ac..d01b00b 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "liquidio_common.h"
 #include "octeon_droq.h"
 #include "octeon_iq.h"
@@ -1416,3 +1417,24 @@ int cn23xx_fw_loaded(struct octeon_device *oct)
val = octeon_read_csr64(oct, CN23XX_SLI_SCRATCH1);
return (val >> 1) & 1ULL;
 }
+
+void cn23xx_tell_vf_its_macaddr_changed(struct octeon_device *oct, int vfidx,
+   u8 *mac)
+{
+   if (oct->sriov_info.vf_drv_loaded_mask & BIT_ULL(vfidx)) {
+   struct octeon_mbox_cmd mbox_cmd;
+
+   mbox_cmd.msg.u64 = 0;
+   mbox_cmd.msg.s.type = OCTEON_MBOX_REQUEST;
+   mbox_cmd.msg.s.resp_needed = 0;
+   mbox_cmd.msg.s.cmd = OCTEON_PF_CHANGED_VF_MACADDR;
+   mbox_cmd.msg.s.len = 1;
+   mbox_cmd.recv_len = 0;
+   mbox_cmd.recv_status = 0;
+   mbox_cmd.fn = NULL;
+   mbox_cmd.fn_arg = 0;
+   ether_addr_copy(mbox_cmd.msg.s.params, mac);
+   mbox_cmd.q_no = vfidx * oct->sriov_info.rings_per_vf;
+   octeon_mbox_write(oct, _cmd);
+   }
+}
diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.h 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.h
index 21b5c90..cee346a 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.h
@@ -29,6 +29,8 @@
 
 #include "cn23xx_pf_regs.h"
 
+#define LIO_CMD_WAIT_TM 100
+
 /* Register address and configuration for a CN23XX devices.
  * If device specific changes need to be made then add a struct to include
  * device specific fields as shown in the commented section
@@ -56,4 +58,7 @@ int validate_cn23xx_pf_config_info(struct octeon_device *oct,
 void cn23xx_dump_pf_initialized_regs(struct octeon_device *oct);
 
 int cn23xx_fw_loaded(struct octeon_device *oct);
+
+void cn23xx_tell_vf_its_macaddr_changed(struct octeon_device *oct, int vfidx,
+   u8 *mac);
 #endif
diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c 
b/drivers/net/ethernet/cavium/liquidio/lio_main.c
index f776808..6e435db 100644
--- a/drivers/net/ethernet/cavium/liquidio/lio_main.c
+++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c
@@ -3573,6 +3573,151 @@ static void liquidio_del_vxlan_port(struct net_device 
*netdev,
OCTNET_CMD_VXLAN_PORT_DEL);
 }
 
+static int __liquidio_set_vf_mac(struct net_device *netdev, int vfidx,
+u8 *mac, bool is_admin_assigned)
+{
+   struct lio *lio = GET_LIO(netdev);
+   struct octeon_device *oct = lio->oct_dev;
+   struct octnic_ctrl_pkt nctrl;
+
+   if (!is_valid_ether_addr(mac))
+   return -EINVAL;
+
+   if (vfidx < 0 || vfidx >= oct->sriov_info.max_vfs)
+   return -EINVAL;
+
+   memset(, 0, sizeof(struct octnic_ctrl_pkt));
+
+   nctrl.ncmd.u64 = 0;
+   nctrl.ncmd.s.cmd = OCTNET_CMD_CHANGE_MACADDR;
+   /* vfidx is 0 based, but vf_num (param1) is 1 based */
+   nctrl.ncmd.s.param1 = vfidx + 1;
+   nctrl.ncmd.s.param2 = (is_admin_assigned ? 1 : 0);
+   nctrl.ncmd.s.more = 1;
+   nctrl.iq_no = lio->linfo.txpciq[0].s.q_no;
+   nctrl.cb_fn = 0;
+   nctrl.wait_time = LIO_CMD_WAIT_TM;
+
+   nctrl.udd[0] = 0;
+   /* The MAC Address is presented in network byte order. */
+   ether_addr_copy((u8 *)[0] + 2, mac);
+
+   oct->sriov_info.vf_macaddr[vfidx] = nctrl.udd[0];
+
+   octnet_send_nic_ctrl_pkt(oct, );
+
+   return 0;
+}
+
+static int liquidio_set_vf_mac(struct net_device *netdev, int vfidx, u8 *mac)
+{
+   struct lio *lio = GET_LIO(netdev);
+   struct octeon_device *oct = lio->oct_dev;
+   int retval;
+
+   retval = __liquidio_set_vf_mac(netdev, vfidx, mac, true);
+   if (!retval)
+   cn23xx_tell_vf_its_macaddr_changed(oct, vfidx, mac);
+
+   return retval;
+}
+

[PATCH net-next V5 4/9] liquidio CN23XX: mailbox interrupt processing

2016-11-09 Thread Raghu Vatsavayi

Adds support for mailbox interrupt processing of various
commands.

Signed-off-by: Raghu Vatsavayi 
Signed-off-by: Derek Chickles 
Signed-off-by: Satanand Burla 
Signed-off-by: Felix Manlunas 
---
 .../ethernet/cavium/liquidio/cn23xx_pf_device.c| 157 +
 drivers/net/ethernet/cavium/liquidio/lio_main.c|  12 ++
 .../net/ethernet/cavium/liquidio/octeon_device.c   |   1 +
 .../net/ethernet/cavium/liquidio/octeon_device.h   |  21 ++-
 4 files changed, 184 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c 
b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
index 832d710..ffc94ac 100644
--- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
+++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c
@@ -30,6 +30,7 @@
 #include "octeon_device.h"
 #include "cn23xx_pf_device.h"
 #include "octeon_main.h"
+#include "octeon_mailbox.h"
 
 #define RESET_NOTDONE 0
 #define RESET_DONE 1
@@ -677,6 +678,118 @@ static void cn23xx_setup_oq_regs(struct octeon_device 
*oct, u32 oq_no)
}
 }
 
+static void cn23xx_pf_mbox_thread(struct work_struct *work)
+{
+   struct cavium_wk *wk = (struct cavium_wk *)work;
+   struct octeon_mbox *mbox = (struct octeon_mbox *)wk->ctxptr;
+   struct octeon_device *oct = mbox->oct_dev;
+   u64 mbox_int_val, val64;
+   u32 q_no, i;
+
+   if (oct->rev_id < OCTEON_CN23XX_REV_1_1) {
+   /*read and clear by writing 1*/
+   mbox_int_val = readq(mbox->mbox_int_reg);
+   writeq(mbox_int_val, mbox->mbox_int_reg);
+
+   for (i = 0; i < oct->sriov_info.num_vfs_alloced; i++) {
+   q_no = i * oct->sriov_info.rings_per_vf;
+
+   val64 = readq(oct->mbox[q_no]->mbox_write_reg);
+
+   if (val64 && (val64 != OCTEON_PFVFACK)) {
+   if (octeon_mbox_read(oct->mbox[q_no]))
+   octeon_mbox_process_message(
+   oct->mbox[q_no]);
+   }
+   }
+
+   schedule_delayed_work(>work, msecs_to_jiffies(10));
+   } else {
+   octeon_mbox_process_message(mbox);
+   }
+}
+
+static int cn23xx_setup_pf_mbox(struct octeon_device *oct)
+{
+   struct octeon_mbox *mbox = NULL;
+   u16 mac_no = oct->pcie_port;
+   u16 pf_num = oct->pf_num;
+   u32 q_no, i;
+
+   if (!oct->sriov_info.max_vfs)
+   return 0;
+
+   for (i = 0; i < oct->sriov_info.max_vfs; i++) {
+   q_no = i * oct->sriov_info.rings_per_vf;
+
+   mbox = vmalloc(sizeof(*mbox));
+   if (!mbox)
+   goto free_mbox;
+
+   memset(mbox, 0, sizeof(struct octeon_mbox));
+
+   spin_lock_init(>lock);
+
+   mbox->oct_dev = oct;
+
+   mbox->q_no = q_no;
+
+   mbox->state = OCTEON_MBOX_STATE_IDLE;
+
+   /* PF mbox interrupt reg */
+   mbox->mbox_int_reg = (u8 *)oct->mmio[0].hw_addr +
+CN23XX_SLI_MAC_PF_MBOX_INT(mac_no, pf_num);
+
+   /* PF writes into SIG0 reg */
+   mbox->mbox_write_reg = (u8 *)oct->mmio[0].hw_addr +
+  CN23XX_SLI_PKT_PF_VF_MBOX_SIG(q_no, 0);
+
+   /* PF reads from SIG1 reg */
+   mbox->mbox_read_reg = (u8 *)oct->mmio[0].hw_addr +
+ CN23XX_SLI_PKT_PF_VF_MBOX_SIG(q_no, 1);
+
+   /*Mail Box Thread creation*/
+   INIT_DELAYED_WORK(>mbox_poll_wk.work,
+ cn23xx_pf_mbox_thread);
+   mbox->mbox_poll_wk.ctxptr = (void *)mbox;
+
+   oct->mbox[q_no] = mbox;
+
+   writeq(OCTEON_PFVFSIG, mbox->mbox_read_reg);
+   }
+
+   if (oct->rev_id < OCTEON_CN23XX_REV_1_1)
+   schedule_delayed_work(>mbox[0]->mbox_poll_wk.work,
+ msecs_to_jiffies(0));
+
+   return 0;
+
+free_mbox:
+   while (i) {
+   i--;
+   vfree(oct->mbox[i]);
+   }
+
+   return 1;
+}
+
+static int cn23xx_free_pf_mbox(struct octeon_device *oct)
+{
+   u32 q_no, i;
+
+   if (!oct->sriov_info.max_vfs)
+   return 0;
+
+   for (i = 0; i < oct->sriov_info.max_vfs; i++) {
+   q_no = i * oct->sriov_info.rings_per_vf;
+   cancel_delayed_work_sync(
+   >mbox[q_no]->mbox_poll_wk.work);
+   vfree(oct->mbox[q_no]);
+   }
+
+   return 0;
+}
+
 static int cn23xx_enable_io_queues(struct octeon_device *oct)
 {
u64 reg_val;
@@ -871,6 +984,29 @@ static u64 cn23xx_pf_msix_interrupt_handler(void *dev)
return

Re: [PATCH 1/2] [net-next] udp: provide udp{4,6}_lib_lookup for nf_socket_ipv{4,6}

2016-11-09 Thread Pablo Neira Ayuso

On Tue, Nov 08, 2016 at 02:28:18PM +0100, Arnd Bergmann wrote:
> Since commit ca065d0cf80f ("udp: no longer use SLAB_DESTROY_BY_RCU")
> the udp6_lib_lookup and udp4_lib_lookup functions are only
> provided when it is actually possible to call them.
> 
> However, moving the callers now caused a link error:
> 
> net/built-in.o: In function `nf_sk_lookup_slow_v6':
> (.text+0x131a39): undefined reference to `udp6_lib_lookup'
> net/ipv4/netfilter/nf_socket_ipv4.o: In function `nf_sk_lookup_slow_v4':
> nf_socket_ipv4.c:(.text.nf_sk_lookup_slow_v4+0x114): undefined reference to 
> `udp4_lib_lookup'
> 
> This extends the #ifdef so we also provide the functions when
> CONFIG_NF_SOCKET_IPV4 or CONFIG_NF_SOCKET_IPV6, respectively
> are set.

Applied, thanks Arnd!

Re: [bug report] genetlink: fix error return code in genl_register_family()

2016-11-09 Thread Cong Wang

Cc'ing netdev

On Wed, Nov 9, 2016 at 1:56 AM, Dan Carpenter  wrote:
> Hello Wei Yongjun,
>
> The patch 22ca904ad70a: "genetlink: fix error return code in
> genl_register_family()" from Nov 1, 2016, leads to the following
> static checker warning:
>
> net/netlink/genetlink.c:365 genl_register_family()
> warn: unsigned 'family->id' is never less than zero.
>
> net/netlink/genetlink.c
>362
>363  family->id = idr_alloc(_fam_idr, family,
>364 start, end + 1, GFP_KERNEL);
>365  if (family->id < 0) {
> ^^
> Doesn't work for unsigned int.


family->id should be signed int.

[PATCH 01/17] batman-adv: Introduce missing headers for genetlink restructure

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

Fixes: 56989f6d8568 ("genetlink: mark families as __ro_after_init")
Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/netlink.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c
index 005012b..2171281 100644
--- a/net/batman-adv/netlink.c
+++ b/net/batman-adv/netlink.c
@@ -20,11 +20,14 @@
 
 #include 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.10.1

[PATCH 02/17] batman-adv: Mark batadv_netlink_ops as const

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

The genl_ops don't need to be written by anyone and thus can be moved in a
ro memory range.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/netlink.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c
index 2171281..0627381 100644
--- a/net/batman-adv/netlink.c
+++ b/net/batman-adv/netlink.c
@@ -530,7 +530,7 @@ batadv_netlink_dump_hardifs(struct sk_buff *msg, struct 
netlink_callback *cb)
return msg->len;
 }
 
-static struct genl_ops batadv_netlink_ops[] = {
+static const struct genl_ops batadv_netlink_ops[] = {
{
.cmd = BATADV_CMD_GET_MESH_INFO,
.flags = GENL_ADMIN_PERM,
-- 
2.10.1

[PATCH 09/17] batman-adv: use consume_skb for non-dropped packets

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

kfree_skb assumes that an skb is dropped after an failure and notes that.
consume_skb should be used in non-failure situations. Such information is
important for dropmonitor netlink which tells how many packets were dropped
and where this drop happened.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/bat_iv_ogm.c | 13 -
 net/batman-adv/fragmentation.c  | 20 ++--
 net/batman-adv/network-coding.c | 24 +++-
 net/batman-adv/send.c   | 27 +++
 net/batman-adv/send.h   |  3 ++-
 net/batman-adv/soft-interface.c |  2 +-
 6 files changed, 59 insertions(+), 30 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 0b9be62..310f391 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -698,7 +698,7 @@ static void batadv_iv_ogm_aggregate_new(const unsigned char 
*packet_buff,
 
forw_packet_aggr->skb = netdev_alloc_skb_ip_align(NULL, skb_size);
if (!forw_packet_aggr->skb) {
-   batadv_forw_packet_free(forw_packet_aggr);
+   batadv_forw_packet_free(forw_packet_aggr, true);
return;
}
 
@@ -1611,7 +1611,7 @@ batadv_iv_ogm_process_per_outif(const struct sk_buff 
*skb, int ogm_offset,
if (hardif_neigh)
batadv_hardif_neigh_put(hardif_neigh);
 
-   kfree_skb(skb_priv);
+   consume_skb(skb_priv);
 }
 
 /**
@@ -1783,6 +1783,7 @@ static void 
batadv_iv_send_outstanding_bat_ogm_packet(struct work_struct *work)
struct delayed_work *delayed_work;
struct batadv_forw_packet *forw_packet;
struct batadv_priv *bat_priv;
+   bool dropped = false;
 
delayed_work = to_delayed_work(work);
forw_packet = container_of(delayed_work, struct batadv_forw_packet,
@@ -1792,8 +1793,10 @@ static void 
batadv_iv_send_outstanding_bat_ogm_packet(struct work_struct *work)
hlist_del(_packet->list);
spin_unlock_bh(_priv->forw_bat_list_lock);
 
-   if (atomic_read(_priv->mesh_state) == BATADV_MESH_DEACTIVATING)
+   if (atomic_read(_priv->mesh_state) == BATADV_MESH_DEACTIVATING) {
+   dropped = true;
goto out;
+   }
 
batadv_iv_ogm_emit(forw_packet);
 
@@ -1810,7 +1813,7 @@ static void 
batadv_iv_send_outstanding_bat_ogm_packet(struct work_struct *work)
batadv_iv_ogm_schedule(forw_packet->if_incoming);
 
 out:
-   batadv_forw_packet_free(forw_packet);
+   batadv_forw_packet_free(forw_packet, dropped);
 }
 
 static int batadv_iv_ogm_receive(struct sk_buff *skb,
@@ -1851,7 +1854,7 @@ static int batadv_iv_ogm_receive(struct sk_buff *skb,
ogm_packet = (struct batadv_ogm_packet *)packet_pos;
}
 
-   kfree_skb(skb);
+   consume_skb(skb);
return NET_RX_SUCCESS;
 }
 
diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index 2b967a3..a2e28a1 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -42,17 +42,23 @@
 /**
  * batadv_frag_clear_chain - delete entries in the fragment buffer chain
  * @head: head of chain with entries.
+ * @dropped: whether the chain is cleared because all fragments are dropped
  *
  * Free fragments in the passed hlist. Should be called with appropriate lock.
  */
-static void batadv_frag_clear_chain(struct hlist_head *head)
+static void batadv_frag_clear_chain(struct hlist_head *head, bool dropped)
 {
struct batadv_frag_list_entry *entry;
struct hlist_node *node;
 
hlist_for_each_entry_safe(entry, node, head, list) {
hlist_del(>list);
-   kfree_skb(entry->skb);
+
+   if (dropped)
+   kfree_skb(entry->skb);
+   else
+   consume_skb(entry->skb);
+
kfree(entry);
}
 }
@@ -73,7 +79,7 @@ void batadv_frag_purge_orig(struct batadv_orig_node 
*orig_node,
spin_lock_bh(>lock);
 
if (!check_cb || check_cb(chain)) {
-   batadv_frag_clear_chain(>fragment_list);
+   batadv_frag_clear_chain(>fragment_list, true);
chain->size = 0;
}
 
@@ -118,7 +124,7 @@ static bool batadv_frag_init_chain(struct 
batadv_frag_table_entry *chain,
return false;
 
if (!hlist_empty(>fragment_list))
-   batadv_frag_clear_chain(>fragment_list);
+   batadv_frag_clear_chain(>fragment_list, true);
 
chain->size = 0;
chain->seqno = seqno;
@@ -220,7 +226,7 @@ static bool batadv_frag_insert_packet(struct 
batadv_orig_node *orig_node,
 * exceeds the maximum size of one merged packet. Don't allow
 * packets to have different total_size.
 */
-

[PATCH 08/17] batman-adv: Simple (re)broadcast avoidance

2016-11-09 Thread Simon Wunderlich

From: Linus Lüssing 

With this patch, (re)broadcasting on a specific interfaces is avoided:

* No neighbor: There is no need to broadcast on an interface if there
  is no node behind it.

* Single neighbor is source: If there is just one neighbor on an
  interface and if this neighbor is the one we actually got this
  broadcast packet from, then we do not need to echo it back.

* Single neighbor is originator: If there is just one neighbor on
  an interface and if this neighbor is the originator of this
  broadcast packet, then we do not need to echo it back.

Goodies for BATMAN V:

("Upgrade your BATMAN IV network to V now to get these for free!")

Thanks to the split of OGMv1 into two packet types, OGMv2 and ELP
that is, we can now apply the same optimizations stated above to OGMv2
packets, too.

Furthermore, with BATMAN V, rebroadcasts can be reduced in certain
multi interface cases, too, where BATMAN IV cannot. This is thanks to
the removal of the "secondary interface originator" concept in BATMAN V.

Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/bat_v_ogm.c  | 56 +
 net/batman-adv/hard-interface.c | 52 ++
 net/batman-adv/hard-interface.h | 16 
 net/batman-adv/originator.c | 13 +++---
 net/batman-adv/routing.c|  2 +-
 net/batman-adv/send.c   | 55 +++-
 net/batman-adv/send.h   |  3 ++-
 net/batman-adv/soft-interface.c |  2 +-
 net/batman-adv/types.h  |  2 ++
 9 files changed, 193 insertions(+), 8 deletions(-)

diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 61ff5f8..9922ccd 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -140,6 +140,7 @@ static void batadv_v_ogm_send(struct work_struct *work)
unsigned char *ogm_buff, *pkt_buff;
int ogm_buff_len;
u16 tvlv_len = 0;
+   int ret;
 
bat_v = container_of(work, struct batadv_priv_bat_v, ogm_wq.work);
bat_priv = container_of(bat_v, struct batadv_priv, bat_v);
@@ -182,6 +183,31 @@ static void batadv_v_ogm_send(struct work_struct *work)
if (!kref_get_unless_zero(_iface->refcount))
continue;
 
+   ret = batadv_hardif_no_broadcast(hard_iface, NULL, NULL);
+   if (ret) {
+   char *type;
+
+   switch (ret) {
+   case BATADV_HARDIF_BCAST_NORECIPIENT:
+   type = "no neighbor";
+   break;
+   case BATADV_HARDIF_BCAST_DUPFWD:
+   type = "single neighbor is source";
+   break;
+   case BATADV_HARDIF_BCAST_DUPORIG:
+   type = "single neighbor is originator";
+   break;
+   default:
+   type = "unknown";
+   }
+
+   batadv_dbg(BATADV_DBG_BATMAN, bat_priv, "OGM2 from 
ourselve on %s surpressed: %s\n",
+  hard_iface->net_dev->name, type);
+
+   batadv_hardif_put(hard_iface);
+   continue;
+   }
+
batadv_dbg(BATADV_DBG_BATMAN, bat_priv,
   "Sending own OGM2 packet (originator %pM, seqno %u, 
throughput %u, TTL %d) on interface %s [%pM]\n",
   ogm_packet->orig, ntohl(ogm_packet->seqno),
@@ -651,6 +677,7 @@ static void batadv_v_ogm_process(const struct sk_buff *skb, 
int ogm_offset,
struct batadv_hard_iface *hard_iface;
struct batadv_ogm2_packet *ogm_packet;
u32 ogm_throughput, link_throughput, path_throughput;
+   int ret;
 
ethhdr = eth_hdr(skb);
ogm_packet = (struct batadv_ogm2_packet *)(skb->data + ogm_offset);
@@ -716,6 +743,35 @@ static void batadv_v_ogm_process(const struct sk_buff 
*skb, int ogm_offset,
if (!kref_get_unless_zero(_iface->refcount))
continue;
 
+   ret = batadv_hardif_no_broadcast(hard_iface,
+ogm_packet->orig,
+hardif_neigh->orig);
+
+   if (ret) {
+   char *type;
+
+   switch (ret) {
+   case BATADV_HARDIF_BCAST_NORECIPIENT:
+   type = "no neighbor";
+   break;
+   case BATADV_HARDIF_BCAST_DUPFWD:
+   type = "single neighbor is source";
+   break;
+   case

[PATCH 10/17] batman-adv: Count all non-success TX packets as dropped

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

A failure during the submission also causes dropped packets.
batadv_interface_tx should therefore also increase the DROPPED counter for
these returns.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/soft-interface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index 2f0304e..7b3494a 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -386,7 +386,7 @@ static int batadv_interface_tx(struct sk_buff *skb,
ret = batadv_send_skb_via_tt(bat_priv, skb, dst_hint,
 vid);
}
-   if (ret == NET_XMIT_DROP)
+   if (ret != NET_XMIT_SUCCESS)
goto dropped_freed;
}
 
-- 
2.10.1

[PATCH 06/17] batman-adv: Remove unused skb_reset_mac_header()

2016-11-09 Thread Simon Wunderlich

From: Linus Lüssing 

During broadcast queueing, the skb_reset_mac_header() sets the skb
to a place invalid for a MAC header, pointing right into the
batman-adv broadcast packet. Luckily, no one seems to actually use
eth_hdr(skb) afterwards until batadv_send_skb_packet() resets the
header to a valid position again.

Therefore removing this unnecessary, weird skb_reset_mac_header()
call.

Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/send.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index e1e9136..be3f6d7 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -586,8 +586,6 @@ int batadv_add_bcast_packet_to_list(struct batadv_priv 
*bat_priv,
bcast_packet = (struct batadv_bcast_packet *)newskb->data;
bcast_packet->ttl--;
 
-   skb_reset_mac_header(newskb);
-
forw_packet->skb = newskb;
 
INIT_DELAYED_WORK(_packet->delayed_work,
-- 
2.10.1

[PATCH 05/17] batman-adv: Remove unnecessary lockdep in batadv_mcast_mla_list_free

2016-11-09 Thread Simon Wunderlich

From: Linus Lüssing 

batadv_mcast_mla_list_free() just frees some leftovers of a local feast
in batadv_mcast_mla_update(). No lockdep needed as it has nothing to do
with bat_priv->mcast.mla_list.

Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/multicast.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c
index 13661f4..45757fa 100644
--- a/net/batman-adv/multicast.c
+++ b/net/batman-adv/multicast.c
@@ -231,19 +231,15 @@ static int batadv_mcast_mla_bridge_get(struct net_device 
*dev,
 
 /**
  * batadv_mcast_mla_list_free - free a list of multicast addresses
- * @bat_priv: the bat priv with all the soft interface information
  * @mcast_list: the list to free
  *
  * Removes and frees all items in the given mcast_list.
  */
-static void batadv_mcast_mla_list_free(struct batadv_priv *bat_priv,
-  struct hlist_head *mcast_list)
+static void batadv_mcast_mla_list_free(struct hlist_head *mcast_list)
 {
struct batadv_hw_addr *mcast_entry;
struct hlist_node *tmp;
 
-   lockdep_assert_held(_priv->tt.commit_lock);
-
hlist_for_each_entry_safe(mcast_entry, tmp, mcast_list, list) {
hlist_del(_entry->list);
kfree(mcast_entry);
@@ -560,7 +556,7 @@ void batadv_mcast_mla_update(struct batadv_priv *bat_priv)
batadv_mcast_mla_tt_add(bat_priv, _list);
 
 out:
-   batadv_mcast_mla_list_free(bat_priv, _list);
+   batadv_mcast_mla_list_free(_list);
 }
 
 /**
-- 
2.10.1

[PATCH 04/17] batman-adv: Add wrapper for ARP reply creation

2016-11-09 Thread Simon Wunderlich

From: Linus Lüssing 

Removing duplicate code.

Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/distributed-arp-table.c | 67 --
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/net/batman-adv/distributed-arp-table.c 
b/net/batman-adv/distributed-arp-table.c
index cbb4f32..49576c5 100644
--- a/net/batman-adv/distributed-arp-table.c
+++ b/net/batman-adv/distributed-arp-table.c
@@ -949,6 +949,41 @@ static unsigned short batadv_dat_get_vid(struct sk_buff 
*skb, int *hdr_size)
 }
 
 /**
+ * batadv_dat_arp_create_reply - create an ARP Reply
+ * @bat_priv: the bat priv with all the soft interface information
+ * @ip_src: ARP sender IP
+ * @ip_dst: ARP target IP
+ * @hw_src: Ethernet source and ARP sender MAC
+ * @hw_dst: Ethernet destination and ARP target MAC
+ * @vid: VLAN identifier (optional, set to zero otherwise)
+ *
+ * Creates an ARP Reply from the given values, optionally encapsulated in a
+ * VLAN header.
+ *
+ * Return: An skb containing an ARP Reply.
+ */
+static struct sk_buff *
+batadv_dat_arp_create_reply(struct batadv_priv *bat_priv, __be32 ip_src,
+   __be32 ip_dst, u8 *hw_src, u8 *hw_dst,
+   unsigned short vid)
+{
+   struct sk_buff *skb;
+
+   skb = arp_create(ARPOP_REPLY, ETH_P_ARP, ip_dst, bat_priv->soft_iface,
+ip_src, hw_dst, hw_src, hw_dst);
+   if (!skb)
+   return NULL;
+
+   skb_reset_mac_header(skb);
+
+   if (vid & BATADV_VLAN_HAS_TAG)
+   skb = vlan_insert_tag(skb, htons(ETH_P_8021Q),
+ vid & VLAN_VID_MASK);
+
+   return skb;
+}
+
+/**
  * batadv_dat_snoop_outgoing_arp_request - snoop the ARP request and try to
  * answer using DAT
  * @bat_priv: the bat priv with all the soft interface information
@@ -1005,20 +1040,12 @@ bool batadv_dat_snoop_outgoing_arp_request(struct 
batadv_priv *bat_priv,
goto out;
}
 
-   skb_new = arp_create(ARPOP_REPLY, ETH_P_ARP, ip_src,
-bat_priv->soft_iface, ip_dst, hw_src,
-dat_entry->mac_addr, hw_src);
+   skb_new = batadv_dat_arp_create_reply(bat_priv, ip_dst, ip_src,
+ dat_entry->mac_addr,
+ hw_src, vid);
if (!skb_new)
goto out;
 
-   if (vid & BATADV_VLAN_HAS_TAG) {
-   skb_new = vlan_insert_tag(skb_new, htons(ETH_P_8021Q),
- vid & VLAN_VID_MASK);
-   if (!skb_new)
-   goto out;
-   }
-
-   skb_reset_mac_header(skb_new);
skb_new->protocol = eth_type_trans(skb_new,
   bat_priv->soft_iface);
bat_priv->stats.rx_packets++;
@@ -1081,25 +1108,11 @@ bool batadv_dat_snoop_incoming_arp_request(struct 
batadv_priv *bat_priv,
if (!dat_entry)
goto out;
 
-   skb_new = arp_create(ARPOP_REPLY, ETH_P_ARP, ip_src,
-bat_priv->soft_iface, ip_dst, hw_src,
-dat_entry->mac_addr, hw_src);
-
+   skb_new = batadv_dat_arp_create_reply(bat_priv, ip_dst, ip_src,
+ dat_entry->mac_addr, hw_src, vid);
if (!skb_new)
goto out;
 
-   /* the rest of the TX path assumes that the mac_header offset pointing
-* to the inner Ethernet header has been set, therefore reset it now.
-*/
-   skb_reset_mac_header(skb_new);
-
-   if (vid & BATADV_VLAN_HAS_TAG) {
-   skb_new = vlan_insert_tag(skb_new, htons(ETH_P_8021Q),
- vid & VLAN_VID_MASK);
-   if (!skb_new)
-   goto out;
-   }
-
/* To preserve backwards compatibility, the node has choose the outgoing
 * format based on the incoming request packet type. The assumption is
 * that a node not using the 4addr packet format doesn't support it.
-- 
2.10.1

[PATCH 12/17] batman-adv: Consume skb in batadv_send_skb_to_orig

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

Sending functions in Linux consume the supplied skbuff. Doing the same in
batadv_send_skb_to_orig avoids the hack of returning -1 (-EPERM) to signal
the caller that he is responsible for cleaning up the skb.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/routing.c  | 11 ++-
 net/batman-adv/send.c | 39 ++-
 net/batman-adv/tp_meter.c |  6 --
 net/batman-adv/tvlv.c |  5 +
 4 files changed, 25 insertions(+), 36 deletions(-)

diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index a4cb157..4d2679a 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -262,9 +262,6 @@ static int batadv_recv_my_icmp_packet(struct batadv_priv 
*bat_priv,
icmph->ttl = BATADV_TTL;
 
res = batadv_send_skb_to_orig(skb, orig_node, NULL);
-   if (res == -1)
-   goto out;
-
ret = NET_RX_SUCCESS;
 
break;
@@ -325,8 +322,7 @@ static int batadv_recv_icmp_ttl_exceeded(struct batadv_priv 
*bat_priv,
icmp_packet->ttl = BATADV_TTL;
 
res = batadv_send_skb_to_orig(skb, orig_node, NULL);
-   if (res != -1)
-   ret = NET_RX_SUCCESS;
+   ret = NET_RX_SUCCESS;
 
 out:
if (primary_if)
@@ -413,8 +409,7 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
 
/* route it */
res = batadv_send_skb_to_orig(skb, orig_node, recv_if);
-   if (res != -1)
-   ret = NET_RX_SUCCESS;
+   ret = NET_RX_SUCCESS;
 
 out:
if (orig_node)
@@ -702,8 +697,6 @@ static int batadv_route_unicast_packet(struct sk_buff *skb,
 
len = skb->len;
res = batadv_send_skb_to_orig(skb, orig_node, recv_if);
-   if (res == -1)
-   goto out;
 
/* translate transmit result into receive result */
if (res == NET_XMIT_SUCCESS) {
diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 0f86293..b00aac7 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -165,11 +165,9 @@ int batadv_send_unicast_skb(struct sk_buff *skb,
  * host, NULL can be passed as recv_if and no interface alternating is
  * attempted.
  *
- * Return: -1 on failure (and the skb is not consumed), -EINPROGRESS if the
- * skb is buffered for later transmit or the NET_XMIT status returned by the
+ * Return: negative errno code on a failure, -EINPROGRESS if the skb is
+ * buffered for later transmit or the NET_XMIT status returned by the
  * lower routine if the packet has been passed down.
- *
- * If the returning value is not -1 the skb has been consumed.
  */
 int batadv_send_skb_to_orig(struct sk_buff *skb,
struct batadv_orig_node *orig_node,
@@ -177,12 +175,14 @@ int batadv_send_skb_to_orig(struct sk_buff *skb,
 {
struct batadv_priv *bat_priv = orig_node->bat_priv;
struct batadv_neigh_node *neigh_node;
-   int ret = -1;
+   int ret;
 
/* batadv_find_router() increases neigh_nodes refcount if found. */
neigh_node = batadv_find_router(bat_priv, orig_node, recv_if);
-   if (!neigh_node)
-   goto out;
+   if (!neigh_node) {
+   ret = -EINVAL;
+   goto free_skb;
+   }
 
/* Check if the skb is too large to send in one piece and fragment
 * it if needed.
@@ -191,8 +191,10 @@ int batadv_send_skb_to_orig(struct sk_buff *skb,
skb->len > neigh_node->if_incoming->net_dev->mtu) {
/* Fragment and send packet. */
ret = batadv_frag_send_packet(skb, orig_node, neigh_node);
+   /* skb was consumed */
+   skb = NULL;
 
-   goto out;
+   goto put_neigh_node;
}
 
/* try to network code the packet, if it is received on an interface
@@ -204,9 +206,13 @@ int batadv_send_skb_to_orig(struct sk_buff *skb,
else
ret = batadv_send_unicast_skb(skb, neigh_node);
 
-out:
-   if (neigh_node)
-   batadv_neigh_node_put(neigh_node);
+   /* skb was consumed */
+   skb = NULL;
+
+put_neigh_node:
+   batadv_neigh_node_put(neigh_node);
+free_skb:
+   kfree_skb(skb);
 
return ret;
 }
@@ -327,7 +333,7 @@ int batadv_send_skb_unicast(struct batadv_priv *bat_priv,
 {
struct batadv_unicast_packet *unicast_packet;
struct ethhdr *ethhdr;
-   int res, ret = NET_XMIT_DROP;
+   int ret = NET_XMIT_DROP;
 
if (!orig_node)
goto out;
@@ -364,13 +370,12 @@ int batadv_send_skb_unicast(struct batadv_priv *bat_priv,
if (batadv_tt_global_client_is_roaming(bat_priv, ethhdr->h_dest, vid))
unicast_packet->ttvn = unicast_packet->ttvn - 1;
 
-   res = batadv_send_skb_to_orig(skb, orig_node, NULL);
-   if (res != -1)
-   ret =

[PATCH 13/17] batman-adv: Consume skb in receive handlers

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

Receiving functions in Linux consume the supplied skbuff. Doing the same in
the batadv_rx_handler functions makes the behavior more similar to the rest
of the Linux network code.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/bat_iv_ogm.c |  22 --
 net/batman-adv/bat_v_elp.c  |  30 
 net/batman-adv/bat_v_ogm.c  |  15 ++--
 net/batman-adv/main.c   |  11 +--
 net/batman-adv/network-coding.c |  11 +--
 net/batman-adv/routing.c| 149 +++-
 6 files changed, 153 insertions(+), 85 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 310f391..bd39247 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -1823,17 +1823,18 @@ static int batadv_iv_ogm_receive(struct sk_buff *skb,
struct batadv_ogm_packet *ogm_packet;
u8 *packet_pos;
int ogm_offset;
-   bool ret;
+   bool res;
+   int ret = NET_RX_DROP;
 
-   ret = batadv_check_management_packet(skb, if_incoming, BATADV_OGM_HLEN);
-   if (!ret)
-   return NET_RX_DROP;
+   res = batadv_check_management_packet(skb, if_incoming, BATADV_OGM_HLEN);
+   if (!res)
+   goto free_skb;
 
/* did we receive a B.A.T.M.A.N. IV OGM packet on an interface
 * that does not have B.A.T.M.A.N. IV enabled ?
 */
if (bat_priv->algo_ops->iface.enable != batadv_iv_ogm_iface_enable)
-   return NET_RX_DROP;
+   goto free_skb;
 
batadv_inc_counter(bat_priv, BATADV_CNT_MGMT_RX);
batadv_add_counter(bat_priv, BATADV_CNT_MGMT_RX_BYTES,
@@ -1854,8 +1855,15 @@ static int batadv_iv_ogm_receive(struct sk_buff *skb,
ogm_packet = (struct batadv_ogm_packet *)packet_pos;
}
 
-   consume_skb(skb);
-   return NET_RX_SUCCESS;
+   ret = NET_RX_SUCCESS;
+
+free_skb:
+   if (ret == NET_RX_SUCCESS)
+   consume_skb(skb);
+   else
+   kfree_skb(skb);
+
+   return ret;
 }
 
 #ifdef CONFIG_BATMAN_ADV_DEBUGFS
diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index ee08540..54bdd41 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -492,20 +492,21 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb,
struct batadv_elp_packet *elp_packet;
struct batadv_hard_iface *primary_if;
struct ethhdr *ethhdr = (struct ethhdr *)skb_mac_header(skb);
-   bool ret;
+   bool res;
+   int ret = NET_RX_DROP;
 
-   ret = batadv_check_management_packet(skb, if_incoming, BATADV_ELP_HLEN);
-   if (!ret)
-   return NET_RX_DROP;
+   res = batadv_check_management_packet(skb, if_incoming, BATADV_ELP_HLEN);
+   if (!res)
+   goto free_skb;
 
if (batadv_is_my_mac(bat_priv, ethhdr->h_source))
-   return NET_RX_DROP;
+   goto free_skb;
 
/* did we receive a B.A.T.M.A.N. V ELP packet on an interface
 * that does not have B.A.T.M.A.N. V ELP enabled ?
 */
if (strcmp(bat_priv->algo_ops->name, "BATMAN_V") != 0)
-   return NET_RX_DROP;
+   goto free_skb;
 
elp_packet = (struct batadv_elp_packet *)skb->data;
 
@@ -516,14 +517,19 @@ int batadv_v_elp_packet_recv(struct sk_buff *skb,
 
primary_if = batadv_primary_if_get_selected(bat_priv);
if (!primary_if)
-   goto out;
+   goto free_skb;
 
batadv_v_elp_neigh_update(bat_priv, ethhdr->h_source, if_incoming,
  elp_packet);
 
-out:
-   if (primary_if)
-   batadv_hardif_put(primary_if);
-   consume_skb(skb);
-   return NET_RX_SUCCESS;
+   ret = NET_RX_SUCCESS;
+   batadv_hardif_put(primary_if);
+
+free_skb:
+   if (ret == NET_RX_SUCCESS)
+   consume_skb(skb);
+   else
+   kfree_skb(skb);
+
+   return ret;
 }
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 9922ccd..38b9aab 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -810,18 +810,18 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb,
 * B.A.T.M.A.N. V enabled ?
 */
if (strcmp(bat_priv->algo_ops->name, "BATMAN_V") != 0)
-   return NET_RX_DROP;
+   goto free_skb;
 
if (!batadv_check_management_packet(skb, if_incoming, BATADV_OGM2_HLEN))
-   return NET_RX_DROP;
+   goto free_skb;
 
if (batadv_is_my_mac(bat_priv, ethhdr->h_source))
-   return NET_RX_DROP;
+   goto free_skb;
 
ogm_packet = (struct batadv_ogm2_packet *)skb->data;
 
if (batadv_is_my_mac(bat_priv, ogm_packet->orig))
-   return NET_RX_DROP;
+   goto free_skb;

[PATCH 07/17] batman-adv: Use own timer for multicast TT and TVLV updates

2016-11-09 Thread Simon Wunderlich

From: Linus Lüssing 

Instead of latching onto the OGM period, this patch introduces a worker
dedicated to multicast TT and TVLV updates.

The reasoning is, that upon roaming especially the translation table
should be updated timely to minimize connectivity issues.

With BATMAN V, the idea is to greatly increase the OGM interval to
reduce overhead. Unfortunately, right now this could lead to
a bad user experience if multicast traffic is involved.

Therefore this patch introduces a fixed 500ms update interval for
multicast TT entries and the multicast TVLV.

Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/main.h  |  1 +
 net/batman-adv/multicast.c | 62 ++
 net/batman-adv/multicast.h |  6 
 net/batman-adv/translation-table.c |  4 ---
 net/batman-adv/types.h |  4 ++-
 5 files changed, 60 insertions(+), 17 deletions(-)

diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index daddca9..a6cc804 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -48,6 +48,7 @@
 #define BATADV_TT_CLIENT_TEMP_TIMEOUT 60 /* in milliseconds */
 #define BATADV_TT_WORK_PERIOD 5000 /* 5 seconds */
 #define BATADV_ORIG_WORK_PERIOD 1000 /* 1 second */
+#define BATADV_MCAST_WORK_PERIOD 500 /* 0.5 seconds */
 #define BATADV_DAT_ENTRY_TIMEOUT (5 * 6) /* 5 mins in milliseconds */
 /* sliding packet range of received originator messages in sequence numbers
  * (should be a multiple of our word size)
diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c
index 45757fa..090a69f 100644
--- a/net/batman-adv/multicast.c
+++ b/net/batman-adv/multicast.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -48,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -60,6 +62,18 @@
 #include "translation-table.h"
 #include "tvlv.h"
 
+static void batadv_mcast_mla_update(struct work_struct *work);
+
+/**
+ * batadv_mcast_start_timer - schedule the multicast periodic worker
+ * @bat_priv: the bat priv with all the soft interface information
+ */
+static void batadv_mcast_start_timer(struct batadv_priv *bat_priv)
+{
+   queue_delayed_work(batadv_event_workqueue, _priv->mcast.work,
+  msecs_to_jiffies(BATADV_MCAST_WORK_PERIOD));
+}
+
 /**
  * batadv_mcast_get_bridge - get the bridge on top of the softif if it exists
  * @soft_iface: netdev struct of the mesh interface
@@ -255,6 +269,8 @@ static void batadv_mcast_mla_list_free(struct hlist_head 
*mcast_list)
  * translation table except the ones listed in the given mcast_list.
  *
  * If mcast_list is NULL then all are retracted.
+ *
+ * Do not call outside of the mcast worker! (or cancel mcast worker first)
  */
 static void batadv_mcast_mla_tt_retract(struct batadv_priv *bat_priv,
struct hlist_head *mcast_list)
@@ -262,7 +278,7 @@ static void batadv_mcast_mla_tt_retract(struct batadv_priv 
*bat_priv,
struct batadv_hw_addr *mcast_entry;
struct hlist_node *tmp;
 
-   lockdep_assert_held(_priv->tt.commit_lock);
+   WARN_ON(delayed_work_pending(_priv->mcast.work));
 
hlist_for_each_entry_safe(mcast_entry, tmp, _priv->mcast.mla_list,
  list) {
@@ -287,6 +303,8 @@ static void batadv_mcast_mla_tt_retract(struct batadv_priv 
*bat_priv,
  *
  * Adds multicast listener announcements from the given mcast_list to the
  * translation table if they have not been added yet.
+ *
+ * Do not call outside of the mcast worker! (or cancel mcast worker first)
  */
 static void batadv_mcast_mla_tt_add(struct batadv_priv *bat_priv,
struct hlist_head *mcast_list)
@@ -294,7 +312,7 @@ static void batadv_mcast_mla_tt_add(struct batadv_priv 
*bat_priv,
struct batadv_hw_addr *mcast_entry;
struct hlist_node *tmp;
 
-   lockdep_assert_held(_priv->tt.commit_lock);
+   WARN_ON(delayed_work_pending(_priv->mcast.work));
 
if (!mcast_list)
return;
@@ -528,13 +546,18 @@ static bool batadv_mcast_mla_tvlv_update(struct 
batadv_priv *bat_priv)
 }
 
 /**
- * batadv_mcast_mla_update - update the own MLAs
+ * __batadv_mcast_mla_update - update the own MLAs
  * @bat_priv: the bat priv with all the soft interface information
  *
  * Updates the own multicast listener announcements in the translation
  * table as well as the own, announced multicast tvlv container.
+ *
+ * Note that non-conflicting reads and writes to bat_priv->mcast.mla_list
+ * in batadv_mcast_mla_tt_retract() and batadv_mcast_mla_tt_add() are
+ * ensured by the non-parallel execution of the worker this function
+ * belongs to.
  */
-void batadv_mcast_mla_update(struct batadv_priv *bat_priv)
+static void

[PATCH 16/17] batman-adv: Disallow zero and mcast src address for mgmt frames

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

The routing check for management frames is validating the source mac
address in the outer ethernet header. It rejects every source mac address
which is a broadcast address. But it also has to reject the zero-mac
address and multicast mac addresses.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/routing.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index c02897b..4f034df 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -196,8 +196,8 @@ bool batadv_check_management_packet(struct sk_buff *skb,
if (!is_broadcast_ether_addr(ethhdr->h_dest))
return false;
 
-   /* packet with broadcast sender address */
-   if (is_broadcast_ether_addr(ethhdr->h_source))
+   /* packet with invalid sender address */
+   if (!is_valid_ether_addr(ethhdr->h_source))
return false;
 
/* create a copy of the skb, if needed, to modify it. */
-- 
2.10.1

[PATCH 11/17] batman-adv: Consume skb in batadv_frag_send_packet

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

Sending functions in Linux consume the supplied skbuff. Doing the same in
batadv_frag_send_packet avoids the hack of returning -1 (-EPERM) to signal
the caller that he is responsible for cleaning up the skb.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/fragmentation.c | 50 --
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index a2e28a1..9c561e6 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -20,6 +20,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -441,8 +442,7 @@ static struct sk_buff *batadv_frag_create(struct sk_buff 
*skb,
  * @orig_node: final destination of the created fragments
  * @neigh_node: next-hop of the created fragments
  *
- * Return: the netdev tx status or -1 in case of error.
- * When -1 is returned the skb is not consumed.
+ * Return: the netdev tx status or a negative errno code on a failure
  */
 int batadv_frag_send_packet(struct sk_buff *skb,
struct batadv_orig_node *orig_node,
@@ -455,7 +455,7 @@ int batadv_frag_send_packet(struct sk_buff *skb,
unsigned int mtu = neigh_node->if_incoming->net_dev->mtu;
unsigned int header_size = sizeof(frag_header);
unsigned int max_fragment_size, max_packet_size;
-   int ret = -1;
+   int ret;
 
/* To avoid merge and refragmentation at next-hops we never send
 * fragments larger than BATADV_FRAG_MAX_FRAG_SIZE
@@ -465,13 +465,17 @@ int batadv_frag_send_packet(struct sk_buff *skb,
max_packet_size = max_fragment_size * BATADV_FRAG_MAX_FRAGMENTS;
 
/* Don't even try to fragment, if we need more than 16 fragments */
-   if (skb->len > max_packet_size)
-   goto out;
+   if (skb->len > max_packet_size) {
+   ret = -EAGAIN;
+   goto free_skb;
+   }
 
bat_priv = orig_node->bat_priv;
primary_if = batadv_primary_if_get_selected(bat_priv);
-   if (!primary_if)
-   goto out;
+   if (!primary_if) {
+   ret = -EINVAL;
+   goto put_primary_if;
+   }
 
/* Create one header to be copied to all fragments */
frag_header.packet_type = BATADV_UNICAST_FRAG;
@@ -496,34 +500,35 @@ int batadv_frag_send_packet(struct sk_buff *skb,
/* Eat and send fragments from the tail of skb */
while (skb->len > max_fragment_size) {
skb_fragment = batadv_frag_create(skb, _header, mtu);
-   if (!skb_fragment)
-   goto out;
+   if (!skb_fragment) {
+   ret = -ENOMEM;
+   goto free_skb;
+   }
 
batadv_inc_counter(bat_priv, BATADV_CNT_FRAG_TX);
batadv_add_counter(bat_priv, BATADV_CNT_FRAG_TX_BYTES,
   skb_fragment->len + ETH_HLEN);
ret = batadv_send_unicast_skb(skb_fragment, neigh_node);
if (ret != NET_XMIT_SUCCESS) {
-   /* return -1 so that the caller can free the original
-* skb
-*/
-   ret = -1;
-   goto out;
+   ret = NET_XMIT_DROP;
+   goto free_skb;
}
 
frag_header.no++;
 
/* The initial check in this function should cover this case */
if (frag_header.no == BATADV_FRAG_MAX_FRAGMENTS - 1) {
-   ret = -1;
-   goto out;
+   ret = -EINVAL;
+   goto free_skb;
}
}
 
/* Make room for the fragment header. */
if (batadv_skb_head_push(skb, header_size) < 0 ||
-   pskb_expand_head(skb, header_size + ETH_HLEN, 0, GFP_ATOMIC) < 0)
-   goto out;
+   pskb_expand_head(skb, header_size + ETH_HLEN, 0, GFP_ATOMIC) < 0) {
+   ret = -ENOMEM;
+   goto free_skb;
+   }
 
memcpy(skb->data, _header, header_size);
 
@@ -532,10 +537,13 @@ int batadv_frag_send_packet(struct sk_buff *skb,
batadv_add_counter(bat_priv, BATADV_CNT_FRAG_TX_BYTES,
   skb->len + ETH_HLEN);
ret = batadv_send_unicast_skb(skb, neigh_node);
+   /* skb was consumed */
+   skb = NULL;
 
-out:
-   if (primary_if)
-   batadv_hardif_put(primary_if);
+put_primary_if:
+   batadv_hardif_put(primary_if);
+free_skb:
+   kfree_skb(skb);
 
return ret;
 }
-- 
2.10.1

[PATCH 03/17] batman-adv: Close two alignment holes in batadv_hard_iface

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/types.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 673a22e..c9db184 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -123,8 +123,8 @@ struct batadv_hard_iface_bat_v {
  * @list: list node for batadv_hardif_list
  * @if_num: identificator of the interface
  * @if_status: status of the interface for batman-adv
- * @net_dev: pointer to the net_device
  * @num_bcasts: number of payload re-broadcasts on this interface (ARQ)
+ * @net_dev: pointer to the net_device
  * @hardif_obj: kobject of the per interface sysfs "mesh" directory
  * @refcount: number of contexts the object is used
  * @batman_adv_ptype: packet type describing packets that should be processed 
by
@@ -141,8 +141,8 @@ struct batadv_hard_iface {
struct list_head list;
s16 if_num;
char if_status;
-   struct net_device *net_dev;
u8 num_bcasts;
+   struct net_device *net_dev;
struct kobject *hardif_obj;
struct kref refcount;
struct packet_type batman_adv_ptype;
-- 
2.10.1

[PATCH 14/17] batman-adv: Remove dev_queue_xmit return code exception

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

No caller of batadv_send_skb_to_orig is expecting the results to be -1
(-EPERM) anymore when the skbuff was not consumed. They will instead expect
that the skbuff is always consumed. Having such return code filter is
therefore not needed anymore.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/send.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index b00aac7..9ea272e 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -64,8 +64,11 @@ static void batadv_send_outstanding_bcast_packet(struct 
work_struct *work);
  * If neigh_node is NULL, then the packet is broadcasted using hard_iface,
  * otherwise it is sent as unicast to the given neighbor.
  *
- * Return: NET_TX_DROP in case of error or the result of dev_queue_xmit(skb)
- * otherwise
+ * Regardless of the return value, the skb is consumed.
+ *
+ * Return: A negative errno code is returned on a failure. A success does not
+ * guarantee the frame will be transmitted as it may be dropped due
+ * to congestion or traffic shaping.
  */
 int batadv_send_skb_packet(struct sk_buff *skb,
   struct batadv_hard_iface *hard_iface,
@@ -73,7 +76,6 @@ int batadv_send_skb_packet(struct sk_buff *skb,
 {
struct batadv_priv *bat_priv;
struct ethhdr *ethhdr;
-   int ret;
 
bat_priv = netdev_priv(hard_iface->soft_iface);
 
@@ -111,15 +113,8 @@ int batadv_send_skb_packet(struct sk_buff *skb,
/* dev_queue_xmit() returns a negative result on error.  However on
 * congestion and traffic shaping, it drops and returns NET_XMIT_DROP
 * (which is > 0). This will not be treated as an error.
-*
-* a negative value cannot be returned because it could be interepreted
-* as not consumed skb by callers of batadv_send_skb_to_orig.
 */
-   ret = dev_queue_xmit(skb);
-   if (ret < 0)
-   ret = NET_XMIT_DROP;
-
-   return ret;
+   return dev_queue_xmit(skb);
 send_skb_err:
kfree_skb(skb);
return NET_XMIT_DROP;
-- 
2.10.1

[PATCH 17/17] batman-adv: Reject unicast packet with zero/mcast dst address

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

An unicast batman-adv packet cannot be transmitted to a multicast or zero
mac address. So reject incoming packets which still have these classes of
addresses as destination mac address in the outer ethernet header.

Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/routing.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 4f034df..6713bdf 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -364,8 +364,8 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
 
ethhdr = eth_hdr(skb);
 
-   /* packet with unicast indication but broadcast recipient */
-   if (is_broadcast_ether_addr(ethhdr->h_dest))
+   /* packet with unicast indication but non-unicast recipient */
+   if (!is_valid_ether_addr(ethhdr->h_dest))
goto free_skb;
 
/* packet with broadcast/multicast sender address */
@@ -462,8 +462,8 @@ static int batadv_check_unicast_packet(struct batadv_priv 
*bat_priv,
 
ethhdr = eth_hdr(skb);
 
-   /* packet with unicast indication but broadcast recipient */
-   if (is_broadcast_ether_addr(ethhdr->h_dest))
+   /* packet with unicast indication but non-unicast recipient */
+   if (!is_valid_ether_addr(ethhdr->h_dest))
return -EBADR;
 
/* packet with broadcast/multicast sender address */
-- 
2.10.1

[PATCH 15/17] batman-adv: Disallow mcast src address for data frames

2016-11-09 Thread Simon Wunderlich

From: Sven Eckelmann 

The routing checks are validating the source mac address of the outer
ethernet header. They reject every source mac address which is a broadcast
address. But they also have to reject any multicast mac addresses.

Signed-off-by: Sven Eckelmann 
[s...@simonwunderlich.de: fix commit message typo]
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/routing.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index 105d4fc..c02897b 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -368,8 +368,8 @@ int batadv_recv_icmp_packet(struct sk_buff *skb,
if (is_broadcast_ether_addr(ethhdr->h_dest))
goto free_skb;
 
-   /* packet with broadcast sender address */
-   if (is_broadcast_ether_addr(ethhdr->h_source))
+   /* packet with broadcast/multicast sender address */
+   if (is_multicast_ether_addr(ethhdr->h_source))
goto free_skb;
 
/* not for me */
@@ -466,8 +466,8 @@ static int batadv_check_unicast_packet(struct batadv_priv 
*bat_priv,
if (is_broadcast_ether_addr(ethhdr->h_dest))
return -EBADR;
 
-   /* packet with broadcast sender address */
-   if (is_broadcast_ether_addr(ethhdr->h_source))
+   /* packet with broadcast/multicast sender address */
+   if (is_multicast_ether_addr(ethhdr->h_source))
return -EBADR;
 
/* not for me */
@@ -1159,8 +1159,8 @@ int batadv_recv_bcast_packet(struct sk_buff *skb,
if (!is_broadcast_ether_addr(ethhdr->h_dest))
goto free_skb;
 
-   /* packet with broadcast sender address */
-   if (is_broadcast_ether_addr(ethhdr->h_source))
+   /* packet with broadcast/multicast sender address */
+   if (is_multicast_ether_addr(ethhdr->h_source))
goto free_skb;
 
/* ignore broadcasts sent by myself */
-- 
2.10.1

[PATCH 00/17] pull request for net-next: batman-adv 2016-11-08 v2

2016-11-09 Thread Simon Wunderlich

Hi David,

this is an updated version from yesterdays pull request. Sven did changes
according to Eric Dumazets comments in Patch 13, everything else staid the
same.

Please pull or let me know of any problem!

Thank you,
  Simon

The following changes since commit a283ad5066cd63f595224c7476001cfc367fdf2e:

  Merge tag 'batadv-next-for-davem-20161027' of 
git://git.open-mesh.org/linux-merge (2016-10-29 16:26:50 -0400)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-next-for-davem-20161108-v2

for you to fetch changes up to 93bbaab455f30fd43911e0881a02107a17150a62:

  batman-adv: Reject unicast packet with zero/mcast dst address (2016-11-08 
19:02:36 +0100)


This feature and cleanup patchset includes the following changes:

 - netlink and code cleanups by Sven Eckelmann (3 patches)

 - Cleanup and minor fixes by Linus Luessing (3 patches)

 - Speed up multicast update intervals, by Linus Luessing

 - Avoid (re)broadcast in meshes for some easy cases,
   by Linus Luessing

 - Clean up tx return state handling, by Sven Eckelmann (6 patches)

 - Fix some special mac address handling cases, by Sven Eckelmann
   (3 patches)


Linus Lüssing (5):
  batman-adv: Add wrapper for ARP reply creation
  batman-adv: Remove unnecessary lockdep in batadv_mcast_mla_list_free
  batman-adv: Remove unused skb_reset_mac_header()
  batman-adv: Use own timer for multicast TT and TVLV updates
  batman-adv: Simple (re)broadcast avoidance

Sven Eckelmann (12):
  batman-adv: Introduce missing headers for genetlink restructure
  batman-adv: Mark batadv_netlink_ops as const
  batman-adv: Close two alignment holes in batadv_hard_iface
  batman-adv: use consume_skb for non-dropped packets
  batman-adv: Count all non-success TX packets as dropped
  batman-adv: Consume skb in batadv_frag_send_packet
  batman-adv: Consume skb in batadv_send_skb_to_orig
  batman-adv: Consume skb in receive handlers
  batman-adv: Remove dev_queue_xmit return code exception
  batman-adv: Disallow mcast src address for data frames
  batman-adv: Disallow zero and mcast src address for mgmt frames
  batman-adv: Reject unicast packet with zero/mcast dst address

 net/batman-adv/bat_iv_ogm.c|  33 --
 net/batman-adv/bat_v_elp.c |  30 +++---
 net/batman-adv/bat_v_ogm.c |  71 -
 net/batman-adv/distributed-arp-table.c |  67 +++-
 net/batman-adv/fragmentation.c |  70 -
 net/batman-adv/hard-interface.c|  52 ++
 net/batman-adv/hard-interface.h|  16 +++
 net/batman-adv/main.c  |  11 +-
 net/batman-adv/main.h  |   1 +
 net/batman-adv/multicast.c |  70 ++---
 net/batman-adv/multicast.h |   6 --
 net/batman-adv/netlink.c   |   5 +-
 net/batman-adv/network-coding.c|  35 ---
 net/batman-adv/originator.c|  13 ++-
 net/batman-adv/routing.c   | 180 -
 net/batman-adv/send.c  | 140 ++---
 net/batman-adv/send.h  |   6 +-
 net/batman-adv/soft-interface.c|   6 +-
 net/batman-adv/tp_meter.c  |   6 --
 net/batman-adv/translation-table.c |   4 -
 net/batman-adv/tvlv.c  |   5 +-
 net/batman-adv/types.h |  10 +-
 22 files changed, 582 insertions(+), 255 deletions(-)

Re: [PATCH 1/6] dt-bindings: mdio-mux: Add documentation for mdio mux for NSP SoC

2016-11-09 Thread Scott Branden


One change

On 16-11-09 01:33 AM, Yendapally Reddy Dhananjaya Reddy wrote:

Add documentation for mdio mux available in Broadcom NSP SoC

Signed-off-by: Yendapally Reddy Dhananjaya Reddy 
---
 .../devicetree/bindings/net/brcm,mdio-mux-nsp.txt  | 57 ++
 1 file changed, 57 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/brcm,mdio-mux-nsp.txt

diff --git a/Documentation/devicetree/bindings/net/brcm,mdio-mux-nsp.txt 
b/Documentation/devicetree/bindings/net/brcm,mdio-mux-nsp.txt
new file mode 100644
index 000..b749a2b
--- /dev/null
+++ b/Documentation/devicetree/bindings/net/brcm,mdio-mux-nsp.txt
@@ -0,0 +1,57 @@
+Properties for an MDIO bus multiplexer available in Broadcom NSP SoC.
+
+This MDIO bus multiplexer defines buses that could access the internal
+phys as well as external to SoCs. When child bus is selected, one needs
+to select the below properties to generate desired MDIO transaction on
+appropriate bus.
+
+Required properties in addition to the generic multiplexer properties:
+
+MDIO multiplexer node:
+- compatible: brcm,mdio-mux-iproc.

This should be brcm,mdio-mux-nsp


+- reg: Should contain registers location and length.
+- reg-names: Should contain the resource reg names.
+   - bus-ctrl: mdio bus control register address space required to
+ select the bus master. This property is not required for SoC's
+ that doesn't provide master selection.
+   - mgmt-ctrl: mdio management control register address space
+
+Sub-nodes:
+   Each bus master should be represented as a sub-node.
+
+Sub-nodes required properties:
+- reg: Bus master number. Should be 0x10 to access the external mdio devices.
+- address-cells: should be 1
+- size-cells: should be 0
+
+Every non-ethernet PHY requires a compatible property so that it could be
+probed based on this compatible string.
+
+Additional information regarding generic multiplexer properties can be found
+at- Documentation/devicetree/bindings/net/mdio-mux.txt
+
+example:
+
+   mdio_mux: mdio-mux@3f190 {
+   compatible = "brcm,mdio-mux-nsp";
+   reg = <0x3f190 0x4>,
+ <0x32000 0x4>;
+   reg-names = "bus-ctrl", "mgmt-ctrl";
+   mdio-parent-bus = <>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   mdio@0 {
+   reg = <0x0>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   usb3_phy: usb3-phy@10 {
+   compatible = "brcm,nsp-usb3-phy";
+   reg = <0x10>;
+   usb3-ctrl-syscon = <_ctrl>;
+   #phy-cells = <0>;
+   status = "disabled";
+   };
+   };
+   };

1 2 3 >

1 - 100 of 259 matches

Mail list logo