question about bpf.test_progs.fail

2018-03-11 Thread lst

hi, I have a question need your help.

I get failure "libbpf: incorrect bpf_call opcode" when running below two 
cases on v4.16-rc3:

-
test_l4lb_all();
    const char *file2 = "./test_l4lb_noinline.o";

test_xdp_noinline();
-

and from the file test_libbpf.sh, it seems libbpf can't load noinline 
functions.

-
# TODO: fix libbpf to load noinline functions
# [warning] libbpf: incorrect bpf_call opcode
#libbpf_open_file test_l4lb_noinline.o
-

They all point to bpf_object__open(filename) at last.
Here(test_progs) test "test_l4lb_noinline.o" but test_libbpf.sh don't.

So, I guess there must be some setting (like certain kernel kconfig or 
compiling) to make the test work.


Can you tell me how can I make this test(test_progs) pass?

I shall appreciate a reply at your earliest convenience.

thanks,
Shaoting Lei





Re: [pci PATCH v4 1/4] pci-iov: Add support for unmanaged SR-IOV

2018-03-11 Thread Alex Williamson
On Thu, 08 Mar 2018 11:02:29 -0800
Alexander Duyck  wrote:

> From: Alexander Duyck 
> 
> This patch is meant to add some basic functionality to support for SR-IOV
> on devices when the VFs are not managed by some other entity in the device
> other than the kernel.
> 
> A new sysfs value called sriov_unmanaged_autoprobe has been added. This
> value is used as the drivers_autoprobe setting of the VFs when they are
> being managed by an external entity such as device firmware instead of
> being managed by the kernel.
> 
> One side effect of this change is that the sriov_drivers_autoprobe and
> sriov_unmanaged_autoprobe will only apply their updates when SR-IOV VFs
> are allocated. Attempts to update them when SR-IOV is in use will only
> update the local value and will not update sriov->autoprobe.
> 
> Signed-off-by: Alexander Duyck 
> ---


I still struggle to understand why we need this "unmanaged"
complication and how a user of the sysfs API is expected to have any
idea whether a PF is managed or unmanaged and why they should care.
Can't we just have a pci_simple_sriov_configure() helper and ignore
this unmanaged business?  Thanks,

Alex


Re: [PATCH 3/4 net-next] ibmvnic: Pad small packets to minimum MTU size

2018-03-11 Thread David Miller
From: Thomas Falcon 
Date: Fri,  9 Mar 2018 13:23:56 -0600

> + /* For some backing devices, mishandling of small packets
> +  * can result in a loss of connection or TX stall. Device
> +  * architects recommend that no packet should be smaller
> +  * than the minimum MTU value provided to the driver, so
> +  * pad any packets to that length
> +  */
> + if (skb->len < netdev->min_mtu) {
> + return skb_put_padto(skb, netdev->min_mtu);
> + }

Please do not use curly braces for a single statement
basic block.

Thank you.


Re: [PATCH net-next 00/12] fix some bugs for HNS3 driver

2018-03-11 Thread David Miller
From: Peng Li 
Date: Sat, 10 Mar 2018 11:29:21 +0800

> This patchset fixes some bugs for HNS3 driver:
> [Patch 1/12 - Patch 8/12] fix various bugs for PF driver.
> [Patch 9/12 - Patch 12/12] fix issues when change the us mac address of
> PF/VF device to an existent one in the mac_vlan table.

Series applied, thank you.


Re: [PATCH net v2] openvswitch: meter: fix the incorrect calculation of max delta_t

2018-03-11 Thread David Miller
From: zhangliping 
Date: Fri,  9 Mar 2018 10:08:50 +0800

> From: zhangliping 
> 
> Max delat_t should be the full_bucket/rate instead of the full_bucket.
> Also report EINVAL if the rate is zero.
> 
> Fixes: 96fbc13d7e77 ("openvswitch: Add meter infrastructure")
> Cc: Andy Zhou 
> Signed-off-by: zhangliping 
> ---
>  V2: report EINVAL if the rate is 0 to avoid divide by zero error.

Applied and queued up for -stable, thanks.


Re: [PATCH net v4 0/2] rhashtable: Fix rhltable duplicates insertion

2018-03-11 Thread David Miller
From: Or Gerlitz 
Date: Sun, 11 Mar 2018 11:16:44 +0200

> On 3/7/2018 6:23 PM, David Miller wrote:
>> From: Paul Blakey 
>> Date: Wed,  7 Mar 2018 16:00:11 +0200
>> 
>>> On our mlx5 driver fs_core.c, we use the rhltable interface to store
>>> flow groups. We noticed that sometimes we get a warning that flow group 
>>> isn't
>>> found at removal. This rare case was caused when a specific scenario 
>>> happened,
>>> insertion of a flow group with a similar match criteria (a duplicate),
>>> but only where the flow group rhash_head was second (or not first)
>>> on the relevant rhashtable bucket list.
> 
>> I already applied v3, sorry.  But I did add Herbert's ACKs.
> 
> Hi Dave,
> 
> In mlx5 we use this facility since 4.14, can you please push to
> -stable of at least 4.14 if not earlier?

Ok, queued up.


Re: [PATCH net] macvlan: filter out unsupported feature flags

2018-03-11 Thread David Miller
From: Shannon Nelson 
Date: Thu,  8 Mar 2018 16:17:23 -0800

> Adding a macvlan device on top of a lowerdev that supports
> the xfrm offloads fails with a new regression:
>   # ip link add link ens1f0 mv0 type macvlan
>   RTNETLINK answers: Operation not permitted
> 
> Tracing down the failure shows that the macvlan device inherits
> the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
> from the lowerdev, but with no dev->xfrmdev_ops API filled
> in, it doesn't actually support xfrm.  When the request is
> made to add the new macvlan device, the XFRM listener for
> NETDEV_REGISTER calls xfrm_api_check() which fails the new
> registration because dev->xfrmdev_ops is NULL.
> 
> The macvlan creation succeeds when we filter out the ESP
> feature flags in macvlan_fix_features(), so let's filter them
> out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
> When XFRM support is added in the future, we can add the flags
> into MACVLAN_FEATURES.
> 
> This same problem could crop up in the future with any other
> new feature flags, so let's filter out any flags that aren't
> defined as supported in macvlan.
> 
> Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
> Reported-by: Alexey Kodanev 
> Signed-off-by: Shannon Nelson 

Applied, thanks.


Re: [PATCH net-next 0/4] selftests: forwarding: Tweaks and a new test

2018-03-11 Thread David Miller
From: Ido Schimmel 
Date: Sun, 11 Mar 2018 09:57:21 +0200

> First patch adds a new test for VLAN-unaware bridges.
> 
> Next two patches make the tests fail in case they are missing interfaces
> or dependencies.
> 
> Last patch allows one to create the veth interfaces even without the
> optional configuration file.

Series applied, thank you.


Re: [PATCH net-next 1/3] net: ipv6: Introduce ip6_multipath_hash_policy()

2018-03-11 Thread David Ahern
On 3/11/18 12:45 AM, Ido Schimmel wrote:
> From: Petr Machata 
> 
> In order to abstract away access to the
> ipv6.sysctl.multipath_hash_policy variable, which is not available on
> systems compiled without IPv6 support, introduce a wrapper function
> ip6_multipath_hash_policy() that falls back to 0 on non-IPv6 systems.
> 
> Use this wrapper from mlxsw/spectrum_router instead of a direct
> reference.
> 
> Signed-off-by: Petr Machata 
> Signed-off-by: Ido Schimmel 
> ---
>  drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c |  2 +-
>  include/net/ipv6.h| 11 +++
>  2 files changed, 12 insertions(+), 1 deletion(-)
> 

Acked-by: David Ahern 

For consistency, would be good to use that inline everywhere the sysctl
is referenced.


Re: [PATCH net-next 4/4] selftests: forwarding: Allow creation of interfaces without a config file

2018-03-11 Thread David Ahern
On 3/11/18 12:57 AM, Ido Schimmel wrote:
> Some users want to be able to run the tests without a configuration file
> which is useful when one needs to test both virtual and physical
> interfaces on the same machine.
> 
> Move the defines that set the type of interface to create and whether to
> create it away from the optional configuration file to the library like
> the rest of the defines.
> 
> Signed-off-by: Ido Schimmel 
> Reviewed-by: Jiri Pirko 
> ---
>  tools/testing/selftests/net/forwarding/forwarding.config.sample | 9 -
>  tools/testing/selftests/net/forwarding/lib.sh   | 2 ++
>  2 files changed, 6 insertions(+), 5 deletions(-)
> 

Reviewed-by: David Ahern 




Re: [PATCH net-next 3/4] selftests: forwarding: Exit with error when missing interfaces

2018-03-11 Thread David Ahern
On 3/11/18 12:57 AM, Ido Schimmel wrote:
> Returning 0 gives a false sense of success when the required modules did
> not even manage to be initialized and register the required net devices.
> 
> Signed-off-by: Ido Schimmel 
> Reviewed-by: Jiri Pirko 
> ---
>  tools/testing/selftests/net/forwarding/lib.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: David Ahern 




Re: [PATCH net-next 2/4] selftests: forwarding: Exit with error when missing dependencies

2018-03-11 Thread David Ahern
On 3/11/18 12:57 AM, Ido Schimmel wrote:
> We already return an error when some dependencies (e.g., 'jq') are
> missing so lets be consistent and do that for all.
> 
> Signed-off-by: Ido Schimmel 
> Reviewed-by: Jiri Pirko 
> ---
>  tools/testing/selftests/net/forwarding/lib.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: David Ahern 




Re: [PATCH net-next 1/4] selftests: forwarding: Add a test for VLAN-unaware bridge

2018-03-11 Thread David Ahern
On 3/11/18 12:57 AM, Ido Schimmel wrote:
> Similar to the VLAN-aware bridge test, test the VLAN-unaware bridge and
> make sure that ping, FDB learning and flooding work as expected.
> 
> Signed-off-by: Ido Schimmel 
> Reviewed-by: Jiri Pirko 
> ---
>  .../net/forwarding/bridge_vlan_unaware.sh  | 86 
> ++
>  1 file changed, 86 insertions(+)
>  create mode 100755 
> tools/testing/selftests/net/forwarding/bridge_vlan_unaware.sh
> 

Reviewed-by: David Ahern 



Re: [RESEND PATCH] rsi: Remove stack VLA usage

2018-03-11 Thread Larry Finger

On 03/11/2018 08:43 PM, Tobin C. Harding wrote:

The kernel would like to have all stack VLA usage removed[1].  rsi uses
a VLA based on 'blksize'.  Elsewhere in the SDIO code maximum block size
is defined using a magic number.  We can use a pre-processor defined
constant and declare the array to maximum size.  We add a check before
accessing the array in case of programmer error.

[1]: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Tobin C. Harding 
---

RESEND: add wireless mailing list to CC's (requested by Kalle)

  drivers/net/wireless/rsi/rsi_91x_hal.c  | 13 +++--
  drivers/net/wireless/rsi/rsi_91x_sdio.c |  9 +++--
  2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_hal.c 
b/drivers/net/wireless/rsi/rsi_91x_hal.c
index 1176de646942..839ebdd602df 100644
--- a/drivers/net/wireless/rsi/rsi_91x_hal.c
+++ b/drivers/net/wireless/rsi/rsi_91x_hal.c
@@ -641,7 +641,7 @@ static int ping_pong_write(struct rsi_hw *adapter, u8 cmd, 
u8 *addr, u32 size)
u32 cmd_addr;
u16 cmd_resp, cmd_req;
u8 *str;
-   int status;
+   int status, ret;
  
  	if (cmd == PING_WRITE) {

cmd_addr = PING_BUFFER_ADDRESS;
@@ -655,12 +655,13 @@ static int ping_pong_write(struct rsi_hw *adapter, u8 
cmd, u8 *addr, u32 size)
str = "PONG_VALID";
}
  
-	status = hif_ops->load_data_master_write(adapter, cmd_addr, size,

+   ret = hif_ops->load_data_master_write(adapter, cmd_addr, size,
block_size, addr);
-   if (status) {
-   rsi_dbg(ERR_ZONE, "%s: Unable to write blk at addr %0x\n",
-   __func__, *addr);
-   return status;
+   if (ret) {
+   if (ret != -EINVAL)
+   rsi_dbg(ERR_ZONE, "%s: Unable to write blk at addr 
%0x\n",
+   __func__, *addr);
+   return ret;
}
  
  	status = bl_cmd(adapter, cmd_req, cmd_resp, str);

diff --git a/drivers/net/wireless/rsi/rsi_91x_sdio.c 
b/drivers/net/wireless/rsi/rsi_91x_sdio.c
index b0cf41195051..b766578b591a 100644
--- a/drivers/net/wireless/rsi/rsi_91x_sdio.c
+++ b/drivers/net/wireless/rsi/rsi_91x_sdio.c
@@ -20,6 +20,8 @@
  #include "rsi_common.h"
  #include "rsi_hal.h"
  
+#define RSI_MAX_BLOCK_SIZE 256

+
  /**
   * rsi_sdio_set_cmd52_arg() - This function prepares cmd 52 read/write arg.
   * @rw: Read/write
@@ -362,7 +364,7 @@ static int rsi_setblocklength(struct rsi_hw *adapter, u32 
length)
rsi_dbg(INIT_ZONE, "%s: Setting the block length\n", __func__);
  
  	status = sdio_set_block_size(dev->pfunction, length);

-   dev->pfunction->max_blksize = 256;
+   dev->pfunction->max_blksize = RSI_MAX_BLOCK_SIZE;
adapter->block_size = dev->pfunction->max_blksize;
  
  	rsi_dbg(INFO_ZONE,

@@ -567,9 +569,12 @@ static int rsi_sdio_load_data_master_write(struct rsi_hw 
*adapter,
  {
u32 num_blocks, offset, i;
u16 msb_address, lsb_address;
-   u8 temp_buf[block_size];
+   u8 temp_buf[RSI_MAX_BLOCK_SIZE];
int status;
  
+	if (block_size > RSI_MAX_BLOCK_SIZE)

+   return -EINVAL;
+
num_blocks = instructions_sz / block_size;
msb_address = base_address >> 16;


I am not giving this patch a negative review, but my solution to the same 
problem has been to change the on-stack array into a u8 pointer, use kmalloc() 
to assign the space, and then free that space at the end. That way large stack 
allocations are avoided, with a minimum of changes.


Larry

  





Re: [PATCH iproute2-next 0/3] ip: multicast commands JSON

2018-03-11 Thread David Ahern
On 3/8/18 7:02 PM, Stephen Hemminger wrote:
> From: Stephen Hemminger 
> 
> Some more JSON support and report better error if kernel
> is configured without multicast.
> 
> Stephen Hemminger (3):
>   ipmaddr: json and color support
>   ipmroute: convert to output JSON
>   ipmroute: better error message if no kernel mroute
> 

applied to iproute2-next



Re: [PATCH iproute2-next v4 0/4] iplink: Improve iplink_parse()

2018-03-11 Thread David Ahern
On 3/7/18 1:40 AM, Serhey Popovych wrote:
> This is main routine to parse ip-link(8) configuration parameters.
> 
> Move all code related to command line parsing and validation to it from
> iptables_modify(). As benefit we reduce number of arguments as well as
> checking for most of weired cases in single place to give benefit to
> iptables_parse() users.
> 
> See individual patch description message for more information.
> 


applied to iproute2-next.


[RESEND PATCH] rsi: Remove stack VLA usage

2018-03-11 Thread Tobin C. Harding
The kernel would like to have all stack VLA usage removed[1].  rsi uses
a VLA based on 'blksize'.  Elsewhere in the SDIO code maximum block size
is defined using a magic number.  We can use a pre-processor defined
constant and declare the array to maximum size.  We add a check before
accessing the array in case of programmer error.

[1]: https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Tobin C. Harding 
---

RESEND: add wireless mailing list to CC's (requested by Kalle)

 drivers/net/wireless/rsi/rsi_91x_hal.c  | 13 +++--
 drivers/net/wireless/rsi/rsi_91x_sdio.c |  9 +++--
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_hal.c 
b/drivers/net/wireless/rsi/rsi_91x_hal.c
index 1176de646942..839ebdd602df 100644
--- a/drivers/net/wireless/rsi/rsi_91x_hal.c
+++ b/drivers/net/wireless/rsi/rsi_91x_hal.c
@@ -641,7 +641,7 @@ static int ping_pong_write(struct rsi_hw *adapter, u8 cmd, 
u8 *addr, u32 size)
u32 cmd_addr;
u16 cmd_resp, cmd_req;
u8 *str;
-   int status;
+   int status, ret;
 
if (cmd == PING_WRITE) {
cmd_addr = PING_BUFFER_ADDRESS;
@@ -655,12 +655,13 @@ static int ping_pong_write(struct rsi_hw *adapter, u8 
cmd, u8 *addr, u32 size)
str = "PONG_VALID";
}
 
-   status = hif_ops->load_data_master_write(adapter, cmd_addr, size,
+   ret = hif_ops->load_data_master_write(adapter, cmd_addr, size,
block_size, addr);
-   if (status) {
-   rsi_dbg(ERR_ZONE, "%s: Unable to write blk at addr %0x\n",
-   __func__, *addr);
-   return status;
+   if (ret) {
+   if (ret != -EINVAL)
+   rsi_dbg(ERR_ZONE, "%s: Unable to write blk at addr 
%0x\n",
+   __func__, *addr);
+   return ret;
}
 
status = bl_cmd(adapter, cmd_req, cmd_resp, str);
diff --git a/drivers/net/wireless/rsi/rsi_91x_sdio.c 
b/drivers/net/wireless/rsi/rsi_91x_sdio.c
index b0cf41195051..b766578b591a 100644
--- a/drivers/net/wireless/rsi/rsi_91x_sdio.c
+++ b/drivers/net/wireless/rsi/rsi_91x_sdio.c
@@ -20,6 +20,8 @@
 #include "rsi_common.h"
 #include "rsi_hal.h"
 
+#define RSI_MAX_BLOCK_SIZE 256
+
 /**
  * rsi_sdio_set_cmd52_arg() - This function prepares cmd 52 read/write arg.
  * @rw: Read/write
@@ -362,7 +364,7 @@ static int rsi_setblocklength(struct rsi_hw *adapter, u32 
length)
rsi_dbg(INIT_ZONE, "%s: Setting the block length\n", __func__);
 
status = sdio_set_block_size(dev->pfunction, length);
-   dev->pfunction->max_blksize = 256;
+   dev->pfunction->max_blksize = RSI_MAX_BLOCK_SIZE;
adapter->block_size = dev->pfunction->max_blksize;
 
rsi_dbg(INFO_ZONE,
@@ -567,9 +569,12 @@ static int rsi_sdio_load_data_master_write(struct rsi_hw 
*adapter,
 {
u32 num_blocks, offset, i;
u16 msb_address, lsb_address;
-   u8 temp_buf[block_size];
+   u8 temp_buf[RSI_MAX_BLOCK_SIZE];
int status;
 
+   if (block_size > RSI_MAX_BLOCK_SIZE)
+   return -EINVAL;
+
num_blocks = instructions_sz / block_size;
msb_address = base_address >> 16;
 
-- 
2.7.4



Re: [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace

2018-03-11 Thread jiangyiwen
On 2018/3/10 4:41, Greg Kurz wrote:
> If it was interrupted by a signal, the 9p client may need to send some
> more requests to the server for cleanup before returning to userspace.
> 
> To avoid such a last minute request to be interrupted right away, the
> client memorizes if a signal is pending, clears TIF_SIGPENDING, handles
> the request and calls recalc_sigpending() before returning.
> 
> Unfortunately, if the transmission of this cleanup request fails for any
> reason, the transport returns an error and the client propagates it right
> away, without calling recalc_sigpending().
> 
> This ends up with -ERESTARTSYS from the initially interrupted request
> crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup
> request. The specific signal handling code, which is responsible for
> converting -ERESTARTSYS to -EINTR is not called, and userspace receives
> the confusing errno value:
> 
> open: Unknown error 512 (512)
> 
> This is really hard to hit in real life. I discovered the issue while
> working on hot-unplug of a virtio-9p-pci device with an instrumented
> QEMU allowing to control request completion.
> 
> Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy
> error path actually. Their code flow is a bit obscure and the best
> thing to do would probably be a full rewrite: to really ensure this
> situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can
> never happen.
> 
> But given the general lack of interest for the 9p code, I won't risk
> breaking more things. So this patch simply fixes the buggy paths in
> both functions with a trivial label+goto.
> 
> Thanks to Laurent Dufour for his help and suggestions on how to find
> the root cause and how to fix it.
> 

Looks good.

Reviewed-by: Yiwen Jiang 

> Signed-off-by: Greg Kurz 
> ---
>  net/9p/client.c |6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/9p/client.c b/net/9p/client.c
> index b433aff5ff13..e6cae8332e2e 100644
> --- a/net/9p/client.c
> +++ b/net/9p/client.c
> @@ -769,7 +769,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const 
> char *fmt, ...)
>   if (err < 0) {
>   if (err != -ERESTARTSYS && err != -EFAULT)
>   c->status = Disconnected;
> - goto reterr;
> + goto recalc_sigpending;
>   }
>  again:
>   /* Wait for the response */
> @@ -804,6 +804,7 @@ p9_client_rpc(struct p9_client *c, int8_t type, const 
> char *fmt, ...)
>   if (req->status == REQ_STATUS_RCVD)
>   err = 0;
>   }
> +recalc_sigpending:
>   if (sigpending) {
>   spin_lock_irqsave(>sighand->siglock, flags);
>   recalc_sigpending();
> @@ -867,7 +868,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client 
> *c, int8_t type,
>   if (err == -EIO)
>   c->status = Disconnected;
>   if (err != -ERESTARTSYS)
> - goto reterr;
> + goto recalc_sigpending;
>   }
>   if (req->status == REQ_STATUS_ERROR) {
>   p9_debug(P9_DEBUG_ERROR, "req_status error %d\n", req->t_err);
> @@ -885,6 +886,7 @@ static struct p9_req_t *p9_client_zc_rpc(struct p9_client 
> *c, int8_t type,
>   if (req->status == REQ_STATUS_RCVD)
>   err = 0;
>   }
> +recalc_sigpending:
>   if (sigpending) {
>   spin_lock_irqsave(>sighand->siglock, flags);
>   recalc_sigpending();
> 
> 
> .
> 




Re: [RFC] netfilter: cttimeout: remove VLA in ctnl_timeout_parse_policy

2018-03-11 Thread Gustavo A. R. Silva



On 03/11/2018 05:21 PM, Pablo Neira Ayuso wrote:

On Sun, Mar 11, 2018 at 05:12:09PM -0500, Gustavo A. R. Silva wrote:

Hi Pablo,

On 03/11/2018 05:04 PM, Pablo Neira Ayuso wrote:

On Tue, Mar 06, 2018 at 12:47:55PM -0600, Gustavo A. R. Silva wrote:

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.


Looks good but...


Signed-off-by: Gustavo A. R. Silva 
---
   net/netfilter/nfnetlink_cttimeout.c | 12 ++--
   1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index 95b0470..a2f7d92 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -52,18 +52,26 @@ ctnl_timeout_parse_policy(void *timeouts,
  struct net *net, const struct nlattr *attr)
   {
int ret = 0;
+   struct nlattr **tb = NULL;


I think we don't need to initialize this, right?



We actually do have to initialized it because in the unlikely case that the
code block inside the 'if' below is not executed, then we will end up
freeing an uninitialized pointer.


I see, you're right indeed.

We can probably simplify this code, but just doing:

 if (!l4proto->ctnl_timeout.nlattr_to_obj))
 return 0;



I wonder if it is better to code this instead:

if (unlikely(!l4proto->ctnl_timeout.nlattr_to_obj)))
return 0;



 netlink attribute parsing here.

You could even remove the likely() thing, which doesn't make much
sense for control plane code.



Why is that?


I understand this is a larger change, but I think this function will
look better while we're removing VLA.

Would you mind having a look? I'd appreciate if so.



I can do that. No problem.

Thanks
--
Gustavo


Re: [RFC] netfilter: cttimeout: remove VLA in ctnl_timeout_parse_policy

2018-03-11 Thread Gustavo A. R. Silva

Hi Pablo,

On 03/11/2018 05:04 PM, Pablo Neira Ayuso wrote:

On Tue, Mar 06, 2018 at 12:47:55PM -0600, Gustavo A. R. Silva wrote:

In preparation to enabling -Wvla, remove VLA and replace it
with dynamic memory allocation.


Looks good but...


Signed-off-by: Gustavo A. R. Silva 
---
  net/netfilter/nfnetlink_cttimeout.c | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index 95b0470..a2f7d92 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -52,18 +52,26 @@ ctnl_timeout_parse_policy(void *timeouts,
  struct net *net, const struct nlattr *attr)
  {
int ret = 0;
+   struct nlattr **tb = NULL;


I think we don't need to initialize this, right?



We actually do have to initialized it because in the unlikely case that 
the code block inside the 'if' below is not executed, then we will end 
up freeing an uninitialized pointer.


Thanks
--
Gustavo

  
  	if (likely(l4proto->ctnl_timeout.nlattr_to_obj)) {

-   struct nlattr *tb[l4proto->ctnl_timeout.nlattr_max+1];
+   tb = kcalloc(l4proto->ctnl_timeout.nlattr_max + 1, sizeof(*tb),
+GFP_KERNEL);
+
+   if (!tb)
+   return -ENOMEM;
  
  		ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max,

   attr, l4proto->ctnl_timeout.nla_policy,
   NULL);
if (ret < 0)
-   return ret;
+   goto err;
  
  		ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);

}
+
+err:
+   kfree(tb);
return ret;
  }
  
--

2.7.4






Re: [PATCH 3/3] wlcore: Use common error handling code in wl1271_acx_sta_rate_policies()

2018-03-11 Thread Arend van Spriel

On 3/10/2018 10:33 PM, SF Markus Elfring wrote:

From: Markus Elfring 
Date: Sat, 10 Mar 2018 22:18:45 +0100

Add a jump target so that a bit of exception handling can be better reused
at the end of this function.

This issue was detected by using the Coccinelle software.


You call this an issue?


Signed-off-by: Markus Elfring 
---
  drivers/net/wireless/ti/wlcore/acx.c | 24 +++-
  1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/wireless/ti/wlcore/acx.c 
b/drivers/net/wireless/ti/wlcore/acx.c
index 1cc5bba670e1..7d37a417c756 100644
--- a/drivers/net/wireless/ti/wlcore/acx.c
+++ b/drivers/net/wireless/ti/wlcore/acx.c


[...]


ret = wl1271_cmd_configure(wl, ACX_RATE_POLICY, acx, sizeof(*acx));
-   if (ret < 0) {
-   wl1271_warning("Setting of rate policies failed: %d", ret);
-   goto out;
-   }
+   if (ret < 0)
+   goto report_failure;

-out:
+free_acx:
kfree(acx);
return ret;
+
+report_failure:
+   wl1271_warning("Setting of rate policies failed: %d", ret);
+   goto free_acx;


In my opinion you are introducing a new issue. I don't call this 
"common" in any way. It is leaning more towards "spaghetti code" [1]. 
Jumping over a label to return to it with another jump. They are not 
long jumps, but it sure does not make thing more readable. Always aim 
for simple top-to-bottom.


Regards,
Arend

[1] https://en.wikipedia.org/wiki/Spaghetti_code


  }

  int wl1271_acx_ap_rate_policy(struct wl1271 *wl, struct conf_tx_rate_class *c,





Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-11 Thread Tobin C. Harding
On Fri, Mar 09, 2018 at 01:10:30PM -0800, Linus Torvalds wrote:
> On Fri, Mar 9, 2018 at 12:05 PM, Kees Cook  wrote:
> > When max() is used in stack array size calculations from literal values
> > (e.g. "char foo[max(sizeof(struct1), sizeof(struct2))]", the compiler
> > thinks this is a dynamic calculation due to the single-eval logic, which
> > is not needed in the literal case. This change removes several accidental
> > stack VLAs from an x86 allmodconfig build:
> 
> Ok, looks good.
> 
> I just have a couple of questions about applying it.
> 
> In particular, if this will help people working on getting rid of
> VLA's in the short term, I can apply it directly. But if people who
> are looking at it (anybody else than Kees?) don't much care, then this
> might be a 4.17 thing or at least "random -mm queue"?

It's easy enough to work on the other VLA removals without basing on
this, no rush.


thanks,
Tobin.


[PATCH] net/mlx4_en: Fix a memory leak in case of error in 'mlx4_en_init_netdev()'

2018-03-11 Thread Christophe JAILLET
If 'kzalloc' fails, we must free some memory before returning.

Fixes: 67f8b1dcb9ee ("net/mlx4_en: Refactor the XDP forwarding rings scheme")
Signed-off-by: Christophe JAILLET 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 8fc51bc29003..f9db018e858f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -3327,7 +3327,7 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int 
port,
if (!priv->tx_cq[t]) {
kfree(priv->tx_ring[t]);
err = -ENOMEM;
-   goto out;
+   goto err_free_tx;
}
}
priv->rx_ring_num = prof->rx_ring_num;
-- 
2.14.1


---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



Re: KASAN: use-after-free Read in get_work_pool

2018-03-11 Thread Tom Herbert
On Sun, Mar 11, 2018 at 2:34 PM, Eric Biggers  wrote:
> On Wed, Feb 14, 2018 at 02:45:05PM +0100, 'Dmitry Vyukov' via syzkaller-bugs 
> wrote:
>> On Wed, Dec 6, 2017 at 1:50 PM, Dmitry Vyukov  wrote:
>> > On Fri, Oct 27, 2017 at 11:18 PM, Cong Wang  
>> > wrote:
>> >> On Thu, Oct 26, 2017 at 11:00 PM, Dmitry Vyukov  
>> >> wrote:
>> >>> On Thu, Oct 26, 2017 at 7:58 PM, Tejun Heo  wrote:
>>  Hello,
>> 
>>  On Thu, Oct 26, 2017 at 09:35:44AM -0700, syzbot wrote:
>> > BUG: KASAN: use-after-free in __read_once_size
>> > include/linux/compiler.h:276 [inline]
>> > BUG: KASAN: use-after-free in atomic64_read
>> > arch/x86/include/asm/atomic64_64.h:21 [inline]
>> > BUG: KASAN: use-after-free in atomic_long_read
>> > include/asm-generic/atomic-long.h:44 [inline]
>> > BUG: KASAN: use-after-free in get_work_pool+0x1c2/0x1e0
>> > kernel/workqueue.c:709
>> > Read of size 8 at addr 8801cc58c378 by task syz-executor5/21326
>> >
>> > CPU: 1 PID: 21326 Comm: syz-executor5 Not tainted 4.13.0+ #43
>> > Hardware name: Google Google Compute Engine/Google Compute Engine,
>> > BIOS Google 01/01/2011
>> > Call Trace:
>> >  __dump_stack lib/dump_stack.c:16 [inline]
>> >  dump_stack+0x194/0x257 lib/dump_stack.c:52
>> >  print_address_description+0x73/0x250 mm/kasan/report.c:252
>> >  kasan_report_error mm/kasan/report.c:351 [inline]
>> >  kasan_report+0x24e/0x340 mm/kasan/report.c:409
>> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
>> >  __read_once_size include/linux/compiler.h:276 [inline]
>> >  atomic64_read arch/x86/include/asm/atomic64_64.h:21 [inline]
>> >  atomic_long_read include/asm-generic/atomic-long.h:44 [inline]
>> >  get_work_pool+0x1c2/0x1e0 kernel/workqueue.c:709
>> >  __queue_work+0x235/0x1150 kernel/workqueue.c:1401
>> >  queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1486
>> >  queue_work include/linux/workqueue.h:489 [inline]
>> >  strp_check_rcv+0x25/0x30 net/strparser/strparser.c:553
>> >  kcm_attach net/kcm/kcmsock.c:1439 [inline]
>> >  kcm_attach_ioctl net/kcm/kcmsock.c:1460 [inline]
>> >  kcm_ioctl+0x826/0x1610 net/kcm/kcmsock.c:1695
>> >  sock_do_ioctl+0x65/0xb0 net/socket.c:961
>> >  sock_ioctl+0x2c2/0x440 net/socket.c:1058
>> >  vfs_ioctl fs/ioctl.c:45 [inline]
>> >  do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685
>> >  SYSC_ioctl fs/ioctl.c:700 [inline]
>> >  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>> >  entry_SYSCALL_64_fastpath+0x1f/0xbe
>> 
>>  Looks like kcm is trying to reuse a work item whose last workqueue has
>>  been destroyed without re-initing it.  A work item needs to be
>>  reinit'd.
>> >>>
>> >>> +kcm maintainers
>> >>
>> >> Can you try the fix below? There is no C reproducer so I can't verify it.
>> >
>> >
>> > Hi Cong,
>> >
>> > syzbot can now test proposed patches, see
>> > https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>> > for details. Please give it a try.
>> >
>> >> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
>> >> index af4e76ac88ff..7816f44c576a 100644
>> >> --- a/net/kcm/kcmsock.c
>> >> +++ b/net/kcm/kcmsock.c
>> >> @@ -1433,11 +1433,12 @@ static int kcm_attach(struct socket *sock,
>> >> struct socket *csock,
>> >> KCM_STATS_INCR(mux->stats.psock_attach);
>> >> mux->psocks_cnt++;
>> >> psock_now_avail(psock);
>> >> -   spin_unlock_bh(>lock);
>> >>
>> >> /* Schedule RX work in case there are already bytes queued */
>> >> strp_check_rcv(>strp);
>> >>
>> >> +   spin_unlock_bh(>lock);
>> >> +
>> >> return 0;
>> >>  }
>>
>>
>> Hi Cong,
>>
>> Was this ever merged? Is it still necessary?
>>
>
> syzbot is no longer hitting this bug for some reason but it's still there.  
> Tom,
> it looks like you wrote the buggy code (it's yet another KCM bug, apparently);
> can you please look into it?
>
Yes. Thank you for the simple reproducer.

Tom

> I've put together a C reproducer that works on latest linux-next 
> (next-20180309,
> commit 61530b14b059d).  It works as an unprivileged user provided that KCM is
> enabled, and that KASAN is enabled so you see the use-after-free report:
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int main()
> {
> union bpf_attr prog = {
> .prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
> .insn_cnt = 2,
> .insns = (__u64)(__u64[]){ 0xb7, 0x95 },
> .license = (__u64)"",
> };
> int tcp_fd, bpf_fd, kcm_fd;
> struct sockaddr_in addr = {
> .sin_family = AF_INET,
> .sin_port = __constant_htons(3270),
> .sin_addr = { __constant_htonl(0x7f01) }
> };
>
> tcp_fd = socket(AF_INET, 

Re: [PATCH v2] net: netfilter: Replace printk() with appropriate pr_*() macro

2018-03-11 Thread Florian Westphal
Arushi Singhal  wrote:
> On Mon, Mar 12, 2018 at 2:17 AM, Pablo Neira Ayuso 
> wrote:
> 
> > Hi Joe,
> >
> > On Sun, Mar 11, 2018 at 12:52:41PM -0700, Joe Perches wrote:
> > > On Mon, 2018-03-12 at 01:11 +0530, Arushi Singhal wrote:
> > > > Using pr_() is more concise than
> > > > printk(KERN_).
> > > > Replace printks having a log level with the appropriate
> > > > pr_*() macros.
> > > >
> > > > Signed-off-by: Arushi Singhal 
> > > > ---
> > > > changes in v2
> > > > *in v1 printk() were replaced with netdev_*()
> > >
> > > >  net/netfilter/nf_conntrack_acct.c  | 2 +-
> > > >  net/netfilter/nf_conntrack_ecache.c| 2 +-
> > > >  net/netfilter/nf_conntrack_timestamp.c | 2 +-
> > > >  net/netfilter/nf_nat_core.c| 2 +-
> > > >  net/netfilter/nfnetlink_queue.c| 4 ++--
> > > >  5 files changed, 6 insertions(+), 6 deletions(-)
> > >
> > > None of these files have a #define for pr_fmt so this
> > > should be OK.
> >
> > I think Arushi could add pr_fmt in the same go, so we skip another
> > follow up patch for this. @Arushi: I suggested this in my previous
> > email, please have a look.
> >
> > Hello Pablo
> 
> Should I send two patches, one with the conversion of printk() to pr_() and
> another for defining pr_fmt().
> 
> Or
> 
> only one patch with all the changes?

Both in one, it reduces code churn.

With pr_* + pr_fmt, the module name will be prefixed automatically, see
e.g. commit e016c5e43db51875c2b541b59bd217494d213174 as an example.

> > This would also probably allows us to save the line break in the error
> > message, which IIRC is not a good practise either, eg.
> >
> > pr_warn("nf_queue: OOM "
> > "in mangle, dropping packet\n");
> >
> > > Perhaps coalesce the formats and remove the unnecessary periods too.

The above message should be removed, its useless (on oom allocator
already warns).


Re: [PATCH v2] net: netfilter: Replace printk() with appropriate pr_*() macro

2018-03-11 Thread Pablo Neira Ayuso
On Mon, Mar 12, 2018 at 03:56:15AM +0530, Arushi Singhal wrote:
> On Mon, Mar 12, 2018 at 2:17 AM, Pablo Neira Ayuso 
> wrote:
> 
> > Hi Joe,
> >
> > On Sun, Mar 11, 2018 at 12:52:41PM -0700, Joe Perches wrote:
> > > On Mon, 2018-03-12 at 01:11 +0530, Arushi Singhal wrote:
> > > > Using pr_() is more concise than
> > > > printk(KERN_).
> > > > Replace printks having a log level with the appropriate
> > > > pr_*() macros.
> > > >
> > > > Signed-off-by: Arushi Singhal 
> > > > ---
> > > > changes in v2
> > > > *in v1 printk() were replaced with netdev_*()
> > >
> > > >  net/netfilter/nf_conntrack_acct.c  | 2 +-
> > > >  net/netfilter/nf_conntrack_ecache.c| 2 +-
> > > >  net/netfilter/nf_conntrack_timestamp.c | 2 +-
> > > >  net/netfilter/nf_nat_core.c| 2 +-
> > > >  net/netfilter/nfnetlink_queue.c| 4 ++--
> > > >  5 files changed, 6 insertions(+), 6 deletions(-)
> > >
> > > None of these files have a #define for pr_fmt so this
> > > should be OK.
> >
> > I think Arushi could add pr_fmt in the same go, so we skip another
> > follow up patch for this. @Arushi: I suggested this in my previous
> > email, please have a look.
> >
> > Hello Pablo
> 
> Should I send two patches, one with the conversion of printk() to pr_() and
> another for defining pr_fmt().
> 
> Or
> 
> only one patch with all the changes?

I think adding pr_fmt and use pr_() belongs to the same logical
change, so one patch for this is fine.

Thanks Arushi.

P.S: Please, just send your patch netfilter-de...@vger.kernel.org
next time, no need to Cc every list. Thanks!


Re: [RFC] netfilter: cttimeout: remove VLA in ctnl_timeout_parse_policy

2018-03-11 Thread Pablo Neira Ayuso
On Sun, Mar 11, 2018 at 05:12:09PM -0500, Gustavo A. R. Silva wrote:
> Hi Pablo,
> 
> On 03/11/2018 05:04 PM, Pablo Neira Ayuso wrote:
> > On Tue, Mar 06, 2018 at 12:47:55PM -0600, Gustavo A. R. Silva wrote:
> > > In preparation to enabling -Wvla, remove VLA and replace it
> > > with dynamic memory allocation.
> > 
> > Looks good but...
> > 
> > > Signed-off-by: Gustavo A. R. Silva 
> > > ---
> > >   net/netfilter/nfnetlink_cttimeout.c | 12 ++--
> > >   1 file changed, 10 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/net/netfilter/nfnetlink_cttimeout.c 
> > > b/net/netfilter/nfnetlink_cttimeout.c
> > > index 95b0470..a2f7d92 100644
> > > --- a/net/netfilter/nfnetlink_cttimeout.c
> > > +++ b/net/netfilter/nfnetlink_cttimeout.c
> > > @@ -52,18 +52,26 @@ ctnl_timeout_parse_policy(void *timeouts,
> > > struct net *net, const struct nlattr *attr)
> > >   {
> > >   int ret = 0;
> > > + struct nlattr **tb = NULL;
> > 
> > I think we don't need to initialize this, right?
> > 
> 
> We actually do have to initialized it because in the unlikely case that the
> code block inside the 'if' below is not executed, then we will end up
> freeing an uninitialized pointer.

I see, you're right indeed.

We can probably simplify this code, but just doing:

if (!l4proto->ctnl_timeout.nlattr_to_obj))
return 0;

netlink attribute parsing here.

You could even remove the likely() thing, which doesn't make much
sense for control plane code.

I understand this is a larger change, but I think this function will
look better while we're removing VLA.

Would you mind having a look? I'd appreciate if so.

Thanks.


Re: [RFC] netfilter: cttimeout: remove VLA in ctnl_timeout_parse_policy

2018-03-11 Thread Pablo Neira Ayuso
On Tue, Mar 06, 2018 at 12:47:55PM -0600, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wvla, remove VLA and replace it
> with dynamic memory allocation.

Looks good but...

> Signed-off-by: Gustavo A. R. Silva 
> ---
>  net/netfilter/nfnetlink_cttimeout.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/nfnetlink_cttimeout.c 
> b/net/netfilter/nfnetlink_cttimeout.c
> index 95b0470..a2f7d92 100644
> --- a/net/netfilter/nfnetlink_cttimeout.c
> +++ b/net/netfilter/nfnetlink_cttimeout.c
> @@ -52,18 +52,26 @@ ctnl_timeout_parse_policy(void *timeouts,
> struct net *net, const struct nlattr *attr)
>  {
>   int ret = 0;
> + struct nlattr **tb = NULL;

I think we don't need to initialize this, right?

>  
>   if (likely(l4proto->ctnl_timeout.nlattr_to_obj)) {
> - struct nlattr *tb[l4proto->ctnl_timeout.nlattr_max+1];
> + tb = kcalloc(l4proto->ctnl_timeout.nlattr_max + 1, sizeof(*tb),
> +  GFP_KERNEL);
> +
> + if (!tb)
> + return -ENOMEM;
>  
>   ret = nla_parse_nested(tb, l4proto->ctnl_timeout.nlattr_max,
>  attr, l4proto->ctnl_timeout.nla_policy,
>  NULL);
>   if (ret < 0)
> - return ret;
> + goto err;
>  
>   ret = l4proto->ctnl_timeout.nlattr_to_obj(tb, net, timeouts);
>   }
> +
> +err:
> + kfree(tb);
>   return ret;
>  }
>  
> -- 
> 2.7.4
> 


Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-11 Thread Luis R. Rodriguez
On Sat, Mar 10, 2018 at 02:08:43PM +, Luis R. Rodriguez wrote:
> The alternative to this would be a simple equivalent of 
> try_then_request_module()
> for UMH modules: try_umhm_then_request_umh_module() or whatever. So just as I
> argued earlier over UMH limitations, this is not the end of the world for umh
> modules, and it doesn't mean you can't get *properly* add umh modules 
> upstream,
> it would *just mean* we'd be perpetuating today's (IMHO) horrible and loose
> semantics.

I was about to suggest that perhaps a try_umhm_then_request_umh_module() or
whatever should not be a macro -- but instead an actual routine, and we don't
export say the simple form to avoid non-deterministic uses of it from the
start... but the thing is *it'd have to be a macro* given that the *check* for
the module *has to be loose*, just as try_then_request_module()...

*Ugh* gross.

Another reason for me to want an actual deterministic clean proper solution
from the start.

  Luis


Re: KASAN: use-after-free Read in get_work_pool

2018-03-11 Thread Eric Biggers
On Wed, Feb 14, 2018 at 02:45:05PM +0100, 'Dmitry Vyukov' via syzkaller-bugs 
wrote:
> On Wed, Dec 6, 2017 at 1:50 PM, Dmitry Vyukov  wrote:
> > On Fri, Oct 27, 2017 at 11:18 PM, Cong Wang  
> > wrote:
> >> On Thu, Oct 26, 2017 at 11:00 PM, Dmitry Vyukov  wrote:
> >>> On Thu, Oct 26, 2017 at 7:58 PM, Tejun Heo  wrote:
>  Hello,
> 
>  On Thu, Oct 26, 2017 at 09:35:44AM -0700, syzbot wrote:
> > BUG: KASAN: use-after-free in __read_once_size
> > include/linux/compiler.h:276 [inline]
> > BUG: KASAN: use-after-free in atomic64_read
> > arch/x86/include/asm/atomic64_64.h:21 [inline]
> > BUG: KASAN: use-after-free in atomic_long_read
> > include/asm-generic/atomic-long.h:44 [inline]
> > BUG: KASAN: use-after-free in get_work_pool+0x1c2/0x1e0
> > kernel/workqueue.c:709
> > Read of size 8 at addr 8801cc58c378 by task syz-executor5/21326
> >
> > CPU: 1 PID: 21326 Comm: syz-executor5 Not tainted 4.13.0+ #43
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:16 [inline]
> >  dump_stack+0x194/0x257 lib/dump_stack.c:52
> >  print_address_description+0x73/0x250 mm/kasan/report.c:252
> >  kasan_report_error mm/kasan/report.c:351 [inline]
> >  kasan_report+0x24e/0x340 mm/kasan/report.c:409
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:430
> >  __read_once_size include/linux/compiler.h:276 [inline]
> >  atomic64_read arch/x86/include/asm/atomic64_64.h:21 [inline]
> >  atomic_long_read include/asm-generic/atomic-long.h:44 [inline]
> >  get_work_pool+0x1c2/0x1e0 kernel/workqueue.c:709
> >  __queue_work+0x235/0x1150 kernel/workqueue.c:1401
> >  queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1486
> >  queue_work include/linux/workqueue.h:489 [inline]
> >  strp_check_rcv+0x25/0x30 net/strparser/strparser.c:553
> >  kcm_attach net/kcm/kcmsock.c:1439 [inline]
> >  kcm_attach_ioctl net/kcm/kcmsock.c:1460 [inline]
> >  kcm_ioctl+0x826/0x1610 net/kcm/kcmsock.c:1695
> >  sock_do_ioctl+0x65/0xb0 net/socket.c:961
> >  sock_ioctl+0x2c2/0x440 net/socket.c:1058
> >  vfs_ioctl fs/ioctl.c:45 [inline]
> >  do_vfs_ioctl+0x1b1/0x1530 fs/ioctl.c:685
> >  SYSC_ioctl fs/ioctl.c:700 [inline]
> >  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
> >  entry_SYSCALL_64_fastpath+0x1f/0xbe
> 
>  Looks like kcm is trying to reuse a work item whose last workqueue has
>  been destroyed without re-initing it.  A work item needs to be
>  reinit'd.
> >>>
> >>> +kcm maintainers
> >>
> >> Can you try the fix below? There is no C reproducer so I can't verify it.
> >
> >
> > Hi Cong,
> >
> > syzbot can now test proposed patches, see
> > https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
> > for details. Please give it a try.
> >
> >> diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
> >> index af4e76ac88ff..7816f44c576a 100644
> >> --- a/net/kcm/kcmsock.c
> >> +++ b/net/kcm/kcmsock.c
> >> @@ -1433,11 +1433,12 @@ static int kcm_attach(struct socket *sock,
> >> struct socket *csock,
> >> KCM_STATS_INCR(mux->stats.psock_attach);
> >> mux->psocks_cnt++;
> >> psock_now_avail(psock);
> >> -   spin_unlock_bh(>lock);
> >>
> >> /* Schedule RX work in case there are already bytes queued */
> >> strp_check_rcv(>strp);
> >>
> >> +   spin_unlock_bh(>lock);
> >> +
> >> return 0;
> >>  }
> 
> 
> Hi Cong,
> 
> Was this ever merged? Is it still necessary?
> 

syzbot is no longer hitting this bug for some reason but it's still there.  Tom,
it looks like you wrote the buggy code (it's yet another KCM bug, apparently);
can you please look into it?

I've put together a C reproducer that works on latest linux-next (next-20180309,
commit 61530b14b059d).  It works as an unprivileged user provided that KCM is
enabled, and that KASAN is enabled so you see the use-after-free report:

#include 
#include 
#include 
#include 
#include 
#include 
#include 

int main()
{
union bpf_attr prog = {
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.insn_cnt = 2,
.insns = (__u64)(__u64[]){ 0xb7, 0x95 },
.license = (__u64)"",
};
int tcp_fd, bpf_fd, kcm_fd;
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = __constant_htons(3270),
.sin_addr = { __constant_htonl(0x7f01) }
};

tcp_fd = socket(AF_INET, SOCK_STREAM, 0);
bind(tcp_fd, (void *), sizeof(addr));
listen(tcp_fd, 1);
tcp_fd = socket(AF_INET, SOCK_STREAM, 0);
connect(tcp_fd, (void *), sizeof(addr));
bpf_fd = syscall(__NR_bpf, BPF_PROG_LOAD, , 48);
kcm_fd = socket(AF_KCM, 

[PATCH] net: llc: drop VLA in llc_sap_mcast()

2018-03-11 Thread Salvatore Mesoraca
Avoid a VLA[1] by using a real constant expression instead of a variable.
The compiler should be able to optimize the original code and avoid using
an actual VLA. Anyway this change is useful because it will avoid a false
positive with -Wvla, it might also help the compiler generating better
code.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca 
---
 net/llc/llc_sap.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/llc/llc_sap.c b/net/llc/llc_sap.c
index d90928f..a7f7b8f 100644
--- a/net/llc/llc_sap.c
+++ b/net/llc/llc_sap.c
@@ -394,8 +394,9 @@ static void llc_sap_mcast(struct llc_sap *sap,
  const struct llc_addr *laddr,
  struct sk_buff *skb)
 {
-   int i = 0, count = 256 / sizeof(struct sock *);
-   struct sock *sk, *stack[count];
+   int i = 0;
+   struct sock *sk;
+   struct sock *stack[256 / sizeof(struct sock *)];
struct llc_sock *llc;
struct hlist_head *dev_hb = llc_sk_dev_hash(sap, skb->dev->ifindex);
 
@@ -408,7 +409,7 @@ static void llc_sap_mcast(struct llc_sap *sap,
continue;
 
sock_hold(sk);
-   if (i < count)
+   if (i < ARRAY_SIZE(stack))
stack[i++] = sk;
else {
llc_do_mcast(sap, skb, stack, i);
-- 
1.9.1



[PATCH 2/2] net: rds: drop VLA in rds_walk_conn_path_info()

2018-03-11 Thread Salvatore Mesoraca
Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca 
---
 net/rds/connection.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index f80792c..abef75d 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -578,9 +578,9 @@ static void rds_walk_conn_path_info(struct socket *sock, 
unsigned int len,
struct rds_info_iterator *iter,
struct rds_info_lengths *lens,
int (*visitor)(struct rds_conn_path *, void 
*),
+   u64 *buffer,
size_t item_len)
 {
-   u64  buffer[(item_len + 7) / 8];
struct hlist_head *head;
struct rds_connection *conn;
size_t i;
@@ -649,8 +649,11 @@ static void rds_conn_info(struct socket *sock, unsigned 
int len,
  struct rds_info_iterator *iter,
  struct rds_info_lengths *lens)
 {
+   u64 buffer[(sizeof(struct rds_info_connection) + 7) / 8];
+
rds_walk_conn_path_info(sock, len, iter, lens,
rds_conn_info_visitor,
+   buffer,
sizeof(struct rds_info_connection));
 }
 
-- 
1.9.1



[PATCH 1/2] net: rds: drop VLA in rds_for_each_conn_info()

2018-03-11 Thread Salvatore Mesoraca
Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca 
---
 net/rds/connection.c | 2 +-
 net/rds/ib.c | 3 +++
 net/rds/rds.h| 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/rds/connection.c b/net/rds/connection.c
index 2da3176..f80792c 100644
--- a/net/rds/connection.c
+++ b/net/rds/connection.c
@@ -540,9 +540,9 @@ void rds_for_each_conn_info(struct socket *sock, unsigned 
int len,
  struct rds_info_iterator *iter,
  struct rds_info_lengths *lens,
  int (*visitor)(struct rds_connection *, void *),
+ u64 *buffer,
  size_t item_len)
 {
-   uint64_t buffer[(item_len + 7) / 8];
struct hlist_head *head;
struct rds_connection *conn;
size_t i;
diff --git a/net/rds/ib.c b/net/rds/ib.c
index 50a88f3..02deee2 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -321,8 +321,11 @@ static void rds_ib_ic_info(struct socket *sock, unsigned 
int len,
   struct rds_info_iterator *iter,
   struct rds_info_lengths *lens)
 {
+   u64 buffer[(sizeof(struct rds_info_rdma_connection) + 7) / 8];
+
rds_for_each_conn_info(sock, len, iter, lens,
rds_ib_conn_info_visitor,
+   buffer,
sizeof(struct rds_info_rdma_connection));
 }
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 7301b9b..91ea08f 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -709,6 +709,7 @@ void rds_for_each_conn_info(struct socket *sock, unsigned 
int len,
  struct rds_info_iterator *iter,
  struct rds_info_lengths *lens,
  int (*visitor)(struct rds_connection *, void *),
+ u64 *buffer,
  size_t item_len);
 
 __printf(2, 3)
-- 
1.9.1



Re: [PATCH net-next] modules: allow modprobe load regular elf binaries

2018-03-11 Thread Luis R. Rodriguez
Also,

Alexei you never answered my questions out aliases with the umh modules.
Long term this important to consider.

  Luis


Re: [PATCH] rsi: Remove stack VLA usage

2018-03-11 Thread Tobin C. Harding
On Fri, Mar 09, 2018 at 12:37:06PM +0200, Kalle Valo wrote:
> "Tobin C. Harding"  writes:
> 
> > The kernel would like to have all stack VLA usage removed[1].  rsi uses
> > a VLA based on 'blksize'.  Elsewhere in the SDIO code maximum block size
> > is defined using a magic number.  We can use a pre-processor defined
> > constant and declare the array to maximum size.  We add a check before
> > accessing the array in case of programmer error.
> >
> > [1]: https://lkml.org/lkml/2018/3/7/621
> >
> > Signed-off-by: Tobin C. Harding 
> 
> Please CC linux-wireless list, otherwise patchwork won't see the patch
> and I can't apply it.

No worries Kelle can do, will repost as [RESEND]

thanks,
Tobin.


Re: [PATCH v2] net: netfilter: Replace printk() with appropriate pr_*() macro

2018-03-11 Thread Pablo Neira Ayuso
Hi Joe,

On Sun, Mar 11, 2018 at 12:52:41PM -0700, Joe Perches wrote:
> On Mon, 2018-03-12 at 01:11 +0530, Arushi Singhal wrote:
> > Using pr_() is more concise than
> > printk(KERN_).
> > Replace printks having a log level with the appropriate
> > pr_*() macros.
> > 
> > Signed-off-by: Arushi Singhal 
> > ---
> > changes in v2
> > *in v1 printk() were replaced with netdev_*()
> 
> >  net/netfilter/nf_conntrack_acct.c  | 2 +-
> >  net/netfilter/nf_conntrack_ecache.c| 2 +-
> >  net/netfilter/nf_conntrack_timestamp.c | 2 +-
> >  net/netfilter/nf_nat_core.c| 2 +-
> >  net/netfilter/nfnetlink_queue.c| 4 ++--
> >  5 files changed, 6 insertions(+), 6 deletions(-)
> 
> None of these files have a #define for pr_fmt so this
> should be OK.

I think Arushi could add pr_fmt in the same go, so we skip another
follow up patch for this. @Arushi: I suggested this in my previous
email, please have a look.

This would also probably allows us to save the line break in the error
message, which IIRC is not a good practise either, eg.

pr_warn("nf_queue: OOM "
"in mangle, dropping packet\n");

> Perhaps coalesce the formats and remove the unnecessary periods too.

Yes, removing periods seems fine.

Thanks.


Re: [PATCH bpf-next v8 01/11] fs,security: Add a security blob to nameidata

2018-03-11 Thread Mickaël Salaün

On 02/27/2018 02:23 AM, Al Viro wrote:
> On Tue, Feb 27, 2018 at 12:57:21AM +, Al Viro wrote:
>> On Tue, Feb 27, 2018 at 01:41:11AM +0100, Mickaël Salaün wrote:
>>> The function current_nameidata_security(struct inode *) can be used to
>>> retrieve a blob's pointer address tied to the inode being walk through.
>>> This enable to follow a path lookup and know where an inode access come
>>> from. This is needed for the Landlock LSM to be able to restrict access
>>> to file path.
>>>
>>> The LSM hook nameidata_free_security(struct inode *) is called before
>>> freeing the associated nameidata.
>>
>> NAK.  Not without well-defined semantics and "some Linux S uses that for
>> something, don't ask what" does not count.
> 
> Incidentally, pathwalk mechanics is subject to change at zero notice, so
> if you want something, you'd better
>   * have explicitly defined semantics
>   * explain what it is - on fsdevel
>   * not have it hidden behind the layers of opaque LSM dreck, pardon
> the redundance.
> 
> Again, pathwalk internals have changed in the past and may bloody well
> change again in the future.  There's a damn good reason why struct nameidata
> is _not_ visible outside of fs/namei.c, and quietly relying upon any
> implementation details is no-go.
> 

I thought this whole patch series would go to linux-fsdevel but only
this patch did. I'll CCed fsdevel for the next round. Meanwhile, the
cover letter is here: https://lkml.org/lkml/2018/2/26/1214
The code using current_nameidata_lookup(inode) is in the patch 07/11:
https://lkml.org/lkml/2018/2/26/1206

To sum up, I don't know any way to identify if a directory (execute)
access was directly requested by a process or inferred by the kernel
because of a path walk. This was not needed until now because the other
access control systems (either the DAC or access controls enforced by
inode-based LSM, i.e. SELinux and Smack) do not care about the file
hierarchy. Path-based access controls (i.e. AppArmor and Tomoyo)
directly use the notion of path to define a security policy (in the
kernel, not only in the user space configuration). Landlock can't rely
on xattrs (because of composed and unprivileged access control). Because
we can't know for sure from which path an inode come from (if any),
path-based LSM hooks do not help for some file system checks (e.g.
inode_permission). With Landlock, I try to find a way to identify a set
of inodes, from the user space point of view, which is most of the time
related to file hierarchies.

I needed a way to "follow" a path walk, with the minimum amount of code,
and if possible without touching the fs/namei.c . I saw that the
pathwalk mechanism has evolved over time. With this patch, I tried to
make a kernel object (nameidata) usable in some way by LSM, but only
through an inode (current_nameidata_lookup(inode)). The "only" guarantee
of this function should be to identify if an inode is tied to a path
walk. This enable to follow a path walk and know why an inode access is
requested.

I get your concern about the "instability" of the path walk mechanism.
However, I though that a path resolution should not change from the user
space point of view, like other Linux ABI. Anyway, all the current
inode-based access controls, including DAC, rely on this path walks
mechanism. This patch does not expose anything to user space, but only
through the API of Landlock, which is currently relying on path walk
resolutions, already visible to user space. Did I miss something? Do you
have another suggestion to tie an inode to a path walk?

Thanks,
 Mickaël



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v2] net: netfilter: Replace printk() with appropriate pr_*() macro

2018-03-11 Thread Joe Perches
On Mon, 2018-03-12 at 01:11 +0530, Arushi Singhal wrote:
> Using pr_() is more concise than
> printk(KERN_).
> Replace printks having a log level with the appropriate
> pr_*() macros.
> 
> Signed-off-by: Arushi Singhal 
> ---
> changes in v2
> *in v1 printk() were replaced with netdev_*()

>  net/netfilter/nf_conntrack_acct.c  | 2 +-
>  net/netfilter/nf_conntrack_ecache.c| 2 +-
>  net/netfilter/nf_conntrack_timestamp.c | 2 +-
>  net/netfilter/nf_nat_core.c| 2 +-
>  net/netfilter/nfnetlink_queue.c| 4 ++--
>  5 files changed, 6 insertions(+), 6 deletions(-)

None of these files have a #define for pr_fmt so this
should be OK.

Perhaps coalesce the formats and remove the unnecessary periods too.

> diff --git a/net/netfilter/nf_conntrack_acct.c 
> b/net/netfilter/nf_conntrack_acct.c
[]
> @@ -80,7 +80,7 @@ static int nf_conntrack_acct_init_sysctl(struct net *net)
>   net->ct.acct_sysctl_header = register_net_sysctl(net, "net/netfilter",
>table);
>   if (!net->ct.acct_sysctl_header) {
> - printk(KERN_ERR "nf_conntrack_acct: can't register to 
> sysctl.\n");
> + pr_err("nf_conntrack_acct: can't register to sysctl.\n");
>   goto out_register;
>   }
>   return 0;

etc...

> diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
[]
> @@ -834,8 +834,8 @@ nfqnl_mangle(void *data, int data_len, struct 
> nf_queue_entry *e, int diff)
>   nskb = skb_copy_expand(e->skb, skb_headroom(e->skb),
>  diff, GFP_ATOMIC);
>   if (!nskb) {
> - printk(KERN_WARNING "nf_queue: OOM "
> -   "in mangle, dropping packet\n");
> + pr_warn("nf_queue: OOM "
> + "in mangle, dropping packet\n");
>   return -ENOMEM;
>   }
>   kfree_skb(e->skb);


[PATCH v2] net: netfilter: Replace printk() with appropriate pr_*() macro

2018-03-11 Thread Arushi Singhal
Using pr_() is more concise than
printk(KERN_).
Replace printks having a log level with the appropriate
pr_*() macros.

Signed-off-by: Arushi Singhal 
---
changes in v2
*in v1 printk() were replaced with netdev_*()

 net/netfilter/nf_conntrack_acct.c  | 2 +-
 net/netfilter/nf_conntrack_ecache.c| 2 +-
 net/netfilter/nf_conntrack_timestamp.c | 2 +-
 net/netfilter/nf_nat_core.c| 2 +-
 net/netfilter/nfnetlink_queue.c| 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nf_conntrack_acct.c 
b/net/netfilter/nf_conntrack_acct.c
index 8669167..b1c3286 100644
--- a/net/netfilter/nf_conntrack_acct.c
+++ b/net/netfilter/nf_conntrack_acct.c
@@ -80,7 +80,7 @@ static int nf_conntrack_acct_init_sysctl(struct net *net)
net->ct.acct_sysctl_header = register_net_sysctl(net, "net/netfilter",
 table);
if (!net->ct.acct_sysctl_header) {
-   printk(KERN_ERR "nf_conntrack_acct: can't register to 
sysctl.\n");
+   pr_err("nf_conntrack_acct: can't register to sysctl.\n");
goto out_register;
}
return 0;
diff --git a/net/netfilter/nf_conntrack_ecache.c 
b/net/netfilter/nf_conntrack_ecache.c
index caac41a..21a3048 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -372,7 +372,7 @@ static int nf_conntrack_event_init_sysctl(struct net *net)
net->ct.event_sysctl_header =
register_net_sysctl(net, "net/netfilter", table);
if (!net->ct.event_sysctl_header) {
-   printk(KERN_ERR "nf_ct_event: can't register to sysctl.\n");
+   pr_err("nf_ct_event: can't register to sysctl.\n");
goto out_register;
}
return 0;
diff --git a/net/netfilter/nf_conntrack_timestamp.c 
b/net/netfilter/nf_conntrack_timestamp.c
index 4c4734b..f32cc86 100644
--- a/net/netfilter/nf_conntrack_timestamp.c
+++ b/net/netfilter/nf_conntrack_timestamp.c
@@ -58,7 +58,7 @@ static int nf_conntrack_tstamp_init_sysctl(struct net *net)
net->ct.tstamp_sysctl_header = register_net_sysctl(net, "net/netfilter",
   table);
if (!net->ct.tstamp_sysctl_header) {
-   printk(KERN_ERR "nf_ct_tstamp: can't register to sysctl.\n");
+   pr_err("nf_ct_tstamp: can't register to sysctl.\n");
goto out_register;
}
return 0;
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 6c38421..dcda5ac 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -814,7 +814,7 @@ static int __init nf_nat_init(void)
ret = nf_ct_extend_register(_extend);
if (ret < 0) {
nf_ct_free_hashtable(nf_nat_bysource, nf_nat_htable_size);
-   printk(KERN_ERR "nf_nat_core: Unable to register extension\n");
+   pr_err("nf_nat_core: Unable to register extension\n");
return ret;
}
 
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 8bba231..f5ddab1 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -834,8 +834,8 @@ nfqnl_mangle(void *data, int data_len, struct 
nf_queue_entry *e, int diff)
nskb = skb_copy_expand(e->skb, skb_headroom(e->skb),
   diff, GFP_ATOMIC);
if (!nskb) {
-   printk(KERN_WARNING "nf_queue: OOM "
- "in mangle, dropping packet\n");
+   pr_warn("nf_queue: OOM "
+   "in mangle, dropping packet\n");
return -ENOMEM;
}
kfree_skb(e->skb);
-- 
2.7.4



[PATCH 0/1] net: avoid a kernel panic during sk_busy_loop

2018-03-11 Thread Josh Elsasser
Hi Dave,

I stumbled across a reproducible kernel panic while playing around with
busy_poll on a Linux 4.9.86 kernel. There's an unfortunate interaction
between init_dummy_netdev, which doesn't bother to fill in netdev_ops, and
sk_busy_loop, which assumes netdev_ops is a valid pointer.

To reproduce on the device under test (DUT), I did:

  $ ip addr show dev wlan0
  8: wlan0:  mtu 1500 qdisc mq [...]
  inet 172.16.122.6/23 brd 172.16.123.255 scope global wlan0
  $ sysctl -w net.core.busy_read=50
  $ nc -l 172.16.122.6 5001

Then transmitted some data to this socket from a second host:

  $ echo "foo" | nc 172.16.122.6 5001

The DUT immediately hits a kernel panic.

I've attached a patch that applies cleanly to the 4.9.87 stable release.
This fix isn't necessary for net/net-next (ndo_busy_poll was removed in
linux-4.11), but a further backport of this commit is likely required for
any stable releases older than linux-4.5.

I hope this is the right way to raise something like this. I couldn't find
a clear answer from the -stable and netdev howtos for bugs against features
that no longer exist in mainline.

Thanks,
Josh


[PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

2018-03-11 Thread Josh Elsasser
init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring that napi->dev is not a dummy device before
dereferencing napi dev's netdev_ops, preventing the following panic:

  BUG: unable to handle kernel NULL pointer dereference at 00c8
  IP: [] sk_busy_loop+0x92/0x2f0
  Call Trace:
   [] ? uart_write_room+0x74/0xf0
   [] sock_poll+0x99/0xa0
   [] do_sys_poll+0x2e2/0x520
   [] ? get_page_from_freelist+0x3bc/0xa30
   [] ? update_curr+0x62/0x140
   [] ? __slab_free+0xa1/0x2a0
   [] ? __slab_free+0xa1/0x2a0
   [] ? skb_free_head+0x21/0x30
   [] ? poll_initwait+0x50/0x50
   [] ? kmem_cache_free+0x1c6/0x1e0
   [] ? uart_write+0x124/0x1d0
   [] ? remove_wait_queue+0x4d/0x60
   [] ? __wake_up+0x44/0x50
   [] ? tty_write_unlock+0x31/0x40
   [] ? tty_ldisc_deref+0x16/0x20
   [] ? tty_write+0x1e0/0x2f0
   [] ? process_echoes+0x80/0x80
   [] ? __vfs_write+0x2b/0x130
   [] ? vfs_write+0x15a/0x1a0
   [] SyS_poll+0x75/0x100
   [] entry_SYSCALL_64_fastpath+0x24/0xcf

Commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()")
indirectly fixed this upstream in linux-4.11 by removing the offending
pointer usage. No other users of napi->dev touch its netdev_ops.

Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement 
ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser 
---
 net/core/dev.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..d0f67d544587 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;
 
/* Note: ndo_busy_poll method is optional in linux-4.5 */
-   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   if (napi->dev->reg_state != NETREG_DUMMY)
+   busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+   else
+   busy_poll = NULL;
 
do {
rc = 0;
-- 
2.11.0



Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-11 Thread Linus Torvalds
On Sun, Mar 11, 2018 at 4:05 AM, Ingo Molnar  wrote:
>
> BTW., while I fully agree with everything you said, it's not entirely correct 
> to
> claim that if a C compiler can generate VLA code it is necessarily able to 
> parse
> and evaluate constant array sizes "just fine".
>
> Constant expressions are typically parsed very early on, at the preprocessing
> stage. They can be used with some preprocessor directives as well, such as 
> '#if'
> (with some further limitations on their syntax).

Yes. But constant simplification and CSE etc is just a very
fundamental part of a compiler, and anybody who actually implements
VLA's would have to do it anyway.

So no, a message like

  warning: Array declaration is not a C90 constant expression,
resulting in VLA code generation

would be moronic. Only some completely mindless broken shit would do
"oh, it's not a parse-time constant, so it will be variable". The two
just do not follow AT ALL.

So the message might be about _possibly_ resulting in VLA code
generation, but honestly, at that point you should just add the
warning when you actually generate the code to do the stack
allocation.

Because at that point, you know whether it's variable or not.

And trust me, it won't be variable for things like (2,3), or even for
our "max()" thing with other odd builtins. Not unless the compiler
doesn't really support VLA at all (maybe some bolted-on crazy thing
that just turns a VLA at the front-end time into just an alloca), or
the user has explicitly asked to disable some fundamental optimization
phase.

   Linus


Re: [PATCH][rds-next] rds: make functions rds_info_from_znotifier and rds_message_zcopy_from_user static

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 18:03), Colin King wrote:
> From: Colin Ian King 
> 
> Functions rds_info_from_znotifier and rds_message_zcopy_from_user are
> local to the source and do not need to be in global scope, so make them
> static.

the rds_message_zcopy_from_user warning was already flagged by  kbuild-robot
last week.  See
   https://www.spinics.net/lists/netdev/msg488041.html


 



[PATCH][rds-next] rds: make functions rds_info_from_znotifier and rds_message_zcopy_from_user static

2018-03-11 Thread Colin King
From: Colin Ian King 

Functions rds_info_from_znotifier and rds_message_zcopy_from_user are
local to the source and do not need to be in global scope, so make them
static.

Cleans up sparse warnins:
net/rds/message.c:70:27: warning: symbol 'rds_info_from_znotifier' was
not declared. Should it be static?
net/rds/message.c:358:5: warning: symbol 'rds_message_zcopy_from_user'
was not declared. Should it be static?

Signed-off-by: Colin Ian King 
---
 net/rds/message.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/rds/message.c b/net/rds/message.c
index 9c41bdd9e444..7448fa192543 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -67,7 +67,8 @@ static inline bool rds_zcookie_add(struct rds_msg_zcopy_info 
*info, u32 cookie)
return true;
 }
 
-struct rds_msg_zcopy_info *rds_info_from_znotifier(struct rds_znotifier 
*znotif)
+static struct rds_msg_zcopy_info *
+rds_info_from_znotifier(struct rds_znotifier *znotif)
 {
return container_of(znotif, struct rds_msg_zcopy_info, znotif);
 }
@@ -355,7 +356,8 @@ struct rds_message *rds_message_map_pages(unsigned long 
*page_addrs, unsigned in
return rm;
 }
 
-int rds_message_zcopy_from_user(struct rds_message *rm, struct iov_iter *from)
+static int rds_message_zcopy_from_user(struct rds_message *rm,
+  struct iov_iter *from)
 {
struct scatterlist *sg;
int ret = 0;
-- 
2.15.1



[PATCH][next] lan743x: make functions lan743x_csr_read and lan743x_csr_read static

2018-03-11 Thread Colin King
From: Colin Ian King 

Functions lan743x_csr_read and lan743x_csr_read are local to the source
and do not need to be in global scope, so make them static.

Cleans up sparse warning:
drivers/net/ethernet/microchip/lan743x_main.c:56:5: warning: symbol
lan743x_csr_read' was not declared. Should it be static?
drivers/net/ethernet/microchip/lan743x_main.c:61:6: warning: symbol
'lan743x_csr_write' was not declared. Should it be static?

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/microchip/lan743x_main.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c 
b/drivers/net/ethernet/microchip/lan743x_main.c
index 73ce3ee6a520..dd947e4dd3ce 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -53,12 +53,13 @@ static int lan743x_pci_init(struct lan743x_adapter *adapter,
return ret;
 }
 
-u32 lan743x_csr_read(struct lan743x_adapter *adapter, int offset)
+static u32 lan743x_csr_read(struct lan743x_adapter *adapter, int offset)
 {
return ioread32(>csr.csr_address[offset]);
 }
 
-void lan743x_csr_write(struct lan743x_adapter *adapter, int offset, u32 data)
+static void lan743x_csr_write(struct lan743x_adapter *adapter, int offset,
+ u32 data)
 {
iowrite32(data, >csr.csr_address[offset]);
 }
-- 
2.15.1



Re: [rds-devel] [PATCH][rds-next] rds: remove redundant variable 'sg_off'

2018-03-11 Thread Sowmini Varadhan
On (03/11/18 17:27), Colin King wrote:
> Variable sg_off is assigned a value but it is never read, hence it is
> redundant and can be removed.
> 

Acked-by: Sowmini Varadhan 




[PATCH][next] lan743x: remove some redundant variables and assignments

2018-03-11 Thread Colin King
From: Colin Ian King 

Function lan743x_phy_init assigns pointer 'netdev' but this is never read
and hence it can be removed. The return error code handling can also be
cleaned up to remove the variable 'ret'.

Function lan743x_phy_link_status_change assigns pointer 'phy' twice and
this is never read, so it also can be removed.

Finally, function lan743x_tx_napi_poll initializes pointer 'adapter'
and then re-assigns the same value into this pointer a little later on
so this second assignment is redundant and can be also removed.

Cleans up clang warnings:
drivers/net/ethernet/microchip/lan743x_main.c:951:2: warning: Value
stored to 'netdev' is never read
drivers/net/ethernet/microchip/lan743x_main.c:971:3: warning: Value
stored to 'phy' is never read
drivers/net/ethernet/microchip/lan743x_main.c:1583:26: warning: Value
stored to 'adapter' during its initialization is never read

Signed-off-by: Colin Ian King 
---
 drivers/net/ethernet/microchip/lan743x_main.c | 13 +
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_main.c 
b/drivers/net/ethernet/microchip/lan743x_main.c
index 04022ec5ae1e..73ce3ee6a520 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -945,15 +945,7 @@ static void lan743x_phy_update_flowcontrol(struct 
lan743x_adapter *adapter,
 
 static int lan743x_phy_init(struct lan743x_adapter *adapter)
 {
-   struct net_device *netdev;
-   int ret;
-
-   netdev = adapter->netdev;
-   ret = lan743x_phy_reset(adapter);
-   if (ret)
-   return ret;
-
-   return 0;
+   return lan743x_phy_reset(adapter);
 }
 
 static void lan743x_phy_link_status_change(struct net_device *netdev)
@@ -964,11 +956,9 @@ static void lan743x_phy_link_status_change(struct 
net_device *netdev)
phy_print_status(phydev);
if (phydev->state == PHY_RUNNING) {
struct ethtool_link_ksettings ksettings;
-   struct lan743x_phy *phy = NULL;
int remote_advertisement = 0;
int local_advertisement = 0;
 
-   phy = >phy;
memset(, 0, sizeof(ksettings));
phy_ethtool_get_link_ksettings(netdev, );
local_advertisement = phy_read(phydev, MII_ADVERTISE);
@@ -1586,7 +1576,6 @@ static int lan743x_tx_napi_poll(struct napi_struct *napi, 
int weight)
u32 ioc_bit = 0;
u32 int_sts = 0;
 
-   adapter = tx->adapter;
ioc_bit = DMAC_INT_BIT_TX_IOC_(tx->channel_number);
int_sts = lan743x_csr_read(adapter, DMAC_INT_STS);
if (tx->vector_flags & LAN743X_VECTOR_FLAG_SOURCE_STATUS_W2C)
-- 
2.15.1



[PATCH v2] ss: introduce switch to print exact value of data rates

2018-03-11 Thread Tomasz Torcz
   Introduce -X/--exact switch to disable human-friendly printing  of 
datarates. With the switch, data is not presented as MBps/Kbps.

Changes in v2:
- change variable name into to show_human_readable



[PATCH] ss: introduce switch to print exact value of data rates

2018-03-11 Thread Tomasz Torcz
  Introduce -X/--exact switch to disable human-friendly printing
 of datarates. With the switch, data is not presented as MBps/Kbps.

  Signed-off-by: Tomasz Torcz 
---
 misc/ss.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/misc/ss.c b/misc/ss.c
index e087bef7..61c917e4 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -95,6 +95,7 @@ int resolve_services = 1;
 int preferred_family = AF_UNSPEC;
 int show_options;
 int show_details;
+int show_human_readable = 1;
 int show_users;
 int show_mem;
 int show_tcpinfo;
@@ -2276,7 +2277,9 @@ static int proc_inet_split_line(char *line, char **loc, 
char **rem, char **data)
 
 static char *sprint_bw(char *buf, double bw)
 {
-   if (bw > 100.)
+   if (!show_human_readable)
+   sprintf(buf, "%.0f", bw);
+   else if (bw > 100.)
sprintf(buf, "%.1fM", bw / 100.);
else if (bw > 1000.)
sprintf(buf, "%.1fK", bw / 1000.);
@@ -4502,6 +4505,7 @@ static void _usage(FILE *dest)
 "   -s, --summary   show socket usage summary\n"
 "   -b, --bpf   show bpf filter socket information\n"
 "   -E, --eventscontinually display sockets as they are destroyed\n"
+"   -X, --exact show exact values, instead of human-readable\n"
 "   -Z, --context   display process SELinux security contexts\n"
 "   -z, --contexts  display process and socket SELinux security contexts\n"
 "   -N, --net   switch to the specified network namespace name\n"
@@ -4634,6 +4638,7 @@ static const struct option long_opts[] = {
{ "net", 1, 0, 'N' },
{ "kill", 0, 0, 'K' },
{ "no-header", 0, 0, 'H' },
+   { "exact", 0, 0, 'X' },
{ 0 }
 
 };
@@ -4650,7 +4655,7 @@ int main(int argc, char *argv[])
int screen_width = 80;
 
while ((ch = getopt_long(argc, argv,
-"dhaletuwxnro460spbEf:miA:D:F:vVzZN:KHS",
+"dhaletuwxXnro460spbEf:miA:D:F:vVzZN:KHS",
 long_opts, NULL)) != EOF) {
switch (ch) {
case 'n':
@@ -4701,6 +4706,9 @@ int main(int argc, char *argv[])
case 'x':
filter_af_set(_filter, AF_UNIX);
break;
+   case 'X':
+   show_human_readable = 0;
+   break;
case OPT_VSOCK:
filter_af_set(_filter, AF_VSOCK);
break;
-- 
2.14.3



[PATCH][rds-next] rds: remove redundant variable 'sg_off'

2018-03-11 Thread Colin King
From: Colin Ian King 

Variable sg_off is assigned a value but it is never read, hence it is
redundant and can be removed.

Cleans up clang warning:
net/rds/message.c:373:2: warning: Value stored to 'sg_off' is never read

Signed-off-by: Colin Ian King 
---
 net/rds/message.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/rds/message.c b/net/rds/message.c
index 90dcdcfe9f62..9c41bdd9e444 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -357,7 +357,6 @@ struct rds_message *rds_message_map_pages(unsigned long 
*page_addrs, unsigned in
 
 int rds_message_zcopy_from_user(struct rds_message *rm, struct iov_iter *from)
 {
-   unsigned long sg_off;
struct scatterlist *sg;
int ret = 0;
int length = iov_iter_count(from);
@@ -370,7 +369,6 @@ int rds_message_zcopy_from_user(struct rds_message *rm, 
struct iov_iter *from)
 * now allocate and copy in the data payload.
 */
sg = rm->data.op_sg;
-   sg_off = 0; /* Dear gcc, sg->page will be null from kzalloc. */
 
info = kzalloc(sizeof(*info), GFP_KERNEL);
if (!info)
-- 
2.15.1



Re: [PATCH] net/mlx4_en: fix potential use-after-free with dma_unmap_page

2018-03-11 Thread Tariq Toukan



On 06/03/2018 10:16 PM, Sarah Newman wrote:

On 03/06/2018 08:13 AM, Tariq Toukan wrote:


I have a general question about the process.
I don't totally get what branch this patch is targeted to.
It touches critical areas in datapath and should go through regression tests 
before it is accepted to any branch.



This one is against 4.9 -
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/log/?h=linux-4.9.y

I assume you'd be interested in applying a fix to all currently maintained 
stable branches between 3.6 to 4.11, per
https://www.kernel.org/category/releases.html .



Yes.


I know of at least one other person whose workaround has been to disable GRO, 
but I don't understand the networking code well enough to guarantee that
disabling GRO means the problem will never occur under all possible error 
conditions. If it's guaranteed that disabling GRO will definitely mitigate
the bug, probably it's better to stop supporting GRO for these versions of the 
driver instead of trying to implement some other fix.


I don't think we have a guarantee, it just happens to work.
Also, disabling GRO will cause dramatic performance degradation.
We should fix this, not WA.



--Sarah



[PATCH net-next] net: phy: set link state to down when creating the phy_device

2018-03-11 Thread Heiner Kallweit
Currently the link state is initialized to "up" when the phy_device is
being created. This is not consistent with the phy state being
initialized to PHY_DOWN.

Usually this doen't do any harm because the link state is updated
once the PHY reaches state PHY_AN. However e.g. if a LAN port isn't
used and the PHY remains down this inconsistency remains and calls
to functions like phy_print_status() give false results.
Therefore change the initialization to link being down.

Signed-off-by: Heiner Kallweit 
---
 drivers/net/phy/phy_device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 478405e54..b28532332 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -374,7 +374,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, 
int addr, int phy_id,
dev->duplex = -1;
dev->pause = 0;
dev->asym_pause = 0;
-   dev->link = 1;
+   dev->link = 0;
dev->interface = PHY_INTERFACE_MODE_GMII;
 
dev->autoneg = AUTONEG_ENABLE;
-- 
2.16.2



Re: [PATCH v3] kernel.h: Skip single-eval logic on literals in min()/max()

2018-03-11 Thread Ingo Molnar

* Linus Torvalds  wrote:

> So an error message like
> 
>warning: ISO C90 requires array sizes to be constant-expressions
> 
> would be technically correct and useful from a portability angle. It
> tells you when you're doing something non-portable, and should be
> automatically enabled with "-ansi -pedantic", for example.
> 
> So what's misleading is actually the name of the warning and the
> message, not that it happens. The warning isn't about "variable
> length", it's literally about the rules for what a
> "constant-expression" is.
> 
> And in C, the expression (2,3) has a constant _value_ (namely 3), but
> it isn't a constant-expression as specified by the language.
> 
> Now, the thing is that once you actually do variable length arrays,
> those old front-end rules make no sense any more (outside of the "give
> portability warnings" thing).
> 
> Because once you do variable length arrays, you obviously _parse_
> everything just fine, and you're doing to evaluate much more complex
> expressions than some limited constant-expression rule.

BTW., while I fully agree with everything you said, it's not entirely correct 
to 
claim that if a C compiler can generate VLA code it is necessarily able to 
parse 
and evaluate constant array sizes "just fine".

Constant expressions are typically parsed very early on, at the preprocessing 
stage. They can be used with some preprocessor directives as well, such as 
'#if' 
(with some further limitations on their syntax).

If VLA support is implemented in a later stage, and results in heavy-handed 
code 
generation that will technically work for constant value expressions as well 
but 
results in suboptimal code, then a warning should probably be emitted - and it 
wouldn't be pedantic.

The existing warning is still very misleading:

  warning: ISO C90 forbids variable length array ‘array’ [-Wvla]

... and if my above theory is correct then I think a better warning would be 
something like:

  warning: Array declaration is not a C90 constant expression, resulting in VLA 
code generation

... and note that in this specific case it's not misleading to talk about VLAs 
in 
the warning text, because the array size, even if it's constant value, results 
in 
VLA code generation.

I don't know whether GCC has such a limitation, but a quick experiment with GCC 
7.2 suggests that a (2,3) array size expression results in a lot more code 
being 
generated than with a constant expression.

Thanks,

Ingo


Re: [PATCH net v4 0/2] rhashtable: Fix rhltable duplicates insertion

2018-03-11 Thread Or Gerlitz
On 3/7/2018 6:23 PM, David Miller wrote:
> From: Paul Blakey 
> Date: Wed,  7 Mar 2018 16:00:11 +0200
> 
>> On our mlx5 driver fs_core.c, we use the rhltable interface to store
>> flow groups. We noticed that sometimes we get a warning that flow group isn't
>> found at removal. This rare case was caused when a specific scenario 
>> happened,
>> insertion of a flow group with a similar match criteria (a duplicate),
>> but only where the flow group rhash_head was second (or not first)
>> on the relevant rhashtable bucket list.

> I already applied v3, sorry.  But I did add Herbert's ACKs.

Hi Dave,

In mlx5 we use this facility since 4.14, can you please push to -stable of at 
least 4.14 if not earlier?

Or.