RE: [PATCH v4 11/19] scsi: megaraid: Replace PCI pool old API

2017-03-01 Thread Sumit Saxena
>-Original Message-
>From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
>ow...@vger.kernel.org] On Behalf Of Romain Perier
>Sent: Wednesday, March 01, 2017 9:25 PM
>To: Dan Williams; Doug Ledford; Sean Hefty; Hal Rosenstock;
>jeffrey.t.kirs...@intel.com; David S. Miller; stas.yakov...@gmail.com;
James E.J.
>Bottomley; Martin K. Petersen; Felipe Balbi; Greg Kroah-Hartman
>Cc: linux-r...@vger.kernel.org; netdev@vger.kernel.org; linux-
>u...@vger.kernel.org; linux-s...@vger.kernel.org;
linux-ker...@vger.kernel.org;
>Romain Perier; Peter Senna Tschudin
>Subject: [PATCH v4 11/19] scsi: megaraid: Replace PCI pool old API
>
>The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the
>appropriated function with the DMA pool API.
>
>Signed-off-by: Romain Perier 
>Reviewed-by: Peter Senna Tschudin 
>---
> drivers/scsi/megaraid/megaraid_mbox.c   | 33 +++
> drivers/scsi/megaraid/megaraid_mm.c | 32 +++---
> drivers/scsi/megaraid/megaraid_sas_base.c   | 29 +++--
> drivers/scsi/megaraid/megaraid_sas_fusion.c | 66
+
> 4 files changed, 77 insertions(+), 83 deletions(-)
>
>diff --git a/drivers/scsi/megaraid/megaraid_mbox.c
>b/drivers/scsi/megaraid/megaraid_mbox.c
>index f0987f2..7dfc2e2 100644
>--- a/drivers/scsi/megaraid/megaraid_mbox.c
>+++ b/drivers/scsi/megaraid/megaraid_mbox.c
>@@ -1153,8 +1153,8 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>
>
>   // Allocate memory for 16-bytes aligned mailboxes
>-  raid_dev->mbox_pool_handle = pci_pool_create("megaraid mbox pool",
>-  adapter->pdev,
>+  raid_dev->mbox_pool_handle = dma_pool_create("megaraid mbox
>pool",
>+  >pdev->dev,
>   sizeof(mbox64_t) + 16,
>   16, 0);
>
>@@ -1164,7 +1164,7 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>
>   mbox_pci_blk = raid_dev->mbox_pool;
>   for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
>-  mbox_pci_blk[i].vaddr = pci_pool_alloc(
>+  mbox_pci_blk[i].vaddr = dma_pool_alloc(
>
raid_dev->mbox_pool_handle,
>   GFP_KERNEL,
>
_pci_blk[i].dma_addr);
>@@ -1181,8 +1181,8 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>* share common memory pool. Passthru structures piggyback on
>memory
>* allocted to extended passthru since passthru is smaller of the
two
>*/
>-  raid_dev->epthru_pool_handle = pci_pool_create("megaraid mbox
>pthru",
>-  adapter->pdev, sizeof(mraid_epassthru_t), 128, 0);
>+  raid_dev->epthru_pool_handle = dma_pool_create("megaraid mbox
>pthru",
>+  >pdev->dev, sizeof(mraid_epassthru_t),
128,
>0);
>
>   if (raid_dev->epthru_pool_handle == NULL) {
>   goto fail_setup_dma_pool;
>@@ -1190,7 +1190,7 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>
>   epthru_pci_blk = raid_dev->epthru_pool;
>   for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
>-  epthru_pci_blk[i].vaddr = pci_pool_alloc(
>+  epthru_pci_blk[i].vaddr = dma_pool_alloc(
>
raid_dev->epthru_pool_handle,
>   GFP_KERNEL,
>
_pci_blk[i].dma_addr);
>@@ -1202,8 +1202,8 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>
>   // Allocate memory for each scatter-gather list. Request for 512
bytes
>   // alignment for each sg list
>-  raid_dev->sg_pool_handle = pci_pool_create("megaraid mbox sg",
>-  adapter->pdev,
>+  raid_dev->sg_pool_handle = dma_pool_create("megaraid mbox sg",
>+  >pdev->dev,
>   sizeof(mbox_sgl64) *
>MBOX_MAX_SG_SIZE,
>   512, 0);
>
>@@ -1213,7 +1213,7 @@ megaraid_mbox_setup_dma_pools(adapter_t
>*adapter)
>
>   sg_pci_blk = raid_dev->sg_pool;
>   for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
>-  sg_pci_blk[i].vaddr = pci_pool_alloc(
>+  sg_pci_blk[i].vaddr = dma_pool_alloc(
>   raid_dev->sg_pool_handle,
>   GFP_KERNEL,
>   _pci_blk[i].dma_addr);
>@@ -1249,29 +1249,26 @@ megaraid_mbox_teardown_dma_pools(adapter_t
>*adapter)
>
>   sg_pci_blk = raid_dev->sg_pool;
>   for (i = 0; i < MBOX_MAX_SCSI_CMDS && sg_pci_blk[i].vaddr; i++) {
>-  pci_pool_free(raid_dev->sg_pool_handle,
sg_pci_blk[i].vaddr,
>+  dma_pool_free(raid_dev->sg_pool_handle,
sg_pci_blk[i].vaddr,
>   sg_pci_blk[i].dma_addr);
>   }
>-  if (raid_dev->sg_pool_handle)
>-  pci_pool_destroy(raid_dev->sg_pool_handle);
>+ 

[PULL] vhost: cleanups and fixes

2017-03-01 Thread Michael S. Tsirkin
The following changes since commit c470abd4fde40ea6a0846a2beab642a578c0b8cd:

  Linux 4.10 (2017-02-19 14:34:00 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to c4baad50297d84bde1a7ad45e50c73adae4a2192:

  virtio-console: avoid DMA from stack (2017-03-02 01:35:06 +0200)


virtio, vhost: optimizations, fixes

Looks like a quiet cycle for vhost/virtio, just a couple of minor
tweaks. Most notable is automatic interrupt affinity for blk and scsi.
Hopefully other devices are not far behind.

Signed-off-by: Michael S. Tsirkin 


Christoph Hellwig (9):
  virtio_pci: remove struct virtio_pci_vq_info
  virtio_pci: use shared interrupts for virtqueues
  virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
  virtio_pci: simplify MSI-X setup
  virtio: allow drivers to request IRQ affinity when creating VQs
  virtio: provide a method to get the IRQ affinity mask for a virtqueue
  blk-mq: provide a default queue mapping for virtio device
  virtio_blk: use virtio IRQ affinity
  virtio_scsi: use virtio IRQ affinity

Jason Wang (2):
  vhost: try avoiding avail index access when getting descriptor
  vhost: introduce O(1) vq metadata cache

Michael S. Tsirkin (1):
  virtio_mmio: expose header to userspace

Omar Sandoval (1):
  virtio-console: avoid DMA from stack

 block/Kconfig  |   5 +
 block/Makefile |   1 +
 block/blk-mq-virtio.c  |  54 +
 drivers/block/virtio_blk.c |  14 +-
 drivers/char/virtio_console.c  |  14 +-
 drivers/crypto/virtio/virtio_crypto_core.c |   2 +-
 drivers/gpu/drm/virtio/virtgpu_kms.c   |   2 +-
 drivers/misc/mic/vop/vop_main.c|   2 +-
 drivers/net/caif/caif_virtio.c |   3 +-
 drivers/net/virtio_net.c   |   2 +-
 drivers/remoteproc/remoteproc_virtio.c |   3 +-
 drivers/rpmsg/virtio_rpmsg_bus.c   |   2 +-
 drivers/s390/virtio/kvm_virtio.c   |   3 +-
 drivers/s390/virtio/virtio_ccw.c   |   3 +-
 drivers/scsi/virtio_scsi.c | 127 +-
 drivers/vhost/vhost.c  | 173 +
 drivers/vhost/vhost.h  |   8 +
 drivers/virtio/virtio_balloon.c|   3 +-
 drivers/virtio/virtio_input.c  |   3 +-
 drivers/virtio/virtio_mmio.c   |   5 +-
 drivers/virtio/virtio_pci_common.c | 376 -
 drivers/virtio/virtio_pci_common.h |  50 +---
 drivers/virtio/virtio_pci_legacy.c |   9 +-
 drivers/virtio/virtio_pci_modern.c |  17 +-
 include/linux/blk-mq-virtio.h  |  10 +
 include/linux/cpuhotplug.h |   1 -
 include/linux/virtio_config.h  |  12 +-
 include/uapi/linux/Kbuild  |   1 +
 include/{ => uapi}/linux/virtio_mmio.h |   0
 include/uapi/linux/virtio_pci.h|   2 +-
 net/vmw_vsock/virtio_transport.c   |   3 +-
 31 files changed, 456 insertions(+), 454 deletions(-)
 create mode 100644 block/blk-mq-virtio.c
 create mode 100644 include/linux/blk-mq-virtio.h
 rename include/{ => uapi}/linux/virtio_mmio.h (100%)


Re: [PATCH v4] net: don't call strlen() on the user buffer in packet_bind_spkt()

2017-03-01 Thread David Miller
From: Alexander Potapenko 
Date: Wed,  1 Mar 2017 12:57:20 +0100

> KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
> uninitialized memory in packet_bind_spkt():
 ...
> This happens because addr.sa_data copied from the userspace is not
> zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
> results in calling strlen() on the kernel copy of that non-terminated
> buffer.
> 
> Signed-off-by: Alexander Potapenko 
> ---
> Changes since v3:
>  - addressed comments by Eric Dumazet (avoid using constants,
>use memcpy() instead of strncpy())

Applied and queued up for -stable.


Re: [PATCH v2] net: bridge: allow IPv6 when multicast flood is disabled

2017-03-01 Thread David Miller
From: Mike Manning 
Date: Wed, 1 Mar 2017 09:55:28 +

> Even with multicast flooding turned off, IPv6 ND should still work so
> that IPv6 connectivity is provided. Allow this by continuing to flood
> multicast traffic originated by us.
> 
> Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
> Cc: Nikolay Aleksandrov 
> Signed-off-by: Mike Manning 

Applied and queued up for -stable.


Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Eric Dumazet
On Wed, Mar 1, 2017 at 9:25 PM, Cong Wang  wrote:
> On Wed, Mar 1, 2017 at 3:15 PM, Eric Dumazet  wrote:
>> On Wed, Mar 1, 2017 at 3:09 PM, Cong Wang  wrote:
>>
>>>
>>> But I doubt skb_orphan() is the solution here, shouldn't we just
>>> update sk->sk_wmem_alloc with skb->truesize changes?
>>
>> Is it worth it ? Apart from syszkaller I mean...
>>
>> We started with something that had a real impact on real workloads.
>>
>> 158f323b9868b59967ad96957c4ca388161be321 net: adjust skb->truesize in
>> pskb_expand_head()
>>
>> Note that auditing the stack took me a while.
>
> I don't know how sk refcnt could work correctly without making
> sk_wmem_alloc correctly. We certainly could just call skb_orphan()
> is we don't need skb->sk any more, probably like the frag case,
> but for this case, the neigh one, the skb's sitting in neigh->arp_queue
> are not going to be freed unless in failed case, therefore skb->sk
> should not be orphaned so early.


There is absolutely no issue in arp/nd case.
Many skbs can sit there and it is fine.
Same with skbs sitting a long time in a qdisc.

Of course we try to not call skb_orphan() unless really needed.

tcp_gso_segment() tries very hard to propagate skb ownership to the segments,
but even something apparently easy like that took some patches before
being done right.

(for details : 0d08c42cf9a71530fef5ebcfe368f38f2dd0476f "tcp: gso: fix
truesize tracking")

conntrack reasm is mostly used in forwarding workloads, where skb->sk
is already NULL.

Are you thinking of a real workload where skb->sk _needs_ to be kept
in ipv6 reasm ?


Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 3:15 PM, Eric Dumazet  wrote:
> On Wed, Mar 1, 2017 at 3:09 PM, Cong Wang  wrote:
>
>>
>> But I doubt skb_orphan() is the solution here, shouldn't we just
>> update sk->sk_wmem_alloc with skb->truesize changes?
>
> Is it worth it ? Apart from syszkaller I mean...
>
> We started with something that had a real impact on real workloads.
>
> 158f323b9868b59967ad96957c4ca388161be321 net: adjust skb->truesize in
> pskb_expand_head()
>
> Note that auditing the stack took me a while.

I don't know how sk refcnt could work correctly without making
sk_wmem_alloc correctly. We certainly could just call skb_orphan()
is we don't need skb->sk any more, probably like the frag case,
but for this case, the neigh one, the skb's sitting in neigh->arp_queue
are not going to be freed unless in failed case, therefore skb->sk
should not be orphaned so early.


Re: [PATCH v4] net: don't call strlen() on the user buffer in packet_bind_spkt()

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 3:57 AM, Alexander Potapenko  wrote:
> This happens because addr.sa_data copied from the userspace is not
> zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
> results in calling strlen() on the kernel copy of that non-terminated
> buffer.

Very similar to

commit b301f2538759933cf9ff1f7c4f968da72e3f0757
Author: Pablo Neira Ayuso 
Date:   Thu Mar 24 21:29:53 2016 +0100

netfilter: x_tables: enforce nul-terminated table name from
getsockopt GET_ENTRIES


Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread David Ahern
On 3/1/17 4:42 PM, Martin KaFai Lau wrote:
> On Wed, Mar 01, 2017 at 04:07:38PM -0800, David Ahern wrote:
>> On 3/1/17 3:16 PM, Martin KaFai Lau wrote:
>>> [ An unrelated topic.  I wonder ip -6 r del xyz::/0 would delete
>>>   the gateway route...]
>>
>> a very related question ...
>>
>> ip -6 r del x::/0 comes down to the kernel as delete '::/0' (plen is 0,
>> so cfg->fc_dst is 0). It ends up on the null_entry from fib6_locate b/c
>> of the 0 prefix length, so it is another variant of 'ip -6 ro del ::/0'
> Agree on the plen == 0 part.
> 
> I actually meant if 'ip -6 r del xyz::/0' would delete any _default_
> _gateway_ also.  By looking at ip6_route_del() alone, it seems it only

Per rtm_to_fib6_config, plen of 0 means fc_dst is 0. Meaning xyz::/0 == ::/0


Re: [Patch net v2] ipv6: ignore null_entry in inet6_rtm_getroute() too

2017-03-01 Thread David Ahern
On 3/1/17 8:48 PM, Cong Wang wrote:
> Like commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route dumps"),
> we need to ignore null entry in inet6_rtm_getroute() too.
> 
> Return -ENETUNREACH here to sync with IPv4 behavior, as suggested by David.
> 
> Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
> Reported-by: Dmitry Vyukov 
> Cc: David Ahern 
> Signed-off-by: Cong Wang 
> ---
>  net/ipv6/route.c | 6 ++
>  1 file changed, 6 insertions(+)
> 


Acked-by: David Ahern 


[Patch net v2] ipv6: ignore null_entry in inet6_rtm_getroute() too

2017-03-01 Thread Cong Wang
Like commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route dumps"),
we need to ignore null entry in inet6_rtm_getroute() too.

Return -ENETUNREACH here to sync with IPv4 behavior, as suggested by David.

Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Dmitry Vyukov 
Cc: David Ahern 
Signed-off-by: Cong Wang 
---
 net/ipv6/route.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f54f426..df757b2 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3627,6 +3627,12 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, 
struct nlmsghdr *nlh)
rt = (struct rt6_info *)ip6_route_output(net, NULL, );
}
 
+   if (rt == net->ipv6.ip6_null_entry) {
+   err = rt->dst.error;
+   ip6_rt_put(rt);
+   goto errout;
+   }
+
skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
if (!skb) {
ip6_rt_put(rt);
-- 
2.5.5



[PATCH v1] qed: Fix copy of uninitialized memory

2017-03-01 Thread Robert Foss
In qed_ll2_start_ooo() the ll2_info variable is uninitialized and then
passed to qed_ll2_acquire_connection() where it is copied into a new
memory space.

This shouldn't cause any issue as long as non of the copied memory is
every read.
But the potential for a bug being introduced by reading this memory
is real.

Detected by CoverityScan, CID#1399632 ("Uninitialized scalar variable")

Signed-off-by: Robert Foss 
---
 drivers/net/ethernet/qlogic/qed/qed_ll2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c 
b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
index 9a0b9af10a57..5fb34db377c8 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c
@@ -968,7 +968,7 @@ static int qed_ll2_start_ooo(struct qed_dev *cdev,
 {
struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
u8 *handle = >pf_params.iscsi_pf_params.ll2_ooo_queue_id;
-   struct qed_ll2_conn ll2_info;
+   struct qed_ll2_conn ll2_info = { 0 };
int rc;
 
ll2_info.conn_type = QED_LL2_TYPE_ISCSI_OOO;
-- 
2.11.0.453.g787f75f05



[PATCH net] net: net_enable_timestamp() can be called from irq contexts

2017-03-01 Thread Eric Dumazet
From: Eric Dumazet 

It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.

Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.

Lets play safe and allow all contexts.

The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.

This extends and improves what we did in commit 5fa8bbda38c6 ("net: use
a work queue to defer net_disable_timestamp() work")

Fixes: b90e5794c5bd ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet 
Reported-by: Dmitry Vyukov 
---
 net/core/dev.c |   35 +++
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 
e63bf61b19be029e30ac40443c0e2edb24de4a73..8637b2b71f3d4751366a2ca5ba46579e6a5fa953
 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1698,27 +1698,54 @@ EXPORT_SYMBOL_GPL(net_dec_egress_queue);
 static struct static_key netstamp_needed __read_mostly;
 #ifdef HAVE_JUMP_LABEL
 static atomic_t netstamp_needed_deferred;
+static atomic_t netstamp_wanted;
 static void netstamp_clear(struct work_struct *work)
 {
int deferred = atomic_xchg(_needed_deferred, 0);
+   int wanted;
 
-   while (deferred--)
-   static_key_slow_dec(_needed);
+   wanted = atomic_add_return(deferred, _wanted);
+   if (wanted > 0)
+   static_key_enable(_needed);
+   else
+   static_key_disable(_needed);
 }
 static DECLARE_WORK(netstamp_work, netstamp_clear);
 #endif
 
 void net_enable_timestamp(void)
 {
+#ifdef HAVE_JUMP_LABEL
+   int wanted;
+
+   while (1) {
+   wanted = atomic_read(_wanted);
+   if (wanted <= 0)
+   break;
+   if (atomic_cmpxchg(_wanted, wanted, wanted + 1) == 
wanted)
+   return;
+   }
+   atomic_inc(_needed_deferred);
+   schedule_work(_work);
+#else
static_key_slow_inc(_needed);
+#endif
 }
 EXPORT_SYMBOL(net_enable_timestamp);
 
 void net_disable_timestamp(void)
 {
 #ifdef HAVE_JUMP_LABEL
-   /* net_disable_timestamp() can be called from non process context */
-   atomic_inc(_needed_deferred);
+   int wanted;
+
+   while (1) {
+   wanted = atomic_read(_wanted);
+   if (wanted <= 1)
+   break;
+   if (atomic_cmpxchg(_wanted, wanted, wanted - 1) == 
wanted)
+   return;
+   }
+   atomic_dec(_needed_deferred);
schedule_work(_work);
 #else
static_key_slow_dec(_needed);




Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy

2017-03-01 Thread Andy Lutomirski
On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün  wrote:
>
> On 28/02/2017 21:01, Andy Lutomirski wrote:
>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün  wrote:
>>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>>> current process. As with a seccomp filter, the Landlock rule is enforced
>>> for all its future children. An inherited rule tree can be updated
>>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>>> process that create a new rule)
>>
>> Can you clarify exaclty what this type of update does?  Is it
>> something that should be supported by normal seccomp rules as well?
>
> There is two main structures involved here: struct landlock_node and
> struct landlock_rule, both defined in include/linux/landlock.h [02/10].
>
> Let's take an example with seccomp filter and then Landlock:
> * seccomp filter: Process P1 creates and applies a seccomp filter F1 to
> itself. Then it forks and creates a child P2, which inherits P1's
> filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
> *won't get it*. The P2's filter list will still only contains F1 but not
> F2. If P2 sets up and applies a new filter F3 to itself, its filter list
> will contains F1 and F3.
> * Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
> Underneath the kernel creates a new node N1 dedicated to P1, which
> contains all its rules. Then P1 forks and creates a child P2, which
> inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
> add a new Landlock rule R2 to itself, P2 *will get it* as well (because
> R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
> its rules will contains R1, R2 and R3. Underneath the kernel created a
> new node N2 for P2, which only contains R3 but inherits/links to N1.
>
> This design makes it possible for a process to add more constraints to
> its children on the fly. I think it is a good feature to have and a
> safer default inheritance mechanism, but it could be guarded by an
> option flag if we want both mechanism to be available. The same design
> could be used by seccomp filter too.
>

Then let's do it right.

Currently each task has an array of seccomp filter layers.  When a
task forks, the child inherits the layers.  All the layers are
presently immutable.  With Landlock, a layer can logically be a
syscall fitler layer or a Landlock layer.  This fits in to the
existing model just fine.

If we want to have an interface to allow modification of an existing
layer, let's make it so that, when a layer is added, you have to
specify a flag to make the layer modifiable (by current, presumably,
although I can imagine other policies down the road).  Then have a
separate API that modifies a layer.

IOW, I think your patch is bad for three reasons, all fixable:

1. The default is wrong.  A layer should be immutable to avoid an easy
attack in which you try to sandbox *yourself* and then you just modify
the layer to weaken it.

2. The API that adds a layer should be different from the API that
modifies a layer.

3. The whole modification mechanism should be a separate patch to be
reviewed on its own merits.

> The current inheritance mechanism doesn't enable to only add a rule to
> the current process. The rule will be inherited by its children
> (starting from the children created after the first applied rule). An
> option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
> could enable to create a new node for the current process, and then
> makes it not inherited by the previous children.

I like my proposal above much better.  "Add a layer" and "change a
layer" should be different operations.

--Andy


Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread Martin KaFai Lau
On Wed, Mar 01, 2017 at 04:07:38PM -0800, David Ahern wrote:
> On 3/1/17 3:16 PM, Martin KaFai Lau wrote:
> > [ An unrelated topic.  I wonder ip -6 r del xyz::/0 would delete
> >   the gateway route...]
>
> a very related question ...
>
> ip -6 r del x::/0 comes down to the kernel as delete '::/0' (plen is 0,
> so cfg->fc_dst is 0). It ends up on the null_entry from fib6_locate b/c
> of the 0 prefix length, so it is another variant of 'ip -6 ro del ::/0'
Agree on the plen == 0 part.

I actually meant if 'ip -6 r del xyz::/0' would delete any _default_
_gateway_ also.  By looking at ip6_route_del() alone, it seems it only
checks ipv6_addr_equal(>fc_gateway, >rt6i_gateway) if
(cfg->fc_flags & RTF_GATEWAY) is true.  Your test case makes
me think about this possibility but it is not related
to what this patch is trying to fix.  I should have started
another thread for this :p


[PATCH] drivers: net: ethernet: remove incorrect __exit markups

2017-03-01 Thread Dmitry Torokhov
Even if bus is not hot-pluggable, devices can be unbound from the
driver via sysfs, so we should not be using __exit annotations on
remove() methods. The only exception is drivers registered with
platform_driver_probe() which specifically disables sysfs bind/unbind
attributes.

Signed-off-by: Dmitry Torokhov 
---
 drivers/net/ethernet/amd/declance.c| 30 +++---
 drivers/net/ethernet/broadcom/sb1250-mac.c |  4 ++--
 drivers/net/ethernet/faraday/ftgmac100.c   |  4 ++--
 drivers/net/ethernet/faraday/ftmac100.c|  4 ++--
 drivers/net/ethernet/seeq/sgiseeq.c|  4 ++--
 drivers/net/ethernet/sgi/meth.c|  4 ++--
 6 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/amd/declance.c 
b/drivers/net/ethernet/amd/declance.c
index 76e5fc7adff5..6c98901f1b89 100644
--- a/drivers/net/ethernet/amd/declance.c
+++ b/drivers/net/ethernet/amd/declance.c
@@ -1276,18 +1276,6 @@ static int dec_lance_probe(struct device *bdev, const 
int type)
return ret;
 }
 
-static void __exit dec_lance_remove(struct device *bdev)
-{
-   struct net_device *dev = dev_get_drvdata(bdev);
-   resource_size_t start, len;
-
-   unregister_netdev(dev);
-   start = to_tc_dev(bdev)->resource.start;
-   len = to_tc_dev(bdev)->resource.end - start + 1;
-   release_mem_region(start, len);
-   free_netdev(dev);
-}
-
 /* Find all the lance cards on the system and initialize them */
 static int __init dec_lance_platform_probe(void)
 {
@@ -1320,7 +1308,7 @@ static void __exit dec_lance_platform_remove(void)
 
 #ifdef CONFIG_TC
 static int dec_lance_tc_probe(struct device *dev);
-static int __exit dec_lance_tc_remove(struct device *dev);
+static int dec_lance_tc_remove(struct device *dev);
 
 static const struct tc_device_id dec_lance_tc_table[] = {
{ "DEC ", "PMAD-AA " },
@@ -1334,7 +1322,7 @@ static struct tc_driver dec_lance_tc_driver = {
.name   = "declance",
.bus= _bus_type,
.probe  = dec_lance_tc_probe,
-   .remove = __exit_p(dec_lance_tc_remove),
+   .remove = dec_lance_tc_remove,
},
 };
 
@@ -1346,7 +1334,19 @@ static int dec_lance_tc_probe(struct device *dev)
 return status;
 }
 
-static int __exit dec_lance_tc_remove(struct device *dev)
+static void dec_lance_remove(struct device *bdev)
+{
+   struct net_device *dev = dev_get_drvdata(bdev);
+   resource_size_t start, len;
+
+   unregister_netdev(dev);
+   start = to_tc_dev(bdev)->resource.start;
+   len = to_tc_dev(bdev)->resource.end - start + 1;
+   release_mem_region(start, len);
+   free_netdev(dev);
+}
+
+static int dec_lance_tc_remove(struct device *dev)
 {
 put_device(dev);
 dec_lance_remove(dev);
diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c 
b/drivers/net/ethernet/broadcom/sb1250-mac.c
index 435a2e4739d1..f82ec1e506e2 100644
--- a/drivers/net/ethernet/broadcom/sb1250-mac.c
+++ b/drivers/net/ethernet/broadcom/sb1250-mac.c
@@ -2617,7 +2617,7 @@ static int sbmac_probe(struct platform_device *pldev)
return err;
 }
 
-static int __exit sbmac_remove(struct platform_device *pldev)
+static int sbmac_remove(struct platform_device *pldev)
 {
struct net_device *dev = platform_get_drvdata(pldev);
struct sbmac_softc *sc = netdev_priv(dev);
@@ -2634,7 +2634,7 @@ static int __exit sbmac_remove(struct platform_device 
*pldev)
 
 static struct platform_driver sbmac_driver = {
.probe = sbmac_probe,
-   .remove = __exit_p(sbmac_remove),
+   .remove = sbmac_remove,
.driver = {
.name = sbmac_string,
},
diff --git a/drivers/net/ethernet/faraday/ftgmac100.c 
b/drivers/net/ethernet/faraday/ftgmac100.c
index 262587240c86..928b0df2b8e0 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1456,7 +1456,7 @@ static int ftgmac100_probe(struct platform_device *pdev)
return err;
 }
 
-static int __exit ftgmac100_remove(struct platform_device *pdev)
+static int ftgmac100_remove(struct platform_device *pdev)
 {
struct net_device *netdev;
struct ftgmac100 *priv;
@@ -1483,7 +1483,7 @@ MODULE_DEVICE_TABLE(of, ftgmac100_of_match);
 
 static struct platform_driver ftgmac100_driver = {
.probe  = ftgmac100_probe,
-   .remove = __exit_p(ftgmac100_remove),
+   .remove = ftgmac100_remove,
.driver = {
.name   = DRV_NAME,
.of_match_table = ftgmac100_of_match,
diff --git a/drivers/net/ethernet/faraday/ftmac100.c 
b/drivers/net/ethernet/faraday/ftmac100.c
index dce5f7b7f772..0f122a85a484 100644
--- a/drivers/net/ethernet/faraday/ftmac100.c
+++ b/drivers/net/ethernet/faraday/ftmac100.c
@@ -1154,7 +1154,7 @@ static int ftmac100_probe(struct platform_device *pdev)
return err;
 }
 
-static int __exit ftmac100_remove(struct 

net: sleeping function called from invalid context in net_enable_timestamp

2017-03-01 Thread Dmitry Vyukov
Hello,

I've got the following report while running syzkaller fuzzer on
e5d56efc97f8240d0b5d66c03949382b6d7e5570:

BUG: sleeping function called from invalid context at kernel/locking/mutex.c:745
in_atomic(): 1, irqs_disabled(): 0, pid: 23233, name: syz-executor5
INFO: lockdep is turned off.
CPU: 1 PID: 23233 Comm: syz-executor5 Not tainted 4.10.0+ #229
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 ___might_sleep+0x47e/0x670 kernel/sched/core.c:6196
 __might_sleep+0x95/0x1a0 kernel/sched/core.c:6151
 __mutex_lock_common kernel/locking/mutex.c:745 [inline]
 __mutex_lock+0x144/0x1730 kernel/locking/mutex.c:891
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
 jump_label_lock kernel/jump_label.c:26 [inline]
 static_key_slow_inc+0x21f/0x3c0 kernel/jump_label.c:127
 net_enable_timestamp+0x15/0x20 net/core/dev.c:1713
 sk_clone_lock+0xef0/0x12c0 net/core/sock.c:1588
 inet_csk_clone_lock+0x91/0x4f0 net/ipv4/inet_connection_sock.c:781
 tcp_create_openreq_child+0xab/0x1e80 net/ipv4/tcp_minisocks.c:436
 tcp_v6_syn_recv_sock+0x210/0x1fb0 net/ipv6/tcp_ipv6.c:1101
 tcp_get_cookie_sock+0x115/0x530 net/ipv4/syncookies.c:212
 cookie_v6_check+0x16f9/0x20d0 net/ipv6/syncookies.c:245
 tcp_v6_cookie_check net/ipv6/tcp_ipv6.c:987 [inline]
 tcp_v6_do_rcv+0xfc3/0x1420 net/ipv6/tcp_ipv6.c:1296
 tcp_v6_rcv+0x22d2/0x2da0 net/ipv6/tcp_ipv6.c:1485
 ip6_input_finish+0x45b/0x1700 net/ipv6/ip6_input.c:279
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ip6_input+0xdb/0x580 net/ipv6/ip6_input.c:322
 dst_input include/net/dst.h:492 [inline]
 ip6_rcv_finish+0x194/0x720 net/ipv6/ip6_input.c:69
 NF_HOOK include/linux/netfilter.h:257 [inline]
 ipv6_rcv+0x12df/0x2380 net/ipv6/ip6_input.c:203
 __netif_receive_skb_core+0x1fb3/0x33a0 net/core/dev.c:4179
 __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
 process_backlog+0x11e/0x730 net/core/dev.c:4837
 napi_poll net/core/dev.c:5171 [inline]
 net_rx_action+0xeb4/0x1580 net/core/dev.c:5236
 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
 
 do_softirq.part.21+0x2c0/0x300 kernel/softirq.c:328
 do_softirq kernel/softirq.c:176 [inline]
 __local_bh_enable_ip+0x24c/0x290 kernel/softirq.c:181
 local_bh_enable include/linux/bottom_half.h:31 [inline]
 rcu_read_unlock_bh include/linux/rcupdate.h:971 [inline]
 ip6_finish_output2+0xb85/0x2380 net/ipv6/ip6_output.c:124
 ip6_finish_output+0x2f9/0x950 net/ipv6/ip6_output.c:149
 NF_HOOK_COND include/linux/netfilter.h:246 [inline]
 ip6_output+0x1cb/0x8c0 net/ipv6/ip6_output.c:163
 ip6_xmit+0xc36/0x1e80 include/net/dst.h:486
 inet6_csk_xmit+0x320/0x5d0 net/ipv6/inet6_connection_sock.c:139
 tcp_transmit_skb+0x1ab4/0x3460 net/ipv4/tcp_output.c:1057
 tcp_write_xmit+0x6e6/0x50d0 net/ipv4/tcp_output.c:2260
 __tcp_push_pending_frames+0xfa/0x380 net/ipv4/tcp_output.c:2445
 tcp_push+0x4e8/0x770 net/ipv4/tcp.c:683
 tcp_sendmsg+0x1275/0x39a0 net/ipv4/tcp.c:1337
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 SYSC_sendto+0x660/0x810 net/socket.c:1685
 SyS_sendto+0x40/0x50 net/socket.c:1653
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458d9
RSP: 002b:7fdc14160b58 EFLAGS: 0282 ORIG_RAX: 002c
RAX: ffda RBX: 001f RCX: 004458d9
RDX: 0ee8 RSI: 20051000 RDI: 001f
RBP: 006e1b70 R08: 20018fe0 R09: 000e
R10: 0800 R11: 0282 R12: 007080a8
R13:  R14:  R15: 20e42000
BUG: scheduling while atomic: syz-executor5/23233/0x0106
INFO: lockdep is turned off.
Modules linked in:
Kernel panic - not syncing: scheduling while atomic

CPU: 1 PID: 23233 Comm: syz-executor5 Tainted: GW   4.10.0+ #229
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 panic+0x1fb/0x412 kernel/panic.c:179
 __schedule_bug+0x224/0x240 kernel/sched/core.c:3238
 schedule_debug kernel/sched/core.c:3255 [inline]
 __schedule+0x1332/0x2290 kernel/sched/core.c:3355
 schedule+0x108/0x440 kernel/sched/core.c:3479
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3512
 mutex_optimistic_spin kernel/locking/mutex.c:579 [inline]
 __mutex_lock_common kernel/locking/mutex.c:757 [inline]
 __mutex_lock+0x112d/0x1730 kernel/locking/mutex.c:891
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
 jump_label_lock kernel/jump_label.c:26 [inline]
 static_key_slow_inc+0x21f/0x3c0 kernel/jump_label.c:127
 net_enable_timestamp+0x15/0x20 net/core/dev.c:1713
 sk_clone_lock+0xef0/0x12c0 net/core/sock.c:1588
 inet_csk_clone_lock+0x91/0x4f0 net/ipv4/inet_connection_sock.c:781
 tcp_create_openreq_child+0xab/0x1e80 

[PATCH net] tcp: fix potential double free issue for fastopen_req

2017-03-01 Thread Wei Wang
From: Wei Wang 

tp->fastopen_req could potentially be double freed if a malicious
user does the following:
1. Enable TCP_FASTOPEN_CONNECT sockopt and do a connect() on the socket.
2. Call connect() with AF_UNSPEC to disconnect the socket.
3. Make this socket a listening socket by calling listen().
4. Accept incoming connections and generate child sockets. All child
   sockets will get a copy of the pointer of fastopen_req.
5. Call close() on all sockets. fastopen_req will get freed multiple
   times.

Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support")
Reported-by: Andrey Konovalov 
Signed-off-by: Wei Wang 
Signed-off-by: Eric Dumazet 
---
 net/ipv4/tcp.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index da385ae997a3..cf481282 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1110,9 +1110,14 @@ static int tcp_sendmsg_fastopen(struct sock *sk, struct 
msghdr *msg,
flags = (msg->msg_flags & MSG_DONTWAIT) ? O_NONBLOCK : 0;
err = __inet_stream_connect(sk->sk_socket, msg->msg_name,
msg->msg_namelen, flags, 1);
-   inet->defer_connect = 0;
-   *copied = tp->fastopen_req->copied;
-   tcp_free_fastopen_req(tp);
+   /* fastopen_req could already be freed in __inet_stream_connect
+* if the connection times out or gets rst
+*/
+   if (tp->fastopen_req) {
+   *copied = tp->fastopen_req->copied;
+   tcp_free_fastopen_req(tp);
+   inet->defer_connect = 0;
+   }
return err;
 }
 
@@ -2318,6 +2323,10 @@ int tcp_disconnect(struct sock *sk, int flags)
memset(>rx_opt, 0, sizeof(tp->rx_opt));
__sk_dst_reset(sk);
 
+   /* Clean up fastopen related fields */
+   tcp_free_fastopen_req(tp);
+   inet->defer_connect = 0;
+
WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);
 
sk->sk_error_report(sk);
-- 
2.12.0.rc1.440.g5b76565f74-goog



Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread David Ahern
On 3/1/17 3:16 PM, Martin KaFai Lau wrote:
> [ An unrelated topic.  I wonder ip -6 r del xyz::/0 would delete
>   the gateway route...]

a very related question ...

ip -6 r del x::/0 comes down to the kernel as delete '::/0' (plen is 0,
so cfg->fc_dst is 0). It ends up on the null_entry from fib6_locate b/c
of the 0 prefix length, so it is another variant of 'ip -6 ro del ::/0'


Re: [PATCH net] ipv6: orphan skbs in reassembly unit

2017-03-01 Thread Joe Stringer
On 1 March 2017 at 14:45, Eric Dumazet  wrote:
> From: Eric Dumazet 
>
> Andrey reported a use-after-free in IPv6 stack.
>
> Issue here is that we free the socket while it still has skb
> in TX path and in some queues.
>
> It happens here because IPv6 reassembly unit messes skb->truesize,
> breaking skb_set_owner_w() badly.
>
> We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
> Always orphan skbs inside ip_defrag()")
>
> ==
> BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
> Read of size 8 at addr 880062da0060 by task a.out/4140
>
> page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
> index:0x0 compound_mapcount: 0
> flags: 0x1008100(slab|head)
> raw: 01008100   000180130013
> raw: dead0100 dead0200 88006741f140 
> page dumped because: kasan: bad access detected
>
> CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:15
>  dump_stack+0x292/0x398 lib/dump_stack.c:51
>  describe_address mm/kasan/report.c:262
>  kasan_report_error+0x121/0x560 mm/kasan/report.c:370
>  kasan_report mm/kasan/report.c:392
>  __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
>  sock_flag ./arch/x86/include/asm/bitops.h:324
>  sock_wfree+0x118/0x120 net/core/sock.c:1631
>  skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
>  skb_release_all+0x15/0x60 net/core/skbuff.c:668
>  __kfree_skb+0x15/0x20 net/core/skbuff.c:684
>  kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
>  inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
>  inet_frag_put ./include/net/inet_frag.h:133
>  nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
>  ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
>  nf_hook_entry_hookfn ./include/linux/netfilter.h:102
>  nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
>  nf_hook ./include/linux/netfilter.h:212
>  __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
>  ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
>  ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
>  ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
>  rawv6_push_pending_frames net/ipv6/raw.c:613
>  rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
>  inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
>  sock_sendmsg_nosec net/socket.c:635
>  sock_sendmsg+0xca/0x110 net/socket.c:645
>  sock_write_iter+0x326/0x620 net/socket.c:848
>  new_sync_write fs/read_write.c:499
>  __vfs_write+0x483/0x760 fs/read_write.c:512
>  vfs_write+0x187/0x530 fs/read_write.c:560
>  SYSC_write fs/read_write.c:607
>  SyS_write+0xfb/0x230 fs/read_write.c:599
>  entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
> RIP: 0033:0x7ff26e6f5b79
> RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
> RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
> RDX: 0010 RSI: 20f50fe1 RDI: 0003
> RBP: 7ff26ebc1220 R08:  R09: 
> R10:  R11: 0206 R12: 
> R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003
>
> The buggy address belongs to the object at 880062da
>  which belongs to the cache RAWv6 of size 1504
> The buggy address 880062da0060 is located 96 bytes inside
>  of 1504-byte region [880062da, 880062da05e0)
>
> Freed by task 4113:
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:502
>  set_track mm/kasan/kasan.c:514
>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
>  slab_free_hook mm/slub.c:1352
>  slab_free_freelist_hook mm/slub.c:1374
>  slab_free mm/slub.c:2951
>  kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
>  sk_prot_free net/core/sock.c:1377
>  __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
>  sk_destruct+0x47/0x80 net/core/sock.c:1460
>  __sk_free+0x57/0x230 net/core/sock.c:1468
>  sk_free+0x23/0x30 net/core/sock.c:1479
>  sock_put ./include/net/sock.h:1638
>  sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
>  rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
>  inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
>  inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
>  sock_release+0x8d/0x1e0 net/socket.c:599
>  sock_close+0x16/0x20 net/socket.c:1063
>  __fput+0x332/0x7f0 fs/file_table.c:208
>  fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x19b/0x270 kernel/task_work.c:116
>  exit_task_work ./include/linux/task_work.h:21
>  do_exit+0x186b/0x2800 kernel/exit.c:839
>  do_group_exit+0x149/0x420 kernel/exit.c:943
>  SYSC_exit_group kernel/exit.c:954
>  SyS_exit_group+0x1d/0x20 kernel/exit.c:952
>  entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
>
> Allocated by task 4115:
>  

Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Eric Dumazet
On Wed, Mar 1, 2017 at 1:43 PM, Cong Wang  wrote:
>>
>> This one looks very similar to a previous one:
>> https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
>>
>> Both happen on raw v6 sockets.
>>
>> For me, it seems the sk refcnt is not correct, skb should still hold
>> a refcnt so it should not be freed before kfree_skb() in a timer
>> handler...
>
> More precisely, after this commit:
>
> commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
> Author: Eric Dumazet 
> Date:   Thu Jun 11 02:55:43 2009 -0700
>
> net: No more expensive sock_hold()/sock_put() on each tx
>
> we don't take (old) refcnt any more on TX path, sk_wmem_alloc
> is the new refcnt. ;)

So the bug is that skb->truesize is mangled by reassembly unit,
while sbk->sk is tracking sk_wmem_alloc changes in order
to decide when it is safe to free sk.

This is why we need to call skb_orphan(), as we did for IPv4 in
8282f27449bf15548


Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 1:54 PM, Eric Dumazet  wrote:
> On Wed, Mar 1, 2017 at 1:43 PM, Cong Wang  wrote:
>>>
>>> This one looks very similar to a previous one:
>>> https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
>>>
>>> Both happen on raw v6 sockets.
>>>
>>> For me, it seems the sk refcnt is not correct, skb should still hold
>>> a refcnt so it should not be freed before kfree_skb() in a timer
>>> handler...
>>
>> More precisely, after this commit:
>>
>> commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
>> Author: Eric Dumazet 
>> Date:   Thu Jun 11 02:55:43 2009 -0700
>>
>> net: No more expensive sock_hold()/sock_put() on each tx
>>
>> we don't take (old) refcnt any more on TX path, sk_wmem_alloc
>> is the new refcnt. ;)
>
> So the bug is that skb->truesize is mangled by reassembly unit,
> while sbk->sk is tracking sk_wmem_alloc changes in order
> to decide when it is safe to free sk.

That is my suspicion as well, skb->truesize is updated somewhere
but sk->sk_wmem_alloc isn't, so leads to this bug.

>
> This is why we need to call skb_orphan(), as we did for IPv4 in
> 8282f27449bf15548


But I doubt skb_orphan() is the solution here, shouldn't we just
update sk->sk_wmem_alloc with skb->truesize changes?


Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy

2017-03-01 Thread Mickaël Salaün


On 01/03/2017 23:20, Andy Lutomirski wrote:
> On Wed, Mar 1, 2017 at 2:14 PM, Mickaël Salaün  wrote:
>>
>> On 28/02/2017 21:01, Andy Lutomirski wrote:
>>> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün  wrote:
 The seccomp(2) syscall can be use to apply a Landlock rule to the
 current process. As with a seccomp filter, the Landlock rule is enforced
 for all its future children. An inherited rule tree can be updated
 (append-only) by the owner of inherited Landlock nodes (e.g. a parent
 process that create a new rule)
>>>
>>> Can you clarify exaclty what this type of update does?  Is it
>>> something that should be supported by normal seccomp rules as well?
>>
>> There is two main structures involved here: struct landlock_node and
>> struct landlock_rule, both defined in include/linux/landlock.h [02/10].
>>
>> Let's take an example with seccomp filter and then Landlock:
>> * seccomp filter: Process P1 creates and applies a seccomp filter F1 to
>> itself. Then it forks and creates a child P2, which inherits P1's
>> filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
>> *won't get it*. The P2's filter list will still only contains F1 but not
>> F2. If P2 sets up and applies a new filter F3 to itself, its filter list
>> will contains F1 and F3.
>> * Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
>> Underneath the kernel creates a new node N1 dedicated to P1, which
>> contains all its rules. Then P1 forks and creates a child P2, which
>> inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
>> add a new Landlock rule R2 to itself, P2 *will get it* as well (because
>> R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
>> its rules will contains R1, R2 and R3. Underneath the kernel created a
>> new node N2 for P2, which only contains R3 but inherits/links to N1.
>>
>> This design makes it possible for a process to add more constraints to
>> its children on the fly. I think it is a good feature to have and a
>> safer default inheritance mechanism, but it could be guarded by an
>> option flag if we want both mechanism to be available. The same design
>> could be used by seccomp filter too.
>>
> 
> Then let's do it right.
> 
> Currently each task has an array of seccomp filter layers.  When a
> task forks, the child inherits the layers.  All the layers are
> presently immutable.  With Landlock, a layer can logically be a
> syscall fitler layer or a Landlock layer.  This fits in to the
> existing model just fine.
> 
> If we want to have an interface to allow modification of an existing
> layer, let's make it so that, when a layer is added, you have to
> specify a flag to make the layer modifiable (by current, presumably,
> although I can imagine other policies down the road).  Then have a
> separate API that modifies a layer.
> 
> IOW, I think your patch is bad for three reasons, all fixable:
> 
> 1. The default is wrong.  A layer should be immutable to avoid an easy
> attack in which you try to sandbox *yourself* and then you just modify
> the layer to weaken it.

This is not possible, there is only an operation for now:
SECCOMP_ADD_LANDLOCK_RULE. You can only add more rules to the list (as
for seccomp filter). There is no way to weaken a sandbox. The question
is: how do we want to handle the rules *tree* (from the kernel point of
view)?

> 
> 2. The API that adds a layer should be different from the API that
> modifies a layer.

Right, but it doesn't apply now because we can only add rules.

> 
> 3. The whole modification mechanism should be a separate patch to be
> reviewed on its own merits.

For a rule *replacement*, sure!

> 
>> The current inheritance mechanism doesn't enable to only add a rule to
>> the current process. The rule will be inherited by its children
>> (starting from the children created after the first applied rule). An
>> option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
>> could enable to create a new node for the current process, and then
>> makes it not inherited by the previous children.
> 
> I like my proposal above much better.  "Add a layer" and "change a
> layer" should be different operations.

I agree, but for now it's about how to handle immutable (but growing)
inherited rules.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH net] tcp/dccp: block BH for SYN processing

2017-03-01 Thread David Miller
From: Eric Dumazet 
Date: Wed, 01 Mar 2017 08:39:49 -0800

> From: Eric Dumazet 
> 
> SYN processing really was meant to be handled from BH.
> 
> When I got rid of BH blocking while processing socket backlog
> in commit 5413d1babe8f ("net: do not block BH while processing socket
> backlog"), I forgot that a malicious user could transition to TCP_LISTEN
> from a state that allowed (SYN) packets to be parked in the socket
> backlog while socket is owned by the thread doing the listen() call.
> 
> Sure enough syzkaller found this and reported the bug ;)
 ...
> Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
> Signed-off-by: Eric Dumazet 
> Reported-by: Andrey Konovalov 

Applied and queued up for -stable, thanks.


Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread David Ahern
On 3/1/17 3:16 PM, Martin KaFai Lau wrote:
> On Mon, Feb 27, 2017 at 04:14:04PM -0800, David Ahern wrote:
>> On 2/27/17 4:07 PM, Cong Wang wrote:
>>> Andrey reported a NULL pointer deref bug in ipv6_route_ioctl()
>>> -> ip6_route_del() -> __ip6_del_rt_siblings() code path. This is
>>> because ip6_null_entry is returned in this path since ip6_null_entry
>>> is kinda default for a ipv6 route table root node. Quote from
>>
>>
>> Missed this earlier. The issue here is an attempt to delete the NULL
>> route,
> You meant rt == NULL or rt->rt6i_table == NULL when rt == ip6_null_entry?

ip6_null_entry

> 
>> not that the null_entry is being returned as happens during a
>> route lookup. This will also hit the bug:
>> ip -6 ro del ::/0
> I also found the commit log a bit confusing.  By reading the message,
> my first thought was an ip6_null_entry is returned because a route cannot
> be found.  Thanks for this particular test case.  It seems fn is NULL
> here for all random routes except 'ip -6 r del xyz::/0' which happens
> to match ip6_null_entry.

yes, that was my point.


Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Eric Dumazet
On Wed, Mar 1, 2017 at 3:09 PM, Cong Wang  wrote:

>
> But I doubt skb_orphan() is the solution here, shouldn't we just
> update sk->sk_wmem_alloc with skb->truesize changes?

Is it worth it ? Apart from syszkaller I mean...

We started with something that had a real impact on real workloads.

158f323b9868b59967ad96957c4ca388161be321 net: adjust skb->truesize in
pskb_expand_head()

Note that auditing the stack took me a while.


Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread Martin KaFai Lau
On Mon, Feb 27, 2017 at 04:14:04PM -0800, David Ahern wrote:
> On 2/27/17 4:07 PM, Cong Wang wrote:
> > Andrey reported a NULL pointer deref bug in ipv6_route_ioctl()
> > -> ip6_route_del() -> __ip6_del_rt_siblings() code path. This is
> > because ip6_null_entry is returned in this path since ip6_null_entry
> > is kinda default for a ipv6 route table root node. Quote from
>
>
> Missed this earlier. The issue here is an attempt to delete the NULL
> route,
You meant rt == NULL or rt->rt6i_table == NULL when rt == ip6_null_entry?

> not that the null_entry is being returned as happens during a
> route lookup. This will also hit the bug:
> ip -6 ro del ::/0
I also found the commit log a bit confusing.  By reading the message,
my first thought was an ip6_null_entry is returned because a route cannot
be found.  Thanks for this particular test case.  It seems fn is NULL
here for all random routes except 'ip -6 r del xyz::/0' which happens
to match ip6_null_entry.

[ An unrelated topic.  I wonder ip -6 r del xyz::/0 would delete
  the gateway route...]

The patch LGTM.


Re: [Patch net v3] ipv6: check for ip6_null_entry in __ip6_del_rt_siblings()

2017-03-01 Thread David Ahern
On 2/27/17 4:07 PM, Cong Wang wrote:
> Andrey reported a NULL pointer deref bug in ipv6_route_ioctl()
> -> ip6_route_del() -> __ip6_del_rt_siblings() code path. This is
> because ip6_null_entry is returned in this path since ip6_null_entry
> is kinda default for a ipv6 route table root node. Quote from
> David Ahern:
> 
>  ip6_null_entry is the root of all ipv6 fib tables making it integrated
>  into the table ...
> 
> We should ignore any attempt of trying to delete it, like we do in
> __ip6_del_rt() path and several others.
> 
> Reported-by: Andrey Konovalov 
> Fixes: 0ae8133586ad ("net: ipv6: Allow shorthand delete of all nexthops in 
> multipath route")
> Cc: David Ahern 
> Cc: Eric Dumazet 
> Signed-off-by: Cong Wang 
> ---
>  net/ipv6/route.c | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)


Acked-by: David Ahern 


Re: [PATCH] bpf: update the comment about the length of analysis

2017-03-01 Thread David Miller
From: Gary Lin 
Date: Wed,  1 Mar 2017 16:25:51 +0800

> Commit 07016151a446 ("bpf, verifier: further improve search
> pruning") increased the limit of processed instructions from
> 32k to 64k, but the comment still mentioned the 32k limit.
> This commit updates the comment to reflect the change.
> 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Signed-off-by: Gary Lin 

Applied, thanks.


Re: pull-request: mac80211 2017-02-28

2017-03-01 Thread David Miller
From: Johannes Berg 
Date: Tue, 28 Feb 2017 10:50:18 +0100

> First round of fixes - we actually have quite a few.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks Johannes.

> I had another question - I have this average.h API change pending
> (you saw it before). It seems it might be a good time to get it in
> now, since no new users should be showing up. If I put it in only
> with the next merge window, new users might show up and break.

Ok, you can feel free to send this now for 'net' if you like.


Re: [PATCH net] bridge: Fix error path in nbp_vlan_init

2017-03-01 Thread David Miller
From: Yotam Gigi 
Date: Wed,  1 Mar 2017 16:50:45 +0200

> Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
> call failes, the vlan_hash wouldn't be destroyed before inited.
> 
> Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
> CC: Roopa Prabhu 
> Signed-off-by: Yotam Gigi 

Applied, thanks.


[PATCH net] ipv6: orphan skbs in reassembly unit

2017-03-01 Thread Eric Dumazet
From: Eric Dumazet 

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")

== 
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120 
Read of size 8 at addr 880062da0060 by task a.out/4140 

page:ea00018b6800 count:1 mapcount:0 mapping:  (null) 
index:0x0 compound_mapcount: 0 
flags: 0x1008100(slab|head) 
raw: 01008100   000180130013 
raw: dead0100 dead0200 88006741f140  
page dumped because: kasan: bad access detected 

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 
Call Trace: 
 __dump_stack lib/dump_stack.c:15 
 dump_stack+0x292/0x398 lib/dump_stack.c:51 
 describe_address mm/kasan/report.c:262 
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370 
 kasan_report mm/kasan/report.c:392 
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413 
 sock_flag ./arch/x86/include/asm/bitops.h:324 
 sock_wfree+0x118/0x120 net/core/sock.c:1631 
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655 
 skb_release_all+0x15/0x60 net/core/skbuff.c:668 
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705 
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 
 inet_frag_put ./include/net/inet_frag.h:133 
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617 
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102 
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 
 nf_hook ./include/linux/netfilter.h:212 
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160 
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 
 rawv6_push_pending_frames net/ipv6/raw.c:613 
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927 
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 
 sock_sendmsg_nosec net/socket.c:635 
 sock_sendmsg+0xca/0x110 net/socket.c:645 
 sock_write_iter+0x326/0x620 net/socket.c:848 
 new_sync_write fs/read_write.c:499 
 __vfs_write+0x483/0x760 fs/read_write.c:512 
 vfs_write+0x187/0x530 fs/read_write.c:560 
 SYSC_write fs/read_write.c:607 
 SyS_write+0xfb/0x230 fs/read_write.c:599 
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 
RIP: 0033:0x7ff26e6f5b79 
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001 
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79 
RDX: 0010 RSI: 20f50fe1 RDI: 0003 
RBP: 7ff26ebc1220 R08:  R09:  
R10:  R11: 0206 R12:  
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003 

The buggy address belongs to the object at 880062da 
 which belongs to the cache RAWv6 of size 1504 
The buggy address 880062da0060 is located 96 bytes inside 
 of 1504-byte region [880062da, 880062da05e0) 

Freed by task 4113: 
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502 
 set_track mm/kasan/kasan.c:514 
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578 
 slab_free_hook mm/slub.c:1352 
 slab_free_freelist_hook mm/slub.c:1374 
 slab_free mm/slub.c:2951 
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973 
 sk_prot_free net/core/sock.c:1377 
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452 
 sk_destruct+0x47/0x80 net/core/sock.c:1460 
 __sk_free+0x57/0x230 net/core/sock.c:1468 
 sk_free+0x23/0x30 net/core/sock.c:1479 
 sock_put ./include/net/sock.h:1638 
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782 
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214 
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425 
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431 
 sock_release+0x8d/0x1e0 net/socket.c:599 
 sock_close+0x16/0x20 net/socket.c:1063 
 __fput+0x332/0x7f0 fs/file_table.c:208 
 fput+0x15/0x20 fs/file_table.c:244 
 task_work_run+0x19b/0x270 kernel/task_work.c:116 
 exit_task_work ./include/linux/task_work.h:21 
 do_exit+0x186b/0x2800 kernel/exit.c:839 
 do_group_exit+0x149/0x420 kernel/exit.c:943 
 SYSC_exit_group kernel/exit.c:954 
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952 
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 

Allocated by task 4115: 
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502 
 set_track mm/kasan/kasan.c:514 
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605 
 

Re: [PATCH v5 03/10] bpf: Define handle_fs and add a new helper bpf_handle_fs_get_mode()

2017-03-01 Thread Mickaël Salaün

On 01/03/2017 10:32, James Morris wrote:
> On Wed, 22 Feb 2017, Mickaël Salaün wrote:
> 
>> Add an eBPF function bpf_handle_fs_get_mode(handle_fs) to get the mode
>> of a an abstract object wrapping either a file, a dentry, a path, or an
>> inode.
>>
>> Changes since v4:
>> * use a file abstraction (handle) to wrap inode, dentry, path and file
>>   structs
> 
> Good to see these abstractions.  As discussed at LPC, we need to ensure 
> that we don't couple the Landlock API too closely with the LSM API, as the 
> former is an ABI exposed to userland -- we don't want to lose the ability 
> to change LSM internally due to breaking Landlock policies.

Right, it is the case now, especially with the Landlock events.

> 
>> @@ -82,6 +87,8 @@ enum bpf_arg_type {
>>  
>>  ARG_PTR_TO_CTX, /* pointer to context */
>>  ARG_ANYTHING,   /* any (initialized) argument is ok */
>> +
>> +ARG_CONST_PTR_TO_HANDLE_FS, /* pointer to an abstract FS struct */
>>  };
> 
> Extraneous whitespace?

It is on purpose, following the same rules as used for this enum.

 Mickaël



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v5 06/10] seccomp,landlock: Handle Landlock events per process hierarchy

2017-03-01 Thread Mickaël Salaün

On 28/02/2017 21:01, Andy Lutomirski wrote:
> On Tue, Feb 21, 2017 at 5:26 PM, Mickaël Salaün  wrote:
>> The seccomp(2) syscall can be use to apply a Landlock rule to the
>> current process. As with a seccomp filter, the Landlock rule is enforced
>> for all its future children. An inherited rule tree can be updated
>> (append-only) by the owner of inherited Landlock nodes (e.g. a parent
>> process that create a new rule)
> 
> Can you clarify exaclty what this type of update does?  Is it
> something that should be supported by normal seccomp rules as well?

There is two main structures involved here: struct landlock_node and
struct landlock_rule, both defined in include/linux/landlock.h [02/10].

Let's take an example with seccomp filter and then Landlock:
* seccomp filter: Process P1 creates and applies a seccomp filter F1 to
itself. Then it forks and creates a child P2, which inherits P1's
filters, hence F1. Now, if P1 add a new seccomp filter F2 to itself, P2
*won't get it*. The P2's filter list will still only contains F1 but not
F2. If P2 sets up and applies a new filter F3 to itself, its filter list
will contains F1 and F3.
* Landlock: Process P1 creates and applies a Landlock rule R1 to itself.
Underneath the kernel creates a new node N1 dedicated to P1, which
contains all its rules. Then P1 forks and creates a child P2, which
inherits P1's rules, hence R1. Underneath P2 inherited N1. Now, if P1
add a new Landlock rule R2 to itself, P2 *will get it* as well (because
R2 is part of N1). If P2 creates and applies a new rule R3 to itself,
its rules will contains R1, R2 and R3. Underneath the kernel created a
new node N2 for P2, which only contains R3 but inherits/links to N1.

This design makes it possible for a process to add more constraints to
its children on the fly. I think it is a good feature to have and a
safer default inheritance mechanism, but it could be guarded by an
option flag if we want both mechanism to be available. The same design
could be used by seccomp filter too.


> 
>> +/**
>> + * landlock_run_prog - run Landlock program for a syscall
> 
> Unless this is actually specific to syscalls, s/for a syscall//, perhaps?

Right, not specific to syscall anymore.

> 
>> +   if (new_events->nodes[event_idx]->owner ==
>> +   _events->nodes[event_idx]) {
>> +   /* We are the owner, we can then update the node. */
>> +   add_landlock_rule(new_events, rule);
> 
> This is the part I don't get.  Adding a rule if you're the owner (BTW,
> why is ownership visible to userspace at all?) for just yourself and
> future children is very different from adding it so it applies to
> preexisting children too.

Node ownership is not (directly) visible to userspace.

The current inheritance mechanism doesn't enable to only add a rule to
the current process. The rule will be inherited by its children
(starting from the children created after the first applied rule). An
option flag NEW_RULE_HIERARCHY (or maybe another seccomp operation)
could enable to create a new node for the current process, and then
makes it not inherited by the previous children.


> 
> 
>> +   } else if (atomic_read(_events->usage) == 1) {
>> +   WARN_ON(new_events->nodes[event_idx]->owner);
>> +   /*
>> +* We can become the new owner if no other task use 
>> it.
>> +* This avoid an unnecessary allocation.
>> +*/
>> +   new_events->nodes[event_idx]->owner =
>> +   _events->nodes[event_idx];
>> +   add_landlock_rule(new_events, rule);
>> +   } else {
>> +   /*
>> +* We are not the owner, we need to fork 
>> current_events
>> +* and then add a new node.
>> +*/
>> +   struct landlock_node *node;
>> +   size_t i;
>> +
>> +   node = kmalloc(sizeof(*node), GFP_KERNEL);
>> +   if (!node) {
>> +   new_events = ERR_PTR(-ENOMEM);
>> +   goto put_rule;
>> +   }
>> +   atomic_set(>usage, 1);
>> +   /* set the previous node after the new_events
>> +* allocation */
>> +   node->prev = NULL;
>> +   /* do not increment the previous node usage */
>> +   node->owner = _events->nodes[event_idx];
>> +   /* rule->prev is already NULL */
>> +   atomic_set(>usage, 1);
>> +   node->rule = rule;
>> +
>> +   new_events = new_raw_landlock_events();
>> +   if (IS_ERR(new_events)) {
>> +  

Re: [PATCH net] tcp/dccp: block BH for SYN processing

2017-03-01 Thread Soheil Hassas Yeganeh
On Wed, Mar 1, 2017 at 8:39 AM, Eric Dumazet  wrote:
> From: Eric Dumazet 
>
> SYN processing really was meant to be handled from BH.
>
> When I got rid of BH blocking while processing socket backlog
> in commit 5413d1babe8f ("net: do not block BH while processing socket
> backlog"), I forgot that a malicious user could transition to TCP_LISTEN
> from a state that allowed (SYN) packets to be parked in the socket
> backlog while socket is owned by the thread doing the listen() call.
>
> Sure enough syzkaller found this and reported the bug ;)
>
>
> =
> [ INFO: inconsistent lock state ]
> 4.10.0+ #60 Not tainted
> -
> inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
> syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
>  (&(>ehash_locks[i])->rlock){+.?...}, at:
> [] spin_lock include/linux/spinlock.h:299 [inline]
>  (&(>ehash_locks[i])->rlock){+.?...}, at:
> [] inet_ehash_insert+0x240/0xad0
> net/ipv4/inet_hashtables.c:407
> {IN-SOFTIRQ-W} state was registered at:
>   mark_irqflags kernel/locking/lockdep.c:2923 [inline]
>   __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
>   lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
>   __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>   _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
>   spin_lock include/linux/spinlock.h:299 [inline]
>   inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
>   reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
>   inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 
> net/ipv4/inet_connection_sock.c:764
>   tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
>   tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
>   tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
>   tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
>   tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
>   ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
>   NF_HOOK include/linux/netfilter.h:257 [inline]
>   ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
>   dst_input include/net/dst.h:492 [inline]
>   ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
>   NF_HOOK include/linux/netfilter.h:257 [inline]
>   ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
>   __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
>   __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
>   netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
>   napi_skb_finish net/core/dev.c:4602 [inline]
>   napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
>   e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 
> [inline]
>   e1000_clean_rx_irq+0x5e0/0x1490
> drivers/net/ethernet/intel/e1000/e1000_main.c:4489
>   e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
>   napi_poll net/core/dev.c:5171 [inline]
>   net_rx_action+0xe70/0x1900 net/core/dev.c:5236
>   __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
>   invoke_softirq kernel/softirq.c:364 [inline]
>   irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>   exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>   do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
>   ret_from_intr+0x0/0x20
>   native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>   arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>   default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
>   arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
>   default_idle_call+0x36/0x60 kernel/sched/idle.c:96
>   cpuidle_idle_call kernel/sched/idle.c:154 [inline]
>   do_idle+0x348/0x440 kernel/sched/idle.c:243
>   cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
>   start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
>   verify_cpu+0x0/0xfc
> irq event stamp: 1741
> hardirqs last  enabled at (1741): []
> __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
> [inline]
> hardirqs last  enabled at (1741): []
> _raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
> hardirqs last disabled at (1740): []
> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
> hardirqs last disabled at (1740): []
> _raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
> softirqs last  enabled at (1738): []
> __do_softirq+0x7cf/0xb7d kernel/softirq.c:310
> softirqs last disabled at (1571): []
> do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>CPU0
>
>   lock(&(>ehash_locks[i])->rlock);
>   
> lock(&(>ehash_locks[i])->rlock);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor0/5090:
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1460 [inline]
>  #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
> sock_setsockopt+0x233/0x1e40 net/core/sock.c:683
>
> stack backtrace:
> CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
> Hardware name: QEMU Standard PC 

Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 1:24 PM, Cong Wang  wrote:
> On Wed, Mar 1, 2017 at 11:27 AM, Dmitry Vyukov  wrote:
>> Hello,
>>
>> I am seeing the following use-after-free report while running
>> syzkaller fuzzer on
>> linux-next/3e7350242c6f3d41d28e03418bd781cc1b7bad5f:
>>
>> ==
>> BUG: KASAN: use-after-free in constant_test_bit
>> arch/x86/include/asm/bitops.h:324 [inline] at addr 8801c56d5460
>> BUG: KASAN: use-after-free in sock_flag include/net/sock.h:789
>> [inline] at addr 8801c56d5460
>> BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
>> net/core/sock.c:1630 at addr 8801c56d5460
>> Read of size 8 by task syz-fuzzer/3261
>> CPU: 0 PID: 3261 Comm: syz-fuzzer Not tainted 4.10.0-next-20170224+ #1
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> Call Trace:
>>  
>>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>>  constant_test_bit arch/x86/include/asm/bitops.h:324 [inline]
>>  sock_flag include/net/sock.h:789 [inline]
>>  sock_wfree+0x118/0x120 net/core/sock.c:1630
>>  skb_release_head_state+0xfc/0x200 net/core/skbuff.c:654
>>  skb_release_all+0x15/0x60 net/core/skbuff.c:667
>>  __kfree_skb+0x15/0x20 net/core/skbuff.c:683
>>  kfree_skb+0x16e/0x4c0 net/core/skbuff.c:704
>>  ndisc_error_report+0xbb/0x190 net/ipv6/ndisc.c:683
>>  neigh_invalidate+0x23e/0x570 net/core/neighbour.c:848
>>  neigh_timer_handler+0x4e7/0x1140 net/core/neighbour.c:933
>>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
>>  expire_timers kernel/time/timer.c:1305 [inline]
>>  __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
>>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>>  invoke_softirq kernel/softirq.c:364 [inline]
>>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
>
> This one looks very similar to a previous one:
> https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ
>
> Both happen on raw v6 sockets.
>
> For me, it seems the sk refcnt is not correct, skb should still hold
> a refcnt so it should not be freed before kfree_skb() in a timer
> handler...

More precisely, after this commit:

commit 2b85a34e911bf483c27cfdd124aeb1605145dc80
Author: Eric Dumazet 
Date:   Thu Jun 11 02:55:43 2009 -0700

net: No more expensive sock_hold()/sock_put() on each tx

we don't take (old) refcnt any more on TX path, sk_wmem_alloc
is the new refcnt. ;)


[PATCH 1/2] dccp: Unlock sock before calling sk_free()

2017-03-01 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8ee7c2 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

  8< 

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

  [ BUG: held lock freed! ]
  4.10.0+ #234 Not tainted
  -
  syz-executor6/6898 is freeing memory
  88006286cac0-88006286d3b7, with a lock still held there!
   (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
  5 locks held by syz-executor6/6898:
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
  include/net/sock.h:1460 [inline]
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
  inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
   #1:  (rcu_read_lock){..}, at: []
  inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
   #2:  (rcu_read_lock){..}, at: [] __skb_unlink
  include/linux/skbuff.h:1767 [inline]
   #2:  (rcu_read_lock){..}, at: [] __skb_dequeue
  include/linux/skbuff.h:1783 [inline]
   #2:  (rcu_read_lock){..}, at: []
  process_backlog+0x264/0x730 net/core/dev.c:4835
   #3:  (rcu_read_lock){..}, at: []
  ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
   #4:  (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   #4:  (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov 
Cc: Cong Wang 
Cc: Eric Dumazet 
Cc: Gerrit Renker 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20170301153510.ge15...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 net/dccp/minisocks.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index 53eddf99e4f6..d20d948a98ed 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(const struct sock 
*sk,
/* It is still raw copy of parent, so invalidate
 * destructor and make plain sk_free() */
newsk->sk_destruct = NULL;
+   bh_unlock_sock(newsk);
sk_free(newsk);
return NULL;
}
-- 
2.9.3



[PATCH 2/2] net: Introduce sk_clone_lock() error path routine

2017-03-01 Thread Arnaldo Carvalho de Melo
From: Arnaldo Carvalho de Melo 

When handling problems in cloning a socket with the sk_clone_locked()
function we need to perform several steps that were open coded in it and
its callers, so introduce a routine to avoid this duplication:
sk_free_unlock_clone().

Cc: Cong Wang 
Cc: Dmitry Vyukov 
Cc: Eric Dumazet 
Cc: Gerrit Renker 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/n/net-ui6laqkotycunhtmqryl9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 include/net/sock.h   |  1 +
 net/core/sock.c  | 16 +++-
 net/dccp/minisocks.c |  6 +-
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index c4f5e6fca17c..93d1160bcd32 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1520,6 +1520,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t 
priority,
 void sk_free(struct sock *sk);
 void sk_destruct(struct sock *sk);
 struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority);
+void sk_free_unlock_clone(struct sock *sk);
 
 struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force,
 gfp_t priority);
diff --git a/net/core/sock.c b/net/core/sock.c
index 4eca27dc5c94..a3d9bb20f65d 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1540,11 +1540,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const 
gfp_t priority)
is_charged = sk_filter_charge(newsk, filter);
 
if (unlikely(!is_charged || xfrm_sk_clone_policy(newsk, sk))) {
-   /* It is still raw copy of parent, so invalidate
-* destructor and make plain sk_free() */
-   newsk->sk_destruct = NULL;
-   bh_unlock_sock(newsk);
-   sk_free(newsk);
+   sk_free_unlock_clone(newsk);
newsk = NULL;
goto out;
}
@@ -1593,6 +1589,16 @@ struct sock *sk_clone_lock(const struct sock *sk, const 
gfp_t priority)
 }
 EXPORT_SYMBOL_GPL(sk_clone_lock);
 
+void sk_free_unlock_clone(struct sock *sk)
+{
+   /* It is still raw copy of parent, so invalidate
+* destructor and make plain sk_free() */
+   sk->sk_destruct = NULL;
+   bh_unlock_sock(sk);
+   sk_free(sk);
+}
+EXPORT_SYMBOL_GPL(sk_free_unlock_clone);
+
 void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
 {
u32 max_segs = 1;
diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index d20d948a98ed..e267e6f4c9a5 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -119,11 +119,7 @@ struct sock *dccp_create_openreq_child(const struct sock 
*sk,
 * Activate features: initialise CCIDs, sequence windows etc.
 */
if (dccp_feat_activate_values(newsk, >dreq_featneg)) {
-   /* It is still raw copy of parent, so invalidate
-* destructor and make plain sk_free() */
-   newsk->sk_destruct = NULL;
-   bh_unlock_sock(newsk);
-   sk_free(newsk);
+   sk_free_unlock_clone(newsk);
return NULL;
}
dccp_init_xmit_timers(newsk);
-- 
2.9.3



Re: net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 11:27 AM, Dmitry Vyukov  wrote:
> Hello,
>
> I am seeing the following use-after-free report while running
> syzkaller fuzzer on
> linux-next/3e7350242c6f3d41d28e03418bd781cc1b7bad5f:
>
> ==
> BUG: KASAN: use-after-free in constant_test_bit
> arch/x86/include/asm/bitops.h:324 [inline] at addr 8801c56d5460
> BUG: KASAN: use-after-free in sock_flag include/net/sock.h:789
> [inline] at addr 8801c56d5460
> BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
> net/core/sock.c:1630 at addr 8801c56d5460
> Read of size 8 by task syz-fuzzer/3261
> CPU: 0 PID: 3261 Comm: syz-fuzzer Not tainted 4.10.0-next-20170224+ #1
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Call Trace:
>  
>  __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
>  constant_test_bit arch/x86/include/asm/bitops.h:324 [inline]
>  sock_flag include/net/sock.h:789 [inline]
>  sock_wfree+0x118/0x120 net/core/sock.c:1630
>  skb_release_head_state+0xfc/0x200 net/core/skbuff.c:654
>  skb_release_all+0x15/0x60 net/core/skbuff.c:667
>  __kfree_skb+0x15/0x20 net/core/skbuff.c:683
>  kfree_skb+0x16e/0x4c0 net/core/skbuff.c:704
>  ndisc_error_report+0xbb/0x190 net/ipv6/ndisc.c:683
>  neigh_invalidate+0x23e/0x570 net/core/neighbour.c:848
>  neigh_timer_handler+0x4e7/0x1140 net/core/neighbour.c:933
>  call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
>  expire_timers kernel/time/timer.c:1305 [inline]
>  __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
>  __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x1cc/0x200 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707

This one looks very similar to a previous one:
https://groups.google.com/forum/#!topic/syzkaller/BhyN5OFd7sQ

Both happen on raw v6 sockets.

For me, it seems the sk refcnt is not correct, skb should still hold
a refcnt so it should not be freed before kfree_skb() in a timer
handler...



> RIP: 0033:0x46a7c3
> RSP: 002b:00c83e2d5180 EFLAGS: 0202 ORIG_RAX: ff10
> RAX:  RBX: 0046a7b0 RCX: 00c820471200
> RDX: 0020 RSI: 00c839e1bba0 RDI: 00c83e2d5190
> RBP: 0002 R08: 0002 R09: 0073
> R10: 00c839a31b03 R11: 00c839e1bbf8 R12: 
> R13:  R14: 0010 R15: 01263e90
>  
> Object at 8801c56d5400, in cache RAWv6 size: 1480
> Allocated:
> PID = 12540
>  kmem_cache_alloc+0x102/0x680 mm/slab.c:3568
>  sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1332
>  sk_alloc+0x8c/0x470 net/core/sock.c:1394
>  inet6_create+0x44d/0x1140 net/ipv6/af_inet6.c:183
>  __sock_create+0x4e4/0x870 net/socket.c:1197
>  sock_create net/socket.c:1237 [inline]
>  SYSC_socket net/socket.c:1267 [inline]
>  SyS_socket+0xf9/0x230 net/socket.c:1247
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> Freed:
> PID = 12572
>  kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:580
>  __cache_free mm/slab.c:3510 [inline]
>  kmem_cache_free+0x71/0x240 mm/slab.c:3770
>  sk_prot_free net/core/sock.c:1375 [inline]
>  __sk_destruct+0x487/0x6b0 net/core/sock.c:1450
>  sk_destruct+0x47/0x80 net/core/sock.c:1458
>  __sk_free+0x57/0x230 net/core/sock.c:1466
>  sk_free+0x23/0x30 net/core/sock.c:1477
>  sock_put include/net/sock.h:1644 [inline]
>  sk_common_release+0x3bf/0x5e0 net/core/sock.c:2781
>  rawv6_close+0x4c/0x80 net/ipv6/raw.c:1218
>  inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
>  inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
>  sock_release+0x8d/0x1e0 net/socket.c:597
>  sock_close+0x16/0x20 net/socket.c:1061
>  __fput+0x332/0x7f0 fs/file_table.c:208
>  fput+0x15/0x20 fs/file_table.c:244
>  task_work_run+0x18a/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0x1956/0x2900 kernel/exit.c:873
>  do_group_exit+0x149/0x420 kernel/exit.c:977
>  get_signal+0x7e0/0x1820 kernel/signal.c:2313
>  do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
>  exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:156
>  prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
>  syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
>  entry_SYSCALL_64_fastpath+0xc0/0xc2


Re: net: sleeping function called from invalid context in net_enable_timestamp

2017-03-01 Thread Eric Dumazet
On Wed, 2017-03-01 at 11:59 -0800, Eric Dumazet wrote:
> On Wed, Mar 1, 2017 at 11:51 AM, Dmitry Vyukov  wrote:
> >
> > Hello,
> >
> > I've got the following report while running syzkaller fuzzer on
> > e5d56efc97f8240d0b5d66c03949382b6d7e5570
> 
> 
> 
> Right, a listener is playing fool games.
> 
> We need to use a work queue for all net_enable_timestamp() invocations

Something like :

diff --git a/net/core/dev.c b/net/core/dev.c
index 
e63bf61b19be029e30ac40443c0e2edb24de4a73..10fac295f4d4dff983156e2cac22456db948b32b
 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1702,15 +1702,27 @@ static void netstamp_clear(struct work_struct *work)
 {
int deferred = atomic_xchg(_needed_deferred, 0);
 
-   while (deferred--)
+   while (deferred < 0) {
+   deferred++;
static_key_slow_dec(_needed);
+   }
+   while (deferred > 0) {
+   deferred--;
+   static_key_slow_inc(_needed);
+   }
 }
 static DECLARE_WORK(netstamp_work, netstamp_clear);
 #endif
 
 void net_enable_timestamp(void)
 {
+#ifdef HAVE_JUMP_LABEL
+   /* net_enable_timestamp() can be called from non process context */
+   atomic_inc(_needed_deferred);
+   schedule_work(_work);
+#else
static_key_slow_inc(_needed);
+#endif
 }
 EXPORT_SYMBOL(net_enable_timestamp);
 
@@ -1718,7 +1730,7 @@ void net_disable_timestamp(void)
 {
 #ifdef HAVE_JUMP_LABEL
/* net_disable_timestamp() can be called from non process context */
-   atomic_inc(_needed_deferred);
+   atomic_dec(_needed_deferred);
schedule_work(_work);
 #else
static_key_slow_dec(_needed);




[iproute PATCH v3 1/1] color: use "light" colors for dark background

2017-03-01 Thread Petr Vorel
COLORFGBG environment variable is used to detect dark background.

Idea and a bit of code is borrowed from Vim, thanks.

Signed-off-by: Petr Vorel 
---
Changes v2->v3: remove unnecessary cast.
---
 include/color.h |  1 +
 lib/color.c | 45 -
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/include/color.h b/include/color.h
index c1c29831..ba0b237e 100644
--- a/include/color.h
+++ b/include/color.h
@@ -12,6 +12,7 @@ enum color_attr {
 };
 
 void enable_color(void);
+void set_color_palette(void);
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...);
 enum color_attr ifa_family_color(__u8 ifa_family);
 enum color_attr oper_state_color(__u8 state);
diff --git a/lib/color.c b/lib/color.c
index 95596be2..810fb1fa 100644
--- a/lib/color.c
+++ b/lib/color.c
@@ -1,5 +1,7 @@
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -14,6 +16,13 @@ enum color {
C_MAGENTA,
C_CYAN,
C_WHITE,
+   C_BOLD_RED,
+   C_BOLD_GREEN,
+   C_BOLD_YELLOW,
+   C_BOLD_BLUE,
+   C_BOLD_MAGENTA,
+   C_BOLD_CYAN,
+   C_BOLD_WHITE,
C_CLEAR
 };
 
@@ -25,25 +34,59 @@ static const char * const color_codes[] = {
"\e[35m",
"\e[36m",
"\e[37m",
+   "\e[1;31m",
+   "\e[1;32m",
+   "\e[1;33m",
+   "\e[1;34m",
+   "\e[1;35m",
+   "\e[1;36m",
+   "\e[1;37m",
"\e[0m",
NULL,
 };
 
 static enum color attr_colors[] = {
+   /* light background */
C_CYAN,
C_YELLOW,
C_MAGENTA,
C_BLUE,
C_GREEN,
C_RED,
+   C_CLEAR,
+
+   /* dark background */
+   C_BOLD_CYAN,
+   C_BOLD_YELLOW,
+   C_BOLD_MAGENTA,
+   C_BOLD_BLUE,
+   C_BOLD_GREEN,
+   C_BOLD_RED,
C_CLEAR
 };
 
+static int is_dark_bg;
 static int color_is_enabled;
 
 void enable_color(void)
 {
color_is_enabled = 1;
+   set_color_palette();
+}
+
+void set_color_palette(void)
+{
+   char *p = getenv("COLORFGBG");
+
+   /*
+* COLORFGBG environment variable usually contains either two or three
+* values separated by semicolons; we want the last value in either 
case.
+* If this value is 0-6 or 8, background is dark.
+*/
+   if (p && (p = strrchr(p, ';')) != NULL
+   && ((p[1] >= '0' && p[1] <= '6') || p[1] == '8')
+   && p[2] == '\0')
+   is_dark_bg = 1;
 }
 
 int color_fprintf(FILE *fp, enum color_attr attr, const char *fmt, ...)
@@ -58,7 +101,7 @@ int color_fprintf(FILE *fp, enum color_attr attr, const char 
*fmt, ...)
goto end;
}
 
-   ret += fprintf(fp, "%s", color_codes[attr_colors[attr]]);
+   ret += fprintf(fp, "%s", color_codes[attr_colors[is_dark_bg ? attr + 7 
: attr]]);
ret += vfprintf(fp, fmt, args);
ret += fprintf(fp, "%s", color_codes[C_CLEAR]);
 
-- 
2.11.0



Re: Passionate Partner

2017-03-01 Thread M. G
Dear Sir,

Did you recieved my mail? 
I have sent it twice without a response.

Mr Masella Giuseppe


Re: net: sleeping function called from invalid context in net_enable_timestamp

2017-03-01 Thread Eric Dumazet
On Wed, 2017-03-01 at 12:07 -0800, Eric Dumazet wrote:
> On Wed, 2017-03-01 at 11:59 -0800, Eric Dumazet wrote:
> > On Wed, Mar 1, 2017 at 11:51 AM, Dmitry Vyukov  wrote:
> > >
> > > Hello,
> > >
> > > I've got the following report while running syzkaller fuzzer on
> > > e5d56efc97f8240d0b5d66c03949382b6d7e5570
> > 
> > 
> > 
> > Right, a listener is playing fool games.
> > 
> > We need to use a work queue for all net_enable_timestamp() invocations
> 
> Something like :

We need something better, I will send a patch keeping good performance
for this jump label thing.




net: use-after-free in neigh_timer_handler/sock_wfree

2017-03-01 Thread Dmitry Vyukov
Hello,

I am seeing the following use-after-free report while running
syzkaller fuzzer on
linux-next/3e7350242c6f3d41d28e03418bd781cc1b7bad5f:

==
BUG: KASAN: use-after-free in constant_test_bit
arch/x86/include/asm/bitops.h:324 [inline] at addr 8801c56d5460
BUG: KASAN: use-after-free in sock_flag include/net/sock.h:789
[inline] at addr 8801c56d5460
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
net/core/sock.c:1630 at addr 8801c56d5460
Read of size 8 by task syz-fuzzer/3261
CPU: 0 PID: 3261 Comm: syz-fuzzer Not tainted 4.10.0-next-20170224+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 
 __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
 constant_test_bit arch/x86/include/asm/bitops.h:324 [inline]
 sock_flag include/net/sock.h:789 [inline]
 sock_wfree+0x118/0x120 net/core/sock.c:1630
 skb_release_head_state+0xfc/0x200 net/core/skbuff.c:654
 skb_release_all+0x15/0x60 net/core/skbuff.c:667
 __kfree_skb+0x15/0x20 net/core/skbuff.c:683
 kfree_skb+0x16e/0x4c0 net/core/skbuff.c:704
 ndisc_error_report+0xbb/0x190 net/ipv6/ndisc.c:683
 neigh_invalidate+0x23e/0x570 net/core/neighbour.c:848
 neigh_timer_handler+0x4e7/0x1140 net/core/neighbour.c:933
 call_timer_fn+0x241/0x820 kernel/time/timer.c:1266
 expire_timers kernel/time/timer.c:1305 [inline]
 __run_timers+0x960/0xcf0 kernel/time/timer.c:1599
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1612
 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1cc/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:658 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
RIP: 0033:0x46a7c3
RSP: 002b:00c83e2d5180 EFLAGS: 0202 ORIG_RAX: ff10
RAX:  RBX: 0046a7b0 RCX: 00c820471200
RDX: 0020 RSI: 00c839e1bba0 RDI: 00c83e2d5190
RBP: 0002 R08: 0002 R09: 0073
R10: 00c839a31b03 R11: 00c839e1bbf8 R12: 
R13:  R14: 0010 R15: 01263e90
 
Object at 8801c56d5400, in cache RAWv6 size: 1480
Allocated:
PID = 12540
 kmem_cache_alloc+0x102/0x680 mm/slab.c:3568
 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1332
 sk_alloc+0x8c/0x470 net/core/sock.c:1394
 inet6_create+0x44d/0x1140 net/ipv6/af_inet6.c:183
 __sock_create+0x4e4/0x870 net/socket.c:1197
 sock_create net/socket.c:1237 [inline]
 SYSC_socket net/socket.c:1267 [inline]
 SyS_socket+0xf9/0x230 net/socket.c:1247
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 12572
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:580
 __cache_free mm/slab.c:3510 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3770
 sk_prot_free net/core/sock.c:1375 [inline]
 __sk_destruct+0x487/0x6b0 net/core/sock.c:1450
 sk_destruct+0x47/0x80 net/core/sock.c:1458
 __sk_free+0x57/0x230 net/core/sock.c:1466
 sk_free+0x23/0x30 net/core/sock.c:1477
 sock_put include/net/sock.h:1644 [inline]
 sk_common_release+0x3bf/0x5e0 net/core/sock.c:2781
 rawv6_close+0x4c/0x80 net/ipv6/raw.c:1218
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1061
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x18a/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0x1956/0x2900 kernel/exit.c:873
 do_group_exit+0x149/0x420 kernel/exit.c:977
 get_signal+0x7e0/0x1820 kernel/signal.c:2313
 do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
 exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:156
 prepare_exit_to_usermode arch/x86/entry/common.c:190 [inline]
 syscall_return_slowpath+0x4d3/0x570 arch/x86/entry/common.c:259
 entry_SYSCALL_64_fastpath+0xc0/0xc2


Re: net: sleeping function called from invalid context in net_enable_timestamp

2017-03-01 Thread Eric Dumazet
On Wed, Mar 1, 2017 at 11:51 AM, Dmitry Vyukov  wrote:
>
> Hello,
>
> I've got the following report while running syzkaller fuzzer on
> e5d56efc97f8240d0b5d66c03949382b6d7e5570



Right, a listener is playing fool games.

We need to use a work queue for all net_enable_timestamp() invocations


Re: [PATCH v4 00/19] Replace PCI pool by DMA pool API

2017-03-01 Thread Joe Perches
On Wed, 2017-03-01 at 16:55 +0100, Romain Perier wrote:
> support to warn about this old API in checkpath.pl

checkpatch

This part isn't true anymore, but it seems sensible enough, thanks.



net/sctp: use-after-free in sctp_association_put

2017-03-01 Thread Dmitry Vyukov
Hello,

I've got the following report while running syzkaller fuzzer on
linux-next/8813198236a044b76e251dcae937b180dd527999:

BUG: KASAN: use-after-free in sctp_association_destroy
net/sctp/associola.c:416 [inline] at addr 8801c0fa415c
BUG: KASAN: use-after-free in sctp_association_put+0x294/0x300
net/sctp/associola.c:881 at addr 8801c0fa415c
Read of size 1 by task syz-executor1/10956
CPU: 1 PID: 10956 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170213 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:162
 print_address_description mm/kasan/report.c:200 [inline]
 kasan_report_error mm/kasan/report.c:289 [inline]
 kasan_report.part.2+0x1e5/0x4b0 mm/kasan/report.c:311
 kasan_report mm/kasan/report.c:329 [inline]
 __asan_report_load1_noabort+0x29/0x30 mm/kasan/report.c:329
 sctp_association_destroy net/sctp/associola.c:416 [inline]
 sctp_association_put+0x294/0x300 net/sctp/associola.c:881
 sctp_generate_timeout_event+0x115/0x360 net/sctp/sm_sideeffect.c:317
 sctp_generate_t1_init_event+0x1a/0x20 net/sctp/sm_sideeffect.c:329
 call_timer_fn+0x241/0x820 kernel/time/timer.c:1308
 expire_timers kernel/time/timer.c:1348 [inline]
 __run_timers+0x9e7/0xe90 kernel/time/timer.c:1642
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1655
 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x1cc/0x200 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:658 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707
RIP: 0010:arch_local_irq_enable arch/x86/include/asm/paravirt.h:788 [inline]
RIP: 0010:__raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
RIP: 0010:_raw_spin_unlock_irq+0x56/0x70 kernel/locking/spinlock.c:199
RSP: 0018:8801c280f178 EFLAGS: 0286 ORIG_RAX: ff10
RAX: dc00 RBX: 8801dbf24a00 RCX: 0006
RDX: 10a18d03 RSI: 8801d71c88e0 RDI: 850c6818
RBP: 8801c280f180 R08: 0002 R09: 
R10: 0006 R11:  R12: 8801c0f3a4c0
R13: 110038501e38 R14: 8801d71c80c0 R15: 8801d71c80c0
 
 finish_lock_switch kernel/sched/sched.h:1248 [inline]
 finish_task_switch+0x1c2/0x720 kernel/sched/core.c:2792
 context_switch kernel/sched/core.c:2928 [inline]
 __schedule+0x893/0x2290 kernel/sched/core.c:3468
 preempt_schedule_common+0x35/0x60 kernel/sched/core.c:3579
 _cond_resched+0x17/0x20 kernel/sched/core.c:4977
 slab_pre_alloc_hook mm/slab.h:427 [inline]
 slab_alloc mm/slab.c:3390 [inline]
 __do_kmalloc mm/slab.c:3730 [inline]
 __kmalloc_track_caller+0x26a/0x690 mm/slab.c:3747
 kstrdup+0x39/0x70 mm/util.c:54
 snd_timer_instance_new+0xfc/0x5d0 sound/core/timer.c:110
 snd_timer_open+0x878/0x1740 sound/core/timer.c:290
 snd_timer_user_tselect sound/core/timer.c:1621 [inline]
 __snd_timer_user_ioctl sound/core/timer.c:1901 [inline]
 snd_timer_user_ioctl+0x9b1/0x34a0 sound/core/timer.c:1931
 vfs_ioctl fs/ioctl.c:43 [inline]
 do_vfs_ioctl+0x1bf/0x1790 fs/ioctl.c:683
 SYSC_ioctl fs/ioctl.c:698 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:689
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x44fb59
RSP: 002b:7f0dc184db58 EFLAGS: 0212 ORIG_RAX: 0010
RAX: ffda RBX: 40345410 RCX: 0044fb59
RDX: 20001000 RSI: 40345410 RDI: 0005
RBP: 0005 R08:  R09: 
R10:  R11: 0212 R12: 00708000
R13: 00a5fc57 R14: 7f0dc184e9c0 R15: 
Object at 8801c0fa4140, in cache kmalloc-4096 size: 4096
Allocated:
PID = 10965
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:504
 set_track mm/kasan/kasan.c:516 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:607
 kmem_cache_alloc_trace+0x10b/0x670 mm/slab.c:3634
 kmalloc include/linux/slab.h:490 [inline]
 kzalloc include/linux/slab.h:663 [inline]
 sctp_association_new+0x114/0x2120 net/sctp/associola.c:306
 sctp_sendmsg+0x1585/0x38f0 net/sctp/socket.c:1835
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
 sock_sendmsg_nosec net/socket.c:633 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:643
 ___sys_sendmsg+0x8fa/0x9f0 net/socket.c:1985
 __sys_sendmsg+0x138/0x300 net/socket.c:2019
 SYSC_sendmsg net/socket.c:2030 [inline]
 SyS_sendmsg+0x2d/0x50 net/socket.c:2026
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 10965
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:504
 set_track mm/kasan/kasan.c:516 [inline]
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:580
 __cache_free mm/slab.c:3510 [inline]
 kfree+0xd3/0x250 mm/slab.c:3827
 sctp_association_destroy 

Re: [PATCH net] bridge: Fix error path in nbp_vlan_init

2017-03-01 Thread Jiri Pirko
Wed, Mar 01, 2017 at 03:50:45PM CET, yot...@mellanox.com wrote:
>Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
>call failes, the vlan_hash wouldn't be destroyed before inited.
>
>Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
>CC: Roopa Prabhu 
>Signed-off-by: Yotam Gigi 

Reviewed-by: Jiri Pirko 


Re: [PATCH net] bridge: Fix error path in nbp_vlan_init

2017-03-01 Thread Roopa Prabhu
On 3/1/17, 6:50 AM, Yotam Gigi wrote:
> Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
> call failes, the vlan_hash wouldn't be destroyed before inited.
>
> Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
> CC: Roopa Prabhu 
> Signed-off-by: Yotam Gigi 
> ---
>  
Acked-by: Roopa Prabhu 

Thanks yotam.




Re: [iproute PATCH v2 1/1] color: use "light" colors for dark background

2017-03-01 Thread Stephen Hemminger
On Mon, 27 Feb 2017 10:55:27 +0100
Petr Vorel  wrote:

> +void set_color_palette(void)
> +{
> + char *p = getenv("COLORFGBG");
> +
> + /*
> +  * COLORFGBG environment variable usually contains either two or three
> +  * values separated by semicolons; we want the last value in either 
> case.
> +  * If this value is 0-6 or 8, background is dark.
> +  */
> + if (p && (p = (char *)strrchr(p, ';')) != NULL

Cast here is unnecessary. strrchr is defined as:
   char *strrchr(const char *s, int c);


Re: [PATCH net] net: route: add missing nla_policy entry for RTA_MARK attribute

2017-03-01 Thread David Miller
From: Liping Zhang 
Date: Mon, 27 Feb 2017 20:59:39 +0800

> From: Liping Zhang 
> 
> This will add stricter validating for RTA_MARK attribute.
> 
> Signed-off-by: Liping Zhang 

Looks good, applied, thanks.


Re: [PATCH 1/2] net: sched: make default fifo qdiscs appear in the dump

2017-03-01 Thread David Miller
From: Jiri Kosina 
Date: Sat, 25 Feb 2017 22:29:09 +0100 (CET)

> @@ -1066,6 +1066,7 @@ hfsc_change_class(struct Qdisc *sch, u32 classid, u32 
> parentid,
> _qdisc_ops, classid);
>   if (cl->qdisc == NULL)
>   cl->qdisc = _qdisc;
> + qdisc_hash_add(cl->qdisc, true);
>   INIT_LIST_HEAD(>children);
>   cl->vt_tree = RB_ROOT;
>   cl->cf_tree = RB_ROOT;
> @@ -1425,6 +1426,7 @@ hfsc_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
> sch->handle);
>   if (q->root.qdisc == NULL)
>   q->root.qdisc = _qdisc;
> + qdisc_hash_add(q->root.qdisc, true);
>   INIT_LIST_HEAD(>root.children);
>   q->root.vt_tree = RB_ROOT;
>   q->root.cf_tree = RB_ROOT;

I'm not so sure it is legal is potentially pass _qdisc into 
qdisc_hash_add().


Re: [PATCH net/ipv6] net/ipv6: avoid possible dead locking on addr_gen_mode sysctl

2017-03-01 Thread David Miller
From: Felix Jia 
Date: Mon, 27 Feb 2017 12:41:23 +1300

> The addr_gen_mode variable can be accessed by both sysctl and netlink.
> Repleacd rtnl_lock() with rtnl_trylock() protect the sysctl operation to
> avoid the possbile dead lock.`
> 
> Signed-off-by: Felix Jia 

Applied and queued up for -stable, thanks.


Re:Re: [drivers/net/vxlan]Why rcu_read_lock is not obtained before rculist travelling

2017-03-01 Thread Xiaobo Yan

Hi Cong,

Thanks very much for your reply.  It's indeeded acquired by upper layer.

Thanks

At 2017-03-01 05:29:17, "Cong Wang"  wrote:
>On Tue, Feb 28, 2017 at 6:03 AM, Xiaobo Yan  wrote:
>> But I don’t find any rcu_read_lock invoked before travelling fdb_head list.  
>> In vxlan_xmit and vxlan_snoop function, vxlan_find_mac function is called to 
>> search the vxlan_fdb of the dst_mac or src_mac. Then information in 
>> vxlan_fdb  is used for further process.  But as no rcu_read_lock is obtained 
>> before the list travelling, I am wondering if it is possible that vxlan_fdb 
>> is freed when it is being used.
>>
>
>In both RX and TX paths, rcu read lock is acquired by upper layer.
>Check __dev_queue_xmit() and process_backlog().


Re: [PATCH 2/2] iproute2: add support for invisible qdisc dumping

2017-03-01 Thread Stephen Hemminger
On Sat, 25 Feb 2017 22:29:17 +0100 (CET)
Jiri Kosina  wrote:

> From: Jiri Kosina 
> 
> Support the new TCA_DUMP_INVISIBLE netlink attribute that allows asking 
> kernel to perform 'full qdisc dump', as for historical reasons some of the 
> default qdiscs are being hidden by the kernel.
> 
> The command syntax is being extended by voluntary 'invisible' argument to
> 'tc qdisc show'.
> 
> Signed-off-by: Jiri Kosina 

Still waiting for TCA_DUMP_INVISIBLE to make it into net-next


Re: [PATCH net 0/2] VXLAN/geneve RCU fixes

2017-03-01 Thread David Miller
From: Jakub Kicinski 
Date: Fri, 24 Feb 2017 11:43:35 -0800

> VXLAN and GENEVE need to take RCU lock explicitly because TX path
> only has the _bh() flavour of RCU locked.  Making the reconfiguration
> path wait for both normal and _bh() RCU would be bigger hassle so
> just acquire the lock, as suggested by Pravin:
> 
> https://www.mail-archive.com/netdev@vger.kernel.org/msg155583.html

Series applied and queued up for -stable, thanks.


Re: [PATCH v4 net] net: solve a NAPI race

2017-03-01 Thread David Miller
From: Eric Dumazet 
Date: Tue, 28 Feb 2017 10:34:50 -0800

> From: Eric Dumazet 
> 
> While playing with mlx4 hardware timestamping of RX packets, I found
> that some packets were received by TCP stack with a ~200 ms delay...
> 
> Since the timestamp was provided by the NIC, and my probe was added
> in tcp_v4_rcv() while in BH handler, I was confident it was not
> a sender issue, or a drop in the network.
> 
> This would happen with a very low probability, but hurting RPC
> workloads.
> 
> A NAPI driver normally arms the IRQ after the napi_complete_done(),
> after NAPI_STATE_SCHED is cleared, so that the hard irq handler can grab
> it.
> 
> Problem is that if another point in the stack grabs NAPI_STATE_SCHED bit
> while IRQ are not disabled, we might have later an IRQ firing and
> finding this bit set, right before napi_complete_done() clears it.
> 
> This can happen with busy polling users, or if gro_flush_timeout is
> used. But some other uses of napi_schedule() in drivers can cause this
> as well.
 ...
> Signed-off-by: Eric Dumazet 

Applied, thanks Eric.


Re: Page allocator order-0 optimizations merged

2017-03-01 Thread Tariq Toukan


On 01/03/2017 3:48 PM, Jesper Dangaard Brouer wrote:

Hi NetDev community,

I just wanted to make net driver people aware that this MM commit[1] got
merged and is available in net-next.

  commit 374ad05ab64d ("mm, page_alloc: only use per-cpu allocator for irq-safe 
requests")
  [1] https://git.kernel.org/davem/net-next/c/374ad05ab64d696

It provides approx 14% speedup of order-0 page allocations.  I do know
most driver do their own page-recycling.  Thus, this gain will only be
seen when this page recycling is insufficient, which Tariq was affected
by AFAIK.

Thanks Jesper, this is great news!
I will start perf testing this tomorrow.


We are also playing with a bulk page allocator facility[2], that I've
benchmarked[3][4].  While I'm seeing between 34%-46% improvements by
bulking, I believe we actually need to do better, before it reach our
performance target for high-speed networking.

Very promising!
This fits perfectly in our Striding RQ feature (Multi-Packet WQE)
where we allocate fragmented buffers (of order-0 pages) of 256KB total.
Big like :)

Thanks,
Tariq

--Jesper

[2] http://lkml.kernel.org/r/20170109163518.6001-5-mgorman%40techsingularity.net
[3] http://lkml.kernel.org/r/20170116152518.5519dc1e%40redhat.com
[4] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/mm/bench/page_bench04_bulk.c


On Mon, 27 Feb 2017 12:25:03 -0800 a...@linux-foundation.org wrote:


The patch titled
  Subject: mm, page_alloc: only use per-cpu allocator for irq-safe requests
has been removed from the -mm tree.  Its filename was
  mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch

This patch was dropped because it was merged into mainline or a subsystem tree

--
From: Mel Gorman 
Subject: mm, page_alloc: only use per-cpu allocator for irq-safe requests

Many workloads that allocate pages are not handling an interrupt at a
time.  As allocation requests may be from IRQ context, it's necessary to
disable/enable IRQs for every page allocation.  This cost is the bulk of
the free path but also a significant percentage of the allocation path.

This patch alters the locking and checks such that only irq-safe
allocation requests use the per-cpu allocator.  All others acquire the
irq-safe zone->lock and allocate from the buddy allocator.  It relies on
disabling preemption to safely access the per-cpu structures.  It could be
slightly modified to avoid soft IRQs using it but it's not clear it's
worthwhile.

This modification may slow allocations from IRQ context slightly but the
main gain from the per-cpu allocator is that it scales better for
allocations from multiple contexts.  There is an implicit assumption that
intensive allocations from IRQ contexts on multiple CPUs from a single
NUMA node are rare and that the fast majority of scaling issues are
encountered in !IRQ contexts such as page faulting.  It's worth noting
that this patch is not required for a bulk page allocator but it
significantly reduces the overhead.

The following is results from a page allocator micro-benchmark.  Only
order-0 is interesting as higher orders do not use the per-cpu allocator

   4.10.0-rc2 4.10.0-rc2
  vanilla   irqsafe-v1r5
Ameanalloc-odr0-1   287.15 (  0.00%)   219.00 ( 23.73%)
Ameanalloc-odr0-2   221.23 (  0.00%)   183.23 ( 17.18%)
Ameanalloc-odr0-4   187.00 (  0.00%)   151.38 ( 19.05%)
Ameanalloc-odr0-8   167.54 (  0.00%)   132.77 ( 20.75%)
Ameanalloc-odr0-16  156.00 (  0.00%)   123.00 ( 21.15%)
Ameanalloc-odr0-32  149.00 (  0.00%)   118.31 ( 20.60%)
Ameanalloc-odr0-64  138.77 (  0.00%)   116.00 ( 16.41%)
Ameanalloc-odr0-128 145.00 (  0.00%)   118.00 ( 18.62%)
Ameanalloc-odr0-256 136.15 (  0.00%)   125.00 (  8.19%)
Ameanalloc-odr0-512 147.92 (  0.00%)   121.77 ( 17.68%)
Ameanalloc-odr0-1024147.23 (  0.00%)   126.15 ( 14.32%)
Ameanalloc-odr0-2048155.15 (  0.00%)   129.92 ( 16.26%)
Ameanalloc-odr0-4096164.00 (  0.00%)   136.77 ( 16.60%)
Ameanalloc-odr0-8192166.92 (  0.00%)   138.08 ( 17.28%)
Ameanalloc-odr0-16384   159.00 (  0.00%)   138.00 ( 13.21%)
Ameanfree-odr0-1165.00 (  0.00%)89.00 ( 46.06%)
Ameanfree-odr0-2113.00 (  0.00%)63.00 ( 44.25%)
Ameanfree-odr0-4 99.00 (  0.00%)54.00 ( 45.45%)
Ameanfree-odr0-8 88.00 (  0.00%)47.38 ( 46.15%)
Ameanfree-odr0-1683.00 (  0.00%)46.00 ( 44.58%)
Amean

Re: [patch] net/mlx4: && vs & typo

2017-03-01 Thread David Miller
From: Dan Carpenter 
Date: Tue, 28 Feb 2017 15:02:15 +0300

> Bitwise & was obviously intended here.
> 
> Fixes: 745d8ae4622c ("net/mlx4: Spoofcheck and zero MAC can't coexist")
> Signed-off-by: Dan Carpenter 

Applied.


Re: [patch net] mlxsw: spectrum_router: Avoid potential packets loss

2017-03-01 Thread David Miller
From: Jiri Pirko 
Date: Tue, 28 Feb 2017 08:55:40 +0100

> From: Ido Schimmel 
> 
> When the structure of the LPM tree changes (f.e., due to the addition of
> a new prefix), we unbind the old tree and then bind the new one. This
> may result in temporary packet loss.
> 
> Instead, overwrite the old binding with the new one.
> 
> Fixes: 6b75c4807db3 ("mlxsw: spectrum_router: Add virtual router management")
> Signed-off-by: Ido Schimmel 
> Signed-off-by: Jiri Pirko 

Applied.


Re: [PATCH] net: usb: asix_devices: fix missing return code check on call to asix_write_medium_mode

2017-03-01 Thread David Miller
From: Colin King 
Date: Tue, 28 Feb 2017 11:58:22 +

> From: Colin Ian King 
> 
> The call to asix_write_medium_mode is not updating the return code ret
> and yet ret is being checked for an error. Fix this by assigning ret to
> the return code from the call asix_write_medium_mode.
> 
> Detected by CoverityScan, CID#1357148 ("Logically Dead Code")
> 
> Signed-off-by: Colin Ian King 

Applied.


Re: [Patch net] ipv6: ignore null_entry in inet6_rtm_getroute() too

2017-03-01 Thread Cong Wang
On Tue, Feb 28, 2017 at 2:35 PM, David Ahern  wrote:
> On 2/28/17 11:48 AM, Cong Wang wrote:
>> On Tue, Feb 28, 2017 at 11:01 AM, David Ahern  
>> wrote:
>>> On 2/28/17 10:44 AM, Cong Wang wrote:
 Like commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route dumps"),
 we need to ignore null entry in inet6_rtm_getroute() too.

 Return -ENOENT here because we return the same errno when deleting
 the null entry.

 Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin 
 down")
 Reported-by: Dmitry Vyukov 
 Cc: David Ahern 
 Signed-off-by: Cong Wang 
 ---
  net/ipv6/route.c | 6 ++
  1 file changed, 6 insertions(+)

 diff --git a/net/ipv6/route.c b/net/ipv6/route.c
 index f54f426..25590d1 100644
 --- a/net/ipv6/route.c
 +++ b/net/ipv6/route.c
 @@ -3627,6 +3627,12 @@ static int inet6_rtm_getroute(struct sk_buff 
 *in_skb, struct nlmsghdr *nlh)
   rt = (struct rt6_info *)ip6_route_output(net, NULL, );
   }

 + if (rt == net->ipv6.ip6_null_entry) {
 + ip6_rt_put(rt);
 + err = -ENOENT;
 + goto errout;
 + }
 +
   skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
   if (!skb) {
   ip6_rt_put(rt);

>>>
>>> hold on. That test exposed something else, not just a getroute problem.
>>> I accidentally ran 'unsahre -n; ip -6 ro ls' on my host machine instead
>>> of a VM, so took some time to recover. dumproute already covers the null
>>> route.
>
> My host was running a slightly older kernel (did not have the null_entry
> check in the dump route path for one).
>
> As for trapping null_entry on getroute, this changes user experience.
> Right now you always get a route response for IPv6 with the error set as
> rta_error. This patch changes that. I am fine with it since it makes
> IPv6 more like IPv4:
>
> # ip -6 ro get 2001:db8:12::1
> RTNETLINK answers: Network is unreachable
>

Yeah, I am not sure if we really want to "return" the null entry here,
since we ignore it in dump anyway. If we really want, an alternative
patch is probably something like:

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 25590d1..e60dc1c 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3322,7 +3322,7 @@ static int rt6_nexthop_info(struct sk_buff *skb,
struct rt6_info *rt,
 {
if (!netif_running(rt->dst.dev) || !netif_carrier_ok(rt->dst.dev)) {
*flags |= RTNH_F_LINKDOWN;
-   if (rt->rt6i_idev->cnf.ignore_routes_with_linkdown)
+   if (rt->rt6i_idev &&
rt->rt6i_idev->cnf.ignore_routes_with_linkdown)
*flags |= RTNH_F_DEAD;
}


> But, if we are going to do this then err should be set based on
> rt->dst.error (ENOENT is not the right error) and the commit message
> should state the change.
>

Makes sense, I will change the errno and update changelog.


Re: [PATCH net] sctp: call rcu_read_lock before checking for duplicate transport nodes

2017-03-01 Thread David Miller
From: Xin Long 
Date: Tue, 28 Feb 2017 12:41:29 +0800

> Commit cd2b70875058 ("sctp: check duplicate node before inserting a
> new transport") called rhltable_lookup() to check for the duplicate
> transport node in transport rhashtable.
> 
> But rhltable_lookup() doesn't call rcu_read_lock inside, it could cause
> a use-after-free issue if it tries to dereference the node that another
> cpu has freed it. Note that sock lock can not avoid this as it is per
> sock.
> 
> This patch is to fix it by calling rcu_read_lock before checking for
> duplicate transport nodes.
> 
> Fixes: cd2b70875058 ("sctp: check duplicate node before inserting a new 
> transport")
> Reported-by: Andrey Konovalov 
> Signed-off-by: Xin Long 

Applied.


Re: [PATCH 1/1] rds: ib: add the static type to the variables

2017-03-01 Thread David Miller
From: Zhu Yanjun 
Date: Tue, 28 Feb 2017 01:45:40 -0500

> The variables rds_ib_mr_1m_pool_size and rds_ib_mr_8k_pool_size
> are used only in the ib.c file. As such, the static type is
> added to limit them in this file.
> 
> Cc: Joe Jin 
> Cc: Junxiao Bi 
> Signed-off-by: Zhu Yanjun 

Applied.


Re: [PATCH] MAINTAINERS: Orphan usb/net/hso driver

2017-03-01 Thread David Miller
From: Baruch Siach 
Date: Tue, 28 Feb 2017 10:39:48 +0200

> The email address of Jan Dumon bounces, and there is not relevant information
> in the linked website.
> 
> Signed-off-by: Baruch Siach 

Applied.


Re: [PATCH net] rxrpc: Fix deadlock between call creation and sendmsg/recvmsg

2017-03-01 Thread David Miller
From: David Howells 
Date: Mon, 27 Feb 2017 15:43:06 +

> All the routines by which rxrpc is accessed from the outside are serialised
> by means of the socket lock (sendmsg, recvmsg, bind,
> rxrpc_kernel_begin_call(), ...) and this presents a problem:
 ...
> Fix this by:
 ...
> This patch has the nice bonus that calls on the same socket are now to some
> extent parallelisable.
> 
> 
> Note that we might want to move rxrpc_service_prealloc() calls out from the
> socket lock and give it its own lock, so that we don't hang progress in
> other calls because we're waiting for the allocator.
> 
> We probably also want to avoid calling rxrpc_notify_socket() from within
> the socket lock (rxrpc_accept_call()).
> 
> Signed-off-by: David Howells 
> Tested-by: Marc Dionne 

Applied, thanks David.


Re: [PATCH v2 net] net: solve a NAPI race

2017-03-01 Thread Eric Dumazet
On Wed, 2017-03-01 at 08:14 -0800, Alexander Duyck wrote:

> What build flags are you using?  With -Os or -O2 I have seen it
> convert the /b * c into a single shift.
> 


Because b & c are unsigned in our case.

I presume David tried signed integers, this is why gcc does that.





Re: net/ipv4: deadlock in ip_ra_control

2017-03-01 Thread Cong Wang
On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyu...@google.com> wrote:
> Hello,
>
> I've got the following deadlock report while running syzkaller fuzzer
> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>
> ==
> [ INFO: possible circular locking dependency detected ]
> 4.10.0-next-20170301+ #1 Not tainted
> ---
> syz-executor1/3394 is trying to acquire lock:
>  (sk_lock-AF_INET){+.+.+.}, at: [] lock_sock
> include/net/sock.h:1460 [inline]
>  (sk_lock-AF_INET){+.+.+.}, at: []
> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>
> but task is already holding lock:
>  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:70
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (rtnl_mutex){+.+.+.}:
>validate_chain kernel/locking/lockdep.c:2265 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>__mutex_lock_common kernel/locking/mutex.c:754 [inline]
>__mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>SYSC_setsockopt net/socket.c:1786 [inline]
>SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> -> #0 (sk_lock-AF_INET){+.+.+.}:
>check_prev_add kernel/locking/lockdep.c:1828 [inline]
>check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>validate_chain kernel/locking/lockdep.c:2265 [inline]
>__lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>lock_sock include/net/sock.h:1460 [inline]
>do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>SYSC_setsockopt net/socket.c:1786 [inline]
>SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>entry_SYSCALL_64_fastpath+0x1f/0xc2
>

Please try the attached patch (compile only).

Thanks.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ebd953b..bda318a 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -591,6 +591,7 @@ static bool setsockopt_needs_rtnl(int optname)
case MCAST_LEAVE_GROUP:
case MCAST_LEAVE_SOURCE_GROUP:
case MCAST_UNBLOCK_SOURCE:
+   case IP_ROUTER_ALERT:
return true;
}
return false;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index beacd02..932321b 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1278,7 +1278,7 @@ static void mrtsock_destruct(struct sock *sk)
struct net *net = sock_net(sk);
struct mr_table *mrt;
 
-   rtnl_lock();
+   ASSERT_RTNL();
ipmr_for_each_table(mrt, net) {
if (sk == rtnl_dereference(mrt->mroute_sk)) {
IPV4_DEVCONF_ALL(net, MC_FORWARDING)--;
@@ -1289,7 +1289,6 @@ static void mrtsock_destruct(struct sock *sk)
mroute_clean_tables(mrt, false);
}
}
-   rtnl_unlock();
 }
 
 /* Socket options and virtual interface manipulation. The whole
@@ -1353,13 +1352,8 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, 
char __user *optval,
if (sk != rcu_access_pointer(mrt->mroute_sk)) {
ret = -EACCES;
} else {
-   /* We need to unlock here because mrtsock_destruct takes
-* care of rtnl itself and we can't change that due to
-* the IP_ROUTER_ALERT setsockopt which runs without it.
-*/
-   rtnl_unlock();
ret = ip_ra_control(sk, 0, NULL);
-   goto out;
+   goto out_unlock;
}
break;
case MRT_ADD_VIF:
@@ -1470,7 +1464,6 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, 
char __user *optval,
}
 out_unlock:
rtnl_unlock();
-out:
return ret;
 }
 


Re: [PATCH] bpf: update the comment about the length of analysis

2017-03-01 Thread Alexei Starovoitov
On Wed, Mar 01, 2017 at 04:25:51PM +0800, Gary Lin wrote:
> Commit 07016151a446 ("bpf, verifier: further improve search
> pruning") increased the limit of processed instructions from
> 32k to 64k, but the comment still mentioned the 32k limit.
> This commit updates the comment to reflect the change.
> 
> Cc: Alexei Starovoitov 
> Cc: Daniel Borkmann 
> Signed-off-by: Gary Lin 

Acked-by: Alexei Starovoitov 



commit a52ad514fdf3b8a57ca4322c92d2d8d5c6182485 net: deprecate eth_change_mtu, remove usage breaks bonding on my machine

2017-03-01 Thread Brad Campbell

G'day Jarod,

I have a pair of machines that are linked by a pair of quad port e1000 
cards with all 4 ports bonded. The network is configured with an mtu of 
9000.


Kernel 4.10 fails to bring these interfaces up as it fails when trying 
to set the mtu on the bond interface higher than 1500. A bisect between 
4.9 & 4.10 winds up identifying this commit as where it all goes wrong. 
If I modify the network config to not touch the mtu (ie leave it at 
1500) then it comes up ok.


I can individually configure each port with an mtu of 9000, so the e1000 
driver is ok, but there appears to be breakage in the bonding driver 
related to your mtu api changes.


I've just reverted to an older kernel, so it's no biggie. And as it's 
still a problem in the latest git head I assume nobody else has 
encountered it. I thought it worth reporting in case it triggers a quick 
lightbulb.


Regards,
Brad.


[PATCH net] bridge: Fix error path in nbp_vlan_init

2017-03-01 Thread Yotam Gigi
Fix error path order in nbp_vlan_init, so if switchdev_port_attr_set
call failes, the vlan_hash wouldn't be destroyed before inited.

Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
CC: Roopa Prabhu 
Signed-off-by: Yotam Gigi 
---
 net/bridge/br_vlan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 62e68c0..b838213 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -997,10 +997,10 @@ int nbp_vlan_init(struct net_bridge_port *p)
RCU_INIT_POINTER(p->vlgrp, NULL);
synchronize_rcu();
vlan_tunnel_deinit(vg);
-err_vlan_enabled:
 err_tunnel_init:
rhashtable_destroy(>vlan_hash);
 err_rhtbl:
+err_vlan_enabled:
kfree(vg);
 
goto out;
-- 
2.4.11



[PATCH net] tcp/dccp: block BH for SYN processing

2017-03-01 Thread Eric Dumazet
From: Eric Dumazet 

SYN processing really was meant to be handled from BH.

When I got rid of BH blocking while processing socket backlog
in commit 5413d1babe8f ("net: do not block BH while processing socket
backlog"), I forgot that a malicious user could transition to TCP_LISTEN
from a state that allowed (SYN) packets to be parked in the socket
backlog while socket is owned by the thread doing the listen() call.

Sure enough syzkaller found this and reported the bug ;)


=
[ INFO: inconsistent lock state ]
4.10.0+ #60 Not tainted
-
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor0/5090 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(>ehash_locks[i])->rlock){+.?...}, at:
[] spin_lock include/linux/spinlock.h:299 [inline]
 (&(>ehash_locks[i])->rlock){+.?...}, at:
[] inet_ehash_insert+0x240/0xad0
net/ipv4/inet_hashtables.c:407
{IN-SOFTIRQ-W} state was registered at:
  mark_irqflags kernel/locking/lockdep.c:2923 [inline]
  __lock_acquire+0xbcf/0x3270 kernel/locking/lockdep.c:3295
  lock_acquire+0x241/0x580 kernel/locking/lockdep.c:3753
  __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
  _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151
  spin_lock include/linux/spinlock.h:299 [inline]
  inet_ehash_insert+0x240/0xad0 net/ipv4/inet_hashtables.c:407
  reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:753 [inline]
  inet_csk_reqsk_queue_hash_add+0x1b7/0x2a0 net/ipv4/inet_connection_sock.c:764
  tcp_conn_request+0x25cc/0x3310 net/ipv4/tcp_input.c:6399
  tcp_v4_conn_request+0x157/0x220 net/ipv4/tcp_ipv4.c:1262
  tcp_rcv_state_process+0x802/0x4130 net/ipv4/tcp_input.c:5889
  tcp_v4_do_rcv+0x56b/0x940 net/ipv4/tcp_ipv4.c:1433
  tcp_v4_rcv+0x2e12/0x3210 net/ipv4/tcp_ipv4.c:1711
  ip_local_deliver_finish+0x4ce/0xc40 net/ipv4/ip_input.c:216
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_local_deliver+0x1ce/0x710 net/ipv4/ip_input.c:257
  dst_input include/net/dst.h:492 [inline]
  ip_rcv_finish+0xb1d/0x2110 net/ipv4/ip_input.c:396
  NF_HOOK include/linux/netfilter.h:257 [inline]
  ip_rcv+0xd90/0x19c0 net/ipv4/ip_input.c:487
  __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4179
  __netif_receive_skb+0x2a/0x170 net/core/dev.c:4217
  netif_receive_skb_internal+0x1d6/0x430 net/core/dev.c:4245
  napi_skb_finish net/core/dev.c:4602 [inline]
  napi_gro_receive+0x4e6/0x680 net/core/dev.c:4636
  e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4033 [inline]
  e1000_clean_rx_irq+0x5e0/0x1490
drivers/net/ethernet/intel/e1000/e1000_main.c:4489
  e1000_clean+0xb9a/0x2910 drivers/net/ethernet/intel/e1000/e1000_main.c:3834
  napi_poll net/core/dev.c:5171 [inline]
  net_rx_action+0xe70/0x1900 net/core/dev.c:5236
  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
  invoke_softirq kernel/softirq.c:364 [inline]
  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
  exiting_irq arch/x86/include/asm/apic.h:658 [inline]
  do_IRQ+0x81/0x1a0 arch/x86/kernel/irq.c:250
  ret_from_intr+0x0/0x20
  native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
  default_idle+0x8f/0x410 arch/x86/kernel/process.c:271
  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:262
  default_idle_call+0x36/0x60 kernel/sched/idle.c:96
  cpuidle_idle_call kernel/sched/idle.c:154 [inline]
  do_idle+0x348/0x440 kernel/sched/idle.c:243
  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:345
  start_secondary+0x344/0x440 arch/x86/kernel/smpboot.c:272
  verify_cpu+0x0/0xfc
irq event stamp: 1741
hardirqs last  enabled at (1741): []
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160
[inline]
hardirqs last  enabled at (1741): []
_raw_spin_unlock_irqrestore+0xf7/0x1a0 kernel/locking/spinlock.c:191
hardirqs last disabled at (1740): []
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
hardirqs last disabled at (1740): []
_raw_spin_lock_irqsave+0xa2/0x110 kernel/locking/spinlock.c:159
softirqs last  enabled at (1738): []
__do_softirq+0x7cf/0xb7d kernel/softirq.c:310
softirqs last disabled at (1571): []
do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902

other info that might help us debug this:
 Possible unsafe locking scenario:

   CPU0
   
  lock(&(>ehash_locks[i])->rlock);
  
lock(&(>ehash_locks[i])->rlock);

 *** DEADLOCK ***

1 lock held by syz-executor0/5090:
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
include/net/sock.h:1460 [inline]
 #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
sock_setsockopt+0x233/0x1e40 net/core/sock.c:683

stack backtrace:
CPU: 1 PID: 5090 Comm: syz-executor0 Not tainted 4.10.0+ #60
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 print_usage_bug+0x3ef/0x450 kernel/locking/lockdep.c:2387
 valid_state kernel/locking/lockdep.c:2400 [inline]
 mark_lock_irq 

Re: [PATCH v2 net] net: solve a NAPI race

2017-03-01 Thread Alexander Duyck
On Wed, Mar 1, 2017 at 2:41 AM, David Laight  wrote:
> From: Alexander Duyck
>> Sent: 28 February 2017 17:20
> ...
>> You might want to consider just using a combination AND, divide,
>> multiply, and OR to avoid having to have any conditional branches
>> being added due to this code path.  Basically the logic would look
>> like:
>> new = val | NAPIF_STATE_SCHED;
>> new |= (val & NAPIF_STATE_SCHED) / NAPIF_STATE_SCHED * 
>> NAPIF_STATE_MISSED;
>>
>> In assembler that all ends up getting translated out to AND, SHL, OR.
>> You avoid the branching, or MOV/OR/TEST/CMOV type code you would end
>> up with otherwise.
>
> It is a shame gcc doesn't contain that optimisation.
> It also doesn't even make a good job of (a & b)/b * c since it
> always does a shr and a sal (gcc 4.7.3 and 5.4).

What build flags are you using?  With -Os or -O2 I have seen it
convert the /b * c into a single shift.

> Worthy of a #define or static inline.
> Something like:
> #define BIT_IF(v, a, b) ((b & (b-1) ? (v & a)/a * b : a > b ? (v & a) / (a/b) 
> : (v & a) * (b/a))
>
> David

Feel free to put together a patch.  I use this kind of thing in the
Intel drivers in multiple spots to shift stuff from TX_FLAGS into
descriptor flags.

- Alex


[PATCH v4 10/19] scsi: lpfc: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API. It also updates
some comments, accordingly.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/scsi/lpfc/lpfc.h   |  12 ++---
 drivers/scsi/lpfc/lpfc_init.c  |  16 +++
 drivers/scsi/lpfc/lpfc_mem.c   | 105 -
 drivers/scsi/lpfc/lpfc_nvme.c  |   6 +--
 drivers/scsi/lpfc/lpfc_nvmet.c |   4 +-
 drivers/scsi/lpfc/lpfc_scsi.c  |  12 ++---
 6 files changed, 76 insertions(+), 79 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h
index 0bba2e3..29492bc 100644
--- a/drivers/scsi/lpfc/lpfc.h
+++ b/drivers/scsi/lpfc/lpfc.h
@@ -934,12 +934,12 @@ struct lpfc_hba {
spinlock_t hbalock;
 
/* pci_mem_pools */
-   struct pci_pool *lpfc_sg_dma_buf_pool;
-   struct pci_pool *lpfc_mbuf_pool;
-   struct pci_pool *lpfc_hrb_pool; /* header receive buffer pool */
-   struct pci_pool *lpfc_drb_pool; /* data receive buffer pool */
-   struct pci_pool *lpfc_hbq_pool; /* SLI3 hbq buffer pool */
-   struct pci_pool *txrdy_payload_pool;
+   struct dma_pool *lpfc_sg_dma_buf_pool;
+   struct dma_pool *lpfc_mbuf_pool;
+   struct dma_pool *lpfc_hrb_pool; /* header receive buffer pool */
+   struct dma_pool *lpfc_drb_pool; /* data receive buffer pool */
+   struct dma_pool *lpfc_hbq_pool; /* SLI3 hbq buffer pool */
+   struct dma_pool *txrdy_payload_pool;
struct lpfc_dma_pool lpfc_mbuf_safety_pool;
 
mempool_t *mbox_mem_pool;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 0ee429d..b856457 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3151,7 +3151,7 @@ lpfc_scsi_free(struct lpfc_hba *phba)
list_for_each_entry_safe(sb, sb_next, >lpfc_scsi_buf_list_put,
 list) {
list_del(>list);
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool, sb->data,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool, sb->data,
  sb->dma_handle);
kfree(sb);
phba->total_scsi_bufs--;
@@ -3162,7 +3162,7 @@ lpfc_scsi_free(struct lpfc_hba *phba)
list_for_each_entry_safe(sb, sb_next, >lpfc_scsi_buf_list_get,
 list) {
list_del(>list);
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool, sb->data,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool, sb->data,
  sb->dma_handle);
kfree(sb);
phba->total_scsi_bufs--;
@@ -3193,7 +3193,7 @@ lpfc_nvme_free(struct lpfc_hba *phba)
list_for_each_entry_safe(lpfc_ncmd, lpfc_ncmd_next,
 >lpfc_nvme_buf_list_put, list) {
list_del(_ncmd->list);
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
  lpfc_ncmd->dma_handle);
kfree(lpfc_ncmd);
phba->total_nvme_bufs--;
@@ -3204,7 +3204,7 @@ lpfc_nvme_free(struct lpfc_hba *phba)
list_for_each_entry_safe(lpfc_ncmd, lpfc_ncmd_next,
 >lpfc_nvme_buf_list_get, list) {
list_del(_ncmd->list);
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool, lpfc_ncmd->data,
  lpfc_ncmd->dma_handle);
kfree(lpfc_ncmd);
phba->total_nvme_bufs--;
@@ -3517,7 +3517,7 @@ lpfc_sli4_scsi_sgl_update(struct lpfc_hba *phba)
list_remove_head(_sgl_list, psb,
 struct lpfc_scsi_buf, list);
if (psb) {
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool,
  psb->data, psb->dma_handle);
kfree(psb);
}
@@ -3614,7 +3614,7 @@ lpfc_sli4_nvme_sgl_update(struct lpfc_hba *phba)
list_remove_head(_sgl_list, lpfc_ncmd,
 struct lpfc_nvme_buf, list);
if (lpfc_ncmd) {
-   pci_pool_free(phba->lpfc_sg_dma_buf_pool,
+   dma_pool_free(phba->lpfc_sg_dma_buf_pool,
  lpfc_ncmd->data,
  lpfc_ncmd->dma_handle);
kfree(lpfc_ncmd);
@@ -6629,8 +6629,8 @@ lpfc_create_shost(struct lpfc_hba *phba)
if (phba->nvmet_support) {
/* Only 

Dell Inspiron 5558/0VNM2T and suspend/resume problem with r8169

2017-03-01 Thread Diego Viola
My machine (a Dell Inspiron 5558 laptop) fails to resume from suspend
unless I rmmod r8169 first.

Another workaround is to do this before suspend:

echo 0 > /sys/power/pm_async

I've been reproducing the freeze like this:

$ i3lock && systemctl suspend

I would have to repeat this at least 5 times for the freeze to occur,
but it seems to be easily reproducible.

If I don't invoke i3lock, I cannot get the freeze to happen, but it
seems to happen with other lockers also.

I have tried Alt+SysRq+r and tried to switch to another TTY but the
machine is always unresponsive, which indicates that it's a kernel
panic.

I have had a similar issue to this about a year ago with the jme
driver and this was the fix:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/jme.c?id=ee50c130c82175eaa0820c96b6d3763928af2241

I haven't tried getting a kernel trace yet, but all seems to indicate
the problem is caused by r8169, at least til now.

Any ideas, please?

Thanks,
Diego


[PATCH v4 12/19] scsi: mpt3sas: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/scsi/mpt3sas/mpt3sas_base.c | 73 +
 1 file changed, 34 insertions(+), 39 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c 
b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 5b7aec5..5ae1c23 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -3200,9 +3200,8 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
}
 
if (ioc->sense) {
-   pci_pool_free(ioc->sense_dma_pool, ioc->sense, ioc->sense_dma);
-   if (ioc->sense_dma_pool)
-   pci_pool_destroy(ioc->sense_dma_pool);
+   dma_pool_free(ioc->sense_dma_pool, ioc->sense, ioc->sense_dma);
+   dma_pool_destroy(ioc->sense_dma_pool);
dexitprintk(ioc, pr_info(MPT3SAS_FMT
"sense_pool(0x%p): free\n",
ioc->name, ioc->sense));
@@ -3210,9 +3209,8 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
}
 
if (ioc->reply) {
-   pci_pool_free(ioc->reply_dma_pool, ioc->reply, ioc->reply_dma);
-   if (ioc->reply_dma_pool)
-   pci_pool_destroy(ioc->reply_dma_pool);
+   dma_pool_free(ioc->reply_dma_pool, ioc->reply, ioc->reply_dma);
+   dma_pool_destroy(ioc->reply_dma_pool);
dexitprintk(ioc, pr_info(MPT3SAS_FMT
"reply_pool(0x%p): free\n",
ioc->name, ioc->reply));
@@ -3220,10 +3218,9 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
}
 
if (ioc->reply_free) {
-   pci_pool_free(ioc->reply_free_dma_pool, ioc->reply_free,
+   dma_pool_free(ioc->reply_free_dma_pool, ioc->reply_free,
ioc->reply_free_dma);
-   if (ioc->reply_free_dma_pool)
-   pci_pool_destroy(ioc->reply_free_dma_pool);
+   dma_pool_destroy(ioc->reply_free_dma_pool);
dexitprintk(ioc, pr_info(MPT3SAS_FMT
"reply_free_pool(0x%p): free\n",
ioc->name, ioc->reply_free));
@@ -3234,7 +3231,7 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
do {
rps = >reply_post[i];
if (rps->reply_post_free) {
-   pci_pool_free(
+   dma_pool_free(
ioc->reply_post_free_dma_pool,
rps->reply_post_free,
rps->reply_post_free_dma);
@@ -3246,8 +3243,7 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
} while (ioc->rdpq_array_enable &&
   (++i < ioc->reply_queue_count));
 
-   if (ioc->reply_post_free_dma_pool)
-   pci_pool_destroy(ioc->reply_post_free_dma_pool);
+   dma_pool_destroy(ioc->reply_post_free_dma_pool);
kfree(ioc->reply_post);
}
 
@@ -3268,12 +3264,11 @@ _base_release_memory_pools(struct MPT3SAS_ADAPTER *ioc)
if (ioc->chain_lookup) {
for (i = 0; i < ioc->chain_depth; i++) {
if (ioc->chain_lookup[i].chain_buffer)
-   pci_pool_free(ioc->chain_dma_pool,
+   dma_pool_free(ioc->chain_dma_pool,
ioc->chain_lookup[i].chain_buffer,
ioc->chain_lookup[i].chain_buffer_dma);
}
-   if (ioc->chain_dma_pool)
-   pci_pool_destroy(ioc->chain_dma_pool);
+   dma_pool_destroy(ioc->chain_dma_pool);
free_pages((ulong)ioc->chain_lookup, ioc->chain_pages);
ioc->chain_lookup = NULL;
}
@@ -3448,23 +3443,23 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
ioc->name);
goto out;
}
-   ioc->reply_post_free_dma_pool = pci_pool_create("reply_post_free pool",
-   ioc->pdev, sz, 16, 0);
+   ioc->reply_post_free_dma_pool = dma_pool_create("reply_post_free pool",
+   >pdev->dev, sz, 16, 0);
if (!ioc->reply_post_free_dma_pool) {
pr_err(MPT3SAS_FMT
-"reply_post_free pool: pci_pool_create failed\n",
+"reply_post_free pool: dma_pool_create failed\n",
 ioc->name);
goto out;
}
i = 0;
do {
ioc->reply_post[i].reply_post_free =
-   pci_pool_alloc(ioc->reply_post_free_dma_pool,
+   

[PATCH v4 13/19] scsi: mvsas: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/scsi/mvsas/mv_init.c | 6 +++---
 drivers/scsi/mvsas/mv_sas.c  | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index 8280046..41d2276 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -125,8 +125,7 @@ static void mvs_free(struct mvs_info *mvi)
else
slot_nr = MVS_CHIP_SLOT_SZ;
 
-   if (mvi->dma_pool)
-   pci_pool_destroy(mvi->dma_pool);
+   dma_pool_destroy(mvi->dma_pool);
 
if (mvi->tx)
dma_free_coherent(mvi->dev,
@@ -296,7 +295,8 @@ static int mvs_alloc(struct mvs_info *mvi, struct Scsi_Host 
*shost)
goto err_out;
 
sprintf(pool_name, "%s%d", "mvs_dma_pool", mvi->id);
-   mvi->dma_pool = pci_pool_create(pool_name, mvi->pdev, MVS_SLOT_BUF_SZ, 
16, 0);
+   mvi->dma_pool = dma_pool_create(pool_name, >pdev->dev,
+   MVS_SLOT_BUF_SZ, 16, 0);
if (!mvi->dma_pool) {
printk(KERN_DEBUG "failed to create dma pool %s.\n", 
pool_name);
goto err_out;
diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
index c7cc803..ee81d10 100644
--- a/drivers/scsi/mvsas/mv_sas.c
+++ b/drivers/scsi/mvsas/mv_sas.c
@@ -790,7 +790,7 @@ static int mvs_task_prep(struct sas_task *task, struct 
mvs_info *mvi, int is_tmf
slot->n_elem = n_elem;
slot->slot_tag = tag;
 
-   slot->buf = pci_pool_alloc(mvi->dma_pool, GFP_ATOMIC, >buf_dma);
+   slot->buf = dma_pool_alloc(mvi->dma_pool, GFP_ATOMIC, >buf_dma);
if (!slot->buf) {
rc = -ENOMEM;
goto err_out_tag;
@@ -840,7 +840,7 @@ static int mvs_task_prep(struct sas_task *task, struct 
mvs_info *mvi, int is_tmf
return rc;
 
 err_out_slot_buf:
-   pci_pool_free(mvi->dma_pool, slot->buf, slot->buf_dma);
+   dma_pool_free(mvi->dma_pool, slot->buf, slot->buf_dma);
 err_out_tag:
mvs_tag_free(mvi, tag);
 err_out:
@@ -918,7 +918,7 @@ static void mvs_slot_task_free(struct mvs_info *mvi, struct 
sas_task *task,
}
 
if (slot->buf) {
-   pci_pool_free(mvi->dma_pool, slot->buf, slot->buf_dma);
+   dma_pool_free(mvi->dma_pool, slot->buf, slot->buf_dma);
slot->buf = NULL;
}
list_del_init(>entry);
-- 
2.9.3



[PATCH v4 16/19] usb: gadget: net2280: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/usb/gadget/udc/net2280.c | 12 ++--
 drivers/usb/gadget/udc/net2280.h |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/usb/gadget/udc/net2280.c b/drivers/usb/gadget/udc/net2280.c
index 8550441..089081e 100644
--- a/drivers/usb/gadget/udc/net2280.c
+++ b/drivers/usb/gadget/udc/net2280.c
@@ -569,7 +569,7 @@ static struct usb_request
if (ep->dma) {
struct net2280_dma  *td;
 
-   td = pci_pool_alloc(ep->dev->requests, gfp_flags,
+   td = dma_pool_alloc(ep->dev->requests, gfp_flags,
>td_dma);
if (!td) {
kfree(req);
@@ -597,7 +597,7 @@ static void net2280_free_request(struct usb_ep *_ep, struct 
usb_request *_req)
req = container_of(_req, struct net2280_request, req);
WARN_ON(!list_empty(>queue));
if (req->td)
-   pci_pool_free(ep->dev->requests, req->td, req->td_dma);
+   dma_pool_free(ep->dev->requests, req->td, req->td_dma);
kfree(req);
 }
 
@@ -3578,10 +3578,10 @@ static void net2280_remove(struct pci_dev *pdev)
for (i = 1; i < 5; i++) {
if (!dev->ep[i].dummy)
continue;
-   pci_pool_free(dev->requests, dev->ep[i].dummy,
+   dma_pool_free(dev->requests, dev->ep[i].dummy,
dev->ep[i].td_dma);
}
-   pci_pool_destroy(dev->requests);
+   dma_pool_destroy(dev->requests);
}
if (dev->got_irq)
free_irq(pdev->irq, dev);
@@ -3723,7 +3723,7 @@ static int net2280_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
 
/* DMA setup */
/* NOTE:  we know only the 32 LSBs of dma addresses may be nonzero */
-   dev->requests = pci_pool_create("requests", pdev,
+   dev->requests = dma_pool_create("requests", >dev,
sizeof(struct net2280_dma),
0 /* no alignment requirements */,
0 /* or page-crossing issues */);
@@ -3735,7 +3735,7 @@ static int net2280_probe(struct pci_dev *pdev, const 
struct pci_device_id *id)
for (i = 1; i < 5; i++) {
struct net2280_dma  *td;
 
-   td = pci_pool_alloc(dev->requests, GFP_KERNEL,
+   td = dma_pool_alloc(dev->requests, GFP_KERNEL,
>ep[i].td_dma);
if (!td) {
ep_dbg(dev, "can't get dummy %d\n", i);
diff --git a/drivers/usb/gadget/udc/net2280.h b/drivers/usb/gadget/udc/net2280.h
index 2736a95..1088c37 100644
--- a/drivers/usb/gadget/udc/net2280.h
+++ b/drivers/usb/gadget/udc/net2280.h
@@ -187,7 +187,7 @@ struct net2280 {
struct usb338x_ll_chi_regs  __iomem *ll_chicken_reg;
struct usb338x_pl_regs  __iomem *plregs;
 
-   struct pci_pool *requests;
+   struct dma_pool *requests;
/* statistics...*/
 };
 
-- 
2.9.3



[PATCH v4 19/19] PCI: Remove PCI pool macro functions

2017-03-01 Thread Romain Perier
Now that all the drivers use dma pool API, we can remove the macro
functions for PCI pool.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 include/linux/pci.h | 9 -
 1 file changed, 9 deletions(-)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 282ed32..d206ba4 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1281,15 +1281,6 @@ int pci_set_vga_state(struct pci_dev *pdev, bool decode,
 #include 
 #include 
 
-#definepci_pool dma_pool
-#define pci_pool_create(name, pdev, size, align, allocation) \
-   dma_pool_create(name, >dev, size, align, allocation)
-#definepci_pool_destroy(pool) dma_pool_destroy(pool)
-#definepci_pool_alloc(pool, flags, handle) dma_pool_alloc(pool, flags, 
handle)
-#definepci_pool_zalloc(pool, flags, handle) \
-   dma_pool_zalloc(pool, flags, handle)
-#definepci_pool_free(pool, vaddr, addr) dma_pool_free(pool, vaddr, 
addr)
-
 struct msix_entry {
u32 vector; /* kernel uses to write allocated vector */
u16 entry;  /* driver uses to specify entry, OS writes */
-- 
2.9.3



[PATCH v4 11/19] scsi: megaraid: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/scsi/megaraid/megaraid_mbox.c   | 33 +++
 drivers/scsi/megaraid/megaraid_mm.c | 32 +++---
 drivers/scsi/megaraid/megaraid_sas_base.c   | 29 +++--
 drivers/scsi/megaraid/megaraid_sas_fusion.c | 66 +
 4 files changed, 77 insertions(+), 83 deletions(-)

diff --git a/drivers/scsi/megaraid/megaraid_mbox.c 
b/drivers/scsi/megaraid/megaraid_mbox.c
index f0987f2..7dfc2e2 100644
--- a/drivers/scsi/megaraid/megaraid_mbox.c
+++ b/drivers/scsi/megaraid/megaraid_mbox.c
@@ -1153,8 +1153,8 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 
 
// Allocate memory for 16-bytes aligned mailboxes
-   raid_dev->mbox_pool_handle = pci_pool_create("megaraid mbox pool",
-   adapter->pdev,
+   raid_dev->mbox_pool_handle = dma_pool_create("megaraid mbox pool",
+   >pdev->dev,
sizeof(mbox64_t) + 16,
16, 0);
 
@@ -1164,7 +1164,7 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 
mbox_pci_blk = raid_dev->mbox_pool;
for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
-   mbox_pci_blk[i].vaddr = pci_pool_alloc(
+   mbox_pci_blk[i].vaddr = dma_pool_alloc(
raid_dev->mbox_pool_handle,
GFP_KERNEL,
_pci_blk[i].dma_addr);
@@ -1181,8 +1181,8 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 * share common memory pool. Passthru structures piggyback on memory
 * allocted to extended passthru since passthru is smaller of the two
 */
-   raid_dev->epthru_pool_handle = pci_pool_create("megaraid mbox pthru",
-   adapter->pdev, sizeof(mraid_epassthru_t), 128, 0);
+   raid_dev->epthru_pool_handle = dma_pool_create("megaraid mbox pthru",
+   >pdev->dev, sizeof(mraid_epassthru_t), 128, 0);
 
if (raid_dev->epthru_pool_handle == NULL) {
goto fail_setup_dma_pool;
@@ -1190,7 +1190,7 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 
epthru_pci_blk = raid_dev->epthru_pool;
for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
-   epthru_pci_blk[i].vaddr = pci_pool_alloc(
+   epthru_pci_blk[i].vaddr = dma_pool_alloc(
raid_dev->epthru_pool_handle,
GFP_KERNEL,
_pci_blk[i].dma_addr);
@@ -1202,8 +1202,8 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 
// Allocate memory for each scatter-gather list. Request for 512 bytes
// alignment for each sg list
-   raid_dev->sg_pool_handle = pci_pool_create("megaraid mbox sg",
-   adapter->pdev,
+   raid_dev->sg_pool_handle = dma_pool_create("megaraid mbox sg",
+   >pdev->dev,
sizeof(mbox_sgl64) * MBOX_MAX_SG_SIZE,
512, 0);
 
@@ -1213,7 +1213,7 @@ megaraid_mbox_setup_dma_pools(adapter_t *adapter)
 
sg_pci_blk = raid_dev->sg_pool;
for (i = 0; i < MBOX_MAX_SCSI_CMDS; i++) {
-   sg_pci_blk[i].vaddr = pci_pool_alloc(
+   sg_pci_blk[i].vaddr = dma_pool_alloc(
raid_dev->sg_pool_handle,
GFP_KERNEL,
_pci_blk[i].dma_addr);
@@ -1249,29 +1249,26 @@ megaraid_mbox_teardown_dma_pools(adapter_t *adapter)
 
sg_pci_blk = raid_dev->sg_pool;
for (i = 0; i < MBOX_MAX_SCSI_CMDS && sg_pci_blk[i].vaddr; i++) {
-   pci_pool_free(raid_dev->sg_pool_handle, sg_pci_blk[i].vaddr,
+   dma_pool_free(raid_dev->sg_pool_handle, sg_pci_blk[i].vaddr,
sg_pci_blk[i].dma_addr);
}
-   if (raid_dev->sg_pool_handle)
-   pci_pool_destroy(raid_dev->sg_pool_handle);
+   dma_pool_destroy(raid_dev->sg_pool_handle);
 
 
epthru_pci_blk = raid_dev->epthru_pool;
for (i = 0; i < MBOX_MAX_SCSI_CMDS && epthru_pci_blk[i].vaddr; i++) {
-   pci_pool_free(raid_dev->epthru_pool_handle,
+   dma_pool_free(raid_dev->epthru_pool_handle,
epthru_pci_blk[i].vaddr, epthru_pci_blk[i].dma_addr);
}
-   if (raid_dev->epthru_pool_handle)
-   pci_pool_destroy(raid_dev->epthru_pool_handle);

Re: Dell Inspiron 5558/0VNM2T and suspend/resume problem with r8169

2017-03-01 Thread Diego Viola
On Wed, Mar 1, 2017 at 12:44 PM, Diego Viola  wrote:
> My machine (a Dell Inspiron 5558 laptop) fails to resume from suspend
> unless I rmmod r8169 first.
>
> Another workaround is to do this before suspend:
>
> echo 0 > /sys/power/pm_async
>
> I've been reproducing the freeze like this:
>
> $ i3lock && systemctl suspend
>
> I would have to repeat this at least 5 times for the freeze to occur,
> but it seems to be easily reproducible.
>
> If I don't invoke i3lock, I cannot get the freeze to happen, but it
> seems to happen with other lockers also.
>
> I have tried Alt+SysRq+r and tried to switch to another TTY but the
> machine is always unresponsive, which indicates that it's a kernel
> panic.
>
> I have had a similar issue to this about a year ago with the jme
> driver and this was the fix:
>
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/drivers/net/ethernet/jme.c?id=ee50c130c82175eaa0820c96b6d3763928af2241
>
> I haven't tried getting a kernel trace yet, but all seems to indicate
> the problem is caused by r8169, at least til now.
>
> Any ideas, please?
>
> Thanks,
> Diego

Sorry, I forgot to mention, I'm on Arch Linux (x86_64), kernel 4.9.11-1-ARCH.

Diego


[PATCH v4 14/19] scsi: pmcraid: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/scsi/pmcraid.c | 10 +-
 drivers/scsi/pmcraid.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/pmcraid.c b/drivers/scsi/pmcraid.c
index 49e70a3..0f893c4 100644
--- a/drivers/scsi/pmcraid.c
+++ b/drivers/scsi/pmcraid.c
@@ -4699,13 +4699,13 @@ pmcraid_release_control_blocks(
return;
 
for (i = 0; i < max_index; i++) {
-   pci_pool_free(pinstance->control_pool,
+   dma_pool_free(pinstance->control_pool,
  pinstance->cmd_list[i]->ioa_cb,
  pinstance->cmd_list[i]->ioa_cb_bus_addr);
pinstance->cmd_list[i]->ioa_cb = NULL;
pinstance->cmd_list[i]->ioa_cb_bus_addr = 0;
}
-   pci_pool_destroy(pinstance->control_pool);
+   dma_pool_destroy(pinstance->control_pool);
pinstance->control_pool = NULL;
 }
 
@@ -4762,8 +4762,8 @@ static int pmcraid_allocate_control_blocks(struct 
pmcraid_instance *pinstance)
pinstance->host->unique_id);
 
pinstance->control_pool =
-   pci_pool_create(pinstance->ctl_pool_name,
-   pinstance->pdev,
+   dma_pool_create(pinstance->ctl_pool_name,
+   >pdev->dev,
sizeof(struct pmcraid_control_block),
PMCRAID_IOARCB_ALIGNMENT, 0);
 
@@ -4772,7 +4772,7 @@ static int pmcraid_allocate_control_blocks(struct 
pmcraid_instance *pinstance)
 
for (i = 0; i < PMCRAID_MAX_CMD; i++) {
pinstance->cmd_list[i]->ioa_cb =
-   pci_pool_alloc(
+   dma_pool_alloc(
pinstance->control_pool,
GFP_KERNEL,
&(pinstance->cmd_list[i]->ioa_cb_bus_addr));
diff --git a/drivers/scsi/pmcraid.h b/drivers/scsi/pmcraid.h
index 568b18a..acf5a7b 100644
--- a/drivers/scsi/pmcraid.h
+++ b/drivers/scsi/pmcraid.h
@@ -755,7 +755,7 @@ struct pmcraid_instance {
 
/* structures related to command blocks */
struct kmem_cache *cmd_cachep;  /* cache for cmd blocks */
-   struct pci_pool *control_pool;  /* pool for control blocks */
+   struct dma_pool *control_pool;  /* pool for control blocks */
char   cmd_pool_name[64];   /* name of cmd cache */
char   ctl_pool_name[64];   /* name of control cache */
 
-- 
2.9.3



[PATCH v4 15/19] usb: gadget: amd5536udc: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/usb/gadget/udc/amd5536udc.c | 8 
 drivers/usb/gadget/udc/amd5536udc.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/usb/gadget/udc/amd5536udc.c 
b/drivers/usb/gadget/udc/amd5536udc.c
index ea03ca7..270876b 100644
--- a/drivers/usb/gadget/udc/amd5536udc.c
+++ b/drivers/usb/gadget/udc/amd5536udc.c
@@ -583,7 +583,7 @@ udc_alloc_request(struct usb_ep *usbep, gfp_t gfp)
 
if (ep->dma) {
/* ep0 in requests are allocated from data pool here */
-   dma_desc = pci_pool_alloc(ep->dev->data_requests, gfp,
+   dma_desc = dma_pool_alloc(ep->dev->data_requests, gfp,
>td_phys);
if (!dma_desc) {
kfree(req);
@@ -622,7 +622,7 @@ static int udc_free_dma_chain(struct udc *dev, struct 
udc_request *req)
td = phys_to_virt(td_last->next);
 
for (i = 1; i < req->chain_len; i++) {
-   pci_pool_free(dev->data_requests, td,
+   dma_pool_free(dev->data_requests, td,
  (dma_addr_t)td_last->next);
td_last = td;
td = phys_to_virt(td_last->next);
@@ -652,7 +652,7 @@ udc_free_request(struct usb_ep *usbep, struct usb_request 
*usbreq)
if (req->chain_len > 1)
udc_free_dma_chain(ep->dev, req);
 
-   pci_pool_free(ep->dev->data_requests, req->td_data,
+   dma_pool_free(ep->dev->data_requests, req->td_data,
req->td_phys);
}
kfree(req);
@@ -847,7 +847,7 @@ static int udc_create_dma_chain(
for (i = buf_len; i < bytes; i += buf_len) {
/* create or determine next desc. */
if (create_new_chain) {
-   td = pci_pool_alloc(ep->dev->data_requests,
+   td = dma_pool_alloc(ep->dev->data_requests,
gfp_flags, _addr);
if (!td)
return -ENOMEM;
diff --git a/drivers/usb/gadget/udc/amd5536udc.h 
b/drivers/usb/gadget/udc/amd5536udc.h
index 4638d70..85d5aa5 100644
--- a/drivers/usb/gadget/udc/amd5536udc.h
+++ b/drivers/usb/gadget/udc/amd5536udc.h
@@ -545,8 +545,8 @@ struct udc {
u32 __iomem *txfifo;
 
/* DMA desc pools */
-   struct pci_pool *data_requests;
-   struct pci_pool *stp_requests;
+   struct dma_pool *data_requests;
+   struct dma_pool *stp_requests;
 
/* device data */
unsigned long   phys_addr;
-- 
2.9.3



[PATCH v4 18/19] usb: host: Remove remaining pci_pool in comments

2017-03-01 Thread Romain Perier
This replaces remaining occurences of pci_pool by dma_pool, as
this is the new API that could be used for that purpose.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/usb/host/ehci-hcd.c | 2 +-
 drivers/usb/host/fotg210-hcd.c  | 2 +-
 drivers/usb/host/oxu210hp-hcd.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index ac2c4ea..6e834b83 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -597,7 +597,7 @@ static int ehci_run (struct usb_hcd *hcd)
/*
 * hcc_params controls whether ehci->regs->segment must (!!!)
 * be used; it constrains QH/ITD/SITD and QTD locations.
-* pci_pool consistent memory always uses segment zero.
+* dma_pool consistent memory always uses segment zero.
 * streaming mappings for I/O buffers, like pci_map_single(),
 * can return segments above 4GB, if the device allows.
 *
diff --git a/drivers/usb/host/fotg210-hcd.c b/drivers/usb/host/fotg210-hcd.c
index 1c5b34b..ced08dc 100644
--- a/drivers/usb/host/fotg210-hcd.c
+++ b/drivers/usb/host/fotg210-hcd.c
@@ -5047,7 +5047,7 @@ static int fotg210_run(struct usb_hcd *hcd)
/*
 * hcc_params controls whether fotg210->regs->segment must (!!!)
 * be used; it constrains QH/ITD/SITD and QTD locations.
-* pci_pool consistent memory always uses segment zero.
+* dma_pool consistent memory always uses segment zero.
 * streaming mappings for I/O buffers, like pci_map_single(),
 * can return segments above 4GB, if the device allows.
 *
diff --git a/drivers/usb/host/oxu210hp-hcd.c b/drivers/usb/host/oxu210hp-hcd.c
index bcf531c..ed20fb3 100644
--- a/drivers/usb/host/oxu210hp-hcd.c
+++ b/drivers/usb/host/oxu210hp-hcd.c
@@ -2708,7 +2708,7 @@ static int oxu_run(struct usb_hcd *hcd)
 
/* hcc_params controls whether oxu->regs->segment must (!!!)
 * be used; it constrains QH/ITD/SITD and QTD locations.
-* pci_pool consistent memory always uses segment zero.
+* dma_pool consistent memory always uses segment zero.
 * streaming mappings for I/O buffers, like pci_map_single(),
 * can return segments above 4GB, if the device allows.
 *
-- 
2.9.3



[PATCH v4 07/19] wireless: ipw2200: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/net/wireless/intel/ipw2x00/ipw2200.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/intel/ipw2x00/ipw2200.c 
b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
index 5ef3c5c..93dfe47 100644
--- a/drivers/net/wireless/intel/ipw2x00/ipw2200.c
+++ b/drivers/net/wireless/intel/ipw2x00/ipw2200.c
@@ -3211,7 +3211,7 @@ static int ipw_load_firmware(struct ipw_priv *priv, u8 * 
data, size_t len)
struct fw_chunk *chunk;
int total_nr = 0;
int i;
-   struct pci_pool *pool;
+   struct dma_pool *pool;
void **virts;
dma_addr_t *phys;
 
@@ -3228,9 +3228,10 @@ static int ipw_load_firmware(struct ipw_priv *priv, u8 * 
data, size_t len)
kfree(virts);
return -ENOMEM;
}
-   pool = pci_pool_create("ipw2200", priv->pci_dev, CB_MAX_LENGTH, 0, 0);
+   pool = dma_pool_create("ipw2200", >pci_dev->dev, CB_MAX_LENGTH, 0,
+  0);
if (!pool) {
-   IPW_ERROR("pci_pool_create failed\n");
+   IPW_ERROR("dma_pool_create failed\n");
kfree(phys);
kfree(virts);
return -ENOMEM;
@@ -3255,7 +3256,7 @@ static int ipw_load_firmware(struct ipw_priv *priv, u8 * 
data, size_t len)
 
nr = (chunk_len + CB_MAX_LENGTH - 1) / CB_MAX_LENGTH;
for (i = 0; i < nr; i++) {
-   virts[total_nr] = pci_pool_alloc(pool, GFP_KERNEL,
+   virts[total_nr] = dma_pool_alloc(pool, GFP_KERNEL,
 [total_nr]);
if (!virts[total_nr]) {
ret = -ENOMEM;
@@ -3299,9 +3300,9 @@ static int ipw_load_firmware(struct ipw_priv *priv, u8 * 
data, size_t len)
}
  out:
for (i = 0; i < total_nr; i++)
-   pci_pool_free(pool, virts[i], phys[i]);
+   dma_pool_free(pool, virts[i], phys[i]);
 
-   pci_pool_destroy(pool);
+   dma_pool_destroy(pool);
kfree(phys);
kfree(virts);
 
-- 
2.9.3



[PATCH v4 09/19] scsi: csiostor: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API. It also updates
the name of some variables and the content of comments, accordingly.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/scsi/csiostor/csio_hw.h   |  2 +-
 drivers/scsi/csiostor/csio_init.c | 11 ++-
 drivers/scsi/csiostor/csio_scsi.c |  6 +++---
 3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/csiostor/csio_hw.h b/drivers/scsi/csiostor/csio_hw.h
index 029bef8..55b04fc 100644
--- a/drivers/scsi/csiostor/csio_hw.h
+++ b/drivers/scsi/csiostor/csio_hw.h
@@ -465,7 +465,7 @@ struct csio_hw {
struct csio_pport   pport[CSIO_MAX_PPORTS]; /* Ports (XGMACs) */
struct csio_hw_params   params; /* Hw parameters */
 
-   struct pci_pool *scsi_pci_pool; /* PCI pool for SCSI */
+   struct dma_pool *scsi_dma_pool; /* DMA pool for SCSI */
mempool_t   *mb_mempool;/* Mailbox memory pool*/
mempool_t   *rnode_mempool; /* rnode memory pool */
 
diff --git a/drivers/scsi/csiostor/csio_init.c 
b/drivers/scsi/csiostor/csio_init.c
index dbe416f..292964c 100644
--- a/drivers/scsi/csiostor/csio_init.c
+++ b/drivers/scsi/csiostor/csio_init.c
@@ -485,9 +485,10 @@ csio_resource_alloc(struct csio_hw *hw)
if (!hw->rnode_mempool)
goto err_free_mb_mempool;
 
-   hw->scsi_pci_pool = pci_pool_create("csio_scsi_pci_pool", hw->pdev,
-   CSIO_SCSI_RSP_LEN, 8, 0);
-   if (!hw->scsi_pci_pool)
+   hw->scsi_dma_pool = dma_pool_create("csio_scsi_dma_pool",
+   >pdev->dev, CSIO_SCSI_RSP_LEN,
+   8, 0);
+   if (!hw->scsi_dma_pool)
goto err_free_rn_pool;
 
return 0;
@@ -505,8 +506,8 @@ csio_resource_alloc(struct csio_hw *hw)
 static void
 csio_resource_free(struct csio_hw *hw)
 {
-   pci_pool_destroy(hw->scsi_pci_pool);
-   hw->scsi_pci_pool = NULL;
+   dma_pool_destroy(hw->scsi_dma_pool);
+   hw->scsi_dma_pool = NULL;
mempool_destroy(hw->rnode_mempool);
hw->rnode_mempool = NULL;
mempool_destroy(hw->mb_mempool);
diff --git a/drivers/scsi/csiostor/csio_scsi.c 
b/drivers/scsi/csiostor/csio_scsi.c
index a1ff75f..dab0d3f 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -2445,7 +2445,7 @@ csio_scsim_init(struct csio_scsim *scm, struct csio_hw 
*hw)
 
/* Allocate Dma buffers for Response Payload */
dma_buf = >dma_buf;
-   dma_buf->vaddr = pci_pool_alloc(hw->scsi_pci_pool, GFP_KERNEL,
+   dma_buf->vaddr = dma_pool_alloc(hw->scsi_dma_pool, GFP_KERNEL,
_buf->paddr);
if (!dma_buf->vaddr) {
csio_err(hw,
@@ -2485,7 +2485,7 @@ csio_scsim_init(struct csio_scsim *scm, struct csio_hw 
*hw)
ioreq = (struct csio_ioreq *)tmp;
 
dma_buf = >dma_buf;
-   pci_pool_free(hw->scsi_pci_pool, dma_buf->vaddr,
+   dma_pool_free(hw->scsi_dma_pool, dma_buf->vaddr,
  dma_buf->paddr);
 
kfree(ioreq);
@@ -2516,7 +2516,7 @@ csio_scsim_exit(struct csio_scsim *scm)
ioreq = (struct csio_ioreq *)tmp;
 
dma_buf = >dma_buf;
-   pci_pool_free(scm->hw->scsi_pci_pool, dma_buf->vaddr,
+   dma_pool_free(scm->hw->scsi_dma_pool, dma_buf->vaddr,
  dma_buf->paddr);
 
kfree(ioreq);
-- 
2.9.3



[PATCH v4 06/19] mlx5: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 11 ++-
 include/linux/mlx5/driver.h   |  2 +-
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c 
b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index caa837e..6eef344 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1061,7 +1061,7 @@ static struct mlx5_cmd_mailbox *alloc_cmd_box(struct 
mlx5_core_dev *dev,
if (!mailbox)
return ERR_PTR(-ENOMEM);
 
-   mailbox->buf = pci_pool_zalloc(dev->cmd.pool, flags,
+   mailbox->buf = dma_pool_zalloc(dev->cmd.pool, flags,
   >dma);
if (!mailbox->buf) {
mlx5_core_dbg(dev, "failed allocation\n");
@@ -1076,7 +1076,7 @@ static struct mlx5_cmd_mailbox *alloc_cmd_box(struct 
mlx5_core_dev *dev,
 static void free_cmd_box(struct mlx5_core_dev *dev,
 struct mlx5_cmd_mailbox *mailbox)
 {
-   pci_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
+   dma_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
kfree(mailbox);
 }
 
@@ -1696,7 +1696,8 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
return -EINVAL;
}
 
-   cmd->pool = pci_pool_create("mlx5_cmd", dev->pdev, size, align, 0);
+   cmd->pool = dma_pool_create("mlx5_cmd", >pdev->dev, size, align,
+   0);
if (!cmd->pool)
return -ENOMEM;
 
@@ -1786,7 +1787,7 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
free_cmd_page(dev, cmd);
 
 err_free_pool:
-   pci_pool_destroy(cmd->pool);
+   dma_pool_destroy(cmd->pool);
 
return err;
 }
@@ -1800,6 +1801,6 @@ void mlx5_cmd_cleanup(struct mlx5_core_dev *dev)
destroy_workqueue(cmd->wq);
destroy_msg_cache(dev);
free_cmd_page(dev, cmd);
-   pci_pool_destroy(cmd->pool);
+   dma_pool_destroy(cmd->pool);
 }
 EXPORT_SYMBOL(mlx5_cmd_cleanup);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 2fcff6b..13a267c 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -284,7 +284,7 @@ struct mlx5_cmd {
struct semaphore pages_sem;
int mode;
struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
-   struct pci_pool *pool;
+   struct dma_pool *pool;
struct mlx5_cmd_debug dbg;
struct cmd_msg_cache cache[MLX5_NUM_COMMAND_CACHES];
int checksum_disabled;
-- 
2.9.3



[PATCH v4 04/19] net: e100: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/net/ethernet/intel/e100.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/e100.c 
b/drivers/net/ethernet/intel/e100.c
index 2b7323d..d1002c2 100644
--- a/drivers/net/ethernet/intel/e100.c
+++ b/drivers/net/ethernet/intel/e100.c
@@ -607,7 +607,7 @@ struct nic {
struct mem *mem;
dma_addr_t dma_addr;
 
-   struct pci_pool *cbs_pool;
+   struct dma_pool *cbs_pool;
dma_addr_t cbs_dma_addr;
u8 adaptive_ifs;
u8 tx_threshold;
@@ -1892,7 +1892,7 @@ static void e100_clean_cbs(struct nic *nic)
nic->cb_to_clean = nic->cb_to_clean->next;
nic->cbs_avail++;
}
-   pci_pool_free(nic->cbs_pool, nic->cbs, nic->cbs_dma_addr);
+   dma_pool_free(nic->cbs_pool, nic->cbs, nic->cbs_dma_addr);
nic->cbs = NULL;
nic->cbs_avail = 0;
}
@@ -1910,7 +1910,7 @@ static int e100_alloc_cbs(struct nic *nic)
nic->cb_to_use = nic->cb_to_send = nic->cb_to_clean = NULL;
nic->cbs_avail = 0;
 
-   nic->cbs = pci_pool_alloc(nic->cbs_pool, GFP_KERNEL,
+   nic->cbs = dma_pool_alloc(nic->cbs_pool, GFP_KERNEL,
  >cbs_dma_addr);
if (!nic->cbs)
return -ENOMEM;
@@ -2958,8 +2958,8 @@ static int e100_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
netif_err(nic, probe, nic->netdev, "Cannot register net device, 
aborting\n");
goto err_out_free;
}
-   nic->cbs_pool = pci_pool_create(netdev->name,
-  nic->pdev,
+   nic->cbs_pool = dma_pool_create(netdev->name,
+  >pdev->dev,
   nic->params.cbs.max * sizeof(struct cb),
   sizeof(u32),
   0);
@@ -2999,7 +2999,7 @@ static void e100_remove(struct pci_dev *pdev)
unregister_netdev(netdev);
e100_free(nic);
pci_iounmap(pdev, nic->csr);
-   pci_pool_destroy(nic->cbs_pool);
+   dma_pool_destroy(nic->cbs_pool);
free_netdev(netdev);
pci_release_regions(pdev);
pci_disable_device(pdev);
-- 
2.9.3



[PATCH v4 05/19] mlx4: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/net/ethernet/mellanox/mlx4/cmd.c  | 10 +-
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c 
b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index e8c1051..fb69604 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2516,8 +2516,8 @@ int mlx4_cmd_init(struct mlx4_dev *dev)
}
 
if (!priv->cmd.pool) {
-   priv->cmd.pool = pci_pool_create("mlx4_cmd",
-dev->persist->pdev,
+   priv->cmd.pool = dma_pool_create("mlx4_cmd",
+>persist->pdev->dev,
 MLX4_MAILBOX_SIZE,
 MLX4_MAILBOX_SIZE, 0);
if (!priv->cmd.pool)
@@ -2588,7 +2588,7 @@ void mlx4_cmd_cleanup(struct mlx4_dev *dev, int 
cleanup_mask)
struct mlx4_priv *priv = mlx4_priv(dev);
 
if (priv->cmd.pool && (cleanup_mask & MLX4_CMD_CLEANUP_POOL)) {
-   pci_pool_destroy(priv->cmd.pool);
+   dma_pool_destroy(priv->cmd.pool);
priv->cmd.pool = NULL;
}
 
@@ -2680,7 +2680,7 @@ struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct 
mlx4_dev *dev)
if (!mailbox)
return ERR_PTR(-ENOMEM);
 
-   mailbox->buf = pci_pool_zalloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
+   mailbox->buf = dma_pool_zalloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
   >dma);
if (!mailbox->buf) {
kfree(mailbox);
@@ -2697,7 +2697,7 @@ void mlx4_free_cmd_mailbox(struct mlx4_dev *dev,
if (!mailbox)
return;
 
-   pci_pool_free(mlx4_priv(dev)->cmd.pool, mailbox->buf, mailbox->dma);
+   dma_pool_free(mlx4_priv(dev)->cmd.pool, mailbox->buf, mailbox->dma);
kfree(mailbox);
 }
 EXPORT_SYMBOL_GPL(mlx4_free_cmd_mailbox);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index b4f1bc5..69c8764 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -628,7 +628,7 @@ struct mlx4_mgm {
 };
 
 struct mlx4_cmd {
-   struct pci_pool*pool;
+   struct dma_pool*pool;
void __iomem   *hcr;
struct mutexslave_cmd_mutex;
struct semaphorepoll_sem;
-- 
2.9.3



[PATCH v4 00/19] Replace PCI pool by DMA pool API

2017-03-01 Thread Romain Perier
The current PCI pool API are simple macro functions direct expanded to
the appropriated dma pool functions. The prototypes are almost the same
and semantically, they are very similar. I propose to use the DMA pool
API directly and get rid of the old API.

This set of patches, replaces the old API by the dma pool API, adds
support to warn about this old API in checkpath.pl and remove the
defines.

Changes in v4:
- Rebased series onto next-20170301
- Removed patch 20/20: checks done by checkpath.pl, no longer required.
  Thanks to Peter and Joe for their feedbacks.
- Added Reviewed-by tags

Changes in v3:
- Rebased series onto next-20170224
- Fix checkpath.pl reports for patch 11/20 and patch 12/20
- Remove prefix RFC
Changes in v2:
- Introduced patch 18/20
- Fixed cosmetic changes: spaces before brace, live over 80 characters
- Removed some of the check for NULL pointers before calling dma_pool_destroy
- Improved the regexp in checkpatch for pci_pool, thanks to Joe Perches
- Added Tested-by and Acked-by tags

Romain Perier (19):
  block: DAC960: Replace PCI pool old API
  dmaengine: pch_dma: Replace PCI pool old API
  IB/mthca: Replace PCI pool old API
  net: e100: Replace PCI pool old API
  mlx4: Replace PCI pool old API
  mlx5: Replace PCI pool old API
  wireless: ipw2200: Replace PCI pool old API
  scsi: be2iscsi: Replace PCI pool old API
  scsi: csiostor: Replace PCI pool old API
  scsi: lpfc: Replace PCI pool old API
  scsi: megaraid: Replace PCI pool old API
  scsi: mpt3sas: Replace PCI pool old API
  scsi: mvsas: Replace PCI pool old API
  scsi: pmcraid: Replace PCI pool old API
  usb: gadget: amd5536udc: Replace PCI pool old API
  usb: gadget: net2280: Replace PCI pool old API
  usb: gadget: pch_udc: Replace PCI pool old API
  usb: host: Remove remaining pci_pool in comments
  PCI: Remove PCI pool macro functions

 drivers/block/DAC960.c|  36 -
 drivers/block/DAC960.h|   4 +-
 drivers/dma/pch_dma.c |  12 +--
 drivers/infiniband/hw/mthca/mthca_av.c|  10 +--
 drivers/infiniband/hw/mthca/mthca_cmd.c   |   8 +-
 drivers/infiniband/hw/mthca/mthca_dev.h   |   4 +-
 drivers/net/ethernet/intel/e100.c |  12 +--
 drivers/net/ethernet/mellanox/mlx4/cmd.c  |  10 +--
 drivers/net/ethernet/mellanox/mlx4/mlx4.h |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |  11 +--
 drivers/net/wireless/intel/ipw2x00/ipw2200.c  |  13 ++--
 drivers/scsi/be2iscsi/be_iscsi.c  |   6 +-
 drivers/scsi/be2iscsi/be_main.c   |   6 +-
 drivers/scsi/be2iscsi/be_main.h   |   2 +-
 drivers/scsi/csiostor/csio_hw.h   |   2 +-
 drivers/scsi/csiostor/csio_init.c |  11 +--
 drivers/scsi/csiostor/csio_scsi.c |   6 +-
 drivers/scsi/lpfc/lpfc.h  |  12 +--
 drivers/scsi/lpfc/lpfc_init.c |  16 ++--
 drivers/scsi/lpfc/lpfc_mem.c  | 105 +-
 drivers/scsi/lpfc/lpfc_nvme.c |   6 +-
 drivers/scsi/lpfc/lpfc_nvmet.c|   4 +-
 drivers/scsi/lpfc/lpfc_scsi.c |  12 +--
 drivers/scsi/megaraid/megaraid_mbox.c |  33 
 drivers/scsi/megaraid/megaraid_mm.c   |  32 
 drivers/scsi/megaraid/megaraid_sas_base.c |  29 +++
 drivers/scsi/megaraid/megaraid_sas_fusion.c   |  66 
 drivers/scsi/mpt3sas/mpt3sas_base.c   |  73 +-
 drivers/scsi/mvsas/mv_init.c  |   6 +-
 drivers/scsi/mvsas/mv_sas.c   |   6 +-
 drivers/scsi/pmcraid.c|  10 +--
 drivers/scsi/pmcraid.h|   2 +-
 drivers/usb/gadget/udc/amd5536udc.c   |   8 +-
 drivers/usb/gadget/udc/amd5536udc.h   |   4 +-
 drivers/usb/gadget/udc/net2280.c  |  12 +--
 drivers/usb/gadget/udc/net2280.h  |   2 +-
 drivers/usb/gadget/udc/pch_udc.c  |  31 
 drivers/usb/host/ehci-hcd.c   |   2 +-
 drivers/usb/host/fotg210-hcd.c|   2 +-
 drivers/usb/host/oxu210hp-hcd.c   |   2 +-
 include/linux/mlx5/driver.h   |   2 +-
 include/linux/pci.h   |   9 ---
 42 files changed, 310 insertions(+), 331 deletions(-)

-- 
2.9.3



[PATCH] Bluetooth: fix assignments on error variable err

2017-03-01 Thread Colin King
From: Colin Ian King 

Variable err is being initialized to zero and then later being
set to the error return from the call to hci_req_run_skb; hence
we can remove the redundant initialization to zero.

Also on two occassions err is not being set from the error return
from the call to hci_req_run_skb, so add these missing assignments.

Signed-off-by: Colin Ian King 
---
 net/bluetooth/amp.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/bluetooth/amp.c b/net/bluetooth/amp.c
index 02a4ccc..ebcab5b 100644
--- a/net/bluetooth/amp.c
+++ b/net/bluetooth/amp.c
@@ -263,7 +263,7 @@ void amp_read_loc_assoc_frag(struct hci_dev *hdev, u8 
phy_handle)
struct hci_cp_read_local_amp_assoc cp;
struct amp_assoc *loc_assoc = >loc_assoc;
struct hci_request req;
-   int err = 0;
+   int err;
 
BT_DBG("%s handle %d", hdev->name, phy_handle);
 
@@ -282,7 +282,7 @@ void amp_read_loc_assoc(struct hci_dev *hdev, struct 
amp_mgr *mgr)
 {
struct hci_cp_read_local_amp_assoc cp;
struct hci_request req;
-   int err = 0;
+   int err;
 
memset(>loc_assoc, 0, sizeof(struct amp_assoc));
memset(, 0, sizeof(cp));
@@ -292,7 +292,7 @@ void amp_read_loc_assoc(struct hci_dev *hdev, struct 
amp_mgr *mgr)
set_bit(READ_LOC_AMP_ASSOC, >state);
hci_req_init(, hdev);
hci_req_add(, HCI_OP_READ_LOCAL_AMP_ASSOC, sizeof(cp), );
-   hci_req_run_skb(, read_local_amp_assoc_complete);
+   err = hci_req_run_skb(, read_local_amp_assoc_complete);
if (err < 0)
a2mp_send_getampassoc_rsp(hdev, A2MP_STATUS_INVALID_CTRL_ID);
 }
@@ -303,7 +303,7 @@ void amp_read_loc_assoc_final_data(struct hci_dev *hdev,
struct hci_cp_read_local_amp_assoc cp;
struct amp_mgr *mgr = hcon->amp_mgr;
struct hci_request req;
-   int err = 0;
+   int err;
 
cp.phy_handle = hcon->handle;
cp.len_so_far = cpu_to_le16(0);
@@ -314,7 +314,7 @@ void amp_read_loc_assoc_final_data(struct hci_dev *hdev,
/* Read Local AMP Assoc final link information data */
hci_req_init(, hdev);
hci_req_add(, HCI_OP_READ_LOCAL_AMP_ASSOC, sizeof(cp), );
-   hci_req_run_skb(, read_local_amp_assoc_complete);
+   err = hci_req_run_skb(, read_local_amp_assoc_complete);
if (err < 0)
a2mp_send_getampassoc_rsp(hdev, A2MP_STATUS_INVALID_CTRL_ID);
 }
-- 
2.10.2



[PATCH v4 17/19] usb: gadget: pch_udc: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Reviewed-by: Peter Senna Tschudin 
---
 drivers/usb/gadget/udc/pch_udc.c | 31 +++
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/usb/gadget/udc/pch_udc.c b/drivers/usb/gadget/udc/pch_udc.c
index a97da64..84dcbcd 100644
--- a/drivers/usb/gadget/udc/pch_udc.c
+++ b/drivers/usb/gadget/udc/pch_udc.c
@@ -355,8 +355,8 @@ struct pch_udc_dev {
vbus_session:1,
set_cfg_not_acked:1,
waiting_zlp_ack:1;
-   struct pci_pool *data_requests;
-   struct pci_pool *stp_requests;
+   struct dma_pool *data_requests;
+   struct dma_pool *stp_requests;
dma_addr_t  dma_addr;
struct usb_ctrlrequest  setup_data;
void __iomem*base_addr;
@@ -1522,7 +1522,7 @@ static void pch_udc_free_dma_chain(struct pch_udc_dev 
*dev,
/* do not free first desc., will be done by free for request */
td = phys_to_virt(addr);
addr2 = (dma_addr_t)td->next;
-   pci_pool_free(dev->data_requests, td, addr);
+   dma_pool_free(dev->data_requests, td, addr);
td->next = 0x00;
addr = addr2;
}
@@ -1539,7 +1539,7 @@ static void pch_udc_free_dma_chain(struct pch_udc_dev 
*dev,
  *
  * Return codes:
  * 0:  success,
- * -ENOMEM:pci_pool_alloc invocation fails
+ * -ENOMEM:dma_pool_alloc invocation fails
  */
 static int pch_udc_create_dma_chain(struct pch_udc_ep *ep,
struct pch_udc_request *req,
@@ -1565,7 +1565,7 @@ static int pch_udc_create_dma_chain(struct pch_udc_ep *ep,
if (bytes <= buf_len)
break;
last = td;
-   td = pci_pool_alloc(ep->dev->data_requests, gfp_flags,
+   td = dma_pool_alloc(ep->dev->data_requests, gfp_flags,
_addr);
if (!td)
goto nomem;
@@ -1770,7 +1770,7 @@ static struct usb_request *pch_udc_alloc_request(struct 
usb_ep *usbep,
if (!ep->dev->dma_addr)
return >req;
/* ep0 in requests are allocated from data pool here */
-   dma_desc = pci_pool_alloc(ep->dev->data_requests, gfp,
+   dma_desc = dma_pool_alloc(ep->dev->data_requests, gfp,
  >td_data_phys);
if (NULL == dma_desc) {
kfree(req);
@@ -1809,7 +1809,7 @@ static void pch_udc_free_request(struct usb_ep *usbep,
if (req->td_data != NULL) {
if (req->chain_len > 1)
pch_udc_free_dma_chain(ep->dev, req);
-   pci_pool_free(ep->dev->data_requests, req->td_data,
+   dma_pool_free(ep->dev->data_requests, req->td_data,
  req->td_data_phys);
}
kfree(req);
@@ -2914,7 +2914,7 @@ static int init_dma_pools(struct pch_udc_dev *dev)
void*ep0out_buf;
 
/* DMA setup */
-   dev->data_requests = pci_pool_create("data_requests", dev->pdev,
+   dev->data_requests = dma_pool_create("data_requests", >pdev->dev,
sizeof(struct pch_udc_data_dma_desc), 0, 0);
if (!dev->data_requests) {
dev_err(>pdev->dev, "%s: can't get request data pool\n",
@@ -2923,7 +2923,7 @@ static int init_dma_pools(struct pch_udc_dev *dev)
}
 
/* dma desc for setup data */
-   dev->stp_requests = pci_pool_create("setup requests", dev->pdev,
+   dev->stp_requests = dma_pool_create("setup requests", >pdev->dev,
sizeof(struct pch_udc_stp_dma_desc), 0, 0);
if (!dev->stp_requests) {
dev_err(>pdev->dev, "%s: can't get setup request pool\n",
@@ -2931,7 +2931,7 @@ static int init_dma_pools(struct pch_udc_dev *dev)
return -ENOMEM;
}
/* setup */
-   td_stp = pci_pool_alloc(dev->stp_requests, GFP_KERNEL,
+   td_stp = dma_pool_alloc(dev->stp_requests, GFP_KERNEL,
>ep[UDC_EP0OUT_IDX].td_stp_phys);
if (!td_stp) {
dev_err(>pdev->dev,
@@ -2941,7 +2941,7 @@ static int init_dma_pools(struct pch_udc_dev *dev)
dev->ep[UDC_EP0OUT_IDX].td_stp = td_stp;
 
/* data: 0 packets !? */
-   td_data = pci_pool_alloc(dev->data_requests, GFP_KERNEL,
+   td_data = dma_pool_alloc(dev->data_requests, GFP_KERNEL,
>ep[UDC_EP0OUT_IDX].td_data_phys);
if (!td_data) {
dev_err(>pdev->dev,
@@ -3021,22 +3021,21 @@ static void pch_udc_remove(struct pci_dev *pdev)
   

[PATCH v4 08/19] scsi: be2iscsi: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/scsi/be2iscsi/be_iscsi.c | 6 +++---
 drivers/scsi/be2iscsi/be_main.c  | 6 +++---
 drivers/scsi/be2iscsi/be_main.h  | 2 +-
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/be2iscsi/be_iscsi.c b/drivers/scsi/be2iscsi/be_iscsi.c
index a484457..d76ef77 100644
--- a/drivers/scsi/be2iscsi/be_iscsi.c
+++ b/drivers/scsi/be2iscsi/be_iscsi.c
@@ -87,8 +87,8 @@ struct iscsi_cls_session *beiscsi_session_create(struct 
iscsi_endpoint *ep,
return NULL;
sess = cls_session->dd_data;
beiscsi_sess = sess->dd_data;
-   beiscsi_sess->bhs_pool =  pci_pool_create("beiscsi_bhs_pool",
-  phba->pcidev,
+   beiscsi_sess->bhs_pool =  dma_pool_create("beiscsi_bhs_pool",
+  >pcidev->dev,
   sizeof(struct be_cmd_bhs),
   64, 0);
if (!beiscsi_sess->bhs_pool)
@@ -113,7 +113,7 @@ void beiscsi_session_destroy(struct iscsi_cls_session 
*cls_session)
struct beiscsi_session *beiscsi_sess = sess->dd_data;
 
printk(KERN_INFO "In beiscsi_session_destroy\n");
-   pci_pool_destroy(beiscsi_sess->bhs_pool);
+   dma_pool_destroy(beiscsi_sess->bhs_pool);
iscsi_session_teardown(cls_session);
 }
 
diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c
index 32b2713..dd43480 100644
--- a/drivers/scsi/be2iscsi/be_main.c
+++ b/drivers/scsi/be2iscsi/be_main.c
@@ -4307,7 +4307,7 @@ static void beiscsi_cleanup_task(struct iscsi_task *task)
pwrb_context = _ctrlr->wrb_context[cri_index];
 
if (io_task->cmd_bhs) {
-   pci_pool_free(beiscsi_sess->bhs_pool, io_task->cmd_bhs,
+   dma_pool_free(beiscsi_sess->bhs_pool, io_task->cmd_bhs,
  io_task->bhs_pa.u.a64.address);
io_task->cmd_bhs = NULL;
task->hdr = NULL;
@@ -4424,7 +4424,7 @@ static int beiscsi_alloc_pdu(struct iscsi_task *task, 
uint8_t opcode)
struct beiscsi_session *beiscsi_sess = beiscsi_conn->beiscsi_sess;
dma_addr_t paddr;
 
-   io_task->cmd_bhs = pci_pool_alloc(beiscsi_sess->bhs_pool,
+   io_task->cmd_bhs = dma_pool_alloc(beiscsi_sess->bhs_pool,
  GFP_ATOMIC, );
if (!io_task->cmd_bhs)
return -ENOMEM;
@@ -4551,7 +4551,7 @@ static int beiscsi_alloc_pdu(struct iscsi_task *task, 
uint8_t opcode)
if (io_task->pwrb_handle)
free_wrb_handle(phba, pwrb_context, io_task->pwrb_handle);
io_task->pwrb_handle = NULL;
-   pci_pool_free(beiscsi_sess->bhs_pool, io_task->cmd_bhs,
+   dma_pool_free(beiscsi_sess->bhs_pool, io_task->cmd_bhs,
  io_task->bhs_pa.u.a64.address);
io_task->cmd_bhs = NULL;
return -ENOMEM;
diff --git a/drivers/scsi/be2iscsi/be_main.h b/drivers/scsi/be2iscsi/be_main.h
index 2188579..cf58d31 100644
--- a/drivers/scsi/be2iscsi/be_main.h
+++ b/drivers/scsi/be2iscsi/be_main.h
@@ -446,7 +446,7 @@ struct beiscsi_hba {
 test_bit(BEISCSI_HBA_ONLINE, >state))
 
 struct beiscsi_session {
-   struct pci_pool *bhs_pool;
+   struct dma_pool *bhs_pool;
 };
 
 /**
-- 
2.9.3



[PATCH v4 03/19] IB/mthca: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/infiniband/hw/mthca/mthca_av.c  | 10 +-
 drivers/infiniband/hw/mthca/mthca_cmd.c |  8 
 drivers/infiniband/hw/mthca/mthca_dev.h |  4 ++--
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_av.c 
b/drivers/infiniband/hw/mthca/mthca_av.c
index c9f0f36..9d041b6 100644
--- a/drivers/infiniband/hw/mthca/mthca_av.c
+++ b/drivers/infiniband/hw/mthca/mthca_av.c
@@ -186,7 +186,7 @@ int mthca_create_ah(struct mthca_dev *dev,
 
 on_hca_fail:
if (ah->type == MTHCA_AH_PCI_POOL) {
-   ah->av = pci_pool_zalloc(dev->av_table.pool,
+   ah->av = dma_pool_zalloc(dev->av_table.pool,
 GFP_ATOMIC, >avdma);
if (!ah->av)
return -ENOMEM;
@@ -245,7 +245,7 @@ int mthca_destroy_ah(struct mthca_dev *dev, struct mthca_ah 
*ah)
break;
 
case MTHCA_AH_PCI_POOL:
-   pci_pool_free(dev->av_table.pool, ah->av, ah->avdma);
+   dma_pool_free(dev->av_table.pool, ah->av, ah->avdma);
break;
 
case MTHCA_AH_KMALLOC:
@@ -333,7 +333,7 @@ int mthca_init_av_table(struct mthca_dev *dev)
if (err)
return err;
 
-   dev->av_table.pool = pci_pool_create("mthca_av", dev->pdev,
+   dev->av_table.pool = dma_pool_create("mthca_av", >pdev->dev,
 MTHCA_AV_SIZE,
 MTHCA_AV_SIZE, 0);
if (!dev->av_table.pool)
@@ -353,7 +353,7 @@ int mthca_init_av_table(struct mthca_dev *dev)
return 0;
 
  out_free_pool:
-   pci_pool_destroy(dev->av_table.pool);
+   dma_pool_destroy(dev->av_table.pool);
 
  out_free_alloc:
mthca_alloc_cleanup(>av_table.alloc);
@@ -367,6 +367,6 @@ void mthca_cleanup_av_table(struct mthca_dev *dev)
 
if (dev->av_table.av_map)
iounmap(dev->av_table.av_map);
-   pci_pool_destroy(dev->av_table.pool);
+   dma_pool_destroy(dev->av_table.pool);
mthca_alloc_cleanup(>av_table.alloc);
 }
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c 
b/drivers/infiniband/hw/mthca/mthca_cmd.c
index c7f49bb..7f219c8 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -530,7 +530,7 @@ int mthca_cmd_init(struct mthca_dev *dev)
return -ENOMEM;
}
 
-   dev->cmd.pool = pci_pool_create("mthca_cmd", dev->pdev,
+   dev->cmd.pool = dma_pool_create("mthca_cmd", >pdev->dev,
MTHCA_MAILBOX_SIZE,
MTHCA_MAILBOX_SIZE, 0);
if (!dev->cmd.pool) {
@@ -543,7 +543,7 @@ int mthca_cmd_init(struct mthca_dev *dev)
 
 void mthca_cmd_cleanup(struct mthca_dev *dev)
 {
-   pci_pool_destroy(dev->cmd.pool);
+   dma_pool_destroy(dev->cmd.pool);
iounmap(dev->hcr);
if (dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS)
iounmap(dev->cmd.dbell_map);
@@ -613,7 +613,7 @@ struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev 
*dev,
if (!mailbox)
return ERR_PTR(-ENOMEM);
 
-   mailbox->buf = pci_pool_alloc(dev->cmd.pool, gfp_mask, >dma);
+   mailbox->buf = dma_pool_alloc(dev->cmd.pool, gfp_mask, >dma);
if (!mailbox->buf) {
kfree(mailbox);
return ERR_PTR(-ENOMEM);
@@ -627,7 +627,7 @@ void mthca_free_mailbox(struct mthca_dev *dev, struct 
mthca_mailbox *mailbox)
if (!mailbox)
return;
 
-   pci_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
+   dma_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
kfree(mailbox);
 }
 
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h 
b/drivers/infiniband/hw/mthca/mthca_dev.h
index 4393a02..8c3f6ed 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -118,7 +118,7 @@ enum {
 };
 
 struct mthca_cmd {
-   struct pci_pool  *pool;
+   struct dma_pool  *pool;
struct mutex  hcr_mutex;
struct semaphore  poll_sem;
struct semaphore  event_sem;
@@ -263,7 +263,7 @@ struct mthca_qp_table {
 };
 
 struct mthca_av_table {
-   struct pci_pool   *pool;
+   struct dma_pool   *pool;
intnum_ddr_avs;
u64ddr_av_base;
void __iomem  *av_map;
-- 
2.9.3



[PATCH v4 01/19] block: DAC960: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/block/DAC960.c | 36 ++--
 drivers/block/DAC960.h |  4 ++--
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c
index 26a51be..2b221cc 100644
--- a/drivers/block/DAC960.c
+++ b/drivers/block/DAC960.c
@@ -268,17 +268,17 @@ static bool 
DAC960_CreateAuxiliaryStructures(DAC960_Controller_T *Controller)
   void *AllocationPointer = NULL;
   void *ScatterGatherCPU = NULL;
   dma_addr_t ScatterGatherDMA;
-  struct pci_pool *ScatterGatherPool;
+  struct dma_pool *ScatterGatherPool;
   void *RequestSenseCPU = NULL;
   dma_addr_t RequestSenseDMA;
-  struct pci_pool *RequestSensePool = NULL;
+  struct dma_pool *RequestSensePool = NULL;
 
   if (Controller->FirmwareType == DAC960_V1_Controller)
 {
   CommandAllocationLength = offsetof(DAC960_Command_T, V1.EndMarker);
   CommandAllocationGroupSize = DAC960_V1_CommandAllocationGroupSize;
-  ScatterGatherPool = pci_pool_create("DAC960_V1_ScatterGather",
-   Controller->PCIDevice,
+  ScatterGatherPool = dma_pool_create("DAC960_V1_ScatterGather",
+   >PCIDevice->dev,
DAC960_V1_ScatterGatherLimit * sizeof(DAC960_V1_ScatterGatherSegment_T),
sizeof(DAC960_V1_ScatterGatherSegment_T), 0);
   if (ScatterGatherPool == NULL)
@@ -290,18 +290,18 @@ static bool 
DAC960_CreateAuxiliaryStructures(DAC960_Controller_T *Controller)
 {
   CommandAllocationLength = offsetof(DAC960_Command_T, V2.EndMarker);
   CommandAllocationGroupSize = DAC960_V2_CommandAllocationGroupSize;
-  ScatterGatherPool = pci_pool_create("DAC960_V2_ScatterGather",
-   Controller->PCIDevice,
+  ScatterGatherPool = dma_pool_create("DAC960_V2_ScatterGather",
+   >PCIDevice->dev,
DAC960_V2_ScatterGatherLimit * sizeof(DAC960_V2_ScatterGatherSegment_T),
sizeof(DAC960_V2_ScatterGatherSegment_T), 0);
   if (ScatterGatherPool == NULL)
return DAC960_Failure(Controller,
"AUXILIARY STRUCTURE CREATION (SG)");
-  RequestSensePool = pci_pool_create("DAC960_V2_RequestSense",
-   Controller->PCIDevice, sizeof(DAC960_SCSI_RequestSense_T),
+  RequestSensePool = dma_pool_create("DAC960_V2_RequestSense",
+   >PCIDevice->dev, sizeof(DAC960_SCSI_RequestSense_T),
sizeof(int), 0);
   if (RequestSensePool == NULL) {
-   pci_pool_destroy(ScatterGatherPool);
+   dma_pool_destroy(ScatterGatherPool);
return DAC960_Failure(Controller,
"AUXILIARY STRUCTURE CREATION (SG)");
   }
@@ -335,16 +335,16 @@ static bool 
DAC960_CreateAuxiliaryStructures(DAC960_Controller_T *Controller)
   Command->Next = Controller->FreeCommands;
   Controller->FreeCommands = Command;
   Controller->Commands[CommandIdentifier-1] = Command;
-  ScatterGatherCPU = pci_pool_alloc(ScatterGatherPool, GFP_ATOMIC,
+  ScatterGatherCPU = dma_pool_alloc(ScatterGatherPool, GFP_ATOMIC,
);
   if (ScatterGatherCPU == NULL)
  return DAC960_Failure(Controller, "AUXILIARY STRUCTURE CREATION");
 
   if (RequestSensePool != NULL) {
- RequestSenseCPU = pci_pool_alloc(RequestSensePool, GFP_ATOMIC,
+ RequestSenseCPU = dma_pool_alloc(RequestSensePool, GFP_ATOMIC,
);
  if (RequestSenseCPU == NULL) {
-pci_pool_free(ScatterGatherPool, ScatterGatherCPU,
+dma_pool_free(ScatterGatherPool, ScatterGatherCPU,
 ScatterGatherDMA);
return DAC960_Failure(Controller,
"AUXILIARY STRUCTURE CREATION");
@@ -379,8 +379,8 @@ static bool 
DAC960_CreateAuxiliaryStructures(DAC960_Controller_T *Controller)
 static void DAC960_DestroyAuxiliaryStructures(DAC960_Controller_T *Controller)
 {
   int i;
-  struct pci_pool *ScatterGatherPool = Controller->ScatterGatherPool;
-  struct pci_pool *RequestSensePool = NULL;
+  struct dma_pool *ScatterGatherPool = Controller->ScatterGatherPool;
+  struct dma_pool *RequestSensePool = NULL;
   void *ScatterGatherCPU;
   dma_addr_t ScatterGatherDMA;
   void *RequestSenseCPU;
@@ -411,9 +411,9 @@ static void 
DAC960_DestroyAuxiliaryStructures(DAC960_Controller_T *Controller)
  RequestSenseDMA = Command->V2.RequestSenseDMA;
   }
   if (ScatterGatherCPU != NULL)
-  pci_pool_free(ScatterGatherPool, ScatterGatherCPU, ScatterGatherDMA);
+  dma_pool_free(ScatterGatherPool, ScatterGatherCPU, ScatterGatherDMA);
  

[PATCH v4 02/19] dmaengine: pch_dma: Replace PCI pool old API

2017-03-01 Thread Romain Perier
The PCI pool API is deprecated. This commits replaces the PCI pool old
API by the appropriated function with the DMA pool API.

Signed-off-by: Romain Perier 
Acked-by: Peter Senna Tschudin 
Tested-by: Peter Senna Tschudin 
---
 drivers/dma/pch_dma.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/pch_dma.c b/drivers/dma/pch_dma.c
index f9028e9..afd8f27 100644
--- a/drivers/dma/pch_dma.c
+++ b/drivers/dma/pch_dma.c
@@ -123,7 +123,7 @@ struct pch_dma_chan {
 struct pch_dma {
struct dma_device   dma;
void __iomem *membase;
-   struct pci_pool *pool;
+   struct dma_pool *pool;
struct pch_dma_regs regs;
struct pch_dma_desc_regs ch_regs[MAX_CHAN_NR];
struct pch_dma_chan channels[MAX_CHAN_NR];
@@ -437,7 +437,7 @@ static struct pch_dma_desc *pdc_alloc_desc(struct dma_chan 
*chan, gfp_t flags)
struct pch_dma *pd = to_pd(chan->device);
dma_addr_t addr;
 
-   desc = pci_pool_zalloc(pd->pool, flags, );
+   desc = dma_pool_zalloc(pd->pool, flags, );
if (desc) {
INIT_LIST_HEAD(>tx_list);
dma_async_tx_descriptor_init(>txd, chan);
@@ -549,7 +549,7 @@ static void pd_free_chan_resources(struct dma_chan *chan)
spin_unlock_irq(_chan->lock);
 
list_for_each_entry_safe(desc, _d, _list, desc_node)
-   pci_pool_free(pd->pool, desc, desc->txd.phys);
+   dma_pool_free(pd->pool, desc, desc->txd.phys);
 
pdc_enable_irq(chan, 0);
 }
@@ -880,7 +880,7 @@ static int pch_dma_probe(struct pci_dev *pdev,
goto err_iounmap;
}
 
-   pd->pool = pci_pool_create("pch_dma_desc_pool", pdev,
+   pd->pool = dma_pool_create("pch_dma_desc_pool", >dev,
   sizeof(struct pch_dma_desc), 4, 0);
if (!pd->pool) {
dev_err(>dev, "Failed to alloc DMA descriptors\n");
@@ -931,7 +931,7 @@ static int pch_dma_probe(struct pci_dev *pdev,
return 0;
 
 err_free_pool:
-   pci_pool_destroy(pd->pool);
+   dma_pool_destroy(pd->pool);
 err_free_irq:
free_irq(pdev->irq, pd);
 err_iounmap:
@@ -963,7 +963,7 @@ static void pch_dma_remove(struct pci_dev *pdev)
tasklet_kill(_chan->tasklet);
}
 
-   pci_pool_destroy(pd->pool);
+   dma_pool_destroy(pd->pool);
pci_iounmap(pdev, pd->membase);
pci_release_regions(pdev);
pci_disable_device(pdev);
-- 
2.9.3



Re: [PATCH v4] net: don't call strlen() on the user buffer in packet_bind_spkt()

2017-03-01 Thread Eric Dumazet
On Wed, 2017-03-01 at 12:57 +0100, Alexander Potapenko wrote:
> KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
> uninitialized memory in packet_bind_spkt():
...
> Signed-off-by: Alexander Potapenko 
> ---
> Changes since v3:
>  - addressed comments by Eric Dumazet (avoid using constants,
>use memcpy() instead of strncpy())
> ---

Acked-by: Eric Dumazet 





[PATCH 1/2] batman-adv: Fix double free during fragment merge error

2017-03-01 Thread Simon Wunderlich
From: Sven Eckelmann 

The function batadv_frag_skb_buffer was supposed not to consume the skbuff
on errors. This was followed in the helper function
batadv_frag_insert_packet when the skb would potentially be inserted in the
fragment queue. But it could happen that the next helper function
batadv_frag_merge_packets would try to merge the fragments and fail. This
results in a kfree_skb of all the enqueued fragments (including the just
inserted one). batadv_recv_frag_packet would detect the error in
batadv_frag_skb_buffer and try to free the skb again.

The behavior of batadv_frag_skb_buffer (and its helper
batadv_frag_insert_packet) must therefore be changed to always consume the
skbuff to have a common behavior and avoid the double kfree_skb.

Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge")
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/fragmentation.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index 0854ebd8613e..31e97e9aee0d 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -239,8 +239,10 @@ static bool batadv_frag_insert_packet(struct 
batadv_orig_node *orig_node,
spin_unlock_bh(>lock);
 
 err:
-   if (!ret)
+   if (!ret) {
kfree(frag_entry_new);
+   kfree_skb(skb);
+   }
 
return ret;
 }
@@ -313,7 +315,7 @@ batadv_frag_merge_packets(struct hlist_head *chain)
  *
  * There are three possible outcomes: 1) Packet is merged: Return true and
  * set *skb to merged packet; 2) Packet is buffered: Return true and set *skb
- * to NULL; 3) Error: Return false and leave skb as is.
+ * to NULL; 3) Error: Return false and free skb.
  *
  * Return: true when packet is merged or buffered, false when skb is not not
  * used.
@@ -338,9 +340,9 @@ bool batadv_frag_skb_buffer(struct sk_buff **skb,
goto out_err;
 
 out:
-   *skb = skb_out;
ret = true;
 out_err:
+   *skb = skb_out;
return ret;
 }
 
-- 
2.11.0



[PATCH 0/2] pull request for net: batman-adv 2017-03-01

2017-03-01 Thread Simon Wunderlich
Hi David,

here are two bugfixes which we would like to see integrated into net.

Please pull or let me know of any problem!

Thank you,
  Simon

The following changes since commit 4ea33ef0f9e95b69db9131d7afd98563713e81b0:

  batman-adv: Decrease hardif refcnt on fragmentation send error (2017-01-04 
08:22:04 +0100)

are available in the git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-net-for-davem-20170301

for you to fetch changes up to 51c6b429c0c95e67edd1cb0b548c5cf6a6604763:

  batman-adv: Fix transmission of final, 16th fragment (2017-02-21 18:33:35 
+0100)


Here are two batman-adv bugfixes:

 - fix a potential double free when fragment merges fail,
   by Sven Eckelmann

 - fix failing tranmission of the 16th (last) fragment if that exists,
   by Linus Lüssing


Linus Lüssing (1):
  batman-adv: Fix transmission of final, 16th fragment

Sven Eckelmann (1):
  batman-adv: Fix double free during fragment merge error

 net/batman-adv/fragmentation.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)


[PATCH 2/2] batman-adv: Fix transmission of final, 16th fragment

2017-03-01 Thread Simon Wunderlich
From: Linus Lüssing 

Trying to split and transmit a unicast packet in 16 parts will fail for
the final fragment: After having sent the 15th one with a frag_packet.no
index of 14, we will increase the the index to 15 - and return with an
error code immediately, even though one more fragment is due for
transmission and allowed.

Fixing this issue by moving the check before incrementing the index.

While at it, adding an unlikely(), because the check is actually more of
an assertion.

Fixes: ee75ed88879a ("batman-adv: Fragment and send skbs larger than mtu")
Signed-off-by: Linus Lüssing 
Signed-off-by: Sven Eckelmann 
Signed-off-by: Simon Wunderlich 
---
 net/batman-adv/fragmentation.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index 31e97e9aee0d..11149e5be4e0 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -501,6 +501,12 @@ int batadv_frag_send_packet(struct sk_buff *skb,
 
/* Eat and send fragments from the tail of skb */
while (skb->len > max_fragment_size) {
+   /* The initial check in this function should cover this case */
+   if (unlikely(frag_header.no == BATADV_FRAG_MAX_FRAGMENTS - 1)) {
+   ret = -EINVAL;
+   goto put_primary_if;
+   }
+
skb_fragment = batadv_frag_create(skb, _header, mtu);
if (!skb_fragment) {
ret = -ENOMEM;
@@ -517,12 +523,6 @@ int batadv_frag_send_packet(struct sk_buff *skb,
}
 
frag_header.no++;
-
-   /* The initial check in this function should cover this case */
-   if (frag_header.no == BATADV_FRAG_MAX_FRAGMENTS - 1) {
-   ret = -EINVAL;
-   goto put_primary_if;
-   }
}
 
/* Make room for the fragment header. */
-- 
2.11.0



Re: net/dccp: dccp_create_openreq_child freed held lock

2017-03-01 Thread Arnaldo Carvalho de Melo
Em Wed, Mar 01, 2017 at 10:38:54AM +0100, Dmitry Vyukov escreveu:
> Hello,
> 
> I've got the following report while running syzkaller fuzzer on
> 86292b33d4b79ee03e2f43ea0381ef85f077c760:
> 
> 
> It seems that dccp_create_openreq_child needs to unlock the sock if
> dccp_feat_activate_values fails.

Yeah, can you please use the patch below, that mimics the error paths in
sk_clone_new(), from where I think even the comment about it being a raw
copy came, but the bh_unlock_sock() didn't?

- Arnaldo

diff --git a/net/dccp/minisocks.c b/net/dccp/minisocks.c
index 53eddf99e4f6..d20d948a98ed 100644
--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(const struct sock 
*sk,
/* It is still raw copy of parent, so invalidate
 * destructor and make plain sk_free() */
newsk->sk_destruct = NULL;
+   bh_unlock_sock(newsk);
sk_free(newsk);
return NULL;
}


  1   2   >