date:20170708

[no subject]

2017-07-08 Thread Alfred chow





Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing  
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road  
Central, Hong Kong. I have a business proposal of  $38,980,369.00.


All confirmable documents to back up the claims will be made available  
to you prior to your acceptance and as soon as I receive your return  
mail.


Best Regards,
Alfred Chow

[PATCH] net-next: Fix minor code bug in timestamping.txt

2017-07-08 Thread Ahmad Fatoum

Passing (void*)val instead of  would make a pointer out of an integer
and cause sock_setsockopt to -EFAULT.

See tools/testing/selftests/networking/timestamping/timestamping.c
for a working example.

Cc: David S. Miller 
Cc: netdev@vger.kernel.org
Signed-off-by: Ahmad Fatoum 
---
 Documentation/networking/timestamping.txt | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/timestamping.txt 
b/Documentation/networking/timestamping.txt
index 196ba17cc344..1be0b6f9e0cb 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -44,8 +44,7 @@ timeval of SO_TIMESTAMP (ms).
 Supports multiple types of timestamp requests. As a result, this
 socket option takes a bitmap of flags, not a boolean. In
 
-  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
-   sizeof(val));
+  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, , sizeof(val));
 
 val is an integer with any of the following bits set. Setting other
 bit returns EINVAL and does not change the current state.
@@ -249,8 +248,7 @@ setsockopt to receive timestamps:
 
   __u32 val = SOF_TIMESTAMPING_SOFTWARE |
  SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
-  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
-   sizeof(val));
+  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, , sizeof(val));
 
 
 1.4 Bytestream Timestamps
-- 
2.13.2

Re: [RFC PATCH 00/12] Implement XDP bpf_redirect vairants

2017-07-08 Thread Jesper Dangaard Brouer

On Sat, 08 Jul 2017 10:46:18 +0100 (WEST)
David Miller  wrote:

> From: John Fastabend 
> Date: Fri, 07 Jul 2017 10:48:36 -0700
> 
> > On 07/07/2017 10:34 AM, John Fastabend wrote:  
> >> This series adds two new XDP helper routines bpf_redirect() and
> >> bpf_redirect_map(). The first variant bpf_redirect() is meant
> >> to be used the same way it is currently being used by the cls_bpf
> >> classifier. An xdp packet will be redirected immediately when this
> >> is called.  
> > 
> > Also other than the typo in the title there ;) I'm going to CC
> > the driver maintainers working on XDP (makes for a long CC list but)
> > because we would want to try and get support in as many as possible in
> > the next merge window.
> > 
> > For this rev I just implemented on ixgbe because I wrote the
> > original XDP support there. I'll volunteer to do virtio as well.  
> 
> I went over this series a few times and it looks great to me.
> You didn't even give me some coding style issues to pick on :-)

We (Daniel, Andy and I) have been reviewing and improving on this
patchset the last couple of weeks ;-).  We had some stability issues,
which is why it wasn't published earlier. My plan is to test this
latest patchset again, Monday and Tuesday. I'll try to assess stability
and provide some performance numbers.

I've complained/warned about the danger of redirecting with XDP,
without providing (1) a way to debug/see XDP redirects, (2) a way
interfaces opt-in they can be redirected. (1) is solved by patch-07/12
via a tracepoint. (2) is currently done by only forwarding to
interfaces with an XDP program loaded itself, this also comes from a
practical need for NIC drivers to allocate XDP-TX HW queues.

I'm not satisfied with the (UAPI) name for the new map
"BPF_MAP_TYPE_DEVMAP" and the filename this is placed in
"kernel/bpf/devmap.c", as we want to take advantage of compiler
inlining for the next redirect map types.  (1) because the name doesn't
tell the user that this map is connected to the redirect_map call.
(2) we want to introduce other kinds of redirect maps (like redirect to
CPUs and sockets), and it would be good if they shared a common "text"
string.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Re: [RFC PATCH 08/12] bpf: add devmap, a map for storing net device references

2017-07-08 Thread Jesper Dangaard Brouer

On Fri, 07 Jul 2017 10:37:12 -0700
John Fastabend  wrote:

> Device map (devmap) is a BPF map, primarily useful for networking
> applications, that uses a key to lookup a reference to a netdevice.
> 
> The map provides a clean way for BPF programs to build virtual port
> to physical port maps. Additionally, it provides a scoping function
> for the redirect action itself allowing multiple optimizations. Future
> patches will leverage the map to provide batching at the XDP layer.
> 
> Another optimization/feature, that is not yet implemented, would be
> to support multiple netdevices per key to support efficient multicast
> and broadcast support.

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 74ea96e..06073ba 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -1294,6 +1294,14 @@ static int check_map_func_compatibility(struct bpf_map 
> *map, int func_id)
>   func_id != BPF_FUNC_current_task_under_cgroup)
>   goto error;
>   break;
> + /* devmap returns a pointer to a live net_device ifindex that we cannot
> +  * allow to be modified from bpf side. So do not allow lookup elemnts
  ^^^
Spelling of elements

> +  * for now.
> +  */
> + case BPF_MAP_TYPE_DEVMAP:
> + if (func_id == BPF_FUNC_map_lookup_elem)
> + goto error;
> + break;

Reviewer notice this limitation from the bpf_prog side.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Re: [PATCH net] ip[6]: don't register inet[6]dev when dev is down

2017-07-08 Thread Cong Wang

On Fri, Jul 7, 2017 at 5:39 AM, Nicolas Dichtel
 wrote:
> Le 06/07/2017 à 20:16, Cong Wang a écrit :
>> On Thu, Jul 6, 2017 at 5:08 AM, Nicolas Dichtel
>>  wrote:
>>> Le 06/07/2017 à 00:43, Cong Wang a écrit :
 On Wed, Jul 5, 2017 at 8:57 AM, Nicolas Dichtel
  wrote:
> When a device changes from one netns to another, it's first unregistered,
> then the netns reference is updated and the dev is registered in the new
> netns. Thus, when a slave moves to another netns, it is first
> unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
> the bonding driver. The driver calls bond_release(), which calls
> dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
> the old netns).

 I think in this special case it is meaningless to send
 NETDEV_CHANGEMTU, because the device is dying within
 its old netns, who still cares about its mtu change?

 Something like the attached patch...
>>> Yes, your patch seems good and I hesitated with something like this.
>>> But I don't see a valid case where the inet[6]dev must be created on a down
>>> interface. I think the patch is valid, even with your patch.
>>
>> Your patch is more risky because it affects normal CHANGEMTU path,
>> I am not sure if it is correct to not to add idev when it is down either.
> Why would it be needed to add this idev on a down interface?
> If idev wasn't there I don't see why changing the mtu would justify to create
> this idev.
>


There must be a reason to check idev->if_flags & IF_READY instead
of IFF_UP.


>>
>> This is a very unusual path, we don't have to take the risk.
> I still think that this approach is better for two reasons:
>  - we don't know if another path like this exists (need an audit) and it would
> be easy to add one again by side effect in the future;

Perhaps we need to add a warning for these events triggered after
UNREGISTER, except UNREGISTER_FINAL, in case of trouble.
But again, the CHANGEMTU case is so special because of the
idev.


>  - the patch is easy to backport in older kernel.
>

Easy to backport doesn't mean easy to verify. ;) As David said, this
code is mess, especially for the keep_addr_on_down logic.

Re: [Patch] mqueue: fix the retry logic for netlink_attachskb()

2017-07-08 Thread Linus Torvalds

On Sat, Jul 8, 2017 at 11:04 AM, Cong Wang  wrote:
>>
>> Can you confirm that? I don't know where the original report is.
>
> Yes of course, setting 'sock' to NULL before 'goto retry' is sufficient
> to fix it, that is in fact my initial thought. And I realized retry'ing
> fdget() can't help anything in this situation but increases the
> attack vector, so I decided to get rid of it from the retry loop
> instead of just NULL'ing 'sock'.
>
> Or do you prefer the simpler fix? Or should I just resend it with
> a improved changelog?

It was just the combination of that nasty code, your patch, and the
explanation that confused me.

Reading the patch, I actually thought that one of the things you fixed
was moving the "fdput()" later, to after the netlink_attachskb().

And I thought you did that because netlink_attachskb() would need the
file to be still around keeping a reference count to the socket, and
without it the socket could have been dropped in the meantime.

But reading the code more closely, I notice that
netlink_getsockbyfilp() gets a reference to the sock, and it's that
netlink_attachskb() will drop that reference on error or retry.

So the fdput() makes sense after netlink_getsockbyfilp(), but that's
also why the retry code currently includes repeating the fdget()...
And the error handling for the fdget is that then triggers the real
bug.

So the reason you moved the fdput() later wasn't to protect the socket
reference, it was just because of how the whole retry loop needs to
have the file descriptor just to get a new reference to the socket.

That's why I thought you fixed a bug even in the first iteration, but
it turns out that was just me making assumptions based on mis-reading
the patch without looking at all the context and the logic of the
called functions.

Now that I have checked deeper, I realize that your patch description
was actually correct about this only being a retry problem - the first
time around the reference count ends up moving correctly from file to
socket, but then when it repeats and 'sock' may contain a stale
pointer, we may end up doing the wrong thing when the fdget fails.

Honestly, now I feel like either patch is fine, and your original
commit message is fine too - but I just hate that code.

And making it use some nice helper function to clean it up looks
painful too, because the error handling is so odd (ie
mq_netlink_attachskb() will free the skb on error, while the other
error cases won't, so you'd have to have some special handling for the
different errors that can happen).

Honestly, this code is nasty, and right now my feeling is that it
would be good to have a minimal patch that also backports cleanly.
Maybe somebody can clean it up later, but that's a separate windmill
to rail against.

And due to the recent compat cleanups by Al, your bigger patch does
not apply cleanly to current git - but the smaller patch to just
setting 'sock' to NULL before that 'goto retry' should apply cleanly
to all versions of this code.

So purely because of that reason, I think I'd prefer to see that
smaller patch instead. Would you mind re-sending the thing?

Sorry about the whole confusion.

   Linus

Re: [PATCH 00/17] v3 net generic subsystem refcount conversions

2017-07-08 Thread Levin, Alexander (Sasha Levin)

On Mon, Jul 03, 2017 at 02:28:56AM -0700, Eric Dumazet wrote:
>On Fri, 2017-06-30 at 13:07 +0300, Elena Reshetova wrote:
>> Changes in v3:
>> Rebased on top of the net-next tree.
>>
>> Changes in v2:
>> No changes in patches apart from rebases, but now by
>> default refcount_t = atomic_t (*) and uses all atomic standard operations
>> unless CONFIG_REFCOUNT_FULL is enabled. This is a compromise for the
>> systems that are critical on performance (such as net) and cannot accept even
>> slight delay on the refcounter operations.
>>
>> This series, for core network subsystem components, replaces atomic_t 
>> reference
>> counters with the new refcount_t type and API (see include/linux/refcount.h).
>> By doing this we prevent intentional or accidental
>> underflows or overflows that can led to use-after-free vulnerabilities.
>> These patches contain only generic net pieces. Other changes will be sent 
>> separately.
>>
>> The patches are fully independent and can be cherry-picked separately.
>> The big patches, such as conversions for sock structure, need a very detailed
>> look from maintainers: refcount managing is quite complex in them and while
>> it seems that they would benefit from the change, extra checking is needed.
>> The biggest corner issue is the fact that refcount_inc() does not increment
>> from zero.
>>
>> If there are no objections to the patches, please merge them via respective 
>> trees.
>>
>> * The respective change is currently merged into -next as
>>   "locking/refcount: Create unchecked atomic_t implementation".
>>
>> Elena Reshetova (17):
>>   net: convert inet_peer.refcnt from atomic_t to refcount_t
>>   net: convert neighbour.refcnt from atomic_t to refcount_t
>>   net: convert neigh_params.refcnt from atomic_t to refcount_t
>>   net: convert nf_bridge_info.use from atomic_t to refcount_t
>>   net: convert sk_buff.users from atomic_t to refcount_t
>>   net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
>>   net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
>>   net: convert sock.sk_refcnt from atomic_t to refcount_t
>>   net: convert ip_mc_list.refcnt from atomic_t to refcount_t
>>   net: convert in_device.refcnt from atomic_t to refcount_t
>>   net: convert netpoll_info.refcnt from atomic_t to refcount_t
>>   net: convert unix_address.refcnt from atomic_t to refcount_t
>>   net: convert fib_rule.refcnt from atomic_t to refcount_t
>>   net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
>>   net: convert net.passive from atomic_t to refcount_t
>>   net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
>>   net: convert packet_fanout.sk_ref from atomic_t to refcount_t
>
>
>Can you take a look at this please ?
>
>[   64.601749] [ cut here ]
>[   64.601757] WARNING: CPU: 0 PID: 6476 at lib/refcount.c:184 
>refcount_sub_and_test+0x75/0xa0
>[   64.601758] Modules linked in: w1_therm wire cdc_acm ehci_pci ehci_hcd 
>mlx4_en ib_uverbs mlx4_ib ib_core mlx4_core
>[   64.601769] CPU: 0 PID: 6476 Comm: ip Tainted: GW   
>4.12.0-smp-DEV #274
>[   64.601770] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.40.0 06/22/2016
>[   64.601771] task: 8837bf482040 task.stack: 8837bdc08000
>[   64.601773] RIP: 0010:refcount_sub_and_test+0x75/0xa0
>[   64.601774] RSP: 0018:8837bdc0f5c0 EFLAGS: 00010286
>[   64.601776] RAX: 0026 RBX: 0001 RCX: 
>
>[   64.601777] RDX: 0026 RSI: 0096 RDI: 
>ed06f7b81eae
>[   64.601778] RBP: 8837bdc0f5d0 R08: 0004 R09: 
>fbfff4a54c25
>[   64.601779] R10: cbc500e5 R11: a52a6128 R12: 
>881febcf6f24
>[   64.601779] R13: 881fbf4eaf00 R14: 881febcf6f80 R15: 
>8837d7a4ed00
>[   64.601781] FS:  7ff5a2f6b700() GS:881fff80() 
>knlGS:
>[   64.601782] CS:  0010 DS:  ES:  CR0: 80050033
>[   64.601783] CR2: 7ffcdc70d000 CR3: 001f9c91e000 CR4: 
>001406f0
>[   64.601783] Call Trace:
>[   64.601786]  refcount_dec_and_test+0x11/0x20
>[   64.601790]  fib_nl_delrule+0xc39/0x1630
[snip]

I'm seeing a similar one coming from sctp:

refcount_t: underflow; use-after-free.
[ cut here ]
WARNING: CPU: 3 PID: 15570 at lib/refcount.c:186 
refcount_sub_and_test.cold.13+0x18/0x21 lib/refcount.c:186
Kernel panic - not syncing: panic_on_warn set ...

CPU: 3 PID: 15570 Comm: syz-executor0 Not tainted 4.12.0-next-20170706+ #186
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 
04/01/2014
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x11d/0x1ef lib/dump_stack.c:52
 panic+0x1bc/0x3ad kernel/panic.c:180
 __warn.cold.6+0x2f/0x2f kernel/panic.c:541
 report_bug+0x20d/0x2d0 lib/bug.c:183
 fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:190
 do_trap_no_signal arch/x86/kernel/traps.c:224 [inline]
 do_trap+0x132/0x390 arch/x86/kernel/traps.c:273

Re: [PATCH net] ip[6]: don't register inet[6]dev when dev is down

2017-07-08 Thread Cong Wang

On Sat, Jul 8, 2017 at 3:02 AM, David Miller  wrote:
> From: Nicolas Dichtel 
> Date: Wed,  5 Jul 2017 17:57:25 +0200
>
>> From: Hongjun Li 
>>
>> When the netdev event NETDEV_CHANGEMTU is triggered, the inet[6]dev may be
>> created even if the corresponding device is down. This may lead to a leak
>> in the procfs when the device is unregistered, and finally trigger a
>> backtrace:
>  ...
>> When a device changes from one netns to another, it's first unregistered,
>> then the netns reference is updated and the dev is registered in the new
>> netns. Thus, when a slave moves to another netns, it is first
>> unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
>> the bonding driver. The driver calls bond_release(), which calls
>> dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
>> the old netns).
>>
>> Signed-off-by: Hongjun Li 
>> Signed-off-by: Nicolas Dichtel 
>
> I'm still not convinced about this.
>
> We have lots of code which iterates ipv6 idevs, and then has a
> check for IFF_UP.
>
> So having an idev attached to a down interface is not a bug nor
> illegal.
>
> In fact, addrconf_cleanup() walks all of the init_net idevs and
> calls addrconf_ifdown() with how=1 regardless of IFF_UP or not.
>
> This entire area is quite a mess.

+1. I fixed a nasty bug with how=1 for loopback before...

>
> Can you show exactly why the procfs state isn't cleaned up for
> these devices moving between namespaces?  Maybe that is the real
> bug and a better place to fix this.
>

It is because the ipv6_add_dev() adds these proc files back after
NETDEV_UNREGISTER event.

[RFC] get_compat_bpf_fprog(): don't copyin field-by-field

2017-07-08 Thread Al Viro

Signed-off-by: Al Viro 
---
diff --git a/net/compat.c b/net/compat.c
index dba5e222a0e5..6ded6c821d7a 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -313,15 +313,15 @@ struct sock_fprog __user *get_compat_bpf_fprog(char 
__user *optval)
 {
struct compat_sock_fprog __user *fprog32 = (struct compat_sock_fprog 
__user *)optval;
struct sock_fprog __user *kfprog = 
compat_alloc_user_space(sizeof(struct sock_fprog));
-   compat_uptr_t ptr;
-   u16 len;
-
-   if (!access_ok(VERIFY_READ, fprog32, sizeof(*fprog32)) ||
-   !access_ok(VERIFY_WRITE, kfprog, sizeof(struct sock_fprog)) ||
-   __get_user(len, >len) ||
-   __get_user(ptr, >filter) ||
-   __put_user(len, >len) ||
-   __put_user(compat_ptr(ptr), >filter))
+   struct compat_sock_fprog f32;
+   struct sock_fprog f;
+
+   if (copy_from_user(, fprog32, sizeof(*fprog32)))
+   return NULL;
+   memset(, 0, sizeof(f));
+   f.len = f32.len;
+   f.filter = compat_ptr(f32.filter);
+   if (copy_to_user(kfprog, , sizeof(struct sock_fprog)))
return NULL;
 
return kfprog;

[RFC] copy_msghdr_from_user(): get rid of field-by-field copyin

2017-07-08 Thread Al Viro


Signed-off-by: Al Viro 
---
diff --git a/net/socket.c b/net/socket.c
index c2564eb25c6b..af33d929135a 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1870,22 +1870,18 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 struct sockaddr __user **save_addr,
 struct iovec **iov)
 {
-   struct sockaddr __user *uaddr;
-   struct iovec __user *uiov;
-   size_t nr_segs;
+   struct user_msghdr msg;
ssize_t err;
 
-   if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
-   __get_user(uaddr, >msg_name) ||
-   __get_user(kmsg->msg_namelen, >msg_namelen) ||
-   __get_user(uiov, >msg_iov) ||
-   __get_user(nr_segs, >msg_iovlen) ||
-   __get_user(kmsg->msg_control, >msg_control) ||
-   __get_user(kmsg->msg_controllen, >msg_controllen) ||
-   __get_user(kmsg->msg_flags, >msg_flags))
+   if (copy_from_user(, umsg, sizeof(*umsg)))
return -EFAULT;
 
-   if (!uaddr)
+   kmsg->msg_control = msg.msg_control;
+   kmsg->msg_controllen = msg.msg_controllen;
+   kmsg->msg_flags = msg.msg_flags;
+
+   kmsg->msg_namelen = msg.msg_namelen;
+   if (!msg.msg_name)
kmsg->msg_namelen = 0;
 
if (kmsg->msg_namelen < 0)
@@ -1895,11 +1891,11 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
kmsg->msg_namelen = sizeof(struct sockaddr_storage);
 
if (save_addr)
-   *save_addr = uaddr;
+   *save_addr = msg.msg_name;
 
-   if (uaddr && kmsg->msg_namelen) {
+   if (msg.msg_name && kmsg->msg_namelen) {
if (!save_addr) {
-   err = move_addr_to_kernel(uaddr, kmsg->msg_namelen,
+   err = move_addr_to_kernel(msg.msg_name, 
kmsg->msg_namelen,
  kmsg->msg_name);
if (err < 0)
return err;
@@ -1909,12 +1905,13 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
kmsg->msg_namelen = 0;
}
 
-   if (nr_segs > UIO_MAXIOV)
+   if (msg.msg_iovlen > UIO_MAXIOV)
return -EMSGSIZE;
 
kmsg->msg_iocb = NULL;
 
-   return import_iovec(save_addr ? READ : WRITE, uiov, nr_segs,
+   return import_iovec(save_addr ? READ : WRITE,
+   msg.msg_iov, msg.msg_iovlen,
UIO_FASTIOV, iov, >msg_iter);
 }

[RFC] get_compat_msghdr(): get rid of field-by-field copyin

2017-07-08 Thread Al Viro

There are 3 commits in vfs.git#misc.compat I hadn't pushed to Linus yet;
they touch net/* and I'd like to see at least "no objections" from networking
folks before asking to pull that; all of those are about getting rid of
field-by-field copyin.  Please, review and comment.

Signed-off-by: Al Viro 
---
diff --git a/net/compat.c b/net/compat.c
index aba929e5250f..dba5e222a0e5 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -37,21 +37,16 @@ int get_compat_msghdr(struct msghdr *kmsg,
  struct sockaddr __user **save_addr,
  struct iovec **iov)
 {
-   compat_uptr_t uaddr, uiov, tmp3;
-   compat_size_t nr_segs;
+   struct compat_msghdr msg;
ssize_t err;
 
-   if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
-   __get_user(uaddr, >msg_name) ||
-   __get_user(kmsg->msg_namelen, >msg_namelen) ||
-   __get_user(uiov, >msg_iov) ||
-   __get_user(nr_segs, >msg_iovlen) ||
-   __get_user(tmp3, >msg_control) ||
-   __get_user(kmsg->msg_controllen, >msg_controllen) ||
-   __get_user(kmsg->msg_flags, >msg_flags))
+   if (copy_from_user(, umsg, sizeof(*umsg)))
return -EFAULT;
 
-   if (!uaddr)
+   kmsg->msg_flags = msg.msg_flags;
+   kmsg->msg_namelen = msg.msg_namelen;
+
+   if (!msg.msg_name)
kmsg->msg_namelen = 0;
 
if (kmsg->msg_namelen < 0)
@@ -59,14 +54,16 @@ int get_compat_msghdr(struct msghdr *kmsg,
 
if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
kmsg->msg_namelen = sizeof(struct sockaddr_storage);
-   kmsg->msg_control = compat_ptr(tmp3);
+
+   kmsg->msg_control = compat_ptr(msg.msg_control);
+   kmsg->msg_controllen = msg.msg_controllen;
 
if (save_addr)
-   *save_addr = compat_ptr(uaddr);
+   *save_addr = compat_ptr(msg.msg_name);
 
-   if (uaddr && kmsg->msg_namelen) {
+   if (msg.msg_name && kmsg->msg_namelen) {
if (!save_addr) {
-   err = move_addr_to_kernel(compat_ptr(uaddr),
+   err = move_addr_to_kernel(compat_ptr(msg.msg_name),
  kmsg->msg_namelen,
  kmsg->msg_name);
if (err < 0)
@@ -77,13 +74,13 @@ int get_compat_msghdr(struct msghdr *kmsg,
kmsg->msg_namelen = 0;
}
 
-   if (nr_segs > UIO_MAXIOV)
+   if (msg.msg_iovlen > UIO_MAXIOV)
return -EMSGSIZE;
 
kmsg->msg_iocb = NULL;
 
return compat_import_iovec(save_addr ? READ : WRITE,
-  compat_ptr(uiov), nr_segs,
+  compat_ptr(msg.msg_iov), msg.msg_iovlen,
   UIO_FASTIOV, iov, >msg_iter);
 }

Re: [Patch] mqueue: fix the retry logic for netlink_attachskb()

2017-07-08 Thread Cong Wang

On Fri, Jul 7, 2017 at 5:23 PM, Linus Torvalds
 wrote:
> On Fri, Jul 7, 2017 at 11:32 AM, Cong Wang  wrote:
>> so we when retry and the fd has been closed during this small
>> window, we end up calling netlink_detachskb() on the error path
>> which releases the sock again and could lead to a use-after-free.
>
> So this seems to be a real problem: "sock" is not NULL'ed out in that
>
> if (!f.file) {
>
> error case (or alternatively, in the retry case).  Plus, since we did
> the "fput()" early, "sock" may be gone by the time we do the
> netlink_attachskb() even when it's all successful.
>
> But I don't think this is really so much about the retrying - the
> "sock may be gone" case seems to be true even the first time around,
> and even if we never retry at all.
>
> Am I reading this correctly?


Yes you are correct.

>
> Basically, I think the patch is fine, but the explanation seems a bit
> misleading. This isn't really about the re-trying: that would be fine
> if we just cleaned up sock properly.
>
> Can you confirm that? I don't know where the original report is.

Yes of course, setting 'sock' to NULL before 'goto retry' is sufficient
to fix it, that is in fact my initial thought. And I realized retry'ing
fdget() can't help anything in this situation but increases the
attack vector, so I decided to get rid of it from the retry loop
instead of just NULL'ing 'sock'.

Or do you prefer the simpler fix? Or should I just resend it with
a improved changelog?

BTW, the original report is here:
https://groups.google.com/forum/#!topic/syzkaller/QsmbsGoYPzA


>
> And that code is ancient, so we should do a "cc: stable" there too,
> and backport it basically forever. I think most of the code in this
> area predates the git tree, although Al Viro actually touched some
> things around here very recently to make the compat case cleaner.
>

Yeah, sorry about forgetting it.

Thanks!

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Alan Stern

Pardon me for barging in, but I found this whole interchange extremely 
confusing...

On Sat, 8 Jul 2017, Ingo Molnar wrote:

> * Paul E. McKenney  wrote:
> 
> > On Sat, Jul 08, 2017 at 10:35:43AM +0200, Ingo Molnar wrote:
> > > 
> > > * Manfred Spraul  wrote:
> > > 
> > > > Hi Ingo,
> > > > 
> > > > On 07/07/2017 10:31 AM, Ingo Molnar wrote:
> > > > > 
> > > > > There's another, probably just as significant advantage: 
> > > > > queued_spin_unlock_wait()
> > > > > is 'read-only', while spin_lock()+spin_unlock() dirties the lock 
> > > > > cache line. On
> > > > > any bigger system this should make a very measurable difference - if
> > > > > spin_unlock_wait() is ever used in a performance critical code path.
> > > > At least for ipc/sem:
> > > > Dirtying the cacheline (in the slow path) allows to remove a smp_mb() 
> > > > in the
> > > > hot path.
> > > > So for sem_lock(), I either need a primitive that dirties the cacheline 
> > > > or
> > > > sem_lock() must continue to use spin_lock()/spin_unlock().

This statement doesn't seem to make sense.  Did Manfred mean to write 
"smp_mb()" instead of "spin_lock()/spin_unlock()"?

> > > Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
> > > acquire 
> > > spinning on spin_unlock() and get very close to the slow path performance 
> > > of a 
> > > pure cacheline-dirtying behavior.

This is even more confusing.  Did Ingo mean to suggest using 
"spin_trylock()+spin_unlock()" in place of "spin_lock()+spin_unlock()" 
could provide the desired ordering guarantee without delaying other 
CPUs that may try to acquire the lock?  That seems highly questionable.

> > > But adding something like spin_barrier(), which purely dirties the lock 
> > > cacheline, 
> > > would be even faster, right?
> > 
> > Interestingly enough, the arm64 and powerpc implementations of
> > spin_unlock_wait() were very close to what it sounds like you are
> > describing.
> 
> So could we perhaps solve all our problems by defining the generic version 
> thusly:
> 
> void spin_unlock_wait(spinlock_t *lock)
> {
>   if (spin_trylock(lock))
>   spin_unlock(lock);
> }

How could this possibly be a generic version of spin_unlock_wait()?  
It does nothing at all (with no ordering properties) if some other CPU
currently holds the lock, whereas the real spin_unlock_wait() would
wait until the other CPU released the lock (or possibly longer).

And if no other CPU currently holds the lock, this has exactly the same
performance properties as spin_lock()+spin_unlock(), so what's the
advantage?

Alan Stern

> ... and perhaps rename it to spin_barrier() [or whatever proper name there 
> would 
> be]?
> 
> Architectures can still optimize it, to remove the small window where the 
> lock is 
> held locally - as long as the ordering is at least as strong as the generic 
> version.
> 
> This would have various advantages:
> 
>  - semantics are well-defined
> 
>  - the generic implementation is already pretty well optimized (no spinning)
> 
>  - it would make it usable for the IPC performance optimization
> 
>  - architectures could still optimize it to eliminate the window where the 
> lock is
>held locally - if there's such instructions available.
> 
> Was this proposed before, or am I missing something?
> 
> Thanks,
> 
>   Ingo

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Paul E. McKenney

On Sat, Jul 08, 2017 at 02:30:19PM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney  wrote:
> 
> > On Sat, Jul 08, 2017 at 10:35:43AM +0200, Ingo Molnar wrote:
> > > 
> > > * Manfred Spraul  wrote:
> > > 
> > > > Hi Ingo,
> > > > 
> > > > On 07/07/2017 10:31 AM, Ingo Molnar wrote:
> > > > > 
> > > > > There's another, probably just as significant advantage: 
> > > > > queued_spin_unlock_wait()
> > > > > is 'read-only', while spin_lock()+spin_unlock() dirties the lock 
> > > > > cache line. On
> > > > > any bigger system this should make a very measurable difference - if
> > > > > spin_unlock_wait() is ever used in a performance critical code path.
> > > > At least for ipc/sem:
> > > > Dirtying the cacheline (in the slow path) allows to remove a smp_mb() 
> > > > in the
> > > > hot path.
> > > > So for sem_lock(), I either need a primitive that dirties the cacheline 
> > > > or
> > > > sem_lock() must continue to use spin_lock()/spin_unlock().
> > > 
> > > Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
> > > acquire 
> > > spinning on spin_unlock() and get very close to the slow path performance 
> > > of a 
> > > pure cacheline-dirtying behavior.
> > > 
> > > But adding something like spin_barrier(), which purely dirties the lock 
> > > cacheline, 
> > > would be even faster, right?
> > 
> > Interestingly enough, the arm64 and powerpc implementations of
> > spin_unlock_wait() were very close to what it sounds like you are
> > describing.
> 
> So could we perhaps solve all our problems by defining the generic version 
> thusly:
> 
> void spin_unlock_wait(spinlock_t *lock)
> {
>   if (spin_trylock(lock))
>   spin_unlock(lock);
> }
> 
> ... and perhaps rename it to spin_barrier() [or whatever proper name there 
> would 
> be]?

As lockdep, 0day Test Robot, Linus Torvalds, and several others let me
know in response to my original (thankfully RFC!) patch series, this needs
to disable irqs to work in the general case.  For example, if the lock
in question is an irq-disabling lock, you take an interrupt just after
a successful spin_trylock(), and that interrupt acquires the same lock,
the actuarial statistics of your kernel degrade sharply and suddenly.

What I get for sending out untested patches!  :-/

> Architectures can still optimize it, to remove the small window where the 
> lock is 
> held locally - as long as the ordering is at least as strong as the generic 
> version.
> 
> This would have various advantages:
> 
>  - semantics are well-defined
> 
>  - the generic implementation is already pretty well optimized (no spinning)
> 
>  - it would make it usable for the IPC performance optimization
> 
>  - architectures could still optimize it to eliminate the window where the 
> lock is
>held locally - if there's such instructions available.
> 
> Was this proposed before, or am I missing something?

It was sort of proposed...

https://marc.info/?l=linux-arch=149912878628355=2

But do we have a situation where normal usage of spin_lock() and
spin_unlock() is causing performance or scalability trouble?

(We do have at least one situation in fnic that appears to be buggy use of
spin_is_locked(), and proposing a patch for that case in on my todo list.)

Thanx, Paul

[no subject]

2017-07-08 Thread Alfred chow





Good Day,

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing  
Chong Hing Bank, Hong Kong, Chong Hing Bank Centre, 24 Des Voeux Road  
Central, Hong Kong. I have a business proposal of  $38,980,369.00.


All confirmable documents to back up the claims will be made available  
to you prior to your acceptance and as soon as I receive your return  
mail.


Best Regards,
Alfred Chow

Loan Offer

2017-07-08 Thread Roy Wood




Exclusive guaranteed loan offer of any amount at a 3% rate. Contact  
Mr. Roy Wood from Save and see for yourself that you will be  
satisfied. please contact FCS Directly only via  
roywoodsavingsunitedloans@gmail. com if yes,apply now with your  
details 1.Full Name: 2.Sex: 3.Age: 4. Phone: 5.Fax: 6.Country:  
7.Address 8.Amount needed: 9.Duration Period:

for a try and be satisfied.

+1 (646) 458-4003


This message was sent using IMP, the Internet Messaging Program.

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Ingo Molnar


* Paul E. McKenney  wrote:

> On Sat, Jul 08, 2017 at 10:35:43AM +0200, Ingo Molnar wrote:
> > 
> > * Manfred Spraul  wrote:
> > 
> > > Hi Ingo,
> > > 
> > > On 07/07/2017 10:31 AM, Ingo Molnar wrote:
> > > > 
> > > > There's another, probably just as significant advantage: 
> > > > queued_spin_unlock_wait()
> > > > is 'read-only', while spin_lock()+spin_unlock() dirties the lock cache 
> > > > line. On
> > > > any bigger system this should make a very measurable difference - if
> > > > spin_unlock_wait() is ever used in a performance critical code path.
> > > At least for ipc/sem:
> > > Dirtying the cacheline (in the slow path) allows to remove a smp_mb() in 
> > > the
> > > hot path.
> > > So for sem_lock(), I either need a primitive that dirties the cacheline or
> > > sem_lock() must continue to use spin_lock()/spin_unlock().
> > 
> > Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
> > acquire 
> > spinning on spin_unlock() and get very close to the slow path performance 
> > of a 
> > pure cacheline-dirtying behavior.
> > 
> > But adding something like spin_barrier(), which purely dirties the lock 
> > cacheline, 
> > would be even faster, right?
> 
> Interestingly enough, the arm64 and powerpc implementations of
> spin_unlock_wait() were very close to what it sounds like you are
> describing.

So could we perhaps solve all our problems by defining the generic version 
thusly:

void spin_unlock_wait(spinlock_t *lock)
{
if (spin_trylock(lock))
spin_unlock(lock);
}

... and perhaps rename it to spin_barrier() [or whatever proper name there 
would 
be]?

Architectures can still optimize it, to remove the small window where the lock 
is 
held locally - as long as the ordering is at least as strong as the generic 
version.

This would have various advantages:

 - semantics are well-defined

 - the generic implementation is already pretty well optimized (no spinning)

 - it would make it usable for the IPC performance optimization

 - architectures could still optimize it to eliminate the window where the lock 
is
   held locally - if there's such instructions available.

Was this proposed before, or am I missing something?

Thanks,

Ingo

Re: Request to add bluetooth module identifier to net/rfkill/rfkill-gpio.c

2017-07-08 Thread Marcel Holtmann

Hi Sundar,

> I have a Cherry Trail laptop with an Atom X5-Z8300. It has a bluetooth
> chip that needs the r8723bs (coexisting RTL 8723BS wifi and
> bluetooth).
> 
> I am using linux-next (20150817) with the r8723bs staging driver and
> the firmware and utility from https://github.com/lwfinger/rtl8723bs_bt
> by Larry finger.
> 
> With linux-next the bluetooth works SOMETIMES, but on reboot, it does
> not work any more (no bluetooth interfaces are detected and hciconfig
> shows nothing).
> 
> I am a kernel novice, but I saw that another kernel that I had tried
> had the following line added in struct acpi_device_id
> rfkill_acpi_match:
> 
> { "OBDA8723",RFKILL_TYPE_BLUETOOTH }
> 
> With the attached patch applied (one line added), my bluetooth works every 
> time.
> 
> Does this belong in rfkill-gpio.c, or should I contact someone else?

it does not belong in rfkill-gpio.c since this controls the power of the 
Bluetooth device. This belongs in the Bluetooth driver. We already moved all 
the Intel and Broadcom ones into hci_intel.c and respectively hci_bcm.c. So 
have the staging driver deal with the power GPIO.

Regards

Marcel

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Paul E. McKenney

On Sat, Jul 08, 2017 at 10:43:24AM +0200, Ingo Molnar wrote:
> 
> * Paul E. McKenney  wrote:
> 
> > On Fri, Jul 07, 2017 at 10:31:28AM +0200, Ingo Molnar wrote:
> > 
> > [ . . . ]
> > 
> > > In fact I'd argue that any future high performance spin_unlock_wait() 
> > > user is 
> > > probably better off open coding the unlock-wait poll loop (and possibly 
> > > thinking 
> > > hard about eliminating it altogether). If such patterns pop up in the 
> > > kernel we 
> > > can think about consolidating them into a single read-only primitive 
> > > again.
> > 
> > I would like any reintroduction to include a header comment saying exactly
> > what the consolidated primitive actually does and does not do.  ;-)
> > 
> > > I.e. I think the proposed changes are doing no harm, and the 
> > > unavailability of a 
> > > generic primitive does not hinder future optimizations either in any 
> > > significant 
> > > fashion.
> > 
> > I will have a v3 with updated comments from Manfred.  Thoughts on when/where
> > to push this?
> 
> Once everyone agrees I can apply it to the locking tree. I think PeterZ's was 
> the 
> only objection?

Oleg wasn't all that happy, either, but he did supply the relevant patch.

> > The reason I ask is if this does not go in during this merge window, I need
> > to fix the header comment on spin_unlock_wait().
> 
> Can try it next week after some testing - let's see how busy things get for 
> Linus 
> in the merge window?

Sounds good!  Either way is fine with me.

Thanx, Paul

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Paul E. McKenney

On Sat, Jul 08, 2017 at 10:35:43AM +0200, Ingo Molnar wrote:
> 
> * Manfred Spraul  wrote:
> 
> > Hi Ingo,
> > 
> > On 07/07/2017 10:31 AM, Ingo Molnar wrote:
> > > 
> > > There's another, probably just as significant advantage: 
> > > queued_spin_unlock_wait()
> > > is 'read-only', while spin_lock()+spin_unlock() dirties the lock cache 
> > > line. On
> > > any bigger system this should make a very measurable difference - if
> > > spin_unlock_wait() is ever used in a performance critical code path.
> > At least for ipc/sem:
> > Dirtying the cacheline (in the slow path) allows to remove a smp_mb() in the
> > hot path.
> > So for sem_lock(), I either need a primitive that dirties the cacheline or
> > sem_lock() must continue to use spin_lock()/spin_unlock().
> 
> Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
> acquire 
> spinning on spin_unlock() and get very close to the slow path performance of 
> a 
> pure cacheline-dirtying behavior.
> 
> But adding something like spin_barrier(), which purely dirties the lock 
> cacheline, 
> would be even faster, right?

Interestingly enough, the arm64 and powerpc implementations of
spin_unlock_wait() were very close to what it sounds like you are
describing.

Thanx, Paul

[GIT] Networking

2017-07-08 Thread David Miller


Mostly fixing some light fallout from the changes that went into
the merge window.

1) Fix memory leaks on network namespace teardown in netfilter, from
   Liping Zhang.

2) When comparing ipv6 nexthops, we have to take the lightweight tunnel
   state into account as well.  From David Ahern.

3) Fix socket option object length check in the new TLS code, from
   Matthias Rosenfelder.

4) Fix memory leak in nfp driver flower support, from Jakub Kicinski.

5) Several netlink attribute validation fixes in cfg80211, from Srinivas
   Dasari.

6) Fix context array leak in virtio_net, from Jason Wang.

7) SKB use after free in hns driver, from Yusheng Lin.

8) Fix socket leak on accept() in RDS, from Sowmini Varadhan.  Also
   add a WARN_ON() to sock_graft() so other protocol stacks don't trip
   over this as well.

Please pull, thanks a lot!

The following changes since commit 9b51f04424e17051a89ab32d892ca66b2a104825:

  Merge branch 'parisc-4.13-2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux (2017-07-05 
17:41:31 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to e8df760307830ca26cf380a9a4b36468a0352fa5:

  net: ethernet: mediatek: remove useless code in mtk_probe() (2017-07-08 
11:27:55 +0100)


Christophe Jaillet (1):
  arcnet: com20020-pci: Fix an error handling path in 'com20020pci_probe()'

David Ahern (1):
  net: ipv6: Compare lwstate in detecting duplicate nexthops

David S. Miller (5):
  Merge git://git.kernel.org/.../pablo/nf
  Merge tag 'mac80211-for-davem-2017-07-07' of 
git://git.kernel.org/.../jberg/mac80211
  net: Update networking MAINTAINERS entry.
  Merge branch 'hns-fixes'
  Merge branch 'rds-tcp-sock_graft-leak'

Derek Chickles (1):
  liquidio: fix bug in soft reset failure detection

Geert Uytterhoeven (1):
  ptp: dte: Use LL suffix for 64-bit constants

Gustavo A. R. Silva (1):
  net: ethernet: mediatek: remove useless code in mtk_probe()

Jakub Kicinski (1):
  nfp: flower: add missing clean up call to avoid memory leaks

Jason Wang (1):
  virtio-net: fix leaking of ctx array

Liping Zhang (2):
  netfilter: nf_ct_dccp/sctp: fix memory leak after netns cleanup
  netfilter: ebt_nflog: fix unexpected truncated packet

Matthias Rosenfelder (1):
  TLS: Fix length check in do_tls_getsockopt_tx()

Nicolas Dichtel (1):
  doc: SKB_GSO_[IPIP|SIT] have been replaced

Nikolay Aleksandrov (1):
  vrf: fix bug_on triggered by rx when destroying a vrf

Roopa Prabhu (1):
  mpls: fix uninitialized in_label var warning in mpls_getroute

Sowmini Varadhan (2):
  rds: tcp: use sock_create_lite() to create the accept socket
  net/sock: add WARN_ON(parent->sk) in sock_graft()

Srinivas Dasari (4):
  cfg80211: Check if PMKID attribute is of expected size
  cfg80211: Check if NAN service ID is of expected size
  cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE
  cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES

WANG Cong (1):
  bonding: avoid NETDEV_CHANGEMTU event when unregistering slave

Wu Fengguang (1):
  tcp: md5: tcp_md5_do_lookup_exact() can be static

Yunsheng Lin (2):
  net: hns: Fix a wrong op phy C45 code
  net: hns: Fix a skb used after free bug

Zheng Li (1):
  sctp: set the value of flowi6_oif to sk_bound_dev_if to make 
sctp_v6_get_dst to find the correct route entry.

vishnuvardhan (1):
  net: macb: Adding Support for Jumbo Frames up to 10240 Bytes in SAMA5D3

 Documentation/networking/segmentation-offloads.txt  |  2 +-
 MAINTAINERS |  2 --
 drivers/net/arcnet/com20020-pci.c   |  6 --
 drivers/net/bonding/bond_main.c | 15 +--
 drivers/net/ethernet/cadence/macb_main.c|  3 ++-
 drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c |  2 +-
 drivers/net/ethernet/cavium/liquidio/cn66xx_device.c|  2 +-
 drivers/net/ethernet/hisilicon/hns/hns_enet.c   | 22 
++
 drivers/net/ethernet/hisilicon/hns/hns_enet.h   |  6 +++---
 drivers/net/ethernet/hisilicon/hns_mdio.c   |  2 +-
 drivers/net/ethernet/mediatek/mtk_eth_soc.c |  5 -
 drivers/net/ethernet/netronome/nfp/flower/main.c|  1 +
 drivers/net/virtio_net.c|  1 +
 drivers/net/vrf.c   | 11 ++-
 drivers/ptp/ptp_dte.c   |  2 +-
 include/linux/netdevice.h   |  1 +
 include/net/ip6_route.h |  8 
 include/net/sock.h  |  1 +
 net/bridge/netfilter/ebt_nflog.c|  1 +
 net/core/dev.c

Re: [PATCH] net: ethernet: mediatek: remove useless code in mtk_probe()

2017-07-08 Thread David Miller

From: "Gustavo A. R. Silva" 
Date: Fri, 7 Jul 2017 15:23:34 -0500

> Remove useless local variables _match_, _soc_ and the code related.
> 
> Notice that
> 
> const struct of_device_id of_mtk_match[] = {
> { .compatible = "mediatek,mt2701-eth" },
> {},
> };
> 
> So match->data is NULL.
> 
> Suggested-by: Andrew Lunn 
> Signed-off-by: Gustavo A. R. Silva 

Applied, thanks.

If someone needs this they can it back, in a less buggy form.

Re: [PATCH net] mpls: fix uninitialized in_label var warning in mpls_getroute

2017-07-08 Thread David Miller

From: Roopa Prabhu 
Date: Fri,  7 Jul 2017 11:21:49 -0700

> From: Roopa Prabhu 
> 
> Fix the below warning generated by static checker:
> net/mpls/af_mpls.c:2111 mpls_getroute()
> error: uninitialized symbol 'in_label'."
> 
> Fixes: 397fc9e5cefe ("mpls: route get support")
> Reported-by: Dan Carpenter 
> Signed-off-by: Roopa Prabhu 

Applied.

Re: [PATCH net] doc: SKB_GSO_[IPIP|SIT] have been replaced

2017-07-08 Thread David Miller

From: Nicolas Dichtel 
Date: Fri,  7 Jul 2017 14:08:25 +0200

> Those enum values don't exist anymore.
> 
> Fixes: 7e13318daa4a ("net: define gso types for IPx over IPv4 and IPv6")
> CC: Tom Herbert 
> Signed-off-by: Nicolas Dichtel 

Applied, thanks.

Re: [Patch net] bonding: avoid NETDEV_CHANGEMTU event when unregistering slave

2017-07-08 Thread David Miller

From: Cong Wang 
Date: Thu,  6 Jul 2017 15:01:57 -0700

> As Hongjun/Nicolas summarized in their original patch:
> 
> "
> When a device changes from one netns to another, it's first unregistered,
> then the netns reference is updated and the dev is registered in the new
> netns. Thus, when a slave moves to another netns, it is first
> unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
> the bonding driver. The driver calls bond_release(), which calls
> dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
> the old netns).
> "
> 
> This is a very special case, because the device is being unregistered
> no one should still care about the NETDEV_CHANGEMTU event triggered
> at this point, we can avoid broadcasting this event on this path,
> and avoid touching inetdev_event()/addrconf_notify() path.
> 
> It requires to export __dev_set_mtu() to bonding driver.
> 
> Reported-by: Hongjun Li 
> Reported-by: Nicolas Dichtel 
> Cc: Jay Vosburgh 
> Cc: Veaceslav Falico 
> Cc: Andy Gospodarek 
> Signed-off-by: Cong Wang 

Applied, thanks.

Re: [PATCH net 0/2] rds-tcp: sock_graft() leak

2017-07-08 Thread David Miller

From: Sowmini Varadhan 
Date: Thu,  6 Jul 2017 08:15:05 -0700

> Following up on the discussion at
>   https://www.spinics.net/lists/netdev/msg442859.html
> - make rds_tcp_accept_one() call sock_create_lite()
> - add a WARN_ON() to sock_graft() 
> 
> Tested by running an infinite while() loop that does
> (module-load; rds-stress; module-unload) and monitors
> TCP slabinfo while the test is running.

This looks great, thanks for following up on this.

Series applied and queued up for -stable.

Re: [PATCH net V2 0/2] Bugfixs for hns ethernet driver

2017-07-08 Thread David Miller

From: Lin Yun Sheng 
Date: Thu, 6 Jul 2017 10:21:58 +0800

> This patchset fix skb used after free and C45 op code issues
> in hns driver.
> 
> Patch V2:
>   1. Remove ndev->feature checking in TX description patch.
>   2. Add Fixes: Tag in patch description.
> 
> Patch V1:
>   Initial Submit

Series applied, thanks.

Re: [PATCH net] ip[6]: don't register inet[6]dev when dev is down

2017-07-08 Thread David Miller

From: Nicolas Dichtel 
Date: Wed,  5 Jul 2017 17:57:25 +0200

> From: Hongjun Li 
> 
> When the netdev event NETDEV_CHANGEMTU is triggered, the inet[6]dev may be
> created even if the corresponding device is down. This may lead to a leak
> in the procfs when the device is unregistered, and finally trigger a
> backtrace:
 ...
> When a device changes from one netns to another, it's first unregistered,
> then the netns reference is updated and the dev is registered in the new
> netns. Thus, when a slave moves to another netns, it is first
> unregistered. This triggers a NETDEV_UNREGISTER event which is caught by
> the bonding driver. The driver calls bond_release(), which calls
> dev_set_mtu() and thus triggers NETDEV_CHANGEMTU (the device is still in
> the old netns).
> 
> Signed-off-by: Hongjun Li 
> Signed-off-by: Nicolas Dichtel 

I'm still not convinced about this.

We have lots of code which iterates ipv6 idevs, and then has a
check for IFF_UP.

So having an idev attached to a down interface is not a bug nor
illegal.

In fact, addrconf_cleanup() walks all of the init_net idevs and
calls addrconf_ifdown() with how=1 regardless of IFF_UP or not.

This entire area is quite a mess.

Can you show exactly why the procfs state isn't cleaned up for
these devices moving between namespaces?  Maybe that is the real
bug and a better place to fix this.

Thanks.

Re: [PATCH v2 RFC 0/13] Remove UDP Fragmentation Offload support

2017-07-08 Thread David Miller

From: Michal Kubecek 
Date: Fri, 7 Jul 2017 14:45:31 +0200

> On Fri, Jul 07, 2017 at 10:43:26AM +0100, David Miller wrote:
>> 
>> This is an RFC patch series, based upon some discussions with
>> various developers, that removes UFO offloading.
>> 
>> Very few devices support this operation, it's usefullness is
>> quesitonable at best, and it adds a non-trivial amount of
>> complexity to our data paths.
> 
> My understanding from the communication with the customer whose reports
> resulted in commits acf8dd0a9d0b ("udp: only allow UFO for packets from
> SOCK_DGRAM sockets") and a5cb659bbc1c ("net: account for current skb
> length when deciding about UFO") was that the real benefit from UFO is 
> in the case when UFO allows to avoid the need to actually fragment the 
> packets. In their case it's when UDP packets are sent via virtio_net 
> either between a guest and its host or between two guests on the same 
> host.
> 
> Personally I have no idea how big the effect is in their use cases so 
> I forwarded the link to your series to them and asked them to provide 
> some real life data if they want to step in. If there is no significant
> performance benefit even in this case, I would agree the feature is not
> worth the hassle - if nothing else, the ever growing list of exceptions
> in ip{,6}_append_data() is getting out of hands.

Thank for letting us know about this.

However, unless the performance gains are significant and there are no
conceivable alternative ways to achieve the same thing, I'm still
removing this.

Re: [RFC PATCH 00/12] Implement XDP bpf_redirect vairants

2017-07-08 Thread David Miller

From: John Fastabend 
Date: Fri, 07 Jul 2017 10:48:36 -0700

> On 07/07/2017 10:34 AM, John Fastabend wrote:
>> This series adds two new XDP helper routines bpf_redirect() and
>> bpf_redirect_map(). The first variant bpf_redirect() is meant
>> to be used the same way it is currently being used by the cls_bpf
>> classifier. An xdp packet will be redirected immediately when this
>> is called.
> 
> Also other than the typo in the title there ;) I'm going to CC
> the driver maintainers working on XDP (makes for a long CC list but)
> because we would want to try and get support in as many as possible in
> the next merge window.
> 
> For this rev I just implemented on ixgbe because I wrote the
> original XDP support there. I'll volunteer to do virtio as well.

I went over this series a few times and it looks great to me.
You didn't even give me some coding style issues to pick on :-)

Re: [PATCH] net: macb: Adding Support for Jumbo Frames up to 10240 Bytes in SAMA5D3

2017-07-08 Thread David Miller

From: Nicolas Ferre 
Date: Wed, 5 Jul 2017 17:36:16 +0200

> From: vishnuvardhan 
> 
> As per the SAMA5D3 device specification it supports Jumbo frames.
> But the suggested flag and length of bytes it supports was not updated
> in this driver config_structure.
> The maximum jumbo frames the device supports :
> 10240 bytes as per the device spec.
> 
> While changing the MTU value greater than 1500, it threw error:
> sudo ifconfig eth1 mtu 9000
> SIOCSIFMTU: Invalid argument
> 
> Add this support to driver so that it works as expected and designed.
> 
> Signed-off-by: vishnuvardhan 
> [nicolas.fe...@microchip.com: modify slightly commit msg]
> Signed-off-by: Nicolas Ferre 

Applied, thank you.

Fwd: Request to add bluetooth module identifier to net/rfkill/rfkill-gpio.c

2017-07-08 Thread Sundar Nagarajan

Hi,

I have a Cherry Trail laptop with an Atom X5-Z8300. It has a bluetooth
chip that needs the r8723bs (coexisting RTL 8723BS wifi and
bluetooth).

I am using linux-next (20150817) with the r8723bs staging driver and
the firmware and utility from https://github.com/lwfinger/rtl8723bs_bt
by Larry finger.

With linux-next the bluetooth works SOMETIMES, but on reboot, it does
not work any more (no bluetooth interfaces are detected and hciconfig
shows nothing).

I am a kernel novice, but I saw that another kernel that I had tried
had the following line added in struct acpi_device_id
rfkill_acpi_match:

{ "OBDA8723",RFKILL_TYPE_BLUETOOTH }

With the attached patch applied (one line added), my bluetooth works every time.

Does this belong in rfkill-gpio.c, or should I contact someone else?
diff --git a/net/rfkill/rfkill-gpio.c b/net/rfkill/rfkill-gpio.c
index 76c01cb..4e32def 100644
--- a/net/rfkill/rfkill-gpio.c
+++ b/net/rfkill/rfkill-gpio.c
@@ -163,6 +163,7 @@ static int rfkill_gpio_remove(struct platform_device *pdev)
 static const struct acpi_device_id rfkill_acpi_match[] = {
 	{ "BCM4752", RFKILL_TYPE_GPS },
 	{ "LNV4752", RFKILL_TYPE_GPS },
+ { "OBDA8723",RFKILL_TYPE_BLUETOOTH },
 	{ },
 };
 MODULE_DEVICE_TABLE(acpi, rfkill_acpi_match);

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Ingo Molnar


* Paul E. McKenney  wrote:

> On Fri, Jul 07, 2017 at 10:31:28AM +0200, Ingo Molnar wrote:
> 
> [ . . . ]
> 
> > In fact I'd argue that any future high performance spin_unlock_wait() user 
> > is 
> > probably better off open coding the unlock-wait poll loop (and possibly 
> > thinking 
> > hard about eliminating it altogether). If such patterns pop up in the 
> > kernel we 
> > can think about consolidating them into a single read-only primitive again.
> 
> I would like any reintroduction to include a header comment saying exactly
> what the consolidated primitive actually does and does not do.  ;-)
> 
> > I.e. I think the proposed changes are doing no harm, and the unavailability 
> > of a 
> > generic primitive does not hinder future optimizations either in any 
> > significant 
> > fashion.
> 
> I will have a v3 with updated comments from Manfred.  Thoughts on when/where
> to push this?

Once everyone agrees I can apply it to the locking tree. I think PeterZ's was 
the 
only objection?

> The reason I ask is if this does not go in during this merge window, I need
> to fix the header comment on spin_unlock_wait().

Can try it next week after some testing - let's see how busy things get for 
Linus 
in the merge window?

Thanks,

Ingo

[PATCH v2] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-08 Thread Christophe JAILLET

If this check fails, we must release some resources as done everywhere
else in this function before returning an error code.

Signed-off-by: Christophe JAILLET 
---
V2: initialization of ret in this erro path ws missing. Stupid me!
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 7d334963dc08..da8683782ffc 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -1330,7 +1330,8 @@ static int mrf24j40_probe(struct spi_device *spi)
if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) {
dev_warn(>dev, "spi clock above possible maximum: %d",
 MAX_SPI_SPEED_HZ);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_register_device;
}
 
ret = mrf24j40_hw_init(devrec);
-- 
2.11.0

Re: [PATCH v2 0/9] Remove spin_unlock_wait()

2017-07-08 Thread Ingo Molnar

* Manfred Spraul  wrote:

> Hi Ingo,
> 
> On 07/07/2017 10:31 AM, Ingo Molnar wrote:
> > 
> > There's another, probably just as significant advantage: 
> > queued_spin_unlock_wait()
> > is 'read-only', while spin_lock()+spin_unlock() dirties the lock cache 
> > line. On
> > any bigger system this should make a very measurable difference - if
> > spin_unlock_wait() is ever used in a performance critical code path.
> At least for ipc/sem:
> Dirtying the cacheline (in the slow path) allows to remove a smp_mb() in the
> hot path.
> So for sem_lock(), I either need a primitive that dirties the cacheline or
> sem_lock() must continue to use spin_lock()/spin_unlock().

Technically you could use spin_trylock()+spin_unlock() and avoid the lock 
acquire 
spinning on spin_unlock() and get very close to the slow path performance of a 
pure cacheline-dirtying behavior.

But adding something like spin_barrier(), which purely dirties the lock 
cacheline, 
would be even faster, right?

Thanks,

Ingo

[PATCH] mrf24j40: Fix en error handling path in 'mrf24j40_probe()'

2017-07-08 Thread Christophe JAILLET

If this check fails, we must release some resources as done everywhere
else in this function before returning an error code.

Signed-off-by: Christophe JAILLET 
---
 drivers/net/ieee802154/mrf24j40.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ieee802154/mrf24j40.c 
b/drivers/net/ieee802154/mrf24j40.c
index 7d334963dc08..da8683782ffc 100644
--- a/drivers/net/ieee802154/mrf24j40.c
+++ b/drivers/net/ieee802154/mrf24j40.c
@@ -1330,7 +1330,7 @@ static int mrf24j40_probe(struct spi_device *spi)
if (spi->max_speed_hz > MAX_SPI_SPEED_HZ) {
dev_warn(>dev, "spi clock above possible maximum: %d",
 MAX_SPI_SPEED_HZ);
-   return -EINVAL;
+   goto err_register_device;
}
 
ret = mrf24j40_hw_init(devrec);
-- 
2.11.0

[PATCH 3/3] net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer

2017-07-08 Thread Christophe JAILLET

'alloc_dma_[rt]x_desc_resources()' functions look very close.
Remove a useless initialization and use the same label name for error
handling path in order to get them even closer.

Signed-off-by: Christophe JAILLET 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 07d486a70118..1853f7ff6657 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1449,7 +1449,7 @@ static void free_dma_rx_desc_resources(struct stmmac_priv 
*priv)
 static void free_dma_tx_desc_resources(struct stmmac_priv *priv)
 {
u32 tx_count = priv->plat->tx_queues_to_use;
-   u32 queue = 0;
+   u32 queue;
 
/* Free TX queue resources */
for (queue = 0; queue < tx_count; queue++) {
@@ -1561,13 +1561,13 @@ static int alloc_dma_tx_desc_resources(struct 
stmmac_priv *priv)

sizeof(*tx_q->tx_skbuff_dma),
GFP_KERNEL);
if (!tx_q->tx_skbuff_dma)
-   goto err_dma_buffers;
+   goto err_dma;
 
tx_q->tx_skbuff = kmalloc_array(DMA_TX_SIZE,
sizeof(struct sk_buff *),
GFP_KERNEL);
if (!tx_q->tx_skbuff)
-   goto err_dma_buffers;
+   goto err_dma;
 
if (priv->extend_desc) {
tx_q->dma_etx = dma_zalloc_coherent(priv->device,
@@ -1577,7 +1577,7 @@ static int alloc_dma_tx_desc_resources(struct stmmac_priv 
*priv)
_q->dma_tx_phy,
GFP_KERNEL);
if (!tx_q->dma_etx)
-   goto err_dma_buffers;
+   goto err_dma;
} else {
tx_q->dma_tx = dma_zalloc_coherent(priv->device,
   DMA_TX_SIZE *
@@ -1586,13 +1586,13 @@ static int alloc_dma_tx_desc_resources(struct 
stmmac_priv *priv)
   _q->dma_tx_phy,
   GFP_KERNEL);
if (!tx_q->dma_tx)
-   goto err_dma_buffers;
+   goto err_dma;
}
}
 
return 0;
 
-err_dma_buffers:
+err_dma:
free_dma_tx_desc_resources(priv);
 
return ret;
-- 
2.11.0

[PATCH 1/3] net: stmmac: Fix error handling path in 'alloc_dma_rx_desc_resources()'

2017-07-08 Thread Christophe JAILLET

If the first 'kmalloc_array' within the loop fails, we should free what
as already been allocated, as done in all other error handling path.

Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx")
Signed-off-by: Christophe JAILLET 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 19bba6281dab..4322fa4a13e8 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1498,7 +1498,7 @@ static int alloc_dma_rx_desc_resources(struct stmmac_priv 
*priv)
sizeof(dma_addr_t),
GFP_KERNEL);
if (!rx_q->rx_skbuff_dma)
-   return -ENOMEM;
+   goto err_dma;
 
rx_q->rx_skbuff = kmalloc_array(DMA_RX_SIZE,
sizeof(struct sk_buff *),
-- 
2.11.0

[PATCH 2/3] net: stmmac: Fix error handling path in 'alloc_dma_tx_desc_resources()'

2017-07-08 Thread Christophe JAILLET

If the first 'kmalloc_array' within the loop fails, we should free what
as already been allocated, as done in all other error handling path.

Fixes: ce736788e8a9 ("net: stmmac: adding multiple buffers for TX")
Signed-off-by: Christophe JAILLET 
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c 
b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 4322fa4a13e8..07d486a70118 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1561,7 +1561,7 @@ static int alloc_dma_tx_desc_resources(struct stmmac_priv 
*priv)

sizeof(*tx_q->tx_skbuff_dma),
GFP_KERNEL);
if (!tx_q->tx_skbuff_dma)
-   return -ENOMEM;
+   goto err_dma_buffers;
 
tx_q->tx_skbuff = kmalloc_array(DMA_TX_SIZE,
sizeof(struct sk_buff *),
-- 
2.11.0

[PATCH 0/3] net: stmmac: Fixes and cleanups in 'alloc_dma_[rt]x_desc_resources()'

2017-07-08 Thread Christophe JAILLET

These patchs are all related to 'alloc_dma_[rt]x_desc_resources()' functions.

The 2 first fix an error path where some resources are leaking. I've
separated them into 2 patches because the issues have been introduced by
2 deferent commits.

The 3rd patch is just a clean-up.

Christophe JAILLET (3):
  net: stmmac: Fix error handling path in
'alloc_dma_rx_desc_resources()'
  net: stmmac: Fix error handling path in
'alloc_dma_tx_desc_resources()'
  net: stmmac: Make 'alloc_dma_[rt]x_desc_resources()' look even closer

 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

-- 
2.11.0

Re: [PATCH] net: ethernet: mediatek: remove useless code in mtk_probe()

2017-07-08 Thread Sean Wang

Hi,  Gustavo

It indeed is useless at the current time point.


but actually I will add new SoC support to the driver in the next week,
which requires the variable match :-(

Sean


On Fri, 2017-07-07 at 15:23 -0500, Gustavo A. R. Silva wrote:
> Remove useless local variables _match_, _soc_ and the code related.
> 
> Notice that
> 
> const struct of_device_id of_mtk_match[] = {
> { .compatible = "mediatek,mt2701-eth" },
> {},
> };
> 
> So match->data is NULL.
> 
> Suggested-by: Andrew Lunn 
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index adaaafc..b9a5a65 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -2401,15 +2401,10 @@ static int mtk_probe(struct platform_device *pdev)
>  {
>   struct resource *res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>   struct device_node *mac_np;
> - const struct of_device_id *match;
> - struct mtk_soc_data *soc;
>   struct mtk_eth *eth;
>   int err;
>   int i;
>  
> - match = of_match_device(of_mtk_match, >dev);
> - soc = (struct mtk_soc_data *)match->data;
> -
>   eth = devm_kzalloc(>dev, sizeof(*eth), GFP_KERNEL);
>   if (!eth)
>   return -ENOMEM;

Re: [PATCH iproute2] ip: change flag names to an array

2017-07-08 Thread Lorenzo Colitti

On Sat, Jul 8, 2017 at 12:39 AM, Stephen Hemminger
 wrote:
> For the most of the address flags, use a table of bit values rather
> than open coding every value.  This allows for easier inevitable
> expansion of flags.

Thanks for doing this.

> +static unsigned int get_ifa_flag_mask(const char *name)
> +{
> +   unsigned int i;
> +
> +   for (i = 0; i < ARRAY_SIZE(ifa_flag_names); i++) {
> +   if (!strcmp(name, ifa_flag_names[i]))
> +   return 1u << i;
> +   }
> +   return 0;
> +}

It looks like the user can specify things such as "-deprecated".
Perhaps that sort of thing can be handled by this function too?
Instead of returning an unsigned int, this function could be void and
take a flag value and a mask as output parameters. If the flag name
starts with "-" it would set the value to 0, and if it doesn't it
would set the value to 1.

Otherwise, some of the - parameters will either need to be
special cased or cease to be supported. Specifically, I see this one:

> -   } else if (strcmp(*argv, "-tentative") == 0) {
> -   filter.flags &= ~IFA_F_TENTATIVE;
> -   filter.flagmask |= IFA_F_TENTATIVE;

this one:

> -   } else if (strcmp(*argv, "-deprecated") == 0) {
> -   filter.flags &= ~IFA_F_DEPRECATED;
> -   filter.flagmask |= IFA_F_DEPRECATED;

and this one:

> -   } else if (strcmp(*argv, "-dadfailed") == 0) {
> -   filter.flags &= ~IFA_F_DADFAILED;
> -   filter.flagmask |= IFA_F_DADFAILED;

42 matches

Mail list logo