Re: [net-next, RFC, 4/8] net: core: add recycle capabilities on skbs via page_pool API

2018-12-08 Thread Willy Tarreau
On Sat, Dec 08, 2018 at 10:14:47PM +0200, Ilias Apalodimas wrote: > On Sat, Dec 08, 2018 at 09:11:53PM +0100, Jesper Dangaard Brouer wrote: > > > > > > I want to make sure you guys thought about splice() stuff, and > > > skb_try_coalesce(), and GRO, and skb cloning, and ... > > > > Thanks for

Re: bring back IPX and NCPFS, please!

2018-11-09 Thread Willy Tarreau
On Fri, Nov 09, 2018 at 06:30:14PM +0100, Johannes C. Schulz wrote: > Hello Willy, hello Stephen > > Thankyou for your reply. > But I'm not able to maintain or code these modules. I'm just a bloody > user/webdev. That's what we've all claimed before taking over something many years ago you know

Re: bring back IPX and NCPFS, please!

2018-11-09 Thread Willy Tarreau
On Fri, Nov 09, 2018 at 02:23:27PM +0100, Johannes C. Schulz wrote: > Hello all! > > I like to please you to bring back IPX and NCPFS modules to the kernel. > Whyever my admins using Novell-shares on our network which I'm not be > able to use anymore - I'm forced to use cifs instead (and the

Re: Kernel Panic on high bandwidth transfer over wifi

2018-08-29 Thread Willy Tarreau
On Wed, Aug 29, 2018 at 11:42:44AM +, Nathaniel Munk wrote: > As you can see from the attached log You apparently forgot to attach the log. Willy

Re: how to (cross)connect two (physical) eth ports for ping test?

2018-08-18 Thread Willy Tarreau
On Sat, Aug 18, 2018 at 09:10:25PM +0200, Andrew Lunn wrote: > On Sat, Aug 18, 2018 at 01:39:50PM -0400, Robert P. J. Day wrote: > > > > (i'm sure this has been explained many times before, so a link > > covering this will almost certainly do just fine.) > > > > i want to loop one physical

Re: ANNOUNCE: Enhanced IP v1.4

2018-06-05 Thread Willy Tarreau
On Tue, Jun 05, 2018 at 02:33:03PM +0200, Bjørn Mork wrote: > > I do have IPv6 at home (a /48, waste of addressing space, I'd be fine > > with less), > > Any reason you would want less? Any reason the ISP should give you > less? What I mean is that *if* the availability of /48 networks was an

Re: ANNOUNCE: Enhanced IP v1.4

2018-06-03 Thread Willy Tarreau
On Sun, Jun 03, 2018 at 03:41:08PM -0700, Eric Dumazet wrote: > > > On 06/03/2018 01:37 PM, Tom Herbert wrote: > > > This is not an inconsequential mechanism that is being proposed. It's > > a modification to IP protocol that is intended to work on the > > Internet, but it looks like the draft

Re: ANNOUNCE: Enhanced IP v1.4

2018-06-02 Thread Willy Tarreau
On Sat, Jun 02, 2018 at 12:17:12PM -0400, Sam Patton wrote: > As far as application examples, check out this simple netcat-like > program I use for testing: > > https://github.com/EnIP/enhancedip/blob/master/userspace/netcat/netcat.c > > Lines 61-67 show how to connect directly via an EnIP

Re: ANNOUNCE: Enhanced IP v1.4

2018-06-01 Thread Willy Tarreau
Hello Sam, On Fri, Jun 01, 2018 at 09:48:28PM -0400, Sam Patton wrote: > Hello! > > If you do not know what Enhanced IP is, read this post on netdev first: > > https://www.spinics.net/lists/netdev/msg327242.html > > > The Enhanced IP project presents: > > Enhanced IP v1.4 > >

Re: Request for -stable inclusion: time stamping fix for nfp

2018-05-17 Thread Willy Tarreau
Adding Greg here. Greg, apparently a backport of 46f1c52e66db is needed in 4.9 according to the thread below. It was merged in 4.13 so 4.14 already has it. Willy On Thu, May 17, 2018 at 02:09:03PM -0400, David Miller wrote: > From: Guillaume Nault > Date: Thu, 17 May 2018

Re: [PATCH linux-stable-4.14] tcp: reset sk_send_head in tcp_write_queue_purge

2018-03-20 Thread Willy Tarreau
Hi David, regarding the patch below, I'm not certain whether you planned to take it since it's marked "not applicable" on patchwork, but I suspect it's only because it doesn't apply to mainline. However, please note that there are two typos in commit IDs referenced in the commit message that

Re: [PATCH stable 4.9 1/8] x86: bpf_jit: small optimization in emit_bpf_tail_call()

2018-01-29 Thread Willy Tarreau
Hi Eric, On Mon, Jan 29, 2018 at 06:04:30AM -0800, Eric Dumazet wrote: > > If these 4 bytes matter, why not use > > cmpq with an immediate value instead, which saves 2 extra bytes ? : > > > > - the mov above is 11 bytes total : > > > >0: 48 8b 84 d6 78 56 34mov

Re: [PATCH stable 4.9 1/8] x86: bpf_jit: small optimization in emit_bpf_tail_call()

2018-01-28 Thread Willy Tarreau
Hi, [ replaced stable@ and greg@ by netdev@ as my question below is not relevant to stable ] On Mon, Jan 29, 2018 at 02:48:54AM +0100, Daniel Borkmann wrote: > From: Eric Dumazet > > [ upstream commit 84ccac6e7854ebbfb56d2fc6d5bef9be49bb304c ] > > Saves 4 bytes

Re: TCP many-connection regression between 4.7 and 4.13 kernels.

2018-01-22 Thread Willy Tarreau
Hi Eric, On Mon, Jan 22, 2018 at 10:16:06AM -0800, Eric Dumazet wrote: > On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote: > > My test case is to have 6 processes each create 5000 TCP IPv4 connections > > to each other > > on a system with 16GB RAM and send slow-speed data. This works fine

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-07 Thread Willy Tarreau
On Sun, Jan 07, 2018 at 12:17:11PM -0800, Linus Torvalds wrote: > We need to fix the security problem, but we need to do it *without* > these braindead arguments that performance is somehow secondary. OK OK. At least we should have security by default and let people trade it against performance

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-07 Thread Willy Tarreau
On Sun, Jan 07, 2018 at 11:47:07AM -0800, Linus Torvalds wrote: > And the whole "normal people won't even notice" is pure garbage too. > Don't spread that bullshit when you see actual normal people > complaining. > > Performance matters. A *LOT*. Linus, no need to explain that to me, I'm

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-06 Thread Willy Tarreau
On Sat, Jan 06, 2018 at 07:38:14PM -0800, Alexei Starovoitov wrote: > yep. plenty of unknowns and what's happening now is an overreaction. To be fair there's overreaction on both sides. The vast majority of users need to get a 100% safe system and will never notice any difference. A few of us

Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

2018-01-06 Thread Willy Tarreau
On Sat, Jan 06, 2018 at 06:38:59PM +, Alan Cox wrote: > Normally people who propose security fixes don't have to argue about the > fact they added 30 clocks to avoid your box being 0wned. In fact it depends, because if a fix makes the system unusable for its initial purpose, this fix will

Re: BUG warnings in 4.14.9

2017-12-26 Thread Willy Tarreau
Guys, Chris reported the bug below and confirmed that reverting commit 9704f81 (ipv6: grab rt->rt6i_ref before allocating pcpu rt) seems to have fixed the issue for him. This patch is a94b9367 in mainline. I personally have no opinion on the patch, just found it because it was the only one

Re: [PATCH net 0/3] Few mvneta fixes

2017-12-19 Thread Willy Tarreau
Hi Arnd, On Tue, Dec 19, 2017 at 09:18:35PM +0100, Arnd Bergmann wrote: > On Tue, Dec 19, 2017 at 5:59 PM, Gregory CLEMENT > wrote: > > Hello, > > > > here it is a small series of fixes found on the mvneta driver. They > > had been already used in the vendor

Re: [PATCH] net: bridge: add max_fdb_count

2017-11-17 Thread Willy Tarreau
Hi Andrew, On Fri, Nov 17, 2017 at 03:06:23PM +0100, Andrew Lunn wrote: > > Usually it's better to apply LRU or random here in my opinion, as the > > new entry is much more likely to be needed than older ones by definition. > > Hi Willy > > I think this depends on why you need to discard. If it

Re: [PATCH] net: bridge: add max_fdb_count

2017-11-16 Thread Willy Tarreau
Hi Stephen, On Thu, Nov 16, 2017 at 04:27:18PM -0800, Stephen Hemminger wrote: > On Thu, 16 Nov 2017 21:21:55 +0100 > Vincent Bernat wrote: > > > ? 16 novembre 2017 20:23 +0100, Andrew Lunn  : > > > > > struct net_bridge_fdb_entry is 40 bytes. > > > > > > My

Re: [PATCH] net: bridge: add max_fdb_count

2017-11-16 Thread Willy Tarreau
Hi Sarah, On Thu, Nov 16, 2017 at 01:20:18AM -0800, Sarah Newman wrote: > I note that anyone who would run up against a too-low limit on the maximum > number of fdb entries would also be savvy enough to fix it in a matter of > minutes. I disagree on this point. There's a huge difference between

Re: [PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage

2017-11-01 Thread Willy Tarreau
On Tue, Oct 31, 2017 at 09:14:45AM -0700, Kees Cook wrote: > diff --git a/net/socket.c b/net/socket.c > index c729625eb5d3..34183f4fbdf8 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -2188,6 +2188,7 @@ static int ___sys_recvmsg(struct socket *sock, struct > user_msghdr __user *msg, >

Re: net.ipv4.tcp_max_syn_backlog implementation

2017-08-28 Thread Willy Tarreau
On Mon, Aug 28, 2017 at 11:47:41PM -0400, Harsha Chenji wrote: > So I have ubuntu 12.04 x32 in a VM with syncookies turned off. I tried > to do a syn flood (with netwox) on 3 different processes. Each of them > returns a different value with netstat -na | grep -c RECV : > > nc -l returns 16

Re: [PATCH net 3/3] tcp: fix xmit timer to only be reset if data ACKed/SACKed

2017-08-06 Thread Willy Tarreau
On Sun, Aug 06, 2017 at 07:39:57AM +, maowenan wrote: > > > > -Original Message- > > From: Willy Tarreau [mailto:w...@1wt.eu] > > Sent: Saturday, August 05, 2017 2:19 AM > > To: Neal Cardwell > > Cc: maowenan; David Miller; netdev@vger.kernel.org;

Re: [PATCH net 3/3] tcp: fix xmit timer to only be reset if data ACKed/SACKed

2017-08-04 Thread Willy Tarreau
On Fri, Aug 04, 2017 at 02:01:34PM -0400, Neal Cardwell wrote: > On Fri, Aug 4, 2017 at 1:10 PM, Willy Tarreau <w...@1wt.eu> wrote: > > Hi Neal, > > > > On Fri, Aug 04, 2017 at 12:59:51PM -0400, Neal Cardwell wrote: > >> I have attached patches for this fix reba

Re: [PATCH net 3/3] tcp: fix xmit timer to only be reset if data ACKed/SACKed

2017-08-04 Thread Willy Tarreau
Hi Neal, On Fri, Aug 04, 2017 at 12:59:51PM -0400, Neal Cardwell wrote: > I have attached patches for this fix rebased on to v3.10.107, the > latest stable release for 3.10. That's pretty far back in history, so > there were substantial conflict resolutions and adjustments required. > :-) Hope

Re: STABLE: net: reduce skb_warn_bad_offload() noise

2017-07-29 Thread Willy Tarreau
On Fri, Jul 28, 2017 at 10:22:52PM -0700, Eric Dumazet wrote: > On Fri, 2017-07-28 at 12:30 -0700, David Miller wrote: > > From: Mark Salyzyn > > Date: Fri, 28 Jul 2017 10:29:57 -0700 > > > > > Please backport the upstream patch to the stable trees (including > > > 3.10.y,

Re: TCP fast retransmit issues

2017-07-28 Thread Willy Tarreau
On Fri, Jul 28, 2017 at 08:36:49AM +0200, Klavs Klavsen wrote: > The network guys know what caused it. > > Appearently on (atleast some) Cisco equipment the feature: > > TCP Sequence Number Randomization > > is enabled by default. I didn't want to suggest names but since you did it first ;-)

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 07:32:12AM -0700, Eric Dumazet wrote: > On Wed, 2017-07-26 at 15:42 +0200, Willy Tarreau wrote: > > On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: > > > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > > > > the 1

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 04:25:29PM +0200, Klavs Klavsen wrote: > Thank you very much guys for your insight.. its highly appreciated. > > Next up for me, is waiting till the network guys come back from summer > vacation, and convince them to sniff on the devices in between to pinpoint > the

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 04:08:19PM +0200, Klavs Klavsen wrote: > Grabbed on both ends. > > http://blog.klavsen.info/fast-retransmit-problem-junos-linux (updated to new > dump - from client scp'ing) > http://blog.klavsen.info/fast-retransmit-problem-junos-linux-receiving-side > (receiving host)

Re: TCP fast retransmit issues

2017-07-26 Thread Willy Tarreau
On Wed, Jul 26, 2017 at 06:31:21AM -0700, Eric Dumazet wrote: > On Wed, 2017-07-26 at 14:18 +0200, Klavs Klavsen wrote: > > the 192.168.32.44 is a Centos 7 box. > > Could you grab a capture on this box, to see if the bogus packets are > sent by it, or later mangled by a middle box ? Given the

Re: [PATCH RFC 0/2] kproxy: Kernel Proxy

2017-06-29 Thread Willy Tarreau
On Thu, Jun 29, 2017 at 04:43:28PM -0700, Tom Herbert wrote: > On Thu, Jun 29, 2017 at 1:58 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Thu, Jun 29, 2017 at 01:40:26PM -0700, Tom Herbert wrote: > >> > In fact that's not much what I observe in field. In practice

Re: [PATCH RFC 0/2] kproxy: Kernel Proxy

2017-06-29 Thread Willy Tarreau
On Thu, Jun 29, 2017 at 01:40:26PM -0700, Tom Herbert wrote: > > In fact that's not much what I observe in field. In practice, large > > data streams are cheaply relayed using splice(), I could achieve > > 60 Gbps of HTTP forwarding via HAProxy on a 4-core xeon 2 years ago. > > And when you use

Re: [PATCH RFC 0/2] kproxy: Kernel Proxy

2017-06-29 Thread Willy Tarreau
Hi Tom, On Thu, Jun 29, 2017 at 11:27:03AM -0700, Tom Herbert wrote: > Sidecar proxies are becoming quite popular on server as a means to > perform layer 7 processing on application data as it is sent. Such > sidecars are used for SSL proxies, application firewalls, and L7 > load balancers. While

Re: [PATCH net-next 1/2] tcp: remove per-destination timestamp cache

2017-03-16 Thread Willy Tarreau
Hi Neal, On Thu, Mar 16, 2017 at 11:40:52AM -0400, Neal Cardwell wrote: > On Thu, Mar 16, 2017 at 7:31 AM, Lutz Vieweg <l...@5t9.de> wrote: > > > > On 03/15/2017 11:55 PM, Willy Tarreau wrote: > >> > >> At least I can say I've seen many people enable it

Re: [PATCH net-next 1/2] tcp: remove per-destination timestamp cache

2017-03-15 Thread Willy Tarreau
Hi David, On Wed, Mar 15, 2017 at 03:40:44PM -0700, David Miller wrote: > From: Soheil Hassas Yeganeh > Date: Wed, 15 Mar 2017 16:30:45 -0400 > > > Note that this cache was already broken for caching timestamps of > > multiple machines behind a NAT sharing the same

Re: net: BUG in unix_notinflight

2017-03-07 Thread Willy Tarreau
On Wed, Mar 08, 2017 at 12:23:56AM +0200, Nikolay Borisov wrote: > > >> > >> > >> New report from linux-next/c0b7b2b33bd17f7155956d0338ce92615da686c9 > >> > >> [ cut here ] > >> kernel BUG at net/unix/garbage.c:149! > >> invalid opcode: [#1] SMP KASAN > >> Dumping

Re: [4.9.10] ip_route_me_harder() reading off-slab

2017-02-16 Thread Willy Tarreau
On Fri, Feb 17, 2017 at 01:34:11PM +0800, Daniel J Blueman wrote: > When booting a VM in libvirt/KVM attached to a local bridge and KASAN > enabled on 4.9.10, we see a stream of KASAN warnings about off-slab > access [1]. Did it start to appear with 4.9.10 or is 4.9.10 the first 4.9 kernel you

[PATCH 3.10 257/319] net: sctp, forbid negative length

2017-02-05 Thread Willy Tarreau
ft.net> Cc: linux-s...@vger.kernel.org Cc: netdev@vger.kernel.org Acked-by: Neil Horman <nhor...@tuxdriver.com> Signed-off-by: David S. Miller <da...@davemloft.net> Signed-off-by: Willy Tarreau <w...@1wt.eu> --- net/sctp/socket.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) dif

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-25 Thread Willy Tarreau
On Wed, Jan 25, 2017 at 12:22:05PM -0500, David Miller wrote: > From: Wei Wang > Date: Wed, 25 Jan 2017 09:15:34 -0800 > > > Looks like you sent a separate patch on top of this patch series to > > address double connect(). Then I think this patch series should be > > good

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-25 Thread Willy Tarreau
Hi Wei, On Wed, Jan 25, 2017 at 09:15:34AM -0800, Wei Wang wrote: > Willy, > > Looks like you sent a separate patch on top of this patch series to address > double connect(). Yes, sorry, I wanted to reply to this thread after the git-send-email and got caught immediately after :-) So as

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-24 Thread Willy Tarreau
On Tue, Jan 24, 2017 at 10:51:25AM -0800, Eric Dumazet wrote: > We do not return -1 / EINPROGRESS but 0 > > Do not call connect() twice, it is clearly not supposed to work. Yes it is, it normally returns -1 / EISCONN on a regular socket : EISCONN The socket is already

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-24 Thread Willy Tarreau
On Tue, Jan 24, 2017 at 09:42:07AM -0800, Eric Dumazet wrote: > On Tue, 2017-01-24 at 09:26 -0800, Yuchung Cheng wrote: > > > > > > > Do you think there's a compelling reason for adding a new option or > > > are you interested in a small patch to perform the change above ? > > I like the proposal

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-24 Thread Willy Tarreau
Hi Eric, On Tue, Jan 24, 2017 at 09:44:49AM -0800, Eric Dumazet wrote: > I believe there is a bug in this application. > > It does not check connect() return value. Yes in fact it does but I noticed the same thing, there's something causing the event not to be registered or something like this.

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
On Mon, Jan 23, 2017 at 10:59:22AM -0800, Wei Wang wrote: > This patch adds a new socket option, TCP_FASTOPEN_CONNECT, as an > alternative way to perform Fast Open on the active side (client). Wei, I think that nothing prevents from reusin the original TCP_FASTOPEN sockopt instead of adding a new

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
On Mon, Jan 23, 2017 at 02:57:31PM -0800, Wei Wang wrote: > Yes. That seems to be a valid fix to it. > Let me try it with my existing test cases as well to see if it works for > all scenarios I have. Perfect. Note that since the state 2 is transient I initially thought about abusing the flags

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
On Mon, Jan 23, 2017 at 11:01:21PM +0100, Willy Tarreau wrote: > On Mon, Jan 23, 2017 at 10:37:32PM +0100, Willy Tarreau wrote: > > On Mon, Jan 23, 2017 at 01:28:53PM -0800, Wei Wang wrote: > > > Hi Willy, > > > > > > True. If you call connect() multiple

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
On Mon, Jan 23, 2017 at 10:37:32PM +0100, Willy Tarreau wrote: > On Mon, Jan 23, 2017 at 01:28:53PM -0800, Wei Wang wrote: > > Hi Willy, > > > > True. If you call connect() multiple times on a socket which already has > > cookie without a write(), the second and onward

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
On Mon, Jan 23, 2017 at 01:28:53PM -0800, Wei Wang wrote: > Hi Willy, > > True. If you call connect() multiple times on a socket which already has > cookie without a write(), the second and onward connect() call will return > EINPROGRESS. > It is basically because the following code block in

Re: [PATCH net-next 3/3] net/tcp-fastopen: Add new API support

2017-01-23 Thread Willy Tarreau
Hi Wei, first, thanks a lot for doing this, it's really awesome! I'm testing it on 4.9 on haproxy and I met a corner case : when I perform a connect() to a server and I have nothing to send, upon POLLOUT notification since I have nothing to send I simply probe the connection using connect()

Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-11 Thread Willy Tarreau
On Sun, Dec 11, 2016 at 03:50:31PM +0100, Jason A. Donenfeld wrote: > 3. Add 3 bytes of padding, set to zero, to the encrypted section just > before the IP header, marked for future use. > Pros: satisfies IETF mantras, can use those extra bits in the future > for interesting protocol extensions

Re: Misalignment, MIPS, and ip_hdr(skb)->version

2016-12-11 Thread Willy Tarreau
Hi Jason, On Thu, Dec 08, 2016 at 11:20:04PM +0100, Jason A. Donenfeld wrote: > Hi David, > > On Thu, Dec 8, 2016 at 1:37 AM, David Miller wrote: > > You really have to land the IP header on a proper 4 byte boundary. > > > > I would suggest pushing 3 dummy garbage bytes of

Re: [ANNOUNCE] ndiv: line-rate network traffic processing

2016-09-21 Thread Willy Tarreau
Hi Tom, On Wed, Sep 21, 2016 at 10:16:45AM -0700, Tom Herbert wrote: > This does seem interesting and indeed the driver datapath looks very > much like XDP. It would be quite interesting if you could rebase and > then maybe look at how this can work with XDP that would be helpful. OK I'll assign

Re: [ANNOUNCE] ndiv: line-rate network traffic processing

2016-09-21 Thread Willy Tarreau
Hi Jesper! On Wed, Sep 21, 2016 at 06:26:39PM +0200, Jesper Dangaard Brouer wrote: > I definitely want to study it! Great, at least I've not put this online for nothing :-) > You mention XDP. If you didn't notice, I've created some documentation > on XDP (it is very "live" documentation at

[ANNOUNCE] ndiv: line-rate network traffic processing

2016-09-21 Thread Willy Tarreau
Hi, Over the last 3 years I've been working a bit on high traffic processing for various reasons. It started with the wish to capture line-rate GigE traffic on very small fanless ARM machines and the framework has evolved to be used at my company as a basis for our anti-DDoS engine capable of

Re: [PATCH net] net: mvneta: set real interrupt per packet for tx_done

2016-07-05 Thread Willy Tarreau
some users for testing. I also remember that on more recent kernels by then (>=3.13) we observed a slightly better performance with this value set to zero. Acked-by: Willy Tarreau <w...@1wt.eu> Willy

[PATCH 3.10 128/143] VSOCK: do not disconnect socket when peer has shutdown SEND only

2016-06-05 Thread Willy Tarreau
vmware.com> Cc: Jorgen Hansen <jhan...@vmware.com> Cc: Adit Ranadive <ad...@vmware.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <da...@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> Signed-off-by: Willy Tarreau <w...@1wt.

Re: [PATCH] nf_conntrack: avoid kernel pointer value leak in slab name

2016-05-14 Thread Willy Tarreau
On Sat, May 14, 2016 at 03:21:31PM -0700, Linus Torvalds wrote: > On Sat, May 14, 2016 at 2:33 PM, Willy Tarreau <w...@1wt.eu> wrote: > > > > Why simply not cast the atomic to (unsigned long long) instead of (u64) > > so that %llu always matches ? > > Yes, that

Re: [PATCH] nf_conntrack: avoid kernel pointer value leak in slab name

2016-05-14 Thread Willy Tarreau
On Sat, May 14, 2016 at 02:31:04PM -0700, Linus Torvalds wrote: > On Sat, May 14, 2016 at 11:24 AM, Linus Torvalds > wrote: > > > > > > - net->ct.slabname = kasprintf(GFP_KERNEL, "nf_conntrack_%p", net); > > + net->ct.slabname = kasprintf(GFP_KERNEL,

Re: [PATCH 3.2 085/115] veth: don???t modify ip_summed; doing so treats packets with bad checksums as good.

2016-04-30 Thread Willy Tarreau
On Sat, Apr 30, 2016 at 03:43:51PM -0700, Ben Greear wrote: > On 04/30/2016 03:01 PM, Vijay Pandurangan wrote: > > Consider: > > > > - App A sends out corrupt packets 50% of the time and discards inbound > > data. (...) > How can you make a generic app C know how to do this? The path could be,

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-25 Thread Willy Tarreau
Hi Eric, On Thu, Mar 24, 2016 at 11:49:41PM -0700, Eric Dumazet wrote: > Everything is possible, but do not complain because BPF went in the > kernel before your changes. Don't get me wrong, I'm not complaining, I'm more asking for help to try to elaborate the alternate solution. I understood

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
On Thu, Mar 24, 2016 at 04:54:03PM -0700, Tom Herbert wrote: > On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic wrote: > > I'll learn how to do this to get the best performances from the > > server, but having to do so to work around what looks like a defect > > (for

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
On Thu, Mar 24, 2016 at 11:20:49AM -0700, Tolga Ceylan wrote: > I would appreciate a conceptual description on how this would work > especially for a common scenario > as described by Willy. My initial impression was that a coordinator > (master) process takes this > responsibility to adjust BPF

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
On Thu, Mar 24, 2016 at 07:00:11PM +0100, Willy Tarreau wrote: > Since it's not about > load distribution and that processes are totally independant, I don't see > well how to (ab)use BPF to achieve this. > > The pattern is : > > t0 : unprivileged processes 1 and 2 are

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
On Thu, Mar 24, 2016 at 10:01:37AM -0700, Eric Dumazet wrote: > On Thu, 2016-03-24 at 17:50 +0100, Willy Tarreau wrote: > > On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote: > > > > --- a/net/ipv4/inet_hashtables.c > > > > +++ b/net/ipv4/inet_hash

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote: > > --- a/net/ipv4/inet_hashtables.c > > +++ b/net/ipv4/inet_hashtables.c > > @@ -189,6 +189,8 @@ static inline int compute_score(struct sock *sk, struct > > net *net, > > return -1; > >

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
Hi Eric, (just lost my e-mail, trying not to forget some points) On Thu, Mar 24, 2016 at 07:45:44AM -0700, Eric Dumazet wrote: > On Thu, 2016-03-24 at 15:22 +0100, Willy Tarreau wrote: > > Hi Eric, > > > But that means that any software making use of SO_REUSEPORT needs to >

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
Hi Eric, On Thu, Mar 24, 2016 at 07:13:33AM -0700, Eric Dumazet wrote: > On Thu, 2016-03-24 at 07:12 +0100, Willy Tarreau wrote: > > Hi, > > > > On Wed, Mar 23, 2016 at 10:10:06PM -0700, Tolga Ceylan wrote: > > > I apologize for not properly following up on th

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2016-03-24 Thread Willy Tarreau
Hi, On Wed, Mar 23, 2016 at 10:10:06PM -0700, Tolga Ceylan wrote: > I apologize for not properly following up on this. I had the > impression that we did not want to merge my original patch and then I > also noticed that it fails to keep the hash consistent. Recently, I > read the follow ups on

Re: [PATCH v2 net-next 0/8] API set for HW Buffer management

2016-02-17 Thread Willy Tarreau
ere, we don't really care about a possible 1% performance drop. I'll try to provide more results as time permits. In the mean time if you want (or plan to submit a next batch), feel free to add a Tested-by: Willy Tarreau <w...@1wt.eu>. cheers, Willy

Re: [PATCH v2] unix: properly account for FDs passed over unix sockets

2016-02-02 Thread Willy Tarreau
On Tue, Feb 02, 2016 at 09:32:56PM +0100, Hannes Frederic Sowa wrote: > But "struct pid *" in unix_skb_parms should be enough to get us to > corresponding "struct cred *" so we can decrement the correct counter > during skb destruction. > > So: > > We increment current task's unix_inflight and

Re: [PATCH v2] unix: properly account for FDs passed over unix sockets

2016-02-02 Thread Willy Tarreau
On Tue, Feb 02, 2016 at 12:53:20PM -0800, Linus Torvalds wrote: > On Tue, Feb 2, 2016 at 12:49 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Tue, Feb 02, 2016 at 12:44:54PM -0800, Linus Torvalds wrote: > >> > >> Umm. I think the "struct cred" may ch

Re: [PATCH v2] unix: properly account for FDs passed over unix sockets

2016-02-02 Thread Willy Tarreau
On Tue, Feb 02, 2016 at 12:44:54PM -0800, Linus Torvalds wrote: > On Tue, Feb 2, 2016 at 12:32 PM, Hannes Frederic Sowa > wrote: > > But "struct pid *" in unix_skb_parms should be enough to get us to > > corresponding "struct cred *" so we can decrement the correct

Re: Mis-backport in af_unix patch for Linux 3.10.95

2016-01-24 Thread Willy Tarreau
ly due to the patch being mis-applied. In unix_stream_recvmsg(), it's still used as well. Does the attached patch seem better to you (not compile-tested) ? Greg/Ben, both 3.2.76 and 3.14.59 are OK regarding this, it seems like only 3.10.95 was affected. Thanks, Willy >From 77f6e82adf349cbccf7e2

Re: [PATCH net] af_unix: fix struct pid memory leak

2016-01-24 Thread Willy Tarreau
Hi Eric, On Sun, Jan 24, 2016 at 01:53:50PM -0800, Eric Dumazet wrote: > From: Eric Dumazet > > Dmitry reported a struct pid leak detected by a syzkaller program. > > Bug happens in unix_stream_recvmsg() when we break the loop when a > signal is pending, without properly

Re: struct pid memory leak

2016-01-23 Thread Willy Tarreau
Hi Eric, Dmitry, On Fri, Jan 22, 2016 at 08:50:01AM -0800, Eric Dumazet wrote: > CC netdev, as it looks some af_unix issue ... > > On Fri, 2016-01-22 at 16:08 +0100, Dmitry Vyukov wrote: > > Hello, > > > > The following program causes struct pid memory leak: > > > > // autogenerated by

Re: struct pid memory leak

2016-01-23 Thread Willy Tarreau
On Sat, Jan 23, 2016 at 07:14:33PM +0100, Dmitry Vyukov wrote: > I've attached my .config. > Also run this program in a parallel loop. I think it's leaking not > every time, probably some race is involved. Thank you. Just in order to confirm, am I supposed to see the messages you quoted in dmesg

Re: struct pid memory leak

2016-01-23 Thread Willy Tarreau
On Sat, Jan 23, 2016 at 07:46:45PM +0100, Dmitry Vyukov wrote: > On Sat, Jan 23, 2016 at 7:40 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Sat, Jan 23, 2016 at 07:14:33PM +0100, Dmitry Vyukov wrote: > >> I've attached my .config. > >> Also run this program in a pa

Re: struct pid memory leak

2016-01-23 Thread Willy Tarreau
On Sat, Jan 23, 2016 at 06:50:11PM -0800, Eric Dumazet wrote: > On Sat, Jan 23, 2016 at 6:38 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Sun, Jan 24, 2016 at 03:11:45AM +0100, Willy Tarreau wrote: > >> It doesn't report this on 3.10. > > > > To be more precise

Re: struct pid memory leak

2016-01-23 Thread Willy Tarreau
On Sun, Jan 24, 2016 at 03:11:45AM +0100, Willy Tarreau wrote: > It doesn't report this on 3.10. To be more precise, kmemleak reports the issue on 3.13 and not on 3.12. I'm not sure if it's reliable enough to run a bisect though. Willy

Re: [PATCH] unix: properly account for FDs passed over unix sockets

2015-12-30 Thread Willy Tarreau
On Thu, Dec 31, 2015 at 03:08:53PM +0900, Tetsuo Handa wrote: > Willy Tarreau wrote: > > On Wed, Dec 30, 2015 at 09:58:42AM +0100, Hannes Frederic Sowa wrote: > > > The MSG_PEEK code should not be harmful and the patch is good as is. I > > > first understood from t

Re: [PATCH] unix: properly account for FDs passed over unix sockets

2015-12-30 Thread Willy Tarreau
On Wed, Dec 30, 2015 at 09:58:42AM +0100, Hannes Frederic Sowa wrote: > The MSG_PEEK code should not be harmful and the patch is good as is. I > first understood from the published private thread, that it is possible > for a program to exceed the rlimit of fds. But the DoS is only by > keeping

Re: [PATCH] unix: properly account for FDs passed over unix sockets

2015-12-29 Thread Willy Tarreau
On Tue, Dec 29, 2015 at 03:48:45PM +0100, Hannes Frederic Sowa wrote: > On 28.12.2015 15:14, Willy Tarreau wrote: > >It is possible for a process to allocate and accumulate far more FDs than > >the process' limit by sending them over a unix socket then closing them > >to keep

[PATCH] unix: properly account for FDs passed over unix sockets

2015-12-28 Thread Willy Tarreau
-privileged processes from having more FDs in flight than their configured FD limit. Reported-by: socketp...@gmail.com Suggested-by: Linus Torvalds <torva...@linux-foundation.org> Signed-off-by: Willy Tarreau <w...@1wt.eu> --- It would be nice if (if accepted) it would be backporte

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-21 Thread Willy Tarreau
On Mon, Dec 21, 2015 at 12:38:27PM -0800, Tom Herbert wrote: > On Fri, Dec 18, 2015 at 11:00 PM, Willy Tarreau <w...@1wt.eu> wrote: > > On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote: > >> On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote: > >>

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-18 Thread Willy Tarreau
Hi Josh, On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote: > I was also puzzled that binding succeeded. Looking into the code paths > involved, in inet_csk_get_port, we quickly goto have_snum. From there, we end > up dropping into tb_found. Since !hlist_empty(>owners), we end up

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-18 Thread Willy Tarreau
On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote: > On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote: > > Hi Josh, > > > > On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote: > > > I was also puzzled that binding succeeded. Looking into

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-16 Thread Willy Tarreau
Hi Eric, On Wed, Dec 16, 2015 at 08:38:14AM +0100, Willy Tarreau wrote: > On Tue, Dec 15, 2015 at 01:21:15PM -0800, Eric Dumazet wrote: > > On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote: > > > > > Thus do you think it's worth adding a new option as Tolga pr

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-15 Thread Willy Tarreau
eed to add any new socket options nor anything. Please let me know what you think about it (patch attached), if it's accepted it's trivial to adapt haproxy to this new behaviour. Thanks! Willy >From 7b79e362479fa7084798e6aa41da2a2045f0d6bb Mon Sep 17 00:00:00 2001 From: Willy Tarreau <w...@

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-15 Thread Willy Tarreau
On Tue, Dec 15, 2015 at 10:21:52AM -0800, Eric Dumazet wrote: > On Tue, 2015-12-15 at 18:43 +0100, Willy Tarreau wrote: > > > Ah ? but what does it bring in this case ? I'm not seeing it used > > anywhere on a listening socket. The code took care of not breaking > > th

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-15 Thread Willy Tarreau
On Tue, Dec 15, 2015 at 09:10:24AM -0800, Eric Dumazet wrote: > On Tue, 2015-12-15 at 17:14 +0100, Willy Tarreau wrote: > > Hi Eric, > > > > On Wed, Nov 11, 2015 at 05:09:01PM -0800, Eric Dumazet wrote: > > > On Wed, 2015-11-11 at 10:43 -0800, Eric Dumazet wrote: &g

Re: [PATCH 1/1] net: Add SO_REUSEPORT_LISTEN_OFF socket option as drain mode

2015-12-15 Thread Willy Tarreau
On Tue, Dec 15, 2015 at 01:21:15PM -0800, Eric Dumazet wrote: > On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote: > > > Thus do you think it's worth adding a new option as Tolga proposed ? > > > I thought we tried hard to avoid adding the option but determined

Please backport commit to 3.12+

2015-11-06 Thread Willy Tarreau
Hi, We recently faced the issue described in the patch below on 3.14.56. This fix was merged in 4.2-rc7. I checked Davem's queue and stable queue and it's not there yet. Could we please have it in 3.12 and above ? (feature was introduced in 3.11). I can confirm that it properly fixes the problem

Re: "ss -p" segfaults (updated to 4.2)

2015-10-12 Thread Willy Tarreau
On Mon, Oct 12, 2015 at 09:50:19AM -0700, Stephen Hemminger wrote: > Applied, and did some editing on commit msg Thank you Stephen! Willy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at

Re: "ss -p" segfaults (updated to 4.2)

2015-10-06 Thread Willy Tarreau
if (!(u = malloc(sizeof(*u break; Also patched some other situations (strcpy and sprintf uses) that potentially produce the same results. Signed-off-by: Jose P Santos <j...@openmailbox.org> [ wt: made Jose's patch slightly simpler, all credits to him for the diag ] Signed

Re: NFS/TCP/IPv6 acting strangely in 4.2

2015-09-16 Thread Willy Tarreau
Hi, On Wed, Sep 16, 2015 at 06:53:57AM +, Damien Thébault wrote: > On Fri, 2015-09-11 at 12:38 +0100, Russell King - ARM Linux wrote: > > I have a recent Marvell Armada 388 board here which uses the mvneta > > driver. I'm seeing some weird effects with NFS with it acting as a > > client. >

Re: [PATCH][kernel 2.6.32] Bond interface can't send gratuitous ARP

2015-07-26 Thread Willy Tarreau
Hi Qingjie, On Mon, Jul 27, 2015 at 09:05:29AM +0800, ? wrote: Hi, Bond interface worked as Active-Backup mode. If the bond interface was added in bridge, then it's just a port of bridge. It doesn' have IP address. When bond slave changing, the current code bond_send_gratuitous_arp

  1   2   >