On Fri, Mar 25, 2016 at 1:00 PM, Eric Dumazet wrote:
> On Fri, 2016-03-25 at 12:31 -0400, Craig Gallek wrote:
>
>> I believe the issue here is that closing the listen sockets will drop
>> any connections that are in the listen queue but have not been
>> accepted yet. In the case of reuseport, you
On Fri, 2016-03-25 at 12:31 -0400, Craig Gallek wrote:
> I believe the issue here is that closing the listen sockets will drop
> any connections that are in the listen queue but have not been
> accepted yet. In the case of reuseport, you could in theory drain
> those queues into the non-closed so
On Fri, Mar 25, 2016 at 12:21 PM, Alexei Starovoitov
wrote:
> On Fri, Mar 25, 2016 at 11:29:10AM -0400, Craig Gallek wrote:
>> On Thu, Mar 24, 2016 at 2:00 PM, Willy Tarreau wrote:
>> > The pattern is :
>> >
>> > t0 : unprivileged processes 1 and 2 are listening to the same port
>> >(so
On Fri, Mar 25, 2016 at 11:29:10AM -0400, Craig Gallek wrote:
> On Thu, Mar 24, 2016 at 2:00 PM, Willy Tarreau wrote:
> > The pattern is :
> >
> > t0 : unprivileged processes 1 and 2 are listening to the same port
> >(sock1@pid1) (sock2@pid2)
> ><-- listening -->
> >
> >
On Thu, Mar 24, 2016 at 2:00 PM, Willy Tarreau wrote:
> The pattern is :
>
> t0 : unprivileged processes 1 and 2 are listening to the same port
>(sock1@pid1) (sock2@pid2)
><-- listening -->
>
> t1 : new processes are started to replace the old ones
>(sock1@pid1)
On Fri, 2016-03-25 at 12:21 +0100, Yann Ylavic wrote:
> Not my intention, you guys know what's the better for the kernel and its APIs.
> My concern is (was) admittedly due to my own lack of knowledge of
> (e)BPF, hence how much of kernel internals I'd need to know to make
> the SO_REUSEPORT work i
On Fri, Mar 25, 2016 at 9:53 AM, Willy Tarreau wrote:
>
> On Thu, Mar 24, 2016 at 11:49:41PM -0700, Eric Dumazet wrote:
>> Everything is possible, but do not complain because BPF went in the
>> kernel before your changes.
>
> Don't get me wrong, I'm not complaining, I'm more asking for help to
> tr
Hi Eric,
On Thu, Mar 24, 2016 at 11:49:41PM -0700, Eric Dumazet wrote:
> Everything is possible, but do not complain because BPF went in the
> kernel before your changes.
Don't get me wrong, I'm not complaining, I'm more asking for help to
try to elaborate the alternate solution. I understood wel
On Fri, 2016-03-25 at 06:28 +0100, Willy Tarreau wrote:
> On Thu, Mar 24, 2016 at 04:54:03PM -0700, Tom Herbert wrote:
> > On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic wrote:
> > > I'll learn how to do this to get the best performances from the
> > > server, but having to do so to work around what
On Thu, Mar 24, 2016 at 04:54:03PM -0700, Tom Herbert wrote:
> On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic wrote:
> > I'll learn how to do this to get the best performances from the
> > server, but having to do so to work around what looks like a defect
> > (for simple/default SMP configurations
From: Yann Ylavic
Date: Thu, 24 Mar 2016 23:40:30 +0100
> On Thu, Mar 24, 2016 at 6:55 PM, Daniel Borkmann wrote:
>> On 03/24/2016 06:26 PM, Tom Herbert wrote:
>>>
>>> On Thu, Mar 24, 2016 at 10:01 AM, Eric Dumazet wrote:
Really, when BPF can be the solution, we wont allow adding new st
From: Eric Dumazet
Date: Thu, 24 Mar 2016 15:49:48 -0700
> That is why EBPF has LLVM backend.
>
> Basically you can write your "BPF" program in C, and let llvm convert it
> into EBPF.
>
> Sure, you still can write BPF manually, as you could write HTTPS server
> in assembly.
+1
On Fri, Mar 25, 2016 at 12:54 AM, Tom Herbert wrote:
> On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic wrote:
>>
>> From this POV, draining the (ending) listeners is already non obvious
>> but might be reasonable, (e)BPF sounds really overkill.
>>
> Just the opposite, it's a simplification. With BPF
On Thu, Mar 24, 2016 at 4:40 PM, Yann Ylavic wrote:
> On Thu, Mar 24, 2016 at 11:49 PM, Eric Dumazet wrote:
>> On Thu, 2016-03-24 at 23:40 +0100, Yann Ylavic wrote:
>>
>>> FWIW, I find:
>>>
>>> const struct bpf_insn prog[] = {
>>> /* BPF_MOV64_REG(BPF_REG_6, BPF_REG_1) */
>>>
On Thu, Mar 24, 2016 at 11:49 PM, Eric Dumazet wrote:
> On Thu, 2016-03-24 at 23:40 +0100, Yann Ylavic wrote:
>
>> FWIW, I find:
>>
>> const struct bpf_insn prog[] = {
>> /* BPF_MOV64_REG(BPF_REG_6, BPF_REG_1) */
>> { BPF_ALU64 | BPF_MOV | BPF_X, BPF_REG_6, BPF_REG_1, 0, 0 },
>
On Thu, 2016-03-24 at 23:40 +0100, Yann Ylavic wrote:
> FWIW, I find:
>
> const struct bpf_insn prog[] = {
> /* BPF_MOV64_REG(BPF_REG_6, BPF_REG_1) */
> { BPF_ALU64 | BPF_MOV | BPF_X, BPF_REG_6, BPF_REG_1, 0, 0 },
> /* BPF_LD_ABS(BPF_W, 0) R0 = (uint32_t)skb[0] */
>
On Thu, Mar 24, 2016 at 6:55 PM, Daniel Borkmann wrote:
> On 03/24/2016 06:26 PM, Tom Herbert wrote:
>>
>> On Thu, Mar 24, 2016 at 10:01 AM, Eric Dumazet wrote:
>>>
>>> Really, when BPF can be the solution, we wont allow adding new stuff in
>>> the kernel in the old way.
>>
>> I completely agree wi
On Thu, 2016-03-24 at 11:20 -0700, Tolga Ceylan wrote:
> On Thu, Mar 24, 2016 at 10:55 AM, Daniel Borkmann
> wrote:
> > On 03/24/2016 06:26 PM, Tom Herbert wrote:
> >>
> >> I completely agree with this, but I wonder if we now need a repository
> >> of useful BPF modules. So in the case of impleme
On Thu, 2016-03-24 at 19:00 +0100, Willy Tarreau wrote:
> OK so this means we have to find a way to expand it to allow an individual
> non-privileged process to change the distribution algorithm without impacting
> other processes.
Just to clarify : Installing a BPF filter on a SO_REUSEPORT socke
On Thu, Mar 24, 2016 at 11:20:49AM -0700, Tolga Ceylan wrote:
> I would appreciate a conceptual description on how this would work
> especially for a common scenario
> as described by Willy. My initial impression was that a coordinator
> (master) process takes this
> responsibility to adjust BPF fi
On Thu, Mar 24, 2016 at 07:00:11PM +0100, Willy Tarreau wrote:
> Since it's not about
> load distribution and that processes are totally independant, I don't see
> well how to (ab)use BPF to achieve this.
>
> The pattern is :
>
> t0 : unprivileged processes 1 and 2 are listening to the same por
On Thu, Mar 24, 2016 at 10:55 AM, Daniel Borkmann wrote:
> On 03/24/2016 06:26 PM, Tom Herbert wrote:
>>
>> I completely agree with this, but I wonder if we now need a repository
>> of useful BPF modules. So in the case of implementing functionality
>> like in SO_REUSEPORT_LISTEN_OFF that might ju
On Thu, Mar 24, 2016 at 10:01:37AM -0700, Eric Dumazet wrote:
> On Thu, 2016-03-24 at 17:50 +0100, Willy Tarreau wrote:
> > On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
> > > > --- a/net/ipv4/inet_hashtables.c
> > > > +++ b/net/ipv4/inet_hashtables.c
> > > > @@ -189,6 +189,8 @@ sta
On 03/24/2016 06:26 PM, Tom Herbert wrote:
On Thu, Mar 24, 2016 at 10:01 AM, Eric Dumazet wrote:
On Thu, 2016-03-24 at 17:50 +0100, Willy Tarreau wrote:
On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -189,6 +1
On Thu, Mar 24, 2016 at 10:01 AM, Eric Dumazet wrote:
> On Thu, 2016-03-24 at 17:50 +0100, Willy Tarreau wrote:
>> On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
>> > > --- a/net/ipv4/inet_hashtables.c
>> > > +++ b/net/ipv4/inet_hashtables.c
>> > > @@ -189,6 +189,8 @@ static inline
On Thu, 2016-03-24 at 17:50 +0100, Willy Tarreau wrote:
> On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
> > > --- a/net/ipv4/inet_hashtables.c
> > > +++ b/net/ipv4/inet_hashtables.c
> > > @@ -189,6 +189,8 @@ static inline int compute_score(struct sock *sk,
> > > struct net *net,
>
On Thu, Mar 24, 2016 at 09:33:11AM -0700, Eric Dumazet wrote:
> > --- a/net/ipv4/inet_hashtables.c
> > +++ b/net/ipv4/inet_hashtables.c
> > @@ -189,6 +189,8 @@ static inline int compute_score(struct sock *sk, struct
> > net *net,
> > return -1;
> >
On Thu, 2016-03-24 at 16:30 +0100, Willy Tarreau wrote:
> Hi Eric,
>
> (just lost my e-mail, trying not to forget some points)
>
> On Thu, Mar 24, 2016 at 07:45:44AM -0700, Eric Dumazet wrote:
> > On Thu, 2016-03-24 at 15:22 +0100, Willy Tarreau wrote:
> > > Hi Eric,
> >
> > > But that means th
Hi Eric,
(just lost my e-mail, trying not to forget some points)
On Thu, Mar 24, 2016 at 07:45:44AM -0700, Eric Dumazet wrote:
> On Thu, 2016-03-24 at 15:22 +0100, Willy Tarreau wrote:
> > Hi Eric,
>
> > But that means that any software making use of SO_REUSEPORT needs to
> > also implement BPF
On Thu, 2016-03-24 at 15:22 +0100, Willy Tarreau wrote:
> Hi Eric,
> But that means that any software making use of SO_REUSEPORT needs to
> also implement BPF on Linux to achieve the same as what it does on
> other OSes ? Also I found a case where a dying process would still
> cause trouble in the
Hi Eric,
On Thu, Mar 24, 2016 at 07:13:33AM -0700, Eric Dumazet wrote:
> On Thu, 2016-03-24 at 07:12 +0100, Willy Tarreau wrote:
> > Hi,
> >
> > On Wed, Mar 23, 2016 at 10:10:06PM -0700, Tolga Ceylan wrote:
> > > I apologize for not properly following up on this. I had the
> > > impression that w
On Thu, 2016-03-24 at 07:12 +0100, Willy Tarreau wrote:
> Hi,
>
> On Wed, Mar 23, 2016 at 10:10:06PM -0700, Tolga Ceylan wrote:
> > I apologize for not properly following up on this. I had the
> > impression that we did not want to merge my original patch and then I
> > also noticed that it fails
Hi,
On Wed, Mar 23, 2016 at 10:10:06PM -0700, Tolga Ceylan wrote:
> I apologize for not properly following up on this. I had the
> impression that we did not want to merge my original patch and then I
> also noticed that it fails to keep the hash consistent. Recently, I
> read the follow ups on it
On Mon, Dec 21, 2015 at 12:41 PM, Willy Tarreau wrote:
> On Mon, Dec 21, 2015 at 12:38:27PM -0800, Tom Herbert wrote:
>> On Fri, Dec 18, 2015 at 11:00 PM, Willy Tarreau wrote:
>> > On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote:
>> >> On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau
On Mon, Dec 21, 2015 at 12:38:27PM -0800, Tom Herbert wrote:
> On Fri, Dec 18, 2015 at 11:00 PM, Willy Tarreau wrote:
> > On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote:
> >> On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote:
> >> > Hi Josh,
> >> >
> >> > On Fri, Dec 18, 2015 at
On Fri, Dec 18, 2015 at 11:00 PM, Willy Tarreau wrote:
> On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote:
>> On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote:
>> > Hi Josh,
>> >
>> > On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote:
>> > > I was also puzzled that bind
On Fri, Dec 18, 2015 at 06:38:03PM -0800, Eric Dumazet wrote:
> On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote:
> > Hi Josh,
> >
> > On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote:
> > > I was also puzzled that binding succeeded. Looking into the code paths
> > > involved, in
On Fri, 2015-12-18 at 19:58 +0100, Willy Tarreau wrote:
> Hi Josh,
>
> On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote:
> > I was also puzzled that binding succeeded. Looking into the code paths
> > involved, in inet_csk_get_port, we quickly goto have_snum. From there, we
> > end
> >
Hi Josh,
On Fri, Dec 18, 2015 at 08:33:45AM -0800, Josh Snyder wrote:
> I was also puzzled that binding succeeded. Looking into the code paths
> involved, in inet_csk_get_port, we quickly goto have_snum. From there, we end
> up dropping into tb_found. Since !hlist_empty(&tb->owners), we end up che
I was also puzzled that binding succeeded. Looking into the code paths
involved, in inet_csk_get_port, we quickly goto have_snum. From there, we end
up dropping into tb_found. Since !hlist_empty(&tb->owners), we end up checking
that (tb->fastreuseport > 0 && sk->sk_reuseport && uid_eq(tb->fastuid,
Hi Eric,
On Wed, Dec 16, 2015 at 08:38:14AM +0100, Willy Tarreau wrote:
> On Tue, Dec 15, 2015 at 01:21:15PM -0800, Eric Dumazet wrote:
> > On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote:
> >
> > > Thus do you think it's worth adding a new option as Tolga proposed ?
> >
> >
> > I though
On Tue, Dec 15, 2015 at 01:21:15PM -0800, Eric Dumazet wrote:
> On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote:
>
> > Thus do you think it's worth adding a new option as Tolga proposed ?
>
>
> I thought we tried hard to avoid adding the option but determined
> we could not avoid it ;)
N
On Tue, 2015-12-15 at 20:44 +0100, Willy Tarreau wrote:
> Thus do you think it's worth adding a new option as Tolga proposed ?
I thought we tried hard to avoid adding the option but determined
we could not avoid it ;)
So I would simply resend the patch for another review.
--
To unsubscribe fr
On Tue, Dec 15, 2015 at 10:21:52AM -0800, Eric Dumazet wrote:
> On Tue, 2015-12-15 at 18:43 +0100, Willy Tarreau wrote:
>
> > Ah ? but what does it bring in this case ? I'm not seeing it used
> > anywhere on a listening socket. The code took care of not breaking
> > them though (ie they still acce
On Tue, Dec 15, 2015 at 09:10:24AM -0800, Eric Dumazet wrote:
> On Tue, 2015-12-15 at 17:14 +0100, Willy Tarreau wrote:
> > Hi Eric,
> >
> > On Wed, Nov 11, 2015 at 05:09:01PM -0800, Eric Dumazet wrote:
> > > On Wed, 2015-11-11 at 10:43 -0800, Eric Dumazet wrote:
> > > > On Wed, 2015-11-11 at 10:2
On Tue, 2015-12-15 at 18:43 +0100, Willy Tarreau wrote:
> Ah ? but what does it bring in this case ? I'm not seeing it used
> anywhere on a listening socket. The code took care of not breaking
> them though (ie they still accept if no other socket shows up with
> a higher score). Otherwise we'll h
On Tue, 2015-12-15 at 17:14 +0100, Willy Tarreau wrote:
> Hi Eric,
>
> On Wed, Nov 11, 2015 at 05:09:01PM -0800, Eric Dumazet wrote:
> > On Wed, 2015-11-11 at 10:43 -0800, Eric Dumazet wrote:
> > > On Wed, 2015-11-11 at 10:23 -0800, Tom Herbert wrote:
> > >
> > > > How about doing this in shutdow
Hi Eric,
On Wed, Nov 11, 2015 at 05:09:01PM -0800, Eric Dumazet wrote:
> On Wed, 2015-11-11 at 10:43 -0800, Eric Dumazet wrote:
> > On Wed, 2015-11-11 at 10:23 -0800, Tom Herbert wrote:
> >
> > > How about doing this in shutdown called for a listener?
> >
> > Seems a good idea, I will try it, th
On Wed, 2015-11-11 at 10:43 -0800, Eric Dumazet wrote:
> On Wed, 2015-11-11 at 10:23 -0800, Tom Herbert wrote:
>
> > How about doing this in shutdown called for a listener?
>
> Seems a good idea, I will try it, thanks !
>
Arg, I forgot about this shutdown() discussion we had recently
with Oracl
On Wed, 2015-11-11 at 10:23 -0800, Tom Herbert wrote:
> How about doing this in shutdown called for a listener?
Seems a good idea, I will try it, thanks !
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo in
On Wed, Nov 11, 2015 at 9:23 AM, Eric Dumazet wrote:
> On Wed, 2015-11-11 at 09:05 -0800, Tom Herbert wrote:
>> On Tue, Nov 10, 2015 at 10:19 PM, Eric Dumazet
>> wrote:
>> > On Tue, 2015-11-10 at 21:41 -0800, Tom Herbert wrote:
>> >> Tolga, are you still planning to respin this patch (when tree
On Wed, 2015-11-11 at 09:05 -0800, Tom Herbert wrote:
> On Tue, Nov 10, 2015 at 10:19 PM, Eric Dumazet wrote:
> > On Tue, 2015-11-10 at 21:41 -0800, Tom Herbert wrote:
> >> Tolga, are you still planning to respin this patch (when tree opens?)
> >
> > I was planning to add an union on skc_tx_queue_
On Tue, Nov 10, 2015 at 10:19 PM, Eric Dumazet wrote:
> On Tue, 2015-11-10 at 21:41 -0800, Tom Herbert wrote:
>> Tolga, are you still planning to respin this patch (when tree opens?)
>
> I was planning to add an union on skc_tx_queue_mapping and
> sk_max_ack_backlog, so that adding a check on sk_m
On Tue, 2015-11-10 at 21:41 -0800, Tom Herbert wrote:
> Tolga, are you still planning to respin this patch (when tree opens?)
I was planning to add an union on skc_tx_queue_mapping and
sk_max_ack_backlog, so that adding a check on sk_max_ack_backlog in
listener lookup would not add an additional c
Tolga, are you still planning to respin this patch (when tree opens?)
Thanks,
Tom
On Sat, Sep 26, 2015 at 7:24 PM, Eric Dumazet wrote:
> On Sat, 2015-09-26 at 19:02 -0700, Tolga Ceylan wrote:
>> By keeping hiscore/matches as is, I'm trying to keep the hashing consistent.
>> Otherwise, this woul
On Sat, 2015-09-26 at 19:02 -0700, Tolga Ceylan wrote:
> By keeping hiscore/matches as is, I'm trying to keep the hashing consistent.
> Otherwise, this would behave similar to removing a listener which
> drops connections.
Right, this problem will soon disappear when listener rewrite is
complete.
On Sat, Sep 26, 2015 at 6:44 PM, Aaron Conole wrote:
> Greetings.
>
> Tolga Ceylan writes:
>> +#define SO_REUSEPORT_LISTEN_OFF 51
>> +
> For all of these, I think the space should be tab.
>
>> unsigned char skc_reuseport:1;
>>+ unsigned char skc_reuseport_listen_off:1;
Greetings.
Tolga Ceylan writes:
> +#define SO_REUSEPORT_LISTEN_OFF 51
> +
For all of these, I think the space should be tab.
> unsigned char skc_reuseport:1;
>+ unsigned char skc_reuseport_listen_off:1;
> unsigned char skc_ipv6only:1;
The spacing here i
On Sat, Sep 26, 2015 at 6:04 PM, Eric Dumazet wrote:
>
> What about listen(fd, 0) ?
>
> Not sure we need to add a new socket option.
>
> It makes sense to extend reuseport logic to ignore listeners with a 0
> backlog (if not already done, I did not check)
>
>
Just checked this and no listen(fd, 0
On Sat, 2015-09-26 at 17:30 -0700, Tolga Ceylan wrote:
> For applications using SO_REUSEPORT listeners, there is
> no clean way to switch traffic on/off or add/remove
> listeners without dropping pending connections. With this
> patch, applications can turn off queueing of new connections
> for a s
For applications using SO_REUSEPORT listeners, there is
no clean way to switch traffic on/off or add/remove
listeners without dropping pending connections. With this
patch, applications can turn off queueing of new connections
for a specific listener socket which enables implementation of
zero down
61 matches
Mail list logo