Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-05 Thread Adrian Chadd
I was thinking of n netisrs per m CPUs, where n < m; or maybe 1 netisr
for m CPUs, where m is less than the total number.

Having 48 cores contending on netisr stuff is a bit crazy. It's highly
unlikely you need that many cores doing packet pushing.


-a
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-05 Thread Sepherosa Ziehau
On Tue, Dec 3, 2013 at 5:41 AM, Adrian Chadd  wrote:
>
> On 2 December 2013 03:45, Sepherosa Ziehau  wrote:
> >
> > On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd  wrote:
> >
> >> Ok, so given this, how do you guarantee the UTHREAD stays on the given
> >> CPU? You assume it stays on the CPU that the initial listen socket was
> >> created on, right? If it's migrated to another CPU core then the
> >> listen queue still stays in the original hash group that's in a netisr
> >> on a different CPU?
> >
> > As I wrote in the above brief introduction, Dfly currently relies on the
> > scheduler doing the proper thing (the scheduler does do a very good job
> > during my tests).  I need to export certain kind of socket option to make
> > that information available to user space programs.  Force UTHREAD binding in
> > kernel is not helpful, given in reverse proxy application, things are
> > different.  And even if that kind of binding information was exported to
> > user space, user space program still would have to poll it periodically (in
> > Dfly at least), since other programs binding to the same addr/port could
> > come and go, which will cause reorganizing of the inp localgroup in the
> > current Dfly implementation.
>
> Right. I kinda gathered that. It's fine, I was conceptually thinking
> of doing some thead pinning into this anyway.
>
> How do you see this scaling on massively multi-core machines? Like 32,
> 48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting

We do have a 48 core box.  It is mainly used for package building and
other stuffs.  I didn't run network stress tests on it.  However, we
do address some message passing problems on it which will not be
unveiled on 8 cpu boxes.

> the concept of pcbgroup hash / netisr threads to a subset of CPUs, or
> have them be able to float between sockets but only have 1 (or n,

Floating around may be good, but by pinning netisr to a specific CPU
you could enjoy lockless per-cpu data.

> maybe) per socket. Or just have a fixed, smaller pool. The idea then

We used to have dedicated threads for UDP and TCP processing, but it
turns out that one netisr per cpu works best in Dfly.  You probably
need to try and measure before deciding to move to 1 or N netisrs per
cpu.

Best Regards,
sephe

> is the scheduler would need to be told that a given userland
> thread/process belongs to a given netisr thread, and to schedule them
> on the same CPU when possible.
>
> Anyway, thanks for doing this work. I only wish that you'd do it for
> FreeBSD. :-)
>
>
>
> -adrian




-- 
Tomorrow Will Never Die
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-02 Thread Adrian Chadd
On 2 December 2013 03:45, Sepherosa Ziehau  wrote:
>
> On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd  wrote:
>
>> Ok, so given this, how do you guarantee the UTHREAD stays on the given
>> CPU? You assume it stays on the CPU that the initial listen socket was
>> created on, right? If it's migrated to another CPU core then the
>> listen queue still stays in the original hash group that's in a netisr
>> on a different CPU?
>
> As I wrote in the above brief introduction, Dfly currently relies on the
> scheduler doing the proper thing (the scheduler does do a very good job
> during my tests).  I need to export certain kind of socket option to make
> that information available to user space programs.  Force UTHREAD binding in
> kernel is not helpful, given in reverse proxy application, things are
> different.  And even if that kind of binding information was exported to
> user space, user space program still would have to poll it periodically (in
> Dfly at least), since other programs binding to the same addr/port could
> come and go, which will cause reorganizing of the inp localgroup in the
> current Dfly implementation.

Right. I kinda gathered that. It's fine, I was conceptually thinking
of doing some thead pinning into this anyway.

How do you see this scaling on massively multi-core machines? Like 32,
48, 64, 128 cores? I had some vague handwav-y notion of maybe limiting
the concept of pcbgroup hash / netisr threads to a subset of CPUs, or
have them be able to float between sockets but only have 1 (or n,
maybe) per socket. Or just have a fixed, smaller pool. The idea then
is the scheduler would need to be told that a given userland
thread/process belongs to a given netisr thread, and to schedule them
on the same CPU when possible.

Anyway, thanks for doing this work. I only wish that you'd do it for
FreeBSD. :-)



-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-02 Thread Sepherosa Ziehau
On Mon, Dec 2, 2013 at 1:02 PM, Adrian Chadd  wrote:

> Hi! Thanks for the writeup!
>
> On 1 December 2013 20:17, Sepherosa Ziehau  wrote:
>
> > I also put up a brief description of SO_REUSEPORT in dfly; may be useful
> to
> > you:
> > http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>
> Ok, so given this, how do you guarantee the UTHREAD stays on the given
> CPU? You assume it stays on the CPU that the initial listen socket was
> created on, right? If it's migrated to another CPU core then the
> listen queue still stays in the original hash group that's in a netisr
> on a different CPU?
>
>
As I wrote in the above brief introduction, Dfly currently relies on the
scheduler doing the proper thing (the scheduler does do a very good job
during my tests).  I need to export certain kind of socket option to make
that information available to user space programs.  Force UTHREAD binding
in kernel is not helpful, given in reverse proxy application, things are
different.  And even if that kind of binding information was exported to
user space, user space program still would have to poll it periodically (in
Dfly at least), since other programs binding to the same addr/port could
come and go, which will cause reorganizing of the inp localgroup in the
current Dfly implementation.

Best Regards,
sephe

-- 
Tomorrow Will Never Die
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-02 Thread Sepherosa Ziehau
On Mon, Dec 2, 2013 at 12:29 PM, Oleg Moskalenko wrote:

> Sepherosa, while reading your description I noticed another long-standing
> problem for UDP application developers: the UDP sockets are always hashed
> with 2-tuple. But UDP sockets can be "connected", too, to a remote address,
> with connect(...)
>

The connected UDP sockets will be in connect hash, which is hashed using
faddr/laddr/fport/lport.  SO_REUSEPORT only affects wildcard sockets.


> function. Unfortunately, with 2-tuple hashing, that pattern is useless for
> large-scale applications: if a large number of UDP sockets on the same
> local port are "connected" to remote address, then the kernel have to go
> thru the long list of UDP sockets with the same hash value.
>
> If the connected UDP sockets would use 4-tuples, then it would be very
> helpful for the new generation of the UDP-based media applications. For
> example, servers which use DTLS protocol would become simpler and more
> efficient.
>
>
If you are talking about RSS, then igb, ixgbe and mxge (and may be other
drivers) support RSS extension (mxge is not using RSS, but still 4-tuple
hash), which will include UDP fport/lport into Toeplitz hash calculation.
Well, for fragments of a UDP datagram, if the ports are taken into
consideration the RSS hash will be different for leading fragment and rest
of the fragments; I think that's why MS didn't include ports for UDP.

Best Regards,
sephe


> Thanks
> Oleg
>
>
>
> On Sun, Dec 1, 2013 at 8:17 PM, Sepherosa Ziehau wrote:
>
>>
>>
>>
>> On Sat, Nov 30, 2013 at 2:42 AM, Ermal Luçi  wrote:
>>
>>> Well seems Dragonfly has some version of it already from commit [1].
>>>
>>>
>> The distribution algorithm was changed a little bit after initial commit
>> to gain more idle time (bnx(4) output has already been maxed out):
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be28b150d3f4fd518914bdeba6
>>
>> Well, I also addressed a reasonable concern from nginx folks (I am not
>> quite sure about Linux's position on it; Linux original implementation of
>> SO_REUSEPORT from Google had this drawback, which I mentioned in the commit
>> message):
>>
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45eb69750219f79f5e8982272
>>
>> As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
>> dports; should be easier to be back ported to FreeBSD's ports.  I failed to
>> convince nginx folks to merge it into mainline and I am currently onto
>> other stuffs, will come back to them later.  If FreeBSD is going to
>> implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
>> mainline will be easier.
>>
>> I also put up a brief description of SO_REUSEPORT in dfly; may be useful
>> to you:
>> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt
>>
>> Best Regards,
>> sephe
>>
>>
>>>  In FreeBSD there is the framework for this with by defining PCBGROUP.
>>> Also the explanation of it at [2] and [3].
>>> It can achieve approximately the same features of SO_RESUSEPORT of linux.
>>> The only thing missing is the marketing behind it and i think and better
>>> RSS support.
>>> By looking at dates the support is there before linux so all you guys
>>> looking for it can experiment with it.
>>>
>>> What i was trying to accomplish was something else from performance
>>> improvement and
>>> maybe put a sysctl behind it to make it more acceptable..
>>>
>>> [1]
>>>
>>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
>>> [2]
>>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
>>> [3]
>>> http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>>>
>>>
>>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko >> >wrote:
>>>
>>> > Tim, you are wrong. Read what is "multicast" definition, and read how
>>> UDP
>>> > and TCP sockets work in Linux 3.9+ kernels.
>>> >
>>> > Oleg .
>>> >
>>> >
>>> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle >> >wrote:
>>> >
>>> >>
>>> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
>>> >>
>>> >> > Hello,
>>> >> >
>>> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>>> daemons to
>>> >> > share the same port and possibly listening ip …
>>> >>
>>> >> These flags are used with TCP-based servers.
>>> >>
>>> >> I’ve used them to make software upgrades go more smoothly.
>>> >> Without them, the following often happens:
>>> >>
>>> >> * Old server stops.  In the process, all of its TCP connections are
>>> >> closed.
>>> >>
>>> >> * Connections to old server remain in the TCP connection table until
>>> the
>>> >> remote end can acknowledge.
>>> >>
>>> >> * New server starts.
>>> >>
>>> >> * New server tries to open port but fails because that port is “still
>>> in
>>> >> use” by connections in the TCP connection table.
>>> >>
>>> >> With these flags, the new server can open the port even though
>>> >> it is “still in use” by existing connections.
>>> >>
>>> >>
>>> >> >

Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-01 Thread Adrian Chadd
Hi! Thanks for the writeup!

On 1 December 2013 20:17, Sepherosa Ziehau  wrote:

> I also put up a brief description of SO_REUSEPORT in dfly; may be useful to
> you:
> http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt

Ok, so given this, how do you guarantee the UTHREAD stays on the given
CPU? You assume it stays on the CPU that the initial listen socket was
created on, right? If it's migrated to another CPU core then the
listen queue still stays in the original hash group that's in a netisr
on a different CPU?


-adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-12-01 Thread Sepherosa Ziehau
On Sat, Nov 30, 2013 at 2:42 AM, Ermal Luçi  wrote:

> Well seems Dragonfly has some version of it already from commit [1].
>
>
The distribution algorithm was changed a little bit after initial commit to
gain more idle time (bnx(4) output has already been maxed out):
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/c275f18d832361be28b150d3f4fd518914bdeba6

Well, I also addressed a reasonable concern from nginx folks (I am not
quite sure about Linux's position on it; Linux original implementation of
SO_REUSEPORT from Google had this drawback, which I mentioned in the commit
message):
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/02ad2f0b874fb0a45eb69750219f79f5e8982272

As about nginx, SO_REUSEPORT patch for nginx (both 1.4.x and 1.5.x) is in
dports; should be easier to be back ported to FreeBSD's ports.  I failed to
convince nginx folks to merge it into mainline and I am currently onto
other stuffs, will come back to them later.  If FreeBSD is going to
implement Linux's style of SO_REUSEPORT, pushing the patch to the nginx
mainline will be easier.

I also put up a brief description of SO_REUSEPORT in dfly; may be useful to
you:
http://leaf.dragonflybsd.org/~sephe/netisr_so_reuseport.txt

Best Regards,
sephe


> In FreeBSD there is the framework for this with by defining PCBGROUP.
> Also the explanation of it at [2] and [3].
> It can achieve approximately the same features of SO_RESUSEPORT of linux.
> The only thing missing is the marketing behind it and i think and better
> RSS support.
> By looking at dates the support is there before linux so all you guys
> looking for it can experiment with it.
>
> What i was trying to accomplish was something else from performance
> improvement and
> maybe put a sysctl behind it to make it more acceptable..
>
> [1]
>
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
> [2]
> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>
>
> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko  >wrote:
>
> > Tim, you are wrong. Read what is "multicast" definition, and read how UDP
> > and TCP sockets work in Linux 3.9+ kernels.
> >
> > Oleg .
> >
> >
> > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle  >wrote:
> >
> >>
> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
> >>
> >> > Hello,
> >> >
> >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons
> to
> >> > share the same port and possibly listening ip …
> >>
> >> These flags are used with TCP-based servers.
> >>
> >> I’ve used them to make software upgrades go more smoothly.
> >> Without them, the following often happens:
> >>
> >> * Old server stops.  In the process, all of its TCP connections are
> >> closed.
> >>
> >> * Connections to old server remain in the TCP connection table until the
> >> remote end can acknowledge.
> >>
> >> * New server starts.
> >>
> >> * New server tries to open port but fails because that port is “still in
> >> use” by connections in the TCP connection table.
> >>
> >> With these flags, the new server can open the port even though
> >> it is “still in use” by existing connections.
> >>
> >>
> >> > This is not the case today.
> >> > Only multicast sockets seem to have the behaviour of broadcasting the
> >> data
> >> > to all sockets sharing the same properties through these options!
> >>
> >> That is what multicast is for.
> >>
> >> If you want the same data sent to all listeners, then
> >> that is multicast behavior and you should be using
> >> a multicast socket.
> >>
> >> > The patch at [1] implements/corrects the behaviour for UDP sockets.
> >>
> >> You’re trying to turn all UDP sockets with those options
> >> into multicast sockets.
> >>
> >> If you want a multicast socket, you should ask for one.
> >>
> >> Tim
> >>
> >> ___
> >> freebsd-...@freebsd.org mailing list
> >> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> >>
> >
> >
>
>
> --
> Ermal
> ___
> freebsd-current@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>



-- 
Tomorrow Will Never Die
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Adrian Chadd
Sure, is there a TCP version of this patch floating around? How's it
doing load balancing to multiple listeners?


-a

On 29 November 2013 11:28, Oleg Moskalenko  wrote:
> It would be nice to have this feature compiled and supported in FreeBSD
> kernel by default.
>
> Thanks
> Oleg
>
>
> On Fri, Nov 29, 2013 at 11:01 AM, Ermal Luçi  wrote:
>
>> And some better marketing from Dragonfly about it
>> http://forum.nginx.org/read.php?29,241283,241283 :)
>>
>>
>> On Fri, Nov 29, 2013 at 7:55 PM, Ermal Luçi  wrote:
>>
>>> Also some discussions and improvements to it.
>>>
>>> http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html
>>>
>>>
>>> On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi  wrote:
>>>
 Well seems Dragonfly has some version of it already from commit [1].

 In FreeBSD there is the framework for this with by defining PCBGROUP.
 Also the explanation of it at [2] and [3].
 It can achieve approximately the same features of SO_RESUSEPORT of linux.
 The only thing missing is the marketing behind it and i think and better
 RSS support.
 By looking at dates the support is there before linux so all you guys
 looking for it can experiment with it.

 What i was trying to accomplish was something else from performance
 improvement and
 maybe put a sysctl behind it to make it more acceptable..

 [1]
 http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
 [2]
 http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
 [3]
 http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html


 On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko 
 wrote:

> Tim, you are wrong. Read what is "multicast" definition, and read how
> UDP and TCP sockets work in Linux 3.9+ kernels.
>
> Oleg .
>
>
> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote:
>
>>
>> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
>>
>> > Hello,
>> >
>> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two
>> daemons to
>> > share the same port and possibly listening ip …
>>
>> These flags are used with TCP-based servers.
>>
>> I’ve used them to make software upgrades go more smoothly.
>> Without them, the following often happens:
>>
>> * Old server stops.  In the process, all of its TCP connections are
>> closed.
>>
>> * Connections to old server remain in the TCP connection table until
>> the remote end can acknowledge.
>>
>> * New server starts.
>>
>> * New server tries to open port but fails because that port is “still
>> in use” by connections in the TCP connection table.
>>
>> With these flags, the new server can open the port even though
>> it is “still in use” by existing connections.
>>
>>
>> > This is not the case today.
>> > Only multicast sockets seem to have the behaviour of broadcasting
>> the data
>> > to all sockets sharing the same properties through these options!
>>
>> That is what multicast is for.
>>
>> If you want the same data sent to all listeners, then
>> that is multicast behavior and you should be using
>> a multicast socket.
>>
>> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>>
>> You’re trying to turn all UDP sockets with those options
>> into multicast sockets.
>>
>> If you want a multicast socket, you should ask for one.
>>
>> Tim
>>
>> ___
>> freebsd-...@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>
>
>


 --
 Ermal

>>>
>>>
>>>
>>> --
>>> Ermal
>>>
>>
>>
>>
>> --
>> Ermal
>>
> ___
> freebsd-...@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Ermal Luçi
And some better marketing from Dragonfly about it
http://forum.nginx.org/read.php?29,241283,241283 :)


On Fri, Nov 29, 2013 at 7:55 PM, Ermal Luçi  wrote:

> Also some discussions and improvements to it.
>
> http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html
>
>
> On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi  wrote:
>
>> Well seems Dragonfly has some version of it already from commit [1].
>>
>> In FreeBSD there is the framework for this with by defining PCBGROUP.
>> Also the explanation of it at [2] and [3].
>> It can achieve approximately the same features of SO_RESUSEPORT of linux.
>> The only thing missing is the marketing behind it and i think and better
>> RSS support.
>> By looking at dates the support is there before linux so all you guys
>> looking for it can experiment with it.
>>
>> What i was trying to accomplish was something else from performance
>> improvement and
>> maybe put a sysctl behind it to make it more acceptable..
>>
>> [1]
>> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
>> [2]
>> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
>> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>>
>>
>> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote:
>>
>>> Tim, you are wrong. Read what is "multicast" definition, and read how
>>> UDP and TCP sockets work in Linux 3.9+ kernels.
>>>
>>> Oleg .
>>>
>>>
>>> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote:
>>>

 On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:

 > Hello,
 >
 > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons
 to
 > share the same port and possibly listening ip …

 These flags are used with TCP-based servers.

 I’ve used them to make software upgrades go more smoothly.
 Without them, the following often happens:

 * Old server stops.  In the process, all of its TCP connections are
 closed.

 * Connections to old server remain in the TCP connection table until
 the remote end can acknowledge.

 * New server starts.

 * New server tries to open port but fails because that port is “still
 in use” by connections in the TCP connection table.

 With these flags, the new server can open the port even though
 it is “still in use” by existing connections.


 > This is not the case today.
 > Only multicast sockets seem to have the behaviour of broadcasting the
 data
 > to all sockets sharing the same properties through these options!

 That is what multicast is for.

 If you want the same data sent to all listeners, then
 that is multicast behavior and you should be using
 a multicast socket.

 > The patch at [1] implements/corrects the behaviour for UDP sockets.

 You’re trying to turn all UDP sockets with those options
 into multicast sockets.

 If you want a multicast socket, you should ask for one.

 Tim

 ___
 freebsd-...@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-net
 To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

>>>
>>>
>>
>>
>> --
>> Ermal
>>
>
>
>
> --
> Ermal
>



-- 
Ermal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Ermal Luçi
Also some discussions and improvements to it.

http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html


On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi  wrote:

> Well seems Dragonfly has some version of it already from commit [1].
>
> In FreeBSD there is the framework for this with by defining PCBGROUP.
> Also the explanation of it at [2] and [3].
> It can achieve approximately the same features of SO_RESUSEPORT of linux.
> The only thing missing is the marketing behind it and i think and better
> RSS support.
> By looking at dates the support is there before linux so all you guys
> looking for it can experiment with it.
>
> What i was trying to accomplish was something else from performance
> improvement and
> maybe put a sysctl behind it to make it more acceptable..
>
> [1]
> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
> [2]
> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html
>
>
> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote:
>
>> Tim, you are wrong. Read what is "multicast" definition, and read how UDP
>> and TCP sockets work in Linux 3.9+ kernels.
>>
>> Oleg .
>>
>>
>> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote:
>>
>>>
>>> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
>>>
>>> > Hello,
>>> >
>>> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons
>>> to
>>> > share the same port and possibly listening ip …
>>>
>>> These flags are used with TCP-based servers.
>>>
>>> I’ve used them to make software upgrades go more smoothly.
>>> Without them, the following often happens:
>>>
>>> * Old server stops.  In the process, all of its TCP connections are
>>> closed.
>>>
>>> * Connections to old server remain in the TCP connection table until the
>>> remote end can acknowledge.
>>>
>>> * New server starts.
>>>
>>> * New server tries to open port but fails because that port is “still in
>>> use” by connections in the TCP connection table.
>>>
>>> With these flags, the new server can open the port even though
>>> it is “still in use” by existing connections.
>>>
>>>
>>> > This is not the case today.
>>> > Only multicast sockets seem to have the behaviour of broadcasting the
>>> data
>>> > to all sockets sharing the same properties through these options!
>>>
>>> That is what multicast is for.
>>>
>>> If you want the same data sent to all listeners, then
>>> that is multicast behavior and you should be using
>>> a multicast socket.
>>>
>>> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>>>
>>> You’re trying to turn all UDP sockets with those options
>>> into multicast sockets.
>>>
>>> If you want a multicast socket, you should ask for one.
>>>
>>> Tim
>>>
>>> ___
>>> freebsd-...@freebsd.org mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>>
>>
>>
>
>
> --
> Ermal
>



-- 
Ermal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Ermal Luçi
On Fri, Nov 29, 2013 at 6:59 PM, Tim Kientzle  wrote:

>
> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
>
> > Hello,
> >
> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
> > share the same port and possibly listening ip …
>
> These flags are used with TCP-based servers.
>

Every one has its own use-case!


>
> I’ve used them to make software upgrades go more smoothly.
> Without them, the following often happens:
>
> * Old server stops.  In the process, all of its TCP connections are closed.
>
> * Connections to old server remain in the TCP connection table until the
> remote end can acknowledge.
>
> * New server starts.
>
> * New server tries to open port but fails because that port is “still in
> use” by connections in the TCP connection table.
>
> With these flags, the new server can open the port even though
> it is “still in use” by existing connections.
>
>
> > This is not the case today.
> > Only multicast sockets seem to have the behaviour of broadcasting the
> data
> > to all sockets sharing the same properties through these options!
>
> That is what multicast is for.
>
>
Multicast has its defined scope and its applications though i think its
interpreting the same socket options
and respecting the options for what they should do and how they should
behave.


> If you want the same data sent to all listeners, then
> that is multicast behavior and you should be using
> a multicast socket.
>
> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>
> You’re trying to turn all UDP sockets with those options
> into multicast sockets.
>

Not really the idea is how you do support the use case of having two
daemons using the same port numbers
but speaking different protocols.
The best would be to merge these daemons but in the case you cannot there
should be some support on it.
At the very end there are only 65k ports :).

Probably a sysctl for the feature might be a further compromise on it?


>
> If you want a multicast socket, you should ask for one.
>
> Tim
>
>


-- 
Ermal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Ermal Luçi
Well seems Dragonfly has some version of it already from commit [1].

In FreeBSD there is the framework for this with by defining PCBGROUP.
Also the explanation of it at [2] and [3].
It can achieve approximately the same features of SO_RESUSEPORT of linux.
The only thing missing is the marketing behind it and i think and better
RSS support.
By looking at dates the support is there before linux so all you guys
looking for it can experiment with it.

What i was trying to accomplish was something else from performance
improvement and
maybe put a sysctl behind it to make it more acceptable..

[1]
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9
[2]
http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51
[3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html


On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote:

> Tim, you are wrong. Read what is "multicast" definition, and read how UDP
> and TCP sockets work in Linux 3.9+ kernels.
>
> Oleg .
>
>
> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote:
>
>>
>> On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:
>>
>> > Hello,
>> >
>> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
>> > share the same port and possibly listening ip …
>>
>> These flags are used with TCP-based servers.
>>
>> I’ve used them to make software upgrades go more smoothly.
>> Without them, the following often happens:
>>
>> * Old server stops.  In the process, all of its TCP connections are
>> closed.
>>
>> * Connections to old server remain in the TCP connection table until the
>> remote end can acknowledge.
>>
>> * New server starts.
>>
>> * New server tries to open port but fails because that port is “still in
>> use” by connections in the TCP connection table.
>>
>> With these flags, the new server can open the port even though
>> it is “still in use” by existing connections.
>>
>>
>> > This is not the case today.
>> > Only multicast sockets seem to have the behaviour of broadcasting the
>> data
>> > to all sockets sharing the same properties through these options!
>>
>> That is what multicast is for.
>>
>> If you want the same data sent to all listeners, then
>> that is multicast behavior and you should be using
>> a multicast socket.
>>
>> > The patch at [1] implements/corrects the behaviour for UDP sockets.
>>
>> You’re trying to turn all UDP sockets with those options
>> into multicast sockets.
>>
>> If you want a multicast socket, you should ask for one.
>>
>> Tim
>>
>> ___
>> freebsd-...@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>>
>
>


-- 
Ermal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Tim Kientzle

On Nov 29, 2013, at 4:04 AM, Ermal Luçi  wrote:

> Hello,
> 
> since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
> share the same port and possibly listening ip …

These flags are used with TCP-based servers.

I’ve used them to make software upgrades go more smoothly.
Without them, the following often happens:

* Old server stops.  In the process, all of its TCP connections are closed.

* Connections to old server remain in the TCP connection table until the remote 
end can acknowledge.

* New server starts.

* New server tries to open port but fails because that port is “still in use” 
by connections in the TCP connection table.

With these flags, the new server can open the port even though
it is “still in use” by existing connections.


> This is not the case today.
> Only multicast sockets seem to have the behaviour of broadcasting the data
> to all sockets sharing the same properties through these options!

That is what multicast is for.

If you want the same data sent to all listeners, then
that is multicast behavior and you should be using
a multicast socket.

> The patch at [1] implements/corrects the behaviour for UDP sockets.

You’re trying to turn all UDP sockets with those options
into multicast sockets.

If you want a multicast socket, you should ask for one.

Tim

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Julian Elischer

On 11/29/13, 8:04 PM, Ermal Luçi wrote:

Hello,

since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
share the same port and possibly listening ip, you would expect if you bind
two daemon with such options to same port to see the same traffic on both!
this is not how I interpret it.. I presume it is is to allow two 
OUTGOING sessions from the same source.


This is not the case today.
Only multicast sockets seem to have the behaviour of broadcasting the data
to all sockets sharing the same properties through these options!

The patch at [1] implements/corrects the behaviour for UDP sockets.
Is there anything to be corrected in that patch?
Why it has not been provided there before?
Can it be committed to the tree?
Any extra security checks for jails needed there?


[1]
https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Daniel Nebdal
On Fri, Nov 29, 2013 at 1:04 PM, Ermal Luçi  wrote:

> Hello,
>
> since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
> share the same port and possibly listening ip, you would expect if you bind
> two daemon with such options to same port to see the same traffic on both!
>
> This is not the case today.
> Only multicast sockets seem to have the behaviour of broadcasting the data
> to all sockets sharing the same properties through these options!
>
> The patch at [1] implements/corrects the behaviour for UDP sockets.
> Is there anything to be corrected in that patch?
> Why it has not been provided there before?
> Can it be committed to the tree?
> Any extra security checks for jails needed there?
>
>
> [1]
>
> https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff
>
> --
> Ermal


I understood it as working sort of like for TCP, where packages from a
given remote host+port all end up at exactly one of the local sockets? If
the idea is to split the workload over multiple threads holding their own
sockets listening to the same interface+port,  wouldn't sending all packets
to all sockets all the time be kind of counterproductive?

Of course, I haven't actually used it much; I might be wrong.

-- 
Daniel Nebdal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


[PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour

2013-11-29 Thread Ermal Luçi
Hello,

since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to
share the same port and possibly listening ip, you would expect if you bind
two daemon with such options to same port to see the same traffic on both!

This is not the case today.
Only multicast sockets seem to have the behaviour of broadcasting the data
to all sockets sharing the same properties through these options!

The patch at [1] implements/corrects the behaviour for UDP sockets.
Is there anything to be corrected in that patch?
Why it has not been provided there before?
Can it be committed to the tree?
Any extra security checks for jails needed there?


[1]
https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff

-- 
Ermal
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"