Re: [PATCH v2] socket.7: Document some BPF-related socket options
On Tue, Mar 1, 2016 at 5:29 AM, Michael Kerrisk (man-pages)wrote: > On 03/01/2016 11:10 AM, Vincent Bernat wrote: >> ❦ 1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)" >> : >> >>> Once the SO_LOCK_FILTER option has been enabled, >>> attempts by an unprivileged process to change or remove >>> the filter attached to a socket, or to disable the >>> SO_LOCK_FILTER option will fail with the error EPERM. >> >> You should remove "unprivileged". I didn't try to check for permissions >> because I was just lazy (and I didn't have a need for it). As root, you >> can just recreate another socket. > > Bother. That's what I meant to do, and then I omitted to do it! Done now > And thanks for catching that, Vincent. > > Revised text below, with another query. > >SO_LOCK_FILTER > When set, this option will prevent changing the filters > associated with the socket. These filters include any > set using the socket options SO_ATTACH_FILTER, > SO_ATTACH_BPF,SO_ATTACH_REUSEPORT_CBPF and > SO_ATTACH_REUSEPORT_EPBF. > > The typical use case is for a privileged process to set > up a socket with restrictive filters, set SO_LOCK_FIL‐ > TER, and then either drop its privileges or pass the > socket file descriptor to an unprivileged process. > > Once the SO_LOCK_FILTER option has been enabled, > attempts to change or remove the filter attached to a > socket, or to disable the SO_LOCK_FILTER option will > fail with the error EPERM. > > I think the second paragraph should probably drop mention of privileges, > right? In fact, maybe just drop the paragraph altogether? Thanks Michael, all of your changes in the git tree look good to me. I parsed the one-way nature of LOCK_FILTER completely backwards from the commit message. It's describing BSD's root-modify behavior, not the implementation in Linux. I think I like this last paragraph as you have it to explicitly call out this as intended behavior. Thanks again, Craig > Cheers, > > Michael > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v2] socket.7: Document some BPF-related socket options
On 03/01/2016 11:10 AM, Vincent Bernat wrote: > ❦ 1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)" >: > >> Once the SO_LOCK_FILTER option has been enabled, >> attempts by an unprivileged process to change or remove >> the filter attached to a socket, or to disable the >> SO_LOCK_FILTER option will fail with the error EPERM. > > You should remove "unprivileged". I didn't try to check for permissions > because I was just lazy (and I didn't have a need for it). As root, you > can just recreate another socket. Bother. That's what I meant to do, and then I omitted to do it! Done now And thanks for catching that, Vincent. Revised text below, with another query. SO_LOCK_FILTER When set, this option will prevent changing the filters associated with the socket. These filters include any set using the socket options SO_ATTACH_FILTER, SO_ATTACH_BPF,SO_ATTACH_REUSEPORT_CBPF and SO_ATTACH_REUSEPORT_EPBF. The typical use case is for a privileged process to set up a socket with restrictive filters, set SO_LOCK_FIL‐ TER, and then either drop its privileges or pass the socket file descriptor to an unprivileged process. Once the SO_LOCK_FILTER option has been enabled, attempts to change or remove the filter attached to a socket, or to disable the SO_LOCK_FILTER option will fail with the error EPERM. I think the second paragraph should probably drop mention of privileges, right? In fact, maybe just drop the paragraph altogether? Cheers, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Linux/UNIX System Programming Training: http://man7.org/training/
Re: [PATCH v2] socket.7: Document some BPF-related socket options
❦ 1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)": > Once the SO_LOCK_FILTER option has been enabled, > attempts by an unprivileged process to change or remove > the filter attached to a socket, or to disable the > SO_LOCK_FILTER option will fail with the error EPERM. You should remove "unprivileged". I didn't try to check for permissions because I was just lazy (and I didn't have a need for it). As root, you can just recreate another socket. -- Choose a data representation that makes the program simple. - The Elements of Programming Style (Kernighan & Plauger)
Re: [PATCH v2] socket.7: Document some BPF-related socket options
Hi Craig, On 02/29/2016 06:36 PM, Craig Gallek wrote: > From: Craig GallekThanks for improvements. I've applied the patch and tweaked things somewhat, but I have a few comments and queries below. I'd be grateful if you'd check these, in case I have introduced any errors. (The tweaked version of the page can be found in the Git repo.) > Document the behavior and the first kernel version for each of the > following socket options: > SO_ATTACH_FILTER > SO_ATTACH_BPF > SO_ATTACH_REUSEPORT_CBPF > SO_ATTACH_REUSEPORT_EBPF > SO_DETACH_FILTER > SO_DETACH_BPF > SO_LOCK_FILTER > > Signed-off-by: Craig Gallek > --- > v2 changes: > - Content suggestions from Michael Kerrisk : > * Clarify socket filter return value semantics > * Clarify wording of minimal kernel versions > * Explain behavior of multiple calls using SO_ATTACH_[BPF|FILTER] > * Define 'reuseport groups' in SO_ATTACH_REUSEPORT_* > - Include SO_LOCK_FILTER documentation mostly based off of the wording > in the commit message by Vincent Bernat > d59577b6ffd3 ("sk-filter: Add ability to lock a socket filter program") > > --- > man7/socket.7 | 136 > +- > 1 file changed, 115 insertions(+), 21 deletions(-) > > diff --git a/man7/socket.7 b/man7/socket.7 > index db7cb8324dde..d22107cc47d7 100644 > --- a/man7/socket.7 > +++ b/man7/socket.7 > @@ -41,9 +41,6 @@ > .\" SO_GET_FILTER (3.8) > .\" commit a8fc92778080c845eaadc369a0ecf5699a03bef0 > .\" Author: Pavel Emelyanov > -.\" SO_LOCK_FILTER (3.9) > -.\" commit d59577b6ffd313d0ab3be39cb1ab47e29bdc9182 > -.\" Author: Vincent Bernat > .\" SO_SELECT_ERR_QUEUE (3.10) > .\" commit 7d4c04fc170087119727119074e72445f2bb192b > .\" Author: Keller, Jacob E > @@ -53,13 +50,6 @@ > .\" SO_BPF_EXTENSIONS (3.14) > .\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e > .\" Author: Michal Sekletar > -.\" SO_ATTACH_BPF (3.19) > -.\" and SO_DETACH_BPF as synonym for SO_DETACH_FILTER > -.\" commit 89aa075832b0da4402acebd698d0411dcc82d03e > -.\" Author: Alexei Starovoitov > -.\" SO_ATTACH_REUSEPORT_CBPF, SO_ATTACH_REUSEPORT_EBPF (4.5) > -.\" commit 538950a1b7527a0a52ccd9337e3fcd304f027f13 > -.\" Author: Craig Gallek > .\" > .TH SOCKET 7 2015-05-07 Linux "Linux Programmer's Manual" > .SH NAME > @@ -311,6 +301,90 @@ The value 0 indicates that this is not a listening > socket, > the value 1 indicates that this is a listening socket. > This socket option is read-only. > .TP > +.BR SO_ATTACH_FILTER " and " SO_ATTACH_BPF > +Attach a classic or extended BPF program (respectively) to the socket > +for use as a filter of incoming packets. A packet will be dropped if > +the filter program returns zero. If the filter program returns a > +non-zero value which is less than the packet's data length, the packet > +will be truncated to the length returned. If the value returned by > +the filter is greater than or equal to the packet's data length, the > +packet is allowed to proceed unmodified. > + > +The argument for > +.BR SO_ATTACH_FILTER > +is a > +.I sock_fprog > +structure in > +.B . > +.sp > +.in +4n > +.nf > +struct sock_fprog { > +unsigned short len; > +struct sock_filter *filter; > +}; > +.fi > +.in > +.IP > +The argument for > +.BR SO_ATTACH_BPF > +is a file descriptor returned by the > +.BR bpf (2) > +system call and must refer to a program of type > +.BR BPF_PROG_TYPE_SOCKET_FILTER. > +These options may be set multiple times for a given socket, each time > +replacing the previous filter program. The classic and extended > +versions may be called on the same socket, but the previous filter > +will always be replaced such that a socket never has more than one > +filter defined. > + > +.BR SO_ATTACH_FILTER > +is available since Linux 2.2. > +.BR SO_ATTACH_BPF > +is available since Linux 3.19. Both classic and extended BPF are > +explained in the kernel source file > +.I Documentation/networking/filter.txt > +.TP > +.BR SO_ATTACH_REUSEPORT_CBPF " and " SO_ATTACH_REUSEPORT_EBPF " (since Linux > 4.5)" > +For use with the > +.BR SO_REUSEPORT > +option, these options allow the user to set a classic or extended > +BPF program (respectively) which defines how packets are assigned to > +the sockets in the reuseport group (that is, all sockets which have > +.BR SO_REUSEPORT > +set and are using the same local address to receive packets). The BPF > +program must return an index between 0 and N-1 representing the socket > +which should receive the packet (where N is the number of sockets in > +the group). If the BPF program returns an invalid index, socket > +selection will fall back to the plain >
[PATCH v2] socket.7: Document some BPF-related socket options
From: Craig GallekDocument the behavior and the first kernel version for each of the following socket options: SO_ATTACH_FILTER SO_ATTACH_BPF SO_ATTACH_REUSEPORT_CBPF SO_ATTACH_REUSEPORT_EBPF SO_DETACH_FILTER SO_DETACH_BPF SO_LOCK_FILTER Signed-off-by: Craig Gallek --- v2 changes: - Content suggestions from Michael Kerrisk : * Clarify socket filter return value semantics * Clarify wording of minimal kernel versions * Explain behavior of multiple calls using SO_ATTACH_[BPF|FILTER] * Define 'reuseport groups' in SO_ATTACH_REUSEPORT_* - Include SO_LOCK_FILTER documentation mostly based off of the wording in the commit message by Vincent Bernat d59577b6ffd3 ("sk-filter: Add ability to lock a socket filter program") --- man7/socket.7 | 136 +- 1 file changed, 115 insertions(+), 21 deletions(-) diff --git a/man7/socket.7 b/man7/socket.7 index db7cb8324dde..d22107cc47d7 100644 --- a/man7/socket.7 +++ b/man7/socket.7 @@ -41,9 +41,6 @@ .\"SO_GET_FILTER (3.8) .\"commit a8fc92778080c845eaadc369a0ecf5699a03bef0 .\"Author: Pavel Emelyanov -.\"SO_LOCK_FILTER (3.9) -.\"commit d59577b6ffd313d0ab3be39cb1ab47e29bdc9182 -.\"Author: Vincent Bernat .\"SO_SELECT_ERR_QUEUE (3.10) .\" commit 7d4c04fc170087119727119074e72445f2bb192b .\"Author: Keller, Jacob E @@ -53,13 +50,6 @@ .\" SO_BPF_EXTENSIONS (3.14) .\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e .\"Author: Michal Sekletar -.\" SO_ATTACH_BPF (3.19) -.\" and SO_DETACH_BPF as synonym for SO_DETACH_FILTER -.\" commit 89aa075832b0da4402acebd698d0411dcc82d03e -.\"Author: Alexei Starovoitov -.\"SO_ATTACH_REUSEPORT_CBPF, SO_ATTACH_REUSEPORT_EBPF (4.5) -.\"commit 538950a1b7527a0a52ccd9337e3fcd304f027f13 -.\"Author: Craig Gallek .\" .TH SOCKET 7 2015-05-07 Linux "Linux Programmer's Manual" .SH NAME @@ -311,6 +301,90 @@ The value 0 indicates that this is not a listening socket, the value 1 indicates that this is a listening socket. This socket option is read-only. .TP +.BR SO_ATTACH_FILTER " and " SO_ATTACH_BPF +Attach a classic or extended BPF program (respectively) to the socket +for use as a filter of incoming packets. A packet will be dropped if +the filter program returns zero. If the filter program returns a +non-zero value which is less than the packet's data length, the packet +will be truncated to the length returned. If the value returned by +the filter is greater than or equal to the packet's data length, the +packet is allowed to proceed unmodified. + +The argument for +.BR SO_ATTACH_FILTER +is a +.I sock_fprog +structure in +.B . +.sp +.in +4n +.nf +struct sock_fprog { +unsigned short len; +struct sock_filter *filter; +}; +.fi +.in +.IP +The argument for +.BR SO_ATTACH_BPF +is a file descriptor returned by the +.BR bpf (2) +system call and must refer to a program of type +.BR BPF_PROG_TYPE_SOCKET_FILTER. +These options may be set multiple times for a given socket, each time +replacing the previous filter program. The classic and extended +versions may be called on the same socket, but the previous filter +will always be replaced such that a socket never has more than one +filter defined. + +.BR SO_ATTACH_FILTER +is available since Linux 2.2. +.BR SO_ATTACH_BPF +is available since Linux 3.19. Both classic and extended BPF are +explained in the kernel source file +.I Documentation/networking/filter.txt +.TP +.BR SO_ATTACH_REUSEPORT_CBPF " and " SO_ATTACH_REUSEPORT_EBPF " (since Linux 4.5)" +For use with the +.BR SO_REUSEPORT +option, these options allow the user to set a classic or extended +BPF program (respectively) which defines how packets are assigned to +the sockets in the reuseport group (that is, all sockets which have +.BR SO_REUSEPORT +set and are using the same local address to receive packets). The BPF +program must return an index between 0 and N-1 representing the socket +which should receive the packet (where N is the number of sockets in +the group). If the BPF program returns an invalid index, socket +selection will fall back to the plain +.BR SO_REUSEPORT +mechanism. + +Sockets are numbered in the order in which they are added to the group +(that is, the order of +.BR bind (2) +calls for UDP sockets or the order of +.BR listen (2) +calls for TCP sockets). New sockets added to a reuseport group will +inherit the BPF program. When a socket is removed from a reuseport +group (via +.BR close (2)) +the last socket in the group will be moved into the closed socket's +position. + +These options may be set repeatedly at any time on any single socket +in the group to