Re: [PATCH v2] socket.7: Document some BPF-related socket options

2016-03-01 Thread Craig Gallek
On Tue, Mar 1, 2016 at 5:29 AM, Michael Kerrisk (man-pages)
 wrote:
> On 03/01/2016 11:10 AM, Vincent Bernat wrote:
>>  ❦  1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)" 
>>  :
>>
>>>   Once   the   SO_LOCK_FILTER  option  has  been  enabled,
>>>   attempts by an unprivileged process to change or  remove
>>>   the  filter  attached  to  a  socket,  or to disable the
>>>   SO_LOCK_FILTER option will fail with the error EPERM.
>>
>> You should remove "unprivileged". I didn't try to check for permissions
>> because I was just lazy (and I didn't have a need for it). As root, you
>> can just recreate another socket.
>
> Bother. That's what I meant to do, and then I omitted to do it! Done now
> And thanks for catching that, Vincent.
>
> Revised text below, with another query.
>
>SO_LOCK_FILTER
>   When set, this option will prevent changing the  filters
>   associated  with  the socket.  These filters include any
>   set   using   the   socket   options   SO_ATTACH_FILTER,
>   SO_ATTACH_BPF,SO_ATTACH_REUSEPORT_CBPF   and
>   SO_ATTACH_REUSEPORT_EPBF.
>
>   The typical use case is for a privileged process to  set
>   up  a  socket with restrictive filters, set SO_LOCK_FIL‐
>   TER, and then either drop its  privileges  or  pass  the
>   socket file descriptor to an unprivileged process.
>
>   Once   the   SO_LOCK_FILTER  option  has  been  enabled,
>   attempts to change or remove the filter  attached  to  a
>   socket,  or  to  disable  the SO_LOCK_FILTER option will
>   fail with the error EPERM.
>
> I think the second paragraph should probably drop mention of privileges,
> right? In fact, maybe just drop the paragraph altogether?
Thanks Michael, all of your changes in the git tree look good to me.
I parsed the one-way nature of LOCK_FILTER completely backwards from
the commit message.  It's describing BSD's root-modify behavior, not
the implementation in Linux.  I think I like this last paragraph as
you have it to explicitly call out this as intended behavior.

Thanks again,
Craig

> Cheers,
>
> Michael
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/


Re: [PATCH v2] socket.7: Document some BPF-related socket options

2016-03-01 Thread Michael Kerrisk (man-pages)
On 03/01/2016 11:10 AM, Vincent Bernat wrote:
>  ❦  1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)" 
>  :
> 
>>   Once   the   SO_LOCK_FILTER  option  has  been  enabled,
>>   attempts by an unprivileged process to change or  remove
>>   the  filter  attached  to  a  socket,  or to disable the
>>   SO_LOCK_FILTER option will fail with the error EPERM.
> 
> You should remove "unprivileged". I didn't try to check for permissions
> because I was just lazy (and I didn't have a need for it). As root, you
> can just recreate another socket.

Bother. That's what I meant to do, and then I omitted to do it! Done now
And thanks for catching that, Vincent.

Revised text below, with another query.

   SO_LOCK_FILTER
  When set, this option will prevent changing the  filters
  associated  with  the socket.  These filters include any
  set   using   the   socket   options   SO_ATTACH_FILTER,
  SO_ATTACH_BPF,SO_ATTACH_REUSEPORT_CBPF   and
  SO_ATTACH_REUSEPORT_EPBF.

  The typical use case is for a privileged process to  set
  up  a  socket with restrictive filters, set SO_LOCK_FIL‐
  TER, and then either drop its  privileges  or  pass  the
  socket file descriptor to an unprivileged process.

  Once   the   SO_LOCK_FILTER  option  has  been  enabled,
  attempts to change or remove the filter  attached  to  a
  socket,  or  to  disable  the SO_LOCK_FILTER option will
  fail with the error EPERM.

I think the second paragraph should probably drop mention of privileges,
right? In fact, maybe just drop the paragraph altogether?

Cheers,

Michael
 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/


Re: [PATCH v2] socket.7: Document some BPF-related socket options

2016-03-01 Thread Vincent Bernat
 ❦  1 mars 2016 11:03 +0100, "Michael Kerrisk (man-pages)" 
 :

>   Once   the   SO_LOCK_FILTER  option  has  been  enabled,
>   attempts by an unprivileged process to change or  remove
>   the  filter  attached  to  a  socket,  or to disable the
>   SO_LOCK_FILTER option will fail with the error EPERM.

You should remove "unprivileged". I didn't try to check for permissions
because I was just lazy (and I didn't have a need for it). As root, you
can just recreate another socket.
-- 
Choose a data representation that makes the program simple.
- The Elements of Programming Style (Kernighan & Plauger)


Re: [PATCH v2] socket.7: Document some BPF-related socket options

2016-03-01 Thread Michael Kerrisk (man-pages)
Hi Craig,

On 02/29/2016 06:36 PM, Craig Gallek wrote:
> From: Craig Gallek 

Thanks for improvements. I've applied the patch and tweaked things 
somewhat, but I have a few comments and queries below. I'd be 
grateful if you'd check these, in case I have introduced any errors.
(The tweaked version of the page can be found in the Git repo.)

> Document the behavior and the first kernel version for each of the
> following socket options:
> SO_ATTACH_FILTER
> SO_ATTACH_BPF
> SO_ATTACH_REUSEPORT_CBPF
> SO_ATTACH_REUSEPORT_EBPF
> SO_DETACH_FILTER
> SO_DETACH_BPF
> SO_LOCK_FILTER
> 
> Signed-off-by: Craig Gallek 
> ---
> v2 changes:
> - Content suggestions from Michael Kerrisk :
>   * Clarify socket filter return value semantics
>   * Clarify wording of minimal kernel versions
>   * Explain behavior of multiple calls using SO_ATTACH_[BPF|FILTER]
>   * Define 'reuseport groups' in SO_ATTACH_REUSEPORT_*
> - Include SO_LOCK_FILTER documentation mostly based off of the wording
>   in the commit message by Vincent Bernat 
>   d59577b6ffd3 ("sk-filter: Add ability to lock a socket filter program")
> 
> ---
>  man7/socket.7 | 136 
> +-
>  1 file changed, 115 insertions(+), 21 deletions(-)
> 
> diff --git a/man7/socket.7 b/man7/socket.7
> index db7cb8324dde..d22107cc47d7 100644
> --- a/man7/socket.7
> +++ b/man7/socket.7
> @@ -41,9 +41,6 @@
>  .\"  SO_GET_FILTER (3.8)
>  .\"  commit a8fc92778080c845eaadc369a0ecf5699a03bef0
>  .\"  Author: Pavel Emelyanov 
> -.\"  SO_LOCK_FILTER (3.9)
> -.\"  commit d59577b6ffd313d0ab3be39cb1ab47e29bdc9182
> -.\"  Author: Vincent Bernat 
>  .\"  SO_SELECT_ERR_QUEUE (3.10)
>  .\" commit 7d4c04fc170087119727119074e72445f2bb192b
>  .\"  Author: Keller, Jacob E 
> @@ -53,13 +50,6 @@
>  .\" SO_BPF_EXTENSIONS (3.14)
>  .\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e
>  .\"  Author: Michal Sekletar 
> -.\" SO_ATTACH_BPF (3.19)
> -.\" and SO_DETACH_BPF as synonym for SO_DETACH_FILTER
> -.\" commit 89aa075832b0da4402acebd698d0411dcc82d03e
> -.\"  Author: Alexei Starovoitov 
> -.\"  SO_ATTACH_REUSEPORT_CBPF, SO_ATTACH_REUSEPORT_EBPF (4.5)
> -.\"  commit 538950a1b7527a0a52ccd9337e3fcd304f027f13
> -.\"  Author: Craig Gallek 
>  .\"
>  .TH SOCKET 7 2015-05-07 Linux "Linux Programmer's Manual"
>  .SH NAME
> @@ -311,6 +301,90 @@ The value 0 indicates that this is not a listening 
> socket,
>  the value 1 indicates that this is a listening socket.
>  This socket option is read-only.
>  .TP
> +.BR SO_ATTACH_FILTER " and " SO_ATTACH_BPF
> +Attach a classic or extended BPF program (respectively) to the socket
> +for use as a filter of incoming packets. A packet will be dropped if
> +the filter program returns zero.  If the filter program returns a
> +non-zero value which is less than the packet's data length, the packet
> +will be truncated to the length returned.  If the value returned by
> +the filter is greater than or equal to the packet's data length, the
> +packet is allowed to proceed unmodified.
> +
> +The argument for
> +.BR SO_ATTACH_FILTER
> +is a
> +.I sock_fprog
> +structure in
> +.B .
> +.sp
> +.in +4n
> +.nf
> +struct sock_fprog {
> +unsigned short  len;
> +struct sock_filter *filter;
> +};
> +.fi
> +.in
> +.IP
> +The argument for
> +.BR SO_ATTACH_BPF
> +is a file descriptor returned by the
> +.BR bpf (2)
> +system call and must refer to a program of type
> +.BR BPF_PROG_TYPE_SOCKET_FILTER.
> +These options may be set multiple times for a given socket, each time
> +replacing the previous filter program.  The classic and extended
> +versions may be called on the same socket, but the previous filter
> +will always be replaced such that a socket never has more than one
> +filter defined.
> +
> +.BR SO_ATTACH_FILTER
> +is available since Linux 2.2.
> +.BR SO_ATTACH_BPF
> +is available since Linux 3.19.  Both classic and extended BPF are
> +explained in the kernel source file
> +.I Documentation/networking/filter.txt
> +.TP
> +.BR SO_ATTACH_REUSEPORT_CBPF " and " SO_ATTACH_REUSEPORT_EBPF " (since Linux 
> 4.5)"
> +For use with the
> +.BR SO_REUSEPORT
> +option, these options allow the user to set a classic or extended
> +BPF program (respectively) which defines how packets are assigned to
> +the sockets in the reuseport group (that is, all sockets which have
> +.BR SO_REUSEPORT
> +set and are using the same local address to receive packets).  The BPF
> +program must return an index between 0 and N-1 representing the socket
> +which should receive the packet (where N is the number of sockets in
> +the group). If the BPF program returns an invalid index, socket
> +selection will fall back to the plain
> 

[PATCH v2] socket.7: Document some BPF-related socket options

2016-02-29 Thread Craig Gallek
From: Craig Gallek 

Document the behavior and the first kernel version for each of the
following socket options:
SO_ATTACH_FILTER
SO_ATTACH_BPF
SO_ATTACH_REUSEPORT_CBPF
SO_ATTACH_REUSEPORT_EBPF
SO_DETACH_FILTER
SO_DETACH_BPF
SO_LOCK_FILTER

Signed-off-by: Craig Gallek 
---
v2 changes:
- Content suggestions from Michael Kerrisk :
  * Clarify socket filter return value semantics
  * Clarify wording of minimal kernel versions
  * Explain behavior of multiple calls using SO_ATTACH_[BPF|FILTER]
  * Define 'reuseport groups' in SO_ATTACH_REUSEPORT_*
- Include SO_LOCK_FILTER documentation mostly based off of the wording
  in the commit message by Vincent Bernat 
  d59577b6ffd3 ("sk-filter: Add ability to lock a socket filter program")

---
 man7/socket.7 | 136 +-
 1 file changed, 115 insertions(+), 21 deletions(-)

diff --git a/man7/socket.7 b/man7/socket.7
index db7cb8324dde..d22107cc47d7 100644
--- a/man7/socket.7
+++ b/man7/socket.7
@@ -41,9 +41,6 @@
 .\"SO_GET_FILTER (3.8)
 .\"commit a8fc92778080c845eaadc369a0ecf5699a03bef0
 .\"Author: Pavel Emelyanov 
-.\"SO_LOCK_FILTER (3.9)
-.\"commit d59577b6ffd313d0ab3be39cb1ab47e29bdc9182
-.\"Author: Vincent Bernat 
 .\"SO_SELECT_ERR_QUEUE (3.10)
 .\" commit 7d4c04fc170087119727119074e72445f2bb192b
 .\"Author: Keller, Jacob E 
@@ -53,13 +50,6 @@
 .\" SO_BPF_EXTENSIONS (3.14)
 .\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e
 .\"Author: Michal Sekletar 
-.\" SO_ATTACH_BPF (3.19)
-.\" and SO_DETACH_BPF as synonym for SO_DETACH_FILTER
-.\" commit 89aa075832b0da4402acebd698d0411dcc82d03e
-.\"Author: Alexei Starovoitov 
-.\"SO_ATTACH_REUSEPORT_CBPF, SO_ATTACH_REUSEPORT_EBPF (4.5)
-.\"commit 538950a1b7527a0a52ccd9337e3fcd304f027f13
-.\"Author: Craig Gallek 
 .\"
 .TH SOCKET 7 2015-05-07 Linux "Linux Programmer's Manual"
 .SH NAME
@@ -311,6 +301,90 @@ The value 0 indicates that this is not a listening socket,
 the value 1 indicates that this is a listening socket.
 This socket option is read-only.
 .TP
+.BR SO_ATTACH_FILTER " and " SO_ATTACH_BPF
+Attach a classic or extended BPF program (respectively) to the socket
+for use as a filter of incoming packets. A packet will be dropped if
+the filter program returns zero.  If the filter program returns a
+non-zero value which is less than the packet's data length, the packet
+will be truncated to the length returned.  If the value returned by
+the filter is greater than or equal to the packet's data length, the
+packet is allowed to proceed unmodified.
+
+The argument for
+.BR SO_ATTACH_FILTER
+is a
+.I sock_fprog
+structure in
+.B .
+.sp
+.in +4n
+.nf
+struct sock_fprog {
+unsigned short  len;
+struct sock_filter *filter;
+};
+.fi
+.in
+.IP
+The argument for
+.BR SO_ATTACH_BPF
+is a file descriptor returned by the
+.BR bpf (2)
+system call and must refer to a program of type
+.BR BPF_PROG_TYPE_SOCKET_FILTER.
+These options may be set multiple times for a given socket, each time
+replacing the previous filter program.  The classic and extended
+versions may be called on the same socket, but the previous filter
+will always be replaced such that a socket never has more than one
+filter defined.
+
+.BR SO_ATTACH_FILTER
+is available since Linux 2.2.
+.BR SO_ATTACH_BPF
+is available since Linux 3.19.  Both classic and extended BPF are
+explained in the kernel source file
+.I Documentation/networking/filter.txt
+.TP
+.BR SO_ATTACH_REUSEPORT_CBPF " and " SO_ATTACH_REUSEPORT_EBPF " (since Linux 
4.5)"
+For use with the
+.BR SO_REUSEPORT
+option, these options allow the user to set a classic or extended
+BPF program (respectively) which defines how packets are assigned to
+the sockets in the reuseport group (that is, all sockets which have
+.BR SO_REUSEPORT
+set and are using the same local address to receive packets).  The BPF
+program must return an index between 0 and N-1 representing the socket
+which should receive the packet (where N is the number of sockets in
+the group). If the BPF program returns an invalid index, socket
+selection will fall back to the plain
+.BR SO_REUSEPORT
+mechanism.
+
+Sockets are numbered in the order in which they are added to the group
+(that is, the order of
+.BR bind (2)
+calls for UDP sockets or the order of
+.BR listen (2)
+calls for TCP sockets).  New sockets added to a reuseport group will
+inherit the BPF program.  When a socket is removed from a reuseport
+group (via
+.BR close (2))
+the last socket in the group will be moved into the closed socket's
+position.
+
+These options may be set repeatedly at any time on any single socket
+in the group to