Re: [PATCH net-next v4 3/4] bpf: BPF for lightweight tunnel infrastructure

2016-12-01 Thread Thomas Graf
On 12/01/16 at 01:08pm, Daniel Borkmann wrote:
> For the verifier change in may_access_direct_pkt_data(), would be
> great if you could later on follow up with a selftest-suite case,
> one where BPF_PROG_TYPE_LWT_IN/OUT prog tries to write and fails,
> and one where BPF_PROG_TYPE_LWT_IN/OUT prog uses pkt data to pass
> to helpers, for example, so that we can keep testing it when future
> changes in that area are made. Thanks.

Good idea, will do.


Re: [PATCH net-next v4 3/4] bpf: BPF for lightweight tunnel infrastructure

2016-12-01 Thread Daniel Borkmann

On 11/30/2016 05:10 PM, Thomas Graf wrote:

Registers new BPF program types which correspond to the LWT hooks:
   - BPF_PROG_TYPE_LWT_IN   => dst_input()
   - BPF_PROG_TYPE_LWT_OUT  => dst_output()
   - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit()

The separate program types are required to differentiate between the
capabilities each LWT hook allows:

  * Programs attached to dst_input() or dst_output() are restricted and
may only read the data of an skb. This prevent modification and
possible invalidation of already validated packet headers on receive
and the construction of illegal headers while the IP headers are
still being assembled.

  * Programs attached to lwtunnel_xmit() are allowed to modify packet
content as well as prepending an L2 header via a newly introduced
helper bpf_skb_change_head(). This is safe as lwtunnel_xmit() is
invoked after the IP header has been assembled completely.

[...]


Signed-off-by: Thomas Graf 


LGTMAFAICT, so:

Acked-by: Daniel Borkmann 

For the verifier change in may_access_direct_pkt_data(), would be
great if you could later on follow up with a selftest-suite case,
one where BPF_PROG_TYPE_LWT_IN/OUT prog tries to write and fails,
and one where BPF_PROG_TYPE_LWT_IN/OUT prog uses pkt data to pass
to helpers, for example, so that we can keep testing it when future
changes in that area are made. Thanks.


Re: [PATCH net-next v4 3/4] bpf: BPF for lightweight tunnel infrastructure

2016-11-30 Thread Alexei Starovoitov
On Wed, Nov 30, 2016 at 05:10:10PM +0100, Thomas Graf wrote:
> Registers new BPF program types which correspond to the LWT hooks:
>   - BPF_PROG_TYPE_LWT_IN   => dst_input()
>   - BPF_PROG_TYPE_LWT_OUT  => dst_output()
>   - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit()
> 
> The separate program types are required to differentiate between the
> capabilities each LWT hook allows:
> 
>  * Programs attached to dst_input() or dst_output() are restricted and
>may only read the data of an skb. This prevent modification and
>possible invalidation of already validated packet headers on receive
>and the construction of illegal headers while the IP headers are
>still being assembled.
> 
>  * Programs attached to lwtunnel_xmit() are allowed to modify packet
>content as well as prepending an L2 header via a newly introduced
>helper bpf_skb_change_head(). This is safe as lwtunnel_xmit() is
>invoked after the IP header has been assembled completely.
> 
> All BPF programs receive an skb with L3 headers attached and may return
> one of the following error codes:
> 
>  BPF_OK - Continue routing as per nexthop
>  BPF_DROP - Drop skb and return EPERM
>  BPF_REDIRECT - Redirect skb to device as per redirect() helper.
> (Only valid in lwtunnel_xmit() context)
> 
> The return codes are binary compatible with their TC_ACT_
> relatives to ease compatibility.
> 
> Signed-off-by: Thomas Graf 

Looks great.

Acked-by: Alexei Starovoitov 



[PATCH net-next v4 3/4] bpf: BPF for lightweight tunnel infrastructure

2016-11-30 Thread Thomas Graf
Registers new BPF program types which correspond to the LWT hooks:
  - BPF_PROG_TYPE_LWT_IN   => dst_input()
  - BPF_PROG_TYPE_LWT_OUT  => dst_output()
  - BPF_PROG_TYPE_LWT_XMIT => lwtunnel_xmit()

The separate program types are required to differentiate between the
capabilities each LWT hook allows:

 * Programs attached to dst_input() or dst_output() are restricted and
   may only read the data of an skb. This prevent modification and
   possible invalidation of already validated packet headers on receive
   and the construction of illegal headers while the IP headers are
   still being assembled.

 * Programs attached to lwtunnel_xmit() are allowed to modify packet
   content as well as prepending an L2 header via a newly introduced
   helper bpf_skb_change_head(). This is safe as lwtunnel_xmit() is
   invoked after the IP header has been assembled completely.

All BPF programs receive an skb with L3 headers attached and may return
one of the following error codes:

 BPF_OK - Continue routing as per nexthop
 BPF_DROP - Drop skb and return EPERM
 BPF_REDIRECT - Redirect skb to device as per redirect() helper.
(Only valid in lwtunnel_xmit() context)

The return codes are binary compatible with their TC_ACT_
relatives to ease compatibility.

Signed-off-by: Thomas Graf 
---
 include/linux/filter.h|   2 +-
 include/uapi/linux/bpf.h  |  32 +++-
 include/uapi/linux/lwtunnel.h |  23 +++
 kernel/bpf/verifier.c |  14 +-
 net/Kconfig   |   8 +
 net/core/Makefile |   1 +
 net/core/filter.c | 173 ++
 net/core/lwt_bpf.c| 396 ++
 net/core/lwtunnel.c   |   2 +
 9 files changed, 646 insertions(+), 5 deletions(-)
 create mode 100644 net/core/lwt_bpf.c

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 7f246a2..7ba6446 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -438,7 +438,7 @@ struct xdp_buff {
 };
 
 /* compute the linear packet data range [data, data_end) which
- * will be accessed by cls_bpf and act_bpf programs
+ * will be accessed by cls_bpf, act_bpf and lwt programs
  */
 static inline void bpf_compute_data_end(struct sk_buff *skb)
 {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 1370a9d..22ac827 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -101,6 +101,9 @@ enum bpf_prog_type {
BPF_PROG_TYPE_XDP,
BPF_PROG_TYPE_PERF_EVENT,
BPF_PROG_TYPE_CGROUP_SKB,
+   BPF_PROG_TYPE_LWT_IN,
+   BPF_PROG_TYPE_LWT_OUT,
+   BPF_PROG_TYPE_LWT_XMIT,
 };
 
 enum bpf_attach_type {
@@ -409,6 +412,16 @@ union bpf_attr {
  *
  * int bpf_get_numa_node_id()
  * Return: Id of current NUMA node.
+ *
+ * int bpf_skb_change_head()
+ * Grows headroom of skb and adjusts MAC header offset accordingly.
+ * Will extends/reallocae as required automatically.
+ * May change skb data pointer and will thus invalidate any check
+ * performed for direct packet access.
+ * @skb: pointer to skb
+ * @len: length of header to be pushed in front
+ * @flags: Flags (unused for now)
+ * Return: 0 on success or negative error
  */
 #define __BPF_FUNC_MAPPER(FN)  \
FN(unspec), \
@@ -453,7 +466,8 @@ union bpf_attr {
FN(skb_pull_data),  \
FN(csum_update),\
FN(set_hash_invalid),   \
-   FN(get_numa_node_id),
+   FN(get_numa_node_id),   \
+   FN(skb_change_head),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
@@ -537,6 +551,22 @@ struct bpf_tunnel_key {
__u32 tunnel_label;
 };
 
+/* Generic BPF return codes which all BPF program types may support.
+ * The values are binary compatible with their TC_ACT_* counter-part to
+ * provide backwards compatibility with existing SCHED_CLS and SCHED_ACT
+ * programs.
+ *
+ * XDP is handled seprately, see XDP_*.
+ */
+enum bpf_ret_code {
+   BPF_OK = 0,
+   /* 1 reserved */
+   BPF_DROP = 2,
+   /* 3-6 reserved */
+   BPF_REDIRECT = 7,
+   /* >127 are reserved for prog type specific return codes */
+};
+
 /* User return codes for XDP prog type.
  * A valid XDP program must return one of these defined values. All other
  * return codes are reserved for future use. Unknown return codes will result
diff --git a/include/uapi/linux/lwtunnel.h b/include/uapi/linux/lwtunnel.h
index 453cc62..92724cba 100644
--- a/include/uapi/linux/lwtunnel.h
+++ b/include/uapi/linux/lwtunnel.h
@@ -10,6 +10,7 @@ enum lwtunnel_encap_types {
LWTUNNEL_ENCAP_ILA,
LWTUNNEL_ENCAP_IP6,
LWTUNNEL_ENCAP_SEG6,
+   LWTUNNEL_ENCAP_BPF,
__LWTUNNEL_ENCAP_MAX,
 };
 
@@ -43,4 +44,26 @@ enum lwtunnel_ip6_t {
 
 #define LWTUNNEL_IP6_MAX (__LWTUNNEL_IP6_MAX - 1)
 
+enum {
+   LWT_BPF_PROG_UNSPEC,
+   LWT_B