Re: [bpf-next, v4 0/5] Introduce eBPF flow dissector

2018-09-24 Thread Willem de Bruijn
On Fri, Sep 14, 2018 at 5:51 PM Petar Penkov  wrote:
>
> On Fri, Sep 14, 2018 at 2:47 PM, Y Song  wrote:
> > On Fri, Sep 14, 2018 at 12:24 PM Alexei Starovoitov
> >  wrote:
> >>
> >> On Fri, Sep 14, 2018 at 07:46:17AM -0700, Petar Penkov wrote:
> >> > From: Petar Penkov 
> >> >
> >> > This patch series hardens the RX stack by allowing flow dissection in 
> >> > BPF,
> >> > as previously discussed [1]. Because of the rigorous checks of the BPF
> >> > verifier, this provides significant security guarantees. In particular, 
> >> > the
> >> > BPF flow dissector cannot get inside of an infinite loop, as with
> >> > CVE-2013-4348, because BPF programs are guaranteed to terminate. It 
> >> > cannot
> >> > read outside of packet bounds, because all memory accesses are checked.
> >> > Also, with BPF the administrator can decide which protocols to support,
> >> > reducing potential attack surface. Rarely encountered protocols can be
> >> > excluded from dissection and the program can be updated without kernel
> >> > recompile or reboot if a bug is discovered.
> >> >
> >> > Patch 1 adds infrastructure to execute a BPF program in 
> >> > __skb_flow_dissect.
> >> > This includes a new BPF program and attach type.
> >> >
> >> > Patch 2 adds the new BPF flow dissector definitions to tools/uapi.
> >> >
> >> > Patch 3 adds support for the new BPF program type to libbpf and bpftool.
> >> >
> >> > Patch 4 adds a flow dissector program in BPF. This parses most protocols 
> >> > in
> >> > __skb_flow_dissect in BPF for a subset of flow keys (basic, control, 
> >> > ports,
> >> > and address types).
> >> >
> >> > Patch 5 adds a selftest that attaches the BPF program to the flow 
> >> > dissector
> >> > and sends traffic with different levels of encapsulation.
> >> >
> >> > Performance Evaluation:
> >> > The in-kernel implementation was compared against the demo program from
> >> > patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
> >> >   $perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
> >> >   -t 10
> >>
> >> Looks great. Applied to bpf-next with one extra patch:
> >>  SEC("dissect")
> >> -int dissect(struct __sk_buff *skb)
> >> +int _dissect(struct __sk_buff *skb)
> >>
> >> otherwise the test doesn't build.
> >> I'm not sure how it builds for you. Which llvm did you use?
> >
> > This is a known issue. IIRC, llvm <= 4 should be okay and llvm >= 5 would 
> > fail.
> >
> I was running a much older version of llvm so I imagine this was the
> issue. Thanks for the fix!
> >>
> >> Also above command works and ipv4 test in ./test_flow_dissector.sh
> >> is passing as well, but it still fails at the end for me:
> >> ./test_flow_dissector.sh
> >> bpffs not mounted. Mounting...
> >> 0: IP
> >> 1: IPV6
> >> 2: IPV6OP
> >> 3: IPV6FR
> >> 4: MPLS
> >> 5: VLAN
> >> Testing IPv4...
> >> inner.dest4: 127.0.0.1
> >> inner.source4: 127.0.0.3
> >> pkts: tx=10 rx=10
> >> inner.dest4: 127.0.0.1
> >> inner.source4: 127.0.0.3
> >> pkts: tx=10 rx=0
> >> inner.dest4: 127.0.0.1
> >> inner.source4: 127.0.0.3
> >> pkts: tx=10 rx=10
> >> Testing IPIP...
> >> tunnels before test:
> >> tunl0: any/ip remote any local any ttl inherit nopmtudisc
> >> sit_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> >> ipip_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> >> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
> >> gre_test_LV5N: gre/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> >> gre0: gre/ip remote any local any ttl inherit nopmtudisc
> >> inner.dest4: 192.168.0.1
> >> inner.source4: 1.1.1.1
> >> encap proto:   4
> >> outer.dest4: 127.0.0.1
> >> outer.source4: 127.0.0.2
> >> pkts: tx=10 rx=0
> >> tunnels after test:
> >> tunl0: any/ip remote any local any ttl inherit nopmtudisc
> >> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
> >> gre0: gre/ip remote any local any ttl inherit nopmtudisc
> >> selftests: test_flow_dissector [FAILED]
> >>
> >> is it something in my setup or test is broken?
> >>
> I just reran the test and it is passing. We will investigate what
> could be causing the issue.

I've tried, but I am still not able to reproduce this exact issue.

This may be due to how we split the compilation and execution

With

  make defconfig &&
  make kvmconfig &&
  make kselftest-merge &&
  make -j $N bzImage

and starting the result in qemu -kernel I miss a few built-ins. The test
complains loudly if sch_ingress or ipip are missing. After updating to
=y, the tests all pass for me.

The confounding part is that in the above output the test shows no
error output and all tunnels are setup correctly.

For debugging purposes, I can update the script to run all tests instead
of existing on the first failure and to output more state, including the
address on the device and the state of the netns.


Re: [bpf-next, v4 0/5] Introduce eBPF flow dissector

2018-09-14 Thread Petar Penkov
On Fri, Sep 14, 2018 at 2:47 PM, Y Song  wrote:
> On Fri, Sep 14, 2018 at 12:24 PM Alexei Starovoitov
>  wrote:
>>
>> On Fri, Sep 14, 2018 at 07:46:17AM -0700, Petar Penkov wrote:
>> > From: Petar Penkov 
>> >
>> > This patch series hardens the RX stack by allowing flow dissection in BPF,
>> > as previously discussed [1]. Because of the rigorous checks of the BPF
>> > verifier, this provides significant security guarantees. In particular, the
>> > BPF flow dissector cannot get inside of an infinite loop, as with
>> > CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
>> > read outside of packet bounds, because all memory accesses are checked.
>> > Also, with BPF the administrator can decide which protocols to support,
>> > reducing potential attack surface. Rarely encountered protocols can be
>> > excluded from dissection and the program can be updated without kernel
>> > recompile or reboot if a bug is discovered.
>> >
>> > Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
>> > This includes a new BPF program and attach type.
>> >
>> > Patch 2 adds the new BPF flow dissector definitions to tools/uapi.
>> >
>> > Patch 3 adds support for the new BPF program type to libbpf and bpftool.
>> >
>> > Patch 4 adds a flow dissector program in BPF. This parses most protocols in
>> > __skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
>> > and address types).
>> >
>> > Patch 5 adds a selftest that attaches the BPF program to the flow dissector
>> > and sends traffic with different levels of encapsulation.
>> >
>> > Performance Evaluation:
>> > The in-kernel implementation was compared against the demo program from
>> > patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
>> >   $perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
>> >   -t 10
>>
>> Looks great. Applied to bpf-next with one extra patch:
>>  SEC("dissect")
>> -int dissect(struct __sk_buff *skb)
>> +int _dissect(struct __sk_buff *skb)
>>
>> otherwise the test doesn't build.
>> I'm not sure how it builds for you. Which llvm did you use?
>
> This is a known issue. IIRC, llvm <= 4 should be okay and llvm >= 5 would 
> fail.
>
I was running a much older version of llvm so I imagine this was the
issue. Thanks for the fix!
>>
>> Also above command works and ipv4 test in ./test_flow_dissector.sh
>> is passing as well, but it still fails at the end for me:
>> ./test_flow_dissector.sh
>> bpffs not mounted. Mounting...
>> 0: IP
>> 1: IPV6
>> 2: IPV6OP
>> 3: IPV6FR
>> 4: MPLS
>> 5: VLAN
>> Testing IPv4...
>> inner.dest4: 127.0.0.1
>> inner.source4: 127.0.0.3
>> pkts: tx=10 rx=10
>> inner.dest4: 127.0.0.1
>> inner.source4: 127.0.0.3
>> pkts: tx=10 rx=0
>> inner.dest4: 127.0.0.1
>> inner.source4: 127.0.0.3
>> pkts: tx=10 rx=10
>> Testing IPIP...
>> tunnels before test:
>> tunl0: any/ip remote any local any ttl inherit nopmtudisc
>> sit_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
>> ipip_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
>> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
>> gre_test_LV5N: gre/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
>> gre0: gre/ip remote any local any ttl inherit nopmtudisc
>> inner.dest4: 192.168.0.1
>> inner.source4: 1.1.1.1
>> encap proto:   4
>> outer.dest4: 127.0.0.1
>> outer.source4: 127.0.0.2
>> pkts: tx=10 rx=0
>> tunnels after test:
>> tunl0: any/ip remote any local any ttl inherit nopmtudisc
>> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
>> gre0: gre/ip remote any local any ttl inherit nopmtudisc
>> selftests: test_flow_dissector [FAILED]
>>
>> is it something in my setup or test is broken?
>>
I just reran the test and it is passing. We will investigate what
could be causing the issue.


Re: [bpf-next, v4 0/5] Introduce eBPF flow dissector

2018-09-14 Thread Y Song
On Fri, Sep 14, 2018 at 12:24 PM Alexei Starovoitov
 wrote:
>
> On Fri, Sep 14, 2018 at 07:46:17AM -0700, Petar Penkov wrote:
> > From: Petar Penkov 
> >
> > This patch series hardens the RX stack by allowing flow dissection in BPF,
> > as previously discussed [1]. Because of the rigorous checks of the BPF
> > verifier, this provides significant security guarantees. In particular, the
> > BPF flow dissector cannot get inside of an infinite loop, as with
> > CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
> > read outside of packet bounds, because all memory accesses are checked.
> > Also, with BPF the administrator can decide which protocols to support,
> > reducing potential attack surface. Rarely encountered protocols can be
> > excluded from dissection and the program can be updated without kernel
> > recompile or reboot if a bug is discovered.
> >
> > Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
> > This includes a new BPF program and attach type.
> >
> > Patch 2 adds the new BPF flow dissector definitions to tools/uapi.
> >
> > Patch 3 adds support for the new BPF program type to libbpf and bpftool.
> >
> > Patch 4 adds a flow dissector program in BPF. This parses most protocols in
> > __skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
> > and address types).
> >
> > Patch 5 adds a selftest that attaches the BPF program to the flow dissector
> > and sends traffic with different levels of encapsulation.
> >
> > Performance Evaluation:
> > The in-kernel implementation was compared against the demo program from
> > patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
> >   $perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
> >   -t 10
>
> Looks great. Applied to bpf-next with one extra patch:
>  SEC("dissect")
> -int dissect(struct __sk_buff *skb)
> +int _dissect(struct __sk_buff *skb)
>
> otherwise the test doesn't build.
> I'm not sure how it builds for you. Which llvm did you use?

This is a known issue. IIRC, llvm <= 4 should be okay and llvm >= 5 would fail.

>
> Also above command works and ipv4 test in ./test_flow_dissector.sh
> is passing as well, but it still fails at the end for me:
> ./test_flow_dissector.sh
> bpffs not mounted. Mounting...
> 0: IP
> 1: IPV6
> 2: IPV6OP
> 3: IPV6FR
> 4: MPLS
> 5: VLAN
> Testing IPv4...
> inner.dest4: 127.0.0.1
> inner.source4: 127.0.0.3
> pkts: tx=10 rx=10
> inner.dest4: 127.0.0.1
> inner.source4: 127.0.0.3
> pkts: tx=10 rx=0
> inner.dest4: 127.0.0.1
> inner.source4: 127.0.0.3
> pkts: tx=10 rx=10
> Testing IPIP...
> tunnels before test:
> tunl0: any/ip remote any local any ttl inherit nopmtudisc
> sit_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> ipip_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
> gre_test_LV5N: gre/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
> gre0: gre/ip remote any local any ttl inherit nopmtudisc
> inner.dest4: 192.168.0.1
> inner.source4: 1.1.1.1
> encap proto:   4
> outer.dest4: 127.0.0.1
> outer.source4: 127.0.0.2
> pkts: tx=10 rx=0
> tunnels after test:
> tunl0: any/ip remote any local any ttl inherit nopmtudisc
> sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
> gre0: gre/ip remote any local any ttl inherit nopmtudisc
> selftests: test_flow_dissector [FAILED]
>
> is it something in my setup or test is broken?
>


Re: [bpf-next, v4 0/5] Introduce eBPF flow dissector

2018-09-14 Thread Alexei Starovoitov
On Fri, Sep 14, 2018 at 07:46:17AM -0700, Petar Penkov wrote:
> From: Petar Penkov 
> 
> This patch series hardens the RX stack by allowing flow dissection in BPF,
> as previously discussed [1]. Because of the rigorous checks of the BPF
> verifier, this provides significant security guarantees. In particular, the
> BPF flow dissector cannot get inside of an infinite loop, as with
> CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
> read outside of packet bounds, because all memory accesses are checked.
> Also, with BPF the administrator can decide which protocols to support,
> reducing potential attack surface. Rarely encountered protocols can be
> excluded from dissection and the program can be updated without kernel
> recompile or reboot if a bug is discovered.
> 
> Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
> This includes a new BPF program and attach type.
> 
> Patch 2 adds the new BPF flow dissector definitions to tools/uapi.
> 
> Patch 3 adds support for the new BPF program type to libbpf and bpftool.
> 
> Patch 4 adds a flow dissector program in BPF. This parses most protocols in
> __skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
> and address types).
> 
> Patch 5 adds a selftest that attaches the BPF program to the flow dissector
> and sends traffic with different levels of encapsulation.
> 
> Performance Evaluation:
> The in-kernel implementation was compared against the demo program from
> patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
>   $perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
>   -t 10

Looks great. Applied to bpf-next with one extra patch:
 SEC("dissect")
-int dissect(struct __sk_buff *skb)
+int _dissect(struct __sk_buff *skb)

otherwise the test doesn't build.
I'm not sure how it builds for you. Which llvm did you use?

Also above command works and ipv4 test in ./test_flow_dissector.sh
is passing as well, but it still fails at the end for me:
./test_flow_dissector.sh
bpffs not mounted. Mounting...
0: IP
1: IPV6
2: IPV6OP
3: IPV6FR
4: MPLS
5: VLAN
Testing IPv4...
inner.dest4: 127.0.0.1
inner.source4: 127.0.0.3
pkts: tx=10 rx=10
inner.dest4: 127.0.0.1
inner.source4: 127.0.0.3
pkts: tx=10 rx=0
inner.dest4: 127.0.0.1
inner.source4: 127.0.0.3
pkts: tx=10 rx=10
Testing IPIP...
tunnels before test:
tunl0: any/ip remote any local any ttl inherit nopmtudisc
sit_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
ipip_test_LV5N: any/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
gre_test_LV5N: gre/ip remote 127.0.0.2 local 127.0.0.1 dev lo ttl inherit
gre0: gre/ip remote any local any ttl inherit nopmtudisc
inner.dest4: 192.168.0.1
inner.source4: 1.1.1.1
encap proto:   4
outer.dest4: 127.0.0.1
outer.source4: 127.0.0.2
pkts: tx=10 rx=0
tunnels after test:
tunl0: any/ip remote any local any ttl inherit nopmtudisc
sit0: ipv6/ip remote any local any ttl 64 nopmtudisc
gre0: gre/ip remote any local any ttl inherit nopmtudisc
selftests: test_flow_dissector [FAILED]

is it something in my setup or test is broken?



[bpf-next, v4 0/5] Introduce eBPF flow dissector

2018-09-14 Thread Petar Penkov
From: Petar Penkov 

This patch series hardens the RX stack by allowing flow dissection in BPF,
as previously discussed [1]. Because of the rigorous checks of the BPF
verifier, this provides significant security guarantees. In particular, the
BPF flow dissector cannot get inside of an infinite loop, as with
CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
read outside of packet bounds, because all memory accesses are checked.
Also, with BPF the administrator can decide which protocols to support,
reducing potential attack surface. Rarely encountered protocols can be
excluded from dissection and the program can be updated without kernel
recompile or reboot if a bug is discovered.

Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
This includes a new BPF program and attach type.

Patch 2 adds the new BPF flow dissector definitions to tools/uapi.

Patch 3 adds support for the new BPF program type to libbpf and bpftool.

Patch 4 adds a flow dissector program in BPF. This parses most protocols in
__skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
and address types).

Patch 5 adds a selftest that attaches the BPF program to the flow dissector
and sends traffic with different levels of encapsulation.

Performance Evaluation:
The in-kernel implementation was compared against the demo program from
patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
$perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
-t 10

In-kernel Dissector:
__skb_flow_dissect overhead: 2.12%
Total Packets: 3,272,597 (from output of ./test_flow_dissector)

BPF Dissector:
__skb_flow_dissect overhead: 1.63% 
Total Packets: 3,232,356 (from output of ./test_flow_dissector)

No-op BPF Dissector:
__skb_flow_dissect overhead: 1.52% 
Total Packets: 3,330,635 (from output of ./test_flow_dissector)

Changes since v3:
1/ struct bpf_flow_keys reorganized to remove holes in patch 1 and patch 2.

Changes since v2:
1/ Changes to tools/include/uapi pulled into a separate patch 2
2/ Changes to tools/lib and tools/bpftool pulled into a separate patch 3
3/ Changed flow_keys in __sk_buff from __u32 to struct bpf_flow_keys *
4/ Added nhoff field in struct bpf_flow_keys to pass initial offset
5/ Saving all of the modified control block, rather than just the qdisc
6/ Sample BPF program in patch 4 modified to use the changes above

Changes since v1:
1/ LD_ABS instructions now disallowed for the new BPF prog type 
2/ now checks if skb is NULL in __skb_flow_dissect()
3/ fixed incorrect accesses in flow_dissector_is_valid_access()
- writes to the flow_keys field now disallowed
- reads/writes to tc_classid and data_meta now disallowed 
4/ headers pulled with bpf_skb_load_data if direct access fails 

Changes since RFC:
1/ Flow dissector hook changed from global to per-netns
2/ Defined struct bpf_flow_keys to be used in BPF flow dissector
programs instead of exposing the internal flow keys layout. Added a
function to translate from bpf_flow_keys to the internal layout after BPF
dissection is complete. The pointer to this struct is stored in
qdisc_skb_cb rather than inside of the 20 byte control block which
simplifies verification and allows access to all 20 bytes of the cb.
3/ Removed GUE parsing as it relied on a hardcoded port
4/ MPLS parsing now stops at the first label which is consistent
with the in-kernel flow dissector
5/ Refactored to use direct packet access and to write out to
struct bpf_flow_keys

[1] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf

Petar Penkov (5):
  flow_dissector: implements flow dissector BPF hook
  bpf: sync bpf.h uapi with tools/
  bpf: support flow dissector in libbpf and bpftool
  flow_dissector: implements eBPF parser
  selftests/bpf: test bpf flow dissection

 include/linux/bpf.h   |   1 +
 include/linux/bpf_types.h |   1 +
 include/linux/skbuff.h|   7 +
 include/net/net_namespace.h   |   3 +
 include/net/sch_generic.h |  12 +-
 include/uapi/linux/bpf.h  |  26 +
 kernel/bpf/syscall.c  |   8 +
 kernel/bpf/verifier.c |  32 +
 net/core/filter.c |  70 ++
 net/core/flow_dissector.c | 134 +++
 tools/bpf/bpftool/prog.c  |   1 +
 tools/include/uapi/linux/bpf.h|  26 +
 tools/lib/bpf/libbpf.c|   2 +
 tools/testing/selftests/bpf/.gitignore|   2 +
 tools/testing/selftests/bpf/Makefile  |   8 +-
 tools/testing/selftests/bpf/bpf_flow.c| 373 +
 tools/testing/selftests/bpf/config|   1 +
 .../selftests/bpf/flow_dissector_load.c   | 140 
 .../selftests/bpf/test_flow_dissector.c   | 782 ++