Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On 03/15/2018 05:37 PM, Daniel Borkmann wrote: > On 03/16/2018 12:06 AM, Alexei Starovoitov wrote: >> On Thu, Mar 15, 2018 at 11:55:39PM +0100, Daniel Borkmann wrote: >>> On 03/15/2018 11:20 PM, Alexei Starovoitov wrote: On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote: > On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: >> On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: >>> >>> +/* User return codes for SK_MSG prog type. */ >>> +enum sk_msg_action { >>> + SK_MSG_DROP = 0, >>> + SK_MSG_PASS, >>> +}; >> >> do we really need new enum here? >> It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP >> and there will be only drop/pass in both enums. >> Also I don't see where these two new SK_MSG_* are used... >> >>> + >>> +/* user accessible metadata for SK_MSG packet hook, new fields must >>> + * be added to the end of this structure >>> + */ >>> +struct sk_msg_md { >>> + __u32 data; >>> + __u32 data_end; >>> +}; >> >> I think it's time for me to ask for forgiveness :) > > :-) > >> I used __u32 for data and data_end only because all other fields >> in __sk_buff were __u32 at the time and I couldn't easily figure out >> how to teach verifier to recognize 8-byte rewrites. >> Unfortunately my mistake stuck and was copied over into xdp. >> Since this is new struct let's do it right and add >> 'void *data, *data_end' here, >> since bpf prog will use them as 'void *' pointers. >> There are no compat issues here, since bpf is always 64-bit. > > But at least offset-wise when you do the ctx rewrite this would then > be a bit more tricky when you have 64 bit kernel with 32 bit user > space since void * members are in each cases at different offset. So > unless I'm missing something, this still should either be __u32 or > __u64 instead of void *, no? there is no 32-bit user space. these structs are seen by bpf progs only and bpf is 64-bit only too. unless I'm missing your point. >>> >>> Ok, so lets say you have 32 bit LLVM binary and compile the prog where >>> you access md->data_end. Given the void * in the struct will that access >>> end up being BPF_W at ctx offset 4 or BPF_DW at ctx offset 8 from clang >>> perspective (iow, is the back end treating this special and always use >>> fixed BPF_DW in such case)? If not and it would be the first case with >>> offset 4, then we could have the case that underlying 64 bit kernel is >>> expecting ctx offset 8 for doing the md ctx conversion. >> >> i'm still not quite following. >> Whether llvm itself is 32-bit binary or it's arm32 or sprac32 binary >> doesn't matter. It will produce the same 64-bit bpf code. >> It will see 'void *' deref from this struct and will emit DW. >> May be confusion is from newly added -mattr=+alu32 flag? >> That option doesn't change that sizeof(void*)==8. >> It only allows backend to emit 32-bit alu insns. > > Ok, so conclusion we had is that while BPF target is unconditionally 64 bit, > it depends which clang front end you use for compilation wrt structs. E.g. > on 32 bit native (e.g. arm) clang front end it would compile the ctx void * > pointers as 4 byte while using clang -target bpf it would compile it as 8 > byte. The native clang front end is needed in case of tracing when accessing > pt_regs for walking data structures, but not for networking use case, so > always using -target bpf there is proper way. Meaning there would be no > confusion on the void * since size will always be 8 regardless of underlying > arch being 32 or 64 bit or clang/llvm binary being 32 bit on 64 bit kernel. > Thus, sticking to void * would be fine, but definitely > samples/sockmap/Makefile > must be fixed as well, such that people don't copy it wrongly. > > Cheers, > Danie I'll send a fix for sockmap/Makefile then as a separate series. And go ahead and change this series to use 'void *'. Thanks for the follow-up on this.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On 03/16/2018 12:06 AM, Alexei Starovoitov wrote: > On Thu, Mar 15, 2018 at 11:55:39PM +0100, Daniel Borkmann wrote: >> On 03/15/2018 11:20 PM, Alexei Starovoitov wrote: >>> On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote: On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: > On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: >> >> +/* User return codes for SK_MSG prog type. */ >> +enum sk_msg_action { >> +SK_MSG_DROP = 0, >> +SK_MSG_PASS, >> +}; > > do we really need new enum here? > It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP > and there will be only drop/pass in both enums. > Also I don't see where these two new SK_MSG_* are used... > >> + >> +/* user accessible metadata for SK_MSG packet hook, new fields must >> + * be added to the end of this structure >> + */ >> +struct sk_msg_md { >> +__u32 data; >> +__u32 data_end; >> +}; > > I think it's time for me to ask for forgiveness :) :-) > I used __u32 for data and data_end only because all other fields > in __sk_buff were __u32 at the time and I couldn't easily figure out > how to teach verifier to recognize 8-byte rewrites. > Unfortunately my mistake stuck and was copied over into xdp. > Since this is new struct let's do it right and add > 'void *data, *data_end' here, > since bpf prog will use them as 'void *' pointers. > There are no compat issues here, since bpf is always 64-bit. But at least offset-wise when you do the ctx rewrite this would then be a bit more tricky when you have 64 bit kernel with 32 bit user space since void * members are in each cases at different offset. So unless I'm missing something, this still should either be __u32 or __u64 instead of void *, no? >>> >>> there is no 32-bit user space. these structs are seen by bpf progs only >>> and bpf is 64-bit only too. >>> unless I'm missing your point. >> >> Ok, so lets say you have 32 bit LLVM binary and compile the prog where >> you access md->data_end. Given the void * in the struct will that access >> end up being BPF_W at ctx offset 4 or BPF_DW at ctx offset 8 from clang >> perspective (iow, is the back end treating this special and always use >> fixed BPF_DW in such case)? If not and it would be the first case with >> offset 4, then we could have the case that underlying 64 bit kernel is >> expecting ctx offset 8 for doing the md ctx conversion. > > i'm still not quite following. > Whether llvm itself is 32-bit binary or it's arm32 or sprac32 binary > doesn't matter. It will produce the same 64-bit bpf code. > It will see 'void *' deref from this struct and will emit DW. > May be confusion is from newly added -mattr=+alu32 flag? > That option doesn't change that sizeof(void*)==8. > It only allows backend to emit 32-bit alu insns. Ok, so conclusion we had is that while BPF target is unconditionally 64 bit, it depends which clang front end you use for compilation wrt structs. E.g. on 32 bit native (e.g. arm) clang front end it would compile the ctx void * pointers as 4 byte while using clang -target bpf it would compile it as 8 byte. The native clang front end is needed in case of tracing when accessing pt_regs for walking data structures, but not for networking use case, so always using -target bpf there is proper way. Meaning there would be no confusion on the void * since size will always be 8 regardless of underlying arch being 32 or 64 bit or clang/llvm binary being 32 bit on 64 bit kernel. Thus, sticking to void * would be fine, but definitely samples/sockmap/Makefile must be fixed as well, such that people don't copy it wrongly. Cheers, Daniel
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On Thu, Mar 15, 2018 at 11:55:39PM +0100, Daniel Borkmann wrote: > On 03/15/2018 11:20 PM, Alexei Starovoitov wrote: > > On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote: > >> On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: > >>> On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: > > +/* User return codes for SK_MSG prog type. */ > +enum sk_msg_action { > +SK_MSG_DROP = 0, > +SK_MSG_PASS, > +}; > >>> > >>> do we really need new enum here? > >>> It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP > >>> and there will be only drop/pass in both enums. > >>> Also I don't see where these two new SK_MSG_* are used... > >>> > + > +/* user accessible metadata for SK_MSG packet hook, new fields must > + * be added to the end of this structure > + */ > +struct sk_msg_md { > +__u32 data; > +__u32 data_end; > +}; > >>> > >>> I think it's time for me to ask for forgiveness :) > >> > >> :-) > >> > >>> I used __u32 for data and data_end only because all other fields > >>> in __sk_buff were __u32 at the time and I couldn't easily figure out > >>> how to teach verifier to recognize 8-byte rewrites. > >>> Unfortunately my mistake stuck and was copied over into xdp. > >>> Since this is new struct let's do it right and add > >>> 'void *data, *data_end' here, > >>> since bpf prog will use them as 'void *' pointers. > >>> There are no compat issues here, since bpf is always 64-bit. > >> > >> But at least offset-wise when you do the ctx rewrite this would then > >> be a bit more tricky when you have 64 bit kernel with 32 bit user > >> space since void * members are in each cases at different offset. So > >> unless I'm missing something, this still should either be __u32 or > >> __u64 instead of void *, no? > > > > there is no 32-bit user space. these structs are seen by bpf progs only > > and bpf is 64-bit only too. > > unless I'm missing your point. > > Ok, so lets say you have 32 bit LLVM binary and compile the prog where > you access md->data_end. Given the void * in the struct will that access > end up being BPF_W at ctx offset 4 or BPF_DW at ctx offset 8 from clang > perspective (iow, is the back end treating this special and always use > fixed BPF_DW in such case)? If not and it would be the first case with > offset 4, then we could have the case that underlying 64 bit kernel is > expecting ctx offset 8 for doing the md ctx conversion. i'm still not quite following. Whether llvm itself is 32-bit binary or it's arm32 or sprac32 binary doesn't matter. It will produce the same 64-bit bpf code. It will see 'void *' deref from this struct and will emit DW. May be confusion is from newly added -mattr=+alu32 flag? That option doesn't change that sizeof(void*)==8. It only allows backend to emit 32-bit alu insns.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On 03/15/2018 11:20 PM, Alexei Starovoitov wrote: > On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote: >> On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: >>> On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: +/* User return codes for SK_MSG prog type. */ +enum sk_msg_action { + SK_MSG_DROP = 0, + SK_MSG_PASS, +}; >>> >>> do we really need new enum here? >>> It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP >>> and there will be only drop/pass in both enums. >>> Also I don't see where these two new SK_MSG_* are used... >>> + +/* user accessible metadata for SK_MSG packet hook, new fields must + * be added to the end of this structure + */ +struct sk_msg_md { + __u32 data; + __u32 data_end; +}; >>> >>> I think it's time for me to ask for forgiveness :) >> >> :-) >> >>> I used __u32 for data and data_end only because all other fields >>> in __sk_buff were __u32 at the time and I couldn't easily figure out >>> how to teach verifier to recognize 8-byte rewrites. >>> Unfortunately my mistake stuck and was copied over into xdp. >>> Since this is new struct let's do it right and add >>> 'void *data, *data_end' here, >>> since bpf prog will use them as 'void *' pointers. >>> There are no compat issues here, since bpf is always 64-bit. >> >> But at least offset-wise when you do the ctx rewrite this would then >> be a bit more tricky when you have 64 bit kernel with 32 bit user >> space since void * members are in each cases at different offset. So >> unless I'm missing something, this still should either be __u32 or >> __u64 instead of void *, no? > > there is no 32-bit user space. these structs are seen by bpf progs only > and bpf is 64-bit only too. > unless I'm missing your point. Ok, so lets say you have 32 bit LLVM binary and compile the prog where you access md->data_end. Given the void * in the struct will that access end up being BPF_W at ctx offset 4 or BPF_DW at ctx offset 8 from clang perspective (iow, is the back end treating this special and always use fixed BPF_DW in such case)? If not and it would be the first case with offset 4, then we could have the case that underlying 64 bit kernel is expecting ctx offset 8 for doing the md ctx conversion.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On Thu, Mar 15, 2018 at 11:17:12PM +0100, Daniel Borkmann wrote: > On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: > > On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: > >> > >> +/* User return codes for SK_MSG prog type. */ > >> +enum sk_msg_action { > >> + SK_MSG_DROP = 0, > >> + SK_MSG_PASS, > >> +}; > > > > do we really need new enum here? > > It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP > > and there will be only drop/pass in both enums. > > Also I don't see where these two new SK_MSG_* are used... > > > >> + > >> +/* user accessible metadata for SK_MSG packet hook, new fields must > >> + * be added to the end of this structure > >> + */ > >> +struct sk_msg_md { > >> + __u32 data; > >> + __u32 data_end; > >> +}; > > > > I think it's time for me to ask for forgiveness :) > > :-) > > > I used __u32 for data and data_end only because all other fields > > in __sk_buff were __u32 at the time and I couldn't easily figure out > > how to teach verifier to recognize 8-byte rewrites. > > Unfortunately my mistake stuck and was copied over into xdp. > > Since this is new struct let's do it right and add > > 'void *data, *data_end' here, > > since bpf prog will use them as 'void *' pointers. > > There are no compat issues here, since bpf is always 64-bit. > > But at least offset-wise when you do the ctx rewrite this would then > be a bit more tricky when you have 64 bit kernel with 32 bit user > space since void * members are in each cases at different offset. So > unless I'm missing something, this still should either be __u32 or > __u64 instead of void *, no? there is no 32-bit user space. these structs are seen by bpf progs only and bpf is 64-bit only too. unless I'm missing your point.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On 03/15/2018 10:59 PM, Alexei Starovoitov wrote: > On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: >> >> +/* User return codes for SK_MSG prog type. */ >> +enum sk_msg_action { >> +SK_MSG_DROP = 0, >> +SK_MSG_PASS, >> +}; > > do we really need new enum here? > It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP > and there will be only drop/pass in both enums. > Also I don't see where these two new SK_MSG_* are used... > >> + >> +/* user accessible metadata for SK_MSG packet hook, new fields must >> + * be added to the end of this structure >> + */ >> +struct sk_msg_md { >> +__u32 data; >> +__u32 data_end; >> +}; > > I think it's time for me to ask for forgiveness :) :-) > I used __u32 for data and data_end only because all other fields > in __sk_buff were __u32 at the time and I couldn't easily figure out > how to teach verifier to recognize 8-byte rewrites. > Unfortunately my mistake stuck and was copied over into xdp. > Since this is new struct let's do it right and add > 'void *data, *data_end' here, > since bpf prog will use them as 'void *' pointers. > There are no compat issues here, since bpf is always 64-bit. But at least offset-wise when you do the ctx rewrite this would then be a bit more tricky when you have 64 bit kernel with 32 bit user space since void * members are in each cases at different offset. So unless I'm missing something, this still should either be __u32 or __u64 instead of void *, no? >> +static int bpf_map_msg_verdict(int _rc, struct sk_msg_buff *md) >> +{ >> +return ((_rc == SK_PASS) ? >> + (md->map ? __SK_REDIRECT : __SK_PASS) : >> + __SK_DROP); > > you're using old SK_PASS here too ;) > that's to my point of not adding SK_MSG_PASS... > > Overall the patch set looks absolutely great. > Thank you for working on it. +1
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On 03/15/2018 02:59 PM, Alexei Starovoitov wrote: > On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: >> >> +/* User return codes for SK_MSG prog type. */ >> +enum sk_msg_action { >> +SK_MSG_DROP = 0, >> +SK_MSG_PASS, >> +}; > > do we really need new enum here? Nope and as you noticed the actual code uses the SK_{DROP|PASS} enum. Will remove this. > It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP > and there will be only drop/pass in both enums. > Also I don't see where these two new SK_MSG_* are used... > >> + >> +/* user accessible metadata for SK_MSG packet hook, new fields must >> + * be added to the end of this structure >> + */ >> +struct sk_msg_md { >> +__u32 data; >> +__u32 data_end; >> +}; > > I think it's time for me to ask for forgiveness :) > I used __u32 for data and data_end only because all other fields > in __sk_buff were __u32 at the time and I couldn't easily figure out > how to teach verifier to recognize 8-byte rewrites. > Unfortunately my mistake stuck and was copied over into xdp. > Since this is new struct let's do it right and add > 'void *data, *data_end' here, > since bpf prog will use them as 'void *' pointers. > There are no compat issues here, since bpf is always 64-bit. > aha nice catch. Yep lets use 'void*' here. I had forgot about that discussion and copied them here as well. >> +static int bpf_map_msg_verdict(int _rc, struct sk_msg_buff *md) >> +{ >> +return ((_rc == SK_PASS) ? >> + (md->map ? __SK_REDIRECT : __SK_PASS) : >> + __SK_DROP); > > you're using old SK_PASS here too ;) > that's to my point of not adding SK_MSG_PASS... > +1 > Overall the patch set looks absolutely great. > Thank you for working on it. > I'll fixup a few of these small things now and should have a v3 shortly.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
On Mon, Mar 12, 2018 at 12:23:29PM -0700, John Fastabend wrote: > > +/* User return codes for SK_MSG prog type. */ > +enum sk_msg_action { > + SK_MSG_DROP = 0, > + SK_MSG_PASS, > +}; do we really need new enum here? It's the same as 'enum sk_action' and SK_DROP == SK_MSG_DROP and there will be only drop/pass in both enums. Also I don't see where these two new SK_MSG_* are used... > + > +/* user accessible metadata for SK_MSG packet hook, new fields must > + * be added to the end of this structure > + */ > +struct sk_msg_md { > + __u32 data; > + __u32 data_end; > +}; I think it's time for me to ask for forgiveness :) I used __u32 for data and data_end only because all other fields in __sk_buff were __u32 at the time and I couldn't easily figure out how to teach verifier to recognize 8-byte rewrites. Unfortunately my mistake stuck and was copied over into xdp. Since this is new struct let's do it right and add 'void *data, *data_end' here, since bpf prog will use them as 'void *' pointers. There are no compat issues here, since bpf is always 64-bit. > +static int bpf_map_msg_verdict(int _rc, struct sk_msg_buff *md) > +{ > + return ((_rc == SK_PASS) ? > +(md->map ? __SK_REDIRECT : __SK_PASS) : > +__SK_DROP); you're using old SK_PASS here too ;) that's to my point of not adding SK_MSG_PASS... Overall the patch set looks absolutely great. Thank you for working on it.
Re: [bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
From: John FastabendDate: Mon, 12 Mar 2018 12:23:29 -0700 > This implements a BPF ULP layer to allow policy enforcement and > monitoring at the socket layer. In order to support this a new > program type BPF_PROG_TYPE_SK_MSG is used to run the policy at > the sendmsg/sendpage hook. To attach the policy to sockets a > sockmap is used with a new program attach type BPF_SK_MSG_VERDICT. ... > Signed-off-by: John Fastabend Acked-by: David S. Miller
[bpf-next PATCH v2 05/18] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
This implements a BPF ULP layer to allow policy enforcement and monitoring at the socket layer. In order to support this a new program type BPF_PROG_TYPE_SK_MSG is used to run the policy at the sendmsg/sendpage hook. To attach the policy to sockets a sockmap is used with a new program attach type BPF_SK_MSG_VERDICT. Similar to previous sockmap usages when a sock is added to a sockmap, via a map update, if the map contains a BPF_SK_MSG_VERDICT program type attached then the BPF ULP layer is created on the socket and the attached BPF_PROG_TYPE_SK_MSG program is run for every msg in sendmsg case and page/offset in sendpage case. BPF_PROG_TYPE_SK_MSG Semantics/API: BPF_PROG_TYPE_SK_MSG supports only two return codes SK_PASS and SK_DROP. Returning SK_DROP free's the copied data in the sendmsg case and in the sendpage case leaves the data untouched. Both cases return -EACESS to the user. Returning SK_PASS will allow the msg to be sent. In the sendmsg case data is copied into kernel space buffers before running the BPF program. The kernel space buffers are stored in a scatterlist object where each element is a kernel memory buffer. Some effort is made to coalesce data from the sendmsg call here. For example a sendmsg call with many one byte iov entries will likely be pushed into a single entry. The BPF program is run with data pointers (start/end) pointing to the first sg element. In the sendpage case data is not copied. We opt not to copy the data by default here, because the BPF infrastructure does not know what bytes will be needed nor when they will be needed. So copying all bytes may be wasteful. Because of this the initial start/end data pointers are (0,0). Meaning no data can be read or written. This avoids reading data that may be modified by the user. A new helper is added later in this series if reading and writing the data is needed. The helper call will do a copy by default so that the page is exclusively owned by the BPF call. The verdict from the BPF_PROG_TYPE_SK_MSG applies to the entire msg in the sendmsg() case and the entire page/offset in the sendpage case. This avoids ambiguity on how to handle mixed return codes in the sendmsg case. Again a helper is added later in the series if a verdict needs to apply to multiple system calls and/or only a subpart of the currently being processed message. The helper msg_redirect_map() can be used to select the socket to send the data on. This is used similar to existing redirect use cases. This allows policy to redirect msgs. Pseudo code simple example: The basic logic to attach a program to a socket is as follows, // load the programs bpf_prog_load(SOCKMAP_TCP_MSG_PROG, BPF_PROG_TYPE_SK_MSG, , _prog); // lookup the sockmap bpf_map_msg = bpf_object__find_map_by_name(obj, "my_sock_map"); // get fd for sockmap map_fd_msg = bpf_map__fd(bpf_map_msg); // attach program to sockmap bpf_prog_attach(msg_prog, map_fd_msg, BPF_SK_MSG_VERDICT, 0); Adding sockets to the map is done in the normal way, // Add a socket 'fd' to sockmap at location 'i' bpf_map_update_elem(map_fd_msg, , fd, BPF_ANY); After the above any socket attached to "my_sock_map", in this case 'fd', will run the BPF msg verdict program (msg_prog) on every sendmsg and sendpage system call. For a complete example see BPF selftests or sockmap samples. Implementation notes: It seemed the simplest, to me at least, to use a refcnt to ensure psock is not lost across the sendmsg copy into the sg, the bpf program running on the data in sg_data, and the final pass to the TCP stack. Some performance testing may show a better method to do this and avoid the refcnt cost, but for now use the simpler method. Another item that will come after basic support is in place is supporting MSG_MORE flag. At the moment we call sendpages even if the MSG_MORE flag is set. An enhancement would be to collect the pages into a larger scatterlist and pass down the stack. Notice that bpf_tcp_sendmsg() could support this with some additional state saved across sendmsg calls. I built the code to support this without having to do refactoring work. Other features TBD include ZEROCOPY and the TCP_RECV_QUEUE/TCP_NO_QUEUE support. This will follow initial series shortly. Future work could improve size limits on the scatterlist rings used here. Currently, we use MAX_SKB_FRAGS simply because this was being used already in the TLS case. Future work could extend the kernel sk APIs to tune this depending on workload. This is a trade-off between memory usage and throughput performance. Signed-off-by: John Fastabend--- include/linux/bpf.h |1 include/linux/bpf_types.h |1 include/linux/filter.h| 17 + include/uapi/linux/bpf.h | 28 ++ kernel/bpf/sockmap.c | 714 - kernel/bpf/syscall.c | 14 + kernel/bpf/verifier.c |5 net/core/filter.c | 106 +++ 8 files