From: John Fastabend <john.fastab...@gmail.com>
Date: Mon, 5 Mar 2018 23:06:01 -0800
> On 03/05/2018 10:42 PM, David Miller wrote:
>> From: John Fastabend <john.fastab...@gmail.com>
>> Date: Mon, 5 Mar 2018 22:22:21 -0800
>>> All I meant by this is if an application uses sendfile() call
>>> there is no good way to know when/if the kernel side will copy or
>>> xmit the data. So a reliable user space application will need to
>>> only modify the data if it "knows" there are no outstanding sends
>>> in-flight. So if we assume applications follow this then it
>>> is OK to avoid the copy. Of course this is not good enough for
>>> security, but for monitoring/statistics (my use case 1 it works).
>> For an application implementing a networking file system, it's pretty
>> legitimate for file contents to change before the page gets DMA's to
>> the networking card.
> Still there are useful BPF programs that can tolerate this. So I
> would prefer to allow BPF programs to operate in the no-copy mode
> if wanted. It doesn't have to be the default though as it currently
> is. A l7 load balancer is a good example of this.
Maybe I'd be ok if it were not the default. But do you really want to
expose a potential attack vector, even if the app gets to choose and
say "I'm ok"?
>> And that's perfectly fine, and we everything such that this will work
>> The card checksums what ends up being DMA'd so nothing from the
>> networking side is broken.
> Assuming the card has checksum support correct? Which is why we have
> the SKBTX_SHARED_FRAG checked in skb_has_shared_frag() and the checksum
> helpers called by the drivers when they do not support the protocol
> being used. So probably OK assumption if using supported protocols and
> hardware? Perhaps in general folks just use normal protocols and
> hardware so it works.
If the hardware doesn't support the checksums, we linearize the SKB
(therefore obtain a snapshot of the data), and checksum. Exactly what
would happen if the hardware did the checksum.
So OK in that case too.
We always guarantee that you will always get a correct checksum on
outgoing packets, even if you modify the page contents meanwhile.
> So the "I need at least X more bytes" is the msg_cork_bytes() in patch
> 7. I could handle the sendpage case the same as I handle the sendmsg
> case and copy the data into the buffer until N bytes are received. I
> had planned to add this mode in a follow up series but could add it in
> this series so we have all the pieces in one submission.
> Although I used a scatterlist instead of a linear buffer. I was
> planning to add a helper to pull in next sg list item if needed
> rather than try to allocate a large linear block up front.
For non-deep packet inspection cases this re-running of the parser case
will probably not trigger at all.