On Tue, Sep 20, 2016 at 06:27:17PM +0200, Jesper Dangaard Brouer wrote:
> On Tue, 20 Sep 2016 09:13:56 -0700
> Eric Dumazet <eric.duma...@gmail.com> wrote:
> 
> > On Tue, 2016-09-20 at 18:06 +0200, Jesper Dangaard Brouer wrote:
> > > On Tue, 20 Sep 2016 08:58:30 -0700
> > > Eric Dumazet <eric.duma...@gmail.com> wrote:  
> > 
> > > > Same for XDP_TX if/when packet is dropped because output ring is full.  
> > > 
> > > For the XDP_TX case a counter is already incremented[1] but it is a
> > > local ring counter (ring->tx_dropped++).
> > > 
> > > Do you think we should maintain separate counters for XDP? (to have a
> > > more consistent interface across drivers...)  
> > 
> > No, as long as the admin can learn drops are occurring.
> 
> Okay, so recording these drops is important for an admin, agreed.  Now
> that we have the chance to define the API, wouldn't is be nice if the
> admin across drive drivers knew what counter to look for???
> 
> > "perf ... -e skb:kfree_skb ..." or drop monitor wont work,
> > unfortunately.
> 
> That is actually a good idea... why not add a trace point for these
> rare drop cases, which is a hassle to debug for an admin?
> Let's actually take advantage of all the nice infrastructure the kernel
> provides(?)

that's a great idea. Instead of adding aborted, ring_full, blahblah counters
everywhere that cost performance, let's add tracepoints that are nops
when not in use.
And if all drivers call the same tracepoints it will be even better.
Monitoring trace_xdp_aborted tracepoint would be generic way to debug
'divide by zero'.

To your other question:
> Please explain why a eBPF program error (div by zero) must be a silent drop?

because 'div by zero' is an abnormal situation that shouldn't be exploited.
Meaning if xdp program is doing DoS prevention and it has a bug that
attacker can now exploit by sending a crafted packet that causes
'div by zero' and kernel will warn then attack got successful.
Therefore it has to be silent drop.
tracpoint in such case is great, since the user can do debugging with it
and even monitoring 24/7 and if suddenly the control plan sees a lot
of such trace_xdp_abotred events, it can disable that tracepoint to avoid
spam and adjust the program or act on attack some other way.
Hardcoded warnings and counters are not generic enough for all
the use cases people want to throw at XDP.
The tracepoints idea is awesome, in a sense that it's optional.

Reply via email to