Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Thu, Jul 26, 2018 at 5:52 AM Jamal Hadi Salim wrote: > > On 25/07/18 01:09 PM, Marcelo Ricardo Leitner wrote: > > On Wed, Jul 25, 2018 at 09:48:16AM -0700, Cong Wang wrote: > >> On Wed, Jul 25, 2018 at 5:27 AM Jamal Hadi Salim wrote: > >>> > >>> Those changes were there from the beginning (above patch did > >>> not introduce them). > >>> IIRC, the reason was to distinguish between policy intended > >>> drops and drops because of errors. > >> > >> There must be a limit for "overlimit" to make sense. There is > >> no limit in mirred action's context, probably there is only > >> such a limit in act_police. So, all rest should not touch overlimit. > > > > +1 > > > > I agree we should at least record drop count(unrelated patch though). > we should keep overlimit (for no other reason other than this > has been around for at least 15 years). > > On why "overlimit"? It is just a name for a counter that is useless > for most actions (but was still being transfered to user space). > It is the closest counter to counting "this failed because of > runtime errors" as opposed to "user asked us to drop this". > > Probably a good alternative is to make a very small stats v3 structure > (we have migrated stats structures before) and extend for > each action/classifier/qdisc to add its extra counters using XSTATS. Agreed. > > Note: > If you are _mirroring_ packets - incrementing the drop counter is > misleading because the packet is not dropped by the system. > i.e the qdisc will not record it as dropped; it should for > redirect policy. It is useful to be able to tell > them apart when you are collecting analytics just for actions. Sounds like we just need another counter rather than re-using overlimit or drops. > (if youve worked on a massive amount of machines you'll appreciate > being able to debug by looking at counters that reduce ambiguity). > Yes, this is how I found out the overlimit of htb qdisc is inaccurate or misleading, instead the one in htb class is accurate, see: commit 3c75f6ee139d464351f8ab77a042dd3635769d98 Author: Eric Dumazet Date: Mon Sep 18 12:36:22 2017 -0700 net_sched: sch_htb: add per class overlimits counter
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On 25/07/18 01:09 PM, Marcelo Ricardo Leitner wrote: On Wed, Jul 25, 2018 at 09:48:16AM -0700, Cong Wang wrote: On Wed, Jul 25, 2018 at 5:27 AM Jamal Hadi Salim wrote: Those changes were there from the beginning (above patch did not introduce them). IIRC, the reason was to distinguish between policy intended drops and drops because of errors. There must be a limit for "overlimit" to make sense. There is no limit in mirred action's context, probably there is only such a limit in act_police. So, all rest should not touch overlimit. +1 I agree we should at least record drop count(unrelated patch though). we should keep overlimit (for no other reason other than this has been around for at least 15 years). On why "overlimit"? It is just a name for a counter that is useless for most actions (but was still being transfered to user space). It is the closest counter to counting "this failed because of runtime errors" as opposed to "user asked us to drop this". Probably a good alternative is to make a very small stats v3 structure (we have migrated stats structures before) and extend for each action/classifier/qdisc to add its extra counters using XSTATS. Note: If you are _mirroring_ packets - incrementing the drop counter is misleading because the packet is not dropped by the system. i.e the qdisc will not record it as dropped; it should for redirect policy. It is useful to be able to tell them apart when you are collecting analytics just for actions. (if youve worked on a massive amount of machines you'll appreciate being able to debug by looking at counters that reduce ambiguity). cheers, jamal
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Wed, Jul 25, 2018 at 09:48:16AM -0700, Cong Wang wrote: > On Wed, Jul 25, 2018 at 5:27 AM Jamal Hadi Salim wrote: > > > > Those changes were there from the beginning (above patch did > > not introduce them). > > IIRC, the reason was to distinguish between policy intended > > drops and drops because of errors. > > There must be a limit for "overlimit" to make sense. There is > no limit in mirred action's context, probably there is only > such a limit in act_police. So, all rest should not touch overlimit. +1
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Wed, Jul 25, 2018 at 5:27 AM Jamal Hadi Salim wrote: > > Those changes were there from the beginning (above patch did > not introduce them). > IIRC, the reason was to distinguish between policy intended > drops and drops because of errors. There must be a limit for "overlimit" to make sense. There is no limit in mirred action's context, probably there is only such a limit in act_police. So, all rest should not touch overlimit.
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On 25/07/18 10:24 AM, Paolo Abeni wrote: On Wed, 2018-07-25 at 08:27 -0400, Jamal Hadi Salim wrote: Those changes were there from the beginning (above patch did not introduce them). IIRC, the reason was to distinguish between policy intended drops and drops because of errors. Double-checking to avoid misinterepration on my side: you are ok with keeping the 'overlimits' increment, right? Yes. cheers, jamal
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Wed, 2018-07-25 at 08:27 -0400, Jamal Hadi Salim wrote: > On 25/07/18 04:29 AM, Paolo Abeni wrote: > > On Tue, 2018-07-24 at 13:50 -0700, Cong Wang wrote: > > [..] > > > > I fail to understand why overlimit is increased in your case > > > > here. I guess you want to increase 'drops' instead. > > > > > > Hmm, actually the current mirred code increases overlimit too. > > > But I still don't think it makes sense. > > > > Yep, I chose to increment 'overlimits' to preserve the current mirred > > semantic. > > > > AFAICS, that was first introduced with: > > > > commit 8919bc13e8d92c5b082c5c0321567383a071f5bc > > Author: Jamal Hadi Salim > > Date: Mon Aug 15 05:25:40 2011 + > > > > net_sched: fix port mirror/redirect stats reporting > > > > Likely increasing 'drops' would be "better", but I'm unsure we can > > change this established behavior without affecting some user. > > > > Those changes were there from the beginning (above patch did > not introduce them). > IIRC, the reason was to distinguish between policy intended > drops and drops because of errors. Double-checking to avoid misinterepration on my side: you are ok with keeping the 'overlimits' increment, right? Thanks, Paolo
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
Hi, On Wed, 2018-07-25 at 08:16 -0400, Jamal Hadi Salim wrote: > +Cc Shmulik > > Paolo - please also run the tdc tests (and add anymore if you > feel they dont do coverage to your changes) I run successfully tdc tests on a patched before posting. I plan to rerun them before posting the v4. > On 24/07/18 04:06 PM, Paolo Abeni wrote: > > This is similar TC_ACT_REDIRECT, but with a slightly different > > semantic: > > - on ingress the mirred skbs are passed to the target device > > network stack without any additional check not scrubbing. > > - the rcu-protected stats provided via the tcf_result struct > >are updated on error conditions. > > > > This new tcfa_action value is not exposed to the user-space > > and can be used only internally by clsact. > > > > v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce > > a new action type instead > > > > v2 -> v3: > > - rename the new action value TC_ACT_REINJECT, update the > > helper accordingly > > - take care of uncloned reinjected packets in XDP generic > > hook > > > > Signed-off-by: Paolo Abeni > > --- > > include/net/pkt_cls.h | 3 +++ > > include/net/sch_generic.h | 19 +++ > > net/core/dev.c| 6 +- > > 3 files changed, 27 insertions(+), 1 deletion(-) > > > > diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h > > index 2081e4219f81..36ccfe2a303a 100644 > > --- a/include/net/pkt_cls.h > > +++ b/include/net/pkt_cls.h > > @@ -7,6 +7,9 @@ > > #include > > #include > > > > +/* TC action not accessible from user space */ > > +#define TC_ACT_REINJECT(TC_ACT_VALUE_MAX + 1) > > Lets say in the future we add a new opcode. > Will old kernel, new iproute2 (new value) work? It will works as it currently does in similar situation: opcode unknown to the kernel are treated as TC_ACT_UNSPEC even now. Cheers, Paolo
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
Wed, Jul 25, 2018 at 02:16:12PM CEST, j...@mojatatu.com wrote: [...] >> diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h >> index 2081e4219f81..36ccfe2a303a 100644 >> --- a/include/net/pkt_cls.h >> +++ b/include/net/pkt_cls.h >> @@ -7,6 +7,9 @@ >> #include >> #include >> +/* TC action not accessible from user space */ >> +#define TC_ACT_REINJECT (TC_ACT_VALUE_MAX + 1) > >Lets say in the future we add a new opcode. >Will old kernel, new iproute2 (new value) work? This is safe. See patch #2.
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
Tue, Jul 24, 2018 at 10:06:42PM CEST, pab...@redhat.com wrote: >This is similar TC_ACT_REDIRECT, but with a slightly different >semantic: >- on ingress the mirred skbs are passed to the target device >network stack without any additional check not scrubbing. >- the rcu-protected stats provided via the tcf_result struct > are updated on error conditions. > >This new tcfa_action value is not exposed to the user-space >and can be used only internally by clsact. > >v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce > a new action type instead > >v2 -> v3: > - rename the new action value TC_ACT_REINJECT, update the > helper accordingly > - take care of uncloned reinjected packets in XDP generic > hook > >Signed-off-by: Paolo Abeni >--- > include/net/pkt_cls.h | 3 +++ > include/net/sch_generic.h | 19 +++ > net/core/dev.c| 6 +- > 3 files changed, 27 insertions(+), 1 deletion(-) > >diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h >index 2081e4219f81..36ccfe2a303a 100644 >--- a/include/net/pkt_cls.h >+++ b/include/net/pkt_cls.h >@@ -7,6 +7,9 @@ > #include > #include > >+/* TC action not accessible from user space */ >+#define TC_ACT_REINJECT (TC_ACT_VALUE_MAX + 1) >+ > /* Basic packet classifier frontend definitions. */ > > struct tcf_walker { >diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h >index 056dc1083aa3..95e81a70f549 100644 >--- a/include/net/sch_generic.h >+++ b/include/net/sch_generic.h >@@ -235,6 +235,12 @@ struct tcf_result { > u32 classid; > }; > const struct tcf_proto *goto_tp; >+ >+ /* used by the TC_ACT_REINJECT action */ >+ struct { >+ boolingress; >+ struct gnet_stats_queue *qstats; >+ }; > }; > }; > >@@ -1091,4 +1097,17 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair >*miniqp, > void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc, > struct mini_Qdisc __rcu **p_miniq); > >+static inline void skb_tc_reinject(struct sk_buff *skb, struct tcf_result >*res) >+{ >+ struct gnet_stats_queue *stats = res->qstats; >+ int ret; >+ >+ if (res->ingress) >+ ret = netif_receive_skb(skb); >+ else >+ ret = dev_queue_xmit(skb); Hmm. "reinject" by the name tells me that the packet should be injected again. By "inject", I understand beginning of the rx path. However, this does xmit as well :/ It is a bit misleading. Maybe "reinsert" would sound better? >+ if (ret && stats) >+ qstats_overlimit_inc(res->qstats); >+} >+ > #endif >diff --git a/net/core/dev.c b/net/core/dev.c >index 14a748ee8cc9..826ec74fe1d9 100644 >--- a/net/core/dev.c >+++ b/net/core/dev.c >@@ -4252,7 +4252,7 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, > /* Reinjected packets coming from act_mirred or similar should >* not get XDP generic processing. >*/ >- if (skb_cloned(skb)) >+ if (skb_cloned(skb) || skb->tc_redirected) > return XDP_PASS; > > /* XDP packets must be linear and must have sufficient headroom >@@ -4602,6 +4602,10 @@ sch_handle_ingress(struct sk_buff *skb, struct >packet_type **pt_prev, int *ret, > __skb_push(skb, skb->mac_len); > skb_do_redirect(skb); > return NULL; >+ case TC_ACT_REINJECT: >+ /* this does not scrub the packet, and updates stats on error */ >+ skb_tc_reinject(skb, _res); >+ return NULL; > default: > break; > } >-- >2.17.1 >
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On 25/07/18 04:29 AM, Paolo Abeni wrote: On Tue, 2018-07-24 at 13:50 -0700, Cong Wang wrote: [..] I fail to understand why overlimit is increased in your case here. I guess you want to increase 'drops' instead. Hmm, actually the current mirred code increases overlimit too. But I still don't think it makes sense. Yep, I chose to increment 'overlimits' to preserve the current mirred semantic. AFAICS, that was first introduced with: commit 8919bc13e8d92c5b082c5c0321567383a071f5bc Author: Jamal Hadi Salim Date: Mon Aug 15 05:25:40 2011 + net_sched: fix port mirror/redirect stats reporting Likely increasing 'drops' would be "better", but I'm unsure we can change this established behavior without affecting some user. Those changes were there from the beginning (above patch did not introduce them). IIRC, the reason was to distinguish between policy intended drops and drops because of errors. cheers, jamal
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
+Cc Shmulik Paolo - please also run the tdc tests (and add anymore if you feel they dont do coverage to your changes) On 24/07/18 04:06 PM, Paolo Abeni wrote: This is similar TC_ACT_REDIRECT, but with a slightly different semantic: - on ingress the mirred skbs are passed to the target device network stack without any additional check not scrubbing. - the rcu-protected stats provided via the tcf_result struct are updated on error conditions. This new tcfa_action value is not exposed to the user-space and can be used only internally by clsact. v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce a new action type instead v2 -> v3: - rename the new action value TC_ACT_REINJECT, update the helper accordingly - take care of uncloned reinjected packets in XDP generic hook Signed-off-by: Paolo Abeni --- include/net/pkt_cls.h | 3 +++ include/net/sch_generic.h | 19 +++ net/core/dev.c| 6 +- 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 2081e4219f81..36ccfe2a303a 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -7,6 +7,9 @@ #include #include +/* TC action not accessible from user space */ +#define TC_ACT_REINJECT(TC_ACT_VALUE_MAX + 1) Lets say in the future we add a new opcode. Will old kernel, new iproute2 (new value) work? Maybe use a negative number below -1. I am honestly still unclear about introducing this new code. Could you not use the result code to carry sufficient info to indicate the intent? cheers, jamal + /* Basic packet classifier frontend definitions. */ struct tcf_walker { diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 056dc1083aa3..95e81a70f549 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -235,6 +235,12 @@ struct tcf_result { u32 classid; }; const struct tcf_proto *goto_tp; + + /* used by the TC_ACT_REINJECT action */ + struct { + boolingress; + struct gnet_stats_queue *qstats; + }; }; }; @@ -1091,4 +1097,17 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp, void mini_qdisc_pair_init(struct mini_Qdisc_pair *miniqp, struct Qdisc *qdisc, struct mini_Qdisc __rcu **p_miniq); +static inline void skb_tc_reinject(struct sk_buff *skb, struct tcf_result *res) +{ + struct gnet_stats_queue *stats = res->qstats; + int ret; + + if (res->ingress) + ret = netif_receive_skb(skb); + else + ret = dev_queue_xmit(skb); + if (ret && stats) + qstats_overlimit_inc(res->qstats); +} + #endif diff --git a/net/core/dev.c b/net/core/dev.c index 14a748ee8cc9..826ec74fe1d9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4252,7 +4252,7 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb, /* Reinjected packets coming from act_mirred or similar should * not get XDP generic processing. */ - if (skb_cloned(skb)) + if (skb_cloned(skb) || skb->tc_redirected) return XDP_PASS; /* XDP packets must be linear and must have sufficient headroom @@ -4602,6 +4602,10 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, __skb_push(skb, skb->mac_len); skb_do_redirect(skb); return NULL; + case TC_ACT_REINJECT: + /* this does not scrub the packet, and updates stats on error */ + skb_tc_reinject(skb, _res); + return NULL; default: break; }
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Tue, 2018-07-24 at 13:50 -0700, Cong Wang wrote: > On Tue, Jul 24, 2018 at 1:38 PM Cong Wang wrote: > > > > On Tue, Jul 24, 2018 at 1:07 PM Paolo Abeni wrote: > > > +static inline void skb_tc_reinject(struct sk_buff *skb, struct > > > tcf_result *res) > > > +{ > > > + struct gnet_stats_queue *stats = res->qstats; > > > + int ret; > > > + > > > + if (res->ingress) > > > + ret = netif_receive_skb(skb); > > > + else > > > + ret = dev_queue_xmit(skb); > > > + if (ret && stats) > > > + qstats_overlimit_inc(res->qstats); > > > > Why increasing overlimit? Overlimit is typically increased > > by traffic shapers to indicate there is no bandwidth to send > > out the packet. > > > > I fail to understand why overlimit is increased in your case > > here. I guess you want to increase 'drops' instead. > > Hmm, actually the current mirred code increases overlimit too. > But I still don't think it makes sense. Yep, I chose to increment 'overlimits' to preserve the current mirred semantic. AFAICS, that was first introduced with: commit 8919bc13e8d92c5b082c5c0321567383a071f5bc Author: Jamal Hadi Salim Date: Mon Aug 15 05:25:40 2011 + net_sched: fix port mirror/redirect stats reporting Likely increasing 'drops' would be "better", but I'm unsure we can change this established behavior without affecting some user. Anyway, I'm fine either way. Please advice. Cheers, Paolo
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Tue, Jul 24, 2018 at 1:38 PM Cong Wang wrote: > > On Tue, Jul 24, 2018 at 1:07 PM Paolo Abeni wrote: > > +static inline void skb_tc_reinject(struct sk_buff *skb, struct tcf_result > > *res) > > +{ > > + struct gnet_stats_queue *stats = res->qstats; > > + int ret; > > + > > + if (res->ingress) > > + ret = netif_receive_skb(skb); > > + else > > + ret = dev_queue_xmit(skb); > > + if (ret && stats) > > + qstats_overlimit_inc(res->qstats); > > Why increasing overlimit? Overlimit is typically increased > by traffic shapers to indicate there is no bandwidth to send > out the packet. > > I fail to understand why overlimit is increased in your case > here. I guess you want to increase 'drops' instead. Hmm, actually the current mirred code increases overlimit too. But I still don't think it makes sense.
Re: [PATCH net-next v3 4/5] net/tc: introduce TC_ACT_REINJECT.
On Tue, Jul 24, 2018 at 1:07 PM Paolo Abeni wrote: > +static inline void skb_tc_reinject(struct sk_buff *skb, struct tcf_result > *res) > +{ > + struct gnet_stats_queue *stats = res->qstats; > + int ret; > + > + if (res->ingress) > + ret = netif_receive_skb(skb); > + else > + ret = dev_queue_xmit(skb); > + if (ret && stats) > + qstats_overlimit_inc(res->qstats); Why increasing overlimit? Overlimit is typically increased by traffic shapers to indicate there is no bandwidth to send out the packet. I fail to understand why overlimit is increased in your case here. I guess you want to increase 'drops' instead.