Re: [RFC net-next sample action optimization 3/3] openvswitch: Optimize sample action for the clone use cases

2017-03-10 Thread Joe Stringer
On 10 March 2017 at 14:07, Andy Zhou  wrote:
> On Thu, Mar 9, 2017 at 11:46 AM, Joe Stringer  wrote:
>> On 7 March 2017 at 16:15, Andy Zhou  wrote:
>>> With the introduction of open flow 'clone' action, the OVS user space
>>> can now translate the 'clone' action into kernel datapath 'sample'
>>> action, with 100% probability, to ensure that the clone semantics,
>>> which is that the packet seen by the clone action is the same as the
>>> packet seen by the action after clone, is faithfully carried out
>>> in the datapath.
>>>
>>> While the sample action in the datpath has the matching semantics,
>>> its implementation is only optimized for its original use.
>>> Specifically, there are two limitation: First, there is a 3 level of
>>> nesting restriction, enforced at the flow downloading time. This
>>> limit turns out to be too restrictive for the 'clone' use case.
>>> Second, the implementation avoid recursive call only if the sample
>>> action list has a single userspace action.
>>>
>>> The optimization implemented in this series removes the static
>>> nesting limit check, instead, implement the run time recursion limit
>>> check, and recursion avoidance similar to that of the 'recirc' action.
>>> This optimization solve both #1 and #2 issues above.
>>>
>>> Another optimization implemented is to avoid coping flow key as
>>
>> *copying
>>
>>> long as the actions enclosed does not change the flow key. The
>>> detection is performed only once at the flow downloading time.
>>>
>>> The third optimization implemented is to rewrite the action list
>>> at flow downloading time in order to save the fast path from parsing
>>> the sample action list in its original form repeatedly.
>>
>> Whenever there is an enumeration (1, 2, 3; ..another..., third thing
>> implemented) in a commit message, I have to ask whether each "another
>> change..." should be a separate patch. It certainly makes it easier to
>> review.
>>
> They are all part of the same implementation. Spliting them probably won't
> make much sense. I think I will drop #2 and #3 in the commit message since
> #1 is the main optimization.

Fair enough. You don't have to drop them from the commit message, it
makes the intention of all of the changes more clear.

>> I ran this through the OVS kernel tests and it's working correctly
>> from that point of view, but I didn't get a chance to dig in and
>> ensure for example, runtime behaviour of several nested
>> sample(actions(sample(actions(sample(actions(output)) handles
>> reasonably when it runs out of stack and deferred actions space. At a
>> high level though, this seems pretty straightforward.
>>
>> Several comments below, thanks.
>>
>>>
>>> Signed-off-by: Andy Zhou 
>>> ---
>>>  net/openvswitch/actions.c  | 106 ++--
>>>  net/openvswitch/datapath.h |   7 +++
>>>  net/openvswitch/flow_netlink.c | 118 
>>> -
>>>  3 files changed, 140 insertions(+), 91 deletions(-)
>>>
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 259aea9..2e8c372 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -930,71 +930,52 @@ static int output_userspace(struct datapath *dp, 
>>> struct sk_buff *skb,
>>>  }
>>>
>>>  static int sample(struct datapath *dp, struct sk_buff *skb,
>>> - struct sw_flow_key *key, const struct nlattr *attr,
>>> - const struct nlattr *actions, int actions_len)
>>> + struct sw_flow_key *key, const struct nlattr *attr)
>>>  {
>>> -   const struct nlattr *acts_list = NULL;
>>> -   const struct nlattr *a;
>>> -   int rem;
>>> -   u32 cutlen = 0;
>>> -
>>> -   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>> -a = nla_next(a, )) {
>>> -   u32 probability;
>>> -
>>> -   switch (nla_type(a)) {
>>> -   case OVS_SAMPLE_ATTR_PROBABILITY:
>>> -   probability = nla_get_u32(a);
>>> -   if (!probability || prandom_u32() > probability)
>>> -   return 0;
>>> -   break;
>>> -
>>> -   case OVS_SAMPLE_ATTR_ACTIONS:
>>> -   acts_list = a;
>>> -   break;
>>> -   }
>>> -   }
>>> +   struct nlattr *actions;
>>> +   struct nlattr *sample_arg;
>>> +   struct sw_flow_key *orig = key;
>>> +   int rem = nla_len(attr);
>>> +   int err = 0;
>>> +   const struct sample_arg *arg;
>>>
>>> -   rem = nla_len(acts_list);
>>> -   a = nla_data(acts_list);
>>> +   /* The first action is always 'OVS_SAMPLE_ATTR_AUX'. */
>>
>> Is it? This is the only reference to OVS_SAMPLE_ATTR_AUX I can see.
>>
>>> +   sample_arg = nla_data(attr);
>>
>> We could do this in the parent call, like several other actions do.
>
> What do you mean?


Re: [RFC net-next sample action optimization 3/3] openvswitch: Optimize sample action for the clone use cases

2017-03-10 Thread Andy Zhou
On Thu, Mar 9, 2017 at 11:46 AM, Joe Stringer  wrote:
> On 7 March 2017 at 16:15, Andy Zhou  wrote:
>> With the introduction of open flow 'clone' action, the OVS user space
>> can now translate the 'clone' action into kernel datapath 'sample'
>> action, with 100% probability, to ensure that the clone semantics,
>> which is that the packet seen by the clone action is the same as the
>> packet seen by the action after clone, is faithfully carried out
>> in the datapath.
>>
>> While the sample action in the datpath has the matching semantics,
>> its implementation is only optimized for its original use.
>> Specifically, there are two limitation: First, there is a 3 level of
>> nesting restriction, enforced at the flow downloading time. This
>> limit turns out to be too restrictive for the 'clone' use case.
>> Second, the implementation avoid recursive call only if the sample
>> action list has a single userspace action.
>>
>> The optimization implemented in this series removes the static
>> nesting limit check, instead, implement the run time recursion limit
>> check, and recursion avoidance similar to that of the 'recirc' action.
>> This optimization solve both #1 and #2 issues above.
>>
>> Another optimization implemented is to avoid coping flow key as
>
> *copying
>
>> long as the actions enclosed does not change the flow key. The
>> detection is performed only once at the flow downloading time.
>>
>> The third optimization implemented is to rewrite the action list
>> at flow downloading time in order to save the fast path from parsing
>> the sample action list in its original form repeatedly.
>
> Whenever there is an enumeration (1, 2, 3; ..another..., third thing
> implemented) in a commit message, I have to ask whether each "another
> change..." should be a separate patch. It certainly makes it easier to
> review.
>
They are all part of the same implementation. Spliting them probably won't
make much sense. I think I will drop #2 and #3 in the commit message since
#1 is the main optimization.

> I ran this through the OVS kernel tests and it's working correctly
> from that point of view, but I didn't get a chance to dig in and
> ensure for example, runtime behaviour of several nested
> sample(actions(sample(actions(sample(actions(output)) handles
> reasonably when it runs out of stack and deferred actions space. At a
> high level though, this seems pretty straightforward.
>
> Several comments below, thanks.
>
>>
>> Signed-off-by: Andy Zhou 
>> ---
>>  net/openvswitch/actions.c  | 106 ++--
>>  net/openvswitch/datapath.h |   7 +++
>>  net/openvswitch/flow_netlink.c | 118 
>> -
>>  3 files changed, 140 insertions(+), 91 deletions(-)
>>
>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> index 259aea9..2e8c372 100644
>> --- a/net/openvswitch/actions.c
>> +++ b/net/openvswitch/actions.c
>> @@ -930,71 +930,52 @@ static int output_userspace(struct datapath *dp, 
>> struct sk_buff *skb,
>>  }
>>
>>  static int sample(struct datapath *dp, struct sk_buff *skb,
>> - struct sw_flow_key *key, const struct nlattr *attr,
>> - const struct nlattr *actions, int actions_len)
>> + struct sw_flow_key *key, const struct nlattr *attr)
>>  {
>> -   const struct nlattr *acts_list = NULL;
>> -   const struct nlattr *a;
>> -   int rem;
>> -   u32 cutlen = 0;
>> -
>> -   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>> -a = nla_next(a, )) {
>> -   u32 probability;
>> -
>> -   switch (nla_type(a)) {
>> -   case OVS_SAMPLE_ATTR_PROBABILITY:
>> -   probability = nla_get_u32(a);
>> -   if (!probability || prandom_u32() > probability)
>> -   return 0;
>> -   break;
>> -
>> -   case OVS_SAMPLE_ATTR_ACTIONS:
>> -   acts_list = a;
>> -   break;
>> -   }
>> -   }
>> +   struct nlattr *actions;
>> +   struct nlattr *sample_arg;
>> +   struct sw_flow_key *orig = key;
>> +   int rem = nla_len(attr);
>> +   int err = 0;
>> +   const struct sample_arg *arg;
>>
>> -   rem = nla_len(acts_list);
>> -   a = nla_data(acts_list);
>> +   /* The first action is always 'OVS_SAMPLE_ATTR_AUX'. */
>
> Is it? This is the only reference to OVS_SAMPLE_ATTR_AUX I can see.
>
>> +   sample_arg = nla_data(attr);
>
> We could do this in the parent call, like several other actions do.

What do you mean?

>
> 
>
>> @@ -1246,9 +1227,24 @@ static int do_execute_actions(struct datapath *dp, 
>> struct sk_buff *skb,
>> err = execute_masked_set_action(skb, key, 
>> nla_data(a));
>> break;
>>
>> -   case OVS_ACTION_ATTR_SAMPLE:
>> -

Re: [RFC net-next sample action optimization 3/3] openvswitch: Optimize sample action for the clone use cases

2017-03-09 Thread Joe Stringer
On 7 March 2017 at 16:15, Andy Zhou  wrote:
> With the introduction of open flow 'clone' action, the OVS user space
> can now translate the 'clone' action into kernel datapath 'sample'
> action, with 100% probability, to ensure that the clone semantics,
> which is that the packet seen by the clone action is the same as the
> packet seen by the action after clone, is faithfully carried out
> in the datapath.
>
> While the sample action in the datpath has the matching semantics,
> its implementation is only optimized for its original use.
> Specifically, there are two limitation: First, there is a 3 level of
> nesting restriction, enforced at the flow downloading time. This
> limit turns out to be too restrictive for the 'clone' use case.
> Second, the implementation avoid recursive call only if the sample
> action list has a single userspace action.
>
> The optimization implemented in this series removes the static
> nesting limit check, instead, implement the run time recursion limit
> check, and recursion avoidance similar to that of the 'recirc' action.
> This optimization solve both #1 and #2 issues above.
>
> Another optimization implemented is to avoid coping flow key as

*copying

> long as the actions enclosed does not change the flow key. The
> detection is performed only once at the flow downloading time.
>
> The third optimization implemented is to rewrite the action list
> at flow downloading time in order to save the fast path from parsing
> the sample action list in its original form repeatedly.

Whenever there is an enumeration (1, 2, 3; ..another..., third thing
implemented) in a commit message, I have to ask whether each "another
change..." should be a separate patch. It certainly makes it easier to
review.

I ran this through the OVS kernel tests and it's working correctly
from that point of view, but I didn't get a chance to dig in and
ensure for example, runtime behaviour of several nested
sample(actions(sample(actions(sample(actions(output)) handles
reasonably when it runs out of stack and deferred actions space. At a
high level though, this seems pretty straightforward.

Several comments below, thanks.

>
> Signed-off-by: Andy Zhou 
> ---
>  net/openvswitch/actions.c  | 106 ++--
>  net/openvswitch/datapath.h |   7 +++
>  net/openvswitch/flow_netlink.c | 118 
> -
>  3 files changed, 140 insertions(+), 91 deletions(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 259aea9..2e8c372 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -930,71 +930,52 @@ static int output_userspace(struct datapath *dp, struct 
> sk_buff *skb,
>  }
>
>  static int sample(struct datapath *dp, struct sk_buff *skb,
> - struct sw_flow_key *key, const struct nlattr *attr,
> - const struct nlattr *actions, int actions_len)
> + struct sw_flow_key *key, const struct nlattr *attr)
>  {
> -   const struct nlattr *acts_list = NULL;
> -   const struct nlattr *a;
> -   int rem;
> -   u32 cutlen = 0;
> -
> -   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> -a = nla_next(a, )) {
> -   u32 probability;
> -
> -   switch (nla_type(a)) {
> -   case OVS_SAMPLE_ATTR_PROBABILITY:
> -   probability = nla_get_u32(a);
> -   if (!probability || prandom_u32() > probability)
> -   return 0;
> -   break;
> -
> -   case OVS_SAMPLE_ATTR_ACTIONS:
> -   acts_list = a;
> -   break;
> -   }
> -   }
> +   struct nlattr *actions;
> +   struct nlattr *sample_arg;
> +   struct sw_flow_key *orig = key;
> +   int rem = nla_len(attr);
> +   int err = 0;
> +   const struct sample_arg *arg;
>
> -   rem = nla_len(acts_list);
> -   a = nla_data(acts_list);
> +   /* The first action is always 'OVS_SAMPLE_ATTR_AUX'. */

Is it? This is the only reference to OVS_SAMPLE_ATTR_AUX I can see.

> +   sample_arg = nla_data(attr);

We could do this in the parent call, like several other actions do.



> @@ -1246,9 +1227,24 @@ static int do_execute_actions(struct datapath *dp, 
> struct sk_buff *skb,
> err = execute_masked_set_action(skb, key, 
> nla_data(a));
> break;
>
> -   case OVS_ACTION_ATTR_SAMPLE:
> -   err = sample(dp, skb, key, a, attr, len);
> +   case OVS_ACTION_ATTR_SAMPLE: {
> +   bool last = nla_is_last(a, rem);
> +   struct sk_buff *clone_skb;
> +
> +   clone_skb = last ? skb : skb_clone(skb, GFP_ATOMIC);
> +
> +   if (!clone_skb)
> +   /* Out of memory, skip 

[RFC net-next sample action optimization 3/3] openvswitch: Optimize sample action for the clone use cases

2017-03-07 Thread Andy Zhou
With the introduction of open flow 'clone' action, the OVS user space
can now translate the 'clone' action into kernel datapath 'sample'
action, with 100% probability, to ensure that the clone semantics,
which is that the packet seen by the clone action is the same as the
packet seen by the action after clone, is faithfully carried out
in the datapath.

While the sample action in the datpath has the matching semantics,
its implementation is only optimized for its original use.
Specifically, there are two limitation: First, there is a 3 level of
nesting restriction, enforced at the flow downloading time. This
limit turns out to be too restrictive for the 'clone' use case.
Second, the implementation avoid recursive call only if the sample
action list has a single userspace action.

The optimization implemented in this series removes the static
nesting limit check, instead, implement the run time recursion limit
check, and recursion avoidance similar to that of the 'recirc' action.
This optimization solve both #1 and #2 issues above.

Another optimization implemented is to avoid coping flow key as
long as the actions enclosed does not change the flow key. The
detection is performed only once at the flow downloading time.

The third optimization implemented is to rewrite the action list
at flow downloading time in order to save the fast path from parsing
the sample action list in its original form repeatedly.

Signed-off-by: Andy Zhou 
---
 net/openvswitch/actions.c  | 106 ++--
 net/openvswitch/datapath.h |   7 +++
 net/openvswitch/flow_netlink.c | 118 -
 3 files changed, 140 insertions(+), 91 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 259aea9..2e8c372 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -930,71 +930,52 @@ static int output_userspace(struct datapath *dp, struct 
sk_buff *skb,
 }
 
 static int sample(struct datapath *dp, struct sk_buff *skb,
- struct sw_flow_key *key, const struct nlattr *attr,
- const struct nlattr *actions, int actions_len)
+ struct sw_flow_key *key, const struct nlattr *attr)
 {
-   const struct nlattr *acts_list = NULL;
-   const struct nlattr *a;
-   int rem;
-   u32 cutlen = 0;
-
-   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
-a = nla_next(a, )) {
-   u32 probability;
-
-   switch (nla_type(a)) {
-   case OVS_SAMPLE_ATTR_PROBABILITY:
-   probability = nla_get_u32(a);
-   if (!probability || prandom_u32() > probability)
-   return 0;
-   break;
-
-   case OVS_SAMPLE_ATTR_ACTIONS:
-   acts_list = a;
-   break;
-   }
-   }
+   struct nlattr *actions;
+   struct nlattr *sample_arg;
+   struct sw_flow_key *orig = key;
+   int rem = nla_len(attr);
+   int err = 0;
+   const struct sample_arg *arg;
 
-   rem = nla_len(acts_list);
-   a = nla_data(acts_list);
+   /* The first action is always 'OVS_SAMPLE_ATTR_AUX'. */
+   sample_arg = nla_data(attr);
+   arg = nla_data(sample_arg);
+   actions = nla_next(sample_arg, );
 
-   /* Actions list is empty, do nothing */
-   if (unlikely(!rem))
-   return 0;
+   if (arg->probability != U32_MAX)
+   if (!arg->probability || prandom_u32() > arg->probability)
+   return 0;
 
-   /* The only known usage of sample action is having a single user-space
-* action, or having a truncate action followed by a single user-space
-* action. Treat this usage as a special case.
-* The output_userspace() should clone the skb to be sent to the
-* user space. This skb will be consumed by its caller.
+   /* In case the sample actions won't change 'key',
+* we can use key for the clone execution.
+* Otherwise, try to allocate a key from the
+* next recursion level of 'flow_keys'. If
+* successful, we can still execute the clone
+* actions  without deferring.
+*
+* Defer the clone action if the action recursion
+* limit has been reached.
 */
-   if (unlikely(nla_type(a) == OVS_ACTION_ATTR_TRUNC)) {
-   struct ovs_action_trunc *trunc = nla_data(a);
-
-   if (skb->len > trunc->max_len)
-   cutlen = skb->len - trunc->max_len;
-
-   a = nla_next(a, );
+   if (!arg->exec) {
+   __this_cpu_inc(exec_actions_level);
+   key = clone_key(key);
}
 
-   if (likely(nla_type(a) == OVS_ACTION_ATTR_USERSPACE &&
-  nla_is_last(a, rem)))
-   return output_userspace(dp, skb, key, a, actions,
-