On Thu, Jan 19, 2017 at 6:21 PM, Jarno Rajahalme <[email protected]> wrote:
> > > On Jan 13, 2017, at 4:13 PM, Andy Zhou <[email protected]> wrote: > > > > Add support for userspace datapath clone action. The clone action > > provides an action envelope to enclose an action list. > > For example, with action A, B, C and D, and an action list: > > “with action”->”with actions” > > > A, clone(B, C), D > > > > The clone action will ensure that: > > > > - D will see that same packet, and any meta states, such as flow, > > associated with the packet. > > > > “same packet”->”same packet as B” > > > - D will be executed regardless whether B, or C drops a packet. They > > can only drop a clone. > > > > - When B drop a packet, the clone will skip all actions within > > “drop”->”drops” > > “all actions”->”all remaining actions”, > consider this: CLONE(OUTPUT:1,METER:1,OUTPUT:2), if meter 1 drops, > OUTPUT:1 is still executed. > Good point. I agree. > > > the clone envelope. This feature is useful when we add meter action > > later. Since Meter action does not have to provide its own envelope, > > but can be a simple action enclosed within the clone action. > > > > The clone action is very similar with the OpenFlow clone action > > recently added. This is by design to simplify vswitchd flow translation > > logic. > > > > Without datapath clone, vswitchd simulate the effect by inserting > > “simulate”->”simulates” > > > datapath actions to "undo" clone actions. The above flow will be > > translated into A, B, C, -C, -B, D. > > > > However, there are two issues: > > - The resulting datapath action list may be longer without using > > clone. > > “clone”->”datapath clone action”. > > > > > - Some actions, such as NAT may not be possible to reverse. > > > > This patch implements clone() simply with packet copy. The performance > > can be improved with later patches, for example, to delay or avoid > > packet copy if possible. It seems datapath should have enough context > > to carry out such optimization without the userspace context. > > > > Signed-off-by: Andy Zhou <[email protected]> > > --- > > datapath/linux/compat/include/linux/openvswitch.h | 16 ++++++- > > lib/dpif-netdev.c | 1 + > > lib/dpif.c | 1 + > > lib/odp-execute.c | 56 > ++++++++++++++++++++++ > > lib/odp-util.c | 29 ++++++++++++ > > ofproto/ofproto-dpif-sflow.c | 1 + > > ofproto/ofproto-dpif.c | 57 > +++++++++++++++++++++++ > > ofproto/ofproto-dpif.h | 3 ++ > > 8 files changed, 163 insertions(+), 1 deletion(-) > > > > diff --git a/datapath/linux/compat/include/linux/openvswitch.h > b/datapath/linux/compat/include/linux/openvswitch.h > > index 12260d8..54f5fae 100644 > > --- a/datapath/linux/compat/include/linux/openvswitch.h > > +++ b/datapath/linux/compat/include/linux/openvswitch.h > > @@ -1,5 +1,5 @@ > > /* > > - * Copyright (c) 2007-2014 Nicira, Inc. > > + * Copyright (c) 2007-2017 Nicira, Inc. > > * > > * This file is offered under your choice of two licenses: Apache 2.0 or > GNU > > * GPL 2.0 or later. The permission statements for each of these > licenses is > > @@ -585,6 +585,19 @@ enum ovs_sample_attr { > > #define OVS_SAMPLE_ATTR_MAX (__OVS_SAMPLE_ATTR_MAX - 1) > > > > /** > > + * enum ovs_clone_attr - Attributes for %OVS_ACTION_ATTR_CLONE action. > > + * @OVS_CLONE_ATTR_ACTIONS: Set of actions to execute with the clone. > > + * Actions are passed as nested attributes. > > + */ > > +enum ovs_clone_attr { > > + OVS_CLONE_ATTR_UNSPEC, > > + OVS_CLONE_ATTR_ACTIONS, /* Nested OVS_ACTION_ATTR_* > attributes. */ > > + __OVS_CLONE_ATTR_MAX, > > +}; > > + > > +#define OVS_CLONE_ATTR_MAX (__OVS_CLONE_ATTR_MAX - 1) > > + > > +/** > > Could we have the OVS_ACTION_ATTR_CLONE directly embed the encapped > actions? I do not see why we need the double encapsulation, > > The idea is to allow possible future expansion of clone to be passed in without having to have a clone2 action. However, I don't have a good example of such parameter handy. > > * enum ovs_userspace_attr - Attributes for %OVS_ACTION_ATTR_USERSPACE > action. > > * @OVS_USERSPACE_ATTR_PID: u32 Netlink PID to which the > %OVS_PACKET_CMD_ACTION > > * message should be sent. Required. > > @@ -818,6 +831,7 @@ enum ovs_action_attr { > > #ifndef __KERNEL__ > > OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/ > > OVS_ACTION_ATTR_TUNNEL_POP, /* u32 port number. */ > > + OVS_ACTION_ATTR_CLONE, /* Nested OVS_CLONE_ATTR_*. */ > > #endif > > __OVS_ACTION_ATTR_MAX, /* Nothing past this will be accepted > > * from userspace. */ > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > > index 546a1e9..94cb840 100644 > > --- a/lib/dpif-netdev.c > > +++ b/lib/dpif-netdev.c > > @@ -4626,6 +4626,7 @@ dp_execute_cb(void *aux_, struct dp_packet_batch > *packets_, > > case OVS_ACTION_ATTR_HASH: > > case OVS_ACTION_ATTR_UNSPEC: > > case OVS_ACTION_ATTR_TRUNC: > > + case OVS_ACTION_ATTR_CLONE: > > case __OVS_ACTION_ATTR_MAX: > > OVS_NOT_REACHED(); > > } > > diff --git a/lib/dpif.c b/lib/dpif.c > > index cea2448..4c4e560 100644 > > --- a/lib/dpif.c > > +++ b/lib/dpif.c > > @@ -1182,6 +1182,7 @@ dpif_execute_helper_cb(void *aux_, struct > dp_packet_batch *packets_, > > case OVS_ACTION_ATTR_SET_MASKED: > > case OVS_ACTION_ATTR_SAMPLE: > > case OVS_ACTION_ATTR_TRUNC: > > + case OVS_ACTION_ATTR_CLONE: > > case OVS_ACTION_ATTR_UNSPEC: > > case __OVS_ACTION_ATTR_MAX: > > OVS_NOT_REACHED(); > > diff --git a/lib/odp-execute.c b/lib/odp-execute.c > > index 73e1016..f34360e 100644 > > --- a/lib/odp-execute.c > > +++ b/lib/odp-execute.c > > @@ -528,6 +528,48 @@ odp_execute_sample(void *dp, struct dp_packet > *packet, bool steal, > > nl_attr_get_size(subactions), dp_execute_action); > > } > > > > +static void > > +odp_execute_clone(void *dp, struct dp_packet *packet, bool steal, > > + const struct nlattr *action, > > + odp_execute_cb dp_execute_action) > > +{ > > + const struct nlattr *subactions = NULL; > > + const struct nlattr *a; > > + struct dp_packet_batch pb; > > + size_t left; > > + > > + NL_NESTED_FOR_EACH_UNSAFE (a, left, action) { > > + int type = nl_attr_type(a); > > + > > + switch ((enum ovs_clone_attr) type) { > > + case OVS_CLONE_ATTR_ACTIONS: > > + subactions = a; > > + break; > > + > > + case OVS_CLONE_ATTR_UNSPEC: > > + case __OVS_CLONE_ATTR_MAX: > > + default: > > + OVS_NOT_REACHED(); > > + } > > + } > > + > > + if (!subactions) { > > + return; > > + } > > + > > + if (!steal) { > > + /* The 'subactions' may modify the packet, but the modification > > + * should not propagate beyond this clone action. Make a copy > > + * the packet in case we don't own the packet, so that the > > + * 'subactions' are only applid to the clone. > 'odp_execute_actions' > > + * will free the clone. */ > > + packet = dp_packet_clone(packet); > > + } > > + packet_batch_init_packet(&pb, packet); > > + odp_execute_actions(dp, &pb, true, nl_attr_get(subactions), > > + nl_attr_get_size(subactions), > dp_execute_action); > > +} > > + > > static bool > > requires_datapath_assistance(const struct nlattr *a) > > { > > @@ -552,6 +594,7 @@ requires_datapath_assistance(const struct nlattr *a) > > case OVS_ACTION_ATTR_PUSH_MPLS: > > case OVS_ACTION_ATTR_POP_MPLS: > > case OVS_ACTION_ATTR_TRUNC: > > + case OVS_ACTION_ATTR_CLONE: > > return false; > > > > case OVS_ACTION_ATTR_UNSPEC: > > @@ -685,6 +728,19 @@ odp_execute_actions(void *dp, struct > dp_packet_batch *batch, bool steal, > > break; > > } > > > > + case OVS_ACTION_ATTR_CLONE: > > + for (i = 0; i < cnt; i++) { > > + odp_execute_clone(dp, packets[i], steal && last_action, > a, > > + dp_execute_action); > > + } > > + > > + if (last_action) { > > + /* We do not need to free the packets. > odp_execute_clone() has > > + * stolen them. */ > > + return; > > + } > > + break; > > + > > case OVS_ACTION_ATTR_OUTPUT: > > case OVS_ACTION_ATTR_TUNNEL_PUSH: > > case OVS_ACTION_ATTR_TUNNEL_POP: > > diff --git a/lib/odp-util.c b/lib/odp-util.c > > index 1e70e3a..e6d3deb 100644 > > --- a/lib/odp-util.c > > +++ b/lib/odp-util.c > > @@ -121,6 +121,7 @@ odp_action_len(uint16_t type) > > case OVS_ACTION_ATTR_SET_MASKED: return ATTR_LEN_VARIABLE; > > case OVS_ACTION_ATTR_SAMPLE: return ATTR_LEN_VARIABLE; > > case OVS_ACTION_ATTR_CT: return ATTR_LEN_VARIABLE; > > + case OVS_ACTION_ATTR_CLONE: return ATTR_LEN_VARIABLE; > > > > case OVS_ACTION_ATTR_UNSPEC: > > case __OVS_ACTION_ATTR_MAX: > > @@ -222,6 +223,31 @@ format_odp_sample_action(struct ds *ds, const > struct nlattr *attr) > > ds_put_format(ds, "))"); > > } > > > > +static void > > +format_odp_clone_action(struct ds *ds, const struct nlattr *attr) > > +{ > > + static const struct nl_policy ovs_clone_policy[] = { > > + [OVS_CLONE_ATTR_ACTIONS] = { .type = NL_A_NESTED } > > + }; > > + > > + struct nlattr *a[ARRAY_SIZE(ovs_clone_policy)]; > > + const struct nlattr *nla_acts; > > + int len; > > + > > + ds_put_cstr(ds, "clone"); > > + > > + if (!nl_parse_nested(attr, ovs_clone_policy, a, ARRAY_SIZE(a))) { > > + ds_put_cstr(ds, "(error)"); > > + return; > > + } > > + > > + nla_acts = nl_attr_get(a[OVS_CLONE_ATTR_ACTIONS]); > > + len = nl_attr_get_size(a[OVS_CLONE_ATTR_ACTIONS]); > > + ds_put_format(ds, "("); > > + format_odp_actions(ds, nla_acts, len); > > + ds_put_format(ds, ")"); > > +} > > + > > static const char * > > slow_path_reason_to_string(uint32_t reason) > > { > > @@ -865,6 +891,9 @@ format_odp_action(struct ds *ds, const struct nlattr > *a) > > case OVS_ACTION_ATTR_CT: > > format_odp_conntrack_action(ds, a); > > break; > > + case OVS_ACTION_ATTR_CLONE: > > + format_odp_clone_action(ds, a); > > + break; > > case OVS_ACTION_ATTR_UNSPEC: > > case __OVS_ACTION_ATTR_MAX: > > default: > > diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c > > index 37992b4..e4ae760 100644 > > --- a/ofproto/ofproto-dpif-sflow.c > > +++ b/ofproto/ofproto-dpif-sflow.c > > @@ -1162,6 +1162,7 @@ dpif_sflow_read_actions(const struct flow *flow, > > break; > > } > > case OVS_ACTION_ATTR_SAMPLE: > > + case OVS_ACTION_ATTR_CLONE: > > case OVS_ACTION_ATTR_UNSPEC: > > case __OVS_ACTION_ATTR_MAX: > > default: > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > > index 3b274ee..fca93a0 100644 > > --- a/ofproto/ofproto-dpif.c > > +++ b/ofproto/ofproto-dpif.c > > @@ -1097,6 +1097,60 @@ check_trunc_action(struct dpif_backer *backer) > > return !error; > > } > > > > +/* Tests whether 'backer''s datapath supports the clone action > > + * OVS_ACTION_ATTR_CLONE. */ > > +static bool > > +check_clone(struct dpif_backer *backer) > > +{ > > + struct dpif_execute execute; > > + struct eth_header *eth; > > + struct flow flow; > > + struct dp_packet packet; > > + struct ofpbuf actions; > > + size_t clone_start, action_start; > > + int error; > > + > > + /* Compose clone(output:1). */ > > + ofpbuf_init(&actions, 64); > > + clone_start = nl_msg_start_nested(&actions, OVS_ACTION_ATTR_CLONE); > > + action_start = nl_msg_start_nested(&actions, > OVS_CLONE_ATTR_ACTIONS); > > + nl_msg_put_odp_port(&actions, OVS_ACTION_ATTR_OUTPUT, > u32_to_odp(1)); > > Should we actually output a packet if the probe is successful? I see that > the truncate probe does that as well, but in both cases I think it would be > safer to skip the output action. > I was wondering about the same thing looking at the probe for trunc. I will add a separate patch to fix this. > > See e.g. the check_masked_set_action(). Also, there is a lot of repeated > code in checking support for different actions, maybe it is time to factor > out the common part like was done earlier for the flow key attributes via > dpif_probe_feature()? > > I agree, but may be this can be addressed in a follow up patch as an enhancement. > + nl_msg_end_nested(&actions, action_start); > > + nl_msg_end_nested(&actions, clone_start); > > + > > As said above, I don’t think the double nesting is necessary. O.K. I will drop it. > > > + /* Compose a dummy Ethernet packet. */ > > + dp_packet_init(&packet, ETH_HEADER_LEN); > > + eth = dp_packet_put_zeros(&packet, ETH_HEADER_LEN); > > + eth->eth_type = htons(0x1234); > > + > > + flow_extract(&packet, &flow); > > + > > + /* Execute the actions. On older datapaths this fails with EINVAL, > on > > + * newer datapaths it succeeds. */ > > + execute.actions = actions.data; > > + execute.actions_len = actions.size; > > + execute.packet = &packet; > > + execute.flow = &flow; > > + execute.needs_help = false; > > + execute.probe = true; > > + execute.mtu = 0; > > + > > + error = dpif_execute(backer->dpif, &execute); > > + > > + dp_packet_uninit(&packet); > > + ofpbuf_uninit(&actions); > > + > > + if (error) { > > + VLOG_INFO("%s: Datapath does not support clone action", > > + dpif_name(backer->dpif)); > > + } else { > > + VLOG_INFO("%s: Datapath supports clone action", > > + dpif_name(backer->dpif)); > > + } > > + > > + return !error; > > +} > > + > > #define CHECK_FEATURE__(NAME, SUPPORT, FIELD, VALUE) > \ > > static bool > \ > > check_##NAME(struct dpif_backer *backer) > \ > > @@ -1145,13 +1199,16 @@ check_support(struct dpif_backer *backer) > > /* This feature needs to be tested after udpif threads are set. */ > > backer->support.variable_length_userdata = false; > > > > + /* Actions. */ > > backer->support.odp.recirc = check_recirc(backer); > > backer->support.odp.max_mpls_depth = check_max_mpls_depth(backer); > > backer->support.masked_set_action = check_masked_set_action(backer); > > backer->support.trunc = check_trunc_action(backer); > > backer->support.ufid = check_ufid(backer); > > backer->support.tnl_push_pop = dpif_supports_tnl_push_pop( > backer->dpif); > > + backer->support.clone = check_clone(backer); > > > > + /* Flow fields. */ > > backer->support.odp.ct_state = check_ct_state(backer); > > backer->support.odp.ct_zone = check_ct_zone(backer); > > backer->support.odp.ct_mark = check_ct_mark(backer); > > diff --git a/ofproto/ofproto-dpif.h b/ofproto/ofproto-dpif.h > > index c4f7baa..a3f9753 100644 > > --- a/ofproto/ofproto-dpif.h > > +++ b/ofproto/ofproto-dpif.h > > @@ -165,6 +165,9 @@ struct dpif_backer_support { > > > > /* Each member represents support for related OVS_KEY_ATTR_* fields. > */ > > struct odp_support odp; > > + > > + /* Ture if the datapath supports OVS_ACTION_ATTR_CLONE action. */ > > “Ture”->”True” > > > + bool clone; > > }; > > > > /* Reasons that we might need to revalidate every datapath flow, and > > -- > > 2.7.4 > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
