On Wed, Oct 14, 2020 at 5:51 PM Dumitru Ceara <[email protected]> wrote: > > On 10/14/20 2:13 PM, Numan Siddique wrote: > > On Wed, Oct 14, 2020 at 5:34 PM Dumitru Ceara <[email protected]> wrote: > >> > >> On 10/14/20 1:54 PM, Numan Siddique wrote: > >>> On Wed, Oct 14, 2020 at 4:40 PM Dumitru Ceara <[email protected]> wrote: > >>>> > >>>> On 10/14/20 12:50 PM, Numan Siddique wrote: > >>>>> On Wed, Oct 14, 2020 at 3:50 PM Dumitru Ceara <[email protected]> wrote: > >>>>>> > >>>>>> On 10/14/20 11:15 AM, [email protected] wrote: > >>>>>>> From: Numan Siddique <[email protected]> > >>>>>>> > >>>>>>> Thee new fields are version independent and these can be used in any > >>>>>> > >>>>>> Hi Numan, > >>>>>> > >>>>>> Nit: s/Thee/These > >>>>>> > >>>>>>> OVN action. Right now the usage of these fields are restricted to > >>>>>>> exchanging the IP source and destination fields. > >>>>>>> > >>>>>>> Eg. reject {... ip.src <-> ip.dst ... }; > >>>>>>> > >>>>>>> "ip.src <-> ip.dst" translates to controller action with "pause" flag > >>>>>>> set. > >>>>>>> When pinctrl thread receives the packet in, it checks the IP version > >>>>>>> of the > >>>>>>> packet and either does - "ip4.src <-> ip4.dst" or "ip6.src <-> > >>>>>>> ip6.dst" and > >>>>>>> resumes the packet to continue with the pipeline. > >>>>>>> > >>>>>>> Upcoming patch will make use of these fields. > >>>>>>> > >>>>>>> Signed-off-by: Numan Siddique <[email protected]> > >>>>>>> --- > >>>>>>> controller/pinctrl.c | 49 +++++++++++++++ > >>>>>>> include/ovn/actions.h | 4 ++ > >>>>>>> include/ovn/logical-fields.h | 18 ++++++ > >>>>>>> lib/actions.c | 118 > >>>>>>> +++++++++++++++++++++++++++++++++++ > >>>>>>> lib/logical-fields.c | 10 +++ > >>>>>>> ovn-sb.xml | 37 +++++++++++ > >>>>>>> tests/ovn.at | 27 ++++++++ > >>>>>>> utilities/ovn-trace.c | 28 +++++++++ > >>>>>>> 8 files changed, 291 insertions(+) > >>>>>>> > >>>>>>> diff --git a/controller/pinctrl.c b/controller/pinctrl.c > >>>>>>> index 631d058458..bc16a82404 100644 > >>>>>>> --- a/controller/pinctrl.c > >>>>>>> +++ b/controller/pinctrl.c > >>>>>>> @@ -233,6 +233,11 @@ static void > >>>>>>> pinctrl_handle_put_icmp_frag_mtu(struct rconn *swconn, > >>>>>>> struct > >>>>>>> ofputil_packet_in *pin, > >>>>>>> struct ofpbuf *userdata, > >>>>>>> struct ofpbuf > >>>>>>> *continuation); > >>>>>>> +static void pinctrl_handle_swap_src_dst_ip(struct rconn *swconn, > >>>>>>> + const struct flow > >>>>>>> *in_flow, > >>>>>>> + struct dp_packet *pkt_in, > >>>>>>> + struct ofputil_packet_in > >>>>>>> *pin, > >>>>>>> + struct ofpbuf > >>>>>>> *continuation); > >>>>>>> static void > >>>>>>> pinctrl_handle_event(struct ofpbuf *userdata) > >>>>>>> OVS_REQUIRES(pinctrl_mutex); > >>>>>>> @@ -2835,6 +2840,11 @@ process_packet_in(struct rconn *swconn, const > >>>>>>> struct ofp_header *msg) > >>>>>>> ovs_mutex_unlock(&pinctrl_mutex); > >>>>>>> break; > >>>>>>> > >>>>>>> + case ACTION_OPCODE_SWAP_SRC_DST_IP: > >>>>>>> + pinctrl_handle_swap_src_dst_ip(swconn, &headers, &packet, > >>>>>>> &pin, > >>>>>>> + &continuation); > >>>>>>> + break; > >>>>>>> + > >>>>>>> default: > >>>>>>> VLOG_WARN_RL(&rl, "unrecognized packet-in opcode %"PRIu32, > >>>>>>> ntohl(ah->opcode)); > >>>>>>> @@ -6508,3 +6518,42 @@ pinctrl_handle_svc_check(struct rconn *swconn, > >>>>>>> const struct flow *ip_flow, > >>>>>>> svc_mon->next_send_time = time_msec() + svc_mon->interval; > >>>>>>> } > >>>>>>> } > >>>>>>> + > >>>>>>> +static void > >>>>>>> +pinctrl_handle_swap_src_dst_ip(struct rconn *swconn, > >>>>>>> + const struct flow *in_flow, > >>>>>>> + struct dp_packet *pkt_in, > >>>>>>> + struct ofputil_packet_in *pin, > >>>>>>> + struct ofpbuf *continuation) > >>>>>>> +{ > >>>>>>> + enum ofp_version version = rconn_get_version(swconn); > >>>>>>> + enum ofputil_protocol proto = > >>>>>>> ofputil_protocol_from_ofp_version(version); > >>>>>>> + struct dp_packet *pkt_out; > >>>>>>> + > >>>>>>> + pkt_out = dp_packet_clone(pkt_in); > >>>>>>> + pkt_out->l2_5_ofs = pkt_in->l2_5_ofs; > >>>>>>> + pkt_out->l2_pad_size = pkt_in->l2_pad_size; > >>>>>>> + pkt_out->l3_ofs = pkt_in->l3_ofs; > >>>>>>> + pkt_out->l4_ofs = pkt_in->l4_ofs; > >>>>>>> + > >>>>>>> + if (get_dl_type(in_flow) == htons(ETH_TYPE_IP)) { > >>>>>>> + /* IPv4 packet. Swap nw_src with nw_dst. */ > >>>>>>> + packet_set_ipv4(pkt_out, in_flow->nw_dst, in_flow->nw_src, > >>>>>>> + in_flow->nw_tos, in_flow->nw_ttl); > >>>>>>> + } else { > >>>>>>> + /* IPv6 packet. Swap ip6_src with ip6_dst. > >>>>>>> + * We could also use packet_set_ipv6() here, but that would > >>>>>>> require > >>>>>>> + * to extract the 'tc' and 'label' from in_nh->ip6_flow > >>>>>>> which seems > >>>>>>> + * unnecessary. */ > >>>>>>> + struct ovs_16aligned_ip6_hdr *out_nh = dp_packet_l3(pkt_out); > >>>>>>> + union ovs_16aligned_in6_addr tmp = out_nh->ip6_src; > >>>>>>> + out_nh->ip6_src = out_nh->ip6_dst; > >>>>>>> + out_nh->ip6_dst = tmp; > >>>>>>> + } > >>>>>>> + > >>>>>>> + pin->packet = dp_packet_data(pkt_out); > >>>>>>> + pin->packet_len = dp_packet_size(pkt_out); > >>>>>>> + > >>>>>>> + queue_msg(swconn, ofputil_encode_resume(pin, continuation, > >>>>>>> proto)); > >>>>>>> + dp_packet_delete(pkt_out); > >>>>>>> +} > >>>>>>> diff --git a/include/ovn/actions.h b/include/ovn/actions.h > >>>>>>> index b4e5acabb9..bf1fe848b7 100644 > >>>>>>> --- a/include/ovn/actions.h > >>>>>>> +++ b/include/ovn/actions.h > >>>>>>> @@ -97,6 +97,7 @@ struct ovn_extend_table; > >>>>>>> OVNACT(DHCP6_REPLY, ovnact_null) \ > >>>>>>> OVNACT(ICMP6_ERROR, ovnact_nest) \ > >>>>>>> OVNACT(REJECT, ovnact_nest) \ > >>>>>>> + OVNACT(OVNFIELD_EXCHANGE, ovnact_move) \ > >>>>>>> > >>>>>>> /* enum ovnact_type, with a member OVNACT_<ENUM> for each action. */ > >>>>>>> enum OVS_PACKED_ENUM ovnact_type { > >>>>>>> @@ -612,6 +613,9 @@ enum action_opcode { > >>>>>>> * The actions, in OpenFlow 1.3 format, follow the action_header. > >>>>>>> */ > >>>>>>> ACTION_OPCODE_REJECT, > >>>>>>> + > >>>>>>> + /* ip.src <-> ip.dst */ > >>>>>>> + ACTION_OPCODE_SWAP_SRC_DST_IP, > >>>>>>> }; > >>>>>>> > >>>>>>> /* Header. */ > >>>>>>> diff --git a/include/ovn/logical-fields.h > >>>>>>> b/include/ovn/logical-fields.h > >>>>>>> index ac6f2f909b..bb6daa50ca 100644 > >>>>>>> --- a/include/ovn/logical-fields.h > >>>>>>> +++ b/include/ovn/logical-fields.h > >>>>>>> @@ -116,6 +116,24 @@ enum ovn_field_id { > >>>>>>> */ > >>>>>>> OVN_ICMP6_FRAG_MTU, > >>>>>>> > >>>>>>> + /* > >>>>>>> + * Name: "ip.src" > >>>>>>> + * Type: be128 > >>>>>>> + * Description: Sets the field MFF_IPV4_SRC if eth.type == 0x800 > >>>>>>> (IPv4) > >>>>>>> + * or sets the field MFF_IPV6_SRC if > >>>>>>> + * eth.type == 0x86dd (IPV6). > >>>>>>> + */ > >>>>>>> + OVN_IP_SRC, > >>>>>>> + > >>>>>>> + /* > >>>>>>> + * Name: "ip.dst" > >>>>>>> + * Type: be128 > >>>>>>> + * Description: Sets the field MFF_IPV4_DST if eth.type == 0x800 > >>>>>>> (IPv4) > >>>>>>> + * or sets the field MFF_IPV6_DST if > >>>>>>> + * eth.type == 0x86dd (IPV6). > >>>>>>> + */ > >>>>>>> + OVN_IP_DST, > >>>>>>> + > >>>>>>> OVN_FIELD_N_IDS > >>>>>>> }; > >>>>>>> > >>>>>>> diff --git a/lib/actions.c b/lib/actions.c > >>>>>>> index 1e1bdeff24..e9c77d2a0a 100644 > >>>>>>> --- a/lib/actions.c > >>>>>>> +++ b/lib/actions.c > >>>>>>> @@ -371,12 +371,35 @@ ovnact_next_free(struct ovnact_next *a > >>>>>>> OVS_UNUSED) > >>>>>>> { > >>>>>>> } > >>>>>>> > >>>>>>> +static bool > >>>>>>> +check_ovnfield_load(struct action_context *ctx, const struct > >>>>>>> expr_field *field) > >>>>>>> +{ > >>>>>>> + switch (field->symbol->ovn_field->id) { > >>>>>>> + case OVN_ICMP4_FRAG_MTU: > >>>>>>> + case OVN_ICMP6_FRAG_MTU: > >>>>>>> + return true; > >>>>>>> + > >>>>>>> + case OVN_IP_SRC: > >>>>>>> + case OVN_IP_DST: > >>>>>>> + lexer_error(ctx->lexer, "Can't load a value to ovn field > >>>>>>> (%s).", > >>>>>>> + field->symbol->name); > >>>>>>> + return false; > >>>>>>> + > >>>>>>> + case OVN_FIELD_N_IDS: > >>>>>>> + default: > >>>>>>> + OVS_NOT_REACHED(); > >>>>>>> + } > >>>>>>> +} > >>>>>>> + > >>>>>>> static void > >>>>>>> parse_LOAD(struct action_context *ctx, const struct expr_field *lhs) > >>>>>>> { > >>>>>>> size_t ofs = ctx->ovnacts->size; > >>>>>>> struct ovnact_load *load; > >>>>>>> if (lhs->symbol->ovn_field) { > >>>>>>> + if (!check_ovnfield_load(ctx, lhs)) { > >>>>>>> + return; > >>>>>>> + } > >>>>>>> load = ovnact_put_OVNFIELD_LOAD(ctx->ovnacts); > >>>>>>> } else { > >>>>>>> load = ovnact_put_LOAD(ctx->ovnacts); > >>>>>>> @@ -474,6 +497,46 @@ format_EXCHANGE(const struct ovnact_move *move, > >>>>>>> struct ds *s) > >>>>>>> format_assignment(move, "<->", s); > >>>>>>> } > >>>>>>> > >>>>>>> +static void > >>>>>>> +parse_ovnfield_exchange(struct action_context *ctx, > >>>>>>> + const struct expr_field *lhs, > >>>>>>> + const struct expr_field *rhs) > >>>>>>> +{ > >>>>>>> + if (!lhs->symbol->ovn_field || !rhs->symbol->ovn_field) { > >>>>>>> + lexer_error(ctx->lexer, > >>>>>>> + "Can't exchange ovn field with non ovn field (%s > >>>>>>> <-> %s).", > >>>>>>> + lhs->symbol->name, rhs->symbol->name); > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> + if (lhs->symbol->ovn_field->id != OVN_IP_SRC && > >>>>>>> + lhs->symbol->ovn_field->id != OVN_IP_DST) { > >>>>>>> + lexer_error(ctx->lexer, > >>>>>>> + "Can't exchange ovn field (%s) with ovn field > >>>>>>> (%s).", > >>>>>>> + lhs->symbol->name, rhs->symbol->name); > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> + if (rhs->symbol->ovn_field->id != OVN_IP_SRC && > >>>>>>> + rhs->symbol->ovn_field->id != OVN_IP_DST) { > >>>>>>> + lexer_error(ctx->lexer, > >>>>>>> + "Can't exchange ovn field (%s) with ovn field > >>>>>>> (%s).", > >>>>>>> + lhs->symbol->name, rhs->symbol->name); > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> + if (lhs->symbol->ovn_field->id == rhs->symbol->ovn_field->id) { > >>>>>>> + lexer_error(ctx->lexer, > >>>>>>> + "Can't exchange ovn field (%s) with ovn field > >>>>>>> (%s).", > >>>>>>> + lhs->symbol->name, rhs->symbol->name); > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> + struct ovnact_move *move = > >>>>>>> ovnact_put_OVNFIELD_EXCHANGE(ctx->ovnacts); > >>>>>>> + move->lhs = *lhs; > >>>>>>> + move->rhs = *rhs; > >>>>>>> +} > >>>>>>> + > >>>>>>> static void > >>>>>>> parse_assignment_action(struct action_context *ctx, bool exchange, > >>>>>>> const struct expr_field *lhs) > >>>>>>> @@ -483,6 +546,11 @@ parse_assignment_action(struct action_context > >>>>>>> *ctx, bool exchange, > >>>>>>> return; > >>>>>>> } > >>>>>>> > >>>>>>> + if (exchange && (lhs->symbol->ovn_field || > >>>>>>> rhs.symbol->ovn_field)) { > >>>>>>> + parse_ovnfield_exchange(ctx, lhs, &rhs); > >>>>>>> + return; > >>>>>>> + } > >>>>>>> + > >>>>>>> const struct expr_symbol *ls = lhs->symbol; > >>>>>>> const struct expr_symbol *rs = rhs.symbol; > >>>>>>> if ((ls->width != 0) != (rs->width != 0)) { > >>>>>>> @@ -3128,6 +3196,8 @@ format_OVNFIELD_LOAD(const struct ovnact_load > >>>>>>> *load , struct ds *s) > >>>>>>> ntohs(load->imm.value.be16_int)); > >>>>>>> break; > >>>>>>> > >>>>>>> + case OVN_IP_SRC: > >>>>>>> + case OVN_IP_DST: > >>>>>>> case OVN_FIELD_N_IDS: > >>>>>>> default: > >>>>>>> OVS_NOT_REACHED(); > >>>>>>> @@ -3157,6 +3227,9 @@ encode_OVNFIELD_LOAD(const struct ovnact_load > >>>>>>> *load, > >>>>>>> encode_finish_controller_op(oc_offset, ofpacts); > >>>>>>> break; > >>>>>>> } > >>>>>>> + > >>>>>>> + case OVN_IP_SRC: > >>>>>>> + case OVN_IP_DST: > >>>>>>> case OVN_FIELD_N_IDS: > >>>>>>> default: > >>>>>>> OVS_NOT_REACHED(); > >>>>>>> @@ -3451,6 +3524,51 @@ ovnact_fwd_group_free(struct ovnact_fwd_group > >>>>>>> *fwd_group) > >>>>>>> free(fwd_group->child_ports); > >>>>>>> } > >>>>>>> > >>>>>>> +static void > >>>>>>> +format_OVNFIELD_EXCHANGE(const struct ovnact_move *move , struct ds > >>>>>>> *s) > >>>>>>> +{ > >>>>>>> + const struct ovn_field *lhs = > >>>>>>> ovn_field_from_name(move->lhs.symbol->name); > >>>>>>> + const struct ovn_field *rhs = > >>>>>>> ovn_field_from_name(move->rhs.symbol->name); > >>>>>>> + switch (lhs->id) { > >>>>>>> + case OVN_IP_SRC: > >>>>>>> + case OVN_IP_DST: > >>>>>>> + break; > >>>>>>> + > >>>>>>> + case OVN_ICMP4_FRAG_MTU: > >>>>>>> + case OVN_ICMP6_FRAG_MTU: > >>>>>>> + case OVN_FIELD_N_IDS: > >>>>>>> + default: > >>>>>>> + OVS_NOT_REACHED(); > >>>>>>> + } > >>>>>> > >>>>>> Nit: I would use the shorter: > >>>>>> ovs_asssert(lhs->id == OVN_IP_SRC || lhs->id == OVN_IP_DST); > >>>>> > >>>>> You mean to drop the switch right ? If so, Ack. > >>>>> > >>>> > >>>> Right, I meant ovs_assert instead of switch. > >>>> > >>>>> > >>>>>> > >>>>>>> + > >>>>>>> + switch (rhs->id) { > >>>>>>> + case OVN_IP_SRC: > >>>>>>> + case OVN_IP_DST: > >>>>>>> + break; > >>>>>>> + > >>>>>>> + case OVN_ICMP4_FRAG_MTU: > >>>>>>> + case OVN_ICMP6_FRAG_MTU: > >>>>>>> + case OVN_FIELD_N_IDS: > >>>>>>> + default: > >>>>>>> + OVS_NOT_REACHED(); > >>>>>>> + } > >>>>>> > >>>>>> Nit: I would use the shorter: > >>>>>> ovs_asssert(rhs->id == OVN_IP_SRC || rhs->id == OVN_IP_DST); > >>>>>> > >>>>>>> + > >>>>>>> + ds_put_format(s, "%s <-> %s;", lhs->name, rhs->name); > >>>>>>> +} > >>>>>>> + > >>>>>>> +static void > >>>>>>> +encode_OVNFIELD_EXCHANGE(const struct ovnact_move *move OVS_UNUSED, > >>>>>>> + const struct ovnact_encode_params *ep > >>>>>>> OVS_UNUSED, > >>>>>>> + struct ofpbuf *ofpacts OVS_UNUSED) > >>>>>>> +{ > >>>>>>> + /* Right now we only support exchanging the IPs in > >>>>>>> + * OVN fields (ip.src <-> ip.dst). */ > >>>>>>> + size_t oc_offset = > >>>>>>> + encode_start_controller_op(ACTION_OPCODE_SWAP_SRC_DST_IP, > >>>>>>> + true, NX_CTLR_NO_METER, ofpacts); > >>>>>>> + encode_finish_controller_op(oc_offset, ofpacts); > >>>>>> > >>>>>> I think this means that all packets matching a flow with action > >>>>>> "reject { ... > >>>>>> ip.src <-> ip.dst ... }" will generate two packet-ins, right? One for > >>>>>> the > >>>>>> reject action, one for the ovnfield-exchange action. > >>>>>> > >>>>>> Is that a concern wrt. performance? > >>>>> > >>>>> Yes. It would have 2 packet-ins. Just like how we handle - icmp4 {... > >>>>> icmp4.frag_mtu = 1500; ...}; > >>>>> > >>>>> If performance is a concern, we could just drop ... ip.src <-> ip.dst > >>>>> in reject and let reject action > >>>>> swap the ip src with ip destination. I thought about that and my take > >>>>> is that since reject { } results in > >>>>> TCP RST or ICMP unreachable packet, it should be fine. I see these > >>>>> packets as more of control packets. > >>>>> > >>>>> I think it should be ok. Let me know what you think. > >>>>> > >>>> > >>>> I think I like the approach of "reject" action implicitly swapping IPs > >>>> best. > >>>> In the end it's not like we could implement reject withough swapping IP > >>>> dst > >>>> with IP src so why not do it as part of the reject action. > >>>> > >>> > >>> Agree. reject action is expected to be used to send the generated tcp > >>> rst/icmp unreachable > >>> packet back to the sender. > >>> > >>> I'm ok to change the reject action to just swap the IPs internally if > >>> everyone is fine. > >>> > >>> @Mark Michelson @Han Zhou Do you have any comments here ? > >>> > >>> Option 1 - reject { ...} -> The reject action handling in pinctrl.c > >>> will swap the ip source and destination > >>> > >>> Option 2: reject {... ip.src <-> ip.dst ...} which is the proposed > >>> approach in this patch series. This results in > >>> 2 packet-ins. > >>> > >>> I personally prefer (2) since in OVN we normally tell what inner > >>> actions to apply for many OVN actions. > >>> But I have no strong preference and I'm fine changing to (1). > >>> > >>> If we chose option (1), we could add a comment inside the action like > >>> - reject { /* ip.src <-> ip.dst is done implicitly*/, eth.src <-> > >>> eth.dst; output ; } > >>> > >> > >> Then, why not swap ETH addresses implicitly too? > > > > As I mentioned above, I'd prefer if ovn-northd says what actions to > > apply for the reject > > packet. Just like for icmp4 or arp actions, ovn-northd says what > > actions to apply for the > > transformed packet from the source packet. > > > > I was thinking that if we go this way when we build the packet in > ovn-controller it might look weird to a reader of the pinctrl code to see > something like: > > /* We only swap IPs because ETH addresses are swapped in OVS. */ > pinctrl_compose_ipv4(&packet, ip_flow->dl_src, ip_flow->dl_dst, > ip_flow->nw_dst, ip_flow->nw_src, ...)
Right. I agree. It makes sense to swap both eth and ip if we go this way. Sorry for the confusion. Thanks Numan > > Regards, > Dumitru > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
