On 18 July 2017 at 01:59, Chandran, Sugesh <[email protected]> wrote:
>
>
> Regards
> _Sugesh
>
>
>> -----Original Message-----
>> From: Joe Stringer [mailto:[email protected]]
>> Sent: Tuesday, July 18, 2017 6:42 AM
>> To: Chandran, Sugesh <[email protected]>
>> Cc: Zoltán Balogh <[email protected]>; ovs dev
>> <[email protected]>; Andy Zhou <[email protected]>
>> Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by combining
>> recirc actions at xlate.
>>
>> On 17 July 2017 at 15:17, Chandran, Sugesh <[email protected]>
>> wrote:
>> > Comments inline
>> >
>> > Regards
>> > _Sugesh
>> >
>> >> -----Original Message-----
>> >> From: Joe Stringer [mailto:[email protected]]
>> >> Sent: Monday, July 17, 2017 7:33 PM
>> >> To: Chandran, Sugesh <[email protected]>
>> >> Cc: Zoltán Balogh <[email protected]>; ovs dev
>> >> <[email protected]>; Andy Zhou <[email protected]>
>> >> Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by
>> >> combining recirc actions at xlate.
>> >>
>> >> On 17 July 2017 at 10:27, Chandran, Sugesh
>> >> <[email protected]>
>> >> wrote:
>> >> >> > -----Original Message-----
>> >> >> > From: Joe Stringer [mailto:[email protected]]
>> >> >> > Sent: Saturday, July 15, 2017 1:46 AM
>> >> >> > To: Sugesh Chandran <[email protected]>
>> >> >> > Cc: ovs dev <[email protected]>; Andy Zhou <[email protected]>;
>> >> >> > Zoltán Balogh <[email protected]>
>> >> >> > Subject: Re: [PATCH v2 4/4] tunneling: Avoid datapath-recirc by
>> >> >> > combining
>> >> >> recirc actions at xlate.
>> >> >> >
>> >> >> > On 4 July 2017 at 02:46, Sugesh Chandran
>> >> >> > <[email protected]>
>> >> >> wrote:
>> >>
>> >> <snip>
>> >>
>> >> >> > > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
>> >> >> > > 4e29085..4d996c1 100644
>> >> >> > > --- a/lib/dpif-netdev.c
>> >> >> > > +++ b/lib/dpif-netdev.c
>> >> >> > > @@ -5028,24 +5028,8 @@ dp_execute_cb(void *aux_, struct
>> >> >> > > dp_packet_batch *packets_,
>> >> >> > >
>> >> >> > > case OVS_ACTION_ATTR_TUNNEL_PUSH:
>> >> >> > > if (*depth < MAX_RECIRC_DEPTH) {
>> >> >> > > - struct dp_packet_batch tnl_pkt;
>> >> >> > > - struct dp_packet_batch *orig_packets_ = packets_;
>> >> >> > > - int err;
>> >> >> > > -
>> >> >> > > - if (!may_steal) {
>> >> >> > > - dp_packet_batch_clone(&tnl_pkt, packets_);
>> >> >> > > - packets_ = &tnl_pkt;
>> >> >> > > - dp_packet_batch_reset_cutlen(orig_packets_);
>> >> >> > > - }
>> >> >> > > -
>> >> >> > > dp_packet_batch_apply_cutlen(packets_);
>> >> >> > > -
>> >> >> > > - err = push_tnl_action(pmd, a, packets_);
>> >> >> > > - if (!err) {
>> >> >> > > - (*depth)++;
>> >> >> > > - dp_netdev_recirculate(pmd, packets_);
>> >> >> > > - (*depth)--;
>> >> >> > > - }
>> >> >> > > + push_tnl_action(pmd, a, packets_);
>> >> >> >
>> >> >> > If I follow, this was the previous datapath-level code to ensure
>> >> >> > that when packets are output to a tunnel, only the copy of the
>> >> >> > packet that goes to the tunnel gets the cutlen applied. With the
>> >> >> > new version of the code, we replace this by making sure that the
>> >> >> > ofproto layer generates a clone to wrap the push_tunnel and all
>> >> >> > subsequent bridge translation, so then it results in the
>> >> >> > equivalent
>> >> >> > behaviour: The cutlen is only ever applied to a clone of the
>> >> >> > packets. Is
>> >> that right?
>> >> > [Sugesh] Hmm. That’s the good catch. Looks like the packets are not
>> >> > cloned when combine_tunnel_action is found impossible in ofproto
>> layer.
>> >> >
>> >> > As a fix we should keep the code block in 'may_steal' to handle the
>> >> > non tunnel-combine case.
>> >>
>> >> Oh, really? I thought that we always end up with a clone wrapping the
>> >> tunnel_push now, which means we shouldn't need this bit any more. My
>> >> comment was intended just to ask if I understood correctly to make
>> >> sure we don't miss anything.
>> >>
>> >> > Any other suggestion to apply cut_len in ofproto?
>> >>
>> >> Unfortunately the cut_len is set up through a previous truncate
>> >> action, and it is supposed to apply to the "next output action" (and
>> >> reset afterwards), so it is part of the datapath actions interface
>> >> definition; I don't think there's a way to push it up to ofproto.
>> > [Sugesh] Or at ofproto layer, the non-combine tunnel push also placed
>> > in a clone, similar to the combine case? That will help to keep the
>> > datapath
>> as simple as just a push.
>>
>> That sounds simpler, but I don't think we can get rid of applying the cut_len
>> just yet. This is related to some other discussions going on around the
>> interaction between truncate() and clone(), so it's probably a bit safer to
>> just
>> leave this code to apply cut len in place for now.
> [Sugesh] I don’t meant to get rid of cut_len. Instead we make sure any
> tunnel-push
> always happen on a clone. We keep the push +cut_len as is in the patch.
> So the ofproto actions would become
> clone(tnl_push), trunc(200) -->Non-combine case. (instead of just [tnl_push,
> trunc(200)]
> clone(tnl_push, output(1)) ---> Combine case
> Do you think it has any other implication that we may missed?
OK, that sounds about right, except that the truncate action is
serialized first - see xlate_output_trunc_action(). Ie, from a
translation perspective it generates trunc(),clone(tnl_push), then
from datapath perspective when executing, the trunc() sets the
cut_len, then the clone(tnl_push) is responsible for cutting the
packet before pushing the header.
I'm not aware of any other implications of this.
>> >> >> > > +static bool
>> >> >> > > +validate_and_combine_post_tnl_actions(struct xlate_ctx *ctx,
>> >> >> > > + const struct xport *xport,
>> >> >> > > + struct xport *out_dev,
>> >> >> > > + struct
>> >> >> > > +ovs_action_push_tnl
>> >> >> > > +tnl_push_data) {
>> >> >> > > + const struct dpif_flow_stats *backup_resubmit_stats;
>> >> >> > > + struct xlate_cache *backup_xcache;
>> >> >> > > + bool nested_act_flag = false;
>> >> >> > > + struct flow_wildcards tmp_flow_wc;
>> >> >> > > + struct flow_wildcards *backup_flow_wc_ptr;
>> >> >> > > + bool backup_side_effects;
>> >> >> > > + const struct dp_packet *backup_pkt;
>> >> >> > > +
>> >> >> > > + memset(&tmp_flow_wc, 0 , sizeof tmp_flow_wc);
>> >> >> > > + backup_flow_wc_ptr = ctx->wc;
>> >> >> > > + ctx->wc = &tmp_flow_wc;
>> >> >> > > + ctx->xin->wc = NULL;
>> >> >> > > + backup_resubmit_stats = ctx->xin->resubmit_stats;
>> >> >> > > + backup_xcache = ctx->xin->xcache;
>> >> >> > > + backup_side_effects = ctx->xin->allow_side_effects;
>> >> >> > > + backup_pkt = ctx->xin->packet;
>> >> >> > > +
>> >> >> > > + size_t push_action_size = 0;
>> >> >> > > + size_t clone_ofs = nl_msg_start_nested(ctx->odp_actions,
>> >> >> > > + OVS_ACTION_ATTR_CLONE);
>> >> >> > > + odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data);
>> >> >> > > + push_action_size = ctx->odp_actions->size;
>> >> >> > > +
>> >> >> > > + ctx->xin->resubmit_stats = NULL;
>> >> >> > > + ctx->xin->xcache = xlate_cache_new(); /* Use new
>> >> >> > > + temporary
>> >> cache.
>> >> >> */
>> >> >> > > + ctx->xin->allow_side_effects = false;
>> >> >> > > + ctx->xin->packet = NULL;
>> >> >> >
>> >> >> > I realised that you asked about this after the previous patch,
>> >> >> > but I didn't understand the implications of it. I believe that
>> >> >> > this packet is how, for instance, the second bridge can forward
>> >> >> > packets to the controller. If you reset it to NULL here, then I
>> >> >> > don't think that the controller action is executed correctly
>> >> >> > during the second bridge translation. (That's a test we could add).
>> >> > [Sugesh] Yes, that’s a good test case for it.
>> >> >> >
>> >> >> > I suspect that some multicast snooping stuff and some other
>> >> >> > pieces that would be affected by this.
>> >> > [Sugesh] This is interesting. So does the controller expects a
>> >> > encaped packets in that case? Do you think, the controller action
>> >> > also be an exception case like the TRUNC?
>> >>
>> >> Yes, I think the controller expects encapped packets in this case.
>> >> Yes, I think that the easiest way to handle this is to put controller
>> >> in that exception case with truncate.
>> > [Sugesh] Ok. Will add that case as an exception.
>> >
>> > Before that, Would like to clarify the controller output action bit
>> > more How does the following case are currently handled that involves
>> > controller action (We haven’t run any test case that involves a
>> > controller and very limited knowledge on that side)
>> >
>> > MOD_VLAN(10) --> (PATCH-PORT) --> (SECOND-BR) What if there is no
>> > rules to handle the packet in 'second-br' and wanted to report to
>> controller?
>> >
>> > Will the packet report with VLAN 10 to controller?
>>
>> Currently, this will work because patch port doesn't NULLify this
>> ctx->xin->packet pointer, then when it translates through the second
>> bridge and gets to execute_controller_action(), that function will make a
>> clone, apply the current actions to that packet, then send the modified
>> packet to the controller.
> [Sugesh] Ok. So all the pre-actions will be applied before
> cloning the packets(applying mod_vlan) using the callback.
Yup, that's right.
>> So, longer term I think that this NULLification should be removed and we
>> keep the same packet attached to the context. However at the moment,
>> from execute_controller_action() ->
>> xlate_execute_odp_actions() -> odp_execute_actions() there is no datapath
>> callback provided for implementing the tunnel_push functionality. Maybe it's
>> as simple as plugging
>> dpif_execute_helper_cb() into the odp_execute_actions() call there, but
> [Sugesh] I think we should offer the callback to do the modification.
> Will work on when this patch series is accepted.
>> that sounds like a separate change that should get some thorough review
>> and testing as well. I don't think it's strictly necessary for getting this
>> series in,
>> this series can function correctly using the fallback path as we discussed
>> above. It'd be nice to see this at some point in future though to consolidate
>> the paths through this code.
> [Sugesh] Will try to do as a separate patch as you suggested.
OK, thanks!
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev