On Wed, Dec 14, 2016 at 9:02 AM, Jarno Rajahalme <[email protected]> wrote:

>
> On Dec 13, 2016, at 3:32 PM, Takashi YAMAMOTO <[email protected]> wrote:
>
>
>
> On Tue, Dec 13, 2016 at 6:49 PM, Thomas Morin <[email protected]>
> wrote:
>
>> Hi Jarno,
>>
>> 2016-12-10, Jarno Rajahalme:
>>
>> On Dec 9, 2016, at 7:04 AM, Thomas Morin <[email protected]> wrote:
>>
>>
>> 2016-12-09, Thomas Morin:
>>
>> In the same setup as the one on which the bug was observed, [...]
>>
>>
>> I was confused, I in fact tested the patch (branch-2.5) on a /different/
>> setup as the one on which we hit the bug, using MPLS over a GRE tunnel
>> port, rather than plain MPLS over an eth port.
>> Sorry if any confusion arised... I can test on the first setup if
>> relevant.
>>
>>
>> Maybe the kernel datapath does not support MPLS over a GRE tunnel port.
>> Having ‘dmesg’ output for the test run might reveal why the actions
>> validation fails.
>>
>>
>> The dmesg output was the following:
>>
>> [171295.258939] openvswitch: netlink: Flow actions may not be safe on all
>> matching packets.
>>
>> I've tested the patch on the platform on which the bug was initially hit
>> (*not* using MPLS/GRE), and I have the following a few times in the logs
>> right after I do an "ovs-appctl fdb/flush":
>>
>> 2016-12-13T09:44:08.449Z|00001|dpif(handler68)|WARN|Dropped 3 log
>> messages in last 1 seconds (most recently, 1 seconds ago) due to excessive
>> rate
>> 2016-12-13T09:44:08.449Z|00002|dpif(handler68)|WARN|system@ovs-system:
>> failed to put[create] (Invalid argument) 
>> ufid:f046c4c4-b97f-436d-bd7c-91ed307275ac
>> recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(9),skb_m
>> ark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/
>> 0),eth(src=fa:16:3e:61:c0:b5,dst=00:00:5e:00:43:64),eth_
>> type(0x0800),ipv4(src=10.0.1.29,dst=10.0.0.3,proto=6,tos=0/
>> 0xfc,ttl=64,frag=no),tcp(src=54253,dst=8080),tcp_flags(0/0),
>> actions:set(ipv4(src=10.0.1.29,dst=10.0.0.3,ttl=63)),set(eth
>> (src=b8:2a:72:de:1b:e3,dst=00:17:cb:79:2c:01)),push_mpls(lab
>> el=433680,tc=0,ttl=63,bos=1,eth_type=0x8847),7,set(eth(src
>> =fa:16:3e:61:c0:b5,dst=00:00:5e:00:43:64)),pop_mpls(eth_typ
>> e=0x800),set(ipv4(src=10.0.1.29,dst=10.0.0.3,tos=0/0xfc,ttl
>> =64)),push_vlan(vid=1,pcp=0),3,8,pop_vlan,13
>>
>> And dmesg:
>> [926833.612372] openvswitch: netlink: Flow actions may not be safe on all
>> matching packets.
>>
>
> it's complaining about set ipv4 after pop_mpls because it doesn't know
> about the "restored" l3.
> while we can improve the kernel, i guess we need to stick with recirc for
> now.
>
>
>
> This should do it? Does not break any existing tests on branch-2.5, but I
> did not create a test for this yet.
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index fb25f63..6ee2a72 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -2899,6 +2899,15 @@ xlate_commit_actions(struct xlate_ctx *ctx)
>  {
>      bool use_masked = ctx->xbridge->support.masked_set_action;
>
> +    /* An MPLS packet can be implicitly popped back to a non-MPLS packet,
> if a
> +     * patch port peer or a group bucket pushed MPLS.  Set the 'was_mpls'
> flag
> +     * also in that case. */
> +    if (eth_type_mpls(ctx->base_flow.dl_type)
> +        && !eth_type_mpls(ctx->xin->flow.dl_type)
> +        && ctx->xbridge->support.odp.recirc) {
> +        ctx->was_mpls = true;
> +    }
> +
>

i guess this check needs to be somewhere around "ctx->was_mpls =
old_was_mpls" in
affected functions.

     ctx->xout->slow |= commit_odp_actions(&ctx->xin->flow, &ctx->base_flow,
>                                            ctx->odp_actions, ctx->wc,
>                                            use_masked);
>
>   Jarno
>
>
>
>>
>> -Thomas
>>
>>
>>
>>
>>
>> On Dec 1, 2016, at 5:57 PM, Jarno Rajahalme <[email protected]> wrote:
>>
>>
>> On Nov 30, 2016, at 8:50 PM, Ben Pfaff <[email protected]> wrote:
>>
>> On Wed, Nov 30, 2016 at 06:58:57PM -0800, Jarno Rajahalme wrote:
>>
>>
>> On Nov 30, 2016, at 8:41 AM, Ben Pfaff <[email protected]> wrote:
>>
>> On Wed, Nov 30, 2016 at 12:05:46PM +0100, Thomas Morin wrote:
>>
>> Hi Ben,
>>
>> 2016-11-30, Ben Pfaff:
>>
>> Do you have any idea what in your OpenFlow pipeline might do that,
>> i.e. is there anything especially tricky in the OpenFlow flows?
>>
>> Are you willing to show us your OpenFlow flow table?
>>
>>
>> The setup involves three OVS bridges connected with patch-ports: br-int --
>> br-tun -- br-mpls, with the traffic that triggers the assert being
>> processed
>> by br-int with a NORMAL action (ie. MAC learning).
>>
>> The flows in this setup aren't particularly tricky, I think, although I'm
>> not sure what qualifies as tricky or non-tricky :)
>>
>> Anyway, since yesterday I managed to identify the event that trigger the
>> assert, by adding more logging before the assert and displaying the
>> actions
>> taken:
>>
>> 2016-11-29T14:44:40.126Z|00001|odp_util(revalidator45)|WARN|
>> commit_set_ipv4_action
>> assert would fail....
>> 2016-11-29T14:44:40.126Z|00002|odp_util(revalidator45)|WARN|  base_flow:
>> ip,in_port=5,dl_vlan=3,dl_vlan_pcp=0,dl_src=fa:16:3e:33:f7:
>> fe,dl_dst=00:00:5e:00:43:64,nw_src=0.0.0.0,nw_dst=0.0.0.0,
>> nw_proto=0,nw_tos=0,nw_ecn=0,nw_ttl=0
>> 2016-11-29T14:44:40.126Z|00003|odp_util(revalidator45)|WARN|  flow:
>> tcp,in_port=5,dl_vlan=3,dl_vlan_pcp=0,dl_src=fa:16:3e:33:f7:
>> fe,dl_dst=00:00:5e:00:43:64,nw_src=10.0.1.22,nw_dst=10.0.
>> 0.3,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=53295,tp_dst=8080,
>> tcp_flags=psh|ack
>> 2016-11-29T14:44:40.126Z|00004|odp_util(revalidator45)|WARN|  masks:
>> recirc_id=0xffffffff,reg0=0xffffffff,in_port=4294967295,dl_
>> vlan=4095,dl_vlan_pcp=7,dl_src=ff:ff:ff:ff:ff:ff,dl_dst=ff:
>> ff:ff:ff:ff:ff,dl_type=0xffff
>> 2016-11-29T14:44:40.126Z|00005|odp_util(revalidator45)|WARN|  actions:
>> set(ipv4(src=10.0.1.22,dst=10.0.0.3,ttl=63)),set(eth(src=b8:
>> 2a:72:de:1b:e3,dst=00:17:cb:79:2c:01)),push_mpls(label=41038
>> 4,tc=0,ttl=63,bos=1,eth_type=0x8847),9,set(eth(src=fa:16:
>> 3e:33:f7:fe,dst=00:00:5e:00:43:64)),pop_mpls(eth_type=0x80
>> 0),push_vlan(vid=3,pcp=0),1
>>
>>
>> push_mpls clears L3/L4, while pop_mpls re-populates them, and then
>> processing the output to port 1 hits the assert?
>>
>>
>> That's what I'm thinking too.
>>
>> Jarno, is this something you have time to look into?  It'd be great, if
>> you do.  I'm way behind.
>>
>>
>> I’m looking at this.
>>
>> Based on the trace given it seems that:
>> 1. Packet is received on br-int port 32, which outputs it via NORMAL
>> action over a patch port to another bridge. The only patch-port on br-int
>> is 2 (patch-tun). The NORMAL action adds dl_vlan=1.
>> 2. br-tun receives the packet on in_port 1 (patch-int), and outputs it on
>> it’s port 2 (patch-to-mpls)
>> 3. br-mpls receives the packet on it’s in_port 2 (patch-to-tun), does
>> pop_vlan, and outputs to it’s port 21 (ipvpn-pp-out), which is also an
>> patch port.
>> 4. br-mpls (?) receives the packet on it’s in_port 20 (ipvpn-pp-in), does
>> dec_ttl,push_mpls:0x8847,load:0x644c0->OXM_OF_MPLS_LABEL[],s
>> et_field:b8:2a:72:de:1b:e3->eth_src,set_field:00:17:cb:79:2c
>> :01->eth_dst,output:1
>>
>> All this generates a megaflow: set(ipv4(src=10.0.1.23,dst=10.
>> 0.0.3,ttl=63)),set(eth(src=b8:2a:72:de:1b:e3,dst=00:17:cb:79
>> :2c:01)),push_mpls(label=410816,tc=0,ttl=63,bos=1,eth_type=0x8847),9
>>
>> This is only the beginning part of the trouble-some megaflow, in which
>> br-int sends the packet also to another port (vlan 3), and as part of that
>> pops the MPLS and restores the original ethernet addresses. Maybe this
>> would happen with the trace too, if you flushed MACs before the trace?
>>
>> The patch ports 21 and 20 appear to be in the same bridge and patched to
>> each other. Is this the case?
>>
>> The crashing megaflow has in_port=5,dl_vlan=3. Is this also on br-int?
>>
>> Also, OVS 2.6 is a little bit less aggressive about avoiding
>> recirculation after mpls operations, and I’d be interested to know if your
>> case fails the same way with OVS 2.6?
>>
>> Thanks,
>>
>>   Jarno
>>
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to