Re: [ovs-dev] [PATCH] dpif-netdev: Fix crash when PACKET_OUT is metered

Ilya Maximets Wed, 11 Aug 2021 15:51:55 -0700

Hi.

It's been some time since the last email in this thread.  I spent more
time debugging and analyzing the code and, I think, I got to the bottom
of this.  First of all, I had a question why the set of actions in that
particular unit test in OVN needs help for execution, so it hits the
dpif_execute_with_help() path?  The answer is the action set that looks
like this:

actions:set(eth(dst=ff:ff:ff:ff:ff:ff)),set(arp(sip=172.16.1.1)),
        clone(tnl_push(tnl_port(6081),...,out_port(1)),101)

Here the 'set(arp())' action is always marked as SLOW_ACTION and leads
to 'needs_help == true'.  Changes of arp fields is not supported by
kernel.  We may argue that userspace datapath supports it and we should
be able to execute these actions inside the datapath avoiding the
slow path execution process, but there are other cases where slow path
execution can be requested.  One of the most obvious is a direct request
of OFPACT_DEBUG_SLOW.  So, it's hard to completely avoid slow path
execution even for a userspace datapath.

Anyway, dpif_execute_with_help() or, more precisely, the callback
dpif_execute_helper_cb() seems to be the source of all issues.  The
overall design of the actions execution seems to be flawed here.
This function makes several opposite assumptions:

1. It assumes that dpif_execute() can't modify packets and that
   there is no way to get result of the execution from the datapath.

2. It assumes that dpif_execute() can and will modify the packet.

Assumption 1 leads to a workaround for 'meter' actions which are
accumulated from several calls and pushed along with the first terminal
action (output, userspace or ct).  This assumption is true for the
kernel datapath, because the packet basically gets copied to the kernel.

Assumption 2 at the same time allows to implement tunneling for the
userpsace datapath.  dpif_execute_helper_cb() relies on the fact
that dpif_execute(OVS_ACTION_ATTR_TUNNEL_PUSH) will modify the packet
by pushing the tunnel header.

So, this inconsistent behavior for different actions breaks the
dpif abstraction and doesn't allow us to correctly implement the
dpif_execute() interface consistently across different datapath
interfaces.

Even though I see what is going on here, I don't currently see a way
to fix this architectural/design flaw.  Definitely packet reference
counters will still help in correct tracking of dp_packets and we
need that.  I know that Aaron started working on this, but it will
take a while before we will have a working solution.  I'll try
to think how to actually re-work the actions execution path so we
will not have this kind of self-contradicting functions and APIs
(these inconsistencies are really the main thing that complicates
the code and doesn't allow to understand it without a deep dive).
If someone has ideas on that front, I'd like to hear.

Meanwhile, I still think that we need to get the "bandaid" workaround
(clone + data copy) in, so we can fix the actual crash.  As far as
I understand now, change like this should not have any unexpected
side effect.

Tony, do you want to sent the workaround patch?  Or else I can do that.

Bets regards, Ilya Maximets.

On 6/21/21 10:58 PM, Ilya Maximets wrote:
> On 6/19/21 4:11 AM, Tony van der Peet wrote:
>> Hi all
>>
>> I am in favour of a better fix, bandaids tend to come back and bite you in 
>> the end anyway. So, will be happy to help with this effort, though I am 
>> probably going to have to defer to the rest of you when it comes to actually 
>> knowing what to do in this area.
>>
> 
> It's great to have a full correct solution, but unfortunately, I don't
> think that we can come up with something like a full and correct
> reference counting in a short period of time.  But we still need to fix
> a crash that is pretty easy to trigger.  I'm also nervous that OVN is
> using meters more and more (e.g. control plane protection patch set)
> and they might actually hit this issue at the high load.  Another point
> is that we need a fix that we can backport to LTS, and I don't think that
> reference counting would be a small change that we can easily backport
> without worrying about breaking something else.
> 
> All in all, I think that we need to come up with a "bandaid" solution
> and work further on correct reference counting after that.
> 
> And we also need to create a unit test that will mimic packet-out that
> I encountered in OVN tests, so we can test this kind of behavior in OVS.
> 
> For the reference, the packet-out generated by OVN controller had a few
> set() actions and the resubmit() to a different table. And this
> table had rules leading to packet output to a tunnel port, resulting
> in a tunnel push + output datapath actions.
> 
>> Cheers
>> Tony
>> ________________________________________
>> From: Aaron Conole <[email protected]>
>> Sent: Saturday, 19 June 2021 2:50 a.m.
>> To: Ilya Maximets
>> Cc: Ben Pfaff; Tony van der Peet; [email protected]; Tony van der Peet
>> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Fix crash when PACKET_OUT is 
>> metered
>>
>> Ilya Maximets <[email protected]> writes:
>>
>>> On 6/17/21 11:59 PM, Ben Pfaff wrote:
>>>> All these flags for stealing, allowing stealing, blah blah, are just
>>>> ways to do some kind of dumb reference counting without actually have a
>>>> reference count.  When it gets super complex like this, maybe
>>>> introducing a reference count is the way to go.  It would be a bigger
>>>> change, but perhaps more maintainable over time.
>>>>
>>>
>>> Yes, I absolutely agree.  We just removed one of these hacky flags from
>>> the struct dp_packet_batch and we need to get rid of the rest of them.
>>
>> +1 from me - it's a bigger scoped change, but over-all, I think it's a
>> better one, and makes the most sense.
>>
>>> One thing that bothers me is that some parts of the code treats
>>> 'should_steal=false' as "do not modify".  For example, I don't really
>>> understand why these functions are carrying cutlen around and clones
>>> the packet before truncating if 'should_steal=false'.
>>>
>>> Some action executors also have optimizations that allows to not clone
>>> the packet before the fatal action if this action is the last one.
>>>
>>> So, in general, yes, we need to get rid of these flags and accurately
>>> re-work all the packet paths.  And yes, this would be not a small
>>> change.
>>>
>>> For now, I think, we need to find a less ugly as possible solution for
>>> existing crash (maybe the one that I suggested), so we will be able to
>>> backport it and continue working on correct reference counting.
>>>
>>> What do you think?  Tony?  Aaron?
>>
>> That makes sense to me, and I hope we will actually work on the
>> ref-count stuff.  I had started taking a look a few weeks back, but the
>> idea of 'steal' is pretty ingrained throughout the code, so getting a
>> ref-count semantic correct is a big effort.  As an example, the
>> odp-execute area expects that each sub-action will have its own copy to
>> modify (and this is documented with the 'should_steal' semantics).
>> Taking that out will be a big rework in that area.  I think it makes
>> the most sense, and we probably could implement something like COW to
>> cover those cases where we really need to modify a packet without
>> propagating those changes back up.
>>
>>> Best regards, Ilya Maximets.
>>
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] dpif-netdev: Fix crash when PACKET_OUT is metered

Reply via email to