Re: [ovs-discuss] Can't send to controller when doing a resubmit and learn (regression)

Murphy McCauley Mon, 28 Apr 2014 12:18:16 -0700

On Apr 23, 2014, at 9:03 PM, Ben Pfaff <[email protected]> wrote:

> On Tue, Apr 22, 2014 at 11:29:55PM -0700, Murphy McCauley wrote:
>> On Apr 22, 2014, at 8:54 AM, Ben Pfaff <[email protected]> wrote:
>> 
>>> On Sat, Apr 19, 2014 at 07:50:31PM -0700, Murphy McCauley wrote:
>>>> I recently found a technique I'd used with OVS 1.9 no longer worked under 
>>>> OVS built from master a few days ago.  Here's a pretty minimal example:
>>>> 
>>>> table=0, actions=resubmit(,2),resubmit(,1)
>>>> table=1, reg1=0 
>>>> actions=learn(table=2,hard_timeout=1,load:1->NXM_NX_REG1[]),controller
>>>> 
>>>> In this example, it's a poor man's controller rate limiter.  The previous 
>>>> (and expected) behavior is that you can spam packets (e.g., ping -i 0.1) 
>>>> and only one per second goes to the controller.  The observed behavior on 
>>>> new versions of OVS is that nothing ever comes to the controller.
>>>> 
>>>> Adding a reg1=1 match to table 1, it was clear the matching was working 
>>>> right (the packet counts of the table 1 rules summed to the packet count 
>>>> of the table 0 rule).  But still nothing at the controller.  A flood 
>>>> action, however, works just fine -- one per second.  This got me thinking 
>>>> it's a fast path/slow path issue.  I did some digging and found:
>>>> 
>>>> Before 4dff909 (Move odp_actions from subfacet to facet), things worked as 
>>>> expected.  After this commit, it didn't work, but I found a workaround 
>>>> based on a glance through the diff and a hunch: if I put a controller 
>>>> action in the table 0 rule too, both controller actions worked.  I was 
>>>> inspired to try this by the change around line 5027.  Without the table 0 
>>>> controller action, facet_revalidate() gives up when the facet goes from 
>>>> fast path to slow path.  With it, I am guessing it starts out on the slow 
>>>> path and never changes.  Whether any of that is significant or not, by 
>>>> sending to a nonexistent controller ID in table 0, I had the behavior I 
>>>> wanted again.
>>>> 
>>>> Unfortunately, this workaround didn't work on master.  So more digging.  
>>>> It turns out that after 3d9c5e5 (Handle learn action flow mods 
>>>> asynchronously), the workaround wasn't required anymore and things were 
>>>> back to working as expected.
>>>> 
>>>> Obviously this didn't last forever.  Specifically, when 9129672 (Move 
>>>> "learn" actions into individual threads) more or less undid the previous, 
>>>> even the workaround doesn't work.
>>>> 
>>>> I tried to find anything related on the mailing list and didn't come up 
>>>> with anything.  Is it unknown?  Is there any reason why this *shouldn't* 
>>>> work?  Any thoughts on getting it to work again?
>>> 
>>> At a glance, this should work (although it's not a use case I've
>>> considered before).  It's not obvious to me why it doesn't.  If you
>>> figure out a fix (though I'd like to take a look myself, I don't have
>>> the time), please submit it, and then we'll add a test to avoid future
>>> regression.
>> 
>> Hi, Ben, thanks for confirming that it should work.
>> 
>> I believe the reason it doesn't lies in handle_upcalls(), which calls 
>> xlate_actions() without the packet and later calls 
>> xlate_actions_for_side_effects() with the packet which should actually send 
>> to the controller.  Unfortunately, by the time the actions are xlated the 
>> second time, the world has changed due to the flow_mod resulting from the 
>> learn action the first time they were xlated.  The result is that things go 
>> differently when trying to run the side effects -- we now hit the newly 
>> learned rule.  Before 9129672, the flow_mods were queued and I guess hadn't 
>> actually executed when xlate_actions_for_side_effects() ran.
>> I think the additional xlate stems from bcd2633 (Store relevant fields for 
>> wildcarding in facet).
>> 
>> I don't know the code well enough to know if there's a particularly elegant 
>> way of solving this.  Someone else must have a better idea. :)
>> 
>> A sidenote is that the actions are xlated both times with may_learn, which 
>> seemed odd to me.  Just for fun I turned it off the second time (which, of 
>> course, is the not-useful-to-me case), and it didn't change the results of 
>> make check for whatever that's worth.
> 
> Thanks for the insight.  That ought to help.


So one way it had occurred to me to resolve this was to cache results from the 
first xlate_actions() so that the one for side effects would be a replay and 
not an entirely new process.  Although used for a different aim, it seems like 
there's some conceptual crossover here with Joe Stringer's "Cache the modules 
affected by xlate_actions()" series of patches.  Just thought I'd mention this 
(CCing in Joe Stringer) in case anyone had any deeper thoughts.

-- Murphy
_______________________________________________
discuss mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/discuss

Re: [ovs-discuss] Can't send to controller when doing a resubmit and learn (regression)

Reply via email to