On Apr 23, 2014, at 9:03 PM, Ben Pfaff <[email protected]> wrote: > On Tue, Apr 22, 2014 at 11:29:55PM -0700, Murphy McCauley wrote: >> On Apr 22, 2014, at 8:54 AM, Ben Pfaff <[email protected]> wrote: >> >>> On Sat, Apr 19, 2014 at 07:50:31PM -0700, Murphy McCauley wrote: >>>> I recently found a technique I'd used with OVS 1.9 no longer worked under >>>> OVS built from master a few days ago. Here's a pretty minimal example: >>>> >>>> table=0, actions=resubmit(,2),resubmit(,1) >>>> table=1, reg1=0 >>>> actions=learn(table=2,hard_timeout=1,load:1->NXM_NX_REG1[]),controller >>>> >>>> In this example, it's a poor man's controller rate limiter. The previous >>>> (and expected) behavior is that you can spam packets (e.g., ping -i 0.1) >>>> and only one per second goes to the controller. The observed behavior on >>>> new versions of OVS is that nothing ever comes to the controller. >>>> >>>> Adding a reg1=1 match to table 1, it was clear the matching was working >>>> right (the packet counts of the table 1 rules summed to the packet count >>>> of the table 0 rule). But still nothing at the controller. A flood >>>> action, however, works just fine -- one per second. This got me thinking >>>> it's a fast path/slow path issue. I did some digging and found: >>>> >>>> Before 4dff909 (Move odp_actions from subfacet to facet), things worked as >>>> expected. After this commit, it didn't work, but I found a workaround >>>> based on a glance through the diff and a hunch: if I put a controller >>>> action in the table 0 rule too, both controller actions worked. I was >>>> inspired to try this by the change around line 5027. Without the table 0 >>>> controller action, facet_revalidate() gives up when the facet goes from >>>> fast path to slow path. With it, I am guessing it starts out on the slow >>>> path and never changes. Whether any of that is significant or not, by >>>> sending to a nonexistent controller ID in table 0, I had the behavior I >>>> wanted again. >>>> >>>> Unfortunately, this workaround didn't work on master. So more digging. >>>> It turns out that after 3d9c5e5 (Handle learn action flow mods >>>> asynchronously), the workaround wasn't required anymore and things were >>>> back to working as expected. >>>> >>>> Obviously this didn't last forever. Specifically, when 9129672 (Move >>>> "learn" actions into individual threads) more or less undid the previous, >>>> even the workaround doesn't work. >>>> >>>> I tried to find anything related on the mailing list and didn't come up >>>> with anything. Is it unknown? Is there any reason why this *shouldn't* >>>> work? Any thoughts on getting it to work again? >>> >>> At a glance, this should work (although it's not a use case I've >>> considered before). It's not obvious to me why it doesn't. If you >>> figure out a fix (though I'd like to take a look myself, I don't have >>> the time), please submit it, and then we'll add a test to avoid future >>> regression. >> >> Hi, Ben, thanks for confirming that it should work. >> >> I believe the reason it doesn't lies in handle_upcalls(), which calls >> xlate_actions() without the packet and later calls >> xlate_actions_for_side_effects() with the packet which should actually send >> to the controller. Unfortunately, by the time the actions are xlated the >> second time, the world has changed due to the flow_mod resulting from the >> learn action the first time they were xlated. The result is that things go >> differently when trying to run the side effects -- we now hit the newly >> learned rule. Before 9129672, the flow_mods were queued and I guess hadn't >> actually executed when xlate_actions_for_side_effects() ran. >> I think the additional xlate stems from bcd2633 (Store relevant fields for >> wildcarding in facet). >> >> I don't know the code well enough to know if there's a particularly elegant >> way of solving this. Someone else must have a better idea. :) >> >> A sidenote is that the actions are xlated both times with may_learn, which >> seemed odd to me. Just for fun I turned it off the second time (which, of >> course, is the not-useful-to-me case), and it didn't change the results of >> make check for whatever that's worth. > > Thanks for the insight. That ought to help.
So one way it had occurred to me to resolve this was to cache results from the first xlate_actions() so that the one for side effects would be a replay and not an entirely new process. Although used for a different aim, it seems like there's some conceptual crossover here with Joe Stringer's "Cache the modules affected by xlate_actions()" series of patches. Just thought I'd mention this (CCing in Joe Stringer) in case anyone had any deeper thoughts. -- Murphy _______________________________________________ discuss mailing list [email protected] http://openvswitch.org/mailman/listinfo/discuss
