Hi Jim,

This is an option.
It isn't totally robust since theoretically there will still be a race, as both 
the openflowplugin listener will wake up on the config store change (in order 
to send the flow mod), and the packet out will be sent in parallel for this new 
listener.
It should improve the race a bit by delaying the packet out a bit.

We're looking at using barrier somehow to enforce this, as should be done 
according to OpenFlow (send flowmods, then barrier, then packet out).
It would be nice to have such an option in the openflowplugin API, we could 
write a packet out to the datastore and have the flowmods linked to that packet 
out.

--alon

From: [email protected] 
[mailto:[email protected]] On Behalf Of Jim West
Sent: Tuesday, 28 February 2017 20:14
To: [email protected]
Subject: Re: [openflowplugin-dev] SNAT test UDP failure - race between flows 
being written and packet out?


Hi Alon,



I hope I'm understanding your issue correctly.  I've had problems in OpenFlow 
programming when I was using Groups/Meters.  I had to ensure that the 
groups/meters were created before sending the FlowMods down that referenced 
them.  I believe you are having a similar problem?  You want to recirculate a 
packet through the OF tables and the packet out is hitting before the FlowMod 
has been processed by the switch, yes?



I was coding on top of the openflowplugin in and using the MD-SAL API to create 
the flows/groups/meters.  When I did a put into the config store it appeared 
that sometimes the meter/group mod came out _after_ the flow mod.  The way I 
solved this problem was registering a listener for all the group/meter 
modifications on the MD-SAL config store.  I wouldn't put the FlowMod into 
MD-SAL until I had heard that the Group/Meter had made it through the config 
store.   Then I added a barrier to the group/meter mod message to make sure 
that the switch would process it before the packet out.



Could you do something similar?  Don't send the pkt-out until you get notified 
that the FlowMod has been added/updated to the MD-SAL config store?



Jim


------------------------------

Message: 2
Date: Tue, 28 Feb 2017 16:00:30 +0000
From: "Kochba, Alon" <[email protected]<mailto:[email protected]>>
To: 
"[email protected]<mailto:[email protected]>"
        
<[email protected]<mailto:[email protected]>>,
    openflowplugin-dev
        
<[email protected]<mailto:[email protected]>>,
        
"[email protected]<mailto:[email protected]>"
        
<[email protected]<mailto:[email protected]>>
Subject: [openflowplugin-dev] SNAT test UDP failure - race between
        flows being written and packet out?
Message-ID:
        
<at5pr84mb00983ea0f0ba67bc10641474c0...@at5pr84mb0098.namprd84.prod.outlook.com<mailto:at5pr84mb00983ea0f0ba67bc10641474c0...@at5pr84mb0098.namprd84.prod.outlook.com>>

Content-Type: text/plain; charset="us-ascii"

Hi,

We have a CSIT test in netvirt that tests UDP connectivity using netvirt's SNAT 
feature, that's sporadically failing.
The same test in TCP succeeds all the time.

We debugged the flows and it seems there is a race between these three events - 
ODL receives the initial packet, and then:

a.      Installs an inbound flow using 
NaptEventHandler#buildAndInstallNatFlows()

b.      Installs an outbound flow using 
NaptEventHandler#buildAndInstallNatFlows()

c.      Sends the original packet as a packet out to OFPP_TABLE for 
re-processing by pipeline.

For the test to work properly, (c) must happen after (a) and (b) have been 
programmed to the switch properly.
The flows are written using genius' mdsalManager.syncInstallFlow(), which does 
a synchronous write into the flows CONFIG data store.
The packet out is sent via openflowplugin 
PacketProcessingService.transmitPacket()

Is there a way to ensure (c) is triggered only after (a) and (b) are properly 
configured?
Perhaps delay the packet out somehow? Use barrier somehow?

If this is not a possibility, we can try two things that are pretty ugly:

a.      Reverse (a) and (b) - because then if only (a) is installed (which 
seems more common), the request would have to be re-punted to the ODL, delaying 
it. [1] will do that, but we need to run it many times to verify it helps.

This still won't fix anything if (c) happens before both (a) and (b).

b.      Leave the entire bug, but try to fix this on a test level - add delay 
to the server's response or use something other than netcat that might retry 
UDP.

The reason this might be ok, is that we have flow based SNAT coming up in 
Carbon which should eliminate this race.

[1] https://git.opendaylight.org/gerrit/#/c/52380
--alon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.opendaylight.org/pipermail/openflowplugin-dev/attachments/20170228/f03c7c9c/attachment-0001.html>

------------------------------

_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to