Hi Jim, This is an option. It isn't totally robust since theoretically there will still be a race, as both the openflowplugin listener will wake up on the config store change (in order to send the flow mod), and the packet out will be sent in parallel for this new listener. It should improve the race a bit by delaying the packet out a bit.
We're looking at using barrier somehow to enforce this, as should be done according to OpenFlow (send flowmods, then barrier, then packet out). It would be nice to have such an option in the openflowplugin API, we could write a packet out to the datastore and have the flowmods linked to that packet out. --alon From: [email protected] [mailto:[email protected]] On Behalf Of Jim West Sent: Tuesday, 28 February 2017 20:14 To: [email protected] Subject: Re: [openflowplugin-dev] SNAT test UDP failure - race between flows being written and packet out? Hi Alon, I hope I'm understanding your issue correctly. I've had problems in OpenFlow programming when I was using Groups/Meters. I had to ensure that the groups/meters were created before sending the FlowMods down that referenced them. I believe you are having a similar problem? You want to recirculate a packet through the OF tables and the packet out is hitting before the FlowMod has been processed by the switch, yes? I was coding on top of the openflowplugin in and using the MD-SAL API to create the flows/groups/meters. When I did a put into the config store it appeared that sometimes the meter/group mod came out _after_ the flow mod. The way I solved this problem was registering a listener for all the group/meter modifications on the MD-SAL config store. I wouldn't put the FlowMod into MD-SAL until I had heard that the Group/Meter had made it through the config store. Then I added a barrier to the group/meter mod message to make sure that the switch would process it before the packet out. Could you do something similar? Don't send the pkt-out until you get notified that the FlowMod has been added/updated to the MD-SAL config store? Jim ------------------------------ Message: 2 Date: Tue, 28 Feb 2017 16:00:30 +0000 From: "Kochba, Alon" <[email protected]<mailto:[email protected]>> To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>>, openflowplugin-dev <[email protected]<mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: [openflowplugin-dev] SNAT test UDP failure - race between flows being written and packet out? Message-ID: <at5pr84mb00983ea0f0ba67bc10641474c0...@at5pr84mb0098.namprd84.prod.outlook.com<mailto:at5pr84mb00983ea0f0ba67bc10641474c0...@at5pr84mb0098.namprd84.prod.outlook.com>> Content-Type: text/plain; charset="us-ascii" Hi, We have a CSIT test in netvirt that tests UDP connectivity using netvirt's SNAT feature, that's sporadically failing. The same test in TCP succeeds all the time. We debugged the flows and it seems there is a race between these three events - ODL receives the initial packet, and then: a. Installs an inbound flow using NaptEventHandler#buildAndInstallNatFlows() b. Installs an outbound flow using NaptEventHandler#buildAndInstallNatFlows() c. Sends the original packet as a packet out to OFPP_TABLE for re-processing by pipeline. For the test to work properly, (c) must happen after (a) and (b) have been programmed to the switch properly. The flows are written using genius' mdsalManager.syncInstallFlow(), which does a synchronous write into the flows CONFIG data store. The packet out is sent via openflowplugin PacketProcessingService.transmitPacket() Is there a way to ensure (c) is triggered only after (a) and (b) are properly configured? Perhaps delay the packet out somehow? Use barrier somehow? If this is not a possibility, we can try two things that are pretty ugly: a. Reverse (a) and (b) - because then if only (a) is installed (which seems more common), the request would have to be re-punted to the ODL, delaying it. [1] will do that, but we need to run it many times to verify it helps. This still won't fix anything if (c) happens before both (a) and (b). b. Leave the entire bug, but try to fix this on a test level - add delay to the server's response or use something other than netcat that might retry UDP. The reason this might be ok, is that we have flow based SNAT coming up in Carbon which should eliminate this race. [1] https://git.opendaylight.org/gerrit/#/c/52380 --alon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.opendaylight.org/pipermail/openflowplugin-dev/attachments/20170228/f03c7c9c/attachment-0001.html> ------------------------------
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
