HI Anil, guys I am facing the same issue you are mentioned in Issue 2 with my single layer implementation. The plugin is not able to know if there is another controller connected to the switch so the only one and not good, even slow solution is/were (I am using right now) that if we lose mastership we are deleting node from DS and HOPE that is sooner than new master will write new node into DS. The best solution were to have the information if this was the last master in cluster for the switch. And then and only then delete the node from DS. What I am trying right know to hold status before the node is deleted from DS and then send the ImmediateFuture back to mdsal singleton, so the new master can be elected.
Anyway it is very bad implementation FOR plugin from singleton service. Jozef From: Anil Vishnoi [mailto:[email protected]] Sent: Tuesday, February 14, 2017 4:37 AM To: Jozef Bacigál <[email protected]>; Abhijit Kumbhare <[email protected]>; Tomáš Slušný <[email protected]>; Shuva Jyoti Kar <[email protected]>; Luis Gomez <[email protected]>; Muthukumaran K <[email protected]> Cc: [email protected] Subject: Singleton Clustering issue Hi Jozef/Tomas/Luis, I was investigating Bug 7736<https://bugs.opendaylight.org/show_bug.cgi?id=7736> and came across few issue in our clustering implementation and also some limitation with singleton clustering as well. Issue 1 : Registering application on data change notification. In the current implementation, when plugin receives the connection from device, it register itself as a service instance to clustering singleton service. After registering with clustering service, it receives the notification to initialize the instance. It then try to set the master role to the device and then write the device data to the data store. Forwarding-Rule-Manager then listen on the data store notification and whenever it see that node is added to the data store, it registers itself as a service instance for that node. Given that we are using ClusteredDataTreeChangeListener, all the FRM instances get the node added notification from data store and all the cluster nodes end up registering themselves as a service instance on the same service identifier. So even if device is connected to only one controller FRM register itself on all the three nodes, that's not correct behavior. So this bug can cause a issue where openflowplugin cluster will be almost unusable. We have seen an issue where if you connect the device to two controllers and disconnect the device from first controller and connect it back, ownership goes to second controller where device is also connected, and then you disconnect the device from second controller and reconnect it, ownership goes to third controller, but given that now ownership for that service identity is with controller 3, even if device connect back to controller1/2, those controller don't push the master role down. And this scenario can occur trigger the moment your device disconnect from any of the controller. Now problem is that for applications there is no way to find out if the device is connected to it's host controller instance (until and unless we write some hardcoded controller number/name in the data store for each device where it's connected). The only way i can see is through the yang notification, where plugin can send the nodeAdded/nodeRemoved notification and application can register themself as a service instance if they receive those events. That way we can avoid the problem i mentioned above. I pushed a patch that does the same thing and it resolves this issue. https://git.opendaylight.org/gerrit/#/c/51489/ Issue 2: Data Change notification every time node disconnect from any of the node in cluster Current implementation we see that even if the device is connected to all the three controller, and the moment device disconnect from one of the controller, applications receive data change notification where node data is removed and shortly after another notification with the node data added. Application thinks that the device just got disconnect from the controllers and reconnected back, but in reality device is still connected to the remaining two controller. I think the reason behind this is that the current implementation of the singleton service don't send any notification to non-owner controllers about the ownership of the device (e.g isOwner=false, hasOwner=false, wasOwner=false). I think because of this limitation we wrote the code in a way that whenever closeServiceInstance() is called plugin removes the data from data store and when the other controller get instantiateServiceInstance() it put the data back to data store. And that actually generates two events for the application. Given that device is connected to all the controllers, this behavior is not correct. I can't think of any solution that can fix that, until and unless singleton clustering service provide a specific notification about it to other controllers, so that those controllers can device if they want to clean-up the data or ignore it given that one of them is still an owner of the device. This same functional behavior can create another issue. If the device is connected to only one controller in the cluster and user kill that controller, it would leave the stale data in the data store, because other controllers won't be notified given that they didn't register as a service instance for the service-group-id. I think this is major limitation and not sure plugin can resolve it by itself (until and unless we use EOS + Singleton Clustering Service hack to make it work). Let me know your thoughts. Side question: do anybody know if any enhancement is proposed in md-sal project that can help solving this issue? -- Thanks Anil JozefBacigál Senior Software Engineer Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia R&D centrum / Janka Kráľa 9 / 974 01 Banská Bystrica / Slovakia +421 908 766 972 / [email protected] reception: +421 2 206 65 114 / www.pantheon.tech [logo]
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
