Sorry i think i make minor mistake- it's not device disconnect, it's controller dies. In case controller dies, nobody will clean-up from the data store, so FRM won't deregister and in that case FRM in third controller can get the ownership.
On Thu, Feb 16, 2017 at 12:32 AM, guo <[email protected]> wrote: > Hi Anil, > > Why is it happening in Issue 1? > *"and then you disconnect the device from second controller and reconnect > it, ownership goes to third controller"* > > I found that when disconnect the device from the second controller, the > device data in data store will be deleted. So the FRM will deregister the > service instance on the third controller, so the ownership goes to the > first controller. > > guo > > ------------------ 原始邮件 ------------------ > *发件人:* "Anil Vishnoi";<[email protected]>; > *发送时间:* 2017年2月16日(星期四) 凌晨4:32 > *收件人:* "Jozef Bacigál"<[email protected]>; > *抄送:* "[email protected]"<openflowplug > [email protected]>; > *主题:* Re: [openflowplugin-dev] Singleton Clustering issue > > Hi Jozef, > > I think this does not solve the issue, it actually will make sure that > node deleted first and then added after that, so that user can see the > node. But this delete and add, will create two data change notification for > the application and will give a impression that device was disconnected and > connected back, which is not really a case. I think the ideal solution as > you mentioned is if clustering service provide a notification saying the > device has no owner, so that it can clean-up. I think we should raise a bug > to the clustering team to provide this kind of API, so that we can use this > to give a proper solution. > > On Tue, Feb 14, 2017 at 12:54 AM, Jozef Bacigál < > [email protected]> wrote: > >> HI Anil, guys >> >> >> >> I am facing the same issue you are mentioned in Issue 2 with my single >> layer implementation. The plugin is not able to know if there is another >> controller connected to the switch so the only one and not good, even slow >> solution is/were (I am using right now) that if we lose mastership we are >> deleting node from DS and HOPE that is sooner than new master will write >> new node into DS. The best solution were to have the information if this >> was the last master in cluster for the switch. And then and only then >> delete the node from DS. What I am trying right know to hold status before >> the node is deleted from DS and then send the ImmediateFuture back to mdsal >> singleton, so the new master can be elected. >> >> >> >> Anyway it is very bad implementation FOR plugin from singleton service. >> >> >> >> Jozef >> >> >> >> *From:* Anil Vishnoi [mailto:[email protected]] >> *Sent:* Tuesday, February 14, 2017 4:37 AM >> *To:* Jozef Bacigál <[email protected]>; Abhijit Kumbhare < >> [email protected]>; Tomáš Slušný <[email protected]>; Shuva >> Jyoti Kar <[email protected]>; Luis Gomez <[email protected]>; >> Muthukumaran K <[email protected]> >> *Cc:* [email protected] >> *Subject:* Singleton Clustering issue >> >> >> >> Hi Jozef/Tomas/Luis, >> >> >> >> I was investigating Bug 7736 >> <https://bugs.opendaylight.org/show_bug.cgi?id=7736> and came across few >> issue in our clustering implementation and also some limitation with >> singleton clustering as well. >> >> >> >> Issue 1 : Registering application on data change notification. >> >> In the current implementation, when plugin receives the connection from >> device, it register itself as a service instance to clustering singleton >> service. After registering with clustering service, it receives the >> notification to initialize the instance. It then try to set the master role >> to the device and then write the device data to the data store. >> Forwarding-Rule-Manager then listen on the data store notification and >> whenever it see that node is added to the data store, it registers itself >> as a service instance for that node. Given that we are using >> ClusteredDataTreeChangeListener, all the FRM instances get the node >> added notification from data store and all the cluster nodes end up >> registering themselves as a service instance on the same service >> identifier. So even if device is connected to only one controller FRM >> register itself on all the three nodes, that's not correct behavior. So >> this bug can cause a issue where openflowplugin cluster will be almost >> unusable. We have seen an issue where if you connect the device to two >> controllers and disconnect the device from first controller and connect it >> back, ownership goes to second controller where device is also connected, >> and then you disconnect the device from second controller and reconnect it, >> ownership goes to third controller, but given that now ownership for that >> service identity is with controller 3, even if device connect back to >> controller1/2, those controller don't push the master role down. And this >> scenario can occur trigger the moment your device disconnect from any of >> the controller. >> >> >> >> Now problem is that for applications there is no way to find out if the >> device is connected to it's host controller instance (until and unless we >> write some hardcoded controller number/name in the data store for each >> device where it's connected). The only way i can see is through the yang >> notification, where plugin can send the nodeAdded/nodeRemoved notification >> and application can register themself as a service instance if they receive >> those events. That way we can avoid the problem i mentioned above. I pushed >> a patch that does the same thing and it resolves this issue. >> >> >> >> https://git.opendaylight.org/gerrit/#/c/51489/ >> >> >> >> Issue 2: Data Change notification every time node disconnect from any of >> the node in cluster >> >> >> >> Current implementation we see that even if the device is connected to all >> the three controller, and the moment device disconnect from one of the >> controller, applications receive data change notification where node data >> is removed and shortly after another notification with the node data added. >> Application thinks that the device just got disconnect from the controllers >> and reconnected back, but in reality device is still connected to the >> remaining two controller. I think the reason behind this is that the >> current implementation of the singleton service don't send any notification >> to non-owner controllers about the ownership of the device (e.g >> isOwner=false, hasOwner=false, wasOwner=false). I think because of this >> limitation we wrote the code in a way that whenever closeServiceInstance() >> is called plugin removes the data from data store and when the other >> controller get instantiateServiceInstance() it put the data back to data >> store. And that actually generates two events for the application. Given >> that device is connected to all the controllers, this behavior is not >> correct. I can't think of any solution that can fix that, until and unless >> singleton clustering service provide a specific notification about it to >> other controllers, so that those controllers can device if they want to >> clean-up the data or ignore it given that one of them is still an owner of >> the device. >> >> >> >> This same functional behavior can create another issue. If the device is >> connected to only one controller in the cluster and user kill that >> controller, it would leave the stale data in the data store, because other >> controllers won't be notified given that they didn't register as a service >> instance for the service-group-id. I think this is major limitation and not >> sure plugin can resolve it by itself (until and unless we use EOS + >> Singleton Clustering Service hack to make it work). >> >> >> >> Let me know your thoughts. >> >> >> >> Side question: do anybody know if any enhancement is proposed in md-sal >> project that can help solving this issue? >> >> >> >> -- >> >> Thanks >> >> Anil >> >> >> >> JozefBacigál >> >> Senior Software Engineer >> >> >> Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia >> R&D centrum / Janka Kráľa 9 / 974 01 Banská Bystrica / Slovakia >> +421 908 766 972 / [email protected] >> reception: +421 2 206 65 114 / www.pantheon.tech >> >> >> > > > > -- > Thanks > Anil > -- Thanks Anil
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
