I opened the bug to md-sal for the enhancement https://bugs.opendaylight.org/show_bug.cgi?id=7820
On Fri, Feb 17, 2017 at 5:47 PM, Anil Vishnoi <[email protected]> wrote: > Sorry i think i make minor mistake- it's not device disconnect, it's > controller dies. In case controller dies, nobody will clean-up from the > data store, so FRM won't deregister and in that case FRM in third > controller can get the ownership. > > On Thu, Feb 16, 2017 at 12:32 AM, guo <[email protected]> wrote: > >> Hi Anil, >> >> Why is it happening in Issue 1? >> *"and then you disconnect the device from second controller and reconnect >> it, ownership goes to third controller"* >> >> I found that when disconnect the device from the second controller, the >> device data in data store will be deleted. So the FRM will deregister the >> service instance on the third controller, so the ownership goes to the >> first controller. >> >> guo >> >> ------------------ 原始邮件 ------------------ >> *发件人:* "Anil Vishnoi";<[email protected]>; >> *发送时间:* 2017年2月16日(星期四) 凌晨4:32 >> *收件人:* "Jozef Bacigál"<[email protected]>; >> *抄送:* "[email protected]"<openflowplug >> [email protected]>; >> *主题:* Re: [openflowplugin-dev] Singleton Clustering issue >> >> Hi Jozef, >> >> I think this does not solve the issue, it actually will make sure that >> node deleted first and then added after that, so that user can see the >> node. But this delete and add, will create two data change notification for >> the application and will give a impression that device was disconnected and >> connected back, which is not really a case. I think the ideal solution as >> you mentioned is if clustering service provide a notification saying the >> device has no owner, so that it can clean-up. I think we should raise a bug >> to the clustering team to provide this kind of API, so that we can use this >> to give a proper solution. >> >> On Tue, Feb 14, 2017 at 12:54 AM, Jozef Bacigál < >> [email protected]> wrote: >> >>> HI Anil, guys >>> >>> >>> >>> I am facing the same issue you are mentioned in Issue 2 with my single >>> layer implementation. The plugin is not able to know if there is another >>> controller connected to the switch so the only one and not good, even slow >>> solution is/were (I am using right now) that if we lose mastership we are >>> deleting node from DS and HOPE that is sooner than new master will write >>> new node into DS. The best solution were to have the information if this >>> was the last master in cluster for the switch. And then and only then >>> delete the node from DS. What I am trying right know to hold status before >>> the node is deleted from DS and then send the ImmediateFuture back to mdsal >>> singleton, so the new master can be elected. >>> >>> >>> >>> Anyway it is very bad implementation FOR plugin from singleton service. >>> >>> >>> >>> Jozef >>> >>> >>> >>> *From:* Anil Vishnoi [mailto:[email protected]] >>> *Sent:* Tuesday, February 14, 2017 4:37 AM >>> *To:* Jozef Bacigál <[email protected]>; Abhijit Kumbhare < >>> [email protected]>; Tomáš Slušný <[email protected]>; >>> Shuva Jyoti Kar <[email protected]>; Luis Gomez < >>> [email protected]>; Muthukumaran K <[email protected]> >>> *Cc:* [email protected] >>> *Subject:* Singleton Clustering issue >>> >>> >>> >>> Hi Jozef/Tomas/Luis, >>> >>> >>> >>> I was investigating Bug 7736 >>> <https://bugs.opendaylight.org/show_bug.cgi?id=7736> and came across >>> few issue in our clustering implementation and also some limitation with >>> singleton clustering as well. >>> >>> >>> >>> Issue 1 : Registering application on data change notification. >>> >>> In the current implementation, when plugin receives the connection from >>> device, it register itself as a service instance to clustering singleton >>> service. After registering with clustering service, it receives the >>> notification to initialize the instance. It then try to set the master role >>> to the device and then write the device data to the data store. >>> Forwarding-Rule-Manager then listen on the data store notification and >>> whenever it see that node is added to the data store, it registers itself >>> as a service instance for that node. Given that we are using >>> ClusteredDataTreeChangeListener, all the FRM instances get the node >>> added notification from data store and all the cluster nodes end up >>> registering themselves as a service instance on the same service >>> identifier. So even if device is connected to only one controller FRM >>> register itself on all the three nodes, that's not correct behavior. So >>> this bug can cause a issue where openflowplugin cluster will be almost >>> unusable. We have seen an issue where if you connect the device to two >>> controllers and disconnect the device from first controller and connect it >>> back, ownership goes to second controller where device is also connected, >>> and then you disconnect the device from second controller and reconnect it, >>> ownership goes to third controller, but given that now ownership for that >>> service identity is with controller 3, even if device connect back to >>> controller1/2, those controller don't push the master role down. And this >>> scenario can occur trigger the moment your device disconnect from any of >>> the controller. >>> >>> >>> >>> Now problem is that for applications there is no way to find out if the >>> device is connected to it's host controller instance (until and unless we >>> write some hardcoded controller number/name in the data store for each >>> device where it's connected). The only way i can see is through the yang >>> notification, where plugin can send the nodeAdded/nodeRemoved notification >>> and application can register themself as a service instance if they receive >>> those events. That way we can avoid the problem i mentioned above. I pushed >>> a patch that does the same thing and it resolves this issue. >>> >>> >>> >>> https://git.opendaylight.org/gerrit/#/c/51489/ >>> >>> >>> >>> Issue 2: Data Change notification every time node disconnect from any of >>> the node in cluster >>> >>> >>> >>> Current implementation we see that even if the device is connected to >>> all the three controller, and the moment device disconnect from one of the >>> controller, applications receive data change notification where node data >>> is removed and shortly after another notification with the node data added. >>> Application thinks that the device just got disconnect from the controllers >>> and reconnected back, but in reality device is still connected to the >>> remaining two controller. I think the reason behind this is that the >>> current implementation of the singleton service don't send any notification >>> to non-owner controllers about the ownership of the device (e.g >>> isOwner=false, hasOwner=false, wasOwner=false). I think because of this >>> limitation we wrote the code in a way that whenever closeServiceInstance() >>> is called plugin removes the data from data store and when the other >>> controller get instantiateServiceInstance() it put the data back to data >>> store. And that actually generates two events for the application. Given >>> that device is connected to all the controllers, this behavior is not >>> correct. I can't think of any solution that can fix that, until and unless >>> singleton clustering service provide a specific notification about it to >>> other controllers, so that those controllers can device if they want to >>> clean-up the data or ignore it given that one of them is still an owner of >>> the device. >>> >>> >>> >>> This same functional behavior can create another issue. If the device is >>> connected to only one controller in the cluster and user kill that >>> controller, it would leave the stale data in the data store, because other >>> controllers won't be notified given that they didn't register as a service >>> instance for the service-group-id. I think this is major limitation and not >>> sure plugin can resolve it by itself (until and unless we use EOS + >>> Singleton Clustering Service hack to make it work). >>> >>> >>> >>> Let me know your thoughts. >>> >>> >>> >>> Side question: do anybody know if any enhancement is proposed in md-sal >>> project that can help solving this issue? >>> >>> >>> >>> -- >>> >>> Thanks >>> >>> Anil >>> >>> >>> >>> JozefBacigál >>> >>> Senior Software Engineer >>> >>> >>> Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia >>> R&D centrum / Janka Kráľa 9 / 974 01 Banská Bystrica / Slovakia >>> +421 908 766 972 / [email protected] >>> reception: +421 2 206 65 114 / www.pantheon.tech >>> >>> >>> >> >> >> >> -- >> Thanks >> Anil >> > > > > -- > Thanks > Anil > -- Thanks Anil
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
