Sorry i think i make minor mistake- it's not device disconnect, it's
controller dies. In case controller dies, nobody will clean-up from the
data store, so FRM won't deregister and in that case FRM in third
controller can get the ownership.

On Thu, Feb 16, 2017 at 12:32 AM, guo <[email protected]> wrote:

> Hi Anil,
>
> Why is it happening in Issue 1?
> *"and then you disconnect the device from second controller and reconnect
> it, ownership goes to third controller"*
>
> I found that when disconnect the device from the second controller, the
> device data in data store will be deleted. So the FRM will deregister the
> service instance on the third controller, so the ownership goes to the
> first controller.
>
> guo
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Anil Vishnoi";<[email protected]>;
> *发送时间:* 2017年2月16日(星期四) 凌晨4:32
> *收件人:* "Jozef Bacigál"<[email protected]>;
> *抄送:* "[email protected]"<openflowplug
> [email protected]>;
> *主题:* Re: [openflowplugin-dev] Singleton Clustering issue
>
> Hi Jozef,
>
> I think this does not solve the issue, it actually will make sure that
> node deleted first and then added after that, so that user can see the
> node. But this delete and add, will create two data change notification for
> the application and will give a impression that device was disconnected and
> connected back, which is not really a case. I think the ideal solution as
> you mentioned is if clustering service provide a notification saying the
> device has no owner, so that it can clean-up. I think we should raise a bug
> to the clustering team to provide this kind of API, so that we can use this
> to give a proper solution.
>
> On Tue, Feb 14, 2017 at 12:54 AM, Jozef Bacigál <
> [email protected]> wrote:
>
>> HI Anil, guys
>>
>>
>>
>> I am facing the same issue you are mentioned in Issue 2 with my single
>> layer implementation. The plugin is not able to know if there is another
>> controller connected to the switch so the only one and not good, even slow
>> solution is/were (I am using right now) that if we lose mastership we are
>> deleting node from DS and HOPE that is sooner than new master will write
>> new node into DS. The best solution were to have the information if this
>> was the last master in cluster for the switch. And then and only then
>> delete the node from DS. What I am trying right know to hold status before
>> the node is deleted from DS and then send the ImmediateFuture back to mdsal
>> singleton, so the new master can be elected.
>>
>>
>>
>> Anyway it is very bad implementation FOR plugin from singleton service.
>>
>>
>>
>> Jozef
>>
>>
>>
>> *From:* Anil Vishnoi [mailto:[email protected]]
>> *Sent:* Tuesday, February 14, 2017 4:37 AM
>> *To:* Jozef Bacigál <[email protected]>; Abhijit Kumbhare <
>> [email protected]>; Tomáš Slušný <[email protected]>; Shuva
>> Jyoti Kar <[email protected]>; Luis Gomez <[email protected]>;
>> Muthukumaran K <[email protected]>
>> *Cc:* [email protected]
>> *Subject:* Singleton Clustering issue
>>
>>
>>
>> Hi Jozef/Tomas/Luis,
>>
>>
>>
>> I was investigating Bug 7736
>> <https://bugs.opendaylight.org/show_bug.cgi?id=7736> and came across few
>> issue in our clustering implementation and also some limitation with
>> singleton clustering as well.
>>
>>
>>
>> Issue 1 : Registering application on data change notification.
>>
>> In the current implementation, when plugin receives the connection from
>> device, it register itself as a service instance to clustering singleton
>> service. After registering with clustering service, it receives the
>> notification to initialize the instance. It then try to set the master role
>> to the device and then write the device data to the data store.
>> Forwarding-Rule-Manager then listen on the data store notification and
>> whenever it see that node is added to the data store, it registers itself
>> as a service instance for that node. Given that we are using
>> ClusteredDataTreeChangeListener, all the FRM instances get the node
>> added notification from data store and all the cluster nodes end up
>> registering themselves as a service instance on the same service
>> identifier. So even if device is connected to only one controller FRM
>> register itself on all the three nodes, that's not correct behavior. So
>> this bug can cause a issue where openflowplugin cluster will be almost
>> unusable. We have seen an issue where if you connect the device to two
>> controllers and disconnect the device from first controller and connect it
>> back, ownership goes to second controller where device is also connected,
>> and then you disconnect the device from second controller and reconnect it,
>> ownership goes to third controller, but given that now ownership for that
>> service identity is with controller 3, even if device connect back to
>> controller1/2, those controller don't push the master role down. And this
>> scenario can occur trigger the moment your device disconnect from any of
>> the controller.
>>
>>
>>
>> Now problem is that for applications there is no way to find out if the
>> device is connected to it's host controller instance (until and unless we
>> write some hardcoded controller number/name in the data store for each
>> device where it's connected). The only way i can see is through the yang
>> notification, where plugin can send the nodeAdded/nodeRemoved notification
>> and application can register themself as a service instance if they receive
>> those events. That way we can avoid the problem i mentioned above. I pushed
>> a patch that does the same thing and it resolves this issue.
>>
>>
>>
>> https://git.opendaylight.org/gerrit/#/c/51489/
>>
>>
>>
>> Issue 2: Data Change notification every time node disconnect from any of
>> the node in cluster
>>
>>
>>
>> Current implementation we see that even if the device is connected to all
>> the three controller, and the moment device disconnect from one of the
>> controller, applications receive data change notification where node data
>> is removed and shortly after another notification with the node data added.
>> Application thinks that the device just got disconnect from the controllers
>> and reconnected back, but in reality device is still connected to the
>> remaining two controller. I think the reason behind this is that the
>> current implementation of the singleton service don't send any notification
>> to non-owner controllers about the ownership of the device (e.g
>> isOwner=false, hasOwner=false, wasOwner=false). I think because of this
>> limitation we wrote the code in a way that whenever closeServiceInstance()
>> is called plugin removes the data from data store and when the other
>> controller get instantiateServiceInstance() it put the data back to data
>> store. And that actually generates two events for the application. Given
>> that device is connected to all the controllers, this behavior is not
>> correct. I can't think of any solution that can fix that, until and unless
>> singleton clustering service provide a specific notification about it to
>> other controllers, so that those controllers can device if they want to
>> clean-up the data or ignore it given that one of them is still an owner of
>> the device.
>>
>>
>>
>> This same functional behavior can create another issue. If the device is
>> connected to only one controller in the cluster  and user kill that
>> controller, it would leave the stale data in the data store, because other
>> controllers won't be notified given that they didn't register as a service
>> instance for the service-group-id. I think this is major limitation and not
>> sure plugin can resolve it by itself (until and unless we use EOS +
>> Singleton Clustering Service hack to make it work).
>>
>>
>>
>> Let me know your thoughts.
>>
>>
>>
>> Side question: do anybody know if any enhancement is proposed in md-sal
>> project that can help solving this issue?
>>
>>
>>
>> --
>>
>> Thanks
>>
>> Anil
>>
>>
>>
>> JozefBacigál
>>
>> Senior Software Engineer
>>
>>
>> Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
>> R&D centrum / Janka Kráľa 9 /  974 01 Banská Bystrica / Slovakia
>> +421 908 766 972 / [email protected]
>> reception: +421 2 206 65 114 / www.pantheon.tech
>>
>>
>>
>
>
>
> --
> Thanks
> Anil
>



-- 
Thanks
Anil
_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to