Hi Anil,
Why is it happening in Issue 1?
"and then you disconnect the device from second controller and reconnect it,
ownership goes to third controller"
I found that when disconnect the device from the second controller, the device
data in data store will be deleted. So the FRM will deregister the service
instance on the third controller, so the ownership goes to the first controller.
guo
------------------ 原始邮件 ------------------
发件人: "Anil Vishnoi";<vishnoia...@gmail.com>;
发送时间: 2017年2月16日(星期四) 凌晨4:32
收件人: "Jozef Bacigál"<jozef.baci...@pantheon.tech>;
抄送:
"openflowplugin-dev@lists.opendaylight.org"<openflowplugin-dev@lists.opendaylight.org>;
主题: Re: [openflowplugin-dev] Singleton Clustering issue
Hi Jozef,
I think this does not solve the issue, it actually will make sure that node
deleted first and then added after that, so that user can see the node. But
this delete and add, will create two data change notification for the
application and will give a impression that device was disconnected and
connected back, which is not really a case. I think the ideal solution as you
mentioned is if clustering service provide a notification saying the device has
no owner, so that it can clean-up. I think we should raise a bug to the
clustering team to provide this kind of API, so that we can use this to give a
proper solution.
On Tue, Feb 14, 2017 at 12:54 AM, Jozef Bacigál <jozef.baci...@pantheon.tech>
wrote:
HI Anil, guys
I am facing the same issue you are mentioned in Issue 2 with my single layer
implementation. The plugin is not able to know if there is another controller
connected to the switch so the only one and not good, even slow solution
is/were (I am using right now) that if we lose mastership we are deleting node
from DS and HOPE that is sooner than new master will write new node into DS.
The best solution were to have the information if this was the last master in
cluster for the switch. And then and only then delete the node from DS. What I
am trying right know to hold status before the node is deleted from DS and then
send the ImmediateFuture back to mdsal singleton, so the new master can be
elected.
Anyway it is very bad implementation FOR plugin from singleton service.
Jozef
From: Anil Vishnoi [mailto:vishnoia...@gmail.com]
Sent: Tuesday, February 14, 2017 4:37 AM
To: Jozef Bacigál <jozef.baci...@pantheon.tech>; Abhijit Kumbhare
<abhijitk...@gmail.com>; Tomáš Slušný <tomas.slu...@pantheon.tech>; Shuva Jyoti
Kar <shuva.jyoti....@ericsson.com>; Luis Gomez <ece...@gmail.com>; Muthukumaran
K <muthukumara...@ericsson.com>
Cc: openflowplugin-dev@lists.opendaylight.org
Subject: Singleton Clustering issue
Hi Jozef/Tomas/Luis,
I was investigating Bug 7736 and came across few issue in our clustering
implementation and also some limitation with singleton clustering as well.
Issue 1 : Registering application on data change notification.
In the current implementation, when plugin receives the connection from device,
it register itself as a service instance to clustering singleton service. After
registering with clustering service, it receives the notification to initialize
the instance. It then try to set the master role to the device and then write
the device data to the data store. Forwarding-Rule-Manager then listen on the
data store notification and whenever it see that node is added to the data
store, it registers itself as a service instance for that node. Given that we
are using ClusteredDataTreeChangeListener, all the FRM instances get the node
added notification from data store and all the cluster nodes end up registering
themselves as a service instance on the same service identifier. So even if
device is connected to only one controller FRM register itself on all the three
nodes, that's not correct behavior. So this bug can cause a issue where
openflowplugin cluster will be almost unusable. We have seen an issue where if
you connect the device to two controllers and disconnect the device from first
controller and connect it back, ownership goes to second controller where
device is also connected, and then you disconnect the device from second
controller and reconnect it, ownership goes to third controller, but given
that now ownership for that service identity is with controller 3, even if
device connect back to controller1/2, those controller don't push the master
role down. And this scenario can occur trigger the moment your device
disconnect from any of the controller.
Now problem is that for applications there is no way to find out if the device
is connected to it's host controller instance (until and unless we write some
hardcoded controller number/name in the data store for each device where it's
connected). The only way i can see is through the yang notification, where
plugin can send the nodeAdded/nodeRemoved notification and application can
register themself as a service instance if they receive those events. That way
we can avoid the problem i mentioned above. I pushed a patch that does the
same thing and it resolves this issue.
https://git.opendaylight.org/gerrit/#/c/51489/
Issue 2: Data Change notification every time node disconnect from any of the
node in cluster
Current implementation we see that even if the device is connected to all the
three controller, and the moment device disconnect from one of the controller,
applications receive data change notification where node data is removed and
shortly after another notification with the node data added. Application
thinks that the device just got disconnect from the controllers and reconnected
back, but in reality device is still connected to the remaining two controller.
I think the reason behind this is that the current implementation of the
singleton service don't send any notification to non-owner controllers about
the ownership of the device (e.g isOwner=false, hasOwner=false,
wasOwner=false). I think because of this limitation we wrote the code in a way
that whenever closeServiceInstance() is called plugin removes the data from
data store and when the other controller get instantiateServiceInstance() it
put the data back to data store. And that actually generates two events for the
application. Given that device is connected to all the controllers, this
behavior is not correct. I can't think of any solution that can fix that, until
and unless singleton clustering service provide a specific notification about
it to other controllers, so that those controllers can device if they want to
clean-up the data or ignore it given that one of them is still an owner of the
device.
This same functional behavior can create another issue. If the device is
connected to only one controller in the cluster and user kill that controller,
it would leave the stale data in the data store, because other controllers
won't be notified given that they didn't register as a service instance for
the service-group-id. I think this is major limitation and not sure plugin can
resolve it by itself (until and unless we use EOS + Singleton Clustering
Service hack to make it work).
Let me know your thoughts.
Side question: do anybody know if any enhancement is proposed in md-sal project
that can help solving this issue?
--
Thanks
Anil
JozefBacigál
Senior Software Engineer
Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
R&D centrum / Janka Kráľa 9 / 974 01 Banská Bystrica / Slovakia
+421 908 766 972 / jozef.baci...@pantheon.tech
reception: +421 2 206 65 114 / www.pantheon.tech
--
ThanksAnil
_______________________________________________
openflowplugin-dev mailing list
openflowplugin-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev