HI Anil, guys

I am facing the same issue you are mentioned in Issue 2 with my single layer 
implementation. The plugin is not able to know if there is another controller 
connected to the switch so the only one and not good, even slow solution 
is/were (I am using right now) that if we lose mastership we are deleting node 
from DS and HOPE that is sooner than new master will write new node into DS. 
The best solution were to have the information if this was the last master in 
cluster for the switch. And then and only then delete the node from DS. What I 
am trying right know to hold status before the node is deleted from DS and then 
send the ImmediateFuture back to mdsal singleton, so the new master can be 
elected.

Anyway it is very bad implementation FOR plugin from singleton service.

Jozef

From: Anil Vishnoi [mailto:[email protected]]
Sent: Tuesday, February 14, 2017 4:37 AM
To: Jozef Bacigál <[email protected]>; Abhijit Kumbhare 
<[email protected]>; Tomáš Slušný <[email protected]>; Shuva Jyoti 
Kar <[email protected]>; Luis Gomez <[email protected]>; Muthukumaran 
K <[email protected]>
Cc: [email protected]
Subject: Singleton Clustering issue

Hi Jozef/Tomas/Luis,

I was investigating Bug 
7736<https://bugs.opendaylight.org/show_bug.cgi?id=7736> and came across few 
issue in our clustering implementation and also some limitation with singleton 
clustering as well.

Issue 1 : Registering application on data change notification.
In the current implementation, when plugin receives the connection from device, 
it register itself as a service instance to clustering singleton service. After 
registering with clustering service, it receives the notification to initialize 
the instance. It then try to set the master role to the device and then write 
the device data to the data store. Forwarding-Rule-Manager then listen on the 
data store notification and whenever it see that node is added to the data 
store, it registers itself as a service instance for that node. Given that we 
are using ClusteredDataTreeChangeListener, all the FRM instances get the node 
added notification from data store and all the cluster nodes end up registering 
themselves as a service instance on the same service identifier. So even if 
device is connected to only one controller FRM register itself on all the three 
nodes, that's not correct behavior. So this bug can cause a issue where 
openflowplugin cluster will be almost unusable. We have seen an issue where if 
you connect the device to two controllers and disconnect the device from first 
controller and connect it back, ownership goes to second controller where 
device is also connected, and then you disconnect the device from second 
controller and reconnect it, ownership goes to third controller, but given that 
now ownership for that service identity is with controller 3, even if device 
connect back to controller1/2, those controller don't push the master role 
down. And this scenario can occur trigger the moment your device disconnect 
from any of the controller.

Now problem is that for applications there is no way to find out if the device 
is connected to it's host controller instance (until and unless we write some 
hardcoded controller number/name in the data store for each device where it's 
connected). The only way i can see is through the yang notification, where 
plugin can send the nodeAdded/nodeRemoved notification and application can 
register themself as a service instance if they receive those events. That way 
we can avoid the problem i mentioned above. I pushed a patch that does the same 
thing and it resolves this issue.

https://git.opendaylight.org/gerrit/#/c/51489/

Issue 2: Data Change notification every time node disconnect from any of the 
node in cluster

Current implementation we see that even if the device is connected to all the 
three controller, and the moment device disconnect from one of the controller, 
applications receive data change notification where node data is removed and 
shortly after another notification with the node data added. Application thinks 
that the device just got disconnect from the controllers and reconnected back, 
but in reality device is still connected to the remaining two controller. I 
think the reason behind this is that the current implementation of the 
singleton service don't send any notification to non-owner controllers about 
the ownership of the device (e.g isOwner=false, hasOwner=false, 
wasOwner=false). I think because of this limitation we wrote the code in a way 
that whenever closeServiceInstance() is called plugin removes the data from 
data store and when the other controller get instantiateServiceInstance() it 
put the data back to data store. And that actually generates two events for the 
application. Given that device is connected to all the controllers, this 
behavior is not correct. I can't think of any solution that can fix that, until 
and unless singleton clustering service provide a specific notification about 
it to other controllers, so that those controllers can device if they want to 
clean-up the data or ignore it given that one of them is still an owner of the 
device.

This same functional behavior can create another issue. If the device is 
connected to only one controller in the cluster  and user kill that controller, 
it would leave the stale data in the data store, because other controllers 
won't be notified given that they didn't register as a service instance for the 
service-group-id. I think this is major limitation and not sure plugin can 
resolve it by itself (until and unless we use EOS + Singleton Clustering 
Service hack to make it work).

Let me know your thoughts.

Side question: do anybody know if any enhancement is proposed in md-sal project 
that can help solving this issue?

--
Thanks
Anil


JozefBacigál
Senior Software Engineer

Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia
R&D centrum / Janka Kráľa 9 /  974 01 Banská Bystrica / Slovakia
+421 908 766 972 / [email protected]
reception: +421 2 206 65 114 / www.pantheon.tech

[logo]


_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

Reply via email to