Re: [controller-dev] Migrating inventory/topology models

2019-09-05 Thread Anil Vishnoi
On Thu, Sep 5, 2019 at 8:56 AM Robert Varga  wrote:

> Hello everyone,
>
> as it currently stands, then only projects which are using
> opendaylight-{inventory,topology}*.yang models are OpenFlow-specific
> projects (openflowplugin, genius, sfc (in sfc-genius-utils), netvirt),
> plus a soon-to-be-deprecated component in controller/netconf.
>
> These models have been deemed as deprecated a long time (3+ years) ago,
> but the effort to migrate off of them has never materialized, which has
> left us in a sorry state, where the usage of those models incurs
> deprecation warnings (all over the place) and there is no target to
> transition to.
>
> We have
>
> https://git.opendaylight.org/gerrit/q/+I1e3d27374ffba0e584f194d468cebcfa9cecfe81
> merged on master, which will be followed by all other branches. This
> will alleviate the deprecation pain downstream.
>
+1

>
> As for the next steps, I think we need to migrate these models to
> openflowplugin, where they can be maintained, as that world is the only
> place that really uses them.
>
As far as upstream OpenDaylight is concern this make sense to me, but we
need to be careful about the downstream consumer. Downstream user who just
use core ODL projects (Controller, yangtools, mdsal,aaa) to develop their
standalone application might be using these models, so this movement will
break them and to solve this they will have to put dependency on
openflowpluing, which they might not want.

>
> Any objections?
>
> Thanks,
> Robert
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] 答复: Is Read from follower shard ok and openflowplugin master must be shard leader?

2019-06-03 Thread Anil Vishnoi
On Sun, Jun 2, 2019 at 8:58 PM Yi Yang (杨燚)-云服务集团 
wrote:

> Thanks Anil. Let us discuss more in today’s weekly meeting.
>
>
>
> Do you mean any part in openflowplugin project won’t write config
> inventory? But I think the below features will write it.
>
>
>
> odl-openflowplugin-app-config-pusher
>
> odl-openflowplugin-app-reconciliation-framework
>
> odl-openflowplugin-app-forwardingrules-manager
>
> odl-openflowplugin-app-arbitratorreconciliation
>
>
>
> In addition, I know some flows are installed by packet-in, is it possible
> to install all the flows in config inventory into open vswitch bridge
> regardless of packet in? Controller disconnection  shouldn’t affect normal
> packet forwarding anyway.
>
These are the support applications that is written on top of the
openflowplugin, so yes, application are allowed to modify the config data
store.

>
>
> It looks like ODL redisgn is only feasible way for super scale data center
> J, I read your google doc in ONS, are you Lumina developing such
> solution? See you in weekly meeting today.
>

>
> *发件人:* Anil Vishnoi [mailto:vishnoia...@gmail.com]
> *发送时间:* 2019年6月1日 2:40
> *收件人:* Yi Yang (杨燚)-云服务集团 
> *抄送:* mdsal-...@lists.opendaylight.org;
> controller-dev@lists.opendaylight.org;
> openflowplugin-...@lists.opendaylight.org; d...@lists.opendaylight.org;
> abhijit.kumbh...@ericsson.com; avish...@luminanetworks.com;
> robert.va...@pantheon.tech
> *主题:* Re: [controller-dev] 答复: Is Read from follower shard ok and
> openflowplugin master must be shard leader?
>
>
>
> Hi Yi,
>
>
>
> Please see inline...
>
>
>
> On Thu, May 30, 2019 at 5:04 PM Yi Yang (杨燚)-云服务集团 
> wrote:
>
> Also cc dev mailing list for getting more responses.
>
>
>
> *发件人:* Yi Yang (杨燚)-云服务集团
> *发送时间:* 2019年5月30日 14:08
> *收件人:* 'mdsal-...@lists.opendaylight.org' <
> mdsal-...@lists.opendaylight.org>; 'controller-dev@lists.opendaylight.org'
> ; '
> openflowplugin-...@lists.opendaylight.org' <
> openflowplugin-...@lists.opendaylight.org>
> *抄送:* 'robert.va...@pantheon.tech' ; '
> tompante...@gmail.com' ; '
> avish...@luminanetworks.com' ; '
> abhijit.kumbh...@ericsson.com' 
> *主题:* Is Read from follower shard ok and openflowplugin master must be
> shard leader?
> *重要性:* 高
>
>
>
> Hi, folks
>
>
>
> I have some questions about ODL clustering and openflowplugin clustering,
> look forward to getting your great help, thank you in advance.
>
>
>
> # Q1. Is only leader node responsible for synchronizing data store to
> other followers for any shard?
>
>
>
> # Q2. Openflowplugin clustering also has master, per its document, only
> openflowplugin master node can do write operation against inventory data
> store, then what if this openflowplugin master node is follower shard?
>
> OpenFlow plugin is driven by the devices connected to it, in the clustered
> setup. OpenFlow plugin allows you to connect your device to any of the
> controller node (one or more), and internally it will decide which node
> from the cluster will be the owner/master of the device using Cluster
> SIngleton Service + EOS. Once the owner/master is decided, that
> owner/master is the one allowed to write data to the "operational"
> inventory (plugin don't write to config inventory).
>
>
>
> # Q3. Can we do more granular shard per openflow node(DPID) in inventory?
> I don’t think it makes sense that the inventory for one openflowplugin
> cluster is replicated to all the other openflowplugin clusters (assume
> there are many openflowplugin clusters because many south nodes/devices are
> there)
>
> are you assuming multiple OpenDaylight cluster instance running and
> sharing data to each other ? e.g 2 cluster setup running and sharing data
> throw some external mechanism or a cluster with 6 nodes in it? If you are
> looking at the scale of 1 device, and assuming that each cluster can
> manage 500 devices, you will have to deploy 20 cluster setup or you will
> have to create cluster with 60 nodes in it. Both of these options are
> pretty much not practical for the production environment.
>
>
>
> # Q4. Anybody can recommend node number of a ODL cluster which will manage
> 1 compute/network nodes? I think leader nodes will have too high
> workload if number of ODL cluster node is too big so that it can’t do
> horizontal scale, per current default shard strategy, every node has all
> the data store, that looks more like data store replication, not distribute
> data store on all the nodes.
>
> In my experience and opinion, ODL in clustered setup is not a solution
> here. As i mentioned above, with cluster setup i can think o

Re: [controller-dev] 答复: Is Read from follower shard ok and openflowplugin master must be shard leader?

2019-05-31 Thread Anil Vishnoi
Hi Yi,

Please see inline...

On Thu, May 30, 2019 at 5:04 PM Yi Yang (杨燚)-云服务集团 
wrote:

> Also cc dev mailing list for getting more responses.
>
>
>
> *发件人:* Yi Yang (杨燚)-云服务集团
> *发送时间:* 2019年5月30日 14:08
> *收件人:* 'mdsal-...@lists.opendaylight.org' <
> mdsal-...@lists.opendaylight.org>; 'controller-dev@lists.opendaylight.org'
> ; '
> openflowplugin-...@lists.opendaylight.org' <
> openflowplugin-...@lists.opendaylight.org>
> *抄送:* 'robert.va...@pantheon.tech' ; '
> tompante...@gmail.com' ; '
> avish...@luminanetworks.com' ; '
> abhijit.kumbh...@ericsson.com' 
> *主题:* Is Read from follower shard ok and openflowplugin master must be
> shard leader?
> *重要性:* 高
>
>
>
> Hi, folks
>
>
>
> I have some questions about ODL clustering and openflowplugin clustering,
> look forward to getting your great help, thank you in advance.
>
>
>
> # Q1. Is only leader node responsible for synchronizing data store to
> other followers for any shard?
>
>
>
> # Q2. Openflowplugin clustering also has master, per its document, only
> openflowplugin master node can do write operation against inventory data
> store, then what if this openflowplugin master node is follower shard?
>
OpenFlow plugin is driven by the devices connected to it, in the clustered
setup. OpenFlow plugin allows you to connect your device to any of the
controller node (one or more), and internally it will decide which node
from the cluster will be the owner/master of the device using Cluster
SIngleton Service + EOS. Once the owner/master is decided, that
owner/master is the one allowed to write data to the "operational"
inventory (plugin don't write to config inventory).

>
>
> # Q3. Can we do more granular shard per openflow node(DPID) in inventory?
> I don’t think it makes sense that the inventory for one openflowplugin
> cluster is replicated to all the other openflowplugin clusters (assume
> there are many openflowplugin clusters because many south nodes/devices are
> there)
>
are you assuming multiple OpenDaylight cluster instance running and sharing
data to each other ? e.g 2 cluster setup running and sharing data throw
some external mechanism or a cluster with 6 nodes in it? If you are looking
at the scale of 1 device, and assuming that each cluster can manage 500
devices, you will have to deploy 20 cluster setup or you will have to
create cluster with 60 nodes in it. Both of these options are pretty much
not practical for the production environment.

>
>
> # Q4. Anybody can recommend node number of a ODL cluster which will manage
> 1 compute/network nodes? I think leader nodes will have too high
> workload if number of ODL cluster node is too big so that it can’t do
> horizontal scale, per current default shard strategy, every node has all
> the data store, that looks more like data store replication, not distribute
> data store on all the nodes.
>
In my experience and opinion, ODL in clustered setup is not a solution
here. As i mentioned above, with cluster setup i can think of two possible
solution as i mentioned above. Deploying 20 cluster will be operational
nightmare (E.g per cluster partition issues, device switching between
cluster, device inventory data sharing across cluster on device switching
etc). Apart from that you will need external mechanism to share the data
between these clusters. And depends on your application, things can get
even more complicated to maintain in production environment. If you go with
the second option of 60 nodes in cluster, i am not even sure this cluster
even will boot up properly :), let alone managing the devices. To make it
work, you need to go with the prefix-based-sharding and cook a solution per
device (per deivce shard, nodes where this shard can be replicated, making
sure that device connection only switch to the node where the devie shard
is replicated etc etc etc).

>
>
> # Q5. Is it possible to run an asymmetric ODL cluster? I mean some nodes
> are full stack (there are netvirt, sfc, genius, etc), some nodes are
> southbound only (only install openflowplugin, ovsdb). I don’t think we must
> run other stuff in south bound device management nodes except southbound
> protocols.
>
I think you can do that, but if you want HA for your application and
southbound plugins and also you want to run these in exlusion, 3 node
cluster is not going to work (atleast you need 4 nodes in cluster).

>
>
> #Q6. I know data store read can be done in any node, but is it read from
> local shard in fact? Per document, it seems shard manager is doing this, if
> local shard is not leader, it will do this from remote shard leader.
>
>
>
> #Q7. Anybody can propose a good ODL clustering solution for a super scale
> data center which has 1 nodes?
>
In my experience, if you are looking for stable production environment with
low operation cost (logistic, resource, support etc), ODL in "clustering"
environment is probably not at-par solution. Luis and myself, shared some
high level thoughts on how we can achieve this kind of 

Re: [controller-dev] owner changed failure

2018-09-20 Thread Anil Vishnoi
On Thu, Sep 20, 2018 at 3:38 AM Vratko Polák 
wrote:

> > It really isn't necessary that the new owner stays the same
>
>
> For some applications it does not matter.
>
> For other applications (such as openflow)
>
> each owner change means the old owner has to disconnect
>
> from the device and the new owner has to connect.
>
That's not entirely correct. It won't disconnect/connect, but yes it pushed
master/slave role down to the switch and that is also a cost. So for OFP,
minimum flapping of ownership is the optimal solution.

> That can be expensive, and in worst case it can lead to data loss.
>
>
> > I'm not sure if it's guaranteed the owner won't change after
> > re-join
>
>
> I know this test happened to work when we developed it,
>
> but that behavior might depend on timing details
>
> of an underlying algorithm implementation.
>
>
> For curiosity, I looked at unit tests.
>
> I have not found anything conclusive,
>
> but I found one comment [0] related to timing.
>
>
> Vratko.
>
>
> [0]
> https://github.com/opendaylight/controller/blob/9bce68c4712d00951d121be68b09578bc6e09151/opendaylight/md-sal/sal-distributed-datastore/src/test/java/org/opendaylight/controller/cluster/datastore/entityownership/DistributedEntityOwnershipIntegrationTest.java#L224-L226
> --
> *From:* Jamo Luhrsen 
> *Sent:* Thursday, September 20, 2018 7:45:12 AM
> *To:* Tom Pantelis
> *Cc:* controller-dev; pgup...@cisco.com; Vratko Polák
> *Subject:* Re: owner changed failure
>
>
>
> On 09/19/2018 02:36 PM, Tom Pantelis wrote:
> >
> >
> > On Wed, Sep 19, 2018 at 4:46 PM Jamo Luhrsen  mailto:jluhr...@gmail.com >> wrote:
> >
> > \
> >  > Anyway we'll need to enable debug
> for org.opendaylight.controller.cluster.datastore.entityownership.  I would
> > suggest to
> >  > pull out that test on its own like you've done before (run it
> standalone in sandbox I guess). Also delete the log
> > files
> >  > in between each run. This will make debugging much easier.
> >
> > Is there not enough info in the existing logs?
> >
> >
> > No - not with the default INFO logging. In order to dig deeper we need
> to enable targeted debug, in this
> > case org.opendaylight.controller.cluster.datastore.entityownership.
>
> I don't think it worked. Here is what the logger config lines looked
> like:
>
> 22:38:53
> log4j2.logger.org_opendaylight_org_opendaylight_controller_cluster_datastore_entityownership.name
> =
>
> org.opendaylight.org.opendaylight.controller.cluster.datastore.entityownership
> 22:38:53
> log4j2.logger.org_opendaylight_org_opendaylight_controller_cluster_datastore_entityownership.level
> = DEBUG
>
> you can see the whole org.ops4j.pax.logging.cfg file here:
>
> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/3/console-timestamp.log.gz
>
> but, I don't see any DEBUG messages showing up in these karaf logs:
>
>
> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/3/odl_1/odl1_karaf.log.gz
>
> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/3/odl_2/odl2_karaf.log.gz
>
> https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/controller-csit-3node-clustering-ask-all-oxygen/3/odl_3/odl3_karaf.log.gz
>
>
> > You can easily trim the
> > karaf logs based on the test cases. We are logging the start of every
> > suite and test case.
> >
> >
> > I think it's  much easier and faster to debug a failing test if it's
> isolated. Of course the logs are much smaller and
> > don't require trimming.  Enabling debug can result in huge logs even
> just running one test, let alone the whole batch of
> > them. Also I assume this test fails sporadically which means it needs to
> be run over and over. Doing that with the
> > entire job will take a long time. But if it's too much of a pain to
> isolate the test, then OK.
>
> The suite takes aprox 90 from start to finish and it's just a matter of
> adding a single string to the parameters when we start the build. This
> failure has happened 2 of the 3 jobs we ran. That's easiest for now,
> and splitting up the karaf logs after the fact shouldn't be too bad.
>
> Thanks,
> JamO
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] ODL Cassandra Persistence

2018-09-19 Thread Anil Vishnoi
Muthu,

Do you have any comparative performance number for ODL default  persistence
and with Cassandra plugin?

Thanks
Anil

On Tue, Sep 18, 2018 at 11:52 PM Muthukumaran K 
wrote:

> Cassandra plugin is quite active and we also get reasonably good responses
> from the Akka Persistence forums.
>
> Depending upon the volume of journals and snapshots, deployment scheme and
> size of snapshots planned to be stored in Cassandra, some level of tuning
> would be required on backend Cassandra cluster
>
>
>
> Regards
>
> Muthu
>
>
>
> *From:* controller-dev-boun...@lists.opendaylight.org [mailto:
> controller-dev-boun...@lists.opendaylight.org] *On Behalf Of *Michael
> Vorburger
> *Sent:* Wednesday, September 19, 2018 2:42 AM
> *To:* sat 
> *Cc:* controller-dev 
> *Subject:* Re: [controller-dev] ODL Cassandra Persistence
>
>
>
> On Tue, 18 Sep 2018, 23:04 sat,  wrote:
>
> Hi,
>
>
>
> Yes, we were looking for a project like this. Unfortunately the project is
> discontinued.
>
>
>
> https://github.com/akka/akka-persistence-cassandra seems to be active?
>
>
>
> Thanks
>
> A.SathishKumar
>
>
>
> On Tue, Sep 18, 2018 at 6:54 AM Tom Pantelis 
> wrote:
>
>
>
> On Mon, Sep 17, 2018 at 11:28 PM sat  wrote:
>
> Hi Michael Vorburger,
>
>
>
> Thanks, i will check it out.
>
>
>
> Thanks
>
> A.SathishKumar
>
>
>
>
>
> There is an akka persistence plugin for Cassandra -
> https://github.com/krasserm/akka-persistence-cassandra.  I think this is
> what you're looking for.
>
>
>
>
>
> On Mon, Sep 17, 2018 at 3:13 PM Michael Vorburger 
> wrote:
>
>
> Sat,
>
>
>
> On Thu, Sep 13, 2018 at 2:07 AM sat  wrote:
>
> Hi,
>
>
>
> ODL uses "LevelDB" for persistence, we came to know that its prone for
> corruption. Did anyone try using Cassandra for persistence rather than
> LevelDB.
>
>
>
> I see some posts with the same requirement, but there is no reply.
>
>
>
> https://pantheon.tech/cassandra-datastore/ is a blog post which may
> interest you in this context; it's from a company that I am not affiliated
> with (and won't be able to further comment on here).
>
>
>
> BTW: https://github.com/vorburger/opendaylight-etcd is somewhat related
> WIP work in FLOSS where I'm actively exploring the use of etcd (not
> Cassandra) as a data store.
>
>
> Tx,
>
> M.
>
> --
>
> Michael Vorburger, Red Hat
> vorbur...@redhat.com | IRC: vorburger @freenode | ~ = http://vorburger.ch
>
>
>
>
>
>
> --
>
> A.SathishKumar
> 044-24735023
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>
>
>
> --
>
> A.SathishKumar
> 044-24735023
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Signing Off

2018-08-27 Thread Anil Vishnoi
Good luck for the next gig Ryan and thank you for you extraordinary
contribution to the ODL community.

On Mon, Aug 27, 2018 at 2:02 PM Ryan Goulding 
wrote:

> Hi ODL Community,
>
>
> I have decided to make a career transition to an opportunity I could not
> pass up, and will no longer contribute to ODL after this week.  This past
> 4 years has been great;  I saw this entire community grow, thrive and
> mature, and I am very proud of what we collectively built.  I have
> learned so much from everyone regarding technology and business fronts, and
> I am grateful to all of you.
>
>
> I wish you all the best of luck going forward.  Stay in touch.
>
>
> Best Regards,
>
> Ryan
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] [mdsal-dev] Does commiting a transaction with no operations have any noteworthy (real life) overhead compared to cancelling it?

2018-07-27 Thread Anil Vishnoi
On Fri, Jul 27, 2018 at 4:54 AM Robert Varga  wrote:

> On 27/07/18 11:41, Anil Vishnoi wrote:
> >
> > My initial reaction is that such an optimization in
> > ManagedNewTransactionRunner is probably pointless as whatever
> > happens behind the scenes on a commit is surely already smart enough
> > by itself for a submit on an empty transaction to basically be a low
> > overhead NOOP anyway?
> >
> > ​Or if transaction API can expose some api like isEmpty() (just
> > example), that can come bit handy here?
>
> I don't think the benefit of such a method justifies additional state
> tracking required to support it.
>
​Depends on the cost of the two operations, if cost of submitting the empty
transaction is  less than cancelling it, yes it doesn't make sense, but
otherwise i think it makes life convinient for folks directly consuming the
api (compared to through wrappers).

>
> Regards,
> Robert
>
>

-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Does commiting a transaction with no operations have any noteworthy (real life) overhead compared to cancelling it?

2018-07-27 Thread Anil Vishnoi
On Fri, Jul 27, 2018 at 4:35 AM Michael Vorburger 
wrote:

> On Fri, Jul 27, 2018 at 11:42 AM Anil Vishnoi 
> wrote:
>
>> On Fri, Jul 27, 2018 at 2:30 AM Michael Vorburger 
>> wrote:
>>
>>> Hello, Tom, Robert, Stephen,
>>>
>>> Does commiting a transaction with no put/.../merge/delete operations on
>>> that Tx have any noteworthy (real life) overhead compared to cancelling it?
>>>
>>> I am asking because that has come up in
>>>
>>> https://git.opendaylight.org/gerrit/#/c/74506/1/lockmanager/lockmanager-impl/src/main/java/org/opendaylight/genius/lockmanager/impl/LockManagerServiceImpl.java@a229
>>> and I was wondering if there is any point in making our
>>> ManagedNewTransactionRunner "smarter" so that it does a cancel if it wasn't
>>> actually used.
>>>
>>> My initial reaction is that such an optimization in
>>> ManagedNewTransactionRunner is probably pointless as whatever happens
>>> behind the scenes on a commit is surely already smart enough by itself for
>>> a submit on an empty transaction to basically be a low overhead NOOP anyway?
>>>
>> Or if transaction API can expose some api like isEmpty() (just example),
>> that can come bit handy here?
>>
>
> we would not even need an isEmpty() to be able to do such an optimization
> in our ManagedNewTransactionRunner, we could track it ourselves. What I
> am wondering if such a short cut optimization has any real value, and if it
> does if it shouldn't go into core MD SAL instead of
> ManagedNewTransactionRunner.
>
​Not everybody is using ManagedNewTransactionRunner, so whoever is using
the API's directly they need to do the tracking anyways. To me is seems
like you are adding elements to array, but keeping the counter out side of
the array, rather than asking List contact to say whether it's empty or
not.​


>
>
>>
>>> Tx,
>>> M.
>>> --
>>> Michael Vorburger, Red Hat
>>> vorbur...@redhat.com | IRC: vorburger @freenode | ~ =
>>> http://vorburger.ch
>>> ___
>>> controller-dev mailing list
>>> controller-dev@lists.opendaylight.org
>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>
>>
>>
>> --
>> Thanks
>> Anil
>>
>

-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] [infrautils-dev] Sharding evolution

2018-06-08 Thread Anil Vishnoi
So reading the wiki page, i was able to understand that there are two main
issues

(1) Transaction management and rollback -- i was not able to figure out the
relevance with distributed shard location.
(2) Performance in cluster node -- if shard leaders are distributed, any
transaction will involve network latency because transaction need to be
routed to leader controller ?

Let me know if there is any other reason that i missed from the wiki.

I think (2) is something that you are addressing by localizing the shard on
one controller? But that just solves probably 1 problem, you still have
following problems if you really want to solve the problem

(1) Ownership of OVSDB devices are distributed across the 3 nodes ( and
they all depends on ClusteredDataChangeListeners, and that has cost as
well).
(2) Ownership of openflow devices are distributed across the 3 nodes.
(3) Operational data replication across the three node cluster also has
cost and if your business logic depends on that, that will hit the
performance as well.

So by locating the shards at one place, you might solve one (minor) problem
in the whole end to end stack to improve the performance. Probably the
quickest solution to significantly improve the end to end performance is
that you force the ovsdb and openflow devices to be owned by the same
controller as well. But if you do that, the only remaining purpose of
cluster is to use it for data replication across two more nodes with the
2/3 performance hit in data store performance :).



On Fri, Jun 8, 2018 at 5:06 PM, Faseela K  wrote:

> [Changed the subject]
>
>
>
> Anil, now you can ask ;)
>
>
>
> https://wiki.opendaylight.org/view/Genius:Sharding_evolution
>
>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* Anil Vishnoi [mailto:vishnoia...@gmail.com]
> *Sent:* Saturday, June 09, 2018 5:30 AM
> *To:* Faseela K 
> *Cc:* Tom Pantelis ; Michael Vorburger <
> vorbur...@redhat.com>; infrautils-...@lists.opendaylight.org;
> controller-dev ;
> genius-...@lists.opendaylight.org
> *Subject:* Re: [controller-dev] [infrautils-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>
>
>
>
> On Fri, Jun 8, 2018 at 4:50 PM, Faseela K  wrote:
>
>
>
>
>
> *From:* Tom Pantelis [mailto:tompante...@gmail.com]
> *Sent:* Saturday, June 09, 2018 2:24 AM
> *To:* Anil Vishnoi 
> *Cc:* Faseela K ; Michael Vorburger <
> vorbur...@redhat.com>; infrautils-...@lists.opendaylight.org;
> controller-dev ;
> genius-...@lists.opendaylight.org
> *Subject:* Re: [controller-dev] [infrautils-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>
>
>
>
> On Fri, Jun 8, 2018 at 3:11 PM, Anil Vishnoi 
> wrote:
>
>
>
>
>
> On Thu, Jun 7, 2018 at 11:39 AM, Faseela K  wrote:
>
> Not related in this context, but if we can get shard leader change
> notification, can we use that to derive an entity owner instead of using
> EOS? ;)
>
> ​Humble suggestion, don't use shard location/ownership status in your
> business logic ;-)​
>
>
>
>
>
> +1. And knowledge, assumptions about shard names, member names ... :)
>
>
>
> >> Of course we all like to avoid such complex logics in the application
> code. In a 3 node cluster, for an application like netvirt which has to
> push a lot of flows, plus a set of OVSDB configuration, based on some
> events coming from neutron datastores(note that all of these are different
> config shards), I am just trying to understand what is the best way to
> place things.  It is always good not to make application logic, depend on
> internals of infra, but is the only way then to collocate shards?
>
> I have few questions around what lead to the solution that putting all the
> shard to one node is the only solutions
>
> ​, but i don't want to hi-jack this thread with that topic :).
>
>
>
>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* infrautils-dev-boun...@lists.opendaylight.org [mailto:
> infrautils-dev-boun...@lists.opendaylight.org] *On Behalf Of *Tom Pantelis
> *Sent:* Friday, June 08, 2018 12:07 AM
> *To:* Michael Vorburger 
> *Cc:* infrautils-...@lists.opendaylight.org; controller-dev <
> controller-dev@lists.opendaylight.org>; genius-...@lists.opendaylight.org;
> Robert Varga 
> *Subject:* Re: [infrautils-dev] [controller-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>
>
>
>
> --
>
> Thanks
>
> Anil
>



-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] [infrautils-dev] OK to resurrect c/64522 to first move infrautils.DiagStatus integration for datastore from genius to controller, and then improve it for GENIUS-138 ?

2018-06-08 Thread Anil Vishnoi
On Fri, Jun 8, 2018 at 4:50 PM, Faseela K  wrote:

>
>
>
>
> *From:* Tom Pantelis [mailto:tompante...@gmail.com]
> *Sent:* Saturday, June 09, 2018 2:24 AM
> *To:* Anil Vishnoi 
> *Cc:* Faseela K ; Michael Vorburger <
> vorbur...@redhat.com>; infrautils-...@lists.opendaylight.org;
> controller-dev ;
> genius-...@lists.opendaylight.org
> *Subject:* Re: [controller-dev] [infrautils-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>
>
>
>
> On Fri, Jun 8, 2018 at 3:11 PM, Anil Vishnoi 
> wrote:
>
>
>
>
>
> On Thu, Jun 7, 2018 at 11:39 AM, Faseela K  wrote:
>
> Not related in this context, but if we can get shard leader change
> notification, can we use that to derive an entity owner instead of using
> EOS? ;)
>
> ​Humble suggestion, don't use shard location/ownership status in your
> business logic ;-)​
>
>
>
>
>
> +1. And knowledge, assumptions about shard names, member names ... :)
>
>
>
> >> Of course we all like to avoid such complex logics in the application
> code. In a 3 node cluster, for an application like netvirt which has to
> push a lot of flows, plus a set of OVSDB configuration, based on some
> events coming from neutron datastores(note that all of these are different
> config shards), I am just trying to understand what is the best way to
> place things.  It is always good not to make application logic, depend on
> internals of infra, but is the only way then to collocate shards?
>
I have few questions around what lead to the solution that putting all the
shard to one node is the only solutions
​, but i don't want to hi-jack this thread with that topic :).

>
>
>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* infrautils-dev-boun...@lists.opendaylight.org [mailto:
> infrautils-dev-boun...@lists.opendaylight.org] *On Behalf Of *Tom Pantelis
> *Sent:* Friday, June 08, 2018 12:07 AM
> *To:* Michael Vorburger 
> *Cc:* infrautils-...@lists.opendaylight.org; controller-dev <
> controller-dev@lists.opendaylight.org>; genius-...@lists.opendaylight.org;
> Robert Varga 
> *Subject:* Re: [infrautils-dev] [controller-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] OK to resurrect c/64522 to first move infrautils.DiagStatus integration for datastore from genius to controller, and then improve it for GENIUS-138 ?

2018-06-08 Thread Anil Vishnoi
On Fri, Jun 8, 2018 at 1:49 PM, Tom Pantelis  wrote:

>
>
> On Fri, Jun 8, 2018 at 3:10 PM, Anil Vishnoi 
> wrote:
>
>>
>>
>> On Thu, Jun 7, 2018 at 11:37 AM, Tom Pantelis 
>> wrote:
>>
>>>
>>>
>>> On Thu, Jun 7, 2018 at 1:14 PM, Michael Vorburger 
>>> wrote:
>>>
>>>> Robert,
>>>>
>>>> just to avoid any misunderstandings and unnecessary extra work to throw
>>>> away, may we double check and confirm that we correctly understand your
>>>> comment in  https://jira.opendaylight.org/browse/GENIUS-138 to mean
>>>> that we are past the "dependency of a mature project on an incubation
>>>> project" objection and you are now OK with that we resurrect
>>>> https://git.opendaylight.org/gerrit/#/c/64522/, to first move
>>>> infrautils.DiagStatus integration for datastore from genius to controller?
>>>> We would then improve it, in controller instead of genius, for the
>>>> improvement proposed in issue GENIUS-138.
>>>>
>>>> Tom, OK for you as well to have such a dependency from controller to
>>>> infrautils?
>>>>
>>>
>>> I don't have a problem with it.
>>>
>>> BTW - I'm planning to add yang notifications to CDS to emit interesting
>>> state/status changes, eg akka member sate changes (Up, Down, Unreachable
>>> etc), shard leader/role changes 
>>>
>> ​Tom, is there any jira ticket that we can get some details about it ?
>> Are these yang notification going to be local or routed ?​
>>
>>
>
> All yang notifications are local - not sure what you mean by routed.
>
​I mean, like we route RPC's, i was wondering if you are building something
that will route the yang notification as well to other node.​


>
> My intention for these yang notifications is for  telemetry,  alarming...
> ​​
>
​Okay, that make sense. I was looking for a scenario where, in 3-node
cluster, shard leader moves from controller-1 to controller-3, will
controller-2 know about that ?As of now not sure about the usecase if that
is requires or not, that's why more interested in details to see what;s
coming :) )

>
>
>>
>>>
>>>
>>>
>>>>
>>>> Tx,
>>>> M.
>>>> --
>>>> Michael Vorburger, Red Hat
>>>> vorbur...@redhat.com | IRC: vorburger @freenode | ~ =
>>>> http://vorburger.ch
>>>>
>>>> ___
>>>> controller-dev mailing list
>>>> controller-dev@lists.opendaylight.org
>>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>>
>>>>
>>>
>>> ___
>>> controller-dev mailing list
>>> controller-dev@lists.opendaylight.org
>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>
>>>
>>
>>
>> --
>> Thanks
>> Anil
>>
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] [infrautils-dev] OK to resurrect c/64522 to first move infrautils.DiagStatus integration for datastore from genius to controller, and then improve it for GENIUS-138 ?

2018-06-08 Thread Anil Vishnoi
On Thu, Jun 7, 2018 at 11:39 AM, Faseela K  wrote:

> Not related in this context, but if we can get shard leader change
> notification, can we use that to derive an entity owner instead of using
> EOS? ;)
>
​Humble suggestion, don't use shard location/ownership status in your
business logic ;-)​


>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* infrautils-dev-boun...@lists.opendaylight.org [mailto:
> infrautils-dev-boun...@lists.opendaylight.org] *On Behalf Of *Tom Pantelis
> *Sent:* Friday, June 08, 2018 12:07 AM
> *To:* Michael Vorburger 
> *Cc:* infrautils-...@lists.opendaylight.org; controller-dev <
> controller-dev@lists.opendaylight.org>; genius-...@lists.opendaylight.org;
> Robert Varga 
> *Subject:* Re: [infrautils-dev] [controller-dev] OK to resurrect c/64522
> to first move infrautils.DiagStatus integration for datastore from genius
> to controller, and then improve it for GENIUS-138 ?
>
>
>
>
>
>
>
> On Thu, Jun 7, 2018 at 1:14 PM, Michael Vorburger 
> wrote:
>
> Robert,
>
>
>
> just to avoid any misunderstandings and unnecessary extra work to throw
> away, may we double check and confirm that we correctly understand your
> comment in  https://jira.opendaylight.org/browse/GENIUS-138 to mean that
> we are past the "dependency of a mature project on an incubation project"
> objection and you are now OK with that we resurrect https://git.
> opendaylight.org/gerrit/#/c/64522/, to first move infrautils.DiagStatus
> integration for datastore from genius to controller? We would then improve
> it, in controller instead of genius, for the improvement proposed in issue
> GENIUS-138.
>
>
>
> Tom, OK for you as well to have such a dependency from controller to
> infrautils?
>
>
>
> I don't have a problem with it.
>
>
>
> BTW - I'm planning to add yang notifications to CDS to emit interesting
> state/status changes, eg akka member sate changes (Up, Down, Unreachable
> etc), shard leader/role changes 
>
>
>
>
>
>
>
>
> Tx,
>
> M.
>
> --
>
> Michael Vorburger, Red Hat
> vorbur...@redhat.com | IRC: vorburger @freenode | ~ = http://vorburger.ch
>
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] OK to resurrect c/64522 to first move infrautils.DiagStatus integration for datastore from genius to controller, and then improve it for GENIUS-138 ?

2018-06-08 Thread Anil Vishnoi
On Thu, Jun 7, 2018 at 11:37 AM, Tom Pantelis  wrote:

>
>
> On Thu, Jun 7, 2018 at 1:14 PM, Michael Vorburger 
> wrote:
>
>> Robert,
>>
>> just to avoid any misunderstandings and unnecessary extra work to throw
>> away, may we double check and confirm that we correctly understand your
>> comment in  https://jira.opendaylight.org/browse/GENIUS-138 to mean that
>> we are past the "dependency of a mature project on an incubation project"
>> objection and you are now OK with that we resurrect https://git.opend
>> aylight.org/gerrit/#/c/64522/, to first move infrautils.DiagStatus
>> integration for datastore from genius to controller? We would then improve
>> it, in controller instead of genius, for the improvement proposed in issue
>> GENIUS-138.
>>
>> Tom, OK for you as well to have such a dependency from controller to
>> infrautils?
>>
>
> I don't have a problem with it.
>
> BTW - I'm planning to add yang notifications to CDS to emit interesting
> state/status changes, eg akka member sate changes (Up, Down, Unreachable
> etc), shard leader/role changes 
>
​Tom, is there any jira ticket that we can get some details about it ? Are
these yang notification going to be local or routed ?​


>
>
>
>
>>
>> Tx,
>> M.
>> --
>> Michael Vorburger, Red Hat
>> vorbur...@redhat.com | IRC: vorburger @freenode | ~ = http://vorburger.ch
>>
>> ___
>> controller-dev mailing list
>> controller-dev@lists.opendaylight.org
>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>
>>
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] ODL crashing in CSIT jobs

2017-11-09 Thread Anil Vishnoi
I suspect you might hit it again if you run it a bit longer because of this

[Thu Nov  2 05:58:08 2017] Free swap  = 0kB
[Thu Nov  2 05:58:08 2017] Total swap = 0kB


On Thu, Nov 9, 2017 at 1:39 AM, Stephen Kitt  wrote:

> On Thu, 9 Nov 2017 10:28:14 +0100
> Robert Varga  wrote:
> > On 02/11/17 23:02, Luis Gomez wrote:
> > > 1) JVM does not kill itself, the OS does instead after the java
> > > process grows to 3.7G in a VM of 4G RAM  (note Xmx is set to 2G but
> > > still the jvm goes far beyond that).
> >
> > Indicates this lies out side of heap -- check thread count.
>
> We verified separately that this is an OOM issue, but one detected by
> the kernel rather than by the JVM (the OOM killer kills the JVM, see
> https://jira.opendaylight.org/secure/attachment/14207/dmesg.log.txt
> for details; the number of threads wasn’t an issue here, but the lack
> of swap probably didn’t help).
>
> Upgrading to OpenJDK 8 patch 151 fixed the problem, it might have been
> related to one of the several memory usage bugs in 144 that were fixed
> in 151. It’s probably just moving the goalposts though since the
> problem was new — basically, I suspect we recently started using a
> little too much off-heap memory for some reason, and the upgrade to 151
> reduces the JVM’s memory usage enough to make us fit in our VMs again.
>
> Regards,
>
> Stephen
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] ODL crashing in CSIT jobs

2017-11-09 Thread Anil Vishnoi
Reading through the mail thread, I doubt that OS is killing the JVM, it's
definitely a native OOM, where JVM is trying to do malloc/calloc and it
does not find the memory and malloc crashes the JVM process(seg fault). I
have seen many native OOM issue with 32 bit JVM because of it's 4gig
virtual address space, but in 64 bit JVM, i have seen this issue when swap
space is not enabled in your system. So the first check i would suggest is
to check if your VM has swap space, if not, then i would suggest to create
swap space for your VM and run JVM. This most probably will fix the issue,
although your JVM performance will suck as OS will do lot of context
switching for the JVM process. It might not be a native memory leak, it's
might be that your JVM needs more memory to run. Now the question is why
it's not failing in carbon. This can happen because of few reasons (1)
Karaf 4 is loading more classes compared to karaf 3, so that will require
more native memory to store those bytecodes. (2) Karaf 4 code is getting
JITted faster compared to karaf3 (depends on the code execution stack), and
that also will require more memory or (3) it's really some leak (most
probably classloaded, nio leak)

Generally when native OOM happens you can verify it through two ways (1)
check dmesg and see if it has OOM related message (2) You see that OS init
dumps the coredump for that JVM process. If you are not seeing the core
dump it means your ulimit -c is set to 0.

I would suggest to look at the dmesg log first and see if there is
anything  like "Out of memory: Killed process X, UID YY, (java)", that
will verify that it's native OOM. Now to debug it, you need a coredump.
Check if your ulimit -c setting is set to unlimited, if not fire this
command (ulimit -c unlimited) and then start the JVM and recreate the
issue. This should generate the core dump.

Once you have the  core dump, attach the gdb to coredump and dump the
current thread stack, this will tell you why JVM crashed. Most probably the
top function of the stack trace will be calloc() or malloc(). Mostly i have
seen the crashed thread is going to be native GC thread, NIO thread, File
IO thread, classloader thread or native JNI thread. If the thread was doing
classloader or jni activity, it is probably a sign of something wrong with
the nitrogen controller code, but if it's anything else it might be that
your JVM need more native memory to run karaf4 controller.

On Mon, Nov 6, 2017 at 11:32 AM, Jamo Luhrsen  wrote:

>
>
> On 11/03/2017 10:24 AM, Michael Vorburger wrote:
> > On Fri, Nov 3, 2017 at 11:02 AM, Stephen Kitt > wrote:
> >
> > On Fri, 3 Nov 2017 10:56:48 +0100
> > Stephen Kitt > wrote:
> > > Another avenue could be to look at the overcommit settings, but I
> > > suspect that the JVM would fail to start altogether if we adjusted
> > > those.
> >
> > Actually that’s worth trying anyway:
> >
> > sysctl -w vm.overcommit_memory=2
> >
> > before starting the JVM, will prevent it from over-allocating memory.
> > We’ll get a heap dump when that happens, if the JVM actually manages
> to
> > start (which I suspect it won’t, with a 2G Xmx on a 4G system).
> >
> >
> >
> > https://jira.opendaylight.org/browse/NETVIRT-974 now has some sub-tasks
> for various next actions..
>
> I've updated NETVIRT-975 [0]. The TL;DR is that with a different host
> image we do not
> see the crash.
>
> I'm looking for opinions on the next steps, one of which could be that we
> update
> our images, close the bug and forget this ever happened :)
>
> JamO
>
> [0]
> https://jira.opendaylight.org/browse/NETVIRT-974?
> focusedCommentId=59941=com.atlassian.jira.plugin.
> system.issuetabpanels:comment-tabpanel#comment-59941
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>



-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Expose Datastore health to applications via infrautils.diagstatus

2017-10-12 Thread Anil Vishnoi
On Thu, Oct 12, 2017 at 1:38 AM, Faseela K <faseel...@ericsson.com> wrote:

> I had the same discussion with Michael.
>
> Since infrautils is placed as some entity below controller/md-sal, he had
> the opinion that the service should be registered from above and the status
> should be exposed.
>
​Yeah, we had brief discussion about it in DDF, and i think that's the
ideal way to do it.​


> So here is how diagstatus module works – any application should register
> as a “service” with the framework, report an initial status(using the APIs
> provided by diagstatus).
>
> There is another OsgiService “ServiceStatusProvider” exposed, and if
> applications implement the same, that will be called everytime an external
> request is made to get the current service status.
>
​Isn't it that in both the approach, controller will have to depend on
infrautils? ​


>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* Anil Vishnoi [mailto:vishnoia...@gmail.com]
> *Sent:* Thursday, October 12, 2017 2:04 PM
> *To:* Faseela K <faseel...@ericsson.com>
> *Cc:* Muthukumaran K <muthukumara...@ericsson.com>; Tom Pantelis <
> tompante...@gmail.com>; infrautils-...@lists.opendaylight.org;
> controller-dev@lists.opendaylight.org; R Srinivasan E <
> r.e.sriniva...@ericsson.com>; Dayavanti Gopal Kamath <
> dayavanti.gopal.kam...@ericsson.com>
>
> *Subject:* Re: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
> Given that we are relying on MBean info to determine the readiness, does
> it matter whether it's should be part of the controller or a service in
> infrautils? Given that data store is core service, probably writing a
> service in infrautils is probably not a bad idea .
>
>
>
> With the current MBean exposed by controller, i think it does not support
> MBean notification ( i might be wrong), but if we can enable MBean
> notification, it will be like a status published through controller, it's
> just that in infrautils we need to make sense out of the changes.
>
>
>
> On Thu, Oct 12, 2017 at 1:24 AM, Faseela K <faseel...@ericsson.com> wrote:
>
> Thanks Muthu, Tom!
>
> Now the only question is whether CONTROLLER can depend on INFRAUTILS to
> implement the ServiceStatusPoller OsgiService, so that we will have one
> single point where all major service status will be available.
>
>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* Muthukumaran K
> *Sent:* Thursday, October 12, 2017 1:52 PM
> *To:* Tom Pantelis <tompante...@gmail.com>
> *Cc:* Faseela K <faseel...@ericsson.com>; infrautils-dev@lists.
> opendaylight.org; controller-dev@lists.opendaylight.org; R Srinivasan E <
> r.e.sriniva...@ericsson.com>; Dayavanti Gopal Kamath <
> dayavanti.gopal.kam...@ericsson.com>
> *Subject:* RE: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
> Thanks Tom. Then we would use the aggregate SyncStatus at Shard-Manager
> level.
>
>
>
> If we need to further drill down at shard-level (I do not have a usecase
> readily for that though) we can use Shard Level MXBeans anyway for any
> manual troubleshooting
>
>
>
> Regards
>
> Muthu
>
>
>
>
>
> *From:* Tom Pantelis [mailto:tompante...@gmail.com <tompante...@gmail.com>]
>
> *Sent:* Thursday, October 12, 2017 1:05 PM
> *To:* Muthukumaran K
> *Cc:* Faseela K; infrautils-...@lists.opendaylight.org;
> controller-dev@lists.opendaylight.org; R Srinivasan E; Dayavanti Gopal
> Kamath
> *Subject:* Re: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
>
>
>
>
> On Thu, Oct 12, 2017 at 3:08 AM, Muthukumaran K <
> muthukumara...@ericsson.com> wrote:
>
> Hi Tom,
>
>
>
> While the initial status of the CDS is inferable using the aggregate
> SyncStatus, for dynamic status (eg. after startup, leader mobility in
> cluster due to load, availability scenarios like node-loss etc.), we were
> thinking of explicitly checking if all configured shards do have the leader
> or not (of course using the Shard Level MBeans).
>
>
>
> But, from your mail, I understand that aggregate SyncStatus being set to
> false can be a more easier way to address dynamic changes post start
> instead of doing shardwise checking.
>
>
>
> Is my understanding correct ?
>
>
>
>
>
> That is correct. The shard will report a sync status change if it's a
> follower and the leader changes or if it goes to candidate. Of course if
> it's the leader, its sync status is automatically true. Also a follower
> shard will report it's not in sync if it lags

Re: [controller-dev] Expose Datastore health to applications via infrautils.diagstatus

2017-10-12 Thread Anil Vishnoi
Given that we are relying on MBean info to determine the readiness, does it
matter whether it's should be part of the controller or a service in
infrautils? Given that data store is core service, probably writing a
service in infrautils is probably not a bad idea .

With the current MBean exposed by controller, i think it does not support
MBean notification ( i might be wrong), but if we can enable MBean
notification, it will be like a status published through controller, it's
just that in infrautils we need to make sense out of the changes.

On Thu, Oct 12, 2017 at 1:24 AM, Faseela K  wrote:

> Thanks Muthu, Tom!
>
> Now the only question is whether CONTROLLER can depend on INFRAUTILS to
> implement the ServiceStatusPoller OsgiService, so that we will have one
> single point where all major service status will be available.
>
>
>
> Thanks,
>
> Faseela
>
>
>
> *From:* Muthukumaran K
> *Sent:* Thursday, October 12, 2017 1:52 PM
> *To:* Tom Pantelis 
> *Cc:* Faseela K ; infrautils-dev@lists.
> opendaylight.org; controller-dev@lists.opendaylight.org; R Srinivasan E <
> r.e.sriniva...@ericsson.com>; Dayavanti Gopal Kamath <
> dayavanti.gopal.kam...@ericsson.com>
> *Subject:* RE: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
> Thanks Tom. Then we would use the aggregate SyncStatus at Shard-Manager
> level.
>
>
>
> If we need to further drill down at shard-level (I do not have a usecase
> readily for that though) we can use Shard Level MXBeans anyway for any
> manual troubleshooting
>
>
>
> Regards
>
> Muthu
>
>
>
>
>
> *From:* Tom Pantelis [mailto:tompante...@gmail.com ]
>
> *Sent:* Thursday, October 12, 2017 1:05 PM
> *To:* Muthukumaran K
> *Cc:* Faseela K; infrautils-...@lists.opendaylight.org;
> controller-dev@lists.opendaylight.org; R Srinivasan E; Dayavanti Gopal
> Kamath
> *Subject:* Re: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
>
>
>
>
> On Thu, Oct 12, 2017 at 3:08 AM, Muthukumaran K <
> muthukumara...@ericsson.com> wrote:
>
> Hi Tom,
>
>
>
> While the initial status of the CDS is inferable using the aggregate
> SyncStatus, for dynamic status (eg. after startup, leader mobility in
> cluster due to load, availability scenarios like node-loss etc.), we were
> thinking of explicitly checking if all configured shards do have the leader
> or not (of course using the Shard Level MBeans).
>
>
>
> But, from your mail, I understand that aggregate SyncStatus being set to
> false can be a more easier way to address dynamic changes post start
> instead of doing shardwise checking.
>
>
>
> Is my understanding correct ?
>
>
>
>
>
> That is correct. The shard will report a sync status change if it's a
> follower and the leader changes or if it goes to candidate. Of course if
> it's the leader, its sync status is automatically true. Also a follower
> shard will report it's not in sync if it lags behind the leader by a
> certain # of commits (default 10).
>
>
>
> Regards
>
> Muthu
>
>
>
>
>
> *From:* controller-dev-boun...@lists.opendaylight.org [mailto:
> controller-dev-boun...@lists.opendaylight.org] *On Behalf Of *Tom Pantelis
> *Sent:* Thursday, October 12, 2017 12:28 PM
> *To:* Faseela K
> *Cc:* infrautils-...@lists.opendaylight.org; controller-dev@lists.
> opendaylight.org; R Srinivasan E; Dayavanti Gopal Kamath
> *Subject:* Re: [controller-dev] Expose Datastore health to applications
> via infrautils.diagstatus
>
>
>
>
>
>
>
> On Wed, Oct 11, 2017 at 2:16 PM, Faseela K  wrote:
>
> Hello controller-dev,
>
>
>
>We @ infrautils have developed a status-and-diagnostics framework,
> where applications can register their services,
>
>And report when they are functionally up. Northbound and Southbound
> interfaces for ODL can open-up and accept configurations,
>
>When all the required services are UP. As part of this, we were
> thinking if we can have a “DATASTORE” service, whose status can
>
>Be shown as “OPERATIONAL” when all the shards have properly elected
> their leaders. We do see that there are several MBeans  exposed by
> controller repo under *org.opendaylight.controller:Category=Shards,name="*
> +**+*",type=DistributedConfigDatastore*
>
>   which can be used to derive the same information.
>
>Instead of doing that from outside, wanted to explore the possibility
> of integrating controller.sal-distributed-datastore with
> infrautils.diagstatus to report the status when the initial shard leader
> election is complete,
>
>And implement the dynamic poll interface to fetch the shard leader
> status at random points in time. Please share your thoughts.
>
>
>
> This sounds like a reasonable idea.  CDS does have an aggregated shard
> sync status that is collected and reported by the ShardManager to
> the ShardManagerInfo MBean's SyncStatus attribute for each data store (eg
> 

Re: [controller-dev] [mdsal-dev] How to increase the timeout value for a commit operation to the data store ?

2017-05-16 Thread Anil Vishnoi
Tom,

I think it depends on the use case for which user is using cohort. For
example If user have a use case where it sends very few rest request to the
controller from northbound side but want to make sure it runs all the
possible checks against that data to make sure that it can avoid any wrong
configuration (according to the use case and not really as per yang
schema). In general i agree with you that anything that takes more then 5
second, it's better to probably write that logic in the application rather
than in the cohort, but we don't know all the use cases people use it for.
So i think having a config knob (with default value to 5 second or lower)
will give user an option to change it (increase or decrease) as per their
usecase.

On Tue, May 16, 2017 at 3:19 AM, Tom Pantelis <tompante...@gmail.com> wrote:

> yes - it is currently hard-coded to 5 sec. It was not intended for cohorts
> to take 5-20 sec to validate. Cohorts are supposed to be performant as the
> API javadocs stress, especially since they're currently invoked
> synchronously and thus block the Shard.
>
> On Mon, May 15, 2017 at 10:20 PM, Anil Vishnoi <vishnoia...@gmail.com>
> wrote:
>
>> I believe this is where it is set
>>
>> https://github.com/opendaylight/controller/blob/master/
>> opendaylight/md-sal/sal-distributed-datastore/src/main
>> /java/org/opendaylight/controller/cluster/datastore/ShardDat
>> aTree.java#L106
>>
>> Not sure if there is any way to configure it though any akka/cluster
>> config knob.
>>
>> On Mon, May 15, 2017 at 8:23 AM, Michael Vorburger <vorbur...@redhat.com>
>> wrote:
>>
>>> On Mon, May 15, 2017 at 5:06 PM, Satish Dutt <sd...@advaoptical.com>
>>> wrote:
>>>
>>>> Hi Michael,
>>>>
>>>>
>>>>
>>>> Thanks for your response. I am writing a custom cohort class for some
>>>> validation purpose, which extends the DOMDataTreeCommitCohort of the
>>>> mdsal.dom.api package and overrides the canCommit(). canCommit() in my
>>>> cohort does some validations and just return a Future object indicating
>>>> success or failure. I am NOT actually timing out the Future . Sometimes my
>>>> cohort class takes more than 5 seconds to execute and the MDSAL commit
>>>> times-out.
>>>>
>>>
>>> Oh you didn't specify that in your first email... sorry, I don't know
>>> anything more about this; maybe others will chime in.
>>>
>>>
>>>> I  suspect  that some classes in MDSAL are probably timing out the
>>>> commit, since it is exceeding the default timeout value of 5 seconds. But
>>>> in my application, I can wait for more than 5 seconds possibly around 20
>>>> seconds for doing the validation. For this I want to know the code of the
>>>> MDSAL which I can tweak for setting a higher timeout value and use that
>>>> code locally in my application.
>>>>
>>>>
>>>>
>>>>   "errors": {
>>>>
>>>> "error": [
>>>>
>>>>   {
>>>>
>>>> "error-type": "application",
>>>>
>>>> "error-tag": "operation-failed",
>>>>
>>>> "error-message": "canCommit encountered an unexpected failure",
>>>>
>>>> "error-info": "java.util.concurrent.TimeoutException: Futures
>>>> timed out after [5 seconds]\n\tat scala.concurrent.impl.Promise$
>>>> DefaultPromise.ready(Promise.scala:219)\n\tat
>>>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)\n\tat
>>>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)\n\tat
>>>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThr
>>>> ead$$anon$3.block(ThreadPoolBuilder.scala:167)\n\tat
>>>> scala.concurrent.forkjoin.ForkJoinPool.managedBlock(ForkJoinPool.java:3640)\n\tat
>>>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThr
>>>> ead.blockOn(ThreadPoolBuilder.scala:165)\n\tat
>>>> scala.concurrent.Await$.result(package.scala:190)\n\tat
>>>> scala.concurrent.Await.result(package.scala)\n\tat
>>>> org.opendaylight.controller.cluster.datastore.CompositeDataT
>>>> reeCohort.processResponses(CompositeDataTreeCohort.java:162)\n\tat
>>>> org.opendaylight.controller.cluster.datastore.CompositeDataT
>>>> reeCohort.canCommit(CompositeDataTreeCohort.java:122)\n\t

Re: [controller-dev] [mdsal-dev] How to increase the timeout value for a commit operation to the data store ?

2017-05-15 Thread Anil Vishnoi
I believe this is where it is set

https://github.com/opendaylight/controller/blob/master/opendaylight/md-sal/sal-distributed-datastore/src/main/java/org/opendaylight/controller/cluster/datastore/ShardDataTree.java#L106

Not sure if there is any way to configure it though any akka/cluster config
knob.

On Mon, May 15, 2017 at 8:23 AM, Michael Vorburger 
wrote:

> On Mon, May 15, 2017 at 5:06 PM, Satish Dutt 
> wrote:
>
>> Hi Michael,
>>
>>
>>
>> Thanks for your response. I am writing a custom cohort class for some
>> validation purpose, which extends the DOMDataTreeCommitCohort of the
>> mdsal.dom.api package and overrides the canCommit(). canCommit() in my
>> cohort does some validations and just return a Future object indicating
>> success or failure. I am NOT actually timing out the Future . Sometimes my
>> cohort class takes more than 5 seconds to execute and the MDSAL commit
>> times-out.
>>
>
> Oh you didn't specify that in your first email... sorry, I don't know
> anything more about this; maybe others will chime in.
>
>
>> I  suspect  that some classes in MDSAL are probably timing out the
>> commit, since it is exceeding the default timeout value of 5 seconds. But
>> in my application, I can wait for more than 5 seconds possibly around 20
>> seconds for doing the validation. For this I want to know the code of the
>> MDSAL which I can tweak for setting a higher timeout value and use that
>> code locally in my application.
>>
>>
>>
>>   "errors": {
>>
>> "error": [
>>
>>   {
>>
>> "error-type": "application",
>>
>> "error-tag": "operation-failed",
>>
>> "error-message": "canCommit encountered an unexpected failure",
>>
>> "error-info": "java.util.concurrent.TimeoutException: Futures
>> timed out after [5 seconds]\n\tat scala.concurrent.impl.Promise$
>> DefaultPromise.ready(Promise.scala:219)\n\tat
>> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)\n\tat
>> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)\n\tat
>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThr
>> ead$$anon$3.block(ThreadPoolBuilder.scala:167)\n\tat
>> scala.concurrent.forkjoin.ForkJoinPool.managedBlock(ForkJoinPool.java:3640)\n\tat
>> akka.dispatch.MonitorableThreadFactory$AkkaForkJoinWorkerThr
>> ead.blockOn(ThreadPoolBuilder.scala:165)\n\tat
>> scala.concurrent.Await$.result(package.scala:190)\n\tat
>> scala.concurrent.Await.result(package.scala)\n\tat
>> org.opendaylight.controller.cluster.datastore.CompositeDataT
>> reeCohort.processResponses(CompositeDataTreeCohort.java:162)\n\tat
>> org.opendaylight.controller.cluster.datastore.CompositeDataT
>> reeCohort.canCommit(CompositeDataTreeCohort.java:122)\n\tat
>> org.opendaylight.controller.cluster.datastore.SimpleShardDat
>> aTreeCohort.userPreCommit(SimpleShardDataTreeCohort.java:162)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardDataTree.
>> startPreCommit(ShardDataTree.java:584)\n\tat
>> org.opendaylight.controller.cluster.datastore.SimpleShardDat
>> aTreeCohort.preCommit(SimpleShardDataTreeCohort.java:91)\n\tat
>> org.opendaylight.controller.cluster.datastore.CohortEntry.pr
>> eCommit(CohortEntry.java:102)\n\tat org.opendaylight.controller.cl
>> uster.datastore.ShardCommitCoordinator.doCommit(ShardCommitCoordinator.java:296)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardCommitCoo
>> rdinator.access$200(ShardCommitCoordinator.java:49)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardCommitCoo
>> rdinator$2.onSuccess(ShardCommitCoordinator.java:243)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardCommitCoo
>> rdinator$2.onSuccess(ShardCommitCoordinator.java:237)\n\tat
>> org.opendaylight.controller.cluster.datastore.SimpleShardDat
>> aTreeCohort.successfulCanCommit(SimpleShardDataTreeCohort.java:145)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardDataTree.
>> processNextTransaction(ShardDataTree.java:526)\n\tat
>> org.opendaylight.controller.cluster.datastore.ShardDataTree.
>> startCanCommit(ShardDataTree.java:560)\n\tat
>> org.opendaylight.controller.cluster.datastore.SimpleShardDat
>> aTreeCohort.canCommit(SimpleShardDataTreeCohort.java:81)\n\tat
>> org.opendaylight.controller.cluster.datastore.CohortEntry.ca
>> nCommit(CohortEntry.java:98)\n\tat org.opendaylight.controller.cl
>> uster.datastore.ShardCommitCoordinator.handleCanCommit(Shard
>> CommitCoordinator.java:237)\n\tat org.opendaylight.controller.cl
>> uster.datastore.ShardCommitCoordinator.handleReadyLocalTrans
>> action(ShardCommitCoordinator.java:201)\n\tat
>> org.opendaylight.controller.cluster.datastore.Shard.handleRe
>> adyLocalTransaction(Shard.java:437)\n\tat org.opendaylight.controller.cl
>> uster.datastore.Shard.handleNonRaftCommand(Shard.java:243)\n\tat
>> org.opendaylight.controller.cluster.raft.RaftActor.handleCommand(RaftActor.java:291)\n\tat
>> org.opendaylight.controller.cluster.common.actor.AbstractUnt
>> 

Re: [controller-dev] Nomination for new committer

2017-04-18 Thread Anil Vishnoi
+1

Sent from my iPhone

> On Apr 18, 2017, at 8:06 AM, Tom Pantelis  wrote:
> 
> Hello controller committers,
> 
> I'd like to nominate Robert Varga for the committer role on the controller 
> project. As we all know, Robert has been a significant contributor to all the 
> kernel projects since the inception of ODL and, particularly, has been 
> contributing to CDS code in the controller project (he reviews most of my 
> patches).
> 
> Please vote +1,0,-1 on whether you would like to see them as a committer.
> 
> Thanks
> Tom
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Nomination for new Committers

2017-04-07 Thread Anil Vishnoi
+1 for both stephen and michael

Sent from my iPhone

> On Apr 7, 2017, at 9:12 AM, Tom Pantelis  wrote:
> 
> Hello controller committers,
> 
> I'd like to nominate Stephen Kitt and Michael Vorburger for the committer 
> role on the controller project. Both have pushed and reviewed many patches 
> over the last year or so. Also I've been the only active committer for quite 
> a while now so we need other active committers going forward.
> 
> Please vote +1,0,-1 on whether you would like to see them as a committer.
> 
> Thanks
> Tom
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Stepping Down as Controller PTL, PTL Election

2016-10-27 Thread Anil Vishnoi
Congratulations Tom :)

On Thu, Oct 27, 2016 at 5:03 PM, Abhijit Kumbhare <abhijitk...@gmail.com>
wrote:

> Congrats Tom! Well deserved!
>
> On Thu, Oct 27, 2016 at 3:42 PM, Tom Pantelis <tompante...@gmail.com>
> wrote:
>
>> So that's 3 of 5 committers voting +1, that's a majority. I'm declaring
>> victory
>>
>> On Wed, Oct 26, 2016 at 8:24 AM, Anil Vishnoi <vishnoia...@gmail.com>
>> wrote:
>>
>>> +1 for Tom Pantelis Nomination for PTL.
>>>
>>> On Tue, Oct 25, 2016 at 9:50 AM, Edward Warnicke <hagb...@gmail.com>
>>> wrote:
>>>
>>>> +1
>>>>
>>>> On Thu, Oct 13, 2016 at 8:42 AM, Tom Pantelis <tompante...@gmail.com>
>>>> wrote:
>>>>
>>>>> I'd like to nominate myself.
>>>>>
>>>>> Tom
>>>>>
>>>>> On Thu, Oct 13, 2016 at 11:35 AM, Anton Tkáčik <tony.tka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> As you noticed for  I was not able to participate as PTL during last
>>>>>> months..
>>>>>> For Carbon this situation will not change, and I will be unable to
>>>>>> participate as PTL for Controller project, so:
>>>>>>
>>>>>> - I am stepping down from role of  Controller PTL
>>>>>>
>>>>>> I want to open PTL nominations / election for Controller project lead.
>>>>>>
>>>>>> Thanks for understanding,
>>>>>> Tony Tkacik
>>>>>>
>>>>>> ___
>>>>>> controller-dev mailing list
>>>>>> controller-dev@lists.opendaylight.org
>>>>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>>>>
>>>>>>
>>>>>
>>>>> ___
>>>>> controller-dev mailing list
>>>>> controller-dev@lists.opendaylight.org
>>>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>>>
>>>>>
>>>>
>>>> ___
>>>> controller-dev mailing list
>>>> controller-dev@lists.opendaylight.org
>>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>>
>>>>
>>>
>>>
>>> --
>>> Thanks
>>> Anil
>>>
>>
>>
>> ___
>> controller-dev mailing list
>> controller-dev@lists.opendaylight.org
>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>
>>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] Stepping Down as Controller PTL, PTL Election

2016-10-26 Thread Anil Vishnoi
+1 for Tom Pantelis Nomination for PTL.

On Tue, Oct 25, 2016 at 9:50 AM, Edward Warnicke  wrote:

> +1
>
> On Thu, Oct 13, 2016 at 8:42 AM, Tom Pantelis 
> wrote:
>
>> I'd like to nominate myself.
>>
>> Tom
>>
>> On Thu, Oct 13, 2016 at 11:35 AM, Anton Tkáčik 
>> wrote:
>>
>>> Hi,
>>> As you noticed for  I was not able to participate as PTL during last
>>> months..
>>> For Carbon this situation will not change, and I will be unable to
>>> participate as PTL for Controller project, so:
>>>
>>> - I am stepping down from role of  Controller PTL
>>>
>>> I want to open PTL nominations / election for Controller project lead.
>>>
>>> Thanks for understanding,
>>> Tony Tkacik
>>>
>>> ___
>>> controller-dev mailing list
>>> controller-dev@lists.opendaylight.org
>>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>>
>>>
>>
>> ___
>> controller-dev mailing list
>> controller-dev@lists.opendaylight.org
>> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>>
>>
>
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
>
>


-- 
Thanks
Anil
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev


Re: [controller-dev] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem

2016-08-25 Thread Anil Vishnoi


Sent from my iPhone

> On Aug 25, 2016, at 10:02 AM, Robert Varga  wrote:
> 
>> On 08/25/2016 05:06 PM, Tom Pantelis wrote:
>> It can be adjusted via the akka.jvm-exit-on-fatal-error setitng. It only
>> exits on JVM Error. But JVM Error like NoClassDefFoundError is a serious
>> error - does it make sense to continue startup with a broken controller?
> 
> This is the correct setting: if we encounter uncaught Errors, the entire
> JVM is entering undefined behavior -- better kill it before it starts
> eating pets...
+1
> 
> Bye,
> Robert
> 
> ___
> controller-dev mailing list
> controller-dev@lists.opendaylight.org
> https://lists.opendaylight.org/mailman/listinfo/controller-dev
___
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev