Hi Vratko,

I investigated the issue I commented to you and created a bug for it, currently 
we have these cluster related bugs in OpenFlow identified by the system test 
(there could be more):

1) table miss flow only pushed by 1 instance (new bug): 
https://bugs.opendaylight.org/show_bug.cgi?id=7770
2) restart of device owner in non-HA scenarios does not work (old bug): 
https://bugs.opendaylight.org/show_bug.cgi?id=6459
3) Openflow cluster performance issues (old bug): 
https://bugs.opendaylight.org/show_bug.cgi?id=6755

As you said it is unclear whether openflow cluster issues are openflow or 
cluster related, all bugs are now in openflow queue and I would expect openflow 
devs to move to cluster queue if that is where they belong to.

BR/Luis


> On Feb 7, 2017, at 10:34 AM, Luis Gomez <ece...@gmail.com> wrote:
> 
> 
>> On Feb 7, 2017, at 10:03 AM, Vratko Polak -X (vrpolak - PANTHEON 
>> TECHNOLOGIES at Cisco) <vrpo...@cisco.com <mailto:vrpo...@cisco.com>> wrote:
>> 
>> Two more questions.
>> 
>> > https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/
>> >  
>> > <https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/>
>> > Cluster non HA test
>>  
>> I just realized 1) and 2) are the same job.
>> I am not sure which of the six suites [1]
>> are you referring to.
> 
> Typo, this is the link for non-HA: 
> https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-periodic-bulkomatic-clustering-daily-only-boron/
>  
> <https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-3node-periodic-bulkomatic-clustering-daily-only-boron/>
>>  
>> >> but other tests are not, I will have to investigate this.
>> > 
>> > Keep us informed.
>>  
>> Do you have an ETA?
> 
> I would say in the next 2 weeks I will have something in place for cluster 
> scalability.
> 
>>  
>> Vratko.
>>  
>> [1] 
>> https://logs.opendaylight.org/releng/jenkins092/openflowplugin-csit-3node-clustering-only-carbon/470/archives/log.html.gz
>>  
>> <https://logs.opendaylight.org/releng/jenkins092/openflowplugin-csit-3node-clustering-only-carbon/470/archives/log.html.gz>
>>  
>> From: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
>> Sent: 7 February, 2017 15:05
>> To: 'Luis Gomez' <ece...@gmail.com <mailto:ece...@gmail.com>>
>> Cc: integration-...@lists.opendaylight.org 
>> <mailto:integration-...@lists.opendaylight.org>; 
>> controller-dev@lists.opendaylight.org 
>> <mailto:controller-dev@lists.opendaylight.org>; openflowplugin-dev 
>> <openflowplugin-...@lists.opendaylight.org 
>> <mailto:openflowplugin-...@lists.opendaylight.org>>
>> Subject: RE: [integration-dev] Clustering acceptance tests
>>  
>> Thanks Luis.
>>  
>> > but other tests are not, I will have to investigate this.
>>  
>> Keep us informed.
>>  
>> > 3) & 4) is probably controller cluster limitation.
>>  
>> Both jobs occasionally pass,
>> and I have opened a Bug [0] for exceptions in karaf log.
>> To me, it looks like an error in OpenflowPlugin
>> (as opposed to Controller) code.
>>  
>> > writing very fast (REST or internal app) on a shard follower DS, and 
>> > reading on the other follower.
>>  
>> We plan to expand controller-csit-3node-rest-clust-cars-perf-only-carbon,
>> not sure yet whether this scenario will be included.
>>  
>> Vratko.
>>  
>> [0] https://bugs.opendaylight.org/show_bug.cgi?id=7750 
>> <https://bugs.opendaylight.org/show_bug.cgi?id=7750>
>>  
>> From: Luis Gomez [mailto:ece...@gmail.com <mailto:ece...@gmail.com>] 
>> Sent: 7 February, 2017 08:35
>> To: Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES at Cisco) 
>> <vrpo...@cisco.com <mailto:vrpo...@cisco.com>>
>> Cc: integration-...@lists.opendaylight.org 
>> <mailto:integration-...@lists.opendaylight.org>; 
>> controller-dev@lists.opendaylight.org 
>> <mailto:controller-dev@lists.opendaylight.org>; openflowplugin-dev 
>> <openflowplugin-...@lists.opendaylight.org 
>> <mailto:openflowplugin-...@lists.opendaylight.org>>
>> Subject: Re: [integration-dev] Clustering acceptance tests
>>  
>> Here is what I know from OpenFlow plugin (cc-ing ofplugin devs):
>>  
>> * Does your project have a test plan mentioning specific cluster scenarios?
>>  
>> Not written test plan but we are running a bunch of cluster tests.
>>  
>> 
>> * Do you have any of such scenarios implemented as Robot suites?
>>  
>> 1) 
>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/
>>  
>> <https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/>
>>  ->  Cluster HA test (DPN connect to all nodes), it used to pass except for 
>> 1 test (member isolation with iptables), now I see this test is stable but 
>> other tests are not, I will have to investigate this.
>>  
>> 2) 
>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/
>>  
>> <https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-clustering-only-boron/>
>>  -> Cluster non HA test (DPN connect to 1 node), failing because this old 
>> bug: https://bugs.opendaylight.org/show_bug.cgi?id=6459 
>> <https://bugs.opendaylight.org/show_bug.cgi?id=6459>.
>>  
>> 3) 
>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-bulkomatic-clustering-perf-daily-only-boron/
>>  
>> <https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-bulkomatic-clustering-perf-daily-only-boron/>
>>  -> Max flows/sec using bulk-o-matic DS on cluster setup. Not fully working 
>> because some cluster backend limitation 
>> https://bugs.opendaylight.org/show_bug.cgi?id=6755 
>> <https://bugs.opendaylight.org/show_bug.cgi?id=6755>
>>  
>> 4) 
>> https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-restconf-clustering-perf-daily-only-boron/
>>  
>> <https://jenkins.opendaylight.org/releng/view/CSIT-3node/job/openflowplugin-csit-3node-periodic-restconf-clustering-perf-daily-only-boron/>
>>  -> Max flows/sec using NB REST on cluster setup, this never worked very 
>> good because previous bug.
>>  
>> * Do the robot suites have failures, suspected to be caused by clustering
>>   (as opposed to application logic, or mistakes in Robot code)?
>>  
>> So far I think issue in 2) is OpenFlow cluster implementation and issue in 
>> 3) & 4) is probably controller cluster limitation.
>>  
>> 
>> * Are there open Bugs corresponding to the clustering failures?
>>  
>> Yes, except for 1) that will require some analysis on the unstable tests.
>>  
>> 
>> * Are you planning to implement more Robot 3node suites until Carbon release?
>>  
>> I will probably replace 1 of the performance suites (no point to run 2 if 
>> they do not work) by a cluster switch scalability test. 
>>  
>> 
>> * Are there scenarios you would like Controller team to cover using mock 
>> apps?
>>  
>> I think issue in 3) & 4) could be reproduced in controller project by just 
>> writing very fast (REST or internal app) on a shard follower DS, and reading 
>> on the other follower. 
>>  
>> On Feb 6, 2017, at 5:31 AM, Vratko Polak -X (vrpolak - PANTHEON TECHNOLOGIES 
>> at Cisco) <vrpo...@cisco.com <mailto:vrpo...@cisco.com>> wrote:
>>  
>> Hello Test Contacts.
>>  
>> In Controller project, our highest priority
>> for Carbon release is to make sure ODL clustering
>> is usable and stable.
>>  
>> We are in the phase of formulating explicit acceptance criteria,
>> so we can create execution plan for turning them into Robot suites.
>>  
>> Of course, clustering is not very useful just by itself,
>> it is used as a tool applications can use to reach their goals.
>> So real acceptance criteria for clustering should also
>> take into account whether ODL applications can work in cluster.
>>  
>> Many projects are already running their 3node CSIT tests,
>> but on one hand, some important scenarios might be not covered yet,
>> and some suites might be too unstable to serve as acceptance tests.
>>  
>> Controller team is small and busy, so we are asking for help.
>> Here is a set of quick questions for test contacts:
>> * Does your project have a test plan mentioning specific cluster scenarios?
>> * Do you have any of such scenarios implemented as Robot suites?
>> * Do the robot suites have failures, suspected to be caused by clustering
>>   (as opposed to application logic, or mistakes in Robot code)?
>> * Are there open Bugs corresponding to the clustering failures?
>> * Are you planning to implement more Robot 3node suites until Carbon release?
>> * Are there scenarios you would like Controller team to cover using mock 
>> apps?
>>  
>> Vratko (as a Controller test contact).
>> _______________________________________________
>> integration-dev mailing list
>> integration-...@lists.opendaylight.org 
>> <mailto:integration-...@lists.opendaylight.org>
>> https://lists.opendaylight.org/mailman/listinfo/integration-dev 
>> <https://lists.opendaylight.org/mailman/listinfo/integration-dev>

_______________________________________________
controller-dev mailing list
controller-dev@lists.opendaylight.org
https://lists.opendaylight.org/mailman/listinfo/controller-dev

Reply via email to