Shuva, This OpenFlow test run using your patch:
https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-1node-flow-services-only-boron/676/ shows port openflow:2:1 down in inventory, same as detected by VTN manager suite. I am adding a check to detect this issue in OpenFlow test: https://git.opendaylight.org/gerrit/#/c/45193/ BR/Luis > On Sep 5, 2016, at 7:54 PM, Shuva Jyoti Kar <[email protected]> > wrote: > > Thanks Luis and Hideyuki for the analysis. After going through all the logs > in CSIT , where the failure has occurred, I come to the following conclusion. > > The failures that we are facing is due to a couple of reasons: > 1. The port status update coming in earlier than the mastership of the > switch has been established, leading to TransactionChainClosed Exception and > thus missing the event. > 2. The port-status as reported by VTNInventorManager is down (enabled = > false), which can be due to state/port-config being null or the booleans > being reported true. Opendaylight inventory will not allow for a null state , > however port-config can be null. > (Refer to PortTranslatorUtil in openflowplugin). > > Bug6595 was raised highlighting issue[1], which has been addressed by > retrying the status update as inhttps://git.opendaylight.org/gerrit/#/c/45125/ > > But the failures are still occurring due to [2], and as such we need to check > the inventory-config DS to ensure that the state and the portconfigurations > are present during the time vtn tries to read it. > > URL: <ip-address>:8181/restconf/operational/opendaylight-inventory:nodes/ > > Code snippet for VTN: (InventoryUtils.toVtnPortBuilder()} > > PortConfig pcfg = fcnc.getConfiguration(); > Boolean portDown = (pcfg == null) ? null : pcfg.isPORTDOWN(); > State state = fcnc.getState(); > Boolean linkDown = (state == null) ? null : state.isLinkDown(); > > boolean enabled = false; > if (Boolean.FALSE.equals(portDown) && Boolean.FALSE.equals(linkDown)) > { > enabled = true; > } > > > I tested with a tree topology at levels 2, 3 in ovs 2.4 and did check the > Booleans getting populated correctly and the state being present. Howver I am > ignorant of the vtn test case and the topology. Hence I would request anyone > from vtn to reproduce the issue with the openflowplugin fix in place and take > a dump of the inventory-operational datastore to check if the status is null > or if either of the Booleans are coming as true. > > Also do let us know the version of ovs that you are testing with and when > does the PortUpdateTask kick in. I guess it’s a DTCL listener for > of-inventory-port , am I correct ? > > Thanks > Shuva > > > From: Luis Gomez [mailto:[email protected]] > Sent: Tuesday, September 06, 2016 12:43 AM > To: Tai, Hideyuki > Cc: Shuva Jyoti Kar; [email protected]; > [email protected]; [email protected] > Subject: Re: [vtn-dev] [integration-dev] [openflowplugin-dev] Blocking bugs > > Right, I was going to send mail with same observation, the below WARN is > weird because mininet ports are normally UP. > > 2016-09-05 16:20:45,905 | WARN | Runner: VTN Main | VBridge > | 194 - org.opendaylight.vtn.manager.implementation - 0.5.0.SNAPSHOT > | vBridge:Tenant1/vBridge1: Drop packet because egress port is down: > src=8a:da:81:51:9e:e1, dst=36:ee:d2:07:79:e4, port=openflow:2:1, type=0x800, > vlan=0 > > BR/Luis > > On Sep 5, 2016, at 12:06 PM, Tai, Hideyuki <[email protected]> wrote: > > Hi Luis, > > Thank you for running the test!!! > > I've checked the log file. > > https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/649/archives/karaf.log.gz > > The port status WARN message "Error processing port status message" didn't > appear in the above log file. > > However, the VTN CSIT failed in the same way as before. > I mean it seems to me that the OpenFlow Plugin still failed to update the > port status of the port "openflow:2:1". > I think, even with the Shuva's patch (Gerrit 45125), the OpenFlow plugin > reported that the port status of the port "openflow:2:1" was down. > > Just for your information, I still saw the following log messages on the > above log file. > > 2016-09-05 16:20:39,328 | INFO | on-dispatcher-41 | VTNInventoryManager | 194 > - org.opendaylight.vtn.manager.implementation - 0.5.0.SNAPSHOT | Port has > been created: {id=openflow:2:1, name=s2-eth1, enabled=false, cost=1000, > links=none} > > 2016-09-05 16:20:45,905 | WARN | Runner: VTN Main | VBridge | 194 - > org.opendaylight.vtn.manager.implementation - 0.5.0.SNAPSHOT | > vBridge:Tenant1/vBridge1: Drop packet because egress port is down: > src=8a:da:81:51:9e:e1, dst=36:ee:d2:07:79:e4, port=openflow:2:1, type=0x800, > vlan=0 > > > Actually, the VTN CSIT failed without the port status WARN message "Error > processing port status message" even without the Shuva's patch (Gerrit 45125) > For example, the following CSIT (September 2nd) failed in the same way, but > it didn't show the the port status WARN message "Error processing port status > message". > https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-manager-only-boron/640/ > > Best Regards, > Hideyuki Tai > > From: [email protected] > <[email protected]> on behalf of Luis Gomez > <[email protected]> > Sent: Monday, September 5, 2016 10:02 AM > To: Shuva Jyoti Kar > Cc: [email protected]; > [email protected]; [email protected] > Subject: Re: [vtn-dev] [integration-dev] [openflowplugin-dev] Blocking bugs > > Hi Shuva, > > I run your fix in vtn suite: > > https://jenkins.opendaylight.org/releng/view/vtn/job/vtn-csit-1node-manager-only-boron/649/ > > And it still fails: > > https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/649/archives/log.html.gz > > However there is no trace of Port Status ERROR anymore so maybe this ERROR > was not connected with the failure: > > https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/649/archives/karaf.log.gz > > BR/Luis > > > On Sep 5, 2016, at 6:07 AM, Shuva Jyoti Kar <[email protected]> > wrote: > > For the first error, I donot see in both the logs attached to the bug6595 > https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-coordinator-only-boron/209/archives/karaf.log.gz > https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/637/archives/karaf.log.gz > > and its not even mentioned in the bug description. I understand its new then. > > For the second one I notice there is a bug open > https://bugs.opendaylight.org/show_bug.cgi?id=6620 > > so the fix provided in https://git.opendaylight.org/gerrit/#/c/45126/ does > improve the situation. > Is my understanding correct ? > > Thanks > Shuva > > > From: Venkatrangan G - ERS, HCL Tech [mailto:[email protected]] > Sent: Monday, September 05, 2016 4:34 PM > To: Shuva Jyoti Kar > Cc: Abhijit Kumbhare; [email protected]; > [email protected]; > [email protected] > Subject: RE: [openflowplugin-dev] Blocking bugs > > The failures related to the “Device Failure” were seen in the earlier logs > also. > I don’t think the port status problem is observed in this log. > > There are two errors > > a. LostLeadership Error > ------------- > 2016-09-05 08:43:58,819 | ERROR | lt-dispatcher-16 | > ClusterSingletonServiceGroupImpl | 134 - > org.opendaylight.mdsal.singleton-dom-impl - 2.1.0.SNAPSHOT | Unexpected > exception state for service Provider openflow:1 in LostLeadership > > > b. BADSETARGUMENT / BADACTION errors blocking flow entry installations > > c. BADACTION code BADSETARGUMENT for flow remove (Failed to rollback Flow > entry) > > > There are some OF10 cases still failing, due to these flow installation > issues. > > > Regards, > Venkat G > > From: Shuva Jyoti Kar [mailto:[email protected]] > Sent: Monday, September 5, 2016 3:40 AM > To: Venkatrangan G - ERS, HCL Tech <[email protected]> > Cc: Abhijit Kumbhare <[email protected]>; [email protected]; > [email protected] > Subject: RE: [openflowplugin-dev] Blocking bugs > > Hi Venkat, > Are the failures old or something new? Are you observing the port status > error still ? > Thanks, > shuva > On Mon, Sep 05, 2016 at 3:01 PM, Venkatrangan G - ERS, HCL Tech > <[email protected]> wrote: > > Hi All, > > I tested the image [1] in sandbox and still see some failures > > a. Some of them with bgpcep, could be a different issue > b. Device side failures are still observed for some ADD Flow in this run > [2] > [1] - > https://jenkins.opendaylight.org/releng/job/openflowplugin-distribution-check-boron/82/artifact/distribution-karaf-0.5.0-SNAPSHOT.zip > > [2] - > https://logs.opendaylight.org/sandbox/jenkins091/vtn-csit-1node-manager-all-boron/2/archives/karaf.log.gz > > Regards, > Venkat G > > From: [email protected] > [mailto:[email protected]] On Behalf Of Shuva > Jyoti Kar > Sent: Sunday, September 4, 2016 11:40 PM > To: Abhijit Kumbhare <[email protected]> > Cc: [email protected]; [email protected] > Subject: Re: [openflowplugin-dev] Blocking bugs > > Lets wait for VTN guys to test and see. I tested it locally and donot see the > exception that i used to, but being unaware of what their TC is , lets wait. > On Mon, Sep 05, 2016 at 11:20 AM, Abhijit Kumbhare <[email protected]> > wrote: > > Do you want to get this merged since you have already tested it or do you > want to wait for VTN to test it out first before merge? If it is the former - > probably Jozef can do it in his day time (Monday). If it is the latter (VTN > review) - VTN may not be able to review for a day and half more as there is a > holiday in US Monday. > > On Sat, Sep 3, 2016 at 8:47 AM, Shuva Jyoti Kar > <[email protected]> wrote: > Sorry for crashing all the builds yesterday. > > I have coded another fix for the same: > > https://git.opendaylight.org/gerrit/#/c/45125/ > > It passes all the tests that it should , and does not introduce any > additional failures. > > Results on stable/Bo : > https://jenkins.opendaylight.org/releng/job/openflowplugin-patch-test-boron/37/ > Results on Carbon: > https://jenkins.opendaylight.org/releng/job/openflowplugin-patch-test-carbon/11/ > > > @Venkat/Hideyuki - Request you to please test it out with this patch to see > if it alleviates vtn issues > > Please feel free to ping me > > Thanks > Shuva > > -----Original Message----- > From: Shuva Jyoti Kar > Sent: Friday, September 02, 2016 4:20 PM > To: 'Tai, Hideyuki'; Luis Gomez > Cc: [email protected]; [email protected] > Subject: RE: [openflowplugin-dev] Blocking bugs > > Thanks Luis and Hideyuki for the analysis. > > I have out in a fix to address the issue of the post status getting updated > prior to mastership of the switch gets determined > > https://git.opendaylight.org/gerrit/#/c/45049/ > > Do test it with the latest build and let us know > > Thanks > Shuva > > -----Original Message----- > From: Tai, Hideyuki [mailto:[email protected]] > Sent: Friday, September 02, 2016 7:58 AM > To: Luis Gomez; Shuva Jyoti Kar > Cc: [email protected]; [email protected] > Subject: Re: [openflowplugin-dev] Blocking bugs > > Hi Luis, > > Thank you for sharing the great information. > > The information matches with my investigation. > > I've written the outcome of my investigation so far into the bug report. > https://bugs.opendaylight.org/show_bug.cgi?id=6595#c2 > > I think the OpenFlow plugin failed to process port status message, because it > failed to get a WriteTransaction. > And, the OpenFlow plugin failed to get the WriteTransaction, because it was > before it started the service as MASTER for the siwtch. > > Best Regards, > Hideyuki Tai > ________________________________________ > From: [email protected] > <[email protected]> on behalf of Luis Gomez > <[email protected]> > Sent: Thursday, September 1, 2016 7:17 PM > To: Shuva Jyoti Kar > Cc: [email protected] > Subject: Re: [openflowplugin-dev] Blocking bugs > > In the case of cluster the port status ERROR seem to happen just after a > member becomes SLAVE for a device: > > https://logs.opendaylight.org/releng/jenkins092/openflowplugin-csit-3node-clustering-only-boron/606/archives/odl3_karaf.log.gz > > While in VTN Manger the issue seems to happen because of longer Master > selection the first time switch connects to controller: > > 2016-08-31 23:18:01,788 | INFO | entLoopGroup-5-2 | ConnectionAdapterImpl > | 173 - org.opendaylight.openflowjava.openflow-protocol-impl - > 0.8.0.Boron-RC2 | Hello received / branch > 2016-08-31 23:18:02,428 | WARN | entLoopGroup-5-3 | DeviceContextImpl > | 183 - org.opendaylight.openflowplugin.impl - 0.3.0.Boron-RC2 | > Error processing port status message: > 2016-08-31 23:18:02,553 | INFO | lt-dispatcher-20 | LifecycleServiceImpl > | 183 - org.opendaylight.openflowplugin.impl - 0.3.0.Boron-RC2 | > ========== Start-up clustering MASTER services for node openflow:1 was > SUCCESSFUL ========== > > In successive switch connections the Master resolution is very fast: > > 2016-08-31 23:27:28,551 | INFO | entLoopGroup-5-6 | ConnectionAdapterImpl > | 173 - org.opendaylight.openflowjava.openflow-protocol-impl - > 0.8.0.Boron-RC2 | Hello received / branch > 2016-08-31 23:27:28,582 | INFO | lt-dispatcher-21 | LifecycleServiceImpl > | 183 - org.opendaylight.openflowplugin.impl - 0.3.0.Boron-RC2 | > ========== Start-up clustering MASTER services for node openflow:1 was > SUCCESSFUL ========== > > BR/Luis > > > > On Sep 1, 2016, at 10:33 AM, Luis Gomez <[email protected]> wrote: > > > > Sure, will do that today. > > > > > >> On Sep 1, 2016, at 10:11 AM, Shuva Jyoti Kar > >> <[email protected]> wrote: > >> > >> Luis/Venkat, > >> > >> Have pushed a gerrit get more information on the port whose status is > >> being missed: > >> https://git.opendaylight.org/gerrit/#/c/45024/ > >> > >> Please try it out with this and share the logs to analyse the failure to > >> process the port status message. > >> We need to establish that the tx chain is activated while we are > >> trying to update the port status > >> > >> Thanks > >> Shuva > >> > >> From: Luis Gomez [mailto:[email protected]] > >> Sent: Thursday, September 01, 2016 2:40 PM > >> To: Shuva Jyoti Kar > >> Cc: Venkatrangan G - ERS, HCL Tech; Andrej Leitner; > >> [email protected] > >> Subject: Re: [openflowplugin-dev] Blocking bugs > >> > >> FYI the "Error processing port status message" is not seen very often in > >> openflow test suites for 1 node but it is very present in 3 node cluster > >> test: > >> > >> > >> https://logs.opendaylight.org/releng/jenkins092/openflowplugin-csit-3 > >> node-clustering-only-boron/604/archives/odl1_karaf.log.gz > >> > >> > >> BR/Luis > >> > >> > >> > >> > >> On Sep 1, 2016, at 1:21 AM, Shuva Jyoti Kar <[email protected]> > >> wrote: > >> > >> Venkat, > >> > >> I see add flow failing with "Device reported error type BADACTION code > >> BADSETARGUMENT", "Device disconnected" and > >> "java.util.concurrent.CancellationException: Task was cancelled" , which > >> are genuine. > >> Do we know what is the flow that is being rejected as a BADACTION , or > >> why is there a flow push when the device is disconnected or why was the > >> task cancelled ? > >> > >> Thanks > >> Shuva > >> > >> From: Venkatrangan G - ERS, HCL Tech [mailto:[email protected]] > >> Sent: Thursday, September 01, 2016 1:14 PM > >> To: Shuva Jyoti Kar; Luis Gomez; Andrej Leitner > >> Cc: [email protected] > >> Subject: RE: [openflowplugin-dev] Blocking bugs > >> > >> Shuva, > >> It fails very frequently since 23rd this month. Every time it fails > >> all the ping tests in OF10 fail, which makes it unusable for Openflow 1.0. > >> Ref: > >> https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-coordi > >> nator-only-boron/195/archives/karaf.log.gz > >> > >> Remove Flows RPC has resulted in a failure, Add Flow has resulted in a > >> failure, Statistics has resulted in a failure, Due to these failures the > >> flow entries were not installed. > >> > >> > >> Regards, > >> Venkat G > >> > >> From: Shuva Jyoti Kar [mailto:[email protected]] > >> Sent: Wednesday, August 31, 2016 11:31 PM > >> To: Venkatrangan G - ERS, HCL Tech <[email protected]>; Luis > >> Gomez <[email protected]>; Andrej Leitner > >> <[email protected]> > >> Cc: [email protected] > >> Subject: RE: [openflowplugin-dev] Blocking bugs > >> > >> Venkat, > >> > >> Could you please explain your last statement? Does it fail sporadically or > >> consistently ? > >> > >> Also the statement for bug 6595 states "The latest regression runs > >> for VTN CSIT jobs reports failure in normal flow installation when testing > >> with OF 1.0 switches" > >> Is that correct since the flow installation fails genuinely because of > >> errors. > >> > >> Please let us know > >> > >> Thanks > >> Shuva > >> > >> From: Venkatrangan G - ERS, HCL Tech [mailto:[email protected]] > >> Sent: Thursday, September 01, 2016 11:57 AM > >> To: Shuva Jyoti Kar; Luis Gomez; Andrej Leitner > >> Cc: [email protected] > >> Subject: RE: [openflowplugin-dev] Blocking bugs > >> > >> Shuva, > >> > >> We also see that the Port Statistics RPC has failed, which results in VTN > >> concluding that ports are DOWN. VTN is a reactive implementation, hence > >> VTN does not install entries considering the egress port is DOWN. > >> The tests are not failing in every run, but every time it fails, OF1.0 > >> failures are observed. > >> > >> > >> Regards, > >> Venkat G > >> > >> From: [email protected] > >> [mailto:[email protected]] On Behalf > >> Of Shuva Jyoti Kar > >> Sent: Wednesday, August 31, 2016 11:18 PM > >> To: Luis Gomez <[email protected]>; Andrej Leitner > >> <[email protected]> > >> Cc: [email protected] > >> Subject: Re: [openflowplugin-dev] Blocking bugs > >> > >> Yes, but if I drill into the logs > >> > >> 1) > >> https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-coordinator-only-boron/209/archives/karaf.log.gz > >> a. errors=Device disconnected (3times) > >> 2) > >> https://logs.opendaylight.org/releng/jenkins092/vtn-csit-1node-manager-only-boron/637/archives/karaf.log.gz > >> a. errors=Device reported error type BADACTION code > >> BADSETARGUMENT(19 times) > >> b. errors=Device disconnected (10 times) > >> c. reason java.util.concurrent.CancellationException: Task was > >> cancelled(5 times) > >> > >> If you search for SalFlowServiceImpl. > >> We do have 0f1.0 TCs for flowprovisioning , are those passing ? If > >> yes then probably , vtn might need to drill down further > >> > >> Thanks > >> Shuva > >> > >> > >> > >> From: [email protected] > >> [mailto:[email protected]] On Behalf > >> Of Luis Gomez > >> Sent: Thursday, September 01, 2016 11:35 AM > >> To: Andrej Leitner > >> Cc: [email protected] > >> Subject: Re: [openflowplugin-dev] Blocking bugs > >> > >> The bug 6654 is happening sporadically in our system test but you can > >> reproduce it very quickly if you stop and start mininet with no delay for > >> few times. Thats what I did to confirm the issue. > >> > >> The bug 6595 is blocker because VTN has a regression in its OF10 system > >> test using the new plugin. BTW I also see this exception very often in the > >> openflow system test so it would be good to fix it for the Boron release. > >> > >> BR/Luis > >> > >> > >> > >> On Aug 31, 2016, at 8:43 PM, Andrej Leitner <[email protected]> > >> wrote: > >> > >> Hi, > >> I assume 6554 is not fixed yet. I've tried to get into it, but was unable > >> to reproduce it locally. Today is public holiday on Slovakia so will > >> continue on Friday. I don't know 6595, must be something new. > >> From: Abhijit Kumbhare <[email protected]> > >> Sent: Wednesday, August 31, 2016 9:09:54 PM > >> To: [email protected] > >> Subject: [openflowplugin-dev] Blocking bugs > >> > >> Hi folks, > >> > >> We seem to have two blocking bugs - that we should discuss tomorrow if not > >> fixed already: > >> > >> https://bugs.opendaylight.org/show_bug.cgi?id=6554 > >> > >> https://bugs.opendaylight.org/show_bug.cgi?id=6595 > >> > >> Thanks, > >> Abhijit > >> AndrejLeitner > >> Software Developer > >> > >> Sídlo / Mlynské Nivy 56 / 821 05 Bratislava / Slovakia R&D centrum / > >> Janka Kráľa 9 / 974 01 Banská Bystrica / Slovakia / > >> [email protected] > >> reception: +421 2 206 65 114 / www.pantheon.sk > >> > >> > >> _______________________________________________ > >> openflowplugin-dev mailing list > >> [email protected] > >> https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev > >> > >> > >> > >> ::DISCLAIMER:: > >> --------------------------------------------------------------------- > >> --------------------------------------------------------------------- > >> ---------- The contents of this e-mail and any attachment(s) are > >> confidential and intended for the named recipient(s) only. > >> E-mail transmission is not guaranteed to be secure or error-free as > >> information could be intercepted, corrupted, lost, destroyed, arrive > >> late or incomplete, or may contain viruses in transmission. The e mail and > >> its contents (with or without referred errors) shall therefore not attach > >> any liability on the originator or HCL or its affiliates. > >> Views or opinions, if any, presented in this email are solely those > >> of the author and may not necessarily reflect the views or opinions > >> of HCL or its affiliates. Any form of reproduction, dissemination, > >> copying, disclosure, modification, distribution and / or publication of > >> this message without the prior written consent of authorized > >> representative of HCL is strictly prohibited. If you have received this > >> email in error please delete it and notify the sender immediately. > >> Before opening any email and/or attachments, please check them for viruses > >> and other defects. > >> --------------------------------------------------------------------- > >> --------------------------------------------------------------------- > >> ---------- > > > > _______________________________________________ > openflowplugin-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev > _______________________________________________ > openflowplugin-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev > > > > ::DISCLAIMER:: > ---------------------------------------------------------------------------------------------------------------------------------------------------- > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability on > the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error please > delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > ---------------------------------------------------------------------------------------------------------------------------------------------------- > _______________________________________________ > integration-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/integration-dev _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
