Hi Jamo, All the TCs are failed, please check. Seems to be basic issue,
=========== Message: Parent suite setup failed: No keyword with name 'KarafKeywords.Execute_Controller_Karaf_Command_On_Background log:set TRACE org.opendaylight.openflowplugin.impl' found. =========== Regards, Arun -----Original Message----- From: Jamo Luhrsen [mailto:[email protected]] Sent: Thursday, January 25, 2018 11:21 AM To: D Arunprakash <[email protected]>; Sam Hague <[email protected]> Cc: openflowplugin-dev <[email protected]>; Manu B <[email protected]>; Jamo Luhrsen <[email protected]> Subject: Re: [openflowplugin-dev] is dhcp issue fixed on carbon? I didn't add any extra tcpdump yet, but here is a job that will enable TRACE for ofp.impl. it's only running the four connectivity/ folder suites. https://jenkins.opendaylight.org/sandbox/job/netvirt-csit-1node-openstack-ocata-jamo-upstream-stateful-carbon/11/ JamO On 1/24/18 8:04 PM, D Arunprakash wrote: > Please enable the logs for the below package, > > org.opendaylight.openflowplugin.impl > > Regards, > > Arun > > *From:*Sam Hague [mailto:[email protected]] > *Sent:* Thursday, January 25, 2018 9:32 AM > *To:* D Arunprakash <[email protected]> > *Cc:* Anil Vishnoi <[email protected]>; openflowplugin-dev > <[email protected]>; Vishal Thapar > <[email protected]>; Faseela K <[email protected]>; Josh > Hershberg <[email protected]>; Jamo Luhrsen <[email protected]>; > Manu B <[email protected]> > *Subject:* RE: is dhcp issue fixed on carbon? > > What logs to enable? Or just the whole openflowplugin, which is very noisy. > > Is packet capture on port 6653 good? > > On Jan 24, 2018 10:54 PM, "D Arunprakash" <[email protected] > <mailto:[email protected]>> wrote: > > Hi Sam, > > We need to enable Openflowplugin logs and do packet capture for the > time window when the tunnel port is delete and added back. > > Regards, > > Arun > > *From:*Anil Vishnoi [mailto:[email protected] > <mailto:[email protected]>] > *Sent:* Thursday, January 25, 2018 1:17 AM > *To:* Sam Hague <[email protected] <mailto:[email protected]>> > *Cc:* D Arunprakash <[email protected] > <mailto:[email protected]>>; openflowplugin-dev > <[email protected] > <mailto:[email protected]>>; Vishal Thapar > <[email protected] <mailto:[email protected]>>; > Faseela K <[email protected] <mailto:[email protected]>>; > Josh Hershberg <[email protected] <mailto:[email protected]>>; > Jamo Luhrsen <[email protected] <mailto:[email protected]>>; > Manu B <[email protected] <mailto:[email protected]>> > > > *Subject:* Re: is dhcp issue fixed on carbon? > > Hi Sam, > > Looks like Arun is looking at it ? > > Arun, if you are not looking at it currently, please let me know I > will take a look at it. > > Thanks > > Anil > > On Wed, Jan 24, 2018 at 4:25 AM, Sam Hague <[email protected] > <mailto:[email protected]>> wrote: > > Adding openflow to thread. > > Anil, could someone take a look at this for carbon? We are > seeing a connection flapping and end up missing port status > updates. This leads to stale models and flows. > > This is blocking the carbon sr3. > > On Jan 24, 2018 12:58 AM, "D Arunprakash" > <[email protected] <mailto:[email protected]>> > wrote: > > Ignore my previous email. > > The tunnel port got deleted around 18:49:29.373 and added > back on 18:52:46.26 > > 2018-01-23T18:49:29.373Z|01979|vconn|DBG|tcp:10.30.170.63:6653 > <http://10.30.170.63:6653>: sent (Success): OFPT_PORT_STATUS > (OF1.3) (xid=0x0): DEL: 4(tun55fb50d0a2b): > addr:3e:0c:ed:2e:a9:ba > > 2018-01-23T18:52:46.261Z|03083|vconn|DBG|tcp:10.30.170.63:6653 > <http://10.30.170.63:6653>: sent (Success): OFPT_PORT_STATUS > (OF1.3) (xid=0x0): ADD: 9(tun55fb50d0a2b): > addr:8a:2f:9f:c6:fe:d9 > > Immediately after tunnel delete, I’m seeing so multiple > switch flaps for quite sometime, > > 2018-01-23T18:49:35.155Z|02108|rconn|DBG|br-int<->unix: > entering ACTIVE > > 2018-01-23T18:49:35.155Z|02109|vconn|DBG|unix: sent > (Success): OFPT_HELLO (OF1.3) (xid=0x75): > > version bitmap: 0x04 > > 2018-01-23T18:49:35.155Z|02110|vconn|DBG|unix: received: > OFPT_HELLO (OF1.3) (xid=0x1): > > 2018-01-23T18:49:35.307Z|02144|rconn|DBG|br-int<->unix: > connection closed by peer > > 2018-01-23T18:49:35.307Z|02145|rconn|DBG|br-int<->unix: > entering DISCONNECTED > > 2018-01-23T18:49:35.324Z|02146|rconn|DBG|br-int<->unix: > entering ACTIVE > > Also, I’m seeing error in karaf log > > 2018-01-23 18:49:29,378 | WARN | entLoopGroup-7-3 | > DeviceContextImpl | 280 - > org.opendaylight.openflowplugin.impl - 0.4.3.SNAPSHOT | > writePortStatusMessage > > 2018-01-23 18:49:29,379 | WARN | entLoopGroup-7-3 | > DeviceContextImpl | 280 - > org.opendaylight.openflowplugin.impl - 0.4.3.SNAPSHOT | > submit transaction for write port status message > > 2018-01-23 18:49:29,379 | WARN | rd-dispatcher-23 | > ShardDataTree | 184 - > org.opendaylight.controller.sa > <http://org.opendaylight.controller.sa>l-distributed-datastore > - 1.5.3.SNAPSHOT | member-1-shard-inventory-operational: > Store Tx member-1-datastore-operational-fe-0-chn-8-txn-11-0: > Data validation failed for path > > /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[{(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:246869078989547}]/AugmentationIdentifier{childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, > (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, > (urn:opendaylight:flow:inventory?revision=2013-08-19)table, > (urn:opendaylight:flow:inventory?revision=2013-08-19)group, > (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, > (urn:opendaylight:flow:inventory?revision=2013-08-19)software, > (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)table-features, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, > (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, > (urn:opendaylight:flow:inventory?revision=2013-08-19)description, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, > (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)meter]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)table/table[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=50}]/flow. > > > org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: > Node > > /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[{(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:246869078989547}]/AugmentationIdentifier{childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, > (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, > (urn:opendaylight:flow:inventory?revision=2013-08-19)table, > (urn:opendaylight:flow:inventory?revision=2013-08-19)group, > (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, > (urn:opendaylight:flow:inventory?revision=2013-08-19)software, > (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)table-features, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, > (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, > (urn:opendaylight:flow:inventory?revision=2013-08-19)description, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, > (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, > > (urn:opendaylight:flow:inventory?revision=2013-08-19)meter]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)table/table[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=50}]/flow > does not exist. Cannot apply modification to its children. > > We need to check why there is multiple switch disconnect and > reconnect and how ofp handles the same. > > Regards, > > Arun > > *From:*Vishal Thapar > *Sent:* Wednesday, January 24, 2018 9:52 AM > *To:* Faseela K <[email protected] > <mailto:[email protected]>>; Sam Hague > <[email protected] <mailto:[email protected]>>; Josh > Hershberg <[email protected] > <mailto:[email protected]>>; D Arunprakash > <[email protected] <mailto:[email protected]>> > *Cc:* Jamo Luhrsen <[email protected] > <mailto:[email protected]>>; Manu B <[email protected] > <mailto:[email protected]>> > *Subject:* RE: is dhcp issue fixed on carbon? > > Missed adding most important detail and added Arun. > > Inventory operational is still showing old port and new port > for some reason. I guess that is what caused problems. > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1- > t25-k4-k2-k1-k2-k56 > > > {"id":"openflow:246869078989547:4","flow-node-inventory:supported":"", > "flow-node-inventory:peer-features":"","flow-node-inventory:port-numbe > r":4,"flow-node-inventory:hardware-address":"3e:0c:ed:2e:a9:ba","flow- > node-inventory:current-feature":"","flow-node-inventory:maximum-speed" > :0,"flow-node-inventory:reason":"add","flow-node-inventory:configurati > on":"","flow-node-inventory:advertised-features":"","flow-node-invento > ry:current-speed":0,"flow-node-inventory:name":"tun55fb50d0a2b","flow- > node-inventory:state":{"link-down":false,"blocked":false,"live":false} > } > > > {"id":"openflow:246869078989547:9","flow-node-inventory:supported":"", > "flow-node-inventory:peer-features":"","flow-node-inventory:port-numbe > r":9,"flow-node-inventory:hardware-address":"8a:2f:9f:c6:fe:d9","flow- > node-inventory:current-feature":"","flow-node-inventory:maximum-speed" > :0,"flow-node-inventory:reason":"add","flow-node-inventory:configurati > on":"","flow-node-inventory:advertised-features":"","flow-node-invento > ry:current-speed":0,"flow-node-inventory:name":"tun55fb50d0a2b","flow- > node-inventory:state":{"link-down":false,"blocked":false,"live":false} > } > > OVS output from same set of logs: > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1- > t25-k4-k1-k3-k1-k11-k4 > > 9(tun55fb50d0a2b): addr:8a:2f:9f:c6:fe:d9 > > config: 0 > > state: 0 > > speed: 0 Mbps now, 0 Mbps max > > So for now I’d peg it as OFPlugin issue. It didn’t detect or > inform us of old port delete and that is why we didn’t > delete old flows. Though wondering if something else in IFM > code could’ve handled it, but don’t think we handle OfPort > number changes, expect a delete+add in such scenarios. > Faseela can pitch in why we have service binding entry with > new port number but flow is still using old one. > > Regards, > > Vishal. > > *From:*Vishal Thapar > *Sent:* 24 January 2018 09:26 > *To:* Faseela K <[email protected] > <mailto:[email protected]>>; Sam Hague > <[email protected] <mailto:[email protected]>>; Josh > Hershberg <[email protected] <mailto:[email protected]>> > *Cc:* Jamo Luhrsen <[email protected] > <mailto:[email protected]>>; Manu B <[email protected] > <mailto:[email protected]>> > *Subject:* RE: is dhcp issue fixed on carbon? > > Quick analysis: > > Not related to policy stuff. Service binding has entry for > the new port number but table 220 flow is still using old > port number. > > { "bound-services": [ { "flow-cookie": 134217735, > "flow-priority": 9, "instruction": [ { "apply-actions": { > "action": [ { "order": 0, "output-action": { "max-length": > 0, "output-node-connector": "*9*" } } ] }, "order": 0 } ], > "service-name": "default.tun55fb50d0a2b", > "service-priority": 9, "service-type": > "interface-service-bindings:service-type-flow-based" } ], > "interface-name": "tun55fb50d0a2b", "service-mode": > "interface-service-bindings:service-mode-egress" } > > > {"id":"246869078989547.220.tun55fb50d0a2b.0","priority":9,"table_id":2 > 20,"installHw":true,"hard-timeout":0,"match":{"openflowplugin-extensio > n-general:extension-list":[{"extension-key":"openflowplugin-extension- > nicira-match:nxm-nx-reg6-key","extension":{"openflowplugin-extension-n > icira-match:nxm-nx-reg":{"value":4096,"reg":"nicira-match:nxm-nx-reg6" > }}}]},"cookie":134217735,"flow-name":"default.tun55fb50d0a2b","strict" > :true,"instructions":{"instruction":[{"order":0,"apply-actions":{"acti > on":[{"order":0,"output-action":{"max-length":0,"output-node-connector > ":"*4*"}}]}}]},"barrier":false,"idle-timeout":0} > > cookie=0x8000007, duration=403.965s, table=220, n_packets=0, > n_bytes=0, priority=9,reg6=0x1000 actions=output:*4* > > In OVS logs you can see this tunnel port getting deleted and > then coming back in with a different OfPort. > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/compute_2/ovs-vswitch > d.log.gz > > It goes from 4 to 9. This happens due to clean up in > previous suite which doesn’t actually clean up everything > and leaves entry for that old service binding. Can confirm > it from interfaces-state entry for same port in first and > second suites. So we have stale flows and stale service > bindings for old tunnel port. Could probably check with > OFPlugin how they handle update of a flow, may probably not > work. > > We need to check if cleanup has been done completely before > moving to next suite. This is where the work we been doing > on tools comes in. > > Regards, > > Vishal. > > *From:*Faseela K > *Sent:* 24 January 2018 08:10 > *To:* Sam Hague <[email protected] > <mailto:[email protected]>>; Josh Hershberg > <[email protected] <mailto:[email protected]>> > *Cc:* Vishal Thapar <[email protected] > <mailto:[email protected]>>; Jamo Luhrsen > <[email protected] <mailto:[email protected]>>; Manu B > <[email protected] <mailto:[email protected]>> > *Subject:* RE: is dhcp issue fixed on carbon? > > Looks more or less similar issue, tunnel flow is programmed > in table 220 with older tunnel’s port number, which was > deleted in l2 suite. However policy code has not kicked in. > I will take a detailed look on what is causing this issue now. > > Thanks, > > Faseela > > *From:*Faseela K > *Sent:* Wednesday, January 24, 2018 7:48 AM > *To:* 'Sam Hague' <[email protected] > <mailto:[email protected]>>; Josh Hershberg > <[email protected] <mailto:[email protected]>> > *Cc:* Vishal Thapar <[email protected] > <mailto:[email protected]>>; Jamo Luhrsen > <[email protected] <mailto:[email protected]>>; Manu B > <[email protected] <mailto:[email protected]>> > *Subject:* RE: is dhcp issue fixed on carbon? > > Thanks Sam for initial triaging. > > I will take a look at this. > > *From:*Sam Hague [mailto:[email protected]] > *Sent:* Wednesday, January 24, 2018 6:54 AM > *To:* Faseela K <[email protected] > <mailto:[email protected]>>; Josh Hershberg > <[email protected] <mailto:[email protected]>> > *Cc:* Vishal Thapar <[email protected] > <mailto:[email protected]>>; Jamo Luhrsen > <[email protected] <mailto:[email protected]>>; Manu B > <[email protected] <mailto:[email protected]>> > *Subject:* Re: is dhcp issue fixed on carbon? > > OK, seems pretty consistent that table 220 flows are not > showing up. Vishal, Faseela, can you see if it is like the > policymgr one where the bind/unbind was wrong? That seems > the closest culprit as those were the last patches merged. > > Here is another case where the table 220 flow is missing in > suite [5] of job [6]. This time the port missing is a tunnel > port. "9(tun55fb50d0a2b): addr:8a:2f:9f:c6:fe:d9" is missing > from table 220. And then in suite [7] of the same job this > port has the same issue where the tunnel port is missing: > "16(tap28760838-a7): addr:fe:16:3e:26:0a:e3" > > [5] > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1- > t25-k4-k1-k3-k1-k12-k4 > > [6] > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz > > [7] > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-gate-stateful-carbon/263/log_04_security_group > .html.gz > > On Tue, Jan 23, 2018 at 3:33 PM, Sam Hague > <[email protected] <mailto:[email protected]>> wrote: > > further details for Josh since the original email > doesn't have many... > > - so the "l3.Check Vm Instances Have Ip Address" test > fails with the net1 not being able to get all the vm ips > for it's three vms. > > - '[u'None', u'31.0.0.9', u'31.0.0.10']' contains 'None' > - this means the first vm of the three did not get a > ip > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k8 > > - looks at the neutron ports to find which port goes > with vm1 > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k9-k1-k4-k1-k2 > > get the missing ip as 31.0.0.6, then look at next log to > get the port > > - look at the 31.0.0.x addresses, we know 31,0.0 > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k9-k1-k8-k2 > > 3862fa17-4e7d-4d41-9237-c372fca11c03 | | > fa:16:3e:96:06:3f | ip_address='31.0.0.6', > subnet_id='697e1b34-1adb-4299-b50f-6527b15260fd' | > ACTIVE | > > - I know the first vm (and second) are both on the > compute_1 so look at the ovs logs on compute_1 > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k9-k2-k1-k2-k1-k11-k4 > > - compute_1, in the ofctl show br-int, we see port 7 > > 7(tap3862fa17-4e): addr:fe:16:3e:96:06:3f > > - then check flows to see if there is a table 220 flow > for port 7 > > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k9-k2-k1-k2-k1-k12-k4 > > And the table 220 flow for port 7 is not there, so the > vm can't get an IP. > > [3] is the patch vishal pushed to fix a similar issue > the first time we saw this. What we found is that the > elan tag was being reused, because a port was deleted > and then a new one created and the elan tag reused. So > you ended up with a tunnel port stomping on a vm port. > > [3] https://git.opendaylight.org/gerrit/#/c/67009/ > > On Tue, Jan 23, 2018 at 3:07 PM, Sam Hague > <[email protected] <mailto:[email protected]>> wrote: > > Adding Josh to thread. > > On Tue, Jan 23, 2018 at 2:25 PM, Faseela K > <[email protected] > <mailto:[email protected]>> wrote: > > Manu, > > Could you please take a look at the DHCP > failure in the below run? > > I am caught up with something else, will > help you out in initial triaging. > > Thanks, > > Faseela > > *From:*Sam Hague [mailto:[email protected] > <mailto:[email protected]>] > *Sent:* Monday, January 22, 2018 10:57 PM > *To:* Vishal Thapar <[email protected] > <mailto:[email protected]>>; Faseela K > <[email protected] > <mailto:[email protected]>>; Jamo Luhrsen > <[email protected] <mailto:[email protected]>> > *Subject:* is dhcp issue fixed on carbon? > > Vishal, Faseela, > > can you look at this job run to see if the issue > you fixed with the policymgr binding is fixed? > in this build the whole poligymgr bundle has > been removed. This is carbon so I just removed > the whole bundle as we would never use it. Could > that have uncovered something that the code was > doing? If so, then even master and nitrogen > should have the issue since there we disabled > building policymgr - so should be the same as > removing it. > > Other thing, merged in carbon is the bind/unbind > patches for elan and dhcp. Could those have an > impact? > > Thanks, Sam > > I don't see "7(tap3862fa17-4e): > addr:fe:16:3e:96:06:3f" pop up in the table 220 > flows which was the problem before. > > 3862fa17-4e7d-4d41-9237-c372fca11c03 | | > fa:16:3e:96:06:3f | ip_address='31.0.0.6', > subnet_id='697e1b34-1adb-4299-b50f-6527b15260fd' > | ACTIVE | > > Thanks, Sam > > [1] > > https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csi > t-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz > #s1-t11-k9-k2-k1-k2-k1-k11-k4 > > > > -- > > Thanks > > Anil > > > _______________________________________________ > openflowplugin-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev > _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
