Please enable the logs for the below package, org.opendaylight.openflowplugin.impl
Regards, Arun From: Sam Hague [mailto:[email protected]] Sent: Thursday, January 25, 2018 9:32 AM To: D Arunprakash <[email protected]> Cc: Anil Vishnoi <[email protected]>; openflowplugin-dev <[email protected]>; Vishal Thapar <[email protected]>; Faseela K <[email protected]>; Josh Hershberg <[email protected]>; Jamo Luhrsen <[email protected]>; Manu B <[email protected]> Subject: RE: is dhcp issue fixed on carbon? What logs to enable? Or just the whole openflowplugin, which is very noisy. Is packet capture on port 6653 good? On Jan 24, 2018 10:54 PM, "D Arunprakash" <[email protected]<mailto:[email protected]>> wrote: Hi Sam, We need to enable Openflowplugin logs and do packet capture for the time window when the tunnel port is delete and added back. Regards, Arun From: Anil Vishnoi [mailto:[email protected]<mailto:[email protected]>] Sent: Thursday, January 25, 2018 1:17 AM To: Sam Hague <[email protected]<mailto:[email protected]>> Cc: D Arunprakash <[email protected]<mailto:[email protected]>>; openflowplugin-dev <[email protected]<mailto:[email protected]>>; Vishal Thapar <[email protected]<mailto:[email protected]>>; Faseela K <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: Re: is dhcp issue fixed on carbon? Hi Sam, Looks like Arun is looking at it ? Arun, if you are not looking at it currently, please let me know I will take a look at it. Thanks Anil On Wed, Jan 24, 2018 at 4:25 AM, Sam Hague <[email protected]<mailto:[email protected]>> wrote: Adding openflow to thread. Anil, could someone take a look at this for carbon? We are seeing a connection flapping and end up missing port status updates. This leads to stale models and flows. This is blocking the carbon sr3. On Jan 24, 2018 12:58 AM, "D Arunprakash" <[email protected]<mailto:[email protected]>> wrote: Ignore my previous email. The tunnel port got deleted around 18:49:29.373 and added back on 18:52:46.26 2018-01-23T18:49:29.373Z|01979|vconn|DBG|tcp:10.30.170.63:6653<http://10.30.170.63:6653>: sent (Success): OFPT_PORT_STATUS (OF1.3) (xid=0x0): DEL: 4(tun55fb50d0a2b): addr:3e:0c:ed:2e:a9:ba 2018-01-23T18:52:46.261Z|03083|vconn|DBG|tcp:10.30.170.63:6653<http://10.30.170.63:6653>: sent (Success): OFPT_PORT_STATUS (OF1.3) (xid=0x0): ADD: 9(tun55fb50d0a2b): addr:8a:2f:9f:c6:fe:d9 Immediately after tunnel delete, I’m seeing so multiple switch flaps for quite sometime, 2018-01-23T18:49:35.155Z|02108|rconn|DBG|br-int<->unix: entering ACTIVE 2018-01-23T18:49:35.155Z|02109|vconn|DBG|unix: sent (Success): OFPT_HELLO (OF1.3) (xid=0x75): version bitmap: 0x04 2018-01-23T18:49:35.155Z|02110|vconn|DBG|unix: received: OFPT_HELLO (OF1.3) (xid=0x1): 2018-01-23T18:49:35.307Z|02144|rconn|DBG|br-int<->unix: connection closed by peer 2018-01-23T18:49:35.307Z|02145|rconn|DBG|br-int<->unix: entering DISCONNECTED 2018-01-23T18:49:35.324Z|02146|rconn|DBG|br-int<->unix: entering ACTIVE Also, I’m seeing error in karaf log 2018-01-23 18:49:29,378 | WARN | entLoopGroup-7-3 | DeviceContextImpl | 280 - org.opendaylight.openflowplugin.impl - 0.4.3.SNAPSHOT | writePortStatusMessage 2018-01-23 18:49:29,379 | WARN | entLoopGroup-7-3 | DeviceContextImpl | 280 - org.opendaylight.openflowplugin.impl - 0.4.3.SNAPSHOT | submit transaction for write port status message 2018-01-23 18:49:29,379 | WARN | rd-dispatcher-23 | ShardDataTree | 184 - org.opendaylight.controller.sa<http://org.opendaylight.controller.sa>l-distributed-datastore - 1.5.3.SNAPSHOT | member-1-shard-inventory-operational: Store Tx member-1-datastore-operational-fe-0-chn-8-txn-11-0: Data validation failed for path /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[{(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:246869078989547}]/AugmentationIdentifier{childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, (urn:opendaylight:flow:inventory?revision=2013-08-19)table, (urn:opendaylight:flow:inventory?revision=2013-08-19)group, (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, (urn:opendaylight:flow:inventory?revision=2013-08-19)software, (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address, (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)table-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, (urn:opendaylight:flow:inventory?revision=2013-08-19)description, (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, (urn:opendaylight:flow:inventory?revision=2013-08-19)meter]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)table/table[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=50}]/flow. org.opendaylight.yangtools.yang.data.api.schema.tree.ModifiedNodeDoesNotExistException: Node /(urn:opendaylight:inventory?revision=2013-08-19)nodes/node/node[{(urn:opendaylight:inventory?revision=2013-08-19)id=openflow:246869078989547}]/AugmentationIdentifier{childNames=[(urn:opendaylight:flow:inventory?revision=2013-08-19)port-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-group, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-match-types, (urn:opendaylight:flow:inventory?revision=2013-08-19)table, (urn:opendaylight:flow:inventory?revision=2013-08-19)group, (urn:opendaylight:flow:inventory?revision=2013-08-19)manufacturer, (urn:opendaylight:flow:inventory?revision=2013-08-19)software, (urn:opendaylight:flow:inventory?revision=2013-08-19)ip-address, (urn:opendaylight:flow:inventory?revision=2013-08-19)serial-number, (urn:opendaylight:flow:inventory?revision=2013-08-19)table-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-actions, (urn:opendaylight:flow:inventory?revision=2013-08-19)hardware, (urn:opendaylight:flow:inventory?revision=2013-08-19)description, (urn:opendaylight:flow:inventory?revision=2013-08-19)switch-features, (urn:opendaylight:flow:inventory?revision=2013-08-19)supported-instructions, (urn:opendaylight:flow:inventory?revision=2013-08-19)stale-meter, (urn:opendaylight:flow:inventory?revision=2013-08-19)meter]}/(urn:opendaylight:flow:inventory?revision=2013-08-19)table/table[{(urn:opendaylight:flow:inventory?revision=2013-08-19)id=50}]/flow does not exist. Cannot apply modification to its children. We need to check why there is multiple switch disconnect and reconnect and how ofp handles the same. Regards, Arun From: Vishal Thapar Sent: Wednesday, January 24, 2018 9:52 AM To: Faseela K <[email protected]<mailto:[email protected]>>; Sam Hague <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>>; D Arunprakash <[email protected]<mailto:[email protected]>> Cc: Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: RE: is dhcp issue fixed on carbon? Missed adding most important detail and added Arun. Inventory operational is still showing old port and new port for some reason. I guess that is what caused problems. https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1-t25-k4-k2-k1-k2-k56 {"id":"openflow:246869078989547:4","flow-node-inventory:supported":"","flow-node-inventory:peer-features":"","flow-node-inventory:port-number":4,"flow-node-inventory:hardware-address":"3e:0c:ed:2e:a9:ba","flow-node-inventory:current-feature":"","flow-node-inventory:maximum-speed":0,"flow-node-inventory:reason":"add","flow-node-inventory:configuration":"","flow-node-inventory:advertised-features":"","flow-node-inventory:current-speed":0,"flow-node-inventory:name":"tun55fb50d0a2b","flow-node-inventory:state":{"link-down":false,"blocked":false,"live":false}} {"id":"openflow:246869078989547:9","flow-node-inventory:supported":"","flow-node-inventory:peer-features":"","flow-node-inventory:port-number":9,"flow-node-inventory:hardware-address":"8a:2f:9f:c6:fe:d9","flow-node-inventory:current-feature":"","flow-node-inventory:maximum-speed":0,"flow-node-inventory:reason":"add","flow-node-inventory:configuration":"","flow-node-inventory:advertised-features":"","flow-node-inventory:current-speed":0,"flow-node-inventory:name":"tun55fb50d0a2b","flow-node-inventory:state":{"link-down":false,"blocked":false,"live":false}} OVS output from same set of logs: https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1-t25-k4-k1-k3-k1-k11-k4 9(tun55fb50d0a2b): addr:8a:2f:9f:c6:fe:d9 config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max So for now I’d peg it as OFPlugin issue. It didn’t detect or inform us of old port delete and that is why we didn’t delete old flows. Though wondering if something else in IFM code could’ve handled it, but don’t think we handle OfPort number changes, expect a delete+add in such scenarios. Faseela can pitch in why we have service binding entry with new port number but flow is still using old one. Regards, Vishal. From: Vishal Thapar Sent: 24 January 2018 09:26 To: Faseela K <[email protected]<mailto:[email protected]>>; Sam Hague <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>> Cc: Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: RE: is dhcp issue fixed on carbon? Quick analysis: Not related to policy stuff. Service binding has entry for the new port number but table 220 flow is still using old port number. { "bound-services": [ { "flow-cookie": 134217735, "flow-priority": 9, "instruction": [ { "apply-actions": { "action": [ { "order": 0, "output-action": { "max-length": 0, "output-node-connector": "9" } } ] }, "order": 0 } ], "service-name": "default.tun55fb50d0a2b", "service-priority": 9, "service-type": "interface-service-bindings:service-type-flow-based" } ], "interface-name": "tun55fb50d0a2b", "service-mode": "interface-service-bindings:service-mode-egress" } {"id":"246869078989547.220.tun55fb50d0a2b.0","priority":9,"table_id":220,"installHw":true,"hard-timeout":0,"match":{"openflowplugin-extension-general:extension-list":[{"extension-key":"openflowplugin-extension-nicira-match:nxm-nx-reg6-key","extension":{"openflowplugin-extension-nicira-match:nxm-nx-reg":{"value":4096,"reg":"nicira-match:nxm-nx-reg6"}}}]},"cookie":134217735,"flow-name":"default.tun55fb50d0a2b","strict":true,"instructions":{"instruction":[{"order":0,"apply-actions":{"action":[{"order":0,"output-action":{"max-length":0,"output-node-connector":"4"}}]}}]},"barrier":false,"idle-timeout":0} cookie=0x8000007, duration=403.965s, table=220, n_packets=0, n_bytes=0, priority=9,reg6=0x1000 actions=output:4 In OVS logs you can see this tunnel port getting deleted and then coming back in with a different OfPort. https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/compute_2/ovs-vswitchd.log.gz It goes from 4 to 9. This happens due to clean up in previous suite which doesn’t actually clean up everything and leaves entry for that old service binding. Can confirm it from interfaces-state entry for same port in first and second suites. So we have stale flows and stale service bindings for old tunnel port. Could probably check with OFPlugin how they handle update of a flow, may probably not work. We need to check if cleanup has been done completely before moving to next suite. This is where the work we been doing on tools comes in. Regards, Vishal. From: Faseela K Sent: 24 January 2018 08:10 To: Sam Hague <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>> Cc: Vishal Thapar <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: RE: is dhcp issue fixed on carbon? Looks more or less similar issue, tunnel flow is programmed in table 220 with older tunnel’s port number, which was deleted in l2 suite. However policy code has not kicked in. I will take a detailed look on what is causing this issue now. Thanks, Faseela From: Faseela K Sent: Wednesday, January 24, 2018 7:48 AM To: 'Sam Hague' <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>> Cc: Vishal Thapar <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: RE: is dhcp issue fixed on carbon? Thanks Sam for initial triaging. I will take a look at this. From: Sam Hague [mailto:[email protected]] Sent: Wednesday, January 24, 2018 6:54 AM To: Faseela K <[email protected]<mailto:[email protected]>>; Josh Hershberg <[email protected]<mailto:[email protected]>> Cc: Vishal Thapar <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>>; Manu B <[email protected]<mailto:[email protected]>> Subject: Re: is dhcp issue fixed on carbon? OK, seems pretty consistent that table 220 flows are not showing up. Vishal, Faseela, can you see if it is like the policymgr one where the bind/unbind was wrong? That seems the closest culprit as those were the last patches merged. Here is another case where the table 220 flow is missing in suite [5] of job [6]. This time the port missing is a tunnel port. "9(tun55fb50d0a2b): addr:8a:2f:9f:c6:fe:d9" is missing from table 220. And then in suite [7] of the same job this port has the same issue where the tunnel port is missing: "16(tap28760838-a7): addr:fe:16:3e:26:0a:e3" [5] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz#s1-t25-k4-k1-k3-k1-k12-k4 [6] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/log_02_l3.html.gz [7] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-gate-stateful-carbon/263/log_04_security_group.html.gz On Tue, Jan 23, 2018 at 3:33 PM, Sam Hague <[email protected]<mailto:[email protected]>> wrote: further details for Josh since the original email doesn't have many... - so the "l3.Check Vm Instances Have Ip Address" test fails with the net1 not being able to get all the vm ips for it's three vms. - '[u'None', u'31.0.0.9', u'31.0.0.10']' contains 'None' - this means the first vm of the three did not get a ip https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k8 - looks at the neutron ports to find which port goes with vm1 https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k9-k1-k4-k1-k2 get the missing ip as 31.0.0.6, then look at next log to get the port - look at the 31.0.0.x addresses, we know 31,0.0 https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k9-k1-k8-k2 3862fa17-4e7d-4d41-9237-c372fca11c03 | | fa:16:3e:96:06:3f | ip_address='31.0.0.6', subnet_id='697e1b34-1adb-4299-b50f-6527b15260fd' | ACTIVE | - I know the first vm (and second) are both on the compute_1 so look at the ovs logs on compute_1 https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k9-k2-k1-k2-k1-k11-k4 - compute_1, in the ofctl show br-int, we see port 7 7(tap3862fa17-4e): addr:fe:16:3e:96:06:3f - then check flows to see if there is a table 220 flow for port 7 https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k9-k2-k1-k2-k1-k12-k4 And the table 220 flow for port 7 is not there, so the vm can't get an IP. [3] is the patch vishal pushed to fix a similar issue the first time we saw this. What we found is that the elan tag was being reused, because a port was deleted and then a new one created and the elan tag reused. So you ended up with a tunnel port stomping on a vm port. [3] https://git.opendaylight.org/gerrit/#/c/67009/ On Tue, Jan 23, 2018 at 3:07 PM, Sam Hague <[email protected]<mailto:[email protected]>> wrote: Adding Josh to thread. On Tue, Jan 23, 2018 at 2:25 PM, Faseela K <[email protected]<mailto:[email protected]>> wrote: Manu, Could you please take a look at the DHCP failure in the below run? I am caught up with something else, will help you out in initial triaging. Thanks, Faseela From: Sam Hague [mailto:[email protected]<mailto:[email protected]>] Sent: Monday, January 22, 2018 10:57 PM To: Vishal Thapar <[email protected]<mailto:[email protected]>>; Faseela K <[email protected]<mailto:[email protected]>>; Jamo Luhrsen <[email protected]<mailto:[email protected]>> Subject: is dhcp issue fixed on carbon? Vishal, Faseela, can you look at this job run to see if the issue you fixed with the policymgr binding is fixed? in this build the whole poligymgr bundle has been removed. This is carbon so I just removed the whole bundle as we would never use it. Could that have uncovered something that the code was doing? If so, then even master and nitrogen should have the issue since there we disabled building policymgr - so should be the same as removing it. Other thing, merged in carbon is the bind/unbind patches for elan and dhcp. Could those have an impact? Thanks, Sam I don't see "7(tap3862fa17-4e): addr:fe:16:3e:96:06:3f" pop up in the table 220 flows which was the problem before. 3862fa17-4e7d-4d41-9237-c372fca11c03 | | fa:16:3e:96:06:3f | ip_address='31.0.0.6', subnet_id='697e1b34-1adb-4299-b50f-6527b15260fd' | ACTIVE | Thanks, Sam [1] https://logs.opendaylight.org/releng/vex-yul-odl-jenkins-1/netvirt-csit-1node-openstack-ocata-upstream-stateful-carbon/298/log_02_l3.html.gz#s1-t11-k9-k2-k1-k2-k1-k11-k4 -- Thanks Anil
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
