Re: [ovs-discuss] Failed assertion in ovs-vswitchd when running OVN

2017-12-08 Thread Ben Pfaff
Thanks for testing!  I applied this to master.

I don't think it needs backporting, because branch-2.8 doesn't have this
assertion.

On Thu, Dec 07, 2017 at 06:59:40PM -0800, Kevin Lin wrote:
> The patch worked for me! (network works, and no more warnings in ovs-vswitchd)
> 
> > On Dec 7, 2017, at 3:54 PM, Ben Pfaff  wrote:
> > 
> > On Thu, Dec 07, 2017 at 02:44:36PM -0800, Kevin Lin wrote:
> >> Hi Ben,
> >> 
> >> I’ve included the traces for an ARP request, and a ping. ovs-vswitchd also 
> >> logs errors for the return traffic, but I didn’t include that as it seems 
> >> redundant.
> >> 
> >> root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
> >> 'arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5
> >> :34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e
> >> ,arp_tha=ff:ff:ff:ff:ff:ff'
> >> Flow: 
> >> arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
> >> 
> >> bridge("kelda-int")
> >> ---
> >> 0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
> >>load:0x2->NXM_NX_REG0[]
> >>resubmit(,1)
> >> 1. arp,dl_dst=ff:ff:ff:ff:ff:ff, priority 1000
> >>LOCAL
> >>output:NXM_NX_REG0[]
> >> -> output port is 2
> >> Cannot truncate output to patch port 
> > 
> > Thanks for the details.
> > 
> > Would you mind testing this patch?
> >https://patchwork.ozlabs.org/patch/845908/ 
> > 
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Failed assertion in ovs-vswitchd when running OVN

2017-12-07 Thread Kevin Lin
The patch worked for me! (network works, and no more warnings in ovs-vswitchd)

> On Dec 7, 2017, at 3:54 PM, Ben Pfaff  wrote:
> 
> On Thu, Dec 07, 2017 at 02:44:36PM -0800, Kevin Lin wrote:
>> Hi Ben,
>> 
>> I’ve included the traces for an ARP request, and a ping. ovs-vswitchd also 
>> logs errors for the return traffic, but I didn’t include that as it seems 
>> redundant.
>> 
>> root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
>> 'arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5
>> :34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e
>> ,arp_tha=ff:ff:ff:ff:ff:ff'
>> Flow: 
>> arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
>> 
>> bridge("kelda-int")
>> ---
>> 0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
>>load:0x2->NXM_NX_REG0[]
>>resubmit(,1)
>> 1. arp,dl_dst=ff:ff:ff:ff:ff:ff, priority 1000
>>LOCAL
>>output:NXM_NX_REG0[]
>> -> output port is 2
>> Cannot truncate output to patch port 
> 
> Thanks for the details.
> 
> Would you mind testing this patch?
>https://patchwork.ozlabs.org/patch/845908/ 
> 
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Failed assertion in ovs-vswitchd when running OVN

2017-12-07 Thread Ben Pfaff
On Thu, Dec 07, 2017 at 02:44:36PM -0800, Kevin Lin wrote:
> Hi Ben,
> 
> I’ve included the traces for an ARP request, and a ping. ovs-vswitchd also 
> logs errors for the return traffic, but I didn’t include that as it seems 
> redundant.
> 
> root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
> 'arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5
> :34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e
> ,arp_tha=ff:ff:ff:ff:ff:ff'
> Flow: 
> arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
> 
> bridge("kelda-int")
> ---
>  0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
> load:0x2->NXM_NX_REG0[]
> resubmit(,1)
>  1. arp,dl_dst=ff:ff:ff:ff:ff:ff, priority 1000
> LOCAL
> output:NXM_NX_REG0[]
>  -> output port is 2
>   Cannot truncate output to patch port 

Thanks for the details.

Would you mind testing this patch?
https://patchwork.ozlabs.org/patch/845908/
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Failed assertion in ovs-vswitchd when running OVN

2017-12-07 Thread Kevin Lin
Hi Ben,

I’ve included the traces for an ARP request, and a ping. ovs-vswitchd also logs 
errors for the return traffic, but I didn’t include that as it seems redundant.

root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
'arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5
:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e
,arp_tha=ff:ff:ff:ff:ff:ff'
Flow: 
arp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff

bridge("kelda-int")
---
 0. in_port=1,dl_src=02:00:0a:c5:34:1e, priority 32768
load:0x2->NXM_NX_REG0[]
resubmit(,1)
 1. arp,dl_dst=ff:ff:ff:ff:ff:ff, priority 1000
LOCAL
output:NXM_NX_REG0[]
 -> output port is 2
  Cannot truncate output to patch port 

bridge("br-int")

 0. in_port=1, priority 100
set_field:0x1->reg13
set_field:0x3->reg11
set_field:0x2->reg12
set_field:0x1->metadata
set_field:0x1->reg14
resubmit(,8)
 8. reg14=0x1,metadata=0x1, priority 50, cookie 0x1d48851d
resubmit(,9)
 9. metadata=0x1, priority 0, cookie 0xb38c4d37
resubmit(,10)
10. metadata=0x1, priority 0, cookie 0xa1f73f3a
resubmit(,11)
11. metadata=0x1, priority 0, cookie 0x8515cdd5
resubmit(,12)
12. metadata=0x1, priority 0, cookie 0x1163ca42
resubmit(,13)
13. metadata=0x1, priority 0, cookie 0xbbb6e5e5
resubmit(,14)
14. metadata=0x1, priority 0, cookie 0x98e3172d
resubmit(,15)
15. metadata=0x1, priority 0, cookie 0x10496832
resubmit(,16)
16. metadata=0x1, priority 0, cookie 0xe88c85a
resubmit(,17)
17. metadata=0x1, priority 0, cookie 0x6516996b
resubmit(,18)
18. arp,metadata=0x1,arp_tpa=10.203.4.66,arp_op=1, priority 50, cookie 
0xb33c89ed
move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[]
 -> NXM_OF_ETH_DST[] is now 02:00:0a:c5:34:1e
set_field:02:00:0a:cb:04:42->eth_src
set_field:2->arp_op
move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[]
 -> NXM_NX_ARP_THA[] is now 02:00:0a:c5:34:1e
set_field:02:00:0a:cb:04:42->arp_sha
move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[]
 -> NXM_OF_ARP_TPA[] is now 10.197.52.30
set_field:10.203.4.66->arp_spa
move:NXM_NX_REG14[]->NXM_NX_REG15[]
 -> NXM_NX_REG15[] is now 0x1
load:0x1->NXM_NX_REG10[0]
resubmit(,32)
32. priority 0
resubmit(,33)
33. reg15=0x1,metadata=0x1, priority 100
set_field:0x1->reg13
set_field:0x3->reg11
set_field:0x2->reg12
resubmit(,34)
34. priority 0
set_field:0->reg0
set_field:0->reg1
set_field:0->reg2
set_field:0->reg3
set_field:0->reg4
set_field:0->reg5
set_field:0->reg6
set_field:0->reg7
set_field:0->reg8
set_field:0->reg9
resubmit(,40)
40. metadata=0x1, priority 0, cookie 0x69ea0511
resubmit(,41)
41. metadata=0x1, priority 0, cookie 0xe98a052d
resubmit(,42)
42. metadata=0x1, priority 0, cookie 0xb9ce92f6
resubmit(,43)
43. metadata=0x1, priority 0, cookie 0x95c9e574
resubmit(,44)
44. metadata=0x1, priority 0, cookie 0xd46c593f
resubmit(,45)
45. metadata=0x1, priority 0, cookie 0x2b14b7f8
resubmit(,46)
46. metadata=0x1, priority 0, cookie 0xacf215be
resubmit(,47)
47. metadata=0x1, priority 0, cookie 0x390a9bed
resubmit(,48)
48. reg15=0x1,metadata=0x1, priority 50, cookie 0x9c81921d
resubmit(,64)
64. reg10=0x1/0x1,reg15=0x1,metadata=0x1, priority 100
push:NXM_OF_IN_PORT[]
set_field:0->in_port
resubmit(,65)
65. reg15=0x1,metadata=0x1, priority 100
output:1

bridge("kelda-int")
---
 0. in_port=2, priority 32768
output:1
pop:NXM_OF_IN_PORT[]
 -> NXM_OF_IN_PORT[] is now 1

Final flow: 
arp,reg0=0x2,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
Megaflow: 
recirc_id=0,ct_state=-new-est-rel-rpl-inv-trk,ct_label=0/0x1,eth,arp,in_port=1,vlan_tci=0x/0x1000,dl_src=02:00:0a:c5:34:1e,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.197.52.30,arp_tpa=10.203.4.66,arp_op=1,arp_sha=02:00:0a:c5:34:1e,arp_tha=ff:ff:ff:ff:ff:ff
Datapath actions: 
1,set(eth(src=02:00:0a:cb:04:42,dst=02:00:0a:c5:34:1e)),set(arp(sip=10.203.4.66,tip=10.197.52.30,op=2/0xff,sha=02:00:0a:cb:04:42,tha=02:00:0a:c5:34:1e)),3
This flow is handled by the userspace slow path because it:
- Uses action(s) not supported by datapath.









root@ip-172-31-2-45:/# ovs-appctl ofproto/trace kelda-int 
'icmp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0'
Flow: 
icmp,in_port=1,vlan_tci=0x,dl_src=02:00:0a:c5:34:1e,dl_dst=02:00:0a:cb:04:42,nw_src=10.197.52.30,nw_dst=10.203.4.66,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0

bridge("kelda-int")

Re: [ovs-discuss] Failed assertion in ovs-vswitchd when running OVN

2017-12-07 Thread Ben Pfaff
On Thu, Dec 07, 2017 at 09:26:14AM -0800, Kevin Lin wrote:
> Hi,
> 
> I work on Kelda (kelda.io ) with Ethan Jackson. We run a 
> containerized, distributed version of OVN. The master branch of 
> openvswitch/ovs (commit 07754b23ee5027508d64804d445e617b017cc2d1) fails with 
> the following assertion in ovs-vswitchd:
> 
> ovs-vswitchd(handler2): ofproto/ofproto-dpif-xlate.c:3704: assertion 
> !truncate failed in compose_output_action__()
> 
> whenever we try to use the OVN network. A little background on our setup:
> We’re a container orchestrator that uses OVN for the container network.
> One machine in our cluster runs ovn-northd and ovsdb-server. The network is 
> mostly configured from here (creating the logical ports, creating ACLs etc).
> Another machine runs ovn-controller, ovs-vswitchd, and ovsdb-server. We 
> install some container-specific OpenFlow rules by connecting directly to 
> ovs-vswitchd, and of course ovs-vswitchd also receives rules from OVN.
> ovs-vswitchd does not crash immediately after the rules are installed. But it 
> crashes as soon as the network is used (e.g. a ping from one container to 
> another).
> 
> The commit before the commit that introduced the assertion works for us 
> (https://github.com/openvswitch/ovs/commit/48f704f4768d13f85252bac4f93c8d45d8ab3eea
>  
> ).
> 
> I’ve attached the ovs-vswitchd logs. I’m not sure how helpful the output of 
> ovs-bugtool will be given our containerized setup, but I’ve also attached the 
> output of running that from within the ovs-vswitchd container from before and 
> after the crash. Note, because the ovs-vswitchd container crashed, the 
> “after” tarball was generated after restarting the container, so I’m not sure 
> if any of the commands it ran actually succeeded.
> 
> The crash is trivial for me to reproduce, so please let me know if there’s 
> anymore information I can give you.

Thank you for the report.

I don't see a good reason that this should be a condition that kills
ovs-vswitchd.  I think that it will be both easier to debug and less
inherently harmful if we change it to an error message.  I sent out a
patch that does that:
https://patchwork.ozlabs.org/patch/845845/

It would be great if you could apply the patch and then try to track
down the activity that triggers the error.  "ofproto/trace" is the best
way to do that, if you can find the right packet or flow, because it
will give us all the details on how the problem gets triggered.

Thanks,

Ben.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss