Re: [ovs-discuss] Open vSwitch fails to allocate memory pool for DPDK port

2019-04-03 Thread Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) via discuss
Sorry I just accidentally sent out the last mail too early...

Hi Ian,

To answer your questions, I just reproduced the issue on my system:

What commands have you used to configure the hugepage memory on your system?
   - I have added the following kernel parameters: hugepagesz=2M hugepages=512 
and then rebooted the system. In other scenarios I allocated the HugePages by 
writing into /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages. The 
HugePages are also mounted on: hugetlbfs on /dev/hugepages type hugetlbfs 
(rw,relatime,seclabel)

Before you start OVS with DPDK, if you execute cat /proc/meminfo how many 
hugepages do you see available and how many free? (for 2mb I would assume 512 
in both cases).
  - In this specific case, I only saw 512 HugePages_Total and 0 free because 
after the restart OVS was already using the 512 pages.
 
What memory commands are you passing to OVS with DPDK (e.g. dpdk-socket-mem 
parameter etc.)?
   - Nothing, meaning the default memory of 1GB.

Is it just 1 bridge and a single DPDK interface you are adding or are there 
more than 1 DPDK interface attached?
   - There are 4 bridges in total but only one is using netdev datapath and 
DPDK ports. Here is an extract of that one DPDK bridge:
   Bridge lan-br
Port lan-br
Interface lan-br
type: internal
Port "dpdk-p0"
Interface "dpdk-p0"
type: dpdk
options: {dpdk-devargs=":08:0b.2"}
error: "could not add network device dpdk-p0 to ofproto (No 
such device)"

Can you provide the entire log? I'd be interested in seeing the memory  info at 
initialization of OVS DPDK.
 - I attached it to that mail.

What type of DPDK device are you adding? It seems to be a Virtual function from 
the log above, can you provide more detail as regards the underlying NIC type 
the VF is associated with?
-  It's a VF. NIC type is Intel 710 and driver is i40

DPDK Version is 17.11.0

Thanks
Tobias

On 4/3/19, 9:07 AM, "Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)" 
 wrote:

Hi Ian,

To answer your questions, I just reproduced the issue on my system:

What commands have you used to configure the hugepage memory on your system?
- I have added the following kernel parameters: hugepagesz=2M hugepages=512 
and then rebooted the system. In other scenarios I allocated the HugePages by 
writing into /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages.
  The HugePages are also mounted on: hugetlbfs on /dev/hugepages type 
hugetlbfs (rw,relatime,seclabel)

Before you start OVS with DPDK, if you execute cat /proc/meminfo how many 
hugepages do you see available and how many free? (for 2mb I would assume 512 
in both cases).
- In this specific case, I only saw 512 HugePages_Total and 0 free because 
after the restart OVS was already using the 512 pages.

What memory commands are you passing to OVS with DPDK (e.g. dpdk-socket-mem 
parameter etc.)?

On 4/3/19, 6:00 AM, "Ian Stokes"  wrote:

On 4/3/2019 1:04 AM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) 
via discuss wrote:
> Hello,
> 
> I’m trying to attach a DPDK port with an mtu_size of 9216 to a 
bridge. 
> For this purpose, I have allocated 512 HugePages of size 2MB for OVS 
> (1GB in total).

Hi,

I couldn't reproduce the behavior above on my own system with 512 x 2MB 
hugepages. Ports were successfully configured with MTU 9216. Perhaps 
some more detail as regards your setup will help reproduce/root cause.

Questions inline below.

What commands have you used to configure the hugepage memory on your 
system?

Before you start OVS with DPDK, if you execute cat /proc/meminfo how 
many hugepages do you see available and how many free? (for 2mb I would 
assume 512 in both cases).

What memory commands are you passing to OVS with DPDK (e.g. 
dpdk-socket-mem parameter etc.)?

Is it just 1 bridge and a single DPDK interface you are adding or are 
there more than 1 DPDK interface attached?

> 
> Doing so will constantly fail, two workarounds to get it working were 
> either to decrease the MTU size to 1500 or to increase the total 
amount 
> of HugePage memory to 3GB.
> 
> Actually, I did expect the setup to also work with just 1GB because 
if 
> the amount of memory is not sufficient, OVS will try to halve the 
number 
> of buffers until 16K.
> 
> However, inside the logs I couldn’t find any details regarding this. 
The 
> only error message I observed was:
> 
> netdev_dpdk|ERR|Failed to create memory pool for netdev dpdk-p0, with 
> MTU 9216 on socket 0: Invalid argument

  

Re: [ovs-discuss] Open vSwitch fails to allocate memory pool for DPDK port

2019-04-03 Thread Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) via discuss
Hi Ian,

To answer your questions, I just reproduced the issue on my system:

What commands have you used to configure the hugepage memory on your system?
- I have added the following kernel parameters: hugepagesz=2M hugepages=512 and 
then rebooted the system. In other scenarios I allocated the HugePages by 
writing into /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages.
  The HugePages are also mounted on: hugetlbfs on /dev/hugepages type hugetlbfs 
(rw,relatime,seclabel)

Before you start OVS with DPDK, if you execute cat /proc/meminfo how many 
hugepages do you see available and how many free? (for 2mb I would assume 512 
in both cases).
- In this specific case, I only saw 512 HugePages_Total and 0 free because 
after the restart OVS was already using the 512 pages.

What memory commands are you passing to OVS with DPDK (e.g. dpdk-socket-mem 
parameter etc.)?

On 4/3/19, 6:00 AM, "Ian Stokes"  wrote:

On 4/3/2019 1:04 AM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) 
via discuss wrote:
> Hello,
> 
> I’m trying to attach a DPDK port with an mtu_size of 9216 to a bridge. 
> For this purpose, I have allocated 512 HugePages of size 2MB for OVS 
> (1GB in total).

Hi,

I couldn't reproduce the behavior above on my own system with 512 x 2MB 
hugepages. Ports were successfully configured with MTU 9216. Perhaps 
some more detail as regards your setup will help reproduce/root cause.

Questions inline below.

What commands have you used to configure the hugepage memory on your system?

Before you start OVS with DPDK, if you execute cat /proc/meminfo how 
many hugepages do you see available and how many free? (for 2mb I would 
assume 512 in both cases).

What memory commands are you passing to OVS with DPDK (e.g. 
dpdk-socket-mem parameter etc.)?

Is it just 1 bridge and a single DPDK interface you are adding or are 
there more than 1 DPDK interface attached?

> 
> Doing so will constantly fail, two workarounds to get it working were 
> either to decrease the MTU size to 1500 or to increase the total amount 
> of HugePage memory to 3GB.
> 
> Actually, I did expect the setup to also work with just 1GB because if 
> the amount of memory is not sufficient, OVS will try to halve the number 
> of buffers until 16K.
> 
> However, inside the logs I couldn’t find any details regarding this. The 
> only error message I observed was:
> 
> netdev_dpdk|ERR|Failed to create memory pool for netdev dpdk-p0, with 
> MTU 9216 on socket 0: Invalid argument

Can you provide the entire log? I'd be interested in seeing the memory 
info at initialization of OVS DPDK.

> 
> That log message is weird as I would have expected an error message 
> saying something like ‘could not reserve memory’ but not ‘Invalid 
argument’.
> 
> I then found this very similar bug on Openstack: 
> https://bugs.launchpad.net/starlingx/+bug/1796380
> 
> After having read this, I tried the exact same setup as described above 
> but this time with HugePages of size 1GB instead of 2MB. In this 
> scenario, it also worked with just 1GB of memory reserved for OVS.
> 
> Inside the logs I could observe this time:
> 
> 2019-04-02T22:55:31.849Z|00098|dpdk|ERR|RING: Cannot reserve memory
> 
> 2019-04-02T22:55:32.019Z|00099|dpdk|ERR|RING: Cannot reserve memory
> 
> 2019-04-02T22:55:32.200Z|00100|netdev_dpdk|INFO|Virtual function 
> detected, HW_CRRC_STRIP will be enabled
> 

What type of DPDK device are you adding? It seems to be a Virtual 
function from the log above, can you provide more detail as regards the 
underlying NIC type the VF is associated with?

> 2019-04-02T22:55:32.281Z|00101|netdev_dpdk|INFO|Port 0: f6:e9:29:4d:f9:cf
> 
> 2019-04-02T22:55:32.281Z|00102|dpif_netdev|INFO|Core 1 on numa node 0 
> assigned port 'dpdk-p0' rx queue 0 (measured processing cycles 0).
> 
> The two times where OVS cannot reserve memory are, I guess, the two 
> times where it has to halve the number of buffers to get it working.

Yes this is correct. For example in my setup with 512 x 2MB pages I see 
"Cannot reserve memory" message 4 times before it completes configuration.

> 
> My question now is, is the fact that it does not work for 2MB HugePages 
> a bug? Also, is the error message in the first log extract the intended 
one?
> 

Yes, it seems like a bug if it can be reproduced. The invalid argument 
in this case would refer to the number of mbufs being requested by OVS 
DPDK being less than the minimum allowed (4096 * 64). i.e. the amount of 
memory available cannot support the minimum 262,144 mbufs OVS DPDK 
requires. The memory configuration at the system level is outside of OVS 
control however.
 

Re: [ovs-discuss] libvirt: error : cannot execute binary ovs-vsctl: Permission denied

2019-04-03 Thread Harsh Gondaliya
Can you please share how to disable apparmor for libvrt? It would be
helpful in running OVS with DPDK, SR-IOV and other use cases.

On Wed, 3 Apr 2019 7:50 pm  I do not think it is an ovs question so I answer directly: did you reboot
> ? apparmor profile cannot be changed dynamically.
>
> Anyway, I would just disable apparmor for libvirt... too many problems as
> soon as you experiment non standard use cases (sriov, etc.)
>
>
> Le 03/04/2019 à 16:16, Harsh Gondaliya a écrit :
>
> When I changed the prefix to --prefix=/usr everything worked well. Now
> when I want to change the AppArmor profile similar error pops up. I changed
> /usr/bin/* PUx to /usr/local/bin/* PUx in
> /etc/apparmor.d/usr.sbin.libvirtd. Unable to troubleshoot what is going
> wrong.
> These are my system logs:
>
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.045860] audit: type=1400
> audit(1554300298.503:71): apparmor="STATUS" operation="profile_load"
> profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
> pid=9093 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.046045] audit: type=1400
> audit(1554300298.503:72): apparmor="STATUS" operation="profile_load"
> profile="unconfined"
> name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c//qemu_bridge_helper"
> pid=9093 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5148] manager: (vnet0): new Tun device
> (/org/freedesktop/NetworkManager/Devices/14)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5197] devices added (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5197] device added (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0): no ifupdown configuration found.
> Apr  3 19:34:58 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
> to add port vnet0 to OVS bridge br0
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.397087] audit: type=1400
> audit(1554300298.855:73): apparmor="DENIED" operation="exec"
> profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9110
> comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.8658] devices removed (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
> reading data: Input/output error
> Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2818.935155] audit: type=1400
> audit(1554300299.391:74): apparmor="STATUS" operation="profile_remove"
> profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
> pid=9117 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
> reading data: Input/output error
> Apr  3 19:34:59 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
> to delete port (null) from OVS
> Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2819.157913] audit: type=1400
> audit(1554300299.615:75): apparmor="DENIED" operation="exec"
> profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9118
> comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
>
> On Wed, Mar 27, 2019 at 5:06 PM Harsh Gondaliya <
> harshgondaliya_vinodb...@srmuniv.edu.in> wrote:
>
>> Thank you very much. I used --prefix=/usr while configuring OVS and the
>> issue got resolved.
>>
>> On Tue, Mar 26, 2019 at 10:48 PM  wrote:
>>
>>> You are probably using an ubuntu distribution. The apparmor profile for
>>> libvirt in /etc/apparmor.d/usr.sbin.libvirtd states "/usr/bin/* PUx,"
>>> but not "/usr/local/bin/* PUx". When you use the distribution ovs, it is
>>> installed in /usr/bin but yours is in /usr/local.
>>>
>>> Either modify your apparmor profile or launch ./configure with
>>> --prefix=/usr
>>>
>>> Le 26/03/2019 à 15:06, Harsh Gondaliya a écrit :
>>> > I installed OVS from source int to my /usr/src directory using the
>>> > installation steps mentioned here:
>>> > http://docs.openvswitch.org/en/latest/intro/install/general/
>>> >
>>> > However when I try to create a VM in KVM-QEMU and add it to OVS Bridge
>>> > I get error: Error starting domain: internal error: Unable to add port
>>> > vnet0 to OVS bridge br0
>>> >
>>> > The system logs shows this error:
>>> >
>>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error :
>>> > virCommandWait:2553 : internal error: Child process (ovs-vsctl
>>> > --timeout=5 -- --if-exists del-port vnet0 -- add-port br0 vnet0 -- set
>>> > Interface vnet0 'external-ids:attached-mac="52:54:00:90:c6:c3"' -- set
>>> > Interface vnet0
>>> > 'external-ids:iface-id="a9700eff-03a7-4c47-a112-429fc20677a2"' -- set
>>> > Interface vnet0
>>> > 'external-ids:vm-id="41b4eef0-b820-41da-9034-9de22e1379e0"' -- set
>>> > Interface vnet0 external-ids:iface-status=active) unexpected exit
>>> > status 126:
>>> > *
>>> > *
>>> > *libvirt:  error : cannot execute binary ovs-vsctl: Permission 

Re: [ovs-discuss] libvirt: error : cannot execute binary ovs-vsctl: Permission denied

2019-04-03 Thread Harsh Gondaliya
The config worked once I rebooted the host PC. So the issue got resolved.
Thanks.

On Wed, Apr 3, 2019 at 7:46 PM Harsh Gondaliya <
harshgondaliya_vinodb...@srmuniv.edu.in> wrote:

> When I changed the prefix to --prefix=/usr everything worked well. Now
> when I want to change the AppArmor profile similar error pops up. I changed
> /usr/bin/* PUx to /usr/local/bin/* PUx in
> /etc/apparmor.d/usr.sbin.libvirtd. Unable to troubleshoot what is going
> wrong.
> These are my system logs:
>
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.045860] audit: type=1400
> audit(1554300298.503:71): apparmor="STATUS" operation="profile_load"
> profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
> pid=9093 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.046045] audit: type=1400
> audit(1554300298.503:72): apparmor="STATUS" operation="profile_load"
> profile="unconfined"
> name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c//qemu_bridge_helper"
> pid=9093 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5148] manager: (vnet0): new Tun device
> (/org/freedesktop/NetworkManager/Devices/14)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5197] devices added (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.5197] device added (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0): no ifupdown configuration found.
> Apr  3 19:34:58 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
> to add port vnet0 to OVS bridge br0
> Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.397087] audit: type=1400
> audit(1554300298.855:73): apparmor="DENIED" operation="exec"
> profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9110
> comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
> Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
> [1554300298.8658] devices removed (path: /sys/devices/virtual/net/vnet0,
> iface: vnet0)
> Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
> reading data: Input/output error
> Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2818.935155] audit: type=1400
> audit(1554300299.391:74): apparmor="STATUS" operation="profile_remove"
> profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
> pid=9117 comm="apparmor_parser"
> Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
> reading data: Input/output error
> Apr  3 19:34:59 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
> to delete port (null) from OVS
> Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2819.157913] audit: type=1400
> audit(1554300299.615:75): apparmor="DENIED" operation="exec"
> profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9118
> comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
>
> On Wed, Mar 27, 2019 at 5:06 PM Harsh Gondaliya <
> harshgondaliya_vinodb...@srmuniv.edu.in> wrote:
>
>> Thank you very much. I used --prefix=/usr while configuring OVS and the
>> issue got resolved.
>>
>> On Tue, Mar 26, 2019 at 10:48 PM  wrote:
>>
>>> You are probably using an ubuntu distribution. The apparmor profile for
>>> libvirt in /etc/apparmor.d/usr.sbin.libvirtd states "/usr/bin/* PUx,"
>>> but not "/usr/local/bin/* PUx". When you use the distribution ovs, it is
>>> installed in /usr/bin but yours is in /usr/local.
>>>
>>> Either modify your apparmor profile or launch ./configure with
>>> --prefix=/usr
>>>
>>> Le 26/03/2019 à 15:06, Harsh Gondaliya a écrit :
>>> > I installed OVS from source int to my /usr/src directory using the
>>> > installation steps mentioned here:
>>> > http://docs.openvswitch.org/en/latest/intro/install/general/
>>> >
>>> > However when I try to create a VM in KVM-QEMU and add it to OVS Bridge
>>> > I get error: Error starting domain: internal error: Unable to add port
>>> > vnet0 to OVS bridge br0
>>> >
>>> > The system logs shows this error:
>>> >
>>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error :
>>> > virCommandWait:2553 : internal error: Child process (ovs-vsctl
>>> > --timeout=5 -- --if-exists del-port vnet0 -- add-port br0 vnet0 -- set
>>> > Interface vnet0 'external-ids:attached-mac="52:54:00:90:c6:c3"' -- set
>>> > Interface vnet0
>>> > 'external-ids:iface-id="a9700eff-03a7-4c47-a112-429fc20677a2"' -- set
>>> > Interface vnet0
>>> > 'external-ids:vm-id="41b4eef0-b820-41da-9034-9de22e1379e0"' -- set
>>> > Interface vnet0 external-ids:iface-status=active) unexpected exit
>>> > status 126:
>>> > *
>>> > *
>>> > *libvirt:  error : cannot execute binary ovs-vsctl: Permission denied*
>>> >
>>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 kernel: [ 1932.243181] audit:
>>> > type=1400 audit(1553608501.701:59): apparmor="DENIED" operation="exec"
>>> > profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=20679
>>> > comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 

Re: [ovs-discuss] libvirt: error : cannot execute binary ovs-vsctl: Permission denied

2019-04-03 Thread Harsh Gondaliya
When I changed the prefix to --prefix=/usr everything worked well. Now when
I want to change the AppArmor profile similar error pops up. I changed
/usr/bin/* PUx to /usr/local/bin/* PUx in
/etc/apparmor.d/usr.sbin.libvirtd. Unable to troubleshoot what is going
wrong.
These are my system logs:

Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.045860] audit: type=1400
audit(1554300298.503:71): apparmor="STATUS" operation="profile_load"
profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
pid=9093 comm="apparmor_parser"
Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.046045] audit: type=1400
audit(1554300298.503:72): apparmor="STATUS" operation="profile_load"
profile="unconfined"
name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c//qemu_bridge_helper"
pid=9093 comm="apparmor_parser"
Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
[1554300298.5148] manager: (vnet0): new Tun device
(/org/freedesktop/NetworkManager/Devices/14)
Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
[1554300298.5197] devices added (path: /sys/devices/virtual/net/vnet0,
iface: vnet0)
Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
[1554300298.5197] device added (path: /sys/devices/virtual/net/vnet0,
iface: vnet0): no ifupdown configuration found.
Apr  3 19:34:58 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
to add port vnet0 to OVS bridge br0
Apr  3 19:34:58 dpdk-OptiPlex-5040 kernel: [ 2818.397087] audit: type=1400
audit(1554300298.855:73): apparmor="DENIED" operation="exec"
profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9110
comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
Apr  3 19:34:58 dpdk-OptiPlex-5040 NetworkManager[8158]: 
[1554300298.8658] devices removed (path: /sys/devices/virtual/net/vnet0,
iface: vnet0)
Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
reading data: Input/output error
Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2818.935155] audit: type=1400
audit(1554300299.391:74): apparmor="STATUS" operation="profile_remove"
profile="unconfined" name="libvirt-ae767ff5-9d0f-4413-999b-b6b14dbf9b0c"
pid=9117 comm="apparmor_parser"
Apr  3 19:34:58 dpdk-OptiPlex-5040 virtlogd[5635]: End of file while
reading data: Input/output error
Apr  3 19:34:59 dpdk-OptiPlex-5040 libvirtd[8951]: internal error: Unable
to delete port (null) from OVS
Apr  3 19:34:59 dpdk-OptiPlex-5040 kernel: [ 2819.157913] audit: type=1400
audit(1554300299.615:75): apparmor="DENIED" operation="exec"
profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=9118
comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0

On Wed, Mar 27, 2019 at 5:06 PM Harsh Gondaliya <
harshgondaliya_vinodb...@srmuniv.edu.in> wrote:

> Thank you very much. I used --prefix=/usr while configuring OVS and the
> issue got resolved.
>
> On Tue, Mar 26, 2019 at 10:48 PM  wrote:
>
>> You are probably using an ubuntu distribution. The apparmor profile for
>> libvirt in /etc/apparmor.d/usr.sbin.libvirtd states "/usr/bin/* PUx,"
>> but not "/usr/local/bin/* PUx". When you use the distribution ovs, it is
>> installed in /usr/bin but yours is in /usr/local.
>>
>> Either modify your apparmor profile or launch ./configure with
>> --prefix=/usr
>>
>> Le 26/03/2019 à 15:06, Harsh Gondaliya a écrit :
>> > I installed OVS from source int to my /usr/src directory using the
>> > installation steps mentioned here:
>> > http://docs.openvswitch.org/en/latest/intro/install/general/
>> >
>> > However when I try to create a VM in KVM-QEMU and add it to OVS Bridge
>> > I get error: Error starting domain: internal error: Unable to add port
>> > vnet0 to OVS bridge br0
>> >
>> > The system logs shows this error:
>> >
>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error :
>> > virCommandWait:2553 : internal error: Child process (ovs-vsctl
>> > --timeout=5 -- --if-exists del-port vnet0 -- add-port br0 vnet0 -- set
>> > Interface vnet0 'external-ids:attached-mac="52:54:00:90:c6:c3"' -- set
>> > Interface vnet0
>> > 'external-ids:iface-id="a9700eff-03a7-4c47-a112-429fc20677a2"' -- set
>> > Interface vnet0
>> > 'external-ids:vm-id="41b4eef0-b820-41da-9034-9de22e1379e0"' -- set
>> > Interface vnet0 external-ids:iface-status=active) unexpected exit
>> > status 126:
>> > *
>> > *
>> > *libvirt:  error : cannot execute binary ovs-vsctl: Permission denied*
>> >
>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 kernel: [ 1932.243181] audit:
>> > type=1400 audit(1553608501.701:59): apparmor="DENIED" operation="exec"
>> > profile="/usr/sbin/libvirtd" name="/usr/local/bin/ovs-vsctl" pid=20679
>> > comm="libvirtd" requested_mask="x" denied_mask="x" fsuid=0 ouid=0
>> >
>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: debug :
>> > virCommandRun:2280 : Result status 0, stdout: '' stderr: 'libvirt:
>> > error : cannot execute binary ovs-vsctl: Permission denied#012'
>> > Mar 26 19:25:01 dpdk-OptiPlex-5040 libvirtd.service: 20423: error :
>> > 

Re: [ovs-discuss] ovs-master crashes due to double locking of ofproto_mutex.

2019-04-03 Thread Vishal Deep Ajmera
Hi Ben, Jarno,

I am not sure why we have to take a lock (ofproto_mutex) for processing 
packet-out action. I think this action does not modify any open flow table. I 
could see that lock is required for actions like flow_add/flow_mod etc. as they 
are modifying the existing flow tables.

Any pointers for reasoning behind this will be very helpful. We are frequently 
hitting this scenario in our deployment and as you see below, it is quite easy 
to reproduce this by a set of open flow rules.

Warm Regards,
Vishal Ajmera

Following commit id introduced this lock in handle_packet_out:

commit 1f4a893366826e392722d5b1ba59e94331bfe5c9
Author: Jarno Rajahalme 
Date:   Wed Sep 14 16:51:27 2016 -0700

ofproto: Refactor packet_out handling.

Refactor handle_packet_out() to prepare for bundle support for packet
outs in a later patch.

Two new callbacks are introduced in ofproto-provider class:
->packet_xlate() and ->packet_execute().  ->packet_xlate() translates
the packet using the flow and actions provided by the caller, but
defers all OpenFlow-visible side-effects (stats, learn actions, actual
packet output, etc.) to be explicitly executed with the
->packet_execute() call.

Adds a new ofproto_rule_reduce_timeouts__() that must be called with
'ofproto_mutex' held.  This is used in the next patch.

Signed-off-by: Jarno Rajahalme 
Acked-by: Ben Pfaff 


From: ovs-discuss-boun...@openvswitch.org  
On Behalf Of Anil Kumar Koli via discuss
Sent: Wednesday, April 3, 2019 3:37 PM
To: ovs-discuss@openvswitch.org
Subject: [ovs-discuss] ovs-master crashes due to double locking of 
ofproto_mutex.

Hello OVS team,

OVS crash is observed in ovs-master when controller sends a packet with 
packet-out (output port as OFPP_TABLE) and any of the openflow table entry 
which gets hit results into a learn action.

Steps to reproduce:
1. Start the ovs-vswitchd in dpdk mode.
2. Configure the following flows
ovs-ofctl -OOpenflow13 add-flow br-int "table=0, priority=50, ct_state=-trk,ip, 
in_port=10 actions=ct(table=0)"
ovs-ofctl -OOpenflow13 add-flow br-int "table=0, priority=50, ct_state=+trk,ip, 
in_port=10 actions=ct(commit),resubmit(,1)"
ovs-ofctl -OOpenflow13 add-flow br-int "table=1 
actions=learn(table=2,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_OF_IN_PORT[]->NXM_NX_REG0[0..15],
 output:NXM_NX_REG0[0..15]),resubmit(,2)"
3. Send a packet with output as OFPP_TABLE
ovs-ofctl -OOpenflow13 packet-out br-int 'in_port=10 
packet=505400071011080045284006f97cc0a80001c0a800020008000a50022e7d,
 actions=TABLE'

This leads to a crash which is a case of double locking of oproto_mutex in 
handle_packet_out and ofproto_flow_mod_learn.
Can some one provide more insight of introducing ofproto_mutex lock in 
handle_packet_out which is the cause for above crash?

Please find the backtrace in the attachment.

Thanks & Regards,
Anil Kumar.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Open vSwitch fails to allocate memory pool for DPDK port

2019-04-03 Thread Ian Stokes
On 4/3/2019 1:04 AM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) 
via discuss wrote:

Hello,

I’m trying to attach a DPDK port with an mtu_size of 9216 to a bridge. 
For this purpose, I have allocated 512 HugePages of size 2MB for OVS 
(1GB in total).


Hi,

I couldn't reproduce the behavior above on my own system with 512 x 2MB 
hugepages. Ports were successfully configured with MTU 9216. Perhaps 
some more detail as regards your setup will help reproduce/root cause.


Questions inline below.

What commands have you used to configure the hugepage memory on your system?

Before you start OVS with DPDK, if you execute cat /proc/meminfo how 
many hugepages do you see available and how many free? (for 2mb I would 
assume 512 in both cases).


What memory commands are you passing to OVS with DPDK (e.g. 
dpdk-socket-mem parameter etc.)?


Is it just 1 bridge and a single DPDK interface you are adding or are 
there more than 1 DPDK interface attached?




Doing so will constantly fail, two workarounds to get it working were 
either to decrease the MTU size to 1500 or to increase the total amount 
of HugePage memory to 3GB.


Actually, I did expect the setup to also work with just 1GB because if 
the amount of memory is not sufficient, OVS will try to halve the number 
of buffers until 16K.


However, inside the logs I couldn’t find any details regarding this. The 
only error message I observed was:


netdev_dpdk|ERR|Failed to create memory pool for netdev dpdk-p0, with 
MTU 9216 on socket 0: Invalid argument


Can you provide the entire log? I'd be interested in seeing the memory 
info at initialization of OVS DPDK.




That log message is weird as I would have expected an error message 
saying something like ‘could not reserve memory’ but not ‘Invalid argument’.


I then found this very similar bug on Openstack: 
https://bugs.launchpad.net/starlingx/+bug/1796380


After having read this, I tried the exact same setup as described above 
but this time with HugePages of size 1GB instead of 2MB. In this 
scenario, it also worked with just 1GB of memory reserved for OVS.


Inside the logs I could observe this time:

2019-04-02T22:55:31.849Z|00098|dpdk|ERR|RING: Cannot reserve memory

2019-04-02T22:55:32.019Z|00099|dpdk|ERR|RING: Cannot reserve memory

2019-04-02T22:55:32.200Z|00100|netdev_dpdk|INFO|Virtual function 
detected, HW_CRRC_STRIP will be enabled




What type of DPDK device are you adding? It seems to be a Virtual 
function from the log above, can you provide more detail as regards the 
underlying NIC type the VF is associated with?



2019-04-02T22:55:32.281Z|00101|netdev_dpdk|INFO|Port 0: f6:e9:29:4d:f9:cf

2019-04-02T22:55:32.281Z|00102|dpif_netdev|INFO|Core 1 on numa node 0 
assigned port 'dpdk-p0' rx queue 0 (measured processing cycles 0).


The two times where OVS cannot reserve memory are, I guess, the two 
times where it has to halve the number of buffers to get it working.


Yes this is correct. For example in my setup with 512 x 2MB pages I see 
"Cannot reserve memory" message 4 times before it completes configuration.




My question now is, is the fact that it does not work for 2MB HugePages 
a bug? Also, is the error message in the first log extract the intended one?




Yes, it seems like a bug if it can be reproduced. The invalid argument 
in this case would refer to the number of mbufs being requested by OVS 
DPDK being less than the minimum allowed (4096 * 64). i.e. the amount of 
memory available cannot support the minimum 262,144 mbufs OVS DPDK 
requires. The memory configuration at the system level is outside of OVS 
control however.



*My version numbers:*

  * CentOS 7.6
  * Open vSwitch version: 2.9.3
  * DPDK version: 17.11
Is this DPDK 17.11.0? As an FYI The latest DPDK 17.11.x version 
recommended for OVS is 17.11.4 with OVS 2.9.4, these typically include 
bug fixes in DPDK so we recommend moving to them if possible..


Ian

  * System has a single NUMA node.

Thank you

Tobias


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss



___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVS

2019-04-03 Thread Ammu
Hello,

Do you have any update on this?

-
Keerthana

On Mon, Apr 1, 2019 at 5:27 PM Ammu  wrote:

> Hello Greg,
>
> Sorry for the late reply.
>
> I tried move the source interface to another bridge. But, after the
> configurations made, the expectation is not met.
>
> I facing few hurdles. The configuration in the attachment.
>
> Kindly help me with the configuration if I have missed something.
>
> And, I just have a question, why is it that the source interface should
> not be in the same bridge (according to my earlier configuration).
> Is there any limitation?
>
> -
> Keerthana
>
> On Fri, Mar 29, 2019 at 9:18 PM Gregory Rose  wrote:
>
>>
>> On 3/27/2019 10:33 PM, Ammu wrote:
>>
>> Hi Greg,
>>
>> *ens256:*
>>
>>- Source interface which is intended to receive all traffic (the
>>reason why promisc mode is ON)
>>- Trying to tunnel all traffic that I receive in this interface
>>
>>
>> *Output for* *ip addr show ens256:*
>>
>> 4: ens256:  mtu 1500 qdisc
>> pfifo_fast state UP group default qlen 1000
>> link/ether 00:50:56:95:ed:4f brd ff:ff:ff:ff:ff:ff
>> inet6 fe80::26b9:a4a6:c518:8738/64 scope link noprefixroute
>>valid_lft forever preferred_lft forever
>>
>>
>> Please move this interface to a different bridge and then retry.  If you
>> check my configuration I do not have any
>> other interface on the bridge where the tunnel is located.
>>
>> Thanks,
>>
>> - Greg
>>
>>
>> -
>> Keerthana
>>
>> On Wed, Mar 27, 2019 at 11:39 PM Gregory Rose 
>> wrote:
>>
>>>
>>>
>>> On 3/27/2019 2:30 AM, Ammu wrote:
>>>
>>> Hello Greg,
>>>
>>> I still don't see any change to my previous results.
>>>
>>> I am attaching the results along with the configurations made.
>>>
>>> I have attached first few packets of the capture.
>>>
>>> Kindly let me know if I am missing something or I am obscure in my
>>> explanation.
>>>
>>>
>>> I'm curious - why do you do this in your configuration?
>>>
>>> ovs-vsctl add-port br0 ens256
>>> ip link set ens256 up
>>> ip link set ens256 promisc on
>>>
>>> Can you provide the output of 'ip addr show ens256'?
>>> And what is the source of that interface?
>>>
>>> Thanks,
>>>
>>> - Greg
>>>
>>>
>>> -
>>> Keerthana
>>>
>>>
>>>
>>> On Tue, Mar 26, 2019 at 12:24 PM Ammu  wrote:
>>>
 Hi Greg,

 Thank you for the update!

 Yeah, we are all in the same page.

 Maybe I will update the OS version to 7.5 and get back on the result at
 the earliest.

 -
 Keerthana

 On Tue, Mar 26, 2019 at 2:34 AM Gregory Rose 
 wrote:

>
>
> On 3/25/2019 9:03 AM, Gregory Rose wrote:
> >
> > On 3/23/2019 2:28 AM, Ammu wrote:
> >> Hey Greg,
> >>
> >> The recent check with OVS 2.10.1 version was done with CentOS Linux
> >> release 7.3.1611
> >>
> >> But, I will have to support the solution with distributions
> >> CentOS/Red Hat/Ubuntu.
> >>
> >> Currently giving you the output of distribution CentOS alone.
> >>
> >> [root@localhost ~]# modinfo openvswitch
> >> filename:
> >>
>  /lib/modules/3.10.0-514.el7.x86_64/kernel/net/openvswitch/openvswitch.ko
> >> license:GPL
> >> description:Open vSwitch switching datapath
> >> rhelversion:7.3
> >> srcversion: B31AE95554C9D9A0067F935
> >> depends:
> >> nf_conntrack,nf_nat,libcrc32c,nf_nat_ipv6,nf_nat_ipv4,nf_defrag_ipv6
> >> intree: Y
> >> vermagic:   3.10.0-514.el7.x86_64 SMP mod_unload modversions
> >> signer: CentOS Linux kernel signing key
> >> sig_key: D4:88:63:A7:C1:6F:CC:27:41:23:E6:29:8F:74:F0:57:AF:19:FC:54
> >> sig_hashalgo:   sha256
> >
> > OK, I wanted to make sure that's the case before I do the repro
> > attempt today.  I'm using a 7.5 based
> > driver but it should be substantially the same.  I'll update in a
> bit
> > after I try it out.
> >
>
> Hi Ammu,
>
> I have tried your setup but am not seeing the same results.
>
> Here is my configuration on machine A:
>
> [root@localhost ovs-test-scripts]# ovs-vsctl show
> a83453d5-27f8-4873-9356-e94b0d488797
>  Bridge "br0"
>  Port "vxlan1"
>  Interface "vxlan1"
>  type: vxlan
>  options: {df_default="false", key="100",
> remote_ip="200.0.0.102"}
>  Port "br0"
>  Interface "br0"
>  type: internal
>  ovs_version: "2.10.1"
>
> I have the identical configuration on Machine B, with the tunnel
> pointing back
> to machine A:
>
> [root@localhost ovs-test-scripts]# ovs-vsctl show
> bd184ee4-6e36-415b-ab90-e447046470c9
>  Bridge "br0"
>  Port "vxlan1"
>  Interface "vxlan1"
>  type: vxlan
>  options: {df_default="false", key="100",
> remote_ip="200.0.0.109"}
>  Port "br0"
>

Re: [ovs-discuss] Packet drop after openvswitch bond interface toggles

2019-04-03 Thread Ian Stokes

On 4/3/2019 12:28 PM, Inakoti, Satish (Nokia - HU/Budapest) wrote:

Hello Ian,
I already tried patching these two fixes in one of the environments sometime 
ago and did not help.


Ok, thanks for trying them.


I kind of am thinking in the similar lines that the LACP control channel of OVS 
has to wait till the carrier is completely capable of handling traffic before 
sending LACP control pdu's.
The below fixes may be missing a trick for the other types of netdevs (eg. 
DPDK) ??


Possibly, ideally the netdev_dpdk behavior would be similar to the other 
netdevs, but as the underlying hardware for a netdev_dpdk device can 
differ also, I'm wondering is there something specific with the ixgbe 
pmd used by the 82599ES card that needs to be addressed here if the 
patches below do not resolve the issue.


I'll need a little time to reproduce on my own system to investigate 
further and I'll follow up then.


Ian



-Satish Inakoti

-Original Message-
From: Ian Stokes 
Sent: Wednesday, April 03, 2019 1:09 PM
To: Inakoti, Satish (Nokia - HU/Budapest) ; 
b...@openvswitch.org
Subject: Re: [ovs-discuss] Packet drop after openvswitch bond interface toggles

On 4/2/2019 8:11 AM, Inakoti, Satish (Nokia - HU/Budapest) wrote:

Hi,
*Problem statement:*
If a ovs-bond is configured with LACP active-active mode (SLB-balancing)
and the one of the links go down and come back up again, we observe a
packet drop for few seconds.
*Environment:*
Openvswitch version - ovs-vsctl (Open vSwitch) 2.9.3
   DB Schema 7.15.1
DPDK version - dpdk-17.11.4
Physical nics: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network
Connection
Bond mode - LACP active-active, SLB balanced.
*Steps to reproduce:*

  1. If one of the links go down (make it down from the ToR switch) - the
 other takes over and traffic flows smooth as expected.
  2. When this link becomes active again, then the VM connected to this
 bond interface observes packets(UDP) drop for few seconds.

*Expected behavior:*
The traffic should flow without any drop, even after the interface comes up.
BR,


Hi,

this sounds similar to the issue described in

https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/356956.html

There are 2 patches under review to help address the issue (although as
you are using an 82599ES I would think patch 1 below should resolve the
issue for you, the second patch is aimed at i40e devices).

https://patchwork.ozlabs.org/patch/1051724/
https://patchwork.ozlabs.org/patch/1051725/

Could you check if they resolve the issue you are seeing?

Regards
Ian




___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Packet drop after openvswitch bond interface toggles

2019-04-03 Thread Inakoti, Satish (Nokia - HU/Budapest)
Hello Ian,
I already tried patching these two fixes in one of the environments sometime 
ago and did not help.

I kind of am thinking in the similar lines that the LACP control channel of OVS 
has to wait till the carrier is completely capable of handling traffic before 
sending LACP control pdu's. 
The below fixes may be missing a trick for the other types of netdevs (eg. 
DPDK) ??


-Satish Inakoti

-Original Message-
From: Ian Stokes  
Sent: Wednesday, April 03, 2019 1:09 PM
To: Inakoti, Satish (Nokia - HU/Budapest) ; 
b...@openvswitch.org
Subject: Re: [ovs-discuss] Packet drop after openvswitch bond interface toggles

On 4/2/2019 8:11 AM, Inakoti, Satish (Nokia - HU/Budapest) wrote:
> Hi,
> *Problem statement:*
> If a ovs-bond is configured with LACP active-active mode (SLB-balancing) 
> and the one of the links go down and come back up again, we observe a 
> packet drop for few seconds.
> *Environment:*
> Openvswitch version - ovs-vsctl (Open vSwitch) 2.9.3
>   DB Schema 7.15.1
> DPDK version - dpdk-17.11.4
> Physical nics: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network 
> Connection
> Bond mode - LACP active-active, SLB balanced.
> *Steps to reproduce:*
> 
>  1. If one of the links go down (make it down from the ToR switch) - the
> other takes over and traffic flows smooth as expected.
>  2. When this link becomes active again, then the VM connected to this
> bond interface observes packets(UDP) drop for few seconds.
> 
> *Expected behavior:*
> The traffic should flow without any drop, even after the interface comes up.
> BR,

Hi,

this sounds similar to the issue described in

https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/356956.html

There are 2 patches under review to help address the issue (although as 
you are using an 82599ES I would think patch 1 below should resolve the 
issue for you, the second patch is aimed at i40e devices).

https://patchwork.ozlabs.org/patch/1051724/
https://patchwork.ozlabs.org/patch/1051725/

Could you check if they resolve the issue you are seeing?

Regards
Ian


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Packet drop after openvswitch bond interface toggles

2019-04-03 Thread Ian Stokes

On 4/2/2019 8:11 AM, Inakoti, Satish (Nokia - HU/Budapest) wrote:

Hi,
*Problem statement:*
If a ovs-bond is configured with LACP active-active mode (SLB-balancing) 
and the one of the links go down and come back up again, we observe a 
packet drop for few seconds.

*Environment:*
Openvswitch version - ovs-vsctl (Open vSwitch) 2.9.3
  DB Schema 7.15.1
DPDK version - dpdk-17.11.4
Physical nics: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network 
Connection

Bond mode – LACP active-active, SLB balanced.
*Steps to reproduce:*

 1. If one of the links go down (make it down from the ToR switch) – the
other takes over and traffic flows smooth as expected.
 2. When this link becomes active again, then the VM connected to this
bond interface observes packets(UDP) drop for few seconds.

*Expected behavior:*
The traffic should flow without any drop, even after the interface comes up.
BR,


Hi,

this sounds similar to the issue described in

https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/356956.html

There are 2 patches under review to help address the issue (although as 
you are using an 82599ES I would think patch 1 below should resolve the 
issue for you, the second patch is aimed at i40e devices).


https://patchwork.ozlabs.org/patch/1051724/
https://patchwork.ozlabs.org/patch/1051725/

Could you check if they resolve the issue you are seeing?

Regards
Ian


___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] ovs-master crashes due to double locking of ofproto_mutex.

2019-04-03 Thread Anil Kumar Koli via discuss
Hello OVS team,

 

OVS crash is observed in ovs-master when controller sends a packet with
packet-out (output port as OFPP_TABLE) and any of the openflow table entry
which gets hit results into a learn action.

 

Steps to reproduce:

1. Start the ovs-vswitchd in dpdk mode.

2. Configure the following flows

ovs-ofctl -OOpenflow13 add-flow br-int "table=0, priority=50,
ct_state=-trk,ip, in_port=10 actions=ct(table=0)"

ovs-ofctl -OOpenflow13 add-flow br-int "table=0, priority=50,
ct_state=+trk,ip, in_port=10 actions=ct(commit),resubmit(,1)"

ovs-ofctl -OOpenflow13 add-flow br-int "table=1
actions=learn(table=2,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:NXM_OF_IN_PORT[
]->NXM_NX_REG0[0..15], output:NXM_NX_REG0[0..15]),resubmit(,2)"

3. Send a packet with output as OFPP_TABLE

ovs-ofctl -OOpenflow13 packet-out br-int 'in_port=10
packet=505400071011080045284006f97cc0a80001c0a800020
008000a50022e7d, actions=TABLE'

 

This leads to a crash which is a case of double locking of oproto_mutex in
handle_packet_out and ofproto_flow_mod_learn. 

Can some one provide more insight of introducing ofproto_mutex lock in
handle_packet_out which is the cause for above crash? 

 

Please find the backtrace in the attachment.

 

Thanks & Regards,

Anil Kumar.

root@ubuntu16:/home/sdn# ovs-vswitchd --version
ovs-vswitchd (Open vSwitch) 2.11.90
DPDK 18.11.0

root@ubuntu16:/home/sdn# apport-retrace -g 
/var/crash/_usr_sbin_ovs-vswitchd.0.crash
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Reading symbols from /usr/sbin/ovs-vswitchd...done.
[New LWP 12999]
[New LWP 13001]
[New LWP 13002]
[New LWP 13006]
[New LWP 13051]
[New LWP 13052]
[New LWP 13055]
[New LWP 13056]
[New LWP 13170]
[New LWP 13000]
[New LWP 13171]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `ovs-vswitchd unix:/var/run/openvswitch/db.sock 
-vconsole:emer -vsyslog:err -vfi'.
Program terminated with signal SIGABRT, Aborted.
#0  0x7fd393582428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
54  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0x7fd394d16000 (LWP 12999))]
(gdb) bt
#0  0x7fd393582428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7fd39358402a in __GI_abort () at abort.c:89
#2  0x00a5a72e in ovs_abort_valist (err_no=, 
format=, args=args@entry=0x7ffcdd7aab10)
at lib/util.c:363
#3  0x00a5a7c4 in ovs_abort (err_no=, 
format=format@entry=0xc42049 "%s: pthread_%s_%s failed")
at lib/util.c:355
#4  0x00a276bc in ovs_mutex_lock_at (l_=l_@entry=0xf8b240 
,
where=where@entry=0xc13c99 "ofproto/ofproto.c:5380") at lib/ovs-thread.c:75
#5  0x00951f7f in ofproto_flow_mod_learn (ofm=ofm@entry=0x7ffcdd7aae30, 
keep_ref=, limit=0,
below_limitp=below_limitp@entry=0x7ffcdd7aad50) at ofproto/ofproto.c:5380
#6  0x0097288e in xlate_learn_action (ctx=ctx@entry=0x7ffcdd7ac6d0, 
learn=learn@entry=0x2bac018)
at ofproto/ofproto-dpif-xlate.c:5362
#7  0x00978ef3 in do_xlate_actions (ofpacts=, 
ofpacts_len=, ctx=,
is_last_action=, group_bucket_action=) at 
ofproto/ofproto-dpif-xlate.c:6762
#8  0x0097353e in xlate_recursively (actions_xlator=0x977760 
, is_last_action=false, deepens=false,
rule=0x2babe80, ctx=0x7ffcdd7ac6d0) at ofproto/ofproto-dpif-xlate.c:4217
#9  xlate_table_action (ctx=0x7ffcdd7ac6d0, in_port=, 
table_id=, may_packet_in=,
honor_table_miss=, with_ct_orig=, 
is_last_action=false, xlator=0x977760 )
at ofproto/ofproto-dpif-xlate.c:4345
#10 0x0097857a in xlate_ofpact_resubmit (is_last_action=false, 
resubmit=0x2bad5f8, ctx=0x7ffcdd7ac6d0)
at ofproto/ofproto-dpif-xlate.c:4656
#11 do_xlate_actions (ofpacts=ofpacts@entry=0x2bad5d8, 
ofpacts_len=ofpacts_len@entry=48, ctx=ctx@entry=0x7ffcdd7ac6d0,
is_last_action=is_last_action@entry=true, 
group_bucket_action=group_bucket_action@entry=false)
at ofproto/ofproto-dpif-xlate.c:6642
#12 0x0097ce46 in xlate_actions (xin=xin@entry=0x7ffcdd7ad560, 
xout=xout@entry=0x7ffcdd7ad950)
at ofproto/ofproto-dpif-xlate.c:7424
#13 0x0096d151 in upcall_xlate (wc=0x7ffcdd7aebc0,