Re: [ovs-discuss] OVS-DPDK fails after clearing buffer

2019-04-08 Thread Burakov, Anatoly
> -Original Message-
> From: Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
> [mailto:tohof...@cisco.com]
> Sent: Friday, April 5, 2019 9:39 PM
> To: b...@openvswitch.org; Burakov, Anatoly 
> Cc: Shriroop Joshi (shrirjos) ; Stokes, Ian
> 
> Subject: Re: [ovs-discuss] OVS-DPDK fails after clearing buffer
> 
> Hi Anatoly,
> 
> I just wanted to follow up on the issue reported below. (It's already been 2
> weeks ago)
> 
> I don’t really understand the first solution you suggested: use IOVA as VA
> mode Does that mean I shall load vfio-pci driver before I set dpdk-init to
> true? So, doing a 'modprobe vfio-pci'? Actually I use vfio-pci but I wait with
> loading the vfio-pci until I actually bind an interface to it.

Hi Tobias,

As far as I can remember, in 18.08, IOVA as VA mode will be enabled if

0) modprobe vfio-pci, enable IOMMU in the BIOS, etc.
1) you have *at least one physical device* (otherwise EAL defaults to IOVA as 
PA mode)
2) *all* of your *physical* devices are bound to vfio-pci

Provided all of this is true, DPDK should run in IOVA as VA mode.

Alternatively, DPDK 17.11 and 18.11 will have --iova-mode command-line switch 
which will allow forcing IOVA as VA mode if possible, but I'm not sure if 18.08 
has it.

> 
> Also, to answer your last question: Transparent HugePages are enabled. I've
> just disabled them and was still able to reproduce the issue.

Unfortunately, I can't be of much help here as I did not look into how vmcaches 
work on Linux, let alone what happens when hugepages end up in said cache. I 
obviously don't know specifics of your use case and whether it's really 
necessary to drop caches, however a cursory Google search indicates that the 
general sentiment seems to be that you shouldn't drop caches in the first 
place, and that it is not a good practice in general.

> 
> Regards
> Toby
> 
> 
> On 3/21/19, 12:19 PM, "Ian Stokes"  wrote:
> 
> On 3/20/2019 10:37 PM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco)
> via discuss wrote:
> > Hello,
> >
> 
> Hi,
> 
> I wasnt sure at first glance what was happening so discussed with
> Anatoly (Cc'd) who has worked a considerable amount with DPDK memory
> models. Please see response below to what the suspected issue is.
> Anatoly, thanks for you input on this.
> 
> > I want to use Open vSwitch with DPDK enabled. For this purpose, I first
> > allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage
> > memory available for OVS-DPDK. (I don’t set any value for
> > */dpdk-socket-mem/ *so the default value of 1GB is taken). Then I set
> > */dpdk-init=true/*. This normally works fine.
> >
> > However, I have realized that I can’t allocate HugePages from memory
> > that is inside the buff/cache (visible through */free -h/*). To solve
> > this issue, I decided to clear the cache/buffer in Linux before
> > allocating HugePages by running */echo 1 >
> /proc/sys/vm/drop_caches/*.
> >
> > After that, allocation of the HugePages still works fine. However, when
> > I then run */ovs-vsctl set open_vswitch other_config:dpdk-init=true/*
> > the process crashes and inside the ovs-vswitchd.log I observe the
> following:
> >
> > *ovs-vswitchd log output:*
> >
> > 2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270
> pages
> > from 512 requested
> >
> > Current CONFIG_RTE_MAX_MEMSEG=256 is not enough
> 
> After you drop the cache, from the above log it is clear that, as a
> result, hugepages’ physical addresses get fragmented, as DPDK cannot
> concatenate pages into segments any more (which results in
> 1-page-per-segment type situation which causes you to run out of
> memseg
> structures, of which there are only 256). We have no control over what
> addresses we get from the OS, so there’s really no way to “unfragment”
> the pages.
> 
> So, the above only happens when
> 
> 1) you’re running in IOVA as PA mode (so, using real physical addresses).
> 2) your hugepages are heavily fragmented.
> 
> Possible solutions for this are:
> 
> 1. Use IOVA as VA mode (so, use VFIO, not igb_uio), this way, the pages
> will still be fragmented, but the IOMMU will remap them to be contiguous
> – this is the recommended option, with VFIO being available it is the
> better choice than igb_uio.
> 
> 2. Use bigger page sizes. Strictly speaking, this isn’t a solution as
> memory would be fragmented too, but a 1GB-long standalone segment is
> way
> more useful than a standalone 2MB-long segment.
>

Re: [ovs-discuss] OVS-DPDK fails after clearing buffer

2019-04-05 Thread Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) via discuss
Hi Anatoly,

I just wanted to follow up on the issue reported below. (It's already been 2 
weeks ago)

I don’t really understand the first solution you suggested: use IOVA as VA mode
Does that mean I shall load vfio-pci driver before I set dpdk-init to true? So, 
doing a 'modprobe vfio-pci'? Actually I use vfio-pci but I wait with loading 
the vfio-pci until I actually bind an interface to it.

Also, to answer your last question: Transparent HugePages are enabled. I've 
just disabled them and was still able to reproduce the issue.

Regards
Toby


On 3/21/19, 12:19 PM, "Ian Stokes"  wrote:

On 3/20/2019 10:37 PM, Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) 
via discuss wrote:
> Hello,
> 

Hi,

I wasnt sure at first glance what was happening so discussed with 
Anatoly (Cc'd) who has worked a considerable amount with DPDK memory 
models. Please see response below to what the suspected issue is. 
Anatoly, thanks for you input on this.

> I want to use Open vSwitch with DPDK enabled. For this purpose, I first 
> allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage 
> memory available for OVS-DPDK. (I don’t set any value for 
> */dpdk-socket-mem/ *so the default value of 1GB is taken). Then I set 
> */dpdk-init=true/*. This normally works fine.
> 
> However, I have realized that I can’t allocate HugePages from memory 
> that is inside the buff/cache (visible through */free -h/*). To solve 
> this issue, I decided to clear the cache/buffer in Linux before 
> allocating HugePages by running */echo 1 > /proc/sys/vm/drop_caches/*.
> 
> After that, allocation of the HugePages still works fine. However, when 
> I then run */ovs-vsctl set open_vswitch other_config:dpdk-init=true/* 
> the process crashes and inside the ovs-vswitchd.log I observe the 
following:
> 
> *ovs-vswitchd log output:*
> 
> 2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270 pages 
> from 512 requested
> 
> Current CONFIG_RTE_MAX_MEMSEG=256 is not enough

After you drop the cache, from the above log it is clear that, as a 
result, hugepages’ physical addresses get fragmented, as DPDK cannot 
concatenate pages into segments any more (which results in 
1-page-per-segment type situation which causes you to run out of memseg 
structures, of which there are only 256). We have no control over what 
addresses we get from the OS, so there’s really no way to “unfragment” 
the pages.

So, the above only happens when

1) you’re running in IOVA as PA mode (so, using real physical addresses).
2) your hugepages are heavily fragmented.

Possible solutions for this are:

1. Use IOVA as VA mode (so, use VFIO, not igb_uio), this way, the pages 
will still be fragmented, but the IOMMU will remap them to be contiguous 
– this is the recommended option, with VFIO being available it is the 
better choice than igb_uio.

2. Use bigger page sizes. Strictly speaking, this isn’t a solution as 
memory would be fragmented too, but a 1GB-long standalone segment is way 
more useful than a standalone 2MB-long segment.

3. Reboot (as you have done), maybe try re-reserving all pages? E.g.
i. Clean your hugetlbfs contents to free any leftover pages
ii. echo 0 > /sys/kernel/mm/hugepages/hugepage-/nr_hugepages
iii. echo 512 > /sys/kernel/mm/hugepages/hugepage-/nr_hugepages

Alternatively if you upgrade to OVs 2.11 it will use DPDK 18.11. This 
would make a difference as since DPDK 18.05+ we don’t require 
PA-contiguous segments any more

I would also question why these pages are in the regular page cache in 
the first place. Are transparent hugepages enabled?

HTL
Ian

> 
> Please either increase it or request less amount of memory.
> 
> 2019-03-18T13:32:41.112Z|00016|dpdk|ERR|EAL: Cannot init memory
> 
> 2019-03-18T13:32:41.128Z|2|daemon_unix|ERR|fork child died before 
> signaling startup (killed (Aborted))
> 
> 2019-03-18T13:32:41.128Z|3|daemon_unix|EMER|could not detach from 
> foreground session
> 
> *Tech Details:*
> 
>   * Open vSwitch version: 2.9.2
>   * DPDK version: 17.11
>   * System has only a single NUMA node.
> 
> This problem is consistently reproducible when having a relatively high 
> amount of memory in the buffer/cache (usually around 5GB) and clearing 
> the buffer afterwards with the command outlined above.
> 
> On the Internet, I found some posts saying that this is due to memory 
> fragmentation but normally I’m not even able to allocate HugePages in 
> the first place when my memory is already fragmented. In this scenario 
> however the allocation of HugePages works totally fine after clearing 
> the buffer so why would 

[ovs-discuss] OVS-DPDK fails after clearing buffer

2019-03-20 Thread Tobias Hofmann -T (tohofman - AAP3 INC at Cisco) via discuss
Hello,

I want to use Open vSwitch with DPDK enabled. For this purpose, I first 
allocate 512 HugePages of size 2MB to have a total of 1GB of HugePage memory 
available for OVS-DPDK. (I don’t set any value for dpdk-socket-mem so the 
default value of 1GB is taken). Then I set dpdk-init=true. This normally works 
fine.

However, I have realized that I can’t allocate HugePages from memory that is 
inside the buff/cache (visible through free -h). To solve this issue, I decided 
to clear the cache/buffer in Linux before allocating HugePages by running echo 
1 > /proc/sys/vm/drop_caches.
After that, allocation of the HugePages still works fine. However, when I then 
run ovs-vsctl set open_vswitch other_config:dpdk-init=true the process crashes 
and inside the ovs-vswitchd.log I observe the following:

ovs-vswitchd log output:
2019-03-18T13:32:41.112Z|00015|dpdk|ERR|EAL: Can only reserve 270 pages from 
512 requested
Current CONFIG_RTE_MAX_MEMSEG=256 is not enough
Please either increase it or request less amount of memory.
2019-03-18T13:32:41.112Z|00016|dpdk|ERR|EAL: Cannot init memory
2019-03-18T13:32:41.128Z|2|daemon_unix|ERR|fork child died before signaling 
startup (killed (Aborted))
2019-03-18T13:32:41.128Z|3|daemon_unix|EMER|could not detach from 
foreground session

Tech Details:

  *   Open vSwitch version: 2.9.2
  *   DPDK version: 17.11
  *   System has only a single NUMA node.

This problem is consistently reproducible when having a relatively high amount 
of memory in the buffer/cache (usually around 5GB) and clearing the buffer 
afterwards with the command outlined above.
On the Internet, I found some posts saying that this is due to memory 
fragmentation but normally I’m not even able to allocate HugePages in the first 
place when my memory is already fragmented. In this scenario however the 
allocation of HugePages works totally fine after clearing the buffer so why 
would they be fragmented?

A workaround that I know of is a reboot.

I’d be very grateful about any opinion on that.

Thank you
Tobias
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss