Re: [vpp-dev] heap sizes

2021-07-01 Thread Matthew Smith via lists.fd.io
On Thu, Jul 1, 2021 at 10:07 AM Damjan Marion  wrote:

>
>
> > On 01.07.2021., at 16:12, Matthew Smith  wrote:
> >
> >
> >
> > On Thu, Jul 1, 2021 at 6:36 AM Damjan Marion  wrote:
> >
> >
> > > On 01.07.2021., at 11:12, Benoit Ganne (bganne) via lists.fd.io
>  wrote:
> > >
> > >> Yes, allowing dynamic heap growth sounds like it could be better.
> > >> Alternatively... if memory allocations could fail and something more
> > >> graceful than VPP exiting could occur, that may also be better. E.g.
> if
> > >> I'm adding a route and try to allocate a counter for it and that
> fails, it
> > >> would be better to refuse to add the route than to exit and take the
> > >> network down.
> > >>
> > >> I realize that neither of those options is easy to do btw. I'm just
> trying
> > >> to figure out how to make it easier and more forgiving for users to
> set up
> > >> their configuration without making them learn about various memory
> > >> parameters.
> > >
> > > Understood, but setting a very high default will just make users of
> smaller config puzzled too  and I think changing all memory allocation
> callsites to check for NULL would be a big paradigm change in VPP.
> > > That's why I think a dynamically growing heap might be better but I do
> not really know what would be the complexity.
> > > That said, you can probably change the default in your own build and
> that should work.
> > >
> >
> > Fully agree wirth Benoit. We should not increase heap size default value.
> >
> > Things are actually a bit more complicated. For performance reasons
> people should use
> > hugepages whenever they are available, but they are also not default.
> > When hugepages are used all pages are immediately backed with physical
> memory.
> >
> > So different use cases require different heap configurations and end
> user needs to tune that.
> > Same applies for other things like stats segment page size which again
> may impact forwarding
> > performance significantly.
> >
> > If messing with startup.conf is too complicated for end user, some nice
> configuration script may be helpful.
> > Or just throwing few startup.confs into extras/startup_configs.
> >
> > Dynamic heap is possible, but not straight forward, as at some places we
> use offsets
> > to the start of the heap, so additional allocation cannot be anywhere.
> > Also it will not help in some cases, i.e. when 1G hugepage is used for
> heap, growing up to 2G
> > will fail if 2nd 1G page is not pre-allocated.
> >
> >
> > Sorry for not being clear. I was not advocating any change to defaults
> in VPP code in gerrit. I was trying to figure out the impact of changing
> the default value written in startup.conf by the management plane I work
> on. And also have a conversation on whether there are ways that it could be
> made easier to tune memory parameters correctly.
>
> ok, so let me try to answer your original questions:
>
> > It's my understanding that when you set the size of the main heap or the
> stat segment in startup.conf, the size you specify is used to set up
> virtual address space and the system does not actually allocate that full
> amount of memory to VPP. I think when VPP tries to read/write addresses
> within the address space, then memory is requested from the system to back
> the chunk of address space containing the address being accessed. Is my
> understanding correct(ish)?
>
> heap-size parameter defines size of memory mapping created for the heap.
> With the normal 4K pages mapping is not backed by physical memory. Instead,
> first time you try to access specific page CPU will generate page fault,
> and kernel will handle it by allocating 4k chunk of physical memory to back
> that specific virtual address and setup MMU mapping for that page.
>
> In VPP we don’t have reverse process, even if all memory allocations which
> use specific 4k page are freed, that 4K page will not be returned to
> kernel, as kernel simply doesn’t know that specific page is not in use
> anymore.
> Solution would be to somehow track number of memory allocations sharing
> single 4K page and call madvise() system call when last one is freed...
>
> If you are using hugepages, all virtual memory is immediately backed by
> physical memory so VPP with 32G of hugepage heap will use 32G of physical
> memory as long as VPP is running.
>
> If you do `show memory main-heap` you will actually see how many physical
> pages are allocated:
>
> vpp# show memory main-heap
> Thread 0 vpp_main
>   base 0x7f6f95c9f000, size 1g, locked, unmap-on-destroy, name 'main heap'
> page stats: page-size 4K, total 262144, mapped 50702, not-mapped 211442
>   numa 1: 50702 pages, 198.05m bytes
> total: 1023.99M, used: 115.51M, free: 908.49M, trimmable: 905.75M
>
>
> Out of this you can see that heap is using 4K pages, 262144 total, and
> 50702 are mapped to physical memory.
> All 50702 pages are using memory on numa node 1.
>
> So effectively VPP is using around 198 MB of physical memory for heap
> while 

Re: [vpp-dev] heap sizes

2021-07-01 Thread Damjan Marion via lists.fd.io


> On 01.07.2021., at 16:12, Matthew Smith  wrote:
> 
> 
> 
> On Thu, Jul 1, 2021 at 6:36 AM Damjan Marion  wrote:
> 
> 
> > On 01.07.2021., at 11:12, Benoit Ganne (bganne) via lists.fd.io 
> >  wrote:
> > 
> >> Yes, allowing dynamic heap growth sounds like it could be better.
> >> Alternatively... if memory allocations could fail and something more
> >> graceful than VPP exiting could occur, that may also be better. E.g. if
> >> I'm adding a route and try to allocate a counter for it and that fails, it
> >> would be better to refuse to add the route than to exit and take the
> >> network down.
> >> 
> >> I realize that neither of those options is easy to do btw. I'm just trying
> >> to figure out how to make it easier and more forgiving for users to set up
> >> their configuration without making them learn about various memory
> >> parameters.
> > 
> > Understood, but setting a very high default will just make users of smaller 
> > config puzzled too  and I think changing all memory allocation callsites 
> > to check for NULL would be a big paradigm change in VPP.
> > That's why I think a dynamically growing heap might be better but I do not 
> > really know what would be the complexity.
> > That said, you can probably change the default in your own build and that 
> > should work.
> > 
> 
> Fully agree wirth Benoit. We should not increase heap size default value.
> 
> Things are actually a bit more complicated. For performance reasons people 
> should use 
> hugepages whenever they are available, but they are also not default.
> When hugepages are used all pages are immediately backed with physical memory.
> 
> So different use cases require different heap configurations and end user 
> needs to tune that.
> Same applies for other things like stats segment page size which again may 
> impact forwarding
> performance significantly.
> 
> If messing with startup.conf is too complicated for end user, some nice 
> configuration script may be helpful.
> Or just throwing few startup.confs into extras/startup_configs.
> 
> Dynamic heap is possible, but not straight forward, as at some places we use 
> offsets
> to the start of the heap, so additional allocation cannot be anywhere.
> Also it will not help in some cases, i.e. when 1G hugepage is used for heap, 
> growing up to 2G
> will fail if 2nd 1G page is not pre-allocated.
> 
> 
> Sorry for not being clear. I was not advocating any change to defaults in VPP 
> code in gerrit. I was trying to figure out the impact of changing the default 
> value written in startup.conf by the management plane I work on. And also 
> have a conversation on whether there are ways that it could be made easier to 
> tune memory parameters correctly. 

ok, so let me try to answer your original questions:

> It's my understanding that when you set the size of the main heap or the stat 
> segment in startup.conf, the size you specify is used to set up virtual 
> address space and the system does not actually allocate that full amount of 
> memory to VPP. I think when VPP tries to read/write addresses within the 
> address space, then memory is requested from the system to back the chunk of 
> address space containing the address being accessed. Is my understanding 
> correct(ish)?

heap-size parameter defines size of memory mapping created for the heap. With 
the normal 4K pages mapping is not backed by physical memory. Instead, first 
time you try to access specific page CPU will generate page fault, and kernel 
will handle it by allocating 4k chunk of physical memory to back that specific 
virtual address and setup MMU mapping for that page.

In VPP we don’t have reverse process, even if all memory allocations which use 
specific 4k page are freed, that 4K page will not be returned to kernel, as 
kernel simply doesn’t know that specific page is not in use anymore.
Solution would be to somehow track number of memory allocations sharing single 
4K page and call madvise() system call when last one is freed...

If you are using hugepages, all virtual memory is immediately backed by 
physical memory so VPP with 32G of hugepage heap will use 32G of physical 
memory as long as VPP is running.

If you do `show memory main-heap` you will actually see how many physical pages 
are allocated:

vpp# show memory main-heap
Thread 0 vpp_main
  base 0x7f6f95c9f000, size 1g, locked, unmap-on-destroy, name 'main heap'
page stats: page-size 4K, total 262144, mapped 50702, not-mapped 211442
  numa 1: 50702 pages, 198.05m bytes
total: 1023.99M, used: 115.51M, free: 908.49M, trimmable: 905.75M


Out of this you can see that heap is using 4K pages, 262144 total, and 50702 
are mapped to physical memory.
All 50702 pages are using memory on numa node 1.

So effectively VPP is using around 198 MB of physical memory for heap while 
real heap usage is only 115 MB.
Such a big difference is mainly caused by one place in our code which temporary 
allocates ~200M of memory for 
temporary vector. 

Re: [vpp-dev] heap sizes

2021-07-01 Thread Matthew Smith via lists.fd.io
On Thu, Jul 1, 2021 at 6:36 AM Damjan Marion  wrote:

>
>
> > On 01.07.2021., at 11:12, Benoit Ganne (bganne) via lists.fd.io  cisco@lists.fd.io> wrote:
> >
> >> Yes, allowing dynamic heap growth sounds like it could be better.
> >> Alternatively... if memory allocations could fail and something more
> >> graceful than VPP exiting could occur, that may also be better. E.g. if
> >> I'm adding a route and try to allocate a counter for it and that fails,
> it
> >> would be better to refuse to add the route than to exit and take the
> >> network down.
> >>
> >> I realize that neither of those options is easy to do btw. I'm just
> trying
> >> to figure out how to make it easier and more forgiving for users to set
> up
> >> their configuration without making them learn about various memory
> >> parameters.
> >
> > Understood, but setting a very high default will just make users of
> smaller config puzzled too  and I think changing all memory allocation
> callsites to check for NULL would be a big paradigm change in VPP.
> > That's why I think a dynamically growing heap might be better but I do
> not really know what would be the complexity.
> > That said, you can probably change the default in your own build and
> that should work.
> >
>
> Fully agree wirth Benoit. We should not increase heap size default value.
>
> Things are actually a bit more complicated. For performance reasons people
> should use
> hugepages whenever they are available, but they are also not default.
> When hugepages are used all pages are immediately backed with physical
> memory.
>
> So different use cases require different heap configurations and end user
> needs to tune that.
> Same applies for other things like stats segment page size which again may
> impact forwarding
> performance significantly.
>
> If messing with startup.conf is too complicated for end user, some nice
> configuration script may be helpful.
> Or just throwing few startup.confs into extras/startup_configs.
>
> Dynamic heap is possible, but not straight forward, as at some places we
> use offsets
> to the start of the heap, so additional allocation cannot be anywhere.
> Also it will not help in some cases, i.e. when 1G hugepage is used for
> heap, growing up to 2G
> will fail if 2nd 1G page is not pre-allocated.
>
>
Sorry for not being clear. I was not advocating any change to defaults in
VPP code in gerrit. I was trying to figure out the impact of changing the
default value written in startup.conf by the management plane I work on.
And also have a conversation on whether there are ways that it could be
made easier to tune memory parameters correctly.

-Matt

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19685): https://lists.fd.io/g/vpp-dev/message/19685
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-07-01 Thread Damjan Marion via lists.fd.io


> On 01.07.2021., at 11:12, Benoit Ganne (bganne) via lists.fd.io 
>  wrote:
> 
>> Yes, allowing dynamic heap growth sounds like it could be better.
>> Alternatively... if memory allocations could fail and something more
>> graceful than VPP exiting could occur, that may also be better. E.g. if
>> I'm adding a route and try to allocate a counter for it and that fails, it
>> would be better to refuse to add the route than to exit and take the
>> network down.
>> 
>> I realize that neither of those options is easy to do btw. I'm just trying
>> to figure out how to make it easier and more forgiving for users to set up
>> their configuration without making them learn about various memory
>> parameters.
> 
> Understood, but setting a very high default will just make users of smaller 
> config puzzled too  and I think changing all memory allocation callsites to 
> check for NULL would be a big paradigm change in VPP.
> That's why I think a dynamically growing heap might be better but I do not 
> really know what would be the complexity.
> That said, you can probably change the default in your own build and that 
> should work.
> 

Fully agree wirth Benoit. We should not increase heap size default value.

Things are actually a bit more complicated. For performance reasons people 
should use 
hugepages whenever they are available, but they are also not default.
When hugepages are used all pages are immediately backed with physical memory.

So different use cases require different heap configurations and end user needs 
to tune that.
Same applies for other things like stats segment page size which again may 
impact forwarding
performance significantly.

If messing with startup.conf is too complicated for end user, some nice 
configuration script may be helpful.
Or just throwing few startup.confs into extras/startup_configs.

Dynamic heap is possible, but not straight forward, as at some places we use 
offsets
to the start of the heap, so additional allocation cannot be anywhere.
Also it will not help in some cases, i.e. when 1G hugepage is used for heap, 
growing up to 2G
will fail if 2nd 1G page is not pre-allocated.

— 
Damjan


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19680): https://lists.fd.io/g/vpp-dev/message/19680
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-07-01 Thread Benoit Ganne (bganne) via lists.fd.io
> Yes, allowing dynamic heap growth sounds like it could be better.
> Alternatively... if memory allocations could fail and something more
> graceful than VPP exiting could occur, that may also be better. E.g. if
> I'm adding a route and try to allocate a counter for it and that fails, it
> would be better to refuse to add the route than to exit and take the
> network down.
> 
> I realize that neither of those options is easy to do btw. I'm just trying
> to figure out how to make it easier and more forgiving for users to set up
> their configuration without making them learn about various memory
> parameters.

Understood, but setting a very high default will just make users of smaller 
config puzzled too  and I think changing all memory allocation callsites to 
check for NULL would be a big paradigm change in VPP.
That's why I think a dynamically growing heap might be better but I do not 
really know what would be the complexity.
That said, you can probably change the default in your own build and that 
should work.

Best
ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19676): https://lists.fd.io/g/vpp-dev/message/19676
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-06-30 Thread Matthew Smith via lists.fd.io
On Wed, Jun 30, 2021 at 3:01 AM Benoit Ganne (bganne) 
wrote:

> > What I'm trying to figure out is this: do I need to try and determine a
> > formula for the sizes that should be used for main heap and stat segment
> > based on X number of routes and Y number of worker threads? Or is there a
> > downside to just setting the main heap size to 32G (which seems like a
> > number that is unlikely to ever be exhausted sans memory leaks)?
>
> I do not think it would be a good idea:
>  - it depends upon overcommit configuration:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/overcommit-accounting.rst
>  - under the default overcommit setting ("heuristic") this would prevent
> small configs to run VPP by default: think developer VMs or smaller cloud
> instances (eg. AWS C5n.large are 4GB) which are pretty common
>
> Maybe having an (optional) dyncamically growing heap could be a better
> option?
>
> ben
>

Hi Ben,

Ah, thanks for the pointer!

Yes, allowing dynamic heap growth sounds like it could be better.
Alternatively... if memory allocations could fail and something more
graceful than VPP exiting could occur, that may also be better. E.g. if I'm
adding a route and try to allocate a counter for it and that fails, it
would be better to refuse to add the route than to exit and take the
network down.

I realize that neither of those options is easy to do btw. I'm just trying
to figure out how to make it easier and more forgiving for users to set up
their configuration without making them learn about various memory
parameters.

Thanks,
-Matt

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19656): https://lists.fd.io/g/vpp-dev/message/19656
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-06-30 Thread Benoit Ganne (bganne) via lists.fd.io
> What I'm trying to figure out is this: do I need to try and determine a
> formula for the sizes that should be used for main heap and stat segment
> based on X number of routes and Y number of worker threads? Or is there a
> downside to just setting the main heap size to 32G (which seems like a
> number that is unlikely to ever be exhausted sans memory leaks)?

I do not think it would be a good idea:
 - it depends upon overcommit configuration: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/vm/overcommit-accounting.rst
 - under the default overcommit setting ("heuristic") this would prevent small 
configs to run VPP by default: think developer VMs or smaller cloud instances 
(eg. AWS C5n.large are 4GB) which are pretty common

Maybe having an (optional) dyncamically growing heap could be a better option?

ben

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19651): https://lists.fd.io/g/vpp-dev/message/19651
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-06-29 Thread Matthew Smith via lists.fd.io
Hi Pim,

The defaults we use are 96M for stat segment and 1G for main heap (default
main heap size). I know these need to scale up as the numbers of routes
and/or threads increase. These values can accommodate a million routes or
so when running single threaded, but allocations on the stat segment
fail if 2 worker threads are enabled with 1M routes. Allocations on the
main heap fail if the number of routes gets too far above 1M.

What I'm trying to figure out is this: do I need to try and determine a
formula for the sizes that should be used for main heap and stat segment
based on X number of routes and Y number of worker threads? Or is there a
downside to just setting the main heap size to 32G (which seems like a
number that is unlikely to ever be exhausted sans memory leaks)?

-Matt


On Mon, Jun 28, 2021 at 5:20 PM Pim van Pelt  wrote:

> Hoi Matt,
>
> Out of curiosity how large is your heap and stats segment? I ask because
> running VPP with a large FIB I needed 2G heap size (and I used page size of
> 2M), and 96M of statsseg:
>
> memory {
>   main-heap-size 3G
>   main-heap-page-size 2M
> }
>
> statseg {
> socket-name /run/vpp/stats.sock
> size 96M
> per-node-counters off
> }
>
> On Mon, Jun 28, 2021 at 11:53 PM Matthew Smith via lists.fd.io  netgate@lists.fd.io> wrote:
>
>> Hi all,
>>
>> It's my understanding that when you set the size of the main heap or the
>> stat segment in startup.conf, the size you specify is used to set up
>> virtual address space and the system does not actually allocate that full
>> amount of memory to VPP. I think when VPP tries to read/write addresses
>> within the address space, then memory is requested from the system to back
>> the chunk of address space containing the address being accessed. Is my
>> understanding correct(ish)?
>>
>> When I add a large number of routes to the FIB (>1M), I have seen VPP
>> crash when the main heap or stats segment run out of space. I am wondering
>> if it makes sense to just set the heap sizes to some huge value that I am
>> confident will not be exceeded. If memory is not allocated unless it's
>> needed, it seems like that would be ok to do.
>>
>> Thanks,
>> -Matt
>>
>>
>>
>> 
>>
>>
>
> --
> Pim van Pelt 
> PBVP1-RIPE - http://www.ipng.nl/
>

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19648): https://lists.fd.io/g/vpp-dev/message/19648
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] heap sizes

2021-06-28 Thread Pim van Pelt
Hoi Matt,

Out of curiosity how large is your heap and stats segment? I ask because
running VPP with a large FIB I needed 2G heap size (and I used page size of
2M), and 96M of statsseg:

memory {
  main-heap-size 3G
  main-heap-page-size 2M
}

statseg {
socket-name /run/vpp/stats.sock
size 96M
per-node-counters off
}

On Mon, Jun 28, 2021 at 11:53 PM Matthew Smith via lists.fd.io  wrote:

> Hi all,
>
> It's my understanding that when you set the size of the main heap or the
> stat segment in startup.conf, the size you specify is used to set up
> virtual address space and the system does not actually allocate that full
> amount of memory to VPP. I think when VPP tries to read/write addresses
> within the address space, then memory is requested from the system to back
> the chunk of address space containing the address being accessed. Is my
> understanding correct(ish)?
>
> When I add a large number of routes to the FIB (>1M), I have seen VPP
> crash when the main heap or stats segment run out of space. I am wondering
> if it makes sense to just set the heap sizes to some huge value that I am
> confident will not be exceeded. If memory is not allocated unless it's
> needed, it seems like that would be ok to do.
>
> Thanks,
> -Matt
>
>
>
> 
>
>

-- 
Pim van Pelt 
PBVP1-RIPE - http://www.ipng.nl/

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#19644): https://lists.fd.io/g/vpp-dev/message/19644
Mute This Topic: https://lists.fd.io/mt/83856384/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-