Re: Understanding ioBufAllocator behvaiour

Dunkin, Nick Wed, 24 May 2017 08:49:55 -0700

Just to clarify.

Yes, this is a constant load via a test application.  The test scenario is live 
segmented video content (Apple HLS), so we have small, plain text, manifest 
files and large(ish) video files (3MB).  There is also constant cache churn 
because the content is live video.   We have set RAM cache and disk cache 
deliberately small (1GB each) to observe behavior as we churn cache.


The graph I attached was for ioBufAllocator[0], but we do see similar trends on 
[4] and [5].  We do see expected behavior, i.e. plateauing, from the other 
ioBufAllocators.  Sorry for not making that clear in my initial email.

Thanks,

Nick

From: Alan Carroll <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Wednesday, May 24, 2017 at 11:39 AM
To: "[email protected]" <[email protected]>
Subject: Re: Understanding ioBufAllocator behvaiour

That can certainly happen and is a known problem, but it looked like Nick's 
scenario was a constant load via a test application and he saw unbounded growth 
in a single iobuf bucket.

For threads, there is a single global pool and each thread keeps a smaller pool 
from the global one (via the ProxyAllocator instances). The ProxyAllocator has 
a high and low water mark - when the # of items in the thread exceeds the high 
water mark they are released back to the global pool until there are only low 
water mark items left. The values for these are in the 128-512 range, so not on 
the same scale as this memory growth.

There's been lots of discussion about jemalloc. What we lack is production 
performance data to see what the impact would be. We're working on that. As far 
as I understand it (Phil and Leif know more) we would keep the ProxyAllocators 
but instead of releasing to a global pool the memory would be released to 
jemalloc for re-use, thereby strongly bounding the amount of memory in a 
particular iobuf bucket.


On Wednesday, May 24, 2017, 10:29:32 AM CDT, Kapil Sharma (kapsharm) 
<[email protected]> wrote:
On plateauing - not necessarily; we do see the memory consumption increasing 
continuously in our deployments as well. It depends on the pattern of segment 
sizes over time.

ATS uses power of 2 allocators for memory pool - there are 15 of those, ranging 
from 128bytes to 2M if my memory serves me right - and these are per thread! 
ATS will choose an optimal allocator for the segments.

As Alan mentioned, once chunk are allocated, they are never freed.

Here is a totally artificial example just to make the point (please correct if 
my understanding is flawed):
* the traffic pattern was such that initially only 2M allocators were used then 
ATS will keep allocating 2M chunks until RAM cache limit (lets say it is 64GB) 
is reached.
* Now traffic pattern changed (smaller fragment requests), and only 1M 
allocators are used, ATS will now keep allocating 1M chunks, again capping at 
64GB. But in the end ATS would have allocated 128GB well over RAM cache size 
limit….


In the past a there was some prototype of reclaimable buffer support added in 
ATS, but I believe it was removed in 7.0? Also there is recent discussion of 
adding jmalloc?



On May 24, 2017, at 11:01 AM, Alan Carroll 
<[email protected]<mailto:[email protected]>> wrote:

One issue is that memory never moves between the iobuf sizes. Once a chunk of 
memory is used for a specific iobuf slot, it's there forever. But unless 
something is leaking, the total size should eventually plateau, certainly 
within less than a day if you have a basically constant load. There will be 
some growth due to blocks being kept in thread local allocation pools, but 
again that should level in less time than you've run.


On Wednesday, May 24, 2017, 9:50:39 AM CDT, Dunkin, Nick 
<[email protected]<mailto:[email protected]>> wrote:

Hi Alan,


This is 7.0.0


I only see this behavior on ioBufAllocator[0], [4] and [5].  The other 
ioBufAllocators’ usage looks as I would expect (i.e. allocated goes up then 
flat), so I was thinking it was more likely something to do with my 
configuration or use-case.


I’d also just like to understand, at a high level, how the ioBufAllocators are 
used.


Thanks,


Nick


From: Alan Carroll 
<[email protected]<mailto:[email protected]>>
Reply-To: 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Date: Wednesday, May 24, 2017 at 10:33 AM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: Re: Understanding ioBufAllocator behvaiour


Honestly it sounds like a leak. Can you specify which version of Traffic Server 
this is?



On Wednesday, May 24, 2017, 8:22:46 AM CDT, Dunkin, Nick 
<[email protected]<mailto:[email protected]>> wrote:

Hi


I have a load test that I’ve been running for a number of days now.  I’m using 
the memory dump logging in traffic.out and I’m trying to understand how Traffic 
Server allocates and reuses memory.  I’m still quite new to Traffic Server.


Nearly all of the memory traces look as I would expect, i.e. memory is 
allocated and reused over the lifetime of the test.  However my readings from 
ioBufAllocator[0] show a continual increase in allocated AND used.  I am 
attaching a graph.  (FYI – This graph covers approximately 3 days of continual 
load test.)


I would have expected to start seeing reuse in ioBufAllocator by now, like I do 
in the other ioBufAllocators.  Can someone help me understand what I’m seeing?


Many thanks,


Nick Dunkin


Nick Dunkin

Principal Engineer

o:   678.258.4071

e:   [email protected]<mailto:[email protected]>
4375 River Green Pkwy # 100, Duluth, GA 30096, USA

<image001.png>
<image001.png>

Re: Understanding ioBufAllocator behvaiour

Reply via email to