[
https://issues.apache.org/jira/browse/TS-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259540#comment-14259540
]
Sudheer Vinukonda edited comment on TS-3261 at 12/28/14 12:51 AM:
------------------------------------------------------------------
Hi [~shinrich]: Agree that the SSL_set_bio has the necessary protection to
release any old bios before setting with new bios. I'm retesting a new patch
now in prod with the second part removed.
However, I am not really sure that the Valgrind results above both point to the
same leak. Based on my experience, tools like Valgrind report "perceived"
leaks, which are basically any calls to new/malloc without a corresponding call
to delete/free. This works fine in normal situations, but, could result in
reporting "false" leaks, sometimes (especially, when free lists are used).
With free list based memory model, typically, memory is allocated upfront and
is never free'd back to the system. Valgrind is not aware of this fact, and
reports all such free list allocations as leaks. Typically, this is the case
with ATS - When you run Valgrind on ATS, it reports a bunch of leaks, most of
which are not really leaks, and one needs to carefully "filter" out the false
leaks and identify the ones that are real leaks.
For the specific leak in this jira, I've only pasted a tiny extract of the
entire Valgrind leak report (which is about 20K lines) after filtering out all
the other "false" leaks. There were many other openSSL leaks reported along
with these two, but, I've filtered them out after comparing them with a similar
report from v5.0. Most likely, all those are due to openssl's free lists
(similar to that of ATS) - these two are the only ones that really stood out
(compared to v5.0).
fwiw, I've never used Valgrind to detect any iobuf related leaks - it simply
can't detect them - instead the dump_mem_info_frequency and/or internal iobuf
tracking (TS-3116) are better options.
PS: There's an option in Valgrind to "mark" certain call stacks as not leaks
(e.g any free list allocations) and it stops reporting them further.
was (Author: sudheerv):
Hi [~shinrich]: Agree that the SSL_set_bio has the necessary protection to
release any old bios before setting with new bios. I'm retesting a new patch
now in prod with the second part removed.
However, I am not really sure that the Valgrind results above both point to the
same leak. Based on my experience, tools like Valgrind report "perceived"
leaks, which are basically any calls to new/malloc without a corresponding call
to delete/free. This works fine in normal situations, but, could result in
reporting "false" leaks, sometimes (especially, when free lists are used).
With free list based memory model, typically, memory is allocated upfront and
is never free'd back to the system. Valgrind is not aware of this fact, and
reports all such free list allocations as leaks. Typically, this is the case
with ATS - When you run Valgrind on ATS, it reports a bunch of leaks, most of
which are not really leaks, and one needs to carefully "filter" out the false
leaks and identify the ones that are real leaks.
For the specific leak in this jira, I've only pasted a tiny extract of the
entire Valgrind leak report (which is about 20K lines) after filtering out all
the other "false" leaks. There were many other openSSL leaks reported along
with these two, but, I've filtered them out after comparing them with a similar
report from v5.0. Most likely, all those are due to openssl's free lists (
similar to that of ATS) - these two are the only ones that really stood out
(compared to v5.0).
fwiw, I've never used Valgrind to detect any iobuf related leaks - it simply
can't detect them - instead the dump_mem_info_frequency and/or internal iobuf
tracking (TS-3116) are better options.
PS: There's an option in Valgrind to "mark" certain call stacks as not leaks
(e.g any free list allocations) and it stops reporting them further.
> possible slow leak in v5.2.0
> ----------------------------
>
> Key: TS-3261
> URL: https://issues.apache.org/jira/browse/TS-3261
> Project: Traffic Server
> Issue Type: Bug
> Components: Core
> Affects Versions: 5.2.0
> Reporter: Sudheer Vinukonda
> Assignee: Sudheer Vinukonda
> Priority: Critical
> Fix For: 5.3.0
>
>
> After fixing the iobuffer leak in TS-3257, the iobuffers seem stable on
> v5.2.0, but, there still seems to be a slow memory leak. The RES memory from
> top shows 15g after running v5.2.0 in prod for more than 3 days, whereas the
> corresponding v5.0 host shows 10g after running for more than a week.
> Below is the dump of iobuffers between the v5.2.0 and v5.0 host - as
> expected, most iobufs are lower than v5.0 host (since, the v5.0 host been
> running longer), except the 32k buffer (iobuf[8]). But, the leak doesn't seem
> to be explained by the difference in 32k buffers either, as it is not high
> enough to explain the 5g difference in total memory.
> v5.2.0 host:
> {code}
> allocated | in-use | type size | free list name
> --------------------|--------------------|------------|----------------------------------
> 67108864 | 25165824 | 2097152 |
> memory/ioBufAllocator[14]
> 2013265920 | 1825570816 | 1048576 |
> memory/ioBufAllocator[13]
> 620756992 | 549978112 | 524288 |
> memory/ioBufAllocator[12]
> 780140544 | 593494016 | 262144 |
> memory/ioBufAllocator[11]
> 742391808 | 574619648 | 131072 |
> memory/ioBufAllocator[10]
> 901775360 | 735576064 | 65536 |
> memory/ioBufAllocator[9]
> 1189085184 | 1093304320 | 32768 |
> memory/ioBufAllocator[8]
> 474480640 | 348733440 | 16384 |
> memory/ioBufAllocator[7]
> 269221888 | 211320832 | 8192 |
> memory/ioBufAllocator[6]
> 156762112 | 142999552 | 4096 |
> memory/ioBufAllocator[5]
> 0 | 0 | 2048 |
> memory/ioBufAllocator[4]
> 131072 | 0 | 1024 |
> memory/ioBufAllocator[3]
> 65536 | 0 | 512 |
> memory/ioBufAllocator[2]
> 65536 | 256 | 256 |
> memory/ioBufAllocator[1]
> 16384 | 0 | 128 |
> memory/ioBufAllocator[0]
> {code}
> v.5.0.0 host:
> {code}
> allocated | in-use | type size | free list name
> --------------------|--------------------|------------|----------------------------------
> 134217728 | 31457280 | 2097152 |
> memory/ioBufAllocator[14]
> 2147483648 | 1843396608 | 1048576 |
> memory/ioBufAllocator[13]
> 788529152 | 608174080 | 524288 |
> memory/ioBufAllocator[12]
> 897581056 | 680525824 | 262144 |
> memory/ioBufAllocator[11]
> 796917760 | 660471808 | 131072 |
> memory/ioBufAllocator[10]
> 985661440 | 818479104 | 65536 |
> memory/ioBufAllocator[9]
> 873463808 | 677969920 | 32768 |
> memory/ioBufAllocator[8]
> 544735232 | 404439040 | 16384 |
> memory/ioBufAllocator[7]
> 310902784 | 237887488 | 8192 |
> memory/ioBufAllocator[6]
> 160956416 | 115515392 | 4096 |
> memory/ioBufAllocator[5]
> 0 | 0 | 2048 |
> memory/ioBufAllocator[4]
> 131072 | 2048 | 1024 |
> memory/ioBufAllocator[3]
> 65536 | 0 | 512 |
> memory/ioBufAllocator[2]
> 98304 | 50688 | 256 |
> memory/ioBufAllocator[1]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)