https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80658

            Bug ID: 80658
           Summary: Memory leak reported in libstdc++ (zerotier)
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bernd at net2o dot de
  Target Milestone: ---

This not very friendly blog entry contains a report of a memory leak in
libstdc++ ("worst bug of my entire career"):

https://www.zerotier.com/blog/2017-05-05-theleak.shtml

Including a not very easy way to reproduce it (by installing their software and
stress-testing it).  Apparently he didn't file a bug report here.

Solution proposed there: link against jemalloc (it's under BSDL), performance
goes up, memory consumption stays low, i.e. neither use glibc's "too slow"
malloc() nor use libstdc++'s memory allocator (still slower than jemalloc).

I don't like this discovery at all, because the implications are too bad...

1. Using your own allocator by default renders tools like valgrind blind.
2. Having two allocators means two times the possibility for bugs.  Actually
having about 10 different allocators is even worse ;-).
3. If glibc's malloc is slow, make it faster, don't implement your own
allocator.

There are some limited valid reasons to create your own allocator, but
stdlibc++ shouldn't do that by default.  Especially if multi-threading speed of
glibc is too slow, please just fix glibc.

Due to #1, we don't even know how many people are affected by the bug.  Memory
leaks caused by the allocator itself aren't detectable by tools that replace
the allocator to find memory leaks (like valgrind), and what's worse: valgrind
doesn't help people to find memory leaks they caused themselves in libstdc++.

I assume that the mt_allocator is used here, because it is easiest to screw up
a multithreaded allocator.  Things that can go wrong:

* the handover from local free list to global free list doesn't work as it
should (forgets to add free stuff, race conditions)
* the access to the global free list doesn't work as it should (more race
conditions possible).
* threads terminating forget to merge their free list
* allocating big chunks of memory will not be shared in the global freelist, as
only few allocations happen, not enough to exceed the limit of the local
freelist
...

The documentation of mt_allocator is at least somewhat misleading:

https://gcc.gnu.org/onlinedocs/libstdc++/manual/mt_allocator_impl.html

"Notes about deallocation. This allocator does not explicitly release memory."

Well, it does add freed memory to its freelists and reuse it.  It's just not
giving back unused memory to the OS.  However, for bigger allocation, using
mmap() and returning the memory to the OS on free is a very good idea.

Related: I have some griefs with glibc's malloc, as well. If you turn on
debugging, so that your program doesn't get a C abort() and could print it's
own diagnostics (usually you want that when you discover that there are memory
corruption bugs), malloc() stops being thread-safe.  That is just not at all
helpful.  I worked around this by wrapping malloc(), resize() and free() in a
critical section when malloc() debugging is enabled. Ulrich Drepper had that as
"wontfix", because he somehow couldn't see how to implement it.  Note that the
debugging version of malloc() doesn't have to be ultra-fast.  It's there for
debugging.  It can lock a mutex on every call.

Reply via email to