On Sat, Dec 26, 2020 at 11:31:32AM +0100, Otto Moerbeek wrote:
> On Sat, Dec 26, 2020 at 11:07:00AM +0100, Otto Moerbeek wrote:
>
> > On Fri, Dec 25, 2020 at 02:04:03PM +0100, Otto Moerbeek wrote:
> >
> > > On Fri, Dec 25, 2020 at 12:59:10PM +0100, Otto Moerbeek wrote:
> > >
> > > > On Fri, Dec 25, 2020 at 12:35:57PM +0100, Mark Kettenis wrote:
> > > >
> > > > > > Date: Fri, 25 Dec 2020 11:34:47 +0100
> > > > > > From: Otto Moerbeek <[email protected]>
> > > > > >
> > > > > > On Thu, Dec 24, 2020 at 01:29:28PM +0100, Uli Schlachter wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > due to that other thread, it occurred to me that getaddrinfo()
> > > > > > > also has
> > > > > > > another bug: It leaks memory. _asr_use_resolver() allocates memory
> > > > > > > per-thread (via _asr_resolver()) and saves it via
> > > > > > > _THREAD_PRIVATE() in
> > > > > > > _asr, but nothing frees that memory. A reproducer follows bellow.
> > > > > > > On
> > > > > > > Debian, no memory leak is observed (= RES in top stays constant
> > > > > > > over time).
> > > > > > >
> > > > > > > I have no good suggestion for how to fix this leak, but I feel
> > > > > > > like this
> > > > > > > might also be helpful in fixing the thread unsafety from "that
> > > > > > > other
> > > > > > > thread". Both bugs originate from storing a pointer to an
> > > > > > > allocation via
> > > > > > > _THREAD_PRIVATE(), which is something that does not really work
> > > > > > > with
> > > > > > > that API.
> > > > > > >
> > > > > > > IMHO this internal API needs to change. At this point, one can
> > > > > > > also fix
> > > > > > > the other problem by having an API that guarantees that each
> > > > > > > thread gets
> > > > > > > zeroed per-thread data instead of memcpy()ing from a global
> > > > > > > default.
> > > > > > >
> > > > > > > Other users of _THREAD_PRIVATE() instead seem to only store
> > > > > > > buffers,
> > > > > > > e.g. strerror_l() or localtime(). These buffers do not need extra
> > > > > > > cleanup.
> > > > > > >
> > > > > > > Reproducer (sorry for the line wrapping; this is basically just
> > > > > > > the
> > > > > > > previous example, but without calling getaddrinfo() on the main
> > > > > > > thread:
> > > > > > > lots of threads are started and each thread calls getaddrinfo()
> > > > > > > once):
> > > > > > >
> > > > > > > #include <pthread.h>
> > > > > > > #include <sys/types.h>
> > > > > > > #include <sys/socket.h>
> > > > > > > #include <netdb.h>
> > > > > > > #include <stdio.h>
> > > > > > > #include <string.h>
> > > > > > >
> > > > > > > #define NUM_THREADS 50
> > > > > > >
> > > > > > > static void do_lookup(const char *host)
> > > > > > > {
> > > > > > > int s;
> > > > > > > struct addrinfo hints;
> > > > > > > struct addrinfo *result;
> > > > > > >
> > > > > > > memset(&hints, 0, sizeof(hints));
> > > > > > > hints.ai_family = AF_UNSPEC;
> > > > > > > hints.ai_socktype = SOCK_STREAM;
> > > > > > > hints.ai_flags = AI_ADDRCONFIG;
> > > > > > > hints.ai_protocol = IPPROTO_TCP;
> > > > > > >
> > > > > > > s = getaddrinfo(host, NULL, &hints, &result);
> > > > > > > if (s != 0) {
> > > > > > > fprintf(stderr, "Lookup error for %s: %s\n", host,
> > > > > > > gai_strerror(s));
> > > > > > > } else {
> > > > > > > freeaddrinfo(result);
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > > static void *
> > > > > > > do_things(void *arg)
> > > > > > > {
> > > > > > > (void) arg;
> > > > > > > do_lookup("ipv4.google.com");
> > > > > > > return NULL;
> > > > > > > }
> > > > > > >
> > > > > > > int main()
> > > > > > > {
> > > > > > > pthread_t threads[NUM_THREADS];
> > > > > > > int i;
> > > > > > > int s;
> > > > > > >
> > > > > > > for (;;) {
> > > > > > > for (i = 0; i < NUM_THREADS; i++) {
> > > > > > > s = pthread_create(&threads[i], NULL,
> > > > > > > do_things, NULL);
> > > > > > > if (s != 0)
> > > > > > > fprintf(stderr, "Error creating
> > > > > > > thread");
> > > > > > > }
> > > > > > > for (i = 0; i < NUM_THREADS; i++) {
> > > > > > > pthread_join(threads[i], NULL);
> > > > > > > }
> > > > > > > }
> > > > > > > return 0;
> > > > > > > }
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Uli
> > > > > > > --
> > > > > > > This can be a, a little complicated. Listen, my advice is... ask
> > > > > > > somebody else for advice, at least someone who's... got more
> > > > > > > experience
> > > > > > > at... giving advice.
> > > > > > >
> > > > > >
> > > > > > Hoi,
> > > > > >
> > > > > > the diff (which is certainly wip) below fixes the asr related leaks.
> > > > > > There's still a leak on creating/destroying threads, in particular
> > > > > > the
> > > > > > stacks do not seem to be unmapped.
> > > > >
> > > > > That is (somewhat) expected. Stacks are recycled if they are
> > > > > allocated using "default" parameters; see _rthread_free_stack().
> > > >
> > > > yes, agreed, they do seem to be reused. I was looking with valgrind,
> > > > and this seem to inrtroduce a bug where the stack reuse is effectively
> > > > disabled. So this observation was a fluke.
> > > >
> > > > Still seeing growing mem usage when I create and destroy threads,
> > > > though. Strange thing is that while top show increasing SIZE and RES,
> > > > ktrace shows no syscall that do allocations, just repeating
> > > >
> > > > 67230/276969 a.out RET __tfork 445287/0x6cb67
> > > > 67230/614016 a.out RET futex 0
> > > > 67230/521766 a.out CALL __threxit(0xfe46b76aa30)
> > > > 67230/276969 a.out CALL __tfork(0x7f7ffffc5e80,24)
> > > > 67230/614016 a.out CALL __threxit(0xfe424f44230)
> > > > 67230/276969 a.out STRU struct __tfork { tcb=0xfe424f44400,
> > > > tid=0xfe424f44430, stack=0xfe3eed35630 }
> > > > 67230/445287 a.out RET __tfork 0
> > > > 67230/276969 a.out RET __tfork 118126/0x1cd6e
> > > > 67230/445287 a.out CALL futex(0xfe3fa8d7448,0x2<FUTEX_WAKE>,1,0,0)
> > > > 67230/276969 a.out CALL __tfork(0x7f7ffffc5e80,24)
> > > > 67230/118126 a.out RET __tfork 0
> > > > 67230/445287 a.out RET futex 0
> > > > 67230/276969 a.out STRU struct __tfork { tcb=0xfe3fa8d7200,
> > > > tid=0xfe3fa8d7230, stack=0xfe403a719b0 }
> > > > 67230/118126 a.out CALL futex(0xfe424f44448,0x2<FUTEX_WAKE>,1,0,0)
> > > > 67230/445287 a.out CALL __threxit(0xfe3fa8d7430)
> > > > 67230/276969 a.out RET __tfork 456926/0x6f8de
> > > >
> > > >
> > > > Could this indicate that one of these syscalls introduces a mem leak
> > > > in the process?
> > >
> > > Using procmap I'm seeing single page anon regions being added to the
> > > process.
> > > A suspect would be uvm_uarea_alloc() in thread_fork.
> >
> > Added some debug code and uvm_uarea_free() is called on the thread (in
> > the repaer) shortly after the thread exists. So that part seems to be
> > ok. Puzzled what other mapping could show up in the process when
> > creating/destroying threads.
> >
> > -Otto
> >
>
> Sorry, I misdiagnosed, the process itself is doing the occasioal mmap
> itself. It happens every +/- 1000 iterations when runinng the
> noop tets program with two threads. Now I need to find out where that
> mmap call is coming form.
>
> -Otto
>
The extra mmap calls were coming from ld.so (that's why I did not spot
in form gdb). After some head scratching it turns out to be a stupid
thing in ld.so/malloc.c
Adding this to the already posted diff makes the leak go away and the
noop threading test runs in constant memory.
I introduced this error myself in ld.so/malloc.c rev 1.24.
-Otto
Index: malloc.c
===================================================================
RCS file: /cvs/src/libexec/ld.so/malloc.c,v
retrieving revision 1.30
diff -u -p -U10 -r1.30 malloc.c
--- malloc.c 30 Sep 2019 03:35:09 -0000 1.30
+++ malloc.c 26 Dec 2020 12:32:31 -0000
@@ -575,23 +575,20 @@ omalloc_make_chunks(struct dir_info *d,
if (pp == MAP_FAILED)
return NULL;
bp = alloc_chunk_info(d, bits);
if (bp == NULL)
goto err;
/* memory protect the page allocated in the malloc(0) case */
if (bits == 0 && _dl_mprotect(pp, MALLOC_PAGESIZE, PROT_NONE) < 0)
goto err;
- bp = alloc_chunk_info(d, bits);
- if (bp == NULL)
- goto err;
bp->page = pp;
if (insert(d, (void *)((uintptr_t)pp | (bits + 1)), (uintptr_t)bp))
goto err;
LIST_INSERT_HEAD(&d->chunk_dir[bits][listnum], bp, entries);
return bp;
err:
unmap(d, pp, MALLOC_PAGESIZE, MALLOC_JUNK);
return NULL;