On Fri, Nov 27, 2015 at 03:54:23PM +0100, Pavel Reichl wrote:
> 
> 
> On 11/27/2015 10:06 AM, Jakub Hrozek wrote:
> >On Wed, Nov 18, 2015 at 03:06:03PM +0100, Pavel Reichl wrote:
> >>Hello,
> >>
> >>attached patch proposes solution for leaking memory when non-existing 
> >>netgroup is looked up.
> >>
> >>1st patch is just for testing - just call 'pkill -SIGUSR1 sssd_nss' and 
> >>talloc report will be generated in /tmp/sssd_nss_talloc_report_full.
> >>
> >>For details about the bug please see commit message in 2nd patch.
> >>
> >>User who reported the bug confirmed that so far it seems that memory leak 
> >>has been fixed and he didn't report any side effects.
> >>
> >>Thanks!
> >
> >> From abbf720b832beb6d04f909f598e7114a19f72a03 Mon Sep 17 00:00:00 2001
> >>From: Pavel Reichl <prei...@redhat.com>
> >>Date: Wed, 30 Sep 2015 07:54:31 -0400
> >>Subject: [PATCH 1/2] talloc_report for nss responder
> >>
> >>---
> >>  src/responder/nss/nsssrv.c | 26 ++++++++++++++++++++++++++
> >>  1 file changed, 26 insertions(+)
> >>
> >>diff --git a/src/responder/nss/nsssrv.c b/src/responder/nss/nsssrv.c
> >>index 
> >>d8eff7968c4929663412aa56d08414689b921a22..79ca9ba89d49a5fd4ee3389e0f881b1a01f9c7a9
> >> 100644
> >>--- a/src/responder/nss/nsssrv.c
> >>+++ b/src/responder/nss/nsssrv.c
> >>@@ -403,6 +403,20 @@ static void nss_dp_reconnect_init(struct 
> >>sbus_connection *conn,
> >>      /* nss_shutdown(rctx); */
> >>  }
> >>
> >>+static void signal_nss_talloc_report(struct tevent_context *ev,
> >>+                                     struct tevent_signal *se,
> >>+                                     int signum,
> >>+                                     int count,
> >>+                                     void *siginfo,
> >>+                                     void *private_data)
> >>+{
> >>+    FILE *f = fopen("/tmp/sssd_nss_talloc_report_full", "w");
> >>+    if (f != NULL) {
> >>+        talloc_report_full(NULL, f);
> >>+        fclose(f);
> >>+    }
> >>+}
> >>+
> >>  int nss_process_init(TALLOC_CTX *mem_ctx,
> >>                       struct tevent_context *ev,
> >>                       struct confdb_ctx *cdb)
> >>@@ -417,6 +431,8 @@ int nss_process_init(TALLOC_CTX *mem_ctx,
> >>      int hret;
> >>      int fd_limit;
> >>
> >>+    talloc_enable_leak_report_full();
> >>+
> >>      nss_cmds = get_nss_cmds();
> >>
> >>      ret = sss_process_init(mem_ctx, ev, cdb,
> >>@@ -558,6 +574,16 @@ int nss_process_init(TALLOC_CTX *mem_ctx,
> >>          goto fail;
> >>      }
> >>
> >>+    /* Handle SIGUSR1 to force offline behavior */
> >>+    BlockSignals(false, SIGUSR1);
> >>+    struct tevent_signal *tes;
> >>+    tes = tevent_add_signal(rctx->ev, rctx, SIGUSR1, 0,
> >>+                            signal_nss_talloc_report, rctx);
> >>+    if (tes == NULL) {
> >>+        ret = EIO;
> >>+        goto fail;
> >>+    }
> >>+
> >>      DEBUG(SSSDBG_TRACE_FUNC, "NSS Initialization complete\n");
> >>
> >>      return EOK;
> >>--
> >>2.4.3
> >>
> >
> >> From 0c05cef940b8a8650537056db4ade09b0b6a6fa6 Mon Sep 17 00:00:00 2001
> >>From: Pavel Reichl <prei...@redhat.com>
> >>Date: Tue, 10 Nov 2015 10:56:56 -0500
> >>Subject: [PATCH 2/2] NSS: Fix memory leak netgroup
> >>
> >>If netgroup cannot be found in setnetgrent_retry() function
> >>set_netgroup_entry() is called which steals getent_ctx directly to
> >>nss_ctx->netgroups.
> >>Subsequently function lookup_netgr_step() is called that (in case of
> >>nenexisting group) will call create_negcache_netgr() which creates
> >>a new dummy object to serve as negative cache. While doing so it calls
> >>again set_netgroup_entry() for the same netgroup and it calls
> >>hash_enter.
> >>
> >>hash_enter will remove previously hashed entry for netgroup (created in
> >>setnetgrent_retry()) from hash table but it won't be freed and thus it
> >>leaks.
> >>
> >>This patch sets netgroup lifetime for netgroups allocated in
> >>setnetgrent_retry().
> >
> >We already set the netgroup lifetime in both the found and notfound case
> >in lookup_netgr_step(). If I understand the code correctly, the issue is
> >that we create a new negative result in create_negcache_netgr(),
> >overwriting the one that we created in setnetgrent_retry(), correct?
> 
> Agree.
> 
> >
> >Wouldn't it then be better to see if another same object is already in
> >the hashtable and free it before replacing?
> 
> I agree it would be best. I tried that before and failed because I could not 
> decipher out the relation of talloc contexts.
> 
> I tried that again. Seems that leaks are gone. Segfaults were not happening 
> during my testing.

> Code got even messier :-(

I think the code would look nice if it was placed in the else branch of
create_negcache_netgr() :-)

I also don't think we need the tmp_ctx change..
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/admin/lists/sssd-devel@lists.fedorahosted.org

Reply via email to