On Tue, Sep 18, 2012 at 01:26:20PM +0200, Michal Židek wrote: > On 09/17/2012 06:02 PM, Simo Sorce wrote: > > > >Hi Michal, > >you bring up a good point about the race with startup, but I am not > >totally convinced about the approach you used to address it. > > > >The race you point out could be easily resolved by having the init code > >unlink() the reset file w/o acting on it. That would address the problem > >of cleaning up stale file. > > unlink() is added to the nss_process_init in the new patch, to avoid > unnecessary cache cleaning when not needed. > > > > >Your solution though has another race, if someone runs sss_cache and at > >the same time the system wants to rotate logs you may get the monitor to > >ignore the legit SIGHUP and only clean the cache but not rotate the logs. > > > >This is why I planned to have the check in sssd_nss and always do log > >rotation (monitor always sends the SIGHUP around). > > You are right. I changed it to always do log rotation. > > Btw. monitor does not send SIGHUP to other process. Monitor > receives SIGHUP and tells other services what to do (via D-Bus). > Right now, 'rotate logs' and 'clear memory cache'(NSS only) messages > are sent from monitor_hup function. I've changed the comment in > the code to make it more obvious. > > > > >The other 'problem' with the current code is that you are 'leaking' > >information specific to the sssd_nss backend into the monitor code. > >This is not really a major issue, but I prefer to avoid these 'shortcuts' > >and keep the monitor dumber wrt frontend or provider specific features. > > Everything I considered unrelated to monitor is removed in the new > patch, but it still needs to know the NSS process name (nsssrv.h), to > be able to send it the request for cache clean up. It would not need it > if we had a single NSS specific function to handle the logrotate > request, which would do both logrotation and memcache clean up > (I had it this way before and I did not like it myself). I thing if > monitor knows that on SIGHUP (sent to monitor) all processes rotate > logs, it should also know that NSS needs to clear caches. >
I think "leaking" the service name is OK or perhaps the lesser of all evils. I remember we even had a check in the past that would test if the list of configured services contains "nss" and "pam". > > > >Other nitpicks: in the current code if you get a fatal error do not do > >anything if the stat fails with an error different from ENOENT. I think > >the SIGHUP should always be sent (as I said above) and the failure should > >only be logged, not be fatal (of course this is not a problem if we move > >back checks into sssd_nss). > > I made the first possible unlink() error (the one that can happen in > nss_process_init) not fatal. I think sssd should at least try to > continue even if the caches might be removed on next log rotation > request. > > New patches are in attachment. > > Thanks > Michal > The code works as advertized. I only have some minor nitpicks style-wise: +static int nss_clear_memcache(DBusMessage *message, + struct sbus_connection *conn) Please align the line after the linebreak. + struct nss_ctx *nctx = (struct nss_ctx*)rctx->pvt_ctx; (and elsewhere) Please put whitespace after the cast. sss_mmap_cache_reinit: We have a convention to call our "local" talloc contexts tmp_ctx. We usually also allocate them on NULL so that a run with valgrind would immediatelly reveal them as leaks[*] I'm not sure about this check: + ret = talloc_free(*mc_ctx); Is there actually a descructor? Wouldn't we want to continue and try to init the cache again instead of just failing? Why the whitespace change in server_setup() ? Otherwise I'm fine with the patches. Good work, Michal! _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel