On Wed, 2009-11-18 at 11:31 +0100, Martin Nagy wrote: > On Mon, 2009-11-16 at 17:51 -0500, Simo Sorce wrote: > > On Mon, 2009-11-16 at 17:08 +0100, Martin Nagy wrote: > > > Simo Sorce wrote: > > > > On Mon, 2009-11-16 at 08:46 +0100, Martin Nagy wrote: > > > > > Simo Sorce wrote: > > > > > > While working on a patch to use failover in the ldap driver I found > > > > > > a > > > > > > few bugs and a few things I felt missing. > > > > > > > > > > > > Attached a patch to fix bugs and add a function to get back a server > > > > > > name from a fo_server structure. > > > > > > > > > > > > Simo. > > > > > > > > > > > > P.S: if you are interested in the failover patch I have a working > > > > > > but > > > > > > unfinished patch here: > > > > > > http://fedorapeople.org/gitweb?p=simo/public_git/sssd.git;a=commit;h=29830482f5e44b5425fb91f82fd5a4ee692f3ae2 > > > > > > > > > > Just one thing. I was counting on the fact that the user of the fail > > > > > over will make sure that he doesn't try to create the same service > > > > > twice and doing so will mean a bug in the code. Looking at your patch > > > > > and having done a very quick look at your branch, it seems like you > > > > > count for this behavior (which I think is fairly reasonable). In that > > > > > case, could you also please change this debugging statement? > > > > > > > > > > DEBUG(1, ("Service %s already exists\n", name)); > > > > > > > > > > Please either completely remove it, or (IMHO better) change the debug > > > > > level. > > > > > > > > I will probably raise the debug level, but it's not a big deal for now. > > > > > > > > > Regarding the ares error code, I'm not 100% sure what's the right > > > > > thing to do. Perhaps we should use a constant, like EIO and count on > > > > > the logging statement few lines up to be sufficient for the user. Any > > > > > other ideas? > > > > > > > > Not sure yet, what we need from callers are 3 type of errors. > > > > 1. an error that tells us that no server is resolvable/available (to > > > > avoid looping forever) > > > > 2. an error that tells us the current server is not resolvable/available > > > > 3. any other error should mean something really bad happened and we > > > > should abort. > > > > > > > > We miss n.2 for now so I am treating any error except ENOENT to mean the > > > > current server is not resolvable and we need to try the next one. > > > > > > > > Simo. > > > > > > OK. Nothing actually wrong with the patch, so ack. > > > > Ok I found another bug, so I am re-spinning the patch. > > > > > > After the SERVER_NOT_WORKING timeout for a server was expired, we were > > setting the status to SERVER_NAME_RESOLVED > > > > This is wrong for 2 reasons: > > 1. we may have marked the server as SERVER_NOT_WORKING because we could > > not resolve the name, this means the name was not resolved at all, by > > setting it as SERVER_NAME_RESOLVED we have no way to force a new lookup, > > we will basically never be able to lookup the server again. > > > > 2. If a sever becomes unavailable because the server has been moved, we > > would never have a chance to resolve it to the new address again because > > setting the server to SERVER_NOT_WORKING status wouldn't work (see point > > 1). > > > > Even if the server was correctly resolved and SERVER_NOT_WORKING was set > > because of other reasons (a service stopped working for example), we > > want to try to resolve the name again just in case. > > > > Simo. > > Nack. I think you forgot to git add fail_over.h, > SERVER_NAME_NOT_RESOLVED is not defined in it.
Sorry, for some reason I didn't notice that SERVER_NAME_NOT_RESOLVED already *is* in fail_over.h. Ack. Martin _______________________________________________ sssd-devel mailing list [email protected] https://fedorahosted.org/mailman/listinfo/sssd-devel
