Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
I recently came across the nfsidmap -c option. I haven't thoroughly tried to reproduce the problem but nslcd 0.9.1-1 in experimental has an option to flush various caches. You could put reconnect_invalidate nfsidmap in nslcd.conf. I'm not 100% sure if this fixes the problem but can you reproduce the problem with that option set? Thanks, -- -- arthur - adej...@debian.org - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Hi Arthur, On Tue, Oct 14, 2008 at 12:18:11AM +0200, Arthur de Jong wrote: nfs-utils-1.1.3/utils/idmapd/idmapd.c:674: /* XXX: I don't like ignoring this error in the id-name case, * but we've never returned it, and I need to check that the client * can handle it gracefully before starting to return it now. */ if (im.im_status == IDMAP_STATUS_LOOKUPFAIL) im.im_status = IDMAP_STATUS_SUCCESS; That does not seem to be the root of the problem. I've built nfs-utils with these lines commented out on one of my systems and disabled the workaround in idmapd and the problem persists. That means that I think the only way to fix this is in the short term is to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which could have other side effects) or to apply the workaround as described before. Hmm. Probably the workaround should then be included in the default configuration of idmapd. It seems not to cause any harm and works around these problems and IMHO its unlikely that this can be fixed *properly* for lenny. What do you think about this approach? Shall we ask the NFS maintainers about this change to the default configuration? Best Regards, Patrick -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Tue, 14 Oct 2008, Patrick Schoenfeld wrote: That does not seem to be the root of the problem. I've built nfs-utils with these lines commented out on one of my systems and disabled the workaround in idmapd and the problem persists. Thanks for investigating this. Another thought occurred to me that the kernel could be caching the contents of the directory at another level (e.g. it could cache the directory information without ever hitting and idmap code untill that cache is expired). That means that I think the only way to fix this is in the short term is to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which could have other side effects) or to apply the workaround as described before. Hmm. Probably the workaround should then be included in the default configuration of idmapd. It seems not to cause any harm and works around these problems and IMHO its unlikely that this can be fixed *properly* for lenny. What do you think about this approach? Shall we ask the NFS maintainers about this change to the default configuration? If the NFS maintainers think this does not cause problems then I think this will be the best solution for the short term. The only downside that I can think of is that there might be some reduced performance because the name to id lookups need to be done more frequently. Can you open a new bugreport on nfs-utils? For the longer term the kernel should probably provide a mechanism to flush the idmap cache. - -- - -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFI9FgYVYan35+NCKcRAnPsAJwI6sOkFIw2ewZiDiNnr+hJrEU1JwCdFzRC 5xUNeqIFH+3qk8fX1G4vwh4= =w9DR -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
retitle 500778 nss-ldapd: problem resolving groups and users with nfs4 severity 500778 important tags 500778 + help thanks On Mon, 2008-10-06 at 11:42 +0200, Patrick Schoenfeld wrote: 2008/10/3 Arthur de Jong [EMAIL PROTECTED]: Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in the [General] section help at all in your setup? (the correct values should be loaded sooner) very good. This betters the situation a lot. Its a good workaround. Now if you'd find the reason why the behaviour differs from libnss-ldap and could enhance libnss-ldapd in this way, this would be great :-)) I am lowering the severity of this bug for now because the problem is limited to using nss-ldapd in combination to nfs4 and there is a workaround (adding Cache-Expiration = 10 to /etc/idmapd.conf). I will try to investigate this some more but help is appreciated with this. -- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
On Mon, 2008-10-13 at 22:17 +0200, Arthur de Jong wrote: I will try to investigate this some more but help is appreciated with this. I have been able to reproduce the same behaviour with nss_ldap. If you freshly mount a filesystem while the LDAP server is unavailable the kernel will not re-ask idmapd to look up the usernames until the timeout has expired. I have dug a little through the code (nfs-utils, libnfsidmap and kernel) and from what I understand is that the kernel should not cache negative lookups. But idmapd seems to map IDMAP_STATUS_LOOKUPFAIL to IDMAP_STATUS_SUCCESS which causes the kernel to remember the mapping. This is done in: nfs-utils-1.1.3/utils/idmapd/idmapd.c:674: /* XXX: I don't like ignoring this error in the id-name case, * but we've never returned it, and I need to check that the client * can handle it gracefully before starting to return it now. */ if (im.im_status == IDMAP_STATUS_LOOKUPFAIL) im.im_status = IDMAP_STATUS_SUCCESS; Not sure who made the comment and if this still a valid comment. If this is fixed this would result in negative entries not being cached at all (except by nscd if it is enabled but the kernel would ask idmapd which would ask nscd). By looking though the kernel code (fs/nfs/idmap.c) there is no way to flush the cache. Also, the value of /proc/sys/fs/nfs/idmap_cache_timeout at the time the cache entry was created is used so it's no use in lowering the value after the fact. That means that I think the only way to fix this is in the short term is to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which could have other side effects) or to apply the workaround as described before. Note that I have only read code and not done extensive debugging by deploying modified versions of either kernel of idmapd. There is one thing that is remaining a little puzzling in the kernel code is the question about the cache retry. I can't explain the strange timeout if you set the cache value really low like 1 jiffy. Then again I don't know enough about jiffies and kernel internals to go hunting that problem anyway. What nss-ldapd could do is document that the Cache-Expiration option be set. Perhaps a check could be implemented with a debconf note during package installation. Another option would be to start nslcd before nfs-common. This however would probably break an environment where /usr is mounter over NFS. Also that would cause problems because it is best to start nslcd after slapd. -- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Hi, 2008/10/3 Arthur de Jong [EMAIL PROTECTED]: Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in the [General] section help at all in your setup? (the correct values should be loaded sooner) very good. This betters the situation a lot. Its a good workaround. Now if you'd find the reason why the behaviour differs from libnss-ldap and could enhance libnss-ldapd in this way, this would be great :-)) Best Regards, Patrick -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Hi, On Fri, Oct 03, 2008 at 12:18:47AM +0200, Arthur de Jong wrote: If using nfs4 (I've been doing some reading up but still no first-hand experience) is that if the user doesn't exist it is generally mapped to nobody:nogroup. right. The mapping is done by idmapd but at some point in combination with something in the kernel. From what I understand from scanning the idmapd code is that there is a default cache expiry time (in the kernel) of 500 seconds (10 minutes). Current value should be available in /proc/sys/fs/nfs/idmap_cache_timeout. My guess is that name lookups are cached in idmapd. Can you check that by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes away? Nope, it does not. Can you check the idmapd logs anything out of the ordinary? Perhaps you can increase the verbosity in /etc/idmapd.conf. Hm, no nothing special. Setting the verbosity higher as the default (default: 3, tried up to 10) does not seem to change anything. Basically this is all: Oct 3 09:46:36 teekanne rpc.idmapd[3309]: libnfsidmap: using domain: localdomain Oct 3 09:46:36 teekanne rpc.idmapd[3309]: libnfsidmap: using translation method: nsswitch Oct 3 09:46:36 teekanne rpc.idmapd[3310]: Expiration time is 600 seconds. Oct 3 09:46:36 teekanne rpc.idmapd[3310]: Opened /proc/net/rpc/nfs4.nametoid/channel Oct 3 09:46:36 teekanne rpc.idmapd[3310]: Opened /proc/net/rpc/nfs4.idtoname/channel Oct 3 09:46:36 teekanne rpc.idmapd[3310]: New client: 0 Oct 3 09:46:36 teekanne rpc.idmapd[3310]: Opened /var/lib/nfs/rpc_pipefs/nfs/clnt0/idmap Oct 3 09:46:36 teekanne rpc.idmapd[3310]: New client: 1 Oct 3 09:47:23 teekanne rpc.idmapd[3310]: Client 0: (user) id 30010 - name [EMAIL PROTECTED] Oct 3 09:47:23 teekanne rpc.idmapd[3310]: Client 0: (group) id 65534 - name [EMAIL PROTECTED] Thanks. Perhaps I should set up a test environment myself with NFS4. Do you have some pointers for that (I use NFS3 myself). Thats not a great thing. You need to setup an export entry like you do for NFSv4, however there is a fundamentel difference to NFSv3. You export a NFSROOT not single exports. So you possibly want to setup a virtual export directory. Its described here [1]. Best Regards, Patrick [1] http://www.crazysquirrel.com/computing/debian/servers/setting-up-nfs4.jspx -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
(Cc-ing the nfs-utils maintainers, perhaps they have some insight that could solve this) On Sat, 2008-10-04 at 09:52 +0200, Patrick Schoenfeld wrote: My guess is that name lookups are cached in idmapd. Can you check that by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes away? Nope, it does not. I have been able to reproduce this. On the server I have in /etc/exports (/export/newhome is a bind-mounted /home with half a dozen users): /export 192.168.1.0/24(ro,sync,insecure,root_squash,no_subtree_check,fsid=0) /export/newhome 192.168.1.0/24(rw,nohide,sync,insecure,root_squash,no_subtree_check) On the client I have in /etc/fstab: fs:/newhome/mntnfs4 rw 0 0 Now if I stop nslcd (all name lookup calls should now return NSS_STATUS_UNAVAIL/ENOENT) an 'ls -l /mnt' shows: [...] drwx-x 148 nobody users 12288 Oct 3 21:02 arthur [...] (the user arthur from the server is mapped to the user nobody on the client because the namelookup failed). With some more verbose logging rpc.idmapd shows: [...] rpc.idmapd: nfs4_name_to_uid: calling nsswitch-name_to_uid rpc.idmapd: nss_getpwnam: name '[EMAIL PROTECTED]' domain 'localdomain': resulting localname 'arthur' rpc.idmapd: nss_getpwnam: name 'arthur' not found in domain 'localdomain' rpc.idmapd: nfs4_name_to_uid: nsswitch-name_to_uid returned -2 rpc.idmapd: nfs4_name_to_uid: final return value is -2 rpc.idmapd: Client 16: (user) name [EMAIL PROTECTED] - id 65534 [...] If I repeat the ls command a couple of times rpc.idmapd no longer logs the failed lookups and a strace of rpc.idmapd also shows that that process is no longer asked (by the kernel?) to look up the user. If I then start nslcd (now name lookups should be performed as usual and getent shows that they do) the results aren't quickly fixed. After a while (I've been messing about with stuff in /proc so I don't know how long this normally takes) the kernel asks rpc.idmapd again to look up user arthur (and the other users in the filesystem). Also note that the bugreporter had problems with groups and I've reproduced the behaviour with users. [...] drwx-x 148 arthur users 12288 Oct 3 21:02 /mnt/arthur [...] Now the question is, how should this caching mechanism be tuned and how should we solve this problem. Is there a reliable way to flush the cache? There seems to be /proc/net/rpc/nfs4.nametoid which contains some stuff that could be relevant and /proc/sys/fs/nfs/idmap_cache_timeout. However setting /proc/sys/fs/nfs/idmap_cache_timeout or Cache-Expiration does not result in the expected timeout in seconds (read from the idmapd.c). Setting it to 10 results in a retry every 30 to 60 seconds, setting it to 100 seems to result in a retry in 60-120 seconds. Also, writing to /proc/net/rpc/nfs4.idtoname/flush and /proc/net/rpc/nfs4.nametoid/flush (like is done in flush_nfsd_idmap_cache()) doesn't seem to make a difference. I haven't had a look at the kernel code yet (this is running kernel Linux 2.6.26-1-686 (SMP w/2 CPU cores)). Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in the [General] section help at all in your setup? (the correct values should be loaded sooner) -- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Hi, On Wed, Oct 01, 2008 at 10:27:04PM +0200, Arthur de Jong wrote: Can you produce logs of nslcd? It should report whether the LDAP server was reachable or not. If you can run nslcd with the -d option it should report more information that will help in tracking this down. attached is a log, while the problem exists. [EMAIL PROTECTED] ~ % ls -l test -rw-rw-r-- 1 schoenfeld nogroup 0 12. Sep 09:49 test Interesting enough: The symptom is similar to the system behaviour, if nslcd is _not_ running. Then all files resolve to nobody:nogroup. However there is no problem visible from the log. Best Regards, Patrick nslcd: DEBUG: add_uri(ldap://majestix-linux.intra.in-medias-res.com) nslcd: version 0.6.5 starting nslcd: DEBUG: setgroups(0,NULL) done nslcd: DEBUG: setgid(121) done nslcd: DEBUG: setuid(117) done nslcd: accepting connections nslcd: [8b4567] DEBUG: connection from pid=3187 uid=30010 gid=1 nslcd: [8b4567] DEBUG: nslcd_passwd_byuid(30010) nslcd: [8b4567] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uidNumber=30010))) nslcd: [8b4567] DEBUG: simple anonymous bind to ldap://majestix-linux.intra.in-medias-res.com nslcd: [8b4567] connected to LDAP server ldap://majestix-linux.intra.in-medias-res.com nslcd: [8b4567] DEBUG: ldap_result(): end of results nslcd: [7b23c6] DEBUG: connection from pid=3188 uid=0 gid=1 nslcd: [7b23c6] DEBUG: nslcd_passwd_byuid(30010) nslcd: [7b23c6] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uidNumber=30010))) nslcd: [7b23c6] DEBUG: simple anonymous bind to ldap://majestix-linux.intra.in-medias-res.com nslcd: [7b23c6] connected to LDAP server ldap://majestix-linux.intra.in-medias-res.com nslcd: [7b23c6] DEBUG: ldap_result(): end of results nslcd: [3c9869] DEBUG: connection from pid=3188 uid=0 gid=1 nslcd: [3c9869] DEBUG: nslcd_group_bymember(root) nslcd: [3c9869] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uid=root))) nslcd: [3c9869] DEBUG: simple anonymous bind to ldap://majestix-linux.intra.in-medias-res.com nslcd: [3c9869] connected to LDAP server ldap://majestix-linux.intra.in-medias-res.com nslcd: [3c9869] DEBUG: ldap_result(): end of results nslcd: [3c9869] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixGroup)(memberUid=root))) nslcd: [3c9869] DEBUG: ldap_result(): end of results nslcd: [334873] DEBUG: connection from pid=3188 uid=0 gid=0 nslcd: [334873] DEBUG: nslcd_group_byname(Operations) nslcd: [334873] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixGroup)(cn=Operations))) nslcd: [334873] DEBUG: simple anonymous bind to ldap://majestix-linux.intra.in-medias-res.com nslcd: [334873] connected to LDAP server ldap://majestix-linux.intra.in-medias-res.com nslcd: [334873] DEBUG: ldap_result(): end of results nslcd: [b0dc51] DEBUG: connection from pid=3194 uid=0 gid=0 nslcd: [b0dc51] DEBUG: nslcd_group_bymember(root) nslcd: [b0dc51] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uid=root))) nslcd: [b0dc51] DEBUG: simple anonymous bind to ldap://majestix-linux.intra.in-medias-res.com nslcd: [b0dc51] connected to LDAP server ldap://majestix-linux.intra.in-medias-res.com nslcd: [b0dc51] DEBUG: ldap_result(): end of results nslcd: [b0dc51] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixGroup)(memberUid=root))) nslcd: [b0dc51] DEBUG: ldap_result(): end of results nslcd: [495cff] DEBUG: connection from pid=3203 uid=0 gid=0 nslcd: [495cff] DEBUG: nslcd_passwd_byuid(30010) nslcd: [495cff] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uidNumber=30010))) nslcd: [495cff] DEBUG: ldap_result(): end of results nslcd: [e8944a] DEBUG: connection from pid=3203 uid=0 gid=0 nslcd: [e8944a] DEBUG: nslcd_group_bygid(1000) nslcd: [e8944a] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixGroup)(gidNumber=1000))) nslcd: [e8944a] DEBUG: ldap_result(): end of results nslcd: [5558ec] DEBUG: connection from pid=3203 uid=0 gid=0 nslcd: [5558ec] DEBUG: nslcd_passwd_byuid(-2) nslcd: [5558ec] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uidNumber=-2))) nslcd: [5558ec] DEBUG: ldap_result(): end of results nslcd: [8e1f29] DEBUG: connection from pid=3203 uid=0 gid=0 nslcd: [8e1f29] DEBUG: nslcd_group_bygid(-2) nslcd: [8e1f29] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixGroup)(gidNumber=-2))) nslcd: [8e1f29] DEBUG: ldap_result(): end of results nslcd: [e87ccd] DEBUG: connection from pid=3204 uid=0 gid=0 nslcd: [e87ccd] DEBUG: nslcd_passwd_byname(schoenfeld) nslcd: [e87ccd] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, filter=((objectClass=posixAccount)(uid=schoenfeld)))
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
On Thu, 2008-10-02 at 10:28 +0200, Patrick Schoenfeld wrote: attached is a log, while the problem exists. [EMAIL PROTECTED] ~ % ls -l test -rw-rw-r-- 1 schoenfeld nogroup 0 12. Sep 09:49 test Interesting enough: The symptom is similar to the system behaviour, if nslcd is _not_ running. Then all files resolve to nobody:nogroup. If using nfs4 (I've been doing some reading up but still no first-hand experience) is that if the user doesn't exist it is generally mapped to nobody:nogroup. The mapping is done by idmapd but at some point in combination with something in the kernel. From what I understand from scanning the idmapd code is that there is a default cache expiry time (in the kernel) of 500 seconds (10 minutes). Current value should be available in /proc/sys/fs/nfs/idmap_cache_timeout. My guess is that name lookups are cached in idmapd. Can you check that by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes away? On my system, idmapd is started way before nslcd and it probably isn't a good idea to start if before idmapd. There seems to be an undocumented Cache-Expiration option in the General section of /etc/idmapd.conf that could help to bring down the cache timeout value. Can you check the idmapd logs anything out of the ordinary? Perhaps you can increase the verbosity in /etc/idmapd.conf. Thanks. Perhaps I should set up a test environment myself with NFS4. Do you have some pointers for that (I use NFS3 myself). -- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Package: libnss-ldapd Severity: serious Version: 0.6.5 Hi, since we use libnss-ldapd we have a problem that is quiet serious for us, as it effectively affects login and group ACLs. However we couldn't yet track down this issue to a specific component, therefore we didn't report it yet. The setup: Our setup is a mixed Windows/Linux environment with a LDAP server, for central authentication. Linux clients use libnss-ldapd for resolution of usernames and groups. The problem: After reboot of the Linux clients they are unable to resolve groups and sometimes are also unable to resolve users. The result is that files are owned by [nobody]:nogroup, while getent passwd and getent group show the right result. In consequence people are unable to properly login (because desktop environment need read permissions on their setting ;) and user permissions are broken. After 10-30 minutes of running the problem disappears. This makes me think that some timeout occours, but I can't tell which. I thought its probably somehow related to the udev resolution issues that are handled different in libnss-ldapd from libnss-ldap which produces a significant delay when booting because groups can't be resolved while ldap is accessible, which is handled gracefully bei libnss-ldapd. Maybe you gather invalid results while booting, because LDAP is not accessible. But I don't see why nslcd should cache these results so I think my idea is absurd. The problem is reproducable with or without nscd running, so the problem is not related to it. The problem seems not to be related to the groups which contain spaces, except that it spams the log secondly with error messages unless my patch is applied. The problem does not occur with libnss-ldap, so the problem is specific to libnss-ldapd. I've choosen severity serious for this issue because at the one hand the problem would fit severity 'Critical', because it makes unrelated software on the system (or the whole system) break, but then again I felt uncomfortable with it, because the problem does not persist over the uptime of the system and after 10-30 minutes the problem disappears. But I think it should definitive be fixed for lenny. Best Regards, Patrick signature.asc Description: Digital signature
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
On Wed, 2008-10-01 at 13:11 +0200, Patrick Schoenfeld wrote: Our setup is a mixed Windows/Linux environment with a LDAP server, for central authentication. Linux clients use libnss-ldapd for resolution of usernames and groups. Could you provide some more details? Is the LDAP server on the system that also runs nss-ldapd, what options do you use, which LDAP server software etc? Your configuration file should also help. After reboot of the Linux clients they are unable to resolve groups and sometimes are also unable to resolve users. The result is that files are owned by [nobody]:nogroup, while getent passwd and getent group show the right result. I don't understand this. If you perform getent passwd and getent group you get the expected result but if you do ls -l the files are reported as nobody:nogroup? If ls can't resolve numeric user and group ids it should print the numeric form, not make up something. Can you produce logs of nslcd? It should report whether the LDAP server was reachable or not. If you can run nslcd with the -d option it should report more information that will help in tracking this down. In consequence people are unable to properly login (because desktop environment need read permissions on their setting ;) and user permissions are broken. Note that for logging in you also need pam_ldap which has it's own configuration. If the problem is in that you should probably also provide information about that. After 10-30 minutes of running the problem disappears. This makes me think that some timeout occours, but I can't tell which. I thought its probably somehow related to the udev resolution issues that are handled different in libnss-ldapd from libnss-ldap which produces a significant delay when booting because groups can't be resolved while ldap is accessible, which is handled gracefully bei libnss-ldapd. Maybe you gather invalid results while booting, because LDAP is not accessible. But I don't see why nslcd should cache these results so I think my idea is absurd. nslcd only caches the relationship between DNs and uids for group membership lookups (when the uniqueMember attribute is used). This timeout is hardcoded at 15 minutes. Other than that I can't think of a timeout as long unless you set it that high in the config. The way nss-ldapd solves the udev problem is by not doing LDAP lookups that early during boot at all and fail quickly. Only when nslcd is started are lookups attempted. In any case I can't think of a case where getent passwd should work and ls would fail. One known issue (#475626) is related to the order at which nslcd is started during boot. If the LDAP server is unavailable when nslcd is started a timeout could occur and the LDAP server will not be found immediately when it is available. I've choosen severity serious for this issue because at the one hand the problem would fit severity 'Critical', because it makes unrelated software on the system (or the whole system) break, but then again I felt uncomfortable with it, because the problem does not persist over the uptime of the system and after 10-30 minutes the problem disappears. I am inclined to lower it to important because it seems to work in a lot of common environments. But I think it should definitive be fixed for lenny. I hope to fix this soon. Thanks for your bugreport. -- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong -- signature.asc Description: This is a digitally signed message part
Bug#500778: libnss-ldapd: groups resolve to nogroup after boot
Hi Arthur, On Wed, Oct 01, 2008 at 10:27:04PM +0200, Arthur de Jong wrote: On Wed, 2008-10-01 at 13:11 +0200, Patrick Schoenfeld wrote: Our setup is a mixed Windows/Linux environment with a LDAP server, for central authentication. Linux clients use libnss-ldapd for resolution of usernames and groups. Could you provide some more details? Yep, I can. I'm just unsure which informations are of interest (I'm at a point where I'm kinda clueless whats the cause of the trouble :/). Is the LDAP server on the system that also runs nss-ldapd, what options do you use, No, it runs on another host. I don't use any special options. In fact the configuration is the default configuration, except the server address and the search base. [EMAIL PROTECTED]:~# grep -v '\(^#\|^$\)' /etc/nss-ldapd.conf uri ldap://majestix-linux.intra.in-medias-res.com base dc=intra,dc=in-medias-res,dc=com uid nslcd gid nslcd which LDAP server software etc? Your configuration file should also help. The LDAP server is a usual slapd as it is in Etch: slapd (2.3.30-5+etch1) After reboot of the Linux clients they are unable to resolve groups and sometimes are also unable to resolve users. The result is that files are owned by [nobody]:nogroup, while getent passwd and getent group show the right result. I don't understand this. If you perform getent passwd and getent group you get the expected result but if you do ls -l the files are reported as nobody:nogroup? Right. Sometimes all files are owned by nobody:nogroup but the most common problem is that only groups are a problem. And yes, while the problem exists getent passwd and getent group show up groups properly. If ls can't resolve numeric user and group ids it should print the numeric form, not make up something. Well, I think this is related to the fact that it is a NFSv4 filesystem. nobody:nogroup is what idmapd from NFS does if it cannot properly resolve the ids. Can you produce logs of nslcd? It should report whether the LDAP server was reachable or not. If you can run nslcd with the -d option it should report more information that will help in tracking this down. OK. I will add this logs ASAP. In consequence people are unable to properly login (because desktop environment need read permissions on their setting ;) and user permissions are broken. Note that for logging in you also need pam_ldap which has it's own configuration. If the problem is in that you should probably also provide information about that. Well, the problem is not the login per se, but that some programs (for example GNOME) simply do not work, because they can't read their settings (if the nobody problem exists as well. if the groups are the only problem, then only accessing shared files is a problem) nslcd only caches the relationship between DNs and uids for group membership lookups (when the uniqueMember attribute is used). This timeout is hardcoded at 15 minutes. Other than that I can't think of a timeout as long unless you set it that high in the config. I would have said first, that 15 minutes could be the time frame, but then again: no. Today I saw the problem disappearing after more then half an hour. The way nss-ldapd solves the udev problem is by not doing LDAP lookups that early during boot at all and fail quickly. Only when nslcd is started are lookups attempted. In any case I can't think of a case where getent passwd should work and ls would fail. Well, sounds reasonable and I don't see why this should cause the problems. I am inclined to lower it to important because it seems to work in a lot of common environments. Well, yes, thats true. But on the other side it has serious affect on the functionality on the system at a whole (because it is a client that mounts /home etc. from the server), so I felt serious is a good compromise. I hope to fix this soon. Thanks for your bugreport. No bug report, no solution, right? So no need to thank me, instead I thank you if you'd find a solution for it. Best Regards, Patrick -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]