Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2013-08-27 Thread Arthur de Jong
I recently came across the nfsidmap -c option. I haven't thoroughly
tried to reproduce the problem but nslcd 0.9.1-1 in experimental has an
option to flush various caches. You could put

reconnect_invalidate nfsidmap

in nslcd.conf. I'm not 100% sure if this fixes the problem but can you
reproduce the problem with that option set?

Thanks,

-- 
-- arthur - adej...@debian.org - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-14 Thread Patrick Schoenfeld
Hi Arthur,

On Tue, Oct 14, 2008 at 12:18:11AM +0200, Arthur de Jong wrote:
 nfs-utils-1.1.3/utils/idmapd/idmapd.c:674:
 
   /* XXX: I don't like ignoring this error in the id-name case,
* but we've never returned it, and I need to check that the client
* can handle it gracefully before starting to return it now. */
 
   if (im.im_status == IDMAP_STATUS_LOOKUPFAIL)
   im.im_status = IDMAP_STATUS_SUCCESS;

That does not seem to be the root of the problem. I've built nfs-utils
with these lines commented out on one of my systems and disabled the
workaround in idmapd and the problem persists.

 That means that I think the only way to fix this is in the short term is
 to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which could
 have other side effects) or to apply the workaround as described before.

Hmm. Probably the workaround should then be included in the default
configuration of idmapd. It seems not to cause any harm and works around
these problems and IMHO its unlikely that this can be fixed *properly*
for lenny. What do you think about this approach? Shall we ask the NFS
maintainers about this change to the default configuration?

Best Regards,
Patrick



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-14 Thread Arthur de Jong

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


On Tue, 14 Oct 2008, Patrick Schoenfeld wrote:
That does not seem to be the root of the problem. I've built nfs-utils 
with these lines commented out on one of my systems and disabled the 
workaround in idmapd and the problem persists.


Thanks for investigating this. Another thought occurred to me that the 
kernel could be caching the contents of the directory at another level 
(e.g. it could cache the directory information without ever hitting and 
idmap code untill that cache is expired).


That means that I think the only way to fix this is in the short term 
is to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which 
could have other side effects) or to apply the workaround as described 
before.


Hmm. Probably the workaround should then be included in the default 
configuration of idmapd. It seems not to cause any harm and works around 
these problems and IMHO its unlikely that this can be fixed *properly* 
for lenny. What do you think about this approach? Shall we ask the NFS 
maintainers about this change to the default configuration?


If the NFS maintainers think this does not cause problems then I think 
this will be the best solution for the short term. The only downside that 
I can think of is that there might be some reduced performance because the 
name to id lookups need to be done more frequently.


Can you open a new bugreport on nfs-utils?

For the longer term the kernel should probably provide a mechanism to 
flush the idmap cache.


- -- 
- -- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFI9FgYVYan35+NCKcRAnPsAJwI6sOkFIw2ewZiDiNnr+hJrEU1JwCdFzRC
5xUNeqIFH+3qk8fX1G4vwh4=
=w9DR
-END PGP SIGNATURE-



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-13 Thread Arthur de Jong
retitle 500778 nss-ldapd: problem resolving groups and users with nfs4
severity 500778 important
tags 500778 + help
thanks

On Mon, 2008-10-06 at 11:42 +0200, Patrick Schoenfeld wrote:
 2008/10/3 Arthur de Jong [EMAIL PROTECTED]:
  Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in
  the [General] section help at all in your setup? (the correct values
  should be loaded sooner)

 very good. This betters the situation a lot. Its a good workaround.
 Now if you'd find the reason why the behaviour differs from
 libnss-ldap and could enhance libnss-ldapd in this way, this would be
 great :-))

I am lowering the severity of this bug for now because the problem is
limited to using nss-ldapd in combination to nfs4 and there is a
workaround (adding Cache-Expiration = 10 to /etc/idmapd.conf).

I will try to investigate this some more but help is appreciated with
this.

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-13 Thread Arthur de Jong
On Mon, 2008-10-13 at 22:17 +0200, Arthur de Jong wrote:
 I will try to investigate this some more but help is appreciated with
 this.

I have been able to reproduce the same behaviour with nss_ldap. If you
freshly mount a filesystem while the LDAP server is unavailable the
kernel will not re-ask idmapd to look up the usernames until the timeout
has expired.


I have dug a little through the code (nfs-utils, libnfsidmap and kernel)
and from what I understand is that the kernel should not cache negative
lookups. But idmapd seems to map IDMAP_STATUS_LOOKUPFAIL to
IDMAP_STATUS_SUCCESS which causes the kernel to remember the mapping.
This is done in:

nfs-utils-1.1.3/utils/idmapd/idmapd.c:674:

/* XXX: I don't like ignoring this error in the id-name case,
 * but we've never returned it, and I need to check that the client
 * can handle it gracefully before starting to return it now. */

if (im.im_status == IDMAP_STATUS_LOOKUPFAIL)
im.im_status = IDMAP_STATUS_SUCCESS;

Not sure who made the comment and if this still a valid comment. If this
is fixed this would result in negative entries not being cached at all
(except by nscd if it is enabled but the kernel would ask idmapd which
would ask nscd).

By looking though the kernel code (fs/nfs/idmap.c) there is no way to
flush the cache. Also, the value of /proc/sys/fs/nfs/idmap_cache_timeout
at the time the cache entry was created is used so it's no use in
lowering the value after the fact.

That means that I think the only way to fix this is in the short term is
to remove the LOOKUPFAIL to SUCCESS mangling from idmapd.c (which could
have other side effects) or to apply the workaround as described before.

Note that I have only read code and not done extensive debugging by
deploying modified versions of either kernel of idmapd.

There is one thing that is remaining a little puzzling in the kernel
code is the question about the cache retry. I can't explain the strange
timeout if you set the cache value really low like 1 jiffy. Then again I
don't know enough about jiffies and kernel internals to go hunting that
problem anyway.


What nss-ldapd could do is document that the Cache-Expiration option be
set. Perhaps a check could be implemented with a debconf note during
package installation.

Another option would be to start nslcd before nfs-common. This however
would probably break an environment where /usr is mounter over NFS. Also
that would cause problems because it is best to start nslcd after slapd.

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-06 Thread Patrick Schönfeld
Hi,

2008/10/3 Arthur de Jong [EMAIL PROTECTED]:
 Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in the
 [General] section help at all in your setup? (the correct values should
 be loaded sooner)

very good. This betters the situation a lot. Its a good workaround.
Now if you'd find the reason why the behaviour differs from
libnss-ldap and could enhance libnss-ldapd in this way, this would be
great :-))

Best Regards,
Patrick



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-03 Thread Patrick Schoenfeld
Hi,

On Fri, Oct 03, 2008 at 12:18:47AM +0200, Arthur de Jong wrote:
 If using nfs4 (I've been doing some reading up but still no first-hand
 experience) is that if the user doesn't exist it is generally mapped to
 nobody:nogroup.

right.

 The mapping is done by idmapd but at some point in combination with
 something in the kernel. From what I understand from scanning the idmapd
 code is that there is a default cache expiry time (in the kernel) of 500
 seconds (10 minutes). Current value should be available
 in /proc/sys/fs/nfs/idmap_cache_timeout.
 
 My guess is that name lookups are cached in idmapd. Can you check that
 by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes
 away?

Nope, it does not.

 Can you check the idmapd logs anything out of the ordinary? Perhaps you
 can increase the verbosity in /etc/idmapd.conf.

Hm, no nothing special. Setting the verbosity higher as the default
(default: 3, tried up to 10) does not seem to change anything.
Basically this is all:

Oct  3 09:46:36 teekanne rpc.idmapd[3309]: libnfsidmap: using domain:
localdomain 
Oct  3 09:46:36 teekanne rpc.idmapd[3309]: libnfsidmap: using
translation method: nsswitch 
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: Expiration time is 600
seconds.
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: Opened
/proc/net/rpc/nfs4.nametoid/channel
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: Opened
/proc/net/rpc/nfs4.idtoname/channel
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: New client: 0
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: Opened
/var/lib/nfs/rpc_pipefs/nfs/clnt0/idmap
Oct  3 09:46:36 teekanne rpc.idmapd[3310]: New client: 1
Oct  3 09:47:23 teekanne rpc.idmapd[3310]: Client 0: (user) id 30010
- name [EMAIL PROTECTED]
Oct  3 09:47:23 teekanne rpc.idmapd[3310]: Client 0: (group) id 65534
- name [EMAIL PROTECTED]

 Thanks. Perhaps I should set up a test environment myself with NFS4. Do
 you have some pointers for that (I use NFS3 myself).

Thats not a great thing. You need to setup an export entry like you do
for NFSv4, however there is a fundamentel difference to NFSv3. You
export a NFSROOT not single exports. So you possibly want to setup a
virtual export directory. Its described here [1].

Best Regards,
Patrick

[1] http://www.crazysquirrel.com/computing/debian/servers/setting-up-nfs4.jspx



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-03 Thread Arthur de Jong
(Cc-ing the nfs-utils maintainers, perhaps they have some insight that
could solve this)

On Sat, 2008-10-04 at 09:52 +0200, Patrick Schoenfeld wrote:
  My guess is that name lookups are cached in idmapd. Can you check that
  by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes
  away?
 
 Nope, it does not.

I have been able to reproduce this. On the server I have in /etc/exports
(/export/newhome is a bind-mounted /home with half a dozen users):

/export 
192.168.1.0/24(ro,sync,insecure,root_squash,no_subtree_check,fsid=0)
/export/newhome 
192.168.1.0/24(rw,nohide,sync,insecure,root_squash,no_subtree_check)

On the client I have in /etc/fstab:

fs:/newhome/mntnfs4 rw 0 0

Now if I stop nslcd (all name lookup calls should now return
NSS_STATUS_UNAVAIL/ENOENT) an 'ls -l /mnt' shows:

[...]
drwx-x 148 nobody users 12288 Oct  3 21:02 arthur
[...]

(the user arthur from the server is mapped to the user nobody on the
client because the namelookup failed). With some more verbose logging
rpc.idmapd shows:

[...]
rpc.idmapd: nfs4_name_to_uid: calling nsswitch-name_to_uid
rpc.idmapd: nss_getpwnam: name '[EMAIL PROTECTED]' domain 'localdomain': 
resulting localname 'arthur'
rpc.idmapd: nss_getpwnam: name 'arthur' not found in domain 'localdomain'
rpc.idmapd: nfs4_name_to_uid: nsswitch-name_to_uid returned -2
rpc.idmapd: nfs4_name_to_uid: final return value is -2
rpc.idmapd: Client 16: (user) name [EMAIL PROTECTED] - id 65534
[...]

If I repeat the ls command a couple of times rpc.idmapd no longer logs
the failed lookups and a strace of rpc.idmapd also shows that that
process is no longer asked (by the kernel?) to look up the user.

If I then start nslcd (now name lookups should be performed as usual and
getent shows that they do) the results aren't quickly fixed.

After a while (I've been messing about with stuff in /proc so I don't
know how long this normally takes) the kernel asks rpc.idmapd again to
look up user arthur (and the other users in the filesystem). Also note
that the bugreporter had problems with groups and I've reproduced the
behaviour with users.

[...]
drwx-x 148 arthur users 12288 Oct  3 21:02 /mnt/arthur
[...]


Now the question is, how should this caching mechanism be tuned and how
should we solve this problem. Is there a reliable way to flush the
cache? There seems to be /proc/net/rpc/nfs4.nametoid which contains some
stuff that could be relevant and /proc/sys/fs/nfs/idmap_cache_timeout.

However setting /proc/sys/fs/nfs/idmap_cache_timeout or Cache-Expiration
does not result in the expected timeout in seconds (read from the
idmapd.c). Setting it to 10 results in a retry every 30 to 60 seconds,
setting it to 100 seems to result in a retry in 60-120 seconds. Also,
writing to /proc/net/rpc/nfs4.idtoname/flush
and /proc/net/rpc/nfs4.nametoid/flush (like is done in
flush_nfsd_idmap_cache()) doesn't seem to make a difference.

I haven't had a look at the kernel code yet (this is running kernel
Linux 2.6.26-1-686 (SMP w/2 CPU cores)).


Patrick, does adding Cache-Expiration = 10 to /etc/idmapd.conf in the
[General] section help at all in your setup? (the correct values should
be loaded sooner)

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-02 Thread Patrick Schoenfeld
Hi,

On Wed, Oct 01, 2008 at 10:27:04PM +0200, Arthur de Jong wrote:
 Can you produce logs of nslcd? It should report whether the LDAP server
 was reachable or not. If you can run nslcd with the -d option it should
 report more information that will help in tracking this down.

attached is a log, while the problem exists.

[EMAIL PROTECTED] ~ % ls -l test
-rw-rw-r-- 1 schoenfeld nogroup 0 12. Sep 09:49 test

Interesting enough: The symptom is similar to the system behaviour, if
nslcd is _not_ running. Then all files resolve to nobody:nogroup.

However there is no problem visible from the log.

Best Regards,
Patrick
nslcd: DEBUG: add_uri(ldap://majestix-linux.intra.in-medias-res.com)
nslcd: version 0.6.5 starting
nslcd: DEBUG: setgroups(0,NULL) done
nslcd: DEBUG: setgid(121) done
nslcd: DEBUG: setuid(117) done
nslcd: accepting connections
nslcd: [8b4567] DEBUG: connection from pid=3187 uid=30010 gid=1
nslcd: [8b4567] DEBUG: nslcd_passwd_byuid(30010)
nslcd: [8b4567] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uidNumber=30010)))
nslcd: [8b4567] DEBUG: simple anonymous bind to 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [8b4567] connected to LDAP server 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [8b4567] DEBUG: ldap_result(): end of results
nslcd: [7b23c6] DEBUG: connection from pid=3188 uid=0 gid=1
nslcd: [7b23c6] DEBUG: nslcd_passwd_byuid(30010)
nslcd: [7b23c6] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uidNumber=30010)))
nslcd: [7b23c6] DEBUG: simple anonymous bind to 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [7b23c6] connected to LDAP server 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [7b23c6] DEBUG: ldap_result(): end of results
nslcd: [3c9869] DEBUG: connection from pid=3188 uid=0 gid=1
nslcd: [3c9869] DEBUG: nslcd_group_bymember(root)
nslcd: [3c9869] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uid=root)))
nslcd: [3c9869] DEBUG: simple anonymous bind to 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [3c9869] connected to LDAP server 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [3c9869] DEBUG: ldap_result(): end of results
nslcd: [3c9869] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixGroup)(memberUid=root)))
nslcd: [3c9869] DEBUG: ldap_result(): end of results
nslcd: [334873] DEBUG: connection from pid=3188 uid=0 gid=0
nslcd: [334873] DEBUG: nslcd_group_byname(Operations)
nslcd: [334873] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixGroup)(cn=Operations)))
nslcd: [334873] DEBUG: simple anonymous bind to 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [334873] connected to LDAP server 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [334873] DEBUG: ldap_result(): end of results
nslcd: [b0dc51] DEBUG: connection from pid=3194 uid=0 gid=0
nslcd: [b0dc51] DEBUG: nslcd_group_bymember(root)
nslcd: [b0dc51] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uid=root)))
nslcd: [b0dc51] DEBUG: simple anonymous bind to 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [b0dc51] connected to LDAP server 
ldap://majestix-linux.intra.in-medias-res.com
nslcd: [b0dc51] DEBUG: ldap_result(): end of results
nslcd: [b0dc51] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixGroup)(memberUid=root)))
nslcd: [b0dc51] DEBUG: ldap_result(): end of results
nslcd: [495cff] DEBUG: connection from pid=3203 uid=0 gid=0
nslcd: [495cff] DEBUG: nslcd_passwd_byuid(30010)
nslcd: [495cff] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uidNumber=30010)))
nslcd: [495cff] DEBUG: ldap_result(): end of results
nslcd: [e8944a] DEBUG: connection from pid=3203 uid=0 gid=0
nslcd: [e8944a] DEBUG: nslcd_group_bygid(1000)
nslcd: [e8944a] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixGroup)(gidNumber=1000)))
nslcd: [e8944a] DEBUG: ldap_result(): end of results
nslcd: [5558ec] DEBUG: connection from pid=3203 uid=0 gid=0
nslcd: [5558ec] DEBUG: nslcd_passwd_byuid(-2)
nslcd: [5558ec] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uidNumber=-2)))
nslcd: [5558ec] DEBUG: ldap_result(): end of results
nslcd: [8e1f29] DEBUG: connection from pid=3203 uid=0 gid=0
nslcd: [8e1f29] DEBUG: nslcd_group_bygid(-2)
nslcd: [8e1f29] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixGroup)(gidNumber=-2)))
nslcd: [8e1f29] DEBUG: ldap_result(): end of results
nslcd: [e87ccd] DEBUG: connection from pid=3204 uid=0 gid=0
nslcd: [e87ccd] DEBUG: nslcd_passwd_byname(schoenfeld)
nslcd: [e87ccd] DEBUG: myldap_search(base=dc=intra,dc=in-medias-res,dc=com, 
filter=((objectClass=posixAccount)(uid=schoenfeld)))

Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-02 Thread Arthur de Jong
On Thu, 2008-10-02 at 10:28 +0200, Patrick Schoenfeld wrote:
 attached is a log, while the problem exists.
 
 [EMAIL PROTECTED] ~ % ls -l test
 -rw-rw-r-- 1 schoenfeld nogroup 0 12. Sep 09:49 test
 
 Interesting enough: The symptom is similar to the system behaviour, if
 nslcd is _not_ running. Then all files resolve to nobody:nogroup.

If using nfs4 (I've been doing some reading up but still no first-hand
experience) is that if the user doesn't exist it is generally mapped to
nobody:nogroup.

The mapping is done by idmapd but at some point in combination with
something in the kernel. From what I understand from scanning the idmapd
code is that there is a default cache expiry time (in the kernel) of 500
seconds (10 minutes). Current value should be available
in /proc/sys/fs/nfs/idmap_cache_timeout.

My guess is that name lookups are cached in idmapd. Can you check that
by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes
away?

On my system, idmapd is started way before nslcd and it probably isn't a
good idea to start if before idmapd. There seems to be an undocumented
Cache-Expiration option in the General section of /etc/idmapd.conf that
could help to bring down the cache timeout value.

Can you check the idmapd logs anything out of the ordinary? Perhaps you
can increase the verbosity in /etc/idmapd.conf.

Thanks. Perhaps I should set up a test environment myself with NFS4. Do
you have some pointers for that (I use NFS3 myself).

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-01 Thread Patrick Schoenfeld
Package: libnss-ldapd
Severity: serious
Version: 0.6.5

Hi,

since we use libnss-ldapd we have a problem that is quiet serious for
us, as it effectively affects login and group ACLs. However we couldn't
yet track down this issue to a specific component, therefore we didn't
report it yet.

The setup:
Our setup is a mixed Windows/Linux environment with a LDAP server, for
central authentication. Linux clients use libnss-ldapd for resolution of
usernames and groups.

The problem:
After reboot of the Linux clients they are unable to resolve groups and
sometimes are also unable to resolve users. The result is that files are
owned by [nobody]:nogroup, while getent passwd and getent group show
the right result. In consequence people are unable to properly login
(because desktop environment need read permissions on their setting ;)
and user permissions are broken.

After 10-30 minutes of running the problem disappears. This makes me
think that some timeout occours, but I can't tell which. I thought its
probably somehow related to the udev resolution issues that are handled
different in libnss-ldapd from libnss-ldap which produces a significant
delay when booting because groups can't be resolved while ldap is
accessible, which is handled gracefully bei libnss-ldapd. Maybe you
gather invalid results while booting, because LDAP is not accessible.
But I don't see why nslcd should cache these results so I think my idea
is absurd.

The problem is reproducable with or without nscd running,  so the problem is
not related to it.

The problem seems not to be related to the groups which contain spaces,
except that it spams the log secondly with error messages unless my patch is
applied.

The problem does not occur with libnss-ldap, so the problem is specific
to libnss-ldapd.

I've choosen severity serious for this issue because at the one hand the
problem would fit severity 'Critical', because it makes unrelated
software on the system (or the whole system) break, but then again I
felt uncomfortable with it, because the problem does not persist over
the uptime of the system and after 10-30 minutes the problem disappears.
But I think it should definitive be fixed for lenny.

Best Regards,
Patrick


signature.asc
Description: Digital signature


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-01 Thread Arthur de Jong
On Wed, 2008-10-01 at 13:11 +0200, Patrick Schoenfeld wrote:
 Our setup is a mixed Windows/Linux environment with a LDAP server, for
 central authentication. Linux clients use libnss-ldapd for resolution of
 usernames and groups.

Could you provide some more details? Is the LDAP server on the system
that also runs nss-ldapd, what options do you use, which LDAP server
software etc? Your configuration file should also help.

 After reboot of the Linux clients they are unable to resolve groups and
 sometimes are also unable to resolve users. The result is that files are
 owned by [nobody]:nogroup, while getent passwd and getent group show
 the right result.

I don't understand this. If you perform getent passwd and getent group
you get the expected result but if you do ls -l the files are reported
as nobody:nogroup?

If ls can't resolve numeric user and group ids it should print the
numeric form, not make up something.

Can you produce logs of nslcd? It should report whether the LDAP server
was reachable or not. If you can run nslcd with the -d option it should
report more information that will help in tracking this down.

 In consequence people are unable to properly login
 (because desktop environment need read permissions on their setting ;)
 and user permissions are broken.

Note that for logging in you also need pam_ldap which has it's own
configuration. If the problem is in that you should probably also
provide information about that.

 After 10-30 minutes of running the problem disappears. This makes me
 think that some timeout occours, but I can't tell which.

 I thought its probably somehow related to the udev resolution issues
 that are handled different in libnss-ldapd from libnss-ldap which
 produces a significant delay when booting because groups can't be
 resolved while ldap is accessible, which is handled gracefully bei
 libnss-ldapd. Maybe you gather invalid results while booting, because
 LDAP is not accessible. But I don't see why nslcd should cache these
 results so I think my idea is absurd.

nslcd only caches the relationship between DNs and uids for group
membership lookups (when the uniqueMember attribute is used). This
timeout is hardcoded at 15 minutes. Other than that I can't think of a
timeout as long unless you set it that high in the config.

The way nss-ldapd solves the udev problem is by not doing LDAP lookups
that early during boot at all and fail quickly. Only when nslcd is
started are lookups attempted. In any case I can't think of a case where
getent passwd should work and ls would fail.

One known issue (#475626) is related to the order at which nslcd is
started during boot. If the LDAP server is unavailable when nslcd is
started a timeout could occur and the LDAP server will not be found
immediately when it is available.

 I've choosen severity serious for this issue because at the one hand
 the problem would fit severity 'Critical', because it makes unrelated
 software on the system (or the whole system) break, but then again I
 felt uncomfortable with it, because the problem does not persist over
 the uptime of the system and after 10-30 minutes the problem
 disappears.

I am inclined to lower it to important because it seems to work in a lot
of common environments.

 But I think it should definitive be fixed for lenny.

I hope to fix this soon. Thanks for your bugreport.

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --


signature.asc
Description: This is a digitally signed message part


Bug#500778: libnss-ldapd: groups resolve to nogroup after boot

2008-10-01 Thread Patrick Schoenfeld
Hi Arthur,

On Wed, Oct 01, 2008 at 10:27:04PM +0200, Arthur de Jong wrote:
 On Wed, 2008-10-01 at 13:11 +0200, Patrick Schoenfeld wrote:
  Our setup is a mixed Windows/Linux environment with a LDAP server, for
  central authentication. Linux clients use libnss-ldapd for resolution of
  usernames and groups.
 
 Could you provide some more details? 

Yep, I can. I'm just unsure which informations are of interest (I'm at a
point where I'm kinda clueless whats the cause of the trouble :/).

 Is the LDAP server on the system that also runs nss-ldapd, what options do 
 you use,

No, it runs on another host. I don't use any special options. In fact
the configuration is the default configuration, except the server
address and the search base.

[EMAIL PROTECTED]:~# grep -v '\(^#\|^$\)' /etc/nss-ldapd.conf
uri ldap://majestix-linux.intra.in-medias-res.com
base dc=intra,dc=in-medias-res,dc=com
uid nslcd
gid nslcd

 which LDAP server software etc? Your configuration file should also help.

The LDAP server is a usual slapd as it is in Etch:

slapd (2.3.30-5+etch1)

  After reboot of the Linux clients they are unable to resolve groups and
  sometimes are also unable to resolve users. The result is that files are
  owned by [nobody]:nogroup, while getent passwd and getent group show
  the right result.
 
 I don't understand this. If you perform getent passwd and getent group
 you get the expected result but if you do ls -l the files are reported
 as nobody:nogroup?

Right. Sometimes all files are owned by nobody:nogroup but the most
common problem is that only groups are a problem. And yes, while the
problem exists getent passwd and getent group show up groups properly.

 If ls can't resolve numeric user and group ids it should print the
 numeric form, not make up something.

Well, I think this is related to the fact that it is a NFSv4 filesystem.
nobody:nogroup is what idmapd from NFS does if it cannot properly
resolve the ids.

 Can you produce logs of nslcd? It should report whether the LDAP server
 was reachable or not. If you can run nslcd with the -d option it should
 report more information that will help in tracking this down.

OK. I will add this logs ASAP.

  In consequence people are unable to properly login
  (because desktop environment need read permissions on their setting ;)
  and user permissions are broken.
 
 Note that for logging in you also need pam_ldap which has it's own
 configuration. If the problem is in that you should probably also
 provide information about that.

Well, the problem is not the login per se, but that some programs (for
example GNOME) simply do not work, because they can't read their settings
(if the nobody problem exists as well. if the groups are the only
problem, then only accessing shared files is a problem)

 nslcd only caches the relationship between DNs and uids for group
 membership lookups (when the uniqueMember attribute is used). This
 timeout is hardcoded at 15 minutes. Other than that I can't think of a
 timeout as long unless you set it that high in the config.

I would have said first, that 15 minutes could be the time frame, but
then again: no. Today I saw the problem disappearing after more then
half an hour.

 The way nss-ldapd solves the udev problem is by not doing LDAP lookups
 that early during boot at all and fail quickly. Only when nslcd is
 started are lookups attempted. In any case I can't think of a case where
 getent passwd should work and ls would fail.

Well, sounds reasonable and I don't see why this should cause the
problems.

 I am inclined to lower it to important because it seems to work in a lot
 of common environments.

Well, yes, thats true. But on the other side it has serious affect on
the functionality on the system at a whole (because it is a client that
mounts /home etc. from the server), so I felt serious is a good
compromise.

 I hope to fix this soon. Thanks for your bugreport.

No bug report, no solution, right? So no need to thank me, instead I
thank you if you'd find a solution for it.

Best Regards,
Patrick



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]