(Cc-ing the nfs-utils maintainers, perhaps they have some insight that
could solve this)

On Sat, 2008-10-04 at 09:52 +0200, Patrick Schoenfeld wrote:
> > My guess is that name lookups are cached in idmapd. Can you check that
> > by restarting idmapd (/etc/init.d/nfs-common restart) the problem goes
> > away?
> 
> Nope, it does not.

I have been able to reproduce this. On the server I have in /etc/exports
(/export/newhome is a bind-mounted /home with half a dozen users):

/export         
192.168.1.0/24(ro,sync,insecure,root_squash,no_subtree_check,fsid=0)
/export/newhome 
192.168.1.0/24(rw,nohide,sync,insecure,root_squash,no_subtree_check)

On the client I have in /etc/fstab:

fs:/newhome    /mnt        nfs4 rw 0 0

Now if I stop nslcd (all name lookup calls should now return
NSS_STATUS_UNAVAIL/ENOENT) an 'ls -l /mnt' shows:

[...]
drwx-----x 148 nobody users 12288 Oct  3 21:02 arthur
[...]

(the user arthur from the server is mapped to the user nobody on the
client because the namelookup failed). With some more verbose logging
rpc.idmapd shows:

[...]
rpc.idmapd: nfs4_name_to_uid: calling nsswitch->name_to_uid
rpc.idmapd: nss_getpwnam: name '[EMAIL PROTECTED]' domain 'localdomain': 
resulting localname 'arthur'
rpc.idmapd: nss_getpwnam: name 'arthur' not found in domain 'localdomain'
rpc.idmapd: nfs4_name_to_uid: nsswitch->name_to_uid returned -2
rpc.idmapd: nfs4_name_to_uid: final return value is -2
rpc.idmapd: Client 16: (user) name "[EMAIL PROTECTED]" -> id "65534"
[...]

If I repeat the ls command a couple of times rpc.idmapd no longer logs
the failed lookups and a strace of rpc.idmapd also shows that that
process is no longer asked (by the kernel?) to look up the user.

If I then start nslcd (now name lookups should be performed as usual and
getent shows that they do) the results aren't quickly fixed.

After a while (I've been messing about with stuff in /proc so I don't
know how long this normally takes) the kernel asks rpc.idmapd again to
look up user arthur (and the other users in the filesystem). Also note
that the bugreporter had problems with groups and I've reproduced the
behaviour with users.

[...]
drwx-----x 148 arthur users 12288 Oct  3 21:02 /mnt/arthur
[...]


Now the question is, how should this caching mechanism be tuned and how
should we solve this problem. Is there a reliable way to flush the
cache? There seems to be /proc/net/rpc/nfs4.nametoid which contains some
stuff that could be relevant and /proc/sys/fs/nfs/idmap_cache_timeout.

However setting /proc/sys/fs/nfs/idmap_cache_timeout or Cache-Expiration
does not result in the expected timeout in seconds (read from the
idmapd.c). Setting it to 10 results in a retry every 30 to 60 seconds,
setting it to 100 seems to result in a retry in 60-120 seconds. Also,
writing to /proc/net/rpc/nfs4.idtoname/flush
and /proc/net/rpc/nfs4.nametoid/flush (like is done in
flush_nfsd_idmap_cache()) doesn't seem to make a difference.

I haven't had a look at the kernel code yet (this is running kernel
Linux 2.6.26-1-686 (SMP w/2 CPU cores)).


Patrick, does adding "Cache-Expiration = 10" to /etc/idmapd.conf in the
[General] section help at all in your setup? (the correct values should
be loaded sooner)

-- 
-- arthur - [EMAIL PROTECTED] - http://people.debian.org/~adejong --

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to