Hi Jason, Thx for the valuable feedback ! I'll defer to Layne on the "scope" of the document.
I think your feedback reflects what other folks have been running into, which is the lack of information on how to debug these type of issues. We're working on tools to enhance NFSv4 observability and diagnosis, but that is still work in progress. In the meantime, I'll work w/Layne to integrate more formal debugging documentation. For now, hope this helps: - The root causes of 'nobody' storms in NFSv4 are usually either an NFSv4 domain mismatch _or_ different naming db backends being used by client and server. Domain mismatch =============== o The 1st place to check on the solaris implementation is (as you've noted) the /var/run/nfs4_domain file. This reflects what the local system has configured as the NFSv4 domain. If these don't match between client and server, that is the problem. NOTE: The domains could be mismatched due to a myriad of different causes. Here are some of them: l a) NFSMAPID_DOMAIN /etc/default/nfs settings don't match b) one system using DNS TXT RR's and another using the NFSMAPID_DOMAIN variable c) one system using /etc/resolv.conf 'domain' (due to misconfiguration) and the other system using the DNS TXT RR. d) one system using /etc/resolv.conf 'domain' setting and (if a different /etc/resolv.conf is used in which both 'domain' and 'search' keywords are used) the other using the 'search' keyword's first field. e) others Admittedly, these are esoteric details of the solaris implementation, but they can cause you heartache. o If running against a different NFSv4 server implementation, snoop'ing the NFSv4 traffic verbosely and grep'ing for "(Owner|Group)" will give you both, what the client is sending and receiving to and from the server, respectively. Differing Naming Backends ========================= o If the domains match, then checking for existance of the user (and/or) group in the current naming databases should reveal if the problem lies there. o Here are a few quick tips I often use... 1) domainname - this should match the NIS/LDAP domains for both client and server 2) ypmatch ${USER} passwd _or_ ypmatch ${GROUP} group - again, these should match between client and server; if they don't match, then the 'nobody' storm is probably being caused because either the server or client cannot map the inbound user (group) name to a valid uid (gid). This is not an all encompassing debug cheat-sheet, but hopefully will fill in the gaps until Layne and I can get this better documented. hth, rick On Wed, Aug 09, 2006 at 06:58:32AM -0700, Jason King wrote: | Don't know if this is part of the scope of the document, but I think more information on how to debug nfsv4 mapping would be *very* useful. I ran into a situation a while ago where things were being mapped to nobody, and even Sun's support could not figure out why -- I knew it was a domain mismatch, but no matter what I tried, couldn't seem to get it to take the correct value. | | It wasn't until I stumbled upon /var/run/nfs4_domain that I was able to start figuring out the issue -- it apparently was using the 'domain' value of /etc/resolv.conf (vs. using DNS to resolve the hostname) which had been misset. Unfortunately, I cannot recall finding any of this documented anywhere. | | | This message posted from opensolaris.org | _______________________________________________ | nfs-discuss mailing list | nfs-discuss at opensolaris.org --