[ https://issues.apache.org/jira/browse/HDFS-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005288#comment-14005288 ]
Brandon Li commented on HDFS-6411: ---------------------------------- Here is what happened: 1. supported user mounted the export, say at /my-mount-point. Linux NFS client keeps the related file attributes in cache. 2. non-supported user tried to access the export starting with an ACCESS call. As [~zhongyi-altiscale] noticed, NFS gateway returned an empty file attributes for /my-mount-point as part of the ACCESS response when the request failed. 3. Linux NFS client replaced the cached file attribute with the empty one. Note, the file handle is all zero now. 4. when the supported user tried to access the mounted directory, Linux NFS client checked the cached file handle of /my-mount-point to see its all zeros and it returns stale-file-handle-error to caller. In this step, the Linux NFS client doesn't send request to NFS server. We have no control in what happens in step 4 since it's up to NFS client implementation. Some client implementation might be more tolerate. For example, I feel MacOS NFS client ignores the file attributes if the call failed (didn't look into its implementation so I could be wrong). However, we should fix step 3 by not sending the empty attribute when the call fails. This is compatible with NFSv3 protocol and could make some NFS clients happy. I will post a patch soon. > nfs-hdfs-gateway mount raises I/O error and hangs when a unauthorized user > attempts to access it > ------------------------------------------------------------------------------------------------ > > Key: HDFS-6411 > URL: https://issues.apache.org/jira/browse/HDFS-6411 > Project: Hadoop HDFS > Issue Type: Bug > Components: nfs > Affects Versions: 2.2.0 > Reporter: Zhongyi Xie > > We use the nfs-hdfs gateway to expose hdfs thru nfs. > 0) login as root, run nfs-hdfs gateway as a user, say, nfsserver. > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > backups hive mr-history system tmp user > 1) add a user nfs-test: adduser nfs-test(make sure that this user is not a > proxyuser of nfsserver > 2) switch to test user: su - nfs-test > 3) access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot open directory /hdfs: Input/output error > retry: > [nfs-test@zhongyi-test-cluster-desktop ~]$ ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > 4) switch back to root and access hdfs nfs gateway > [nfs-test@zhongyi-test-cluster-desktop ~]$ exit > logout > [root@zhongyi-test-cluster-desktop hdfs]# ls /hdfs > ls: cannot access /hdfs: Stale NFS file handle > the nfsserver log indicates we hit an authorization error in the rpc handler; > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): > User: nfsserver is not allowed to impersonate nfs-test > and NFS3ERR_IO is returned, which explains why we see input/output error. > One can catch the authorizationexception and return the correct error: > NFS3ERR_ACCES to fix the error message on the client side but that doesn't > seem to solve the mount hang issue though. When the mount hang happens, it > stops printing nfsserver log which makes it more difficult to figure out the > real cause of the hang. According to jstack and debugger, the nfsserver seems > to be waiting for client requests -- This message was sent by Atlassian JIRA (v6.2#6252)