> Thanks Assaf! Good debugging work! Yes. That was the solution. After your changes I was able to "unbusy" the mount point by killing the processes that were holding it busy. Then unmount it. Then on download proper I stopped the nfs daemon and double checked that all was dead. Then started nfs back up on download. Then mounted it on download0. Tested. The feature works. Yay! But there is still something to be understood because whle scp is working rsync is not working.
Compact test of the problem. root@download0:~# su -s /bin/bash -c 'touch /releases/administration/foo' rwp touch: cannot touch ‘/releases/administration/foo’: Permission denied Failed. The problem is still active. "Unbusy" the mount point so as to unmount it. root@download0:~# lsof | grep /net/download rsync 362 rwp cwd DIR 0,24 4096 6684673 /net/download/srv (olddownload:/) nginx 1272 www-data 31r REG 0,24 28435804 8916141 /net/download/srv/audio-video/video/TEDxGE2014_Stallman05_LQ.webm (olddownload:/) rsync 1375 rwp cwd DIR 0,24 4096 6684673 /net/download/srv (olddownload:/) sh 19045 agn cwd DIR 0,24 4096 6692913 /net/download/srv/download/test-project (olddownload:/) rsync 32113 rwp cwd DIR 0,24 4096 6684673 /net/download/srv (olddownload:/) Of course rwp is me and so those rsync processes were mine that I created while testing. Killed those. The agn sh is Assaf. I killed that one too. The nginx process is never going to make progress therefore killed it too. root@download0:~# kill -HUP 362 1375 32113 19045 root@download0:~# service nginx stop root@download0:~# lsof | grep /net/download root@download0:~# umount /net/download root@download0:~# df root@download0:~# df | awk '$1!="none"&&$1!="tmpfs"' Filesystem 1K-blocks Used Available Use% Mounted on udev 997676 4 997672 1% /dev /dev/xvda2 10190136 1863208 7786256 20% / All clear on download0. Stop the nfsds on download. First check to see what is running. A modest number of daemons are active. root@download:~# ps -ef | grep nfs root 4799 2 0 2016 ? 00:00:00 [nfsd4] root 4800 2 0 2016 ? 00:00:00 [nfsd4_callbacks] root 4801 2 0 2016 ? 00:03:44 [nfsd] root 4802 2 0 2016 ? 00:03:40 [nfsd] root 4803 2 0 2016 ? 00:04:07 [nfsd] root 4804 2 0 2016 ? 00:03:30 [nfsd] root 4805 2 0 2016 ? 00:03:21 [nfsd] root 4806 2 0 2016 ? 00:03:34 [nfsd] root 4807 2 0 2016 ? 00:02:45 [nfsd] root 4808 2 0 2016 ? 00:03:30 [nfsd] root 4823 2 0 2016 ? 00:00:00 [nfsiod] root 22846 12446 0 05:19 pts/1 00:00:00 grep nfs root@download:/# service nfs-kernel-server stop Stopping NFS kernel daemon: mountd nfsd. Unexporting directories for NFS kernel daemon.... root@download:/# ps -ef | grep nfs root 4823 2 0 2016 ? 00:00:00 [nfsiod] root 24072 12446 0 05:33 pts/1 00:00:00 grep nfs root@download:/# Verify that the libnss-mysql is working. Assaf already was here to make sure it was working in the baton pass. root@download:~# getent passwd rwp rwp:x:65821:1003:Bob Proulx:/srv:/usr/local/bin/sv_membersh Start the nfs daemons again. root@download:/# service nfs-kernel-server start Exporting directories for NFS kernel daemon.... Starting NFS kernel daemon: nfsd mountd. root@download:/# ps -ef |grep rpc statd 4576 1 0 2016 ? 00:00:00 /sbin/rpc.statd root 4790 2 0 2016 ? 00:00:00 [rpciod] root 24103 1 0 05:34 ? 00:00:00 /usr/sbin/rpc.mountd --manage-gids root 24106 12446 0 05:34 pts/1 00:00:00 grep rpc root@download:/# ps -ef |grep nfs root 4823 2 0 2016 ? 00:00:00 [nfsiod] root 24090 2 0 05:34 ? 00:00:00 [nfsd4] root 24091 2 0 05:34 ? 00:00:00 [nfsd4_callbacks] root 24092 2 0 05:34 ? 00:00:00 [nfsd] root 24093 2 0 05:34 ? 00:00:00 [nfsd] root 24094 2 0 05:34 ? 00:00:00 [nfsd] root 24095 2 0 05:34 ? 00:00:00 [nfsd] root 24096 2 0 05:34 ? 00:00:00 [nfsd] root 24097 2 0 05:34 ? 00:00:00 [nfsd] root 24098 2 0 05:34 ? 00:00:00 [nfsd] root 24099 2 0 05:34 ? 00:00:00 [nfsd] root 24108 12446 0 05:34 pts/1 00:00:00 grep nfs The rpc.mountd --manage-gids is running. Along with a modest number of nfsd processes. root@download0:~# mount /net/download Mounted. Check it. root@download0:~# ls /net/download archives boot etc initrd initrd.img.old lost+found mnt opt releases sbin srv tmp var vmlinuz.old bin dev home initrd.img lib media nonexistent proc root selinux sys usr vmlinuz Test that my group id is passed along. root@download0:~# su -s /bin/bash -c 'touch /releases/administration/foo' rwp root@download0:~# ll /releases/administration/foo -rw-r--r-- 1 rwp administration 0 Feb 24 00:34 /releases/administration/foo root@download0:~# su -s /bin/bash -c 'rm /releases/administration/foo' rwp That worked! The problem should be resolved now. Start up nginx that I just stopped a moment ago. root@download0:~# service nginx start Peek at the df listing. root@download0:~# df | awk '$1!="none"&&$1!="tmpfs"' Filesystem 1K-blocks Used Available Use% Mounted on udev 997676 4 997672 1% /dev /dev/xvda2 10190136 1863208 7786256 20% / olddownload:/ 275577344 237392896 24186368 91% /net/download Looks okay. Ignoring the full disk that we can't do anything about. /tmp/junk$ scp testfile [email protected]:/releases/administration/ testfile 100% 32 0.4KB/s 00:00 root@download0:~# ll /releases/administration/testfile -rw-r--r-- 1 rwp administration 32 Feb 24 00:38 /releases/administration/testfile That worked. Yay! Try rsync. /tmp/junk$ rsync -avv testfile [email protected]:/releases/administration/ opening connection using: ssh -l rwp download.savannah.gnu.org rsync --server -vvlogDtpre.iLsfxC . /releases/administration/ (9 args) sending incremental file list ...hangs seemingly forever... ^C rsync error: unexplained error (code 130) at rsync.c(638) [sender=3.1.2] rsync: [sender] write error: Broken pipe (32) root@download0:~# ps -ef | grep rsync rwp 10552 1 0 00:38 ? 00:00:00 rsync --server -vvlogDtpre.iLsfxC . /releases/administration/ Sooo.... Some things are working. But rsync is not working. rsync is hanging. I typically see this problem over nfs when there is a lock problem. NFS locks are serviced by the rpc.statd which if you notice above I didn't restart. I am going to see if I can restart the rpc.statd which is in the nfs-common set. root@download0:~# umount /net/download root@download:/# service nfs-kernel-server stop root@download:/# service nfs-common stop root@download:/# service nfs-common start root@download:/# service nfs-kernel-server start root@download0:~# mount /net/download root@download0:~# su -s /bin/bash -c 'rsync -av /tmp/testfile /releases/administration/' rwp sending incremental file list testfile sent 126 bytes received 35 bytes 322.00 bytes/sec total size is 32 speedup is 0.20 Seems like it should be working. But it's not. Testing from rsync over ssh: /tmp/junk$ rsync -avv testfile [email protected]:/releases/administration/ opening connection using: ssh -l rwp download.savannah.gnu.org rsync --server -vvlogDtpre.iLsfxC . /releases/administration/ (9 args) sending incremental file list ...hangs... Things are better because scp now works. But still not happy because rsync is hanging. Continuing... Bob
