Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-12 Thread Christian Kujau
FWIW, this still happens when both client  server are running Linux 
3.11.0-rc5 (vanilla).

$ dpkg -l | grep nfs | cut -c-70
ii  libnfsidmap2:amd64 0.25-4amd64
ii  nfs-common 1:1.2.6-4 amd64
ii  nfs-kernel-server  1:1.2.6-4 amd64


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/alpine.deb.2.10.1308120113360.7...@trent.utfs.org



Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-12 Thread Christian Kujau
Sorry for the noise, here's another oddity, same setup (client  server 
running 3.11-rc5):

$ find /mnt/nfs/usr/share/ -name getopt.awk -ls
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/nfs/usr/share/awk/getopt.awk

It's the same file, but gets reported 10 times! Hence the error when 
trying to tar(1) the directory:

$ tar -cf - /mnt/nfs/usr/share/awk/  /dev/null
tar: Removing leading `/' from member names
tar: /mnt/nfs/usr/share/awk/: Cannot savedir: Too many levels of symbolic links
tar: Exiting with failure status due to previous errors

On the server:

$ find /mnt/disk/usr/share/ -name getopt.awk -ls
 250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
/mnt/disk/usr/share/awk/getopt.awk

So, is JFS  NFS really br0ken and nobody noticed?


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/alpine.deb.2.10.1308120122500.7...@trent.utfs.org



Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-12 Thread J. Bruce Fields
On Mon, Aug 12, 2013 at 01:29:15AM -0700, Christian Kujau wrote:
 Sorry for the noise, here's another oddity, same setup (client  server 
 running 3.11-rc5):
 
 $ find /mnt/nfs/usr/share/ -name getopt.awk -ls
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/nfs/usr/share/awk/getopt.awk
 
 It's the same file, but gets reported 10 times! Hence the error when 
 trying to tar(1) the directory:
 
 $ tar -cf - /mnt/nfs/usr/share/awk/  /dev/null
 tar: Removing leading `/' from member names
 tar: /mnt/nfs/usr/share/awk/: Cannot savedir: Too many levels of symbolic 
 links
 tar: Exiting with failure status due to previous errors
 
 On the server:
 
 $ find /mnt/disk/usr/share/ -name getopt.awk -ls
  250724 -rw-r--r--   1 root root 2237 Mar 16 04:46 
 /mnt/disk/usr/share/awk/getopt.awk
 
 So, is JFS  NFS really br0ken and nobody noticed?

It does sound like a jfs bug, and I don't know if anyone tests nfs
exports of jfs regularly.

It might be interesting to get a network trace (something like tcpdump
-s0 -wtmp.pcap; then wireshark tmp.pcap and look at the cookie
fields in the readdir calls and replies.  The server shouldn't return
the same one twice on one read through the directory.  And when the
client uses a cookie it should get the next entries, not
already-returned entries.)

You could also just run strace -egetdents64 -v ls on the server on
the exported filesystem, in a problem directory, and see if the offsets
are unique.

--b.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130812162924.gb2...@fieldses.org



Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-12 Thread Christian Kujau
On Mon, 12 Aug 2013 at 12:29, J. Bruce Fields wrote:
 It might be interesting to get a network trace (something like tcpdump
 -s0 -wtmp.pcap; then wireshark tmp.pcap and look at the cookie
 fields in the readdir calls and replies.

I've created #60737[0] to track this issue upstream and attached a pcap to 
the bug, obtained while running find dir -ls on the client. But I fail 
to look at the right details in tcpdump/wireshare, I don't see any cookie 
information...

 You could also just run strace -egetdents64 -v ls on the server on
 the exported filesystem, in a problem directory, and see if the offsets
 are unique.

strace returned nothing for getdents64, only getdents. My test 
filesystems are 256 MB in size, maybe this is too small for getdents64 to 
be used? All the calls to getdents however return unique offsets, if I 
did this right:

$ strace -egetdents -v ls /mnt/disk_jfs/usr/share/terminfo/q 21 | egrep -o 
d_off=[0-9]* | sort

When running ls (even w/o -l) on the client on that NFS share, this 
readdir loop message is printed.

HTH,
Christian.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=60737


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/alpine.deb.2.10.1308121257020.7...@trent.utfs.org



Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-10 Thread Christian Kujau
Interesting stuff. Out of curiosity I just tried this myself, both client 
 server are virtual machines running Debian/stable (3.2.0-4-amd64) and I 
was able to reproduce this. A test case would be:

## server:
$ apt-get install nfs-kernel-server jfsutils
$ dd if=/dev/zero bs=1M count=256  /var/test.img
$ losetup -f /var/test.img
$ mkfs.jfs /dev/loop0
$ mount -t jfs /dev/loop0  /mnt/disk
$ tar -C / -cf - usr/share | tar -C /mnt/disk/ -xf -
$ tail -1 /etc/exports
/mnt/disk  192.168.0.0/24(rw,sync,no_root_squash,no_subtree_check)
$ service nfs-kernel-server restart

## client
$ apt-get install nfs-common
$ showmount -e server | tail -1
/mnt/disk 192.168.0.0/24
$ tail -1 /etc/fstab
server:/mnt/disk /mnt/nfs  nfs  rsize=8192,wsize=8192,intr 0 0
$ mount /mnt/nfs
$ mount | tail -1
server:/mnt/disk on /mnt/nfs type nfs4 
(rw,relatime,vers=4,rsize=8192,wsize=8192,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.137,minorversion=0,local_lock=none,addr=192.168.0.138)

$ tar -cf - /mnt/nfs/  /dev/null
tar: Removing leading `/' from member names
tar: Removing leading `/' from hard link targets
tar: /mnt/nfs/usr/share/perl/5.14.2/Pod/: Cannot savedir: Too many levels of 
symbolic links
tar: Exiting with failure status due to previous errors

$ dmesg | tail
[   63.912327] RPC: Registered named UNIX socket transport module.
[   63.913801] RPC: Registered udp transport module.
[   63.914713] RPC: Registered tcp transport module.
[   63.915644] RPC: Registered tcp NFSv4.1 backchannel transport module.
[   63.949485] FS-Cache: Loaded
[   63.972688] FS-Cache: Netfs 'nfs' registered for caching
[   63.993300] Installing knfsd (copyright (C) 1996 o...@monad.swb.de).
[  284.733629] loop: module loaded
[  840.372846] NFS: directory 5.14.2/Pod contains a readdir loop.Please contact 
your server vendor.  The file: Simple has duplicate cookie 18
[  840.375842] NFS: directory 5.14.2/Pod contains a readdir loop.Please contact 
your server vendor.  The file: Simple has duplicate cookie 18

There are no messages on the server when this happens. The message on the 
client repeats on every attempt, this Cannot savedir above may be 
triggering it.


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/alpine.deb.2.10.1308092352550.7...@trent.utfs.org



Bug#714974: [Jfs-discussion] NFS 'readdir loop' error on JFS

2013-08-10 Thread Karl Schmidt

On 08/10/2013 02:28 AM, Christian Kujau wrote:

Interesting stuff. Out of curiosity I just tried this myself, both client
 server are virtual machines running Debian/stable (3.2.0-4-amd64) and I
was able to reproduce this. A test case would be:



I still haven't rebooted that machine - last chance to ask for any test info - as it looks like you 
have a test case anyway.


I haven't lost any data that I know of  - just programs complaining etc.

IMO, at one time, jfs was really a better choice ( good set of tools). Even in a few cases where 
hardware failed the jfs tools worked well. Today with everyone banging on ext4 it has become the 
better choice. ( I don't think IBM is interested in supporting jfs - no idea if they are phasing out 
jfs2? ).






Karl Schmidt  EMail k...@xtronics.com
Transtronics, Inc.  WEB 
http://secure.transtronics.com
3209 West 9th Street Ph (785) 841-3089
Lawrence, KS 66049  FAX (785) 841-0434

The world runs on individuals pursuing their separate interests.
The great achievements of civilization have not come from
government bureaus. Einstein didn’t construct his theory under
order from a bureaucrat. Henry Ford didn’t revolutionize the
automobile industry that way.



--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/520697dc.1030...@xtronics.com