Dear J. R. Okajima

Am Samstag, 7. Februar 2015 03:14:25 schrieben Sie:
> Wolfgang Rosner:
> > > Do you mean
> > > - you specified "=ro+wh" as a branch permission
> > > - but /sys/fs/aufs/si_*/br2 shows "ro"
> > > right?
> >
> > yes. Thats what I did.
>
> I cannot reproduce this problem.
> I think this was a bug in aufs, and already fixed by these commits.
>
> 7a39204 2014-10-16 aufs: bugfix, fix the returning size of the branch attr
> 5001e71 2014-06-25 aufs: simply handing attribute string
> f7292d3 2014-06-25 aufs: introduce au_br_perm_str_t
>
> If you can, try updating aufs.
>
>
> J. R. Okajima

I'm afraid I cannot avoid to drop into the build process.
What I did now:

Thanks for the valuable hint to wireshark.
As an old-console-day-guy I usually am reluctant to use GUI-tools.
But the interactive filtering and deep protocol specific analysis really helps 
against the feeling of "sitting in the crankcase and watch the pistons going 
up and down" I remember from tcpdump.

wiresharking on the server, I could pin down the moment when things hang: 
It is when gzip is run on /var/lib/dmesg.0
Before, I see older log files being renamed.
I can watch success of these operation in both the aufs branches and the aufs 
mount.

The last signs of life of my NFS client look like

41090   122.518015      192.168.130.2   192.168.130.250 
        NFS     3146    V4 Call (Reply In 41427) WRITE StateID: 0xdcdd Offset: 
0 Len: 17341

->
41427   122.543403      192.168.130.250 192.168.130.2   
        NFS     202     V4 Reply (Call In 41090) WRITE
        Status: NFS4_OK (0)

41565   122.549393      192.168.130.2   192.168.130.250 
        NFS     270     V4 Call (Reply In 42087) CLOSE StateID: 0xdcdd
...
41589   122.549842      192.168.130.2   192.168.130.250 
        TCP     66      781→2049 [ACK] Seq=637669 Ack=19538241 Win=949888 Len=0 
        
        TSval=4294896938 TSecr=474615
....
42086   122.564122      192.168.130.2   192.168.130.250 
        TCP     66      781→2049 [ACK] Seq=637669 Ack=21219681 Win=949888 Len=0
         TSval=4294896942 TSecr=474618
42087   122.564131      192.168.130.250 192.168.130.2   
        NFS     47382   V4 Reply (Call In 41559) READ
42179   122.603758      192.168.130.2   192.168.130.250 
        TCP     66      781→2049 [ACK] Seq=637669 Ack=21531981 Win=949888 Len=0
        TSval=4294896952 TSecr=474618

(the dots are only TCP fragments, which I think are related to large file 
transfer) 

Thats it. After that, I only read DHCP renewal and TCP keepalives at sparse 
intervals.

I can ping but not login nor do anything - neither at the console nor by ssh.
I think the client NFS has gone and took the root mount with it.

I don't know whether it's a pure NFS problem or the NFS client is insulted by 
some misbehaviour of aufs.

So, the plan was to capture both network interfaces, rpcdebug -m nfs on both 
sides , aufs debug and strace /usr/bin/savelog (which is calling gzip)

Already attached a HD on one of the client machines, since neiter echoing all 
trace back over the network nor storing it into ramdisk which is gone after 
system hags seems a good idea to me.


I see, I cannot avoid to drop into the build process
- to enable debugging
- to exclude errors already fixed
- to solve the rowh permission display problem (although this might be a minor 
issue)

I haven't build a kernel for a decade or so, and back in "good-old-SuSE-times" 
things (distribution specific patching) worked quite different than in debian 
now. I'm afraid, this will cost me some other days to get it working.

I was quite glad to read in http://aufs.sourceforge.net/README.aufs2 of 
the "module only method", and equally disappoited to figure out that in 
aufs3, this obviusly has changed again :-(

Nevertheless, can I avoid complete kernel rebuild, when I build a module for 
the 3.16 kernel, where aufs still is included?

below is what ships with debian (testing)

Is there a way to build aufs module without going through the whole kernel 
rebuild, starting from there?

Thank You
Wolfgang Rosner


==============================================
.......# uname -v
#1 SMP Debian 3.16.7-ckt2-1 (2014-12-08)


.......$ grep AUFS /boot/config-3.16.0-4-amd64
CONFIG_AUFS_FS=m
CONFIG_AUFS_BRANCH_MAX_127=y
# CONFIG_AUFS_BRANCH_MAX_511 is not set
# CONFIG_AUFS_BRANCH_MAX_1023 is not set
# CONFIG_AUFS_BRANCH_MAX_32767 is not set
CONFIG_AUFS_SBILIST=y
# CONFIG_AUFS_HNOTIFY is not set
CONFIG_AUFS_EXPORT=y
CONFIG_AUFS_INO_T_64=y
# CONFIG_AUFS_FHSM is not set
# CONFIG_AUFS_RDU is not set
# CONFIG_AUFS_SHWH is not set
# CONFIG_AUFS_BR_RAMFS is not set
# CONFIG_AUFS_BR_FUSE is not set
CONFIG_AUFS_BR_HFSPLUS=y
CONFIG_AUFS_BDEV_LOOP=y
# CONFIG_AUFS_DEBUG is not set

........# modinfo aufs
filename:       /lib/modules/3.16.0-4-amd64/kernel/fs/aufs/aufs.ko
staging:        Y
alias:          fs-aufs
version:        3.16-20140908
description:    aufs -- Advanced multi layered unification filesystem
author:         Junjiro R. Okajima <aufs-users@lists.sourceforge.net>
license:        GPL
srcversion:     A0FD0D69EF9A1725B4DB4F8
depends:
intree:         Y
vermagic:       3.16.0-4-amd64 SMP mod_unload modversions
parm:           brs:use <sysfs>/fs/aufs/si_*/brN (int)
parm:           allow_userns:allow unprivileged to mount under userns (bool)



------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/

Reply via email to