Bruno Cesar Ribas wrote:
> On Mon, Dec 22, 2008 at 03:32:59PM -0800, Skylar Thompson wrote:
>> I'm running into some difficulty running aufs over NFS using a 2.6.27.10
>> kernel and the latest CVS fetch of aufs. I have a Linux 2.6.27.10 server
>> that does DHCP, PXE, and NFS for a diskless x86 Linux cluster. The nodes
>> fetch their aufs-enabled kernel and initramfs over TFTP, mount the
>> server's root filesystem over NFS, and then use aufs to provide a series
>> of overlays above the root filesystem. Here's my branches:
>>
>> 1. The entire server root filesystem, mounted over NFS and set as a
>> read-only branch.
>> 2. A subdirectory for the entire cluster, mounted as a read-only branch.
>> 3. A subdirectory for that specific node, based on its IP address, and
>> mounted as a read-write branch.
>>
>> Branch #3 is the only read-write branch, so it's where all the writes
>> end up. The problem is that about 75% of the time I boot a diskless
>> node, it hangs between when the init scripts complete, and when the init
>> process spawns a login prompt. Rebooting several times gets a node all
>> the way to the init prompt.
>>
>> This is a stock Linux kernel with only aufs applied. I've also applied
>> the lhash, splice, sec_perm, and ksize patches. I also use this same
>> setup to provide diskless nodes with just a RAM disk (no NFS), and this
>> works fine, so I suspect it's an NFS problem.
>>
>> Let me know what kind of additional information would be useful to help
>> in debugging.
> 
> send us your mount commands, aufs trace at dmesg (if any)



> Also stuff said at README =)

Ooops. Missed that part.

> 5. Contact
> ----------------------------------------
> When you have any problems or strange behaviour in aufs, please let me
> know with:
> - /proc/mounts (instead of the output of mount(8))

b...@bccd-ng1:/diskless/clients/10.4.0.112/tmp$ cat mounts.txt
rootfs / rootfs rw 0 0
none /root/sys sysfs rw,nosuid,nodev,noexec 0 0
none /root/proc proc rw,nosuid,nodev,noexec 0 0
udev /root/dev tmpfs rw,size=10240k,mode=755 0 0
10.4.0.1:/ /root nfs
rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=10,sec=sys,addr=10.4.0.1
0 0
10.4.0.1:/diskless/bccd /root/diskless/bccd nfs
rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.0.1
0 0
10.4.0.1:/diskless/clients /root/diskless/clients nfs
rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.0.1
0 0
10.4.0.1:/ /root/dev/.static/dev nfs
rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=10,sec=sys,addr=10.4.0.1
0 0
none /root/tmp tmpfs rw 0 0
none / aufs
rw,si=d04e9d25,xino=/tmp/.aufs.xino,br:/root/diskless/clients/10.4.0.112=rw:/root/diskless/bccd=ro:/root=ro
0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0
proc /proc proc rw,nosuid,nodev,noexec 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec 0 0
usbfs /proc/bus/usb usbfs rw,nosuid,nodev,noexec,devmode=666 0 0
none /dev/.static/dev aufs
rw,si=d04e9d25,xino=/tmp/.aufs.xino,br:/root/diskless/clients/10.4.0.112=rw:/root/diskless/bccd=ro:/root=ro
0 0
tmpfs /dev tmpfs rw,size=10240k,mode=755 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
10.4.0.1:/cluster/home /cluster/home nfs
rw,vers=3,rsize=32768,wsize=32768,namlen=255,hard,intr,proto=tcp,timeo=600,retrans=2,sec=sys,addr=10.4.0.1
0 0


> - /sys/fs/aufs/* (if you have them)

/sys/fs/aufs/
/sys/fs/aufs/stat
/sys/fs/aufs/debug
/sys/fs/aufs/si_d04e9d25
/sys/fs/aufs/si_d04e9d25/xino
/sys/fs/aufs/si_d04e9d25/xigen

> - /sys/module/aufs/*

/sys/module/aufs/
/sys/module/aufs/parameters
/sys/module/aufs/parameters/brs
/sys/module/aufs/parameters/nwkq
/sys/module/aufs/parameters/sysrq


> - linux kernel version
>   if your kernel is not plain, for example modified by distributor,
>   the url where i can download its source is necessary too.

2.6.27.10

> - aufs version which was printed at loading the module or booting the
>   system, instead of the date you downloaded.

aufs 20081208

> - configuration (define/undefine CONFIG_AUFS_xxx, or plain local.mk is
>   used or not)

I used a plain local.mk, with the only modification being the KDIR variable.

> - kernel configuration or /proc/config.gz (if you have it)

This one is big, so I've attached it.

> - behaviour which you think to be incorrect

Frequently init hangs on boot after the init scripts are done running
and before the login prompt displays. If a node does manage to get past
it, it's completely reliable though.

> - actual operation, reproducible one is better

Our setup wouldn't be easy to reproduce, but we do have it easily
automated from an ISO image. If this would be useful, I can provide a
link and directions for getting everything going.

-- 
-- Skylar Thompson ([email protected])
-- http://www.cs.earlham.edu/~skylar/

Attachment: config.gz
Description: application/gzip

------------------------------------------------------------------------------

Reply via email to