[email protected] wrote: > Hello Skylar, > > Skylar Thompsonさん: >> end up. The problem is that about 75% of the time I boot a diskless >> node, it hangs between when the init scripts complete, and when the init >> process spawns a login prompt. Rebooting several times gets a node all >> the way to the init prompt. > > Just to make sure, your linuxrc (or the init in your intramfs/initrd) > executes, > - mount all branches > - mount aufs > - "mount -move" all branches > - chroot or switch_root and executes the init under aufs > - and stopped > > Since your wrote "the init scripts complete", I guess some messages > about starting daemons were displayed on your console expectedly. > Are you sure that ALL init procedures (instead of linuxrc in initramfs) > are processed? And only the last prompt was not displayed?
Right. The AUFS stack is setup in the kernel initramfs as /root, and then pivots into that. Then /sbin/init is run as if the system had a local hard drive. /sbin/init runs all the init scripts successfully---I put a test echo at the end of the last one to make sure of this---and then hangs right before starting the login prompt. > >> This is a stock Linux kernel with only aufs applied. I've also applied >> the lhash, splice, sec_perm, and ksize patches. I also use this same >> setup to provide diskless nodes with just a RAM disk (no NFS), and this >> works fine, so I suspect it's an NFS problem. > > According to your config.gz, you set CONFIG_AUFS=y so you don't need > the sec_perm patch. And the ksize patch is unnecessary too. It is for > linux-2.6.21 and earlier. Does it hurt to have it included? If it does, I can rebuild the kernel and test again. > ---------------------------------------------------------------------- > >> b...@bccd-ng1:/diskless/clients/10.4.0.112/tmp$ cat mounts.txt > ::: >> 10.4.0.1:/ /root nfs >> rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=10,sec=sys,addr=10.4.0.1 >> 0 0 >> 10.4.0.1:/diskless/bccd /root/diskless/bccd nfs >> rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.0.1 >> 0 0 >> 10.4.0.1:/diskless/clients /root/diskless/clients nfs >> rw,vers=3,rsize=65536,wsize=65536,namlen=255,hard,nointr,nolock,proto=tcp,timeo=7,retrans=3,sec=sys,addr=10.4.0.1 >> 0 0 > ::: >> none /root/tmp tmpfs rw 0 0 >> none / aufs >> rw,si=d04e9d25,xino=/tmp/.aufs.xino,br:/root/diskless/clients/10.4.0.112=rw:/root/diskless/bccd=ro:/root=ro >> 0 0 > > Here the aufs line shows "xino=/tmp/.aufs.xino" which is the default > path when the writable branch is NFS. I am afraid that you forgot to > "mount -move" /tmp before chroot. It is necessary because aufs refers it > even if your initramfs is freed. > In your case, I'd recommend you to move /tmp. > For instance, > - mount all branches > - mount tmpfs /tmp > - mount aufs > - mount -move all branchs and /tmp under /root > - chroot or switch_root to /root and execute the init Examining this process actually led me to stumble onto what I think is the real problem, which might be unrelated to aufs. It looks like the udev tmpfs that is started in the initramfs isn't being migrated properly to the new root filesystem. This means that init creates initctl over NFS in the r/w aufs branch, and appears to be causing some locking problems. I don't know if the locking problems are aufs-related or not, but obviously the solution is to get /dev migrated properly. > And I don't think CONFIG_AUFS_EXPORT=y, CONFIG_AUFS_ROBR=y, > CONFIG_AUFS_SHWH=y and CONFIG_AUFS_RR_SQUASHFS=y are necessary for you. Does it hurt to have these enabled? I wasn't sure which ones I needed, so I enabled all the ones that looked like they could be useful. Thanks for the suggestions! -- -- Skylar Thompson ([email protected]) -- http://www.cs.earlham.edu/~skylar/ ------------------------------------------------------------------------------
