Hi Mark,
Just to report back: We have tried your (no longer recommended) patch https://gerrit.openafs.org/#/c/12796/ as you pointed out in the thread "getcwd() error for RHEL 7.4 kernel” in the openafs-info list. As far as we have seen, this indeed solved our disappearing mount point problems. We will of course switch to the new version of the patch (or maybe just 1.8.0) as soon as there is one. Thanks for your work! Best regards, /ragge > On 3 Nov 2017, at 17:29, Ragnar Sundblad <[email protected]> wrote: > > > Hi Mark, > >> On 3 Nov 2017, at 15:51, Mark Vitale <[email protected]> wrote: >> >> Ragge, >> >>> On Nov 3, 2017, at 9:46 AM, Ragnar Sundblad <[email protected]> wrote: >>> >>> We have compute clusters where the nodes have almost everything of their >>> roots in afs; most things in /, as /etc and /usr, are soft links into a >>> complete os installation in afs. To be able to have some writable files and >>> directories, such as /etc/adjtime or /var/tmp, we bind mount files and >>> directories in the tree which is actually in afs (mainly using the rwtab >>> functionality), and a lustre client that also gets mounted in the afs tree. >>> >>> When we upgraded from CentOS 7.3 to 7.4, kernel 3.10.0-693.5.2.el7.x86_64, >>> and using OpenAFS client 1.6.21.1 or 1.6.20.1, when users having home >>> directories in afs log in and start accessing their data, mounts in the afs >>> tree starts to get randomly unmounted. In the lustre case, the lustre >>> client nicely reports that it unmounts, so the unmounts seem to be handled >>> in an orderly manner. >>> >>> We have a suspicion this may be related to the problem reported in the >>> thread âgetcwd() error for RHEL 7.4 kernelâ, and that the kernel for >>> some reason decides that path to the mount point is no good and unmounts. >>> In addition, when this has started to happen, we are not able to mount >>> anything more into afs, mount returns ENOENT. >>> >>> This is pretty easy to repeat. >> Thank you for your detailed report. >> I have an idea about what this may be, but I will try to duplicate it on my >> test system first. > > Thanks for investigating! :-) > >>> Our workaround for now is to use the tpmfs based root all the way down to >>> the mount points, and have soft links into afs further down for the rest, >>> which seems to work. >> Itâs good that you have a workaround; thank you for sharing that as well. >> >>> Please let us know if we can provide any help debugging this. >> For now I would like to see your afsd options, and also the output from >> âcmdebug <client> -cacheâ for an affected client. > > We start it like so: > /bin/chroot /sysimage /usr/vice/etc/afsd -memcache -verbose -nosettime > -dynroot -mountdir /afs > (Before systemd is started, we set up the runtime root in /sysimage, then > chroot there, and start systemd to let it bring up the system.) > > Here is a cmdebug: > # cmdebug tegner-login-2 -cache > Chunk files: 1562 > Stat caches: 2343 > Data caches: 1562 > Volume caches: 200 > Chunk size: 65536 > Cache size: 100000 kB > Set time: no > Cache type: memory > > I now see that I forgot to mention that we use memory cache (since the nodes > are diskless). > >> Although you havenât reported the getcwd() problem, could you please >> confirm if youâve seen it or not? > > We have not seen it, but we haven’t really looked for it either. Is there > some test we could try? > >> And finally, just to confirm, you have seen bind mounts in /afs unmounted at >> CentOS 7.4 with both OpenAFS 1.6.21.1 and 1.6.20.1, but _not_ with CentOS >> 7.3 and those same OpenAFS client releases - correct? > > With 7.3 (kernel 3.10.0-514.26.2.el7.x86_64) we actually used openafs client > 1.6.20.2, but with that combination this mount-within-afs thing worked just > fine. > > Thanks! > > /ragge > > _______________________________________________ > OpenAFS-devel mailing list > [email protected] > https://lists.openafs.org/mailman/listinfo/openafs-devel _______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
