On Mon, Mar 19, 2018 at 11:14:13AM +1100, Ian Wienand wrote: > Hello, > > With 1.8.0~pre5 we occasionally get > > [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 08:00:41 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 10:00:07 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 12:00:06 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 14:00:07 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 16:42:15 2018] find_preferred_connection: no connection and > !create > [Fri Mar 16 18:21:58 2018] find_preferred_connection: no connection and > !create > > in the kernel logs. You can see from [1] it's usually around the top > of the hour when mirroring processes start; but not always. I've had > a look at [2] ... there doesn't seem to be anything obviously tunable > about this? Is it something we should worry about?
I think it should be harmless, and should probably be removed. It looks like the only place where we call find_preferred_connection() with create == 0 is within afs_ConnByHost(), where we first check if there's an existing connection to reuse, and if not, we create one. So this message would just be telling us that we are not reusing a cached connection and had to make a new one, which is mostly of interest only to the developer working on the code. > --- > > For background ... in OpenStack we have based our mirroring > infrastructure off AFS. We have a single host that updates from > various upstream mirrors to RW volumes then releases them; mirror > hosts in various remote clouds then serve the volumes via apache to > local nodes in their own cloud. > > Unfortunately this mirror updater has been very unstable lately. In > particular, we use "reprepro" to mirror deb-based repositories like > Debian, Ubuntu, Ubuntu Ports, etc. and its on-disk databases are very > sensitive to corruption of files; when it does happen, recovering or > remirroring these big repos is not fun (others we just rsync, which is > much more tolerant to failures). > > We were previously running Trusty on this host, which would be openafs > 1.6.7 [3]. We'd fairly regularly see things like: > > afs: Lost contact with file server 104.130.138.161 in cell openstack.org > (code -512) (all multi-homed ip addresses down for the server) > afs: failed to store file (110) > > and at the fs level we'd end up with files not written or corruption. > > Anyway, it didn't seem worth spending time on such old code; we have > upgraded the host to Xenial now, and are using a backport of the > bionic 1.8.0~pre5 packages in a PPA [4]. This is so far working well, > modulo the warning above. That's great feedback to hear; thanks! -Ben _______________________________________________ OpenAFS-devel mailing list OpenAFS-devel@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-devel