Hi Thomas, > I was asked offlist: > > Yes, I see wcc errors like these: > > Mar 29 20:51:20 yt /netbsd: [ 208996.0919434] 192.168.0.18:/volume2/foo: > re-enabling wcc > Mar 29 20:51:20 yt /netbsd: [ 209116.2535171] 192.168.0.18:/volume2/foo: > inaccurate wcc data (ctime) detected, disabling wcc (ctime > 1617042615.889440844 1617042615.889440844, mtime 1617042615.889440844 > 1617042615.889440844) > > Perhaps that's the root issue?
I guess that was me ;-) I had seen a bug when building the distribution with the obj directory served by NFS from a Linux (QNAP NAS) server. The build failed with: info.info: Invalid argument makeinfo: Removing output file `info.info' due to errors; use --force to preserve. Looking at the makeinfo code, it looked like it was doing a write of the file and then a read without closing. I wrote a tiny reproducer for this, but I didn't find the root cause of the problem. I suspect that with some versions of Linux NFS, when we detect inaccurate wcc data, something goes wrong between the vnode cache and the NFS attribute cache when we process the mbuf and we end up delaying writes for up to 30 seconds. I did also test by forcing us to detect inaccurate wcc data on a NetBSD NFS server, but that didn't show the problem, and a few people have tested my reproducer with Linux NFS servers, so it seems that this bug only occurs on a small number of servers. However, this isn't the same bug as Thomas sees, so I'm only guessing that it might be related. Also, the bug that I was seeing was 100% reproducable for me and this looks to be intermittent/random. Regards, Julian PS. I upgraded the software on my NAS and I no longer see this problem when running my reproducer, although the wcc message is still present. This is why I suspect that this only happens with some Linux NFS servers (versions).
