Re: zero bytes in files written to NFS

Julian Coleman Tue, 30 Mar 2021 06:11:51 -0700

Hi Thomas,

> I was asked offlist:
> 
> Yes, I see wcc errors like these:
> 
> Mar 29 20:51:20 yt /netbsd: [ 208996.0919434] 192.168.0.18:/volume2/foo: 
> re-enabling wcc
> Mar 29 20:51:20 yt /netbsd: [ 209116.2535171] 192.168.0.18:/volume2/foo: 
> inaccurate wcc data (ctime) detected, disabling wcc (ctime 
> 1617042615.889440844 1617042615.889440844, mtime 1617042615.889440844 
> 1617042615.889440844)
> 
> Perhaps that's the root issue?


I guess that was me ;-)  I had seen a bug when building the distribution
with the obj directory served by NFS from a Linux (QNAP NAS) server.  The
build failed with:

  info.info: Invalid argument
  makeinfo: Removing output file `info.info' due to errors; use --force to 
preserve.

Looking at the makeinfo code, it looked like it was doing a write of the file
and then a read without closing.  I wrote a tiny reproducer for this, but I
didn't find the root cause of the problem.

I suspect that with some versions of Linux NFS, when we detect inaccurate
wcc data, something goes wrong between the vnode cache and the NFS attribute
cache when we process the mbuf and we end up delaying writes for up to 30
seconds.  I did also test by forcing us to detect inaccurate wcc data on a
NetBSD NFS server, but that didn't show the problem, and a few people have
tested my reproducer with Linux NFS servers, so it seems that this bug only
occurs on a small number of servers.

However, this isn't the same bug as Thomas sees, so I'm only guessing that
it might be related.  Also, the bug that I was seeing was 100% reproducable
for me and this looks to be intermittent/random.

Regards,

Julian

PS.  I upgraded the software on my NAS and I no longer see this problem when
running my reproducer, although the wcc message is still present.  This is
why I suspect that this only happens with some Linux NFS servers (versions).

Re: zero bytes in files written to NFS

Reply via email to