Hi,

Last week, I upgraded 3 machines from Bookworm to Trixie. The same export is mounted on these machines. Ever since, the amount of NFS RPCs per client has increased from 4-9k to 17-28k per 30 seconds. The user workload has not changed.

The kernel was upgraded from 6.12.22-1~bpo12+1 (Bookworm backports) to 6.18.9-1~bpo13+1 (Trixie backports). Using the previous kernel from Bookworm (on Trixie) does not help. I also gave 6.12.73+deb13-amd64 (stable, not backports) a shot; no change.

Many more DelegReturn / Close / OpenNoattr (Linux shorthand[1] for SEQUENCE + PUTFH + OPEN) / TestStateID RPCs are done, but user workload has not changed. The amount of StatFs and Getattr calls have decreased by about 8,75x and 1,3x respectively.

Here is two graphs showing requests: https://nextcloud.cyberfusion.nl/s/8De2coZyjaZgZk2 (password 'Kz55T8imXi'). The massive increase immediately follows the Trixie upgrade. (The mailing list seems to not pick up my message when attaching the image.)

The NFS server is on Bookworm (6.1.0-41-amd64). There is NO increase on (still Bookworm) clients using the same NFS server.

I considered that memory allocation might have changed. I.e. less cache being available, naturally causing more requests. That is not the case; the amount of available RAM is about the same as pre-upgrade.

Have others seen this behaviour? Anyone have an idea as to where this increase comes from? I haven't yet thought of what in userspace might affect this.

With kind regards,

William David Edwards

[1]: https://www.fsl.cs.stonybrook.edu/docs/nfs4perf/nfs4perf-microscope.pdf

Reply via email to