Hi, here is a webrev for
6707722 nfs4_start_fop() uses gethrestime_sec() with 1 second granularity to detect server change 6721281 panic: assertion failed: np != 0L, file: ../../common/fs/nfs/nfs4_callback.c, line: 1309 6500269 assertion failed: sp->mntinfo4_list == mi, file: ../../common/fs/nfs/nfs4_vfsops.c, line: 3258 http://cte-www.uk/net/tb3.uk/tank/u/pf199842/6500269-6707722-6721281/webrev/ Background info: All three CRs are related to the nfs4_server_t <-> mntinfo4_t relationship. Not to reinvent the wheel, I will use Sam Falkner's description of that relationship, taken from: *6667873 System panics in nfs4_recov_thread <http://monaco.sfbay.sun.com/detail.jsf?cr=6667873>* <snip> The data structures go roughly like this, where the arrows mean "you can go in this direction intuitively". vfs_t ^ | v mntinfo4_t <- nfs4_server_t | v servinfo4_t Note that it's relatively easy to get from nfs4_server_t to mntinfo4_t, but not vice-versa. mntinfo4_t is at the heart of things, and the client code needs to get from mntinfo4_t to nfs4_server_t sometimes; notably, during unmount and failover. In order to get from mntinfo4_t to nfs_server_t, we do the following: get the current servinfo4_t for our mntinfo4_t for every nfs4_server_t in the kernel: if the connection address in servinfo4_t matches the address in nfs4_server_t success </snip> Here is a short description and a fix for each CR: ---- 6500269: in this CR the MOUNT code in nfs4_mount() is interrupted. During the cleanup code, the lookup of nfs4_server_t for mntinfo4_t is mismatched. The code tries to remove mntinfo4_t from wrong nfs4_server_t. We get a panic. FIX : introduce a direct pointer from mntinfo4_t to nfs4_server_t ---- 6707722: This bug was never reported on a live system - just spotted during code reading. As synopsis says: "nfs4_start_fop() uses gethrestime_sec() with 1 second granularity to detect server change" If a failover of nfs4_server happens, nfs4_start_fop() might not notice that. FIX: use an incrementing counter for detection of a server change ---- 6721281: In this CR, the recovery thread is not able to find the corresponding nfs4_server_t. The situation is different here: - by design neither mntinfo4_t nor recovery thread gets a reference count (s_refcnt) on nfs4_server_t; so the code cannot assume that nfs4_server_t exists. - however nfs4delegreturn_cleanup_impl() always assumes that nfs4_server_t exists; this is a regression of 6632214 FIX: if r_deleg_type == OPEN_DELEGATE_NONE; do not lookup nfs4_server_t ===== Thanks, Pavel