While testing Ganesha NFS V2.4.0.3 using the CEPH FSAL to a ceph file system, I am seeing the ganesha.nfsd process die due to an assert call multiple times per hour. I have also seen it die at the same place in the code using the VFS FSAL with a ext4 file system, but it dies much less often.
It is dying at line 917 in src/SAL/state_misc.c, which is called by src/SAL/state_misc.c at line 1010. The assert call is in dec_state_owner_ref() at the line: assert(refcount > 0); Looking at the core files and adding in some debugging code confirms that refcount is -1 when the assert call is made. It looks like the owner count is trying to go to -1 in uncache_nfs4_owner(), but as it occurs only on occasions, I think it is a race condition. Info on the build: Host OS is Ubuntu 14.04 with a 4.8.2 x86_64 kernel on a 8 processor system Cmake command: # cmake -DCMAKE_INSTALL_PREFIX=/opt/keeper -DALLOCATOR=jemalloc -DUSE_ADMIN_TOOLS=ON -DUSE_DBUS=ON ../src # ganesha.nfsd -v ganesha.nfsd compiled on Oct 17 2016 at 16:50:18 Release = V2.4.0.3 Release comment = GANESHA file server is 64 bits compliant and supports NFS v3,4.0,4.1 (pNFS) and 9P Git HEAD = 0f55a9a97a4bf232fb0e42542e4ca7491fbf84ce Git Describe = V2.4.0.3-0-g0f55a9a # ceph -v ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) # cat ganesha.conf LOG { components { ALL = INFO; } } EXPORT_DEFAULTS { SecType = none, sys; Protocols = 3, 4; Transports = TCP; } # define CephFS export EXPORT { Export_ID = 42; Path = /top; Pseudo = /top; Access_Type = RW; Squash = No_Root_Squash; FSAL { Name = CEPH; } } The VFS export for the ext4 tests was: # define CephFS export EXPORT { Export_ID = 43; Path = /var/top; Pseudo = /var/top; Access_Type = RW; Squash = No_Root_Squash; FSAL { Name = VFS; } } The test was 2 Ubuntu 14.04 NFS clients each having 6 processes, writing 11,000 256k files in separate directory trees with 11 files per lowest level node. On each Ubuntu client, 3 processes wrote to a NFS 3 mount and 3 wrote to a NFS 4 mount. The files are then read and verified, deleted, and the test restarts. Regards, Eric ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel