Hi, I am consistently hitting ganesha crash with v3 client using following steps :
1.)Start ganesha exporting glusterfs volume 2.)mount the volume using nfsv3 3.)cd <mount directory> 4.)ls ---> gives "invalid argument error on .." 5.)again do ls will crash the ganesha process Issue: In this case, a lookup is performed on parent directory of root(which it should not) in the nfs3_readdirplus() call results in a invalid argument error in the first call and end up cleaning up export->exp_root_obj if I understands it correctly. Next time when "ls" performed, it resulted in a crash. bt #0 0x0000000000000000 in ?? () #1 0x00000000004f1978 in nfs_export_get_root_entry (export=0xbe0968, obj=0x7fc6a1ff1a98) at /root/nfs-ganesha/src/support/exports.c:1551 #2 0x000000000042d6f1 in fsal_lookupp (obj=0x7fc688001688, parent=0x7fc6a1ff1ee8) at /root/nfs-ganesha/src/FSAL/fsal_helper.c:904 #3 0x000000000048b013 in nfs3_readdirplus (arg=0x7fc684000aa8, req=0x7fc6840008e8, res=0x7fc6bc000b20) at /root/nfs-ganesha/src/Protocols/NFS/nfs3_readdirplus.c:217 #4 0x0000000000447ea0 in nfs_rpc_execute (reqdata=0x7fc6840008c0) at /root/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1245 #5 0x00000000004487de in worker_run (ctx=0xcd3950) at /root/nfs-ganesha/src/MainNFSD/nfs_worker_thread.c:1553 #6 0x00000000004f5b88 in fridgethr_freeze (fr=0xcd3950, thr_ctx=0x7fc6840008c0) at /root/nfs-ganesha/src/support/fridgethr.c:472 #7 0x00007fc6d4077555 in ?? () #8 0x0000000000000000 in ?? () (gdb) f 1 #1 0x00000000004f1978 in nfs_export_get_root_entry (export=0xbe0968, obj=0x7fc6a1ff1a98) (gdb) p export->exp_root_obj->obj_ops->get_ref $3 = (void (*)(struct fsal_obj_handle *)) 0x0 It crashed on this point. This is my analysis based on my understanding of the code. --------------- nfs_readdirplus calls fsal_lookupp() for performing lookup on parent if (begin_cookie <= 1) { struct fsal_obj_handle *parent_dir_obj = NULL; fsal_status_gethandle = fsal_lookupp(dir_obj, &parent_dir_obj); ---------------- There is special check for root entry in fsal_lookup and skip that but it fails in this case (both are different) ---------------- if (obj == root_obj) { *parent = obj; return fsalstat(ERR_FSAL_NO_ERROR, 0); } ---------------- So I just check how obj and root_obj are populated root_obj is obtained from nfs_export_get_root_entry(op_ctx->export, &root_obj); [fetched from op_ctx->export->exp_root_obj] obj is passed from nfs_readdirplus() as dir_obj and which is created using ---------------- dir_obj = nfs3_FhandleToCache(&(arg->arg_readdirplus3.dir), &(res->res_readdirplus3.status), &rc); ---------------- if I understands correctly, below is the code flow nfs3_FhandleToCache () | |--------create_handle() in FSAL_GLUSTER | |----------construct_handle() | |---------------fsal_obj_handle_init() -- new fsal obj handle initialized. So every time it mismatches with exp_root_obj entry. IMHO the check in fsal_lookupp is not reliable, Can we have introduce a flag for fsal_obj_handle which specifies whether it is root or not? I created Bug[1] for the issue and update my findings in that. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1335062 Regards, Jiffin ------------------------------------------------------------------------------ Mobile security can be enabling, not merely restricting. Employees who bring their own devices (BYOD) to work are irked by the imposition of MDM restrictions. Mobile Device Manager Plus allows you to control only the apps on BYO-devices by containerizing them, leaving personal data untouched! https://ad.doubleclick.net/ddm/clk/304595813;131938128;j _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel