[Nfs-ganesha-devel] RPC queue enqueued/dequeued counter size
Hi, Recently I came across below 2 counters for RPC queue. static uint32_t enqueued_reqs; static uint32_t dequeued_reqs; Shouldn't the counter size be uint64_t ? Having size as uint32_t will allow the counters to grow until 4294967295. Increasing the size would help production environments. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Need clarification on GSH_CACHE_PAD
Hi, I come across the macro GSH_CACHE_PAD. what I can guess from its name is it is used for padding, but unable to understand, where exactly it is needed, also what the number indicates. To be specific I am talking about below structure: struct req_q_pair { const char *s; GSH_CACHE_PAD(0); struct req_q producer; /* from decoder */ GSH_CACHE_PAD(1); struct req_q consumer; /* to executor */ GSH_CACHE_PAD(2); }; In above if I need to add another field like below, do I need to add the pad ? struct req_q_pair { const char *s; GSH_CACHE_PAD(0); struct req_q producer; /* from decoder */ GSH_CACHE_PAD(1); struct req_q consumer; /* to executor */ GSH_CACHE_PAD(2); uint64_t total; /* cumulative */ }; Thanks in advance. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Unable to find epoll_create
Hi All, While preparing to build latest version of Ganesha 2.6, I am facing below errors: -- Looking for include file sys/epoll.h -- Looking for include file sys/epoll.h - found -- Looking for epoll_create -- Looking for epoll_create - not found CMake Error at cmake/modules/FindPackageHandleStandardArgs.cmake:109 (message): Could NOT find EPOLL (missing: EPOLL_FUNC) Call Stack (most recent call first): cmake/modules/FindPackageHandleStandardArgs.cmake:317 (_FPHSA_FAILURE_MESSAGE) cmake/modules/FindEPOLL.cmake:21 (FIND_PACKAGE_HANDLE_STANDARD_ARGS) libntirpc/CMakeLists.txt:118 (find_package) -- Configuring incomplete, errors occurred! The command I used is as below: cmake ../src -DBUILD_CONFIG=rpmbuild -DCMAKE_BUILD_TYPE=Release -DUSE_FSAL_GPFS=ON -DUSE_ADMIN_TOOLS=ON -DUSE_GUI_ADMIN_TOOLS=OFF -DUSE_DBUS=ON -D_MSPAC_SUPPORT=ON Let me know what config changes I need to do. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha 2.3 - assert in dec_state_owner_ref
Hi All, A customer observing assert for Ganesha 2.3 code base as shown below in the back trace: (gdb) where #0 0x7fbf55cb31d7 in raise () from /lib64/libc.so.6 #1 0x7fbf55cb48c8 in abort () from /lib64/libc.so.6 #2 0x7fbf55cac146 in __assert_fail_base () from /lib64/libc.so.6 #3 0x7fbf55cac1f2 in __assert_fail () from /lib64/libc.so.6 #4 0x004beb61 in dec_state_owner_ref (owner=0x7fbadc002c48) at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/SAL/state_misc.c:1007 #5 0x004beed6 in uncache_nfs4_owner (nfs4_owner=0x7fbadc002c98) at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/SAL/state_misc.c:1100 #6 0x004506d6 in reap_expired_open_owners () at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/MainNFSD/nfs_reaper_thread.c:185 #7 0x0045093c in reaper_run (ctx=0x3b10fc0) at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/MainNFSD/nfs_reaper_thread.c:249 #8 0x00521562 in fridgethr_start_routine (arg=0x3b10fc0) at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/support/fridgethr.c:561 #9 0x7fbf566b4dc5 in start_thread () from /lib64/libpthread.so.0 #10 0x7fbf55d7576d in __lseek_nocancel () from /lib64/libc.so.6 #11 0x in ?? () In the function "reap_expired_open_owners", I observe that the texpire (a local variable) has value 0. texpire = atomic_fetch_time_t(_owner->cache_expire); >From the core: (gdb) frame 6 #6 0x004506d6 in reap_expired_open_owners () at /usr/src/debug/nfs-ganesha-2.3.2-ibm47-0.1.1-Source/MainNFSD/nfs_reaper_thread.c:185 185 uncache_nfs4_owner(nfs4_owner); (gdb) p tnow $1 = 1508596802 (gdb) p texpire $2 = 0 In the code nfs4_owner->cache_expire is set to 0, only in function "uncache_nfs4_owner" (which is yet to be called for this crash). So wondering how this is happening ? Going ahead will it be good to safe guard the call to uncache_nfs4_owner, (the reaper code) as below in the function reap_expired_open_owners (as per 2.3 code) ? *} else {if (texpire != 0) { /* This cached owner has expired, uncache it. */ uncache_nfs4_owner(nfs4_owner); count++;}/* Get the next owner to examine. */owner = glist_first_entry( _open_owners, state_owner_t, so_owner.so_nfs4_owner.so_state_list); }* -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha 2.5 - mdc_readdir_chunk_object :INODE :CRIT :Collision while adding dirent for .nfsFD8E
Hello, During tests on Ganesha 2.5, we are getting below logs with the critical message: *2017-11-03 05:30:05 : epoch 000100d3 : c40abc1pn13.gpfs.net <http://c40abc1pn13.gpfs.net> : ganesha.nfsd-36297[work-226] mdcache_avl_insert_ck :INODE :WARN :Already existent when inserting dirent 0x3ffbe8015a60 for .nfsFD8E on entry=0x3ffb08019ed0 FSAL cookie=7fff, duplicated directory cookies make READDIR unreliable.2017-11-03 05:30:05 : epoch 000100d3 : c40abc1pn13.gpfs.net <http://c40abc1pn13.gpfs.net> : ganesha.nfsd-36297[work-226] mdc_readdir_chunk_object :INODE :CRIT :Collision while adding dirent for .nfsFD8E* Would like to understand what exactly mean by FSAL cookie collision ? Does it mean same operation has been done by UPCALL thread ? Is the message really CRIT ? If I compare with 2.3 code (I know there is lot of change related to caching), there we are not throwing any CRIT message. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Ganesha 2.3 and 2.5 - crash in free_nfs_request
William, You are right, gsh_calloc is getting invoked (even for 2.3 code). Interestingly for the core we got in testing, has almost all the fields filled with 0xFF. So wondering is it something to do with underneath glibc or RHEL in general. Here is the gdb o/p indicating the same. (gdb) p reqdata->r_u.req.svc $7 = {rq_prog = 4294967295, rq_vers = 4294967295, rq_proc = 4294967295, rq_cred = {oa_flavor = -1, oa_base = 0x , oa_length = 4294967295}, rq_clntcred = 0x7f183c0a83e0, rq_xprt = 0x7f1932423830, rq_clntname = 0x , rq_svcname = 0x , rq_msg = 0x7f183c0a8020, rq_context = 0x0, rq_u1 = 0x, rq_u2 = 0x, rq_cksum = 18446744073709551615, rq_xid = 4294967295, rq_verf = { oa_flavor = -1, oa_base = 0x , oa_length = 4294967295}, rq_auth = 0x, rq_ap1 = 0x, rq_ap2 = 0x, rq_raddr = {ss_family = 65535, __ss_align = 18446744073709551615, __ss_padding = '\377' }, rq_daddr = {ss_family = 65535, __ss_align = 18446744073709551615, __ss_padding = '\377' }, rq_raddr_len = 0, rq_daddr_len = 0} (gdb) p reqdata->r_u.req $8 = {xprt = 0x7f1932423830, svc = {rq_prog = 4294967295, rq_vers = 4294967295, rq_proc = 4294967295, rq_cred = { oa_flavor = -1, oa_base = 0x , oa_length = 4294967295}, rq_clntcred = 0x7f183c0a83e0, rq_xprt = 0x7f1932423830, rq_clntname = 0x , rq_svcname = 0x , rq_msg = 0x7f183c0a8020, rq_context = 0x0, rq_u1 = 0x, rq_u2 = 0x, rq_cksum = 18446744073709551615, rq_xid = 4294967295, rq_verf = {oa_flavor = -1, oa_base = 0x , oa_length = 4294967295}, rq_auth = 0x, rq_ap1 = 0x, rq_ap2 = 0x, rq_raddr = {ss_family = 65535, __ss_align = 18446744073709551615, __ss_padding = '\377' }, rq_daddr = {ss_family = 65535, __ss_align = 18446744073709551615, __ss_padding = '\377' }, rq_raddr_len = 0, rq_daddr_len = 0}, lookahead = {flags = 4294967295, read = 65535, write = 65535}, arg_nfs = { arg_getattr3 = {object = {data = {data_len = 4294967295, data_val = 0x }}}, arg_setattr3 = {object = {data = { data_len = 4294967295, data_val = 0x }}, new_attributes = {mode = {set_it = -1, set_mode3_u = {mode = 4294967295}}, uid = {set_it = -1, set_uid3_u = { uid = 4294967295}}, gid = {set_it = -1, set_gid3_u = {gid = 4294967295}}, size = {set_it = -1, set_size3_u = { size = 18446744073709551615}}, atime = { set_it = (SET_TO_SERVER_TIME | SET_TO_CLIENT_TIME | unknown: 4294967292), set_atime_u = {atime = { tv_sec = 4294967295, tv_nsec = 4294967295}}}, mtime = { set_it = (SET_TO_SERVER_TIME | SET_TO_CLIENT_TIME | unknown: 4294967292), set_mtime_u = {mtime = { tv_sec = 4294967295, tv_nsec = 4294967295, guard = {check = -1, sattrguard3_u = {obj_ctime = { tv_sec = 4294967295, tv_nsec = 4294967295, arg_lookup3 = {what = {dir = {data = {data_len = 4294967295, data_val = 0x }}, name = 0x }}, arg_access3 = {object = {data = { data_len = 4294967295, data_val = 0x }}, access = 4294967295}, arg_readlink3 = {symlink = {data = {data_len = 4294967295, data_val = 0x }}}, arg_read3 = {file = {data = { data_len = 4294967295, data_val = 0x }}, offset = 18446744073709551615, count = 4294967295}, arg_write3 = {file = {data = {data_len = 4294967295, data_val = 0x }}, offset = 18446744073709551615, count = 4294967295, stable = (DATA_SYNC | FILE_SYNC | unknown: 4294967292), data = {data_len = 4294967295, data_val = 0x }}, arg_create3 = {where = {dir = {data = { data_len = 4294967295, data_val = 0x }}, name = 0x }, how = { mode = (GUARDED | EXCLUSIVE | unknown: 4294967292), createhow3_u = {obj_attributes = {mode = {set_it = -1, ---Type to continue, or q to quit--- Let me know if any one observed this kind of behavior. Thanks in advance. On Mon, Oct 30, 2017 at 9:38 PM, William Allen Simpson < william.allen.simp...@gmail.com> wrote: > On 10/27/17 7:56 AM, Sachin Punadikar wrote: > >> Ganesha 2.3 got segfault with below : >> [...] >> After analyzing the core and related code found that - In >> "thr_decode_rpc_request" function, if call to SVC_RECV fails, then >> free_nfs_request is invoked to free the resources. But so far one of the >> field "reqdata->r_u.req.svc.rq_auth" is not initialized nor allocated, >> which is leading to segfault. >> >> The code in this area is same for Ganesha 2.3 and 2.5. >> I
[Nfs-ganesha-devel] Ganesha 2.5, crash /segfault while executing nlm4_Unlock
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.Hi All, Recently a crash was reported by customer for Ganesha 2.5. (gdb) where #0 0x7f475872900b in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x0041eac9 in fsal_obj_handle_fini (obj=0x7f4378028028) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/commonlib.c:192 #2 0x0053180f in mdcache_lru_clean (entry=0x7f4378027ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:589 #3 0x00536587 in _mdcache_lru_unref (entry=0x7f4378027ff0, flags=0, func=0x5a9380 <__func__.23209> "cih_remove_checked", line=406) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1921 #4 0x00543e91 in cih_remove_checked (entry=0x7f4378027ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_hash.h:406 #5 0x00544b26 in mdc_clean_entry (entry=0x7f4378027ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_helpers.c:235 #6 0x0053181e in mdcache_lru_clean (entry=0x7f4378027ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:592 #7 0x00536587 in _mdcache_lru_unref (entry=0x7f4378027ff0, flags=0, func=0x5a70af <__func__.23112> "mdcache_put", line=190) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.c:1921 #8 0x00539666 in mdcache_put (entry=0x7f4378027ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_lru.h:190 #9 0x0053f062 in mdcache_put_ref (obj_hdl=0x7f4378028028) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_handle.c:1709 #10 0x0049bf0f in nlm4_Unlock (args=0x7f4294165830, req=0x7f4294165028, res=0x7f43f001e0e0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/Protocols/NLM/nlm_Unlock.c:128 #11 0x0044c719 in nfs_rpc_execute (reqdata=0x7f4294165000) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1290 #12 0x0044cf23 in worker_run (ctx=0x3c200e0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/MainNFSD/nfs_worker_thread.c:1562 #13 0x0050a3e7 in fridgethr_start_routine (arg=0x3c200e0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm013.00-0.1.1-Source/support/fridgethr.c:550 #14 0x7f4758725dc5 in start_thread () from /lib64/libpthread.so.0 #15 0x7f4757de673d in clone () from /lib64/libc.so.6 A closer look at the backtrace indicates that there was cyclic flow of execution as below: nlm4_Unlock -> mdcache_put_ref -> mdcache_put -> _mdcache_lru_unref -> mdcache_lru_clean -> fsal_obj_handle_fini and then mdc_clean_entry -> cih_remove_checked -> (purposely coping next flow on below line) -> _mdcache_lru_unref -> mdcache_lru_clean -> fsal_obj_handle_fini (currently crashing here) Do we see any code issue here ? Any hints on how to RCA this issue ? Thanks in advance. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha 2.3 and 2.5 - crash in free_nfs_request
Hello, Ganesha 2.3 got segfault with below : *Core was generated by `/usr/bin/ganesha.nfsd -L /var/log/ganesha.log -f /etc/ganesha/ganesha.conf -N N'.Program terminated with signal 11, Segmentation fault.#0 0x0044b4dd in free_nfs_request (reqdata=0x7f19c5e48010)at /usr/src/debug/nfs-ganesha-2.3.2-ibm51-0.1.1-Source/MainNFSD/nfs_rpc_dispatcher_thread.c:14901490 SVCAUTH_RELEASE(reqdata->r_u.req.svc.rq_auth,Missing separate debuginfos, use: debuginfo-install dbus-libs-1.6.12-13.el7.x86_64 glibc-2.17-105.el7.x86_64 gssproxy-0.4.1-7.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.13.2-10.el7.x86_64 libattr-2.4.46-12.el7.x86_64 libblkid-2.23.2-26.el7.x86_64 libcap-2.22-8.el7.x86_64 libcom_err-1.42.9-7.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-26.el7.x86_64 pcre-8.32-15.el7.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64(gdb) where#0 0x0044b4dd in free_nfs_request (reqdata=0x7f19c5e48010)at /usr/src/debug/nfs-ganesha-2.3.2-ibm51-0.1.1-Source/MainNFSD/nfs_rpc_dispatcher_thread.c:1490#1 0x0044c297 in thr_decode_rpc_request (context=0x0, xprt=0x7f1932423830)at /usr/src/debug/nfs-ganesha-2.3.2-ibm51-0.1.1-Source/MainNFSD/nfs_rpc_dispatcher_thread.c:1836#2 0x0044c355 in thr_decode_rpc_requests (thr_ctx=0x7f17c00b6f10) at /usr/src/debug/nfs-ganesha-2.3.2-ibm51-0.1.1-Source/MainNFSD/nfs_rpc_dispatcher_thread.c:1858#3 0x00520bc6 in fridgethr_start_routine (arg=0x7f17c00b6f10)at /usr/src/debug/nfs-ganesha-2.3.2-ibm51-0.1.1-Source/support/fridgethr.c:561#4 0x7f19c462bdc5 in start_thread () from /lib64/libpthread.so.0#5 0x7f19c3ceb1cd in clone () from /lib64/libc.so.6* After analyzing the core and related code found that - In "thr_decode_rpc_request" function, if call to SVC_RECV fails, then free_nfs_request is invoked to free the resources. But so far one of the field "reqdata->r_u.req.svc.rq_auth" is not initialized nor allocated, which is leading to segfault. The code in this area is same for Ganesha 2.3 and 2.5. I have created below patch to overcome this issue. Please review and if suitable merge with Ganesha 2.5 stable. https://github.com/sachinpunadikar/nfs-ganesha/commit/91baffa8bd197c78eff106f42927a370155ae6b4 Ganesha 2.6 code in this area has lot of changes. Was not able to check whether 2.6 is affected or not. -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] READDIR doesn't return all entries.
Pradeep, The patch is to catch FSAL cookie related issues and when got one, it will ask NFS client to read again. Will you please provide Ganesha logs & tcpdump for both working (RC2) and non-working (RC6) cases? - Sachin. On Tue, Feb 13, 2018 at 7:05 AM, Pradeep <pradeeptho...@gmail.com> wrote: > Hello, > > I noticed that with large number of directory entries, READDIR does not > return all entries. It happened with RC5; but works fine in RC2. I looked > through the changes and the offending change seems to be this one: > > https://github.com/nfs-ganesha/nfs-ganesha/commit/ > 985564cbd147b6acc5dd6de61a3ca8fbc6062eda > > (reverted the change and verified that all entries are returned without > this change) > > Still looking into why it broke READDIR for me. Any insights on debugging > this would be helpful. > > Thanks, > Pradeep > > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > ___ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > > -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha crash in dec_nlm_state_ref
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.Hello, Recently a customer reported below crash: #0 0x3fff7dbd39ac in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37 37return INLINE_SYSCALL (tgkill, 3, pid, THREAD_GETMEM (THREAD_SELF, tid), Missing separate debuginfos, use: debuginfo-install dbus-libs-1.10.24-7.el7.ppc64le elfutils-libelf-0.170-4.el7.ppc64le elfutils-libs-0.170-4.el7.ppc64le (gdb) where #0 0x3fff7dbd39ac in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37 #1 0x10070b38 in crash_handler (signo=11, info=0x3ffaacffcdc8, ctx=0x3ffaacffc050) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/MainNFSD/nfs_init.c:225 #2 #3 0x101b4b70 in mdc_cur_export () at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_int.h:544 #4 0x101b72e0 in mdcache_close2 (obj_hdl=0x3ffbb0897e98, state=0x3ffe8ceafb30) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_file.c:1047 #5 0x10135054 in dec_nlm_state_ref (state=0x3ffe8ceafb30) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/nlm_state.c:340 #6 0x100f7894 in dec_state_t_ref (state=0x3ffe8ceafb30) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/include/sal_functions.h:445 #7 0x100fa5d4 in remove_from_locklist (lock_entry=0x3ffe8c8916e0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/state_lock.c:769 #8 0x100fd2b0 in try_to_grant_lock (lock_entry=0x3ffe8c8916e0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/state_lock.c:1834 #9 0x100fd45c in process_blocked_lock_upcall (block_data=0x3ffe8c1f4630) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/state_lock.c:1850 #10 0x100f62e4 in state_blocked_lock_caller (ctx=0x3ffaa167a1c0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/state_async.c:68 #11 0x10169144 in fridgethr_start_routine (arg=0x3ffaa167a1c0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/support/fridgethr.c:550 #12 0x3fff7dbc8728 in start_thread (arg=0x3ffaacffe810) at pthread_create.c:310 #13 0x3fff7da07ae0 in clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:109 To fix the same I have uploaded patch : https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/423882 - Sachin -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha abort due to double free
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.Hello, A customer reported Ganesha crash/abort due to double free. The stack trace is as below : (gdb) where #0 0x3fff889c39ac in raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37 #1 0x10070b38 in crash_handler (signo=6, info=0x3ffefc7fc728, ctx=0x3ffefc7fb9b0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/MainNFSD/nfs_init.c:225 #2 #3 0x3fff8871e578 in __GI_raise (sig=) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #4 0x3fff887206fc in __GI_abort () at abort.c:90 #5 0x3fff88764844 in __libc_message (do_abort=, fmt=0x3fff888656d0 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196 #6 0x3fff8876f284 in malloc_printerr (ar_ptr=0x3ffa9020, ptr=, str=0x3fff88865798 "double free or corruption (fasttop)", action=3) at malloc.c:5013 #7 _int_free (av=0x3ffa9020, p=, have_lock=) at malloc.c:3835 #8 0x100f6edc in gsh_free (p=0x3ffa9a00) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/include/abstract_mem.h:271 #9 0x1010460c in cancel_all_nlm_blocked () at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/state_lock.c:3799 #10 0x1012a154 in nfs_release_nlm_state (release_ip=0x10031a3c3d6 "10.200.10.107") at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/nfs4_recovery.c:1213 #11 0x10125588 in nfs4_start_grace (gsp=0x3ffefc7fd978) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/SAL/nfs4_recovery.c:106 #12 0x1007bc18 in admin_dbus_grace (args=0x3ffefc7fdaa0, reply=0x1002ea01350, error=0x3ffefc7fda80) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/MainNFSD/nfs_admin_thread.c:166 #13 0x101ca3e4 in dbus_message_entrypoint (conn=0x1002ea00e10, msg=0x1002ea011b0, user_data=0x102414c0 ) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/dbus/dbus_server.c:512 #14 0x3fff88d5164c in _dbus_object_tree_dispatch_and_unlock () from /lib64/libdbus-1.so.3 #15 0x3fff88d3b950 in dbus_connection_dispatch () from /lib64/libdbus-1.so.3 #16 0x3fff88d3bda8 in _dbus_connection_read_write_dispatch () from /lib64/libdbus-1.so.3 #17 0x101cb360 in gsh_dbus_thread (arg=0x0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.04-0.1.1-Source/dbus/dbus_server.c:741 #18 0x3fff889b8728 in start_thread (arg=0x3ffefc7fe810) at pthread_create.c:310 #19 0x3fff887f7ae0 in clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:109 I have uploaded a patch, which can potentially avoid double free. https://review.gerrithub.io/#/c/424260/ I have below patch which can potentially fix the double free. https://review.gerrithub.io/c/ffilz/nfs-ganesha/+/424260 -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Change in ffilz/nfs-ganesha[next]: "mdc_lookup" do not dispatch to FSAL
Bill, The issue is always re-producible for the customer. In another email chain, I have provided the Ganesha logs, which clearly indicates the lookup do not goes to FSAL layer to fetch the more current data if uncached is set to false. On Fri, Feb 16, 2018 at 9:40 PM, William Allen Simpson < william.allen.simp...@gmail.com> wrote: > On 2/15/18 6:44 AM, GerritHub wrote: > >> Sachin Punadikar has uploaded this change for *review*. >> >> View Change <https://review.gerrithub.io/400037> >> >> "mdc_lookup" do not dispatch to FSAL >> >> Are you sure? Do you have an actual reproducible error case? > > > "mdc_lookup" function first attempts to get the entry from cache >> via function "mdc_try_get_cached". On getting ESATLE error, it >> should dispatch to FSAL, but was again calling "mdc_try_get_cached". >> Rectified code to make call to "mdc_lookup_uncached", so FSAL code >> gets invoked. >> >> I'm not the mdcache expert, but don't think this is correct. The > comments already explain. > > It tries under read lock (fastest). If stale, it write locks and > tries again. If still fails, at the uncached label, then it does > the mdc_lookup_uncached(). > > mdc_try_get_cached() is likely faster than mdc_lookup_uncached(). > > > -- > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesh 2.3 : NFSv4 client gets error NFS4ERR_OLD_STATEID
nfs4_Compound :NFS4 :DEBUG :Status of OP_READ in position 1 = NFS4ERR_OLD_STATEID - -- with regards, Sachin Punadikar -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Ganesha crash in lock_avail
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org.Hello, Customer reported below crash: (gdb) where #0 0x7fa70c161fcb in raise () from /lib64/libpthread.so.0 #1 0x00454884 in crash_handler (signo=11, info=0x7fa5a1ff9f30, ctx=0x7fa5a1ff9e00) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/MainNFSD/nfs_init.c:225 #2 #3 0x in ?? () #4 0x00435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179 #5 0x005386eb in mdc_up_lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380 #6 0x00439c72 in queue_lock_avail (ctx=0x7fa40c039c40) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_async.c:247 #7 0x0050a32c in fridgethr_start_routine (arg=0x7fa40c039c40) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/support/fridgethr.c:550 #8 0x7fa70c15adc5 in start_thread () from /lib64/libpthread.so.0 #9 0x7fa70b81a1cd in clone () from /lib64/libc.so.6 It was found that op_ctx was not proper. (gdb) frame 4 #4 0x00435084 in lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL_UP/fsal_up_top.c:179 179obj->obj_ops.put_ref(obj); (gdb) p *obj $2 = {handles = {next = 0x0, prev = 0x0}, fs = 0x193e240, fsal = 0x0, obj_ops = {get_ref = 0x0, put_ref = 0x0, release = 0x0, merge = 0x0, lookup = 0x0, readdir = 0x0, compute_readdir_cookie = 0x0, dirent_cmp = 0x0, create = 0x0, mkdir = 0x0, mknode = 0x0, symlink = 0x0, readlink = 0x0, test_access = 0x0, getattrs = 0x0, setattrs = 0x0, link = 0x0, fs_locations = 0x0, rename = 0x0, unlink = 0x0, open = 0x0, reopen = 0x0, status = 0x0, read = 0x0, read_plus = 0x0, write = 0x0, write_plus = 0x0, seek = 0x0, io_advise = 0x0, commit = 0x0, lock_op = 0x0, share_op = 0x0, close = 0x0, list_ext_attrs = 0x0, getextattr_id_by_name = 0x0, getextattr_value_by_name = 0x0, getextattr_value_by_id = 0x0, setextattr_value = 0x0, setextattr_value_by_id = 0x0, remove_extattr_by_id = 0x0, remove_extattr_by_name = 0x0, handle_is = 0x0, handle_to_wire = 0x0, handle_to_key = 0x0, handle_cmp = 0x0, layoutget = 0x0, layoutreturn = 0x0, layoutcommit = 0x0, getxattrs = 0x0, setxattrs = 0x0, removexattrs = 0x0, listxattrs = 0x0, open2 = 0x0, check_verifier = 0x0, status2 = 0x0, reopen2 = 0x0, read2 = 0x0, write2 = 0x0, seek2 = 0x0, io_advise2 = 0x0, commit2 = 0x0, lock_op2 = 0x0, setattr2 = 0x0, close2 = 0x0}, obj_lock = {__data = { __lock = 0, __nr_readers = 0, __readers_wakeup = 0, __writer_wakeup = 0, __nr_readers_queued = 0, __nr_writers_queued = 0, __writer = 0, __shared = 0, __pad1 = 0, __pad2 = 0, __flags = 0}, __size = '\000' , __align = 0}, type = REGULAR_FILE, fsid = {major = 11073324921844891658, minor = 1}, fileid = 229392385, state_hdl = 0x7fa51006aea0} (gdb) frame 5 #5 0x005386eb in mdc_up_lock_avail (vec=0x18f07c8, file=0x7fa420157fd8, owner=0x7fa4f8189fc0, lock_param=0x7fa420157ff0) at /usr/src/debug/nfs-ganesha-2.5.3-ibm015.01-0.1.1-Source/FSAL/Stackable_FSALs/FSAL_MDCACHE/mdcache_up.c:380 380rc = myself->super_up_ops.lock_avail(vec, file, owner, (gdb) p op_ctx $3 = (struct req_op_context *) 0x7fa5a1ffa430 (gdb) p *op_ctx $4 = {creds = 0x0, original_creds = {caller_uid = 0, caller_gid = 0, caller_glen = 0, caller_garray = 0x0}, caller_gdata = 0x0, caller_garray_copy = 0x0, managed_garray_copy = 0x0, cred_flags = 0, caller_addr = 0x0, clientid = 0x0, nfs_vers = 0, nfs_minorvers = 0, req_type = 0, client = 0x0, ctx_export = 0x18efc78, fsal_export = 0x18f0680, export_perms = 0x0, start_time = 0, queue_wait = 0, fsal_private = 0x0, fsal_module = 0x0, fsal_pnfs_ds = 0x0} (gdb) In the above it shows that op_ctx is not set properly. "fsal_module" is NULL. To fix this issue I have posted a patch. https://review.gerrithub.io/#/c/436356/ -- with regards, Sachin Punadikar ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel