This list has been deprecated. Please subscribe to the new devel list at 
lists.nfs-ganesha.org.
I did notice some issues with rpc.statd (hangs) with some versions.
Unfortunately, Ganesha also uses the rpc.statd for NFSv3 locking (in fact I
saw rpc.statd hangs with Ganesha). If you really want to get away from
rpc.statd issues you are having kNFS, I would suggest NFSv4 mounts. If you
are unable to resolve your Ganesha issue, there is an option for you to try
NFSv4 with kNFS.

Regards, Malahal.

On Mon, Oct 1, 2018 at 10:03 PM David C <dcsysengin...@gmail.com> wrote:

> Hi All
>
> Thanks for the suggestions, I'll have a go and report back. Sharing some
> more info that may or may not be relevant:
>
> The clients I'm accessing Ganesha with were previously accessing a Kernel
> NFS server running on CentOS. I was experiencing frequent issues where
> lockd on the server would go into uninterruptible sleep and I needed to
> stop the nfslock service, clear out /var/lib/nfs/statd dirs and start
> nfslock again to get things working.
>
> Now that these clients are accessing the nfs-ganesha server, I'm seeing
> similar behaviour, the clients were showing "lockd: server *ipaddr *not
> responding" and I had to restart nfs-ganesha to resolve. I don't know if
> these crashes are related to that in any way?
>
> The other thing to note is some of the exports are on a cephfs mount but
> I'm using the VFS FSAL, not the CEPH FSAL.
>
> When it is working, performance seems good, and the crashes don't appear
> to happen during periods of high I/O.
>
> Thanks,
>
>
>
> On Mon, Oct 1, 2018 at 4:30 PM Malahal Naineni <mala...@gmail.com> wrote:
>
>> This list has been deprecated. Please subscribe to the new devel list at
>> lists.nfs-ganesha.org.
>> David, another option is to test with Ganesha2.7 as you are able to
>> recreate easily with V2.6.3.
>>
>> On Mon, Oct 1, 2018 at 7:49 PM Daniel Gryniewicz <d...@redhat.com> wrote:
>>
>>> This list has been deprecated. Please subscribe to the new devel list at
>>> lists.nfs-ganesha.org.
>>>
>>> I'm not seeing any easy way that cmpf could be corrupted.  The structure
>>> before it is fairly complex, with it's last element being an integer, so
>>> it's unlikely that something wrote off the end of that.  That leaves a
>>> random memory corruption, which is almost impossible to detect.
>>>
>>> David, can you rebuild your Ganesha?  If so, can you build with the
>>> Address Sanitizer on?  To do this, install libasan on your distro, and
>>> then pass -DSANITIZE_ADDRESS=ON to cmake.  With ASAN enabled, you may
>>> get a crash at the time of corruption, rather than at some future point.
>>>
>>> Daniel
>>>
>>> On 10/01/2018 09:20 AM, Malahal Naineni wrote:
>>> > This list has been deprecated. Please subscribe to the new devel list
>>> at lists.nfs-ganesha.org.
>>> >
>>> >
>>> >
>>> > Looking at the code head->cmpf should be "clnt_req_xid_cmpf" function
>>> > address. Your gdb didn't show that, but I don't know how that could
>>> > happen with the V2.6.3 code though. @Dan, any insights for this issue?
>>> >
>>> > On Mon, Oct 1, 2018 at 2:22 PM David C <dcsysengin...@gmail.com
>>> > <mailto:dcsysengin...@gmail.com>> wrote:
>>> >
>>> >     Hi Malahal
>>> >
>>> >     Result of that command:
>>> >
>>> >     (gdb) p head->cmpf
>>> >     $1 = (opr_rbtree_cmpf_t) 0x31fb0b405ba000b7
>>> >
>>> >     Thanks,
>>> >
>>> >     On Mon, Oct 1, 2018 at 5:55 AM Malahal Naineni <mala...@gmail.com
>>> >     <mailto:mala...@gmail.com>> wrote:
>>> >
>>> >         Looks like the head is messed up. Run these in gdb and let us
>>> >         know the second commands output. 1. "frame 0"   2.
>>> >         "p head->cmpf".  I believe, head->cmpf function is NULL or bad
>>> >         leading to this segfault. I haven't seen this crash before and
>>> >         never used Ganesha 2.6 version.
>>> >
>>> >         Regards, Malahal.
>>> >
>>> >         On Mon, Oct 1, 2018 at 1:25 AM David C <
>>> dcsysengin...@gmail.com
>>> >         <mailto:dcsysengin...@gmail.com>> wrote:
>>> >
>>> >             Hi Malahal
>>> >
>>> >             I've set up ABRT so I'm now getting coredumps for the
>>> >             crashes. I've installed debuginfo package for nfs-ganesha
>>> >             and libntirpc.
>>> >
>>> >             I'd be really grateful if you could give me some guidance
>>> on
>>> >             debugging this.
>>> >
>>> >             Some info on the latest crash:
>>> >
>>> >             The following was echoed to the kernel log:
>>> >
>>> >                 traps: ganesha.nfsd[28589] general protection
>>> >                 ip:7fcf2421dded sp:7fcd9d4d03a0 error:0 in
>>> >                 libntirpc.so.1.6.3[7fcf2420d000+3d000]
>>> >
>>> >
>>> >             Last lines of output from # gdb /usr/bin/ganesha.nfsd
>>> coredump:
>>> >
>>> >             [Thread debugging using libthread_db enabled]
>>> >             Using host libthread_db library "/lib64/libthread_db.so.1".
>>> >             Core was generated by `/usr/bin/ganesha.nfsd -L
>>> >             /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.c'.
>>> >             Program terminated with signal 11, Segmentation fault.
>>> >             #0  0x00007fcf2421dded in opr_rbtree_insert
>>> >             (head=head@entry=0x7fcef800c528,
>>> >             node=node@entry=0x7fce68004750) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
>>> >             271                     switch (head->cmpf(node, parent)) {
>>> >             Missing separate debuginfos, use: debuginfo-install
>>> >             bzip2-libs-1.0.6-13.el7.x86_64
>>> >             dbus-libs-1.10.24-7.el7.x86_64
>>> >             elfutils-libelf-0.170-4.el7.x86_64
>>> >             elfutils-libs-0.170-4.el7.x86_64 glibc-2.17-222.el7.x86_64
>>> >             gssproxy-0.7.0-17.el7.x86_64
>>> >             keyutils-libs-1.5.8-3.el7.x86_64
>>> >             krb5-libs-1.15.1-19.el7.x86_64 libattr-2.4.46-13.el7.x86_64
>>> >             libblkid-2.23.2-52.el7.x86_64 libcap-2.22-9.el7.x86_64
>>> >             libcom_err-1.42.9-12.el7_5.x86_64
>>> >             libgcc-4.8.5-28.el7_5.1.x86_64
>>> libgcrypt-1.5.3-14.el7.x86_64
>>> >             libgpg-error-1.12-3.el7.x86_64
>>> >             libnfsidmap-0.25-19.el7.x86_64 libselinux-2.5-12.el7.x86_64
>>> >             libuuid-2.23.2-52.el7.x86_64 lz4-1.7.5-2.el7.x86_64
>>> >             pcre-8.32-17.el7.x86_64 systemd-libs-219-57.el7.x86_64
>>> >             xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64
>>> >
>>> >             Output from bt:
>>> >
>>> >             (gdb) bt
>>> >             #0  0x00007fcf2421dded in opr_rbtree_insert
>>> >             (head=head@entry=0x7fcef800c528,
>>> >             node=node@entry=0x7fce68004750) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/rbtree.c:271
>>> >             #1  0x00007fcf24218eac in clnt_req_setup
>>> >             (cc=cc@entry=0x7fce68004720, timeout=...) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/clnt_generic.c:515
>>> >             #2  0x000055d62490347f in nsm_unmonitor
>>> >             (host=host@entry=0x7fce00018ea0) at
>>> >
>>>  /usr/src/debug/nfs-ganesha-2.6.3/src/Protocols/NLM/nsm.c:219
>>> >             #3  0x000055d6249425cf in dec_nsm_client_ref
>>> >             (client=0x7fce00018ea0) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:857
>>> >             #4  0x000055d624942f61 in free_nlm_client
>>> >             (client=0x7fce00017500) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1039
>>> >             #5  0x000055d6249431d3 in dec_nlm_client_ref
>>> >             (client=0x7fce00017500) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1130
>>> >             #6  0x000055d6249439ae in free_nlm_owner
>>> >             (owner=owner@entry=0x7fce00024bc0) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/nlm_owner.c:1314
>>> >             #7  0x000055d624929a48 in free_state_owner
>>> >             (owner=0x7fce00024bc0) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/state_misc.c:818
>>> >             #8  0x000055d624929dc0 in dec_state_owner_ref
>>> >             (owner=0x7fce00024bc0) at
>>> >             /usr/src/debug/nfs-ganesha-2.6.3/src/SAL/state_misc.c:968
>>> >             #9  0x000055d6248ff173 in nlm4_Unlock (args=0x7fce68003b98,
>>> >             req=0x7fce68003490, res=0x7fce68000d70) at
>>> >
>>>  /usr/src/debug/nfs-ganesha-2.6.3/src/Protocols/NLM/nlm_Unlock.c:127
>>> >             #10 0x000055d6248c0f0f in nfs_rpc_process_request
>>> >             (reqdata=0x7fce68003490) at
>>> >
>>>  /usr/src/debug/nfs-ganesha-2.6.3/src/MainNFSD/nfs_worker_thread.c:1329
>>> >             #11 0x000055d6248c02ba in nfs_rpc_decode_request
>>> >             (xprt=0x7fcef011b600, xdrs=0x7fce68001480)
>>> >                  at
>>> >
>>>  
>>> /usr/src/debug/nfs-ganesha-2.6.3/src/MainNFSD/nfs_rpc_dispatcher_thread.c:1341
>>> >             #12 0x00007fcf2422dbcd in svc_rqst_xprt_task
>>> >             (wpe=0x7fcef011b818) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:751
>>> >             #13 0x00007fcf2422df2a in svc_rqst_epoll_events
>>> >             (n_events=<optimized out>, sr_rec=0x55d6253b3fd0) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:923
>>> >             #14 svc_rqst_epoll_loop (sr_rec=<optimized out>) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:996
>>> >             #15 svc_rqst_run_task (wpe=0x55d6253b3fd0) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/svc_rqst.c:1032
>>> >             #16 0x00007fcf2423671a in work_pool_thread
>>> >             (arg=0x55d6282753f0) at
>>> >             /usr/src/debug/ntirpc-1.6.3/src/work_pool.c:176
>>> >             #17 0x00007fcf2465ce25 in start_thread () from
>>> >             /lib64/libpthread.so.0
>>> >             #18 0x00007fcf23d28bad in clone () from /lib64/libc.so.6
>>> >
>>> >             Thanks for your assistance so far on this
>>> >             David
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >             On Fri, Sep 28, 2018 at 8:06 PM David C
>>> >             <dcsysengin...@gmail.com <mailto:dcsysengin...@gmail.com>>
>>> >             wrote:
>>> >
>>> >                 Thanks, Malahal. I'll get the coredumps enabled. I've
>>> >                 had a few more crashes today, hopefully they'll shed
>>> >                 some light on the issue.
>>> >
>>> >                 On Fri, Sep 28, 2018 at 1:20 PM Malahal Naineni
>>> >                 <mala...@gmail.com <mailto:mala...@gmail.com>> wrote:
>>> >
>>> >                     You need to enable coredumps for ganesha. Here are
>>> >                     some instructions! Step2 is NOT needed as your
>>> >                     packages are signed:
>>> >
>>> >
>>> https://ganltc.github.io/setup-to-take-ganesha-coredumps.html
>>> >
>>> >                     On Fri, Sep 28, 2018 at 4:38 PM David C
>>> >                     <dcsysengin...@gmail.com
>>> >                     <mailto:dcsysengin...@gmail.com>> wrote:
>>> >
>>> >                         This list has been deprecated. Please subscribe
>>> >                         to the new devel list at lists.nfs-ganesha.org
>>> >                         <http://lists.nfs-ganesha.org>.
>>> >                         Hi All
>>> >
>>> >                         CentOS 7.5
>>> >                         nfs-ganesha-2.6.3-1.el7.x86_64
>>> >                         nfs-ganesha-vfs-2.6.3-1.el7.x86_64
>>> >                         libntirpc-1.6.3-1.el7.x86_64
>>> >
>>> >                         My Ganesha service crashed and the following
>>> was
>>> >                         echoed to my kernel log:
>>> >
>>> >                             ganesha.nfsd[28752]: segfault at 0
>>> >                             ip           (null) sp 00007ff9a2af8458
>>> >                             error 14 in
>>> ganesha.nfsd[559170ef3000+1a4000]
>>> >
>>> >
>>> >                         Nothing in my ganesha.log
>>> >
>>> >                         These are the log settings from my
>>> ganesha.conf:
>>> >
>>> >                             LOG {
>>> >                                      ## Default log level for all
>>> components
>>> >                                      Default_Log_Level = DEBUG;
>>> >
>>> >                                      ## Configure per-component log
>>> levels.
>>> >                                      #Components {
>>> >                                              #FSAL = INFO;
>>> >                                              #NFS4 = EVENT;
>>> >                                      #}
>>> >
>>> >                                      ## Where to log
>>> >                                      Facility {
>>> >                                              name = FILE;
>>> >                                              destination =
>>> >                             "/var/log/ganesha.log";
>>> >                                              enable = active;
>>> >                                      }
>>> >                             }
>>> >
>>> >
>>> >                         This is an example of one of my exports
>>> (they're
>>> >                         all Nfsv3 with VFS FSAL):
>>> >
>>> >                             EXPORT
>>> >                             {
>>> >                                      Export_Id = 80;
>>> >                                      Path = /mnt/dir;
>>> >                                      Pseudo = /mnt/dir;
>>> >                                      Access_Type = RW;
>>> >                                      Protocols = 3;
>>> >                                      Transports = TCP;
>>> >                                      Squash = no_root_squash;
>>> >                                      Disable_ACL=False;
>>> >                                      Filesystem_Id = 101.1;
>>> >                                      CLIENT {
>>> >                                         Clients = *;
>>> >                                         Squash = None;
>>> >                                         Access_Type = RW;
>>> >                                      }
>>> >                                      FSAL {
>>> >                                            Name = VFS;
>>> >                                       }
>>> >                             }
>>> >
>>> >
>>> >                         The exports are mounted on CentOS 7.4 clients
>>> >                         with autofs-5.0.7 and
>>> >                         nfs-utils-1.3.0-0.48.el7_4.x86_64
>>> >
>>> >                         This crashed occurred approx 2 hours after I
>>> >                         increased the number of clients accessing the
>>> >                         server by approx five clients, don't know if
>>> >                         that's related
>>> >
>>> >                         Could someone help me troubleshoot this please?
>>> >
>>> >                         Many thanks
>>> >                         David
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >                         _______________________________________________
>>> >                         Nfs-ganesha-devel mailing list
>>> >                         Nfs-ganesha-devel@lists.sourceforge.net
>>> >                         <mailto:
>>> Nfs-ganesha-devel@lists.sourceforge.net>
>>> >
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Nfs-ganesha-devel mailing list
>>> > Nfs-ganesha-devel@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>> >
>>>
>>>
>>>
>>> _______________________________________________
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>> _______________________________________________
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to