Bug#963746: nfs-common: Random Segmentation Violations of rpc.gssd Daemon

2020-10-03 Thread Kraus, Sebastian
root@all:~# coredumpctl debug 
   PID: 26824 (rpc.gssd)
   UID: 0 (root)
   GID: 0 (root)
Signal: 11 (SEGV)
 Timestamp: Sat 2020-10-03 10:29:16 CEST (5h 38min ago)
  Command Line: /usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10
Executable: /usr/sbin/rpc.gssd
 Control Group: /system.slice/rpc-gssd.service
  Unit: rpc-gssd.service
 Slice: system.slice
   Boot ID: e60fc71ee667413c98017762004c67f2
Machine ID: d3d1247edbd7490591d291e33e196b79
  Hostname: all
   Storage: 
/var/lib/systemd/coredump/core.rpc\x2egssd.0.e60fc71ee667413c98017762004c67f2.26824.160171375600.lz4
   Message: Process 26824 (rpc.gssd) of user 0 dumped core.

Stack trace of thread 4596:
#0  0x563f504ab38e create_auth_rpc_client (rpc.gssd)
#1  0x563f504ab9f8 krb5_use_machine_creds (rpc.gssd)
#2  0x563f504abb92 process_krb5_upcall (rpc.gssd)
#3  0x563f504ac3b3 handle_gssd_upcall (rpc.gssd)
#4  0x7f13dcd4dfa3 start_thread (libpthread.so.0)
#5  0x7f13dcc7e4cf __clone (libc.so.6)

Stack trace of thread 26824:
#0  0x7f13dcc73819 __GI___poll (libc.so.6)
#1  0x7f13dcb59207 send_dg (libresolv.so.2)
#2  0x7f13dcb56c43 __GI___res_context_query (libresolv.so.2)
#3  0x7f13dcb31536 __GI__nss_dns_gethostbyaddr2_r 
(libnss_dns.so.2)
#4  0x7f13dcb31823 _nss_dns_gethostbyaddr_r 
(libnss_dns.so.2)
#5  0x7f13dcc8fee2 __gethostbyaddr_r (libc.so.6)
#6  0x7f13dcc987d5 gni_host_inet_name (libc.so.6)
#7  0x563f504aa455 gssd_get_servername (rpc.gssd)
#8  0x563f504aa82c gssd_read_service_info (rpc.gssd)
#9  0x563f504ab067 gssd_inotify_clnt (rpc.gssd)
#10 0x7f13dcf269ba event_persist_closure (libevent-2.1.so.6)
#11 0x7f13dcf27537 event_process_active (libevent-2.1.so.6)
#12 0x563f504a8eaa main (rpc.gssd)
#13 0x7f13dcba909b __libc_start_main (libc.so.6)
#14 0x563f504a903a _start (rpc.gssd)

GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/rpc.gssd...Reading symbols from 
/usr/lib/debug/.build-id/97/484761d181f6a900fc8e41e4ff6cf038e00e4c.debug...done.
done.
[New LWP 4596]
[New LWP 26824]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x563f504ab38e in create_auth_rpc_client (clp=clp@entry=0x563f50687c30, 
tgtname=tgtname@entry=0x563f5069e67f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7f13dcb2cde8, 
auth_return=auth_return@entry=0x7f13dcb2cd50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
352 gssd_proc.c: No such file or directory.
[Current thread is 1 (Thread 0x7f13dcb2d700 (LWP 4596))]
(gdb) set pagination off
(gdb) bt full
#0  0x563f504ab38e in create_auth_rpc_client (clp=clp@entry=0x563f50687c30, 
tgtname=tgtname@entry=0x563f5069e67f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7f13dcb2cde8, 
auth_return=auth_return@entry=0x7f13dcb2cd50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
rpc_clnt = 0x0
sec = {mech = 0x563f504b7590 , qop = 0, svc = 
RPCSEC_GSS_SVC_NONE, cred = 0x7f13d80024d0, req_flags = 2}
auth = 0x0
retval = -1
min_stat = 256
rpc_errmsg = '\000' , 

Bug#963746: nfs-common: Random Segmentation Violations of rpc.gssd Daemon

2020-06-26 Thread Kraus, Sebastian
Package: nfs-common
Version: 1:1.3.4-2.5
OS Release: Buster

Dear all:

Since september 2019, the rpc.gssd user space daemon on the NFSv4 file servers 
(VMware ESXi virtualized hosts) of my department provokes random segmentation 
violations. Security flavour of NFS exports is set to sec=krb5p. 
Some monthes back, all NFS server were still running on Debian Stretch. I am 
about to migrate all "my" NFS file servers to Debian Buster. 
Unfortunately, the problem persists with Debian Buster using the most recent 
versions of nfs-common package and Linux Kernel. 
I now managed to get a backtrace of a recent segfault incident on Debian 
Buster. 

Here is the full backtrace:

root@server:~# coredumpctl debug
   PID: 6356 (rpc.gssd)
   UID: 0 (root)
   GID: 0 (root)
Signal: 11 (SEGV)
 Timestamp: Thu 2020-06-25 11:46:08 CEST (21h ago)
  Command Line: /usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10
Executable: /usr/sbin/rpc.gssd
 Control Group: /system.slice/rpc-gssd.service
  Unit: rpc-gssd.service
 Slice: system.slice
   Boot ID: (obfuscated)
Machine ID: (obfuscated)
  Hostname: all
   Storage: 
/var/lib/systemd/coredump/core.rpc\x2egssd.0.7f31136228274af0a1a855b91ad1e75c.6356.159307836800.lz4
   Message: Process 6356 (rpc.gssd) of user 0 dumped core.

Stack trace of thread 14174:
#0  0x56233fff038e n/a (rpc.gssd)
#1  0x56233fff09f8 n/a (rpc.gssd)
#2  0x56233fff0b92 n/a (rpc.gssd)
#3  0x56233fff13b3 n/a (rpc.gssd)
#4  0x7fb2eb8dbfa3 start_thread (libpthread.so.0)
#5  0x7fb2eb80c4cf __clone (libc.so.6)

Stack trace of thread 6356:
#0  0x7fb2eb801819 __GI___poll (libc.so.6)
#1  0x7fb2eb6e7207 send_dg (libresolv.so.2)
#2  0x7fb2eb6e4c43 __GI___res_context_query (libresolv.so.2)
#3  0x7fb2eb6bf536 __GI__nss_dns_gethostbyaddr2_r 
(libnss_dns.so.2)
#4  0x7fb2eb6bf823 _nss_dns_gethostbyaddr_r 
(libnss_dns.so.2)
#5  0x7fb2eb81dee2 __gethostbyaddr_r (libc.so.6)
#6  0x7fb2eb8267d5 gni_host_inet_name (libc.so.6)
#7  0x56233ffef455 n/a (rpc.gssd)
#8  0x56233ffef82c n/a (rpc.gssd)
#9  0x56233fff01d0 n/a (rpc.gssd)
#10 0x7fb2ebab49ba n/a (libevent-2.1.so.6)
#11 0x7fb2ebab5537 event_base_loop (libevent-2.1.so.6)
#12 0x56233ffedeaa n/a (rpc.gssd)
#13 0x7fb2eb73709b __libc_start_main (libc.so.6)
#14 0x56233ffee03a n/a (rpc.gssd)

GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/rpc.gssd...Reading symbols from 
/usr/lib/debug/.build-id/08/a9957ac98e4e5a68f9238c4d763a95e9b4d492.debug...done.
done.
[New LWP 14174]
[New LWP 6356]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x56233fff038e in create_auth_rpc_client (clp=clp@entry=0x562341008fa0, 
tgtname=tgtname@entry=0x562341011c8f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7fb2eaeb9de8, 
auth_return=auth_return@entry=0x7fb2eaeb9d50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
352 gssd_proc.c: No such file or directory.
[Current thread is 1 (Thread 0x7fb2eaeba700 (LWP 14174))]

(gdb) bt full
#0  0x56233fff038e in create_auth_rpc_client (clp=clp@entry=0x562341008fa0, 
tgtname=tgtname@entry=0x562341011c8f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7fb2eaeb9de8, 
auth_return=auth_return@entry=0x7fb2eaeb9d50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
rpc_clnt = 0x0
sec = {mech = 0x56233fffc590 , qop = 0, svc = 
RPCSEC_GSS_SVC_NONE, cred = 0x7fb2dc000d60, req_flags = 2}
auth = 0x0
retval = -1
min_stat = 256
rpc_errmsg = '\000' ,