Bug#963746: nfs-common: Random Segmentation Violations of rpc.gssd Daemon

2020-10-03 Thread Kraus, Sebastian
root@all:~# coredumpctl debug 
   PID: 26824 (rpc.gssd)
   UID: 0 (root)
   GID: 0 (root)
Signal: 11 (SEGV)
 Timestamp: Sat 2020-10-03 10:29:16 CEST (5h 38min ago)
  Command Line: /usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10
Executable: /usr/sbin/rpc.gssd
 Control Group: /system.slice/rpc-gssd.service
  Unit: rpc-gssd.service
 Slice: system.slice
   Boot ID: e60fc71ee667413c98017762004c67f2
Machine ID: d3d1247edbd7490591d291e33e196b79
  Hostname: all
   Storage: 
/var/lib/systemd/coredump/core.rpc\x2egssd.0.e60fc71ee667413c98017762004c67f2.26824.160171375600.lz4
   Message: Process 26824 (rpc.gssd) of user 0 dumped core.

Stack trace of thread 4596:
#0  0x563f504ab38e create_auth_rpc_client (rpc.gssd)
#1  0x563f504ab9f8 krb5_use_machine_creds (rpc.gssd)
#2  0x563f504abb92 process_krb5_upcall (rpc.gssd)
#3  0x563f504ac3b3 handle_gssd_upcall (rpc.gssd)
#4  0x7f13dcd4dfa3 start_thread (libpthread.so.0)
#5  0x7f13dcc7e4cf __clone (libc.so.6)

Stack trace of thread 26824:
#0  0x7f13dcc73819 __GI___poll (libc.so.6)
#1  0x7f13dcb59207 send_dg (libresolv.so.2)
#2  0x7f13dcb56c43 __GI___res_context_query (libresolv.so.2)
#3  0x7f13dcb31536 __GI__nss_dns_gethostbyaddr2_r 
(libnss_dns.so.2)
#4  0x7f13dcb31823 _nss_dns_gethostbyaddr_r 
(libnss_dns.so.2)
#5  0x7f13dcc8fee2 __gethostbyaddr_r (libc.so.6)
#6  0x7f13dcc987d5 gni_host_inet_name (libc.so.6)
#7  0x563f504aa455 gssd_get_servername (rpc.gssd)
#8  0x563f504aa82c gssd_read_service_info (rpc.gssd)
#9  0x563f504ab067 gssd_inotify_clnt (rpc.gssd)
#10 0x7f13dcf269ba event_persist_closure (libevent-2.1.so.6)
#11 0x7f13dcf27537 event_process_active (libevent-2.1.so.6)
#12 0x563f504a8eaa main (rpc.gssd)
#13 0x7f13dcba909b __libc_start_main (libc.so.6)
#14 0x563f504a903a _start (rpc.gssd)

GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/rpc.gssd...Reading symbols from 
/usr/lib/debug/.build-id/97/484761d181f6a900fc8e41e4ff6cf038e00e4c.debug...done.
done.
[New LWP 4596]
[New LWP 26824]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x563f504ab38e in create_auth_rpc_client (clp=clp@entry=0x563f50687c30, 
tgtname=tgtname@entry=0x563f5069e67f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7f13dcb2cde8, 
auth_return=auth_return@entry=0x7f13dcb2cd50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
352 gssd_proc.c: No such file or directory.
[Current thread is 1 (Thread 0x7f13dcb2d700 (LWP 4596))]
(gdb) set pagination off
(gdb) bt full
#0  0x563f504ab38e in create_auth_rpc_client (clp=clp@entry=0x563f50687c30, 
tgtname=tgtname@entry=0x563f5069e67f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7f13dcb2cde8, 
auth_return=auth_return@entry=0x7f13dcb2cd50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
rpc_clnt = 0x0
sec = {mech = 0x563f504b7590 , qop = 0, svc = 
RPCSEC_GSS_SVC_NONE, cred = 0x7f13d80024d0, req_flags = 2}
auth = 0x0
retval = -1
min_stat = 256
rpc_errmsg = '\000' , 

Bug#963746: Fw: [PATCH v2] Re: Strange segmentation violations of rpc.gssd in Debian Buster

2020-06-28 Thread Kraus, Sebastian
Testing upstream patch by Doug Nazar:
Manually patching gssd.c, gssd_proc.c and gssd.h wihtin nfs-utils_1.3.4 source.
Rebuilding binary package nfs-common-1.3.4-2.5 from nfs-utils_1.3.4 source.
Testing altered binary packages nfs-common_1.3.4-2.5_amd64.deb and 
nfs-common-dbgsym_1.3.4-2.5_amd64.deb on one machine.

ToDo:
Correctly backporting Nazar's patch to nfs-utils_1.3.4 source of Debian Stable 
and Oldstable.
Roll-out to all NFS file servers.


Jump into nightmare's heaven oUt
Sebastian


Sebastian Kraus
Team IT am Institut für Chemie
Gebäude C, Straße des 17. Juni 115, Raum C7

Technische Universität Berlin
Fakultät II
Institut für Chemie
Sekretariat C3
Straße des 17. Juni 135
10623 Berlin

Email: sebastian.kr...@tu-berlin.de


From: linux-nfs-ow...@vger.kernel.org  on 
behalf of Doug Nazar 
Sent: Friday, June 26, 2020 23:30
To: J. Bruce Fields
Cc: Kraus, Sebastian; linux-...@vger.kernel.org; Steve Dickson; Olga 
Kornievskaia
Subject: [PATCH v2] Re: Strange segmentation violations of rpc.gssd in Debian 
Buster

On 2020-06-26 17:02, J. Bruce Fields wrote:
> Unless I'm missing something--an upcall thread could still be using this
> file descriptor.
>
> If we're particularly unlucky, we could do a new open in a moment and
> reuse this file descriptor number, and then then writes in do_downcall()
> could end up going to some other random file.
>
> I think we want these closes done by gssd_free_client() in the !refcnt
> case?

Makes sense. I was thinking more that it was an abort situation and we
shouldn't be sending any data to the kernel but re-use is definitely a
concern.

I've split it so that we are removed from the event loop in destroy()
but the close happens in free().

DougFrom 8ef49081e8a42bfa05bb63265cd4f951e2b23413 Mon Sep 17 00:00:00 2001
From: Doug Nazar 
Date: Fri, 26 Jun 2020 16:02:04 -0400
Subject: [PATCH] gssd: Refcount struct clnt_info to protect multithread usage

Struct clnt_info is shared with the various upcall threads so
we need to ensure that it stays around even if the client dir
gets removed.

Reported-by: Sebastian Kraus 
Signed-off-by: Doug Nazar 
---
 utils/gssd/gssd.c  | 67 --
 utils/gssd/gssd.h  |  5 ++--
 utils/gssd/gssd_proc.c |  4 +--
 3 files changed, 55 insertions(+), 21 deletions(-)

diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
index 588da0fb..b40c3220 100644
--- a/utils/gssd/gssd.c
+++ b/utils/gssd/gssd.c
@@ -90,9 +90,7 @@ char *ccachedir = NULL;
 /* Avoid DNS reverse lookups on server names */
 static bool avoid_dns = true;
 static bool use_gssproxy = false;
-int thread_started = false;
-pthread_mutex_t pmutex = PTHREAD_MUTEX_INITIALIZER;
-pthread_cond_t pcond = PTHREAD_COND_INITIALIZER;
+pthread_mutex_t clp_lock = PTHREAD_MUTEX_INITIALIZER;
 
 TAILQ_HEAD(topdir_list_head, topdir) topdir_list;
 
@@ -359,20 +357,28 @@ out:
free(port);
 }
 
+/* Actually frees clp and fields that might be used from other
+ * threads if was last reference.
+ */
 static void
-gssd_destroy_client(struct clnt_info *clp)
+gssd_free_client(struct clnt_info *clp)
 {
-   if (clp->krb5_fd >= 0) {
+   int refcnt;
+
+   pthread_mutex_lock(_lock);
+   refcnt = --clp->refcount;
+   pthread_mutex_unlock(_lock);
+   if (refcnt > 0)
+   return;
+
+   printerr(3, "freeing client %s\n", clp->relpath);
+
+   if (clp->krb5_fd >= 0)
close(clp->krb5_fd);
-   event_del(>krb5_ev);
-   }
 
-   if (clp->gssd_fd >= 0) {
+   if (clp->gssd_fd >= 0)
close(clp->gssd_fd);
-   event_del(>gssd_ev);
-   }
 
-   inotify_rm_watch(inotify_fd, clp->wd);
free(clp->relpath);
free(clp->servicename);
free(clp->servername);
@@ -380,6 +386,24 @@ gssd_destroy_client(struct clnt_info *clp)
free(clp);
 }
 
+/* Called when removing from clnt_list to tear down event handling.
+ * Will then free clp if was last reference.
+ */
+static void
+gssd_destroy_client(struct clnt_info *clp)
+{
+   printerr(3, "destroying client %s\n", clp->relpath);
+
+   if (clp->krb5_fd >= 0)
+   event_del(>krb5_ev);
+
+   if (clp->gssd_fd >= 0)
+   event_del(>gssd_ev);
+
+   inotify_rm_watch(inotify_fd, clp->wd);
+   gssd_free_client(clp);
+}
+
 static void gssd_scan(void);
 
 static int
@@ -416,11 +440,21 @@ static struct clnt_upcall_info *alloc_upcall_info(struct 
clnt_info *clp)
info = malloc(sizeof(struct clnt_upcall_info));
if (info == NULL)
return NULL;
+
+   pthread_mutex_lock(_lock);
+   clp->refcount++;
+   pthread_mutex_unlock(_lock);
info->clp = clp;
 
return info;
 }
 
+void free_upcall_info(struct clnt_upcall_info *info)
+{
+   gs

Bug#963746: nfs-common: Random Segmentation Violations of rpc.gssd Daemon

2020-06-26 Thread Kraus, Sebastian
Package: nfs-common
Version: 1:1.3.4-2.5
OS Release: Buster

Dear all:

Since september 2019, the rpc.gssd user space daemon on the NFSv4 file servers 
(VMware ESXi virtualized hosts) of my department provokes random segmentation 
violations. Security flavour of NFS exports is set to sec=krb5p. 
Some monthes back, all NFS server were still running on Debian Stretch. I am 
about to migrate all "my" NFS file servers to Debian Buster. 
Unfortunately, the problem persists with Debian Buster using the most recent 
versions of nfs-common package and Linux Kernel. 
I now managed to get a backtrace of a recent segfault incident on Debian 
Buster. 

Here is the full backtrace:

root@server:~# coredumpctl debug
   PID: 6356 (rpc.gssd)
   UID: 0 (root)
   GID: 0 (root)
Signal: 11 (SEGV)
 Timestamp: Thu 2020-06-25 11:46:08 CEST (21h ago)
  Command Line: /usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10
Executable: /usr/sbin/rpc.gssd
 Control Group: /system.slice/rpc-gssd.service
  Unit: rpc-gssd.service
 Slice: system.slice
   Boot ID: (obfuscated)
Machine ID: (obfuscated)
  Hostname: all
   Storage: 
/var/lib/systemd/coredump/core.rpc\x2egssd.0.7f31136228274af0a1a855b91ad1e75c.6356.159307836800.lz4
   Message: Process 6356 (rpc.gssd) of user 0 dumped core.

Stack trace of thread 14174:
#0  0x56233fff038e n/a (rpc.gssd)
#1  0x56233fff09f8 n/a (rpc.gssd)
#2  0x56233fff0b92 n/a (rpc.gssd)
#3  0x56233fff13b3 n/a (rpc.gssd)
#4  0x7fb2eb8dbfa3 start_thread (libpthread.so.0)
#5  0x7fb2eb80c4cf __clone (libc.so.6)

Stack trace of thread 6356:
#0  0x7fb2eb801819 __GI___poll (libc.so.6)
#1  0x7fb2eb6e7207 send_dg (libresolv.so.2)
#2  0x7fb2eb6e4c43 __GI___res_context_query (libresolv.so.2)
#3  0x7fb2eb6bf536 __GI__nss_dns_gethostbyaddr2_r 
(libnss_dns.so.2)
#4  0x7fb2eb6bf823 _nss_dns_gethostbyaddr_r 
(libnss_dns.so.2)
#5  0x7fb2eb81dee2 __gethostbyaddr_r (libc.so.6)
#6  0x7fb2eb8267d5 gni_host_inet_name (libc.so.6)
#7  0x56233ffef455 n/a (rpc.gssd)
#8  0x56233ffef82c n/a (rpc.gssd)
#9  0x56233fff01d0 n/a (rpc.gssd)
#10 0x7fb2ebab49ba n/a (libevent-2.1.so.6)
#11 0x7fb2ebab5537 event_base_loop (libevent-2.1.so.6)
#12 0x56233ffedeaa n/a (rpc.gssd)
#13 0x7fb2eb73709b __libc_start_main (libc.so.6)
#14 0x56233ffee03a n/a (rpc.gssd)

GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/rpc.gssd...Reading symbols from 
/usr/lib/debug/.build-id/08/a9957ac98e4e5a68f9238c4d763a95e9b4d492.debug...done.
done.
[New LWP 14174]
[New LWP 6356]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/sbin/rpc.gssd -vvv -rrr -t 3600 -T 10'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x56233fff038e in create_auth_rpc_client (clp=clp@entry=0x562341008fa0, 
tgtname=tgtname@entry=0x562341011c8f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7fb2eaeb9de8, 
auth_return=auth_return@entry=0x7fb2eaeb9d50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
352 gssd_proc.c: No such file or directory.
[Current thread is 1 (Thread 0x7fb2eaeba700 (LWP 14174))]

(gdb) bt full
#0  0x56233fff038e in create_auth_rpc_client (clp=clp@entry=0x562341008fa0, 
tgtname=tgtname@entry=0x562341011c8f "h...@client.domain.tu-berlin.de", 
clnt_return=clnt_return@entry=0x7fb2eaeb9de8, 
auth_return=auth_return@entry=0x7fb2eaeb9d50, uid=uid@entry=0, 
cred=cred@entry=0x0, authtype=0) at gssd_proc.c:352
rpc_clnt = 0x0
sec = {mech = 0x56233fffc590 , qop = 0, svc = 
RPCSEC_GSS_SVC_NONE, cred = 0x7fb2dc000d60, req_flags = 2}
auth = 0x0
retval = -1
min_stat = 256
rpc_errmsg = '\000' , 

Bug#858819: 2.10.0-2 missing from package index of stable

2018-06-15 Thread Kraus, Sebastian
In addition to my last post and for the sake of

clarity:

The 2.10.0-2 version of the trac-email2trac

package has been apparently built for Debian

stable as 2.10.0-2_all, but it has not been pushed

to the official package index at main:


Package: trac-email2trac
Source: email2trac
Version: 2.10.0-1
Installed-Size: 137
Maintainer: W. Martin Borgert 
Architecture: all
Depends: python, python:any, trac
Suggests: getmail4
Description: Creates and amends Trac tickets from e-mail
Homepage: https://oss.trac.surfsara.nl/email2trac
Description-md5: 2edfde2600de0a44fd3071ab45935cc4
Tag: role::plugin, works-with::bugs, works-with::mail
Section: web
Priority: optional
Filename: pool/main/e/email2trac/trac-email2trac_2.10.0-1_all.deb
Size: 43822
MD5sum: f2f7181a33ec850a7c3652b213a76dff
SHA256: 737fca926ad6c4aca1d2d63e4c170dc989b4d140bbe6c4955a61d7570f1e8626



Also on packages.debian.org, the latest release for stable

remains at  2.10.0-1 . Would be just great to have it.



Thanks and Regards

Sebastian




Sebastian Kraus
Team IT am Institut für Chemie
Gebäude C, Straße des 17. Juni 115, Raum C7

Technische Universität Berlin
Fakultät II
Institut für Chemie
Sekretariat C3
Straße des 17. Juni 135
10623 Berlin

Email: sebastian.kr...@tu-berlin.de





Bug#858819: availability new package version in stable

2018-06-14 Thread Kraus, Sebastian
Dear Adrian Bunk,



thanks for your corrections.

Do you plan to push the new package version

also to Debian stable (stretch)? There only the

buggy package version 2.10.0-1 is available.



Regards

Sebastian



Sebastian Kraus
Team IT am Institut für Chemie
Gebäude C, Straße des 17. Juni 115, Raum C7

Technische Universität Berlin
Fakultät II
Institut für Chemie
Sekretariat C3
Straße des 17. Juni 135
10623 Berlin

Email: sebastian.kr...@tu-berlin.de





Bug#882117: non-assignment options for nfs mounts neither recognized in /etc/fstab nor /etc/nfsmount.conf

2017-11-18 Thread Kraus, Sebastian
Package: mount
Version: 2.29.2-1

Dear package maintainers,

simple (non-assignment) nfs mount options like acl, (no)rdirplus are neither 
recognized within /etc/fstab nor /etc/nfsmount.conf.
We need to mount some NFS shares at bootup on our client machines, while 
setting the nordirplus option being mandatory
cause of severe caching problems against our nfs servers.
Manually mounting the nfs shares on the commandline recognizes these 
non-assignment options.
Manual mounts are not feasible in our case.

Tested with Debian stretch 4.9.0-4-amd64 #1 SMP Debian 4.9.51-1 (2017-09-28) 
x86_64

Package: linux-image-4.9.0-4-amd64  Version: 4.9.51-1 .



Regards


Sebastian Kraus







Sebastian Kraus
Team IT am Institut für Chemie
Gebäude C, Straße des 17. Juni 115, Raum C7

Technische Universität Berlin
Fakultät II
Institut für Chemie
Sekretariat C3
Straße des 17. Juni 135
10623 Berlin


Tel.: +49 30 314 22263
Fax: +49 30 314 29309
Email: sebastian.kr...@tu-berlin.de





Bug#859888: Impossible to install a kernel module for a specific kernel version in a clean manner

2017-04-08 Thread Kraus, Sebastian
Package: dkms
Version: 2.2.0.3-2
Architecture: all


Hello,


the script /usr/lib/dkms/dkms_autoinstaller includes the following command in 
line 42:


dkms autoinstall --kernelver $kernel .

This does not allow to install modules for a specific/exclusive kernel version 
and ignores the list of kernel versions set in /usr/lib/dkms/common.postinst . 
Having a look at the man page or the online help of dkms, you see that dkms 
command autoinstall does not accept any further option(s). Is this behaviour 
intended or a mistake?



Best greetings



Sebastian Kraus