Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-04 Thread Trond Myklebust
On Wed, Feb 4, 2015 at 4:52 PM, Bruno Prémont  wrote:
> On Wed, 4 Feb 2015 14:06:48 Trond Myklebust wrote:
>> On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont wrote:
>> > On Fri, 30 January 2015 Trond Myklebust wrote:
>> > > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
>> > > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
>> > > > > On a system running home-brown container (mntns, utsns,
>> > > > > pidns, netns) with NFS mount-point bind-mounted into the
>> > > > > container I hit the following trace if nfs filesystem is
>> > > > > first umount()ed in init ns and then later umounted from
>> > > > > container when the container exists.
>> > > > >
>> > > > > 
>> > > >
>> > > > We should rather change rpcb_create() to pass the nodename from
>> > > > the parent. The point is that the rpc_clnt->cl_nodename is used
>> > > > in various different contexts (for instance in the AUTH_SYS
>> > > > credential) where it isn't always appropriate to have it set to
>> > > > an empty string.
>> > >
>> > > I was rather hoping that Bruno would fix up his patch and resend,
>> > > but since other reports of the same bug are now surfacing...
>> > > Please could you all check if something like the following patch
>> > > fixes it.
>> >
>> > This patch works for me, so
>> >   Tested-by: Bruno Prémont 
>> >
>> > Now I get just the following complaint in dmesg on shutdown:
>> >   [ 1940.173201] lockd: cannot unmonitor nfs.home
>> >    name of NFS
>> > server
>> >
>> > This complaint did not happen with my "empty string" name
>> > patch.
>>
>> Are there any clues from rpc.statd in your syslog that might help to
>> explain the error?
>
> I would say there is no more running rpc.statd at that time.
>
> My shutdown process looks about as follows (it's not necessarily the
> optimal ordering):
>
> - umount nfs mountpoints from root mntns
> - stop nfsclient
> - stop rpc.statd
> - stop rpcbind
> - stop containers (with bind-mounted nfs mountpoints from root mntns
> that get implicitly umounted on mntns release)
>

In that case, the above message can be taken as a likely good
indication that the kernel is working as expected. If there is no
rpc.statd to talk to, then the unmonitor RPC call is expected to fail.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-04 Thread Bruno Prémont
On Wed, 4 Feb 2015 14:06:48 Trond Myklebust wrote:
> On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont wrote:
> > On Fri, 30 January 2015 Trond Myklebust wrote:
> > > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
> > > > > On a system running home-brown container (mntns, utsns,
> > > > > pidns, netns) with NFS mount-point bind-mounted into the
> > > > > container I hit the following trace if nfs filesystem is
> > > > > first umount()ed in init ns and then later umounted from
> > > > > container when the container exists.
> > > > >
> > > > > 
> > > >
> > > > We should rather change rpcb_create() to pass the nodename from
> > > > the parent. The point is that the rpc_clnt->cl_nodename is used
> > > > in various different contexts (for instance in the AUTH_SYS
> > > > credential) where it isn't always appropriate to have it set to
> > > > an empty string.
> > >
> > > I was rather hoping that Bruno would fix up his patch and resend,
> > > but since other reports of the same bug are now surfacing...
> > > Please could you all check if something like the following patch
> > > fixes it.
> >
> > This patch works for me, so
> >   Tested-by: Bruno Prémont 
> >
> > Now I get just the following complaint in dmesg on shutdown:
> >   [ 1940.173201] lockd: cannot unmonitor nfs.home
> >    name of NFS
> > server
> >
> > This complaint did not happen with my "empty string" name
> > patch.
> 
> Are there any clues from rpc.statd in your syslog that might help to
> explain the error?

I would say there is no more running rpc.statd at that time.

My shutdown process looks about as follows (it's not necessarily the
optimal ordering):

- umount nfs mountpoints from root mntns
- stop nfsclient
- stop rpc.statd
- stop rpcbind
- stop containers (with bind-mounted nfs mountpoints from root mntns
that get implicitly umounted on mntns release)

Bruno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-04 Thread Trond Myklebust
On Wed, Feb 4, 2015 at 12:08 PM, Bruno Prémont
 wrote:
>
> On Fri, 30 January 2015 Trond Myklebust wrote:
> > On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont wrote:
> > > > On a system running home-brown container (mntns, utsns, pidns, netns)
> > > > with NFS mount-point bind-mounted into the container I hit the following
> > > > trace if nfs filesystem is first umount()ed in init ns and then later
> > > > umounted from container when the container exists.
> > > >
> > > > [51397.767310] BUG: unable to handle kernel NULL pointer dereference at 
> > > > 0008
> > > > [51397.770671] IP: [] rpc_new_client+0x193/0x2b0
> > > > [51397.773967] PGD 0
> > > > [51397.777218] Oops:  [#1] SMP
> > > > [51397.780490] Modules linked in:
> > > > [51397.783751] CPU: 0 PID: 1711 Comm: image-starter Not tainted 
> > > > 3.19.0-rc2-kvm+ #7
> > > > [51397.787123] Hardware name: Gigabyte Technology Co., Ltd. 
> > > > GA-A75M-UD2H/GA-A75M-UD2H, BIOS F6 09/28/2012
> > > > [51397.790606] task: 8800c9fcbac0 ti: 8801fe754000 task.ti: 
> > > > 8801fe754000
> > > > [51397.794149] RIP: 0010:[]  [] 
> > > > rpc_new_client+0x193/0x2b0
> > > > [51397.797798] RSP: 0018:8801fe757908  EFLAGS: 00010246
> > > > [51397.801444] RAX:  RBX: 88009dafb240 RCX: 
> > > > 0180
> > > > [51397.805174] RDX: bae0 RSI: 1770 RDI: 
> > > > 88009dafb308
> > > > [51397.808913] RBP: 8801fe757948 R08: 88009daf92d8 R09: 
> > > > 88009dafb458
> > > > [51397.812673] R10: 88009dafb458 R11: 88020ec15bc0 R12: 
> > > > 8801fe757a40
> > > > [51397.816456] R13: 81b9d800 R14: 8800c6e31030 R15: 
> > > > 
> > > > [51397.820270] FS:  7f335a3a1700() GS:88020ec0() 
> > > > knlGS:f7287700
> > > > [51397.824168] CS:  0010 DS:  ES:  CR0: 8005003b
> > > > [51397.828066] CR2: 0008 CR3: 0001fc54d000 CR4: 
> > > > 07f0
> > > > [51397.832017] Stack:
> > > > [51397.835924]  000a 81b9d770 81826450 
> > > > 8801fe757a40
> > > > [51397.840023]  8801fe757a40 8800cf08d500 81826450 
> > > > 820f4728
> > > > [51397.844130]  8801fe757978 81828815 8801fe757978 
> > > > 8182aad8
> > > > [51397.848224] Call Trace:
> > > > [51397.852221]  [] ? call_start+0x20/0x20
> > > > [51397.856273]  [] ? call_start+0x20/0x20
> > > > [51397.860295]  [] rpc_create_xprt+0x15/0xb0
> > > > [51397.864324]  [] ? xprt_create_transport+0x108/0x1b0
> > > > [51397.868428]  [] rpc_create+0xc1/0x190
> > > > [51397.872574]  [] ? internal_add_timer+0x66/0x80
> > > > [51397.876733]  [] ? mod_timer+0x109/0x1e0
> > > > [51397.880877]  [] rpcb_create+0x6e/0x90
> > > > [51397.884999]  [] rpcb_getport_async+0x15a/0x330
> > > > [51397.889118]  [] ? rpc_malloc+0x3a/0x70
> > > > [51397.893240]  [] ? __kmalloc+0xc2/0x170
> > > > [51397.897354]  [] ? call_reserveresult+0x110/0x110
> > > > [51397.901490]  [] ? call_start+0x20/0x20
> > > > [51397.905606]  [] ? call_start+0x20/0x20
> > > > [51397.909662]  [] call_bind+0x3e/0x40
> > > > [51397.913709]  [] __rpc_execute+0x79/0x330
> > > > [51397.917778]  [] rpc_execute+0x5d/0xa0
> > > > [51397.921871]  [] rpc_run_task+0x6b/0x90
> > > > [51397.925989]  [] rpc_call_sync+0x3e/0xa0
> > > > [51397.930108]  [] nsm_mon_unmon+0xb9/0xd0
> > > > [51397.934191]  [] ? call_rcu_bh+0x20/0x20
> > > > [51397.938235]  [] nsm_unmonitor+0x8c/0x140
> > > > [51397.942309]  [] nlm_destroy_host_locked+0x63/0xa0
> > > > [51397.946442]  [] nlmclnt_release_host+0x7c/0x130
> > > > [51397.950591]  [] nlmclnt_done+0x15/0x30
> > > > [51397.954773]  [] nfs_destroy_server+0x12/0x20
> > > > [51397.958934]  [] nfs_free_server+0x22/0xa0
> > > > [51397.963053]  [] nfs_kill_super+0x1d/0x30
> > > > [51397.967158]  [] deactivate_locked_super+0x4c/0x70
> > > > [51397.971286]  [] deactivate_super+0x49/0x70
> > > > [51397.975398]  [] cleanup_mnt+0x3e/0x90
> > > > [51397.979499]  [] __cleanup_mnt+0xd/0x10
> > > > [51397.983598]  [] task_work_run+0xbc/0xe0
> > > > [51397.987697]  [] do_exit+0x295/0xaf0
> > > > [51397.991812]  [] ? fput+0x9/0x10
> > > > [51397.995937]  [] ? task_work_run+0xa4/0xe0
> > > > [51398.70]  [] do_group_exit+0x3a/0xa0
> > > > [51398.004201]  [] SyS_exit_group+0xf/0x10
> > > > [51398.008315]  [] system_call_fastpath+0x12/0x17
> > > > [51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 
> > > > 63 2d 01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 
> > > > 00 00 <4c> 8b 68 08 49 83 c5 45 4
> > > > [51398.022378] RIP  [] rpc_new_client+0x193/0x2b0
> > > > [51398.026732]  RSP 
> > > > [51398.031025] CR2: 0008
> > > > [51398.035326] ---[ end trace b701b037bc457620 ]---
> > > > [51398.058223] Fixing recursive fault but reboot is needed!
> > >
> > > We should rather change rpcb_create() to pass the nodename f

Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-04 Thread Bruno Prémont
 [51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 
> > > 63 2d 01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 
> > > 00 00 <4c> 8b 68 08 49 83 c5 45 4
> > > [51398.022378] RIP  [] rpc_new_client+0x193/0x2b0
> > > [51398.026732]  RSP 
> > > [51398.031025] CR2: 0008
> > > [51398.035326] ---[ end trace b701b037bc457620 ]---
> > > [51398.058223] Fixing recursive fault but reboot is needed!
> >
> > We should rather change rpcb_create() to pass the nodename from the
> > parent. The point is that the rpc_clnt->cl_nodename is used in various
> > different contexts (for instance in the AUTH_SYS credential) where it
> > isn't always appropriate to have it set to an empty string.
> 
> I was rather hoping that Bruno would fix up his patch and resend, but
> since other reports of the same bug are now surfacing... Please could
> you all check if something like the following patch fixes it.

This patch works for me, so
  Tested-by: Bruno Prémont 

Now I get just the following complaint in dmesg on shutdown:
  [ 1940.173201] lockd: cannot unmonitor nfs.home
   name of NFS server

This complaint did not happen with my "empty string" name
patch.

Thanks,
Bruno


> Thanks
>   Trond
> 
> 8<-
> From 87557df0ca2241da0edac558286650fdb081152c Mon Sep 17 00:00:00 2001
> From: Trond Myklebust 
> Date: Fri, 30 Jan 2015 18:12:28 -0500
> Subject: [PATCH] SUNRPC: NULL utsname dereference on NFS umount during
>  namespace cleanup
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Fix an Oopsable condition when nsm_mon_unmon is called as part of the
> namespace cleanup, which now apparently happens after the utsname
> has been freed.
> 
> Link: http://lkml.kernel.org/r/20150125220604.09012...@neptune.home
> Reported-by: Bruno Prémont 
> Cc: sta...@vger.kernel.org # 3.18
> Signed-off-by: Trond Myklebust 
> ---
>  fs/lockd/mon.c  | 13 +
>  include/linux/sunrpc/clnt.h |  3 ++-
>  net/sunrpc/clnt.c   | 12 +++-
>  net/sunrpc/rpcb_clnt.c  |  8 ++--
>  4 files changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> index 1cc6ec51e6b1..47a32b6d9b90 100644
> --- a/fs/lockd/mon.c
> +++ b/fs/lockd/mon.c
> @@ -65,7 +65,7 @@ static inline struct sockaddr *nsm_addr(const struct 
> nsm_handle *nsm)
>   return (struct sockaddr *)&nsm->sm_addr;
>  }
>  
> -static struct rpc_clnt *nsm_create(struct net *net)
> +static struct rpc_clnt *nsm_create(struct net *net, const char *nodename)
>  {
>   struct sockaddr_in sin = {
>   .sin_family = AF_INET,
> @@ -77,6 +77,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
>   .address= (struct sockaddr *)&sin,
>   .addrsize   = sizeof(sin),
>   .servername = "rpc.statd",
> + .nodename   = nodename,
>   .program= &nsm_program,
>   .version= NSM_VERSION,
>   .authflavor = RPC_AUTH_NULL,
> @@ -102,7 +103,7 @@ out:
>   return clnt;
>  }
>  
> -static struct rpc_clnt *nsm_client_get(struct net *net)
> +static struct rpc_clnt *nsm_client_get(struct net *net, const char *nodename)
>  {
>   struct rpc_clnt *clnt, *new;
>   struct lockd_net *ln = net_generic(net, lockd_net_id);
> @@ -111,7 +112,7 @@ static struct rpc_clnt *nsm_client_get(struct net *net)
>   if (clnt != NULL)
>   goto out;
>  
> - clnt = new = nsm_create(net);
> + clnt = new = nsm_create(net, nodename);
>   if (IS_ERR(clnt))
>   goto out;
>  
> @@ -190,19 +191,23 @@ int nsm_monitor(const struct nlm_host *host)
>   struct nsm_res  res;
>   int status;
>   struct rpc_clnt *clnt;
> + const char *nodename = NULL;
>  
>   dprintk("lockd: nsm_monitor(%s)\n", nsm->sm_name);
>  
>   if (nsm->sm_monitored)
>   return 0;
>  
> + if (host->h_rpcclnt)
> + nodename = host->h_rpcclnt->cl_nodename;
> +
>   /*
>* Choose whether to record the caller_name or IP address of
>* this peer in the local rpc.statd's database.
>*/
>   nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name : nsm->sm_addrbuf;
>  
> - clnt = nsm_client_get(host->net);
>

Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-02 Thread Nix
On 2 Feb 2015, Trond Myklebust verbalised:
> Hmm... I'm at a loss to see how rpcb_create can ever call
> rpc_new_client() with a null value for the nodename with that patch
> applied. Are you 100% sure that the above Oops came from a patched
> kernel? That IP address of "rpc_new_client+0x13b/0x1f2" looks
> identical to the one in your original posting.

I've been swapping kernels a lot of late due to bisection -- it is
perfectly possible that I somehow ended up running an unpatched one :/

I'll do a build from scratch with the patch and reboot into it. My
apologies if this was a false positive (which it looks quite like it
might have been: your evidence is pretty persuasive!)

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-02 Thread Trond Myklebust
On Mon, Feb 2, 2015 at 12:41 PM, Nix  wrote:
> On 31 Jan 2015, n...@esperi.org.uk told this:
>> I'll let it run overnight and give it a reboot in the morning.
>
> Alas, my latest reboot hit:
>
> [  215.245158] BUG: unable to handle kernel NULL pointer dereference at 
> 0004
> [  215.251602] IP: [] rpc_new_client+0x13b/0x1f2
> [  215.251602] *pde = 
> [  215.251602] Oops:  [#1]
> [  215.251602] CPU: 0 PID: 398 Comm: bash Not tainted 3.18.5+ #1
> [  215.251602] task: de1fcfc0 ti: de1fa000 task.ti: de1fa000
> [  215.251602] EIP: 0060:[] EFLAGS: 00010246 CPU: 0
> [  215.251602] EIP is at rpc_new_client+0x13b/0x1f2
> [  215.251602] EAX:  EBX: df70f000 ECX: bae0 EDX: 0005
> [  215.251602] ESI:  EDI: df6cc000 EBP: 0007 ESP: de1fbcac
> [  215.251602]  DS: 007b ES: 007b FS:  GS:  SS: 0068
> [  215.251602] CR0: 8005003b CR2: 0004 CR3: 1f645000 CR4: 0090
> [  215.251602] Stack:
> [  215.251602]  00d0 df6cc000 de1fbd4c  de1fbd4c de1fbd4c 
> de1fbd10 de1fbf8c
> [  215.251602]  c0350196 de1fbd4c de2056d0 de1fbd10 c0350300 c0262e32 
> c0263841 df528000
> [  215.251602]  a2e0b542 0006 c044b178  de1fbddc 0010 
> de2056d0 
> [  215.251602] Call Trace:
> [  215.251602]  [] ? rpc_create_xprt+0xc/0x74
> [  215.251602]  [] ? rpc_create+0x102/0x10f
> [  215.251602]  [] ? ata_sff_check_status+0x8/0x9
> [  215.251602]  [] ? ata_dev_select.constprop.20+0x83/0x95
> [  215.251602]  [] ? __block_commit_write.isra.25+0x56/0x7f
> [  215.251602]  [] ? rpcb_create+0x6e/0x7c
> [  215.251602]  [] ? rpcb_getport_async+0x124/0x25a
> [  215.251602]  [] ? update_curr+0x81/0xb3
> [  215.251602]  [] ? check_preempt_wakeup+0xf0/0x134
> [  215.251602]  [] ? check_preempt_curr+0x21/0x59
> [  215.251602]  [] ? rpcauth_lookupcred+0x3f/0x47
> [  215.251602]  [] ? __kmalloc+0xa3/0xc4
> [  215.251602]  [] ? rpc_malloc+0x39/0x48
> [  215.251602]  [] ? call_bind+0x2d/0x2e
> [  215.251602]  [] ? __rpc_execute+0x5c/0x187
> [  215.251602]  [] ? rpc_run_task+0x55/0x5a
> [  215.251602]  [] ? rpc_call_sync+0x69/0x81
> [  215.251602]  [] ? nsm_mon_unmon+0x8c/0xa0
> [  215.251602]  [] ? nsm_unmonitor+0x5f/0xd3
> [  215.251602]  [] ? bdi_unregister+0xf2/0x100
> [  215.251602]  [] ? nlm_destroy_host_locked+0x4f/0x7c
> [  215.251602]  [] ? nlmclnt_release_host+0xd8/0xe5
> [  215.251602]  [] ? nlmclnt_done+0xc/0x14
> [  215.251602]  [] ? nfs_free_server+0x16/0x72
> [  215.251602]  [] ? deactivate_locked_super+0x26/0x37
> [  215.251602]  [] ? cleanup_mnt+0x40/0x59
> [  215.251602]  [] ? task_work_run+0x4f/0x5f
> [  215.251602]  [] ? do_exit+0x264/0x670
> [  215.251602]  [] ? __set_current_blocked+0xd/0xf
> [  215.251602]  [] ? sigprocmask+0x77/0x87
> [  215.251602]  [] ? __task_pid_nr_ns+0x3a/0x41
> [  215.251602]  [] ? find_vpid+0xd/0x17
> [  215.251602]  [] ? do_group_exit+0x2b/0x5d
> [  215.251602]  [] ? SyS_exit_group+0xf/0xf
> [  215.251602]  [] ? syscall_call+0x7/0x7
> [  215.251602] Code: 89 43 40 8b 44 24 04 89 43 18 8d 43 78 8b 53 40 89 43 3c 
> 8b 12 e8 32 ac 00 00 c7 03 01 00 00 00 a1 dc f0 42 c0 8b 80 00 03 00 00 <8b> 
> 70 04 83 c6 45 89 f0 e8 ab d1 eb ff 83 f8 20 7f 05 89 43 44
> [  215.251602] EIP: [] rpc_new_client+0x13b/0x1f2 SS:ESP 
> 0068:de1fbcac
> [  215.251602] CR2: 0004
> [  215.251602] ---[ end trace 4b9e971f6b3f2dc8 ]---
> [  215.251602] Kernel panic - not syncing: Fatal exception
> [  215.251602] Kernel Offset: 0x0 from 0xc010 (relocation range: 
> 0xc000-0xe07f)
> [  215.251602] Rebooting in 5 seconds..
>
> so clearly the bug is still there; also the connection I thought might
> exist with the "xprt_adjust_timeout: rq_timeout = 0!" was illusory: that
> message hadn't recurred since before the last reboot but one, but the
> crash happened anyway.
>

Hmm... I'm at a loss to see how rpcb_create can ever call
rpc_new_client() with a null value for the nodename with that patch
applied. Are you 100% sure that the above Oops came from a patched
kernel? That IP address of "rpc_new_client+0x13b/0x1f2" looks
identical to the one in your original posting.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-02-02 Thread Nix
On 31 Jan 2015, n...@esperi.org.uk told this:
> I'll let it run overnight and give it a reboot in the morning.

Alas, my latest reboot hit:

[  215.245158] BUG: unable to handle kernel NULL pointer dereference at 0004
[  215.251602] IP: [] rpc_new_client+0x13b/0x1f2
[  215.251602] *pde = 
[  215.251602] Oops:  [#1]
[  215.251602] CPU: 0 PID: 398 Comm: bash Not tainted 3.18.5+ #1
[  215.251602] task: de1fcfc0 ti: de1fa000 task.ti: de1fa000
[  215.251602] EIP: 0060:[] EFLAGS: 00010246 CPU: 0
[  215.251602] EIP is at rpc_new_client+0x13b/0x1f2
[  215.251602] EAX:  EBX: df70f000 ECX: bae0 EDX: 0005
[  215.251602] ESI:  EDI: df6cc000 EBP: 0007 ESP: de1fbcac
[  215.251602]  DS: 007b ES: 007b FS:  GS:  SS: 0068
[  215.251602] CR0: 8005003b CR2: 0004 CR3: 1f645000 CR4: 0090
[  215.251602] Stack:
[  215.251602]  00d0 df6cc000 de1fbd4c  de1fbd4c de1fbd4c de1fbd10 
de1fbf8c
[  215.251602]  c0350196 de1fbd4c de2056d0 de1fbd10 c0350300 c0262e32 c0263841 
df528000
[  215.251602]  a2e0b542 0006 c044b178  de1fbddc 0010 de2056d0 

[  215.251602] Call Trace:
[  215.251602]  [] ? rpc_create_xprt+0xc/0x74
[  215.251602]  [] ? rpc_create+0x102/0x10f
[  215.251602]  [] ? ata_sff_check_status+0x8/0x9
[  215.251602]  [] ? ata_dev_select.constprop.20+0x83/0x95
[  215.251602]  [] ? __block_commit_write.isra.25+0x56/0x7f
[  215.251602]  [] ? rpcb_create+0x6e/0x7c
[  215.251602]  [] ? rpcb_getport_async+0x124/0x25a
[  215.251602]  [] ? update_curr+0x81/0xb3
[  215.251602]  [] ? check_preempt_wakeup+0xf0/0x134
[  215.251602]  [] ? check_preempt_curr+0x21/0x59
[  215.251602]  [] ? rpcauth_lookupcred+0x3f/0x47
[  215.251602]  [] ? __kmalloc+0xa3/0xc4
[  215.251602]  [] ? rpc_malloc+0x39/0x48
[  215.251602]  [] ? call_bind+0x2d/0x2e
[  215.251602]  [] ? __rpc_execute+0x5c/0x187
[  215.251602]  [] ? rpc_run_task+0x55/0x5a
[  215.251602]  [] ? rpc_call_sync+0x69/0x81
[  215.251602]  [] ? nsm_mon_unmon+0x8c/0xa0
[  215.251602]  [] ? nsm_unmonitor+0x5f/0xd3
[  215.251602]  [] ? bdi_unregister+0xf2/0x100
[  215.251602]  [] ? nlm_destroy_host_locked+0x4f/0x7c
[  215.251602]  [] ? nlmclnt_release_host+0xd8/0xe5
[  215.251602]  [] ? nlmclnt_done+0xc/0x14
[  215.251602]  [] ? nfs_free_server+0x16/0x72
[  215.251602]  [] ? deactivate_locked_super+0x26/0x37
[  215.251602]  [] ? cleanup_mnt+0x40/0x59
[  215.251602]  [] ? task_work_run+0x4f/0x5f
[  215.251602]  [] ? do_exit+0x264/0x670
[  215.251602]  [] ? __set_current_blocked+0xd/0xf
[  215.251602]  [] ? sigprocmask+0x77/0x87
[  215.251602]  [] ? __task_pid_nr_ns+0x3a/0x41
[  215.251602]  [] ? find_vpid+0xd/0x17
[  215.251602]  [] ? do_group_exit+0x2b/0x5d
[  215.251602]  [] ? SyS_exit_group+0xf/0xf
[  215.251602]  [] ? syscall_call+0x7/0x7
[  215.251602] Code: 89 43 40 8b 44 24 04 89 43 18 8d 43 78 8b 53 40 89 43 3c 
8b 12 e8 32 ac 00 00 c7 03 01 00 00 00 a1 dc f0 42 c0 8b 80 00 03 00 00 <8b> 70 
04 83 c6 45 89 f0 e8 ab d1 eb ff 83 f8 20 7f 05 89 43 44
[  215.251602] EIP: [] rpc_new_client+0x13b/0x1f2 SS:ESP 0068:de1fbcac
[  215.251602] CR2: 0004
[  215.251602] ---[ end trace 4b9e971f6b3f2dc8 ]---
[  215.251602] Kernel panic - not syncing: Fatal exception
[  215.251602] Kernel Offset: 0x0 from 0xc010 (relocation range: 
0xc000-0xe07f)
[  215.251602] Rebooting in 5 seconds..

so clearly the bug is still there; also the connection I thought might
exist with the "xprt_adjust_timeout: rq_timeout = 0!" was illusory: that
message hadn't recurred since before the last reboot but one, but the
crash happened anyway.

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-01-31 Thread Bruno Prémont
On Fri, 30 Jan 2015 18:49:21 Trond Myklebust  
wrote:
> On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
> > On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont
> >  wrote:
> > > On a system running home-brown container (mntns, utsns, pidns,
> > > netns) with NFS mount-point bind-mounted into the container I hit
> > > the following trace if nfs filesystem is first umount()ed in init
> > > ns and then later umounted from container when the container
> > > exists.
> > >
> >
> > We should rather change rpcb_create() to pass the nodename from the
> > parent. The point is that the rpc_clnt->cl_nodename is used in
> > various different contexts (for instance in the AUTH_SYS
> > credential) where it isn't always appropriate to have it set to an
> > empty string.
> 
> I was rather hoping that Bruno would fix up his patch and resend, but
> since other reports of the same bug are now surfacing... Please could
> you all check if something like the following patch fixes it.

With FOSDEM this weekend I've had no time to look into it.

Will test when home on Wednesday.

Thanks,
Bruno


> Thanks
>   Trond
> 
> 8<-
> From 87557df0ca2241da0edac558286650fdb081152c Mon Sep 17 00:00:00 2001
> From: Trond Myklebust 
> Date: Fri, 30 Jan 2015 18:12:28 -0500
> Subject: [PATCH] SUNRPC: NULL utsname dereference on NFS umount during
>  namespace cleanup
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Fix an Oopsable condition when nsm_mon_unmon is called as part of the
> namespace cleanup, which now apparently happens after the utsname
> has been freed.
> 
> Link: http://lkml.kernel.org/r/20150125220604.09012...@neptune.home
> Reported-by: Bruno Prémont 
> Cc: sta...@vger.kernel.org # 3.18
> Signed-off-by: Trond Myklebust 
> ---
>  fs/lockd/mon.c  | 13 +
>  include/linux/sunrpc/clnt.h |  3 ++-
>  net/sunrpc/clnt.c   | 12 +++-
>  net/sunrpc/rpcb_clnt.c  |  8 ++--
>  4 files changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
> index 1cc6ec51e6b1..47a32b6d9b90 100644
> --- a/fs/lockd/mon.c
> +++ b/fs/lockd/mon.c
> @@ -65,7 +65,7 @@ static inline struct sockaddr *nsm_addr(const
> struct nsm_handle *nsm) return (struct sockaddr *)&nsm->sm_addr;
>  }
>  
> -static struct rpc_clnt *nsm_create(struct net *net)
> +static struct rpc_clnt *nsm_create(struct net *net, const char
> *nodename) {
>   struct sockaddr_in sin = {
>   .sin_family = AF_INET,
> @@ -77,6 +77,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
>   .address= (struct sockaddr *)&sin,
>   .addrsize   = sizeof(sin),
>   .servername = "rpc.statd",
> + .nodename   = nodename,
>   .program= &nsm_program,
>   .version= NSM_VERSION,
>   .authflavor = RPC_AUTH_NULL,
> @@ -102,7 +103,7 @@ out:
>   return clnt;
>  }
>  
> -static struct rpc_clnt *nsm_client_get(struct net *net)
> +static struct rpc_clnt *nsm_client_get(struct net *net, const char
> *nodename) {
>   struct rpc_clnt *clnt, *new;
>   struct lockd_net *ln = net_generic(net, lockd_net_id);
> @@ -111,7 +112,7 @@ static struct rpc_clnt *nsm_client_get(struct net
> *net) if (clnt != NULL)
>   goto out;
>  
> - clnt = new = nsm_create(net);
> + clnt = new = nsm_create(net, nodename);
>   if (IS_ERR(clnt))
>   goto out;
>  
> @@ -190,19 +191,23 @@ int nsm_monitor(const struct nlm_host *host)
>   struct nsm_res  res;
>   int status;
>   struct rpc_clnt *clnt;
> + const char *nodename = NULL;
>  
>   dprintk("lockd: nsm_monitor(%s)\n", nsm->sm_name);
>  
>   if (nsm->sm_monitored)
>   return 0;
>  
> + if (host->h_rpcclnt)
> + nodename = host->h_rpcclnt->cl_nodename;
> +
>   /*
>* Choose whether to record the caller_name or IP address of
>* this peer in the local rpc.statd's database.
>*/
>   nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name :
> nsm->sm_addrbuf; 
> - clnt = nsm_client_get(host->net);
> + clnt = nsm_client_get(host->net, nodename);
>   if (IS_ERR(clnt)) {
>   status = PTR_ERR(clnt);
>   dprintk("lockd: failed to create NSM upcall
> 

Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-01-30 Thread Nix
On 30 Jan 2015, Trond Myklebust uttered the following:

> On Sun, 2015-01-25 at 16:55 -0500, Trond Myklebust wrote:
>> On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont
>>  wrote:
>> > On a system running home-brown container (mntns, utsns, pidns, netns)
>> > with NFS mount-point bind-mounted into the container I hit the following
>> > trace if nfs filesystem is first umount()ed in init ns and then later
>> > umounted from container when the container exists.

I'm not using containers, but I *am* extensively bind-mounting
NFS filesystems, which probably has the same net effect.

> I was rather hoping that Bruno would fix up his patch and resend, but
> since other reports of the same bug are now surfacing... Please could
> you all check if something like the following patch fixes it.

I have to wait for another of those xprt != 0 warnings, I think. I've
had a couple of clean reboots, but I had the occasional clean reboot
even before this patch.

I'll let it run overnight and give it a reboot in the morning.

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-01-30 Thread Trond Myklebust
58223] Fixing recursive fault but reboot is needed!
>
> We should rather change rpcb_create() to pass the nodename from the
> parent. The point is that the rpc_clnt->cl_nodename is used in various
> different contexts (for instance in the AUTH_SYS credential) where it
> isn't always appropriate to have it set to an empty string.

I was rather hoping that Bruno would fix up his patch and resend, but
since other reports of the same bug are now surfacing... Please could
you all check if something like the following patch fixes it.

Thanks
  Trond

8<-
>From 87557df0ca2241da0edac558286650fdb081152c Mon Sep 17 00:00:00 2001
From: Trond Myklebust 
Date: Fri, 30 Jan 2015 18:12:28 -0500
Subject: [PATCH] SUNRPC: NULL utsname dereference on NFS umount during
 namespace cleanup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Fix an Oopsable condition when nsm_mon_unmon is called as part of the
namespace cleanup, which now apparently happens after the utsname
has been freed.

Link: http://lkml.kernel.org/r/20150125220604.09012...@neptune.home
Reported-by: Bruno Prémont 
Cc: sta...@vger.kernel.org # 3.18
Signed-off-by: Trond Myklebust 
---
 fs/lockd/mon.c  | 13 +
 include/linux/sunrpc/clnt.h |  3 ++-
 net/sunrpc/clnt.c   | 12 +++-
 net/sunrpc/rpcb_clnt.c  |  8 ++--
 4 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c
index 1cc6ec51e6b1..47a32b6d9b90 100644
--- a/fs/lockd/mon.c
+++ b/fs/lockd/mon.c
@@ -65,7 +65,7 @@ static inline struct sockaddr *nsm_addr(const struct 
nsm_handle *nsm)
return (struct sockaddr *)&nsm->sm_addr;
 }
 
-static struct rpc_clnt *nsm_create(struct net *net)
+static struct rpc_clnt *nsm_create(struct net *net, const char *nodename)
 {
struct sockaddr_in sin = {
.sin_family = AF_INET,
@@ -77,6 +77,7 @@ static struct rpc_clnt *nsm_create(struct net *net)
.address= (struct sockaddr *)&sin,
.addrsize   = sizeof(sin),
.servername = "rpc.statd",
+   .nodename   = nodename,
.program= &nsm_program,
.version= NSM_VERSION,
.authflavor = RPC_AUTH_NULL,
@@ -102,7 +103,7 @@ out:
return clnt;
 }
 
-static struct rpc_clnt *nsm_client_get(struct net *net)
+static struct rpc_clnt *nsm_client_get(struct net *net, const char *nodename)
 {
struct rpc_clnt *clnt, *new;
struct lockd_net *ln = net_generic(net, lockd_net_id);
@@ -111,7 +112,7 @@ static struct rpc_clnt *nsm_client_get(struct net *net)
if (clnt != NULL)
goto out;
 
-   clnt = new = nsm_create(net);
+   clnt = new = nsm_create(net, nodename);
if (IS_ERR(clnt))
goto out;
 
@@ -190,19 +191,23 @@ int nsm_monitor(const struct nlm_host *host)
struct nsm_res  res;
int status;
struct rpc_clnt *clnt;
+   const char *nodename = NULL;
 
dprintk("lockd: nsm_monitor(%s)\n", nsm->sm_name);
 
if (nsm->sm_monitored)
return 0;
 
+   if (host->h_rpcclnt)
+   nodename = host->h_rpcclnt->cl_nodename;
+
/*
 * Choose whether to record the caller_name or IP address of
 * this peer in the local rpc.statd's database.
 */
nsm->sm_mon_name = nsm_use_hostnames ? nsm->sm_name : nsm->sm_addrbuf;
 
-   clnt = nsm_client_get(host->net);
+   clnt = nsm_client_get(host->net, nodename);
if (IS_ERR(clnt)) {
status = PTR_ERR(clnt);
dprintk("lockd: failed to create NSM upcall transport, "
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index d86acc63b25f..598ba80ec30c 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -57,7 +57,7 @@ struct rpc_clnt {
const struct rpc_timeout *cl_timeout;   /* Timeout strategy */
 
int cl_nodelen; /* nodename length */
-   charcl_nodename[UNX_MAXNODENAME];
+   charcl_nodename[UNX_MAXNODENAME+1];
struct rpc_pipe_dir_head cl_pipedir_objects;
struct rpc_clnt *   cl_parent;  /* Points to parent of clones */
struct rpc_rtt  cl_rtt_default;
@@ -112,6 +112,7 @@ struct rpc_create_args {
struct sockaddr *saddress;
const struct rpc_timeout *timeout;
const char  *servername;
+   const char  *nodename;
const struct rpc_program *program;
u32 prognumber; /* overrides program->number */
u32 version;
diff --git a

Re: [Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-01-25 Thread Trond Myklebust
On Sun, Jan 25, 2015 at 4:06 PM, Bruno Prémont
 wrote:
> On a system running home-brown container (mntns, utsns, pidns, netns)
> with NFS mount-point bind-mounted into the container I hit the following
> trace if nfs filesystem is first umount()ed in init ns and then later
> umounted from container when the container exists.
>
> [51397.767310] BUG: unable to handle kernel NULL pointer dereference at 
> 0008
> [51397.770671] IP: [] rpc_new_client+0x193/0x2b0
> [51397.773967] PGD 0
> [51397.777218] Oops:  [#1] SMP
> [51397.780490] Modules linked in:
> [51397.783751] CPU: 0 PID: 1711 Comm: image-starter Not tainted 
> 3.19.0-rc2-kvm+ #7
> [51397.787123] Hardware name: Gigabyte Technology Co., Ltd. 
> GA-A75M-UD2H/GA-A75M-UD2H, BIOS F6 09/28/2012
> [51397.790606] task: 8800c9fcbac0 ti: 8801fe754000 task.ti: 
> 8801fe754000
> [51397.794149] RIP: 0010:[]  [] 
> rpc_new_client+0x193/0x2b0
> [51397.797798] RSP: 0018:8801fe757908  EFLAGS: 00010246
> [51397.801444] RAX:  RBX: 88009dafb240 RCX: 
> 0180
> [51397.805174] RDX: bae0 RSI: 1770 RDI: 
> 88009dafb308
> [51397.808913] RBP: 8801fe757948 R08: 88009daf92d8 R09: 
> 88009dafb458
> [51397.812673] R10: 88009dafb458 R11: 88020ec15bc0 R12: 
> 8801fe757a40
> [51397.816456] R13: 81b9d800 R14: 8800c6e31030 R15: 
> 
> [51397.820270] FS:  7f335a3a1700() GS:88020ec0() 
> knlGS:f7287700
> [51397.824168] CS:  0010 DS:  ES:  CR0: 8005003b
> [51397.828066] CR2: 0008 CR3: 0001fc54d000 CR4: 
> 07f0
> [51397.832017] Stack:
> [51397.835924]  000a 81b9d770 81826450 
> 8801fe757a40
> [51397.840023]  8801fe757a40 8800cf08d500 81826450 
> 820f4728
> [51397.844130]  8801fe757978 81828815 8801fe757978 
> 8182aad8
> [51397.848224] Call Trace:
> [51397.852221]  [] ? call_start+0x20/0x20
> [51397.856273]  [] ? call_start+0x20/0x20
> [51397.860295]  [] rpc_create_xprt+0x15/0xb0
> [51397.864324]  [] ? xprt_create_transport+0x108/0x1b0
> [51397.868428]  [] rpc_create+0xc1/0x190
> [51397.872574]  [] ? internal_add_timer+0x66/0x80
> [51397.876733]  [] ? mod_timer+0x109/0x1e0
> [51397.880877]  [] rpcb_create+0x6e/0x90
> [51397.884999]  [] rpcb_getport_async+0x15a/0x330
> [51397.889118]  [] ? rpc_malloc+0x3a/0x70
> [51397.893240]  [] ? __kmalloc+0xc2/0x170
> [51397.897354]  [] ? call_reserveresult+0x110/0x110
> [51397.901490]  [] ? call_start+0x20/0x20
> [51397.905606]  [] ? call_start+0x20/0x20
> [51397.909662]  [] call_bind+0x3e/0x40
> [51397.913709]  [] __rpc_execute+0x79/0x330
> [51397.917778]  [] rpc_execute+0x5d/0xa0
> [51397.921871]  [] rpc_run_task+0x6b/0x90
> [51397.925989]  [] rpc_call_sync+0x3e/0xa0
> [51397.930108]  [] nsm_mon_unmon+0xb9/0xd0
> [51397.934191]  [] ? call_rcu_bh+0x20/0x20
> [51397.938235]  [] nsm_unmonitor+0x8c/0x140
> [51397.942309]  [] nlm_destroy_host_locked+0x63/0xa0
> [51397.946442]  [] nlmclnt_release_host+0x7c/0x130
> [51397.950591]  [] nlmclnt_done+0x15/0x30
> [51397.954773]  [] nfs_destroy_server+0x12/0x20
> [51397.958934]  [] nfs_free_server+0x22/0xa0
> [51397.963053]  [] nfs_kill_super+0x1d/0x30
> [51397.967158]  [] deactivate_locked_super+0x4c/0x70
> [51397.971286]  [] deactivate_super+0x49/0x70
> [51397.975398]  [] cleanup_mnt+0x3e/0x90
> [51397.979499]  [] __cleanup_mnt+0xd/0x10
> [51397.983598]  [] task_work_run+0xbc/0xe0
> [51397.987697]  [] do_exit+0x295/0xaf0
> [51397.991812]  [] ? fput+0x9/0x10
> [51397.995937]  [] ? task_work_run+0xa4/0xe0
> [51398.70]  [] do_group_exit+0x3a/0xa0
> [51398.004201]  [] SyS_exit_group+0xf/0x10
> [51398.008315]  [] system_call_fastpath+0x12/0x17
> [51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 63 2d 
> 01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 00 00 <4c> 
> 8b 68 08 49 83 c5 45 4
> [51398.022378] RIP  [] rpc_new_client+0x193/0x2b0
> [51398.026732]  RSP 
> [51398.031025] CR2: 0008
> [51398.035326] ---[ end trace b701b037bc457620 ]---
> [51398.058223] Fixing recursive fault but reboot is needed!
>
>
> The code executed in rpc_new_client() tries to dereference the
> struct new_utsname * returned by utsname() which has already been
> released at this time.
>
> Cc:  # 3.18
> Signed-off-by: Bruno Prémont 
> ---
> I've seen this trace on 3.18.x as well as 3.19-rc.
>
>
> This patch fixes the NULL dereference but I'm not sure this is the
> right fix.
> Should init uts_ns be referred to or what is the impact of using a random
> string to satisfy rpc_clnt_set_nodename()?
>
> Thanks,
> Bruno
> ---
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 05da12a..0246e94 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -365,6 +365,7 @@ static struct rpc_clnt * rpc_new_client(const struct 
> rpc_create_args *args,
> const struct rpc_version *version;

[Patch] sunrpc: NULL utsname dereference on NFS umount during namespace cleanup

2015-01-25 Thread Bruno Prémont
On a system running home-brown container (mntns, utsns, pidns, netns)
with NFS mount-point bind-mounted into the container I hit the following
trace if nfs filesystem is first umount()ed in init ns and then later
umounted from container when the container exists.

[51397.767310] BUG: unable to handle kernel NULL pointer dereference at 
0008
[51397.770671] IP: [] rpc_new_client+0x193/0x2b0
[51397.773967] PGD 0 
[51397.777218] Oops:  [#1] SMP 
[51397.780490] Modules linked in:
[51397.783751] CPU: 0 PID: 1711 Comm: image-starter Not tainted 3.19.0-rc2-kvm+ 
#7
[51397.787123] Hardware name: Gigabyte Technology Co., Ltd. 
GA-A75M-UD2H/GA-A75M-UD2H, BIOS F6 09/28/2012
[51397.790606] task: 8800c9fcbac0 ti: 8801fe754000 task.ti: 
8801fe754000
[51397.794149] RIP: 0010:[]  [] 
rpc_new_client+0x193/0x2b0
[51397.797798] RSP: 0018:8801fe757908  EFLAGS: 00010246
[51397.801444] RAX:  RBX: 88009dafb240 RCX: 0180
[51397.805174] RDX: bae0 RSI: 1770 RDI: 88009dafb308
[51397.808913] RBP: 8801fe757948 R08: 88009daf92d8 R09: 88009dafb458
[51397.812673] R10: 88009dafb458 R11: 88020ec15bc0 R12: 8801fe757a40
[51397.816456] R13: 81b9d800 R14: 8800c6e31030 R15: 
[51397.820270] FS:  7f335a3a1700() GS:88020ec0() 
knlGS:f7287700
[51397.824168] CS:  0010 DS:  ES:  CR0: 8005003b
[51397.828066] CR2: 0008 CR3: 0001fc54d000 CR4: 07f0
[51397.832017] Stack:
[51397.835924]  000a 81b9d770 81826450 
8801fe757a40
[51397.840023]  8801fe757a40 8800cf08d500 81826450 
820f4728
[51397.844130]  8801fe757978 81828815 8801fe757978 
8182aad8
[51397.848224] Call Trace:
[51397.852221]  [] ? call_start+0x20/0x20
[51397.856273]  [] ? call_start+0x20/0x20
[51397.860295]  [] rpc_create_xprt+0x15/0xb0
[51397.864324]  [] ? xprt_create_transport+0x108/0x1b0
[51397.868428]  [] rpc_create+0xc1/0x190
[51397.872574]  [] ? internal_add_timer+0x66/0x80
[51397.876733]  [] ? mod_timer+0x109/0x1e0
[51397.880877]  [] rpcb_create+0x6e/0x90
[51397.884999]  [] rpcb_getport_async+0x15a/0x330
[51397.889118]  [] ? rpc_malloc+0x3a/0x70
[51397.893240]  [] ? __kmalloc+0xc2/0x170
[51397.897354]  [] ? call_reserveresult+0x110/0x110
[51397.901490]  [] ? call_start+0x20/0x20
[51397.905606]  [] ? call_start+0x20/0x20
[51397.909662]  [] call_bind+0x3e/0x40
[51397.913709]  [] __rpc_execute+0x79/0x330
[51397.917778]  [] rpc_execute+0x5d/0xa0
[51397.921871]  [] rpc_run_task+0x6b/0x90
[51397.925989]  [] rpc_call_sync+0x3e/0xa0
[51397.930108]  [] nsm_mon_unmon+0xb9/0xd0
[51397.934191]  [] ? call_rcu_bh+0x20/0x20
[51397.938235]  [] nsm_unmonitor+0x8c/0x140
[51397.942309]  [] nlm_destroy_host_locked+0x63/0xa0
[51397.946442]  [] nlmclnt_release_host+0x7c/0x130
[51397.950591]  [] nlmclnt_done+0x15/0x30
[51397.954773]  [] nfs_destroy_server+0x12/0x20
[51397.958934]  [] nfs_free_server+0x22/0xa0
[51397.963053]  [] nfs_kill_super+0x1d/0x30
[51397.967158]  [] deactivate_locked_super+0x4c/0x70
[51397.971286]  [] deactivate_super+0x49/0x70
[51397.975398]  [] cleanup_mnt+0x3e/0x90
[51397.979499]  [] __cleanup_mnt+0xd/0x10
[51397.983598]  [] task_work_run+0xbc/0xe0
[51397.987697]  [] do_exit+0x295/0xaf0
[51397.991812]  [] ? fput+0x9/0x10
[51397.995937]  [] ? task_work_run+0xa4/0xe0
[51398.70]  [] do_group_exit+0x3a/0xa0
[51398.004201]  [] SyS_exit_group+0xf/0x10
[51398.008315]  [] system_call_fastpath+0x12/0x17
[51398.012438] Code: 43 78 48 8d bb c8 00 00 00 48 89 7b 70 48 8b 30 e8 63 2d 
01 00 c7 03 01 00 00 00 65 48 8b 04 25 00 aa 00 00 48 8b 80 c0 09 00 00 <4c> 8b 
68 08 49 83 c5 45 4
[51398.022378] RIP  [] rpc_new_client+0x193/0x2b0
[51398.026732]  RSP 
[51398.031025] CR2: 0008
[51398.035326] ---[ end trace b701b037bc457620 ]---
[51398.058223] Fixing recursive fault but reboot is needed!


The code executed in rpc_new_client() tries to dereference the
struct new_utsname * returned by utsname() which has already been
released at this time.

Cc:  # 3.18
Signed-off-by: Bruno Prémont 
---
I've seen this trace on 3.18.x as well as 3.19-rc.


This patch fixes the NULL dereference but I'm not sure this is the
right fix.
Should init uts_ns be referred to or what is the impact of using a random
string to satisfy rpc_clnt_set_nodename()?

Thanks,
Bruno
---
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 05da12a..0246e94 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -365,6 +365,7 @@ static struct rpc_clnt * rpc_new_client(const struct 
rpc_create_args *args,
const struct rpc_version *version;
struct rpc_clnt *clnt = NULL;
const struct rpc_timeout *timeout;
+   const struct new_utsname *uts_name;
int err;
 
/* sanity check the name before trying to print it */
@@ -421,7 +422,8 @@ static struct rpc_clnt * rpc_new_client(const struct 
rpc_cr