I spent some more time looking at this today...

On Fri, Nov 23, 2018 at 06:05:25PM +0000, Will Deacon wrote:
> Doing some more debugging, it looks like the usual failure case is where
> one CPU clears the inode field in the dentry via:
> 
>       devpts_pty_kill()
>               -> d_delete()   // dentry->d_lockref.count == 1
>                       -> dentry_unlink_inode()
> 
> whilst another CPU gets a pointer to the dentry via:
> 
>       sys_getdents64()
>               -> iterate_dir()
>                       -> dcache_readdir()
>                               -> next_positive()
> 
> and explodes on the subsequent inode dereference when trying to pass the
> inode number to dir_emit():
> 
>       if (!dir_emit(..., d_inode(next)->i_ino, ...))
> 
> Indeed, the hack below triggers a warning, indicating that the inode
> is being cleared concurrently.
> 
> I can't work out whether the getdents64() path should hold a refcount
> to stop d_delete() in its tracks, or whether devpts_pty_kill() shouldn't
> be calling d_delete() like this at all.

So the issue is that opening /dev/pts/ptmx creates a new pty in /dev/pts,
which disappears when you close /dev/pts/ptmx. Consequently, when we tear
down the dentry for the magic new file, we have to take the i_node rwsem of
the *parent* so that concurrent path walkers don't trip over it whilst its
being freed. I wrote a simple concurrent program to getdents(/dev/pts/) in
one thread, whilst another opens and closes /dev/pts/ptmx: it crashes the
kernel in seconds.

Patch below, but I'd still like somebody else to look at this, please.

Will

--->8

diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c
index c53814539070..50ddb95ff84c 100644
--- a/fs/devpts/inode.c
+++ b/fs/devpts/inode.c
@@ -619,11 +619,17 @@ void *devpts_get_priv(struct dentry *dentry)
  */
 void devpts_pty_kill(struct dentry *dentry)
 {
-       WARN_ON_ONCE(dentry->d_sb->s_magic != DEVPTS_SUPER_MAGIC);
+       struct super_block *sb = dentry->d_sb;
+       struct dentry *parent = sb->s_root;
 
+       WARN_ON_ONCE(sb->s_magic != DEVPTS_SUPER_MAGIC);
+
+       inode_lock(parent->d_inode);
        dentry->d_fsdata = NULL;
        drop_nlink(dentry->d_inode);
        d_delete(dentry);
+       inode_unlock(parent->d_inode);
+
        dput(dentry);   /* d_alloc_name() in devpts_pty_new() */
 }
 

Reply via email to