Re: [PATCH v3 1/2] fs: don't let getdents return bogus names

2019-01-14 Thread Dave Chinner
On Mon, Jan 14, 2019 at 07:23:17PM +0100, Jann Horn wrote:
> When you e.g. run `find` on a directory for which getdents returns
> "filenames" that contain slashes, `find` passes those "filenames" back to
> the kernel, which then interprets them as paths. That could conceivably
> cause userspace to do something bad when accessing something like an
> untrusted USB stick, but I'm not aware of any specific example.
> 
> Instead of returning bogus filenames to userspace, return -EUCLEAN.

Please don't use EUCLEAN directly to indicate filesystem corruption
directly.  If we want to indicate that the filesystem is corrupted,
please hoist the multiple XFS/ext4 definitions of:

#define EFSCORRUPTED EUCLEAN

up into include/uapi/asm-generic/errno.h and then use EFSCORRUPTED
in all the places where we want to indicate to userspace that the
filesystem is corrupted. That tells both the code reader and the
userspace developers that it's a corruption error and puts context
to the "structure needs cleaning" text that goes along with it...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


[PATCH v3 1/2] fs: don't let getdents return bogus names

2019-01-14 Thread Jann Horn
When you e.g. run `find` on a directory for which getdents returns
"filenames" that contain slashes, `find` passes those "filenames" back to
the kernel, which then interprets them as paths. That could conceivably
cause userspace to do something bad when accessing something like an
untrusted USB stick, but I'm not aware of any specific example.

Instead of returning bogus filenames to userspace, return -EUCLEAN.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn 

I ordered this fix before the refactoring one so that it can easily be
backported.
---
Bringing that half-year-old patch back to life...

changed in v2:
 - move bogus_dirent_name() out of the #ifdef (kbuild test robot)
changed in v3:
 - change calling convention (Al Viro)
 - comment fix

 arch/alpha/kernel/osf_sys.c |  4 
 fs/readdir.c| 35 +++
 include/linux/fs.h  |  2 ++
 3 files changed, 41 insertions(+)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index 792586038808..e60f8e72591b 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -117,6 +118,9 @@ osf_filldir(struct dir_context *ctx, const char *name, int 
namlen,
unsigned int reclen = ALIGN(NAME_OFFSET + namlen + 1, sizeof(u32));
unsigned int d_ino;
 
+   buf->error = check_dirent_name(name, namlen);
+   if (unlikely(buf->error))
+   return -EUCLEAN;
buf->error = -EINVAL;   /* only used if we fail */
if (reclen > buf->count)
return -EINVAL;
diff --git a/fs/readdir.c b/fs/readdir.c
index 2f6a4534e0df..102b0c86a97f 100644
--- a/fs/readdir.c
+++ b/fs/readdir.c
@@ -64,6 +64,26 @@ int iterate_dir(struct file *file, struct dir_context *ctx)
 }
 EXPORT_SYMBOL(iterate_dir);
 
+/*
+ * Most filesystems don't filter out bogus directory entry names, and userspace
+ * can get very confused by such names. Behave as if a filesystem error had
+ * happened while reading directory entries.
+ */
+int check_dirent_name(const char *name, int namlen)
+{
+   if (namlen == 0) {
+   pr_err_once("%s: filesystem returned bogus empty name\n",
+   __func__);
+   return -EUCLEAN;
+   }
+   if (memchr(name, '/', namlen)) {
+   pr_err_once("%s: filesystem returned bogus name '%*pEhp' 
(contains slash)\n",
+   __func__, namlen, name);
+   return -EUCLEAN;
+   }
+   return 0;
+}
+
 /*
  * Traditional linux readdir() handling..
  *
@@ -98,6 +118,9 @@ static int fillonedir(struct dir_context *ctx, const char 
*name, int namlen,
 
if (buf->result)
return -EINVAL;
+   buf->result = check_dirent_name(name, namlen);
+   if (unlikely(buf->result))
+   return -EUCLEAN;
d_ino = ino;
if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) {
buf->result = -EOVERFLOW;
@@ -173,6 +196,9 @@ static int filldir(struct dir_context *ctx, const char 
*name, int namlen,
int reclen = ALIGN(offsetof(struct linux_dirent, d_name) + namlen + 2,
sizeof(long));
 
+   buf->error = check_dirent_name(name, namlen);
+   if (unlikely(buf->error))
+   return -EUCLEAN;
buf->error = -EINVAL;   /* only used if we fail.. */
if (reclen > buf->count)
return -EINVAL;
@@ -259,6 +285,9 @@ static int filldir64(struct dir_context *ctx, const char 
*name, int namlen,
int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
sizeof(u64));
 
+   buf->error = check_dirent_name(name, namlen);
+   if (unlikely(buf->error))
+   return -EUCLEAN;
buf->error = -EINVAL;   /* only used if we fail.. */
if (reclen > buf->count)
return -EINVAL;
@@ -358,6 +387,9 @@ static int compat_fillonedir(struct dir_context *ctx, const 
char *name,
 
if (buf->result)
return -EINVAL;
+   buf->result = check_dirent_name(name, namlen);
+   if (unlikely(buf->result))
+   return -EUCLEAN;
d_ino = ino;
if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) {
buf->result = -EOVERFLOW;
@@ -427,6 +459,9 @@ static int compat_filldir(struct dir_context *ctx, const 
char *name, int namlen,
int reclen = ALIGN(offsetof(struct compat_linux_dirent, d_name) +
namlen + 2, sizeof(compat_long_t));
 
+   buf->error = check_dirent_name(name, namlen);
+   if (unlikely(buf->error))
+   return -EUCLEAN;
buf->error = -EINVAL;   /* only used if we fail.. */
if (reclen > buf->count)
return -EINVAL;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 811c77743dad..e14329741e3a 100644
--- a/include/linux/fs.h
+++ 

[PATCH v3 2/2] fs: let filldir_t return bool instead of an error code

2019-01-14 Thread Jann Horn
As Al Viro pointed out, many filldir_t functions return error codes, but
all callers of filldir_t functions just check whether the return value is
non-zero (to determine whether to continue reading the directory); more
precise errors have to be signalled via struct dir_context.
Change all filldir_t functions to return bool instead of int.

Suggested-by: Al Viro 
Signed-off-by: Jann Horn 
---
added in v3.

I ordered this change after the patch that adds the error check so that
the error check can be backported more easily.

 arch/alpha/kernel/osf_sys.c | 12 +++
 fs/afs/dir.c| 30 +
 fs/ecryptfs/file.c  | 13 
 fs/exportfs/expfs.c |  8 ++---
 fs/fat/dir.c|  8 ++---
 fs/gfs2/export.c|  6 ++--
 fs/nfsd/nfs4recover.c   |  8 ++---
 fs/nfsd/vfs.c   |  6 ++--
 fs/ocfs2/dir.c  | 10 +++---
 fs/ocfs2/journal.c  | 14 
 fs/overlayfs/readdir.c  | 24 +++---
 fs/readdir.c| 64 ++---
 fs/reiserfs/xattr.c | 20 ++--
 fs/xfs/scrub/dir.c  |  8 ++---
 fs/xfs/scrub/parent.c   |  4 +--
 include/linux/fs.h  | 10 +++---
 16 files changed, 125 insertions(+), 120 deletions(-)

diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c
index e60f8e72591b..14e5ae0dac50 100644
--- a/arch/alpha/kernel/osf_sys.c
+++ b/arch/alpha/kernel/osf_sys.c
@@ -108,7 +108,7 @@ struct osf_dirent_callback {
int error;
 };
 
-static int
+static bool
 osf_filldir(struct dir_context *ctx, const char *name, int namlen,
loff_t offset, u64 ino, unsigned int d_type)
 {
@@ -120,14 +120,14 @@ osf_filldir(struct dir_context *ctx, const char *name, 
int namlen,
 
buf->error = check_dirent_name(name, namlen);
if (unlikely(buf->error))
-   return -EUCLEAN;
+   return false;
buf->error = -EINVAL;   /* only used if we fail */
if (reclen > buf->count)
-   return -EINVAL;
+   return false;
d_ino = ino;
if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) {
buf->error = -EOVERFLOW;
-   return -EOVERFLOW;
+   return false;
}
if (buf->basep) {
if (put_user(offset, buf->basep))
@@ -144,10 +144,10 @@ osf_filldir(struct dir_context *ctx, const char *name, 
int namlen,
dirent = (void __user *)dirent + reclen;
buf->dirent = dirent;
buf->count -= reclen;
-   return 0;
+   return true;
 Efault:
buf->error = -EFAULT;
-   return -EFAULT;
+   return false;
 }
 
 SYSCALL_DEFINE4(osf_getdirentries, unsigned int, fd,
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 8a2562e3a316..84d74cc25127 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -26,10 +26,12 @@ static int afs_dir_open(struct inode *inode, struct file 
*file);
 static int afs_readdir(struct file *file, struct dir_context *ctx);
 static int afs_d_revalidate(struct dentry *dentry, unsigned int flags);
 static int afs_d_delete(const struct dentry *dentry);
-static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name, 
int nlen,
- loff_t fpos, u64 ino, unsigned dtype);
-static int afs_lookup_filldir(struct dir_context *ctx, const char *name, int 
nlen,
- loff_t fpos, u64 ino, unsigned dtype);
+static bool afs_lookup_one_filldir(struct dir_context *ctx, const char *name,
+ int nlen, loff_t fpos, u64 ino,
+ unsigned int dtype);
+static bool afs_lookup_filldir(struct dir_context *ctx, const char *name,
+ int nlen, loff_t fpos, u64 ino,
+ unsigned int dtype);
 static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
  bool excl);
 static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode);
@@ -493,7 +495,7 @@ static int afs_readdir(struct file *file, struct 
dir_context *ctx)
  * - if afs_dir_iterate_block() spots this function, it'll pass the FID
  *   uniquifier through dtype
  */
-static int afs_lookup_one_filldir(struct dir_context *ctx, const char *name,
+static bool afs_lookup_one_filldir(struct dir_context *ctx, const char *name,
  int nlen, loff_t fpos, u64 ino, unsigned 
dtype)
 {
struct afs_lookup_one_cookie *cookie =
@@ -509,16 +511,16 @@ static int afs_lookup_one_filldir(struct dir_context 
*ctx, const char *name,
 
if (cookie->name.len != nlen ||
memcmp(cookie->name.name, name, nlen) != 0) {
-   _leave(" = 0 [no]");
-   return 0;
+   _leave(" = true [no]");
+   return true;
}
 
cookie->fid.vnode = ino;
cookie->fid.unique = dtype;
cookie->found = 1;
 
-   _leave(" = -1 

Re: [PATCH 14/15] arch: add split IPC system calls where needed

2019-01-14 Thread Michael Ellerman
Michael Ellerman  writes:
> Hi Arnd,
>
> Arnd Bergmann  writes:
>> The IPC system call handling is highly inconsistent across architectures,
>> some use sys_ipc, some use separate calls, and some use both.  We also
>> have some architectures that require passing IPC_64 in the flags, and
>> others that set it implicitly.
...
>
> We already have a gap at 366-377 from when we tried to add the split IPC
> calls a few years back.
>
> I guess I don't mind leaving that gap and using the common numbers as
> you've done here.
>
> But it would be good to add a comment pointing out that we have room
> at 366 for more arch specific syscalls as well.
>
> cheers

Guess I sent that one twice. 臘

cheers


Re: [PATCH 15/15] arch: add pkey and rseq syscall numbers everywhere

2019-01-14 Thread Heiko Carstens
On Fri, Jan 11, 2019 at 06:30:43PM +0100, Arnd Bergmann wrote:
> On Thu, Jan 10, 2019 at 9:36 PM Heiko Carstens
>  wrote:
> > On Thu, Jan 10, 2019 at 05:24:35PM +0100, Arnd Bergmann wrote:
> 
> > Since you only need/want the system call numbers, could you please
> > change these lines to:
> >
> > > +384  common  pkey_alloc  -   -
> > > +385  common  pkey_free   -   -
> > > +386  common  pkey_mprotect   -   -
> >
> > Otherwise it _looks_ like we would need compat wrappers here as well,
> > even though all of them would just jump to sys_ni_syscall() in this
> > case. Making this explicit seems to better.
> 
> Ok, fair enough. I considered doing this originally and then
> decided against it for consistency with the asm-generic file,
> but I don't care much either way.
> 
> Is this something you may want to add later? I'm not sure exactly
> how pkey compares to s390 storage keys, or if this is something
> completely unrelated.

I don't think pkeys will ever work on s390, since they require a key
per mapping, while the s390 storage keys are per physical page.