Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-10-10 Thread Arthur Gautier
On Thu, Sep 26, 2019 at 04:09:39PM +0200, Borislav Petkov wrote:
> On Thu, Sep 26, 2019 at 09:58:25AM +, Arthur Gautier wrote:
> > I think Andy submitted a patch Feb 25 2019, but I was not copied on it
> > (I believe it was sent to x...@kernel.org) and I don't know which fate it
> > had.
> 
> I guess we're still waiting for Andy to do v2 with feedback incorporated
> and proper commit message. :)

Hello All,

I did not receive neither the patch Andy provided, nor the comments made
on it. But I'd be happy to help and/or take over to fix those if someone could
send me both.

I'd really like to have those problems fixed

Thanks!

-- 
\o/ Arthur
 G  Gandi.net


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-09-26 Thread Borislav Petkov
On Thu, Sep 26, 2019 at 09:58:25AM +, Arthur Gautier wrote:
> I think Andy submitted a patch Feb 25 2019, but I was not copied on it
> (I believe it was sent to x...@kernel.org) and I don't know which fate it
> had.

I guess we're still waiting for Andy to do v2 with feedback incorporated
and proper commit message. :)

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-09-26 Thread Borislav Petkov
On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index 58eacd41526c..709d6efe0d42 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -10,12 +10,7 @@
>  #include 
>  #include 
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> -#define IS_UNALIGNED(src, dst)   0
> -#else
> -#define IS_UNALIGNED(src, dst)   \
> - (((long) dst | (long) src) & (sizeof(long) - 1))
> -#endif
> +#define IS_UNALIGNED(addr) (((long)(addr)) & (sizeof(long) - 1))
>  
>  /*
>   * Do a strncpy, return length of string without final '\0'.
> @@ -35,14 +30,39 @@ static inline long do_strncpy_from_user(char *dst, const 
> char __user *src, long
>   if (max > count)
>   max = count;
>  
> - if (IS_UNALIGNED(src, dst))
> + /*
> +  * First handle any unaligned prefix of src.
> +  */
> + while (max && IS_UNALIGNED(src+res)) {

put spaces around the '+', below too.

> + char c;
> +
> + unsafe_get_user(c, src+res, efault);
> + dst[res] = c;
> + if (!c)
> + return res;
> + res++;
> + max--;
> + }
> +
> + /*
> +  * Now we know that src + res is aligned.  If dst is unaligned and
> +  * we don't have efficient unaligned access, then keep going one
> +  * byte at a time.  (This could be optimized, but it would make
> +  * the code more complicated.
> +  */
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> + if (IS_UNALIGNED(dst + res))
>   goto byte_at_a_time;
> +#endif
>  
>   while (max >= sizeof(unsigned long)) {
> + /*
> +  * src + res is aligned, so the reads in this loop will
> +  * not cross a page boundary.
> +  */
>   unsigned long c, data;
>  
> - /* Fall back to byte-at-a-time if we get a page fault */
> - unsafe_get_user(c, (unsigned long __user *)(src+res), 
> byte_at_a_time);
> + unsafe_get_user(c, (unsigned long __user *)(src+res), efault);
>  
>   *(unsigned long *)(dst+res) = c;
>   if (has_zero(c, , )) {
> @@ -54,7 +74,9 @@ static inline long do_strncpy_from_user(char *dst, const 
> char __user *src, long
>   max -= sizeof(unsigned long);
>   }
>  
> -byte_at_a_time:

You can't remove that label - the ifndef above.

> + /*
> +  * Finish the job one byte at a time.
> +  */
>   while (max) {
>   char c;

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-09-26 Thread Arthur Gautier
On Mon, Feb 18, 2019 at 09:51:50PM +, Arthur Gautier wrote:
> On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> > This seems like it's just papering over the underlying problem: with
> > Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> > the patch below look decent?  It's only compile-tested, but it's
> > conceptually straightforward.  I was hoping I could get rid of the
> > check-maximum-address stuff, but it's needed for architectures where
> > the user range is adjacent to the kernel range (i.e. not x86_64).
> 
> I'm unable to trigger the BUG I had with my initramfs with this patch
> applied. Thanks!
> 

Hello All,

Just a followup on this issue, I'm still able to reproduce the original
issue with:
truncate -s 8388313 a

SECONDFILENAME=
truncate -s 10 $SECONDFILENAME
echo "a\n$SECONDFILENAME" | cpio -o --format=newc | lz4 -l > initrd.img.lz4

I think Andy submitted a patch Feb 25 2019, but I was not copied on it
(I believe it was sent to x...@kernel.org) and I don't know which fate it
had.

Any chance we could have a look again?

Thanks a lot!

-- 
\o/ Arthur
 G  Gandi.net


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-18 Thread Arthur Gautier
On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> This seems like it's just papering over the underlying problem: with
> Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> the patch below look decent?  It's only compile-tested, but it's
> conceptually straightforward.  I was hoping I could get rid of the
> check-maximum-address stuff, but it's needed for architectures where
> the user range is adjacent to the kernel range (i.e. not x86_64).

I'm unable to trigger the BUG I had with my initramfs with this patch
applied. Thanks!

-- 
\o/ Arthur
 G  Gandi.net


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-18 Thread Jann Horn
On Mon, Feb 18, 2019 at 8:15 PM Andy Lutomirski  wrote:
> On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner  wrote:
> > > Another would be to have the buffer passed to flush_buffer() (i.e.
> > > the callback of decompress_fn) allocated with 4 bytes of padding
> > > past the part where the unpacked piece of data is placed for the
> > > callback to find.  As in,
> > >
> > > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> > > index 63b4b7eee138..ca3f7ecc9b35 100644
> > > --- a/lib/decompress_inflate.c
> > > +++ b/lib/decompress_inflate.c
> > > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
> > >   rc = -1;
> > >   if (flush) {
> > >   out_len = 0x8000; /* 32 K */
> > > - out_buf = malloc(out_len);
> > > + out_buf = malloc(out_len + 4);
> >
> >   +8 actually.
> >
> > >   } else {
> > >   if (!out_len)
> > >   out_len = ((size_t)~0) - (size_t)out_buf; /* no 
> > > limit */
> > >
> > > for gunzip/decompress and similar ones for bzip2, etc.  The contents
> > > layout doesn't have anything to do with that...
> >
> > Right. That works nicely.
> >
>
> This seems like it's just papering over the underlying problem: with
> Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> the patch below look decent?  It's only compile-tested, but it's
> conceptually straightforward.

Nit: The "This could be optimized" comment has unbalanced parentheses around it.
Nit: Maybe switch the order of "max" and "IS_UNALIGNED(src+res)" in
the loop for unaligned prefix, or tell the compiler that "max" is
likely to be non-zero? That might be a tiny bit more efficient in the
aligned case? But I don't know much about making code fast, so feel
free to ignore this.

I think there is still similar code in some other places in arch/x86;
back when I did the conversion to _ASM_EXTABLE_UA, I stumbled over a
few similar-looking things. In particular:

load_unaligned_zeropad() (used in some VFS code that looks pretty
performance-critical) still uses _ASM_EXTABLE for reading from kernel
memory; it has a comment saying:
/*
 * Load an unaligned word from kernel space.
 *
 * In the (very unlikely) case of the word being a page-crosser
 * and the next page not being mapped, take the exception and
 * return zeroes in the non-existing part.
 */

csum_partial_copy_generic() uses PREFETCHT0 together with the
following macro; I think this can be used on both kernel and user
addresses?
/*
 * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
 * potentially unmapped kernel address.
 */
.macro ignore L=.Lignore
30:
_ASM_EXTABLE(30b, \L)
.endm
(That comment says "kernel address", but I wrote that, and I think
what I really meant is "kernel or userspace address".)

> I was hoping I could get rid of the
> check-maximum-address stuff, but it's needed for architectures where
> the user range is adjacent to the kernel range (i.e. not x86_64).

> Jann, I'm still unhappy that this code will write up to sizeof(long)-1
> user-controlled garbage bytes in-bounds past the null-terminator in
> the kernel buffer.  Do you think that's worth changing, too?  I don't
> think it's a bug per se, but it seems like a nifty little wart for an
> attacker to try to abuse.

Hm. I guess it might be interesting if some code path first memsets a
kernel buffer to all-zeroes, then uses strncpy_from_user() to copy
into it, and then exposes the entire buffer to a different user task
(or, if strncpy_from_user() was called on kernel memory, to any user
task).

And if the source is kernel memory (might happen e.g. via splice,
there are VFS write handlers that use strncpy_from_user()), it could
potentially be used to make an uninitialized memory read bug
(constrained to a specific field in some slab object, or something
like that) more powerful, by shoving out-of-bounds kernel data around
such that it is aligned properly for the leak.

So, yeah, I think if it's not too much trouble, changing that would be nice.


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-18 Thread Andy Lutomirski
On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner  wrote:

> > Another would be to have the buffer passed to flush_buffer() (i.e.
> > the callback of decompress_fn) allocated with 4 bytes of padding
> > past the part where the unpacked piece of data is placed for the
> > callback to find.  As in,
> >
> > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> > index 63b4b7eee138..ca3f7ecc9b35 100644
> > --- a/lib/decompress_inflate.c
> > +++ b/lib/decompress_inflate.c
> > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
> >   rc = -1;
> >   if (flush) {
> >   out_len = 0x8000; /* 32 K */
> > - out_buf = malloc(out_len);
> > + out_buf = malloc(out_len + 4);
>
>   +8 actually.
>
> >   } else {
> >   if (!out_len)
> >   out_len = ((size_t)~0) - (size_t)out_buf; /* no limit 
> > */
> >
> > for gunzip/decompress and similar ones for bzip2, etc.  The contents
> > layout doesn't have anything to do with that...
>
> Right. That works nicely.
>

This seems like it's just papering over the underlying problem: with
Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
the patch below look decent?  It's only compile-tested, but it's
conceptually straightforward.  I was hoping I could get rid of the
check-maximum-address stuff, but it's needed for architectures where
the user range is adjacent to the kernel range (i.e. not x86_64).

Jann, I'm still unhappy that this code will write up to sizeof(long)-1
user-controlled garbage bytes in-bounds past the null-terminator in
the kernel buffer.  Do you think that's worth changing, too?  I don't
think it's a bug per se, but it seems like a nifty little wart for an
attacker to try to abuse.

On brief inspection, strnlen_user() does not have an equivalent bug.
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 58eacd41526c..709d6efe0d42 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -10,12 +10,7 @@
 #include 
 #include 
 
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-#define IS_UNALIGNED(src, dst)	0
-#else
-#define IS_UNALIGNED(src, dst)	\
-	(((long) dst | (long) src) & (sizeof(long) - 1))
-#endif
+#define IS_UNALIGNED(addr) (((long)(addr)) & (sizeof(long) - 1))
 
 /*
  * Do a strncpy, return length of string without final '\0'.
@@ -35,14 +30,39 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
 	if (max > count)
 		max = count;
 
-	if (IS_UNALIGNED(src, dst))
+	/*
+	 * First handle any unaligned prefix of src.
+	 */
+	while (max && IS_UNALIGNED(src+res)) {
+		char c;
+
+		unsafe_get_user(c, src+res, efault);
+		dst[res] = c;
+		if (!c)
+			return res;
+		res++;
+		max--;
+	}
+
+	/*
+	 * Now we know that src + res is aligned.  If dst is unaligned and
+	 * we don't have efficient unaligned access, then keep going one
+	 * byte at a time.  (This could be optimized, but it would make
+	 * the code more complicated.
+	 */
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	if (IS_UNALIGNED(dst + res))
 		goto byte_at_a_time;
+#endif
 
 	while (max >= sizeof(unsigned long)) {
+		/*
+		 * src + res is aligned, so the reads in this loop will
+		 * not cross a page boundary.
+		 */
 		unsigned long c, data;
 
-		/* Fall back to byte-at-a-time if we get a page fault */
-		unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
+		unsafe_get_user(c, (unsigned long __user *)(src+res), efault);
 
 		*(unsigned long *)(dst+res) = c;
 		if (has_zero(c, , )) {
@@ -54,7 +74,9 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
 		max -= sizeof(unsigned long);
 	}
 
-byte_at_a_time:
+	/*
+	 * Finish the job one byte at a time.
+	 */
 	while (max) {
 		char c;
 


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-18 Thread Thomas Gleixner
On Sun, 17 Feb 2019, Al Viro wrote:
> On Sun, Feb 17, 2019 at 03:41:21AM +, Arthur Gautier wrote:
> Who says anything about changing the format of the file?  At least
> one trivial way to handle that would be this:
> 
> diff --git a/init/initramfs.c b/init/initramfs.c
> index 7cea802d00ef..edbddfb73106 100644
> --- a/init/initramfs.c
> +++ b/init/initramfs.c
> @@ -265,8 +265,12 @@ static int __init do_header(void)
>   state = Collect;
>   return 0;
>   }
> - if (S_ISREG(mode) || !body_len)
> - read_into(name_buf, N_ALIGN(name_len), GotName);
> + if (S_ISREG(mode) || !body_len) {
> + collect = collected = name_buf;
> + remains = N_ALIGN(name_len);
> + next_state = GotName;
> + state = Collect;
> + }
>   return 0;
>  }

That does not help much because that is exactly at the end of the
decompressed image and the decompressor is done. So nothing would collect
the remainder anymore.

> Another would be to have the buffer passed to flush_buffer() (i.e.
> the callback of decompress_fn) allocated with 4 bytes of padding
> past the part where the unpacked piece of data is placed for the
> callback to find.  As in,
> 
> diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> index 63b4b7eee138..ca3f7ecc9b35 100644
> --- a/lib/decompress_inflate.c
> +++ b/lib/decompress_inflate.c
> @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
>   rc = -1;
>   if (flush) {
>   out_len = 0x8000; /* 32 K */
> - out_buf = malloc(out_len);
> + out_buf = malloc(out_len + 4);

  +8 actually.

>   } else {
>   if (!out_len)
>   out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
> 
> for gunzip/decompress and similar ones for bzip2, etc.  The contents
> layout doesn't have anything to do with that...

Right. That works nicely.

Thanks,

tglx



Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro
On Sun, Feb 17, 2019 at 03:41:21AM +, Arthur Gautier wrote:
> On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> > On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> > 
> > > What is the actual problem?  We’re not actually demand-faulting this 
> > > data, are we?  Are we just overrunning the buffer because the from_user 
> > > helpers are too clever?  Can we fix it for real by having the fancy 
> > > helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, 
> > > this might be faster, too.
> > 
> > Unaligned _stores_ are not any cheaper, and you'd get one hell of
> > extra arithmetics from trying to avoid both.  Check something
> > like e.g. memcpy() on alpha, where you really have to keep all
> > accesses aligned, both on load and on store side.
> > 
> > Can't we just pad the buffers a bit?  Making sure that name_buf
> > and symlink_buf are _not_ followed by unmapped pages shouldn't
> > be hard.  Both are allocated by kmalloc(), so...
> 
> We cannot change alignment rules here. The input buffer string we're
> reading is coming from an cpio formated file and the format is
> defined by cpio(5).
> Nothing much we can do there I'm afraid. Input buffer is defined to
> be 4-byte aligned.

Who says anything about changing the format of the file?  At least
one trivial way to handle that would be this:

diff --git a/init/initramfs.c b/init/initramfs.c
index 7cea802d00ef..edbddfb73106 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -265,8 +265,12 @@ static int __init do_header(void)
state = Collect;
return 0;
}
-   if (S_ISREG(mode) || !body_len)
-   read_into(name_buf, N_ALIGN(name_len), GotName);
+   if (S_ISREG(mode) || !body_len) {
+   collect = collected = name_buf;
+   remains = N_ALIGN(name_len);
+   next_state = GotName;
+   state = Collect;
+   }
return 0;
 }
 
Another would be to have the buffer passed to flush_buffer() (i.e.
the callback of decompress_fn) allocated with 4 bytes of padding
past the part where the unpacked piece of data is placed for the
callback to find.  As in,

diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
index 63b4b7eee138..ca3f7ecc9b35 100644
--- a/lib/decompress_inflate.c
+++ b/lib/decompress_inflate.c
@@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
rc = -1;
if (flush) {
out_len = 0x8000; /* 32 K */
-   out_buf = malloc(out_len);
+   out_buf = malloc(out_len + 4);
} else {
if (!out_len)
out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */

for gunzip/decompress and similar ones for bzip2, etc.  The contents
layout doesn't have anything to do with that...


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Arthur Gautier
On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, 
> > are we?  Are we just overrunning the buffer because the from_user helpers 
> > are too clever?  Can we fix it for real by having the fancy helpers do 
> > *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
> > faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...

We cannot change alignment rules here. The input buffer string we're
reading is coming from an cpio formated file and the format is
defined by cpio(5).
Nothing much we can do there I'm afraid. Input buffer is defined to
be 4-byte aligned.

-- 
\o/ Arthur
 G  Gandi.net


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski



> On Feb 16, 2019, at 3:47 PM, Al Viro  wrote:
> 
>> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
>> 
>> What is the actual problem?  We’re not actually demand-faulting this data, 
>> are we?  Are we just overrunning the buffer because the from_user helpers 
>> are too clever?  Can we fix it for real by having the fancy helpers do 
>> *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
>> faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.

I think we should avoid unaligned loads and do unaligned stores instead.

I would general expect that unaligned stores are a bit cheaper since they don’t 
need to complete for the comparisons to happen.

But I maintain my claim that this code should not be overrunning its input 
buffer into the next page, since it could have observable side effects.

> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro
On Sat, Feb 16, 2019 at 11:47:02PM +, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, 
> > are we?  Are we just overrunning the buffer because the from_user helpers 
> > are too clever?  Can we fix it for real by having the fancy helpers do 
> > *aligned* loads so that they don’t overrun the buffer?  Heck, this might be 
> > faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?

... the fact that read_info() might skip copying if everything it
wants is already in the buffer passed by decompressor ;-/  It's
been a while since I looked into that code...


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Al Viro
On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:

> What is the actual problem?  We’re not actually demand-faulting this data, 
> are we?  Are we just overrunning the buffer because the from_user helpers are 
> too clever?  Can we fix it for real by having the fancy helpers do *aligned* 
> loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Unaligned _stores_ are not any cheaper, and you'd get one hell of
extra arithmetics from trying to avoid both.  Check something
like e.g. memcpy() on alpha, where you really have to keep all
accesses aligned, both on load and on store side.

Can't we just pad the buffers a bit?  Making sure that name_buf
and symlink_buf are _not_ followed by unmapped pages shouldn't
be hard.  Both are allocated by kmalloc(), so...

What am I missing here?


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski


> On Feb 16, 2019, at 2:50 PM, Andy Lutomirski  wrote:
> 
> 
> 
>>> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner  wrote:
>>> 
 On Sat, 16 Feb 2019, Jann Horn wrote:
 On Sat, Feb 16, 2019 at 12:59 AM  wrote:
 When extracting an initramfs, a filename may be near an allocation 
 boundary.
 Should that happen, strncopy_from_user will invoke unsafe_get_user which
 may cross the allocation boundary. Should that happen, unsafe_get_user will
 trigger a page fault, and strncopy_from_user would then bailout to
 byte_at_a_time behavior.
 
 unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
 boundaries.
 After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
 addresses")
 it may no longer rely on pagefault as the new page fault handler would
 trigger a BUG().
 
 This commit allows unsafe_get_user to explicitly trigger pagefaults and
 handle them directly with the error target label.
>>> 
>>> Oof. So basically the init code is full of things that just call
>>> syscalls instead of using VFS functions (which don't actually exist
>>> for everything), and the VFS syscalls use getname_flags(), which uses
>>> strncpy_from_user(), which can access out-of-bounds pages on
>>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>>> that in summary means that all the init code is potentially prone to
>>> tripping over this?
>> 
>> Not all init code. It should be only the initramfs decompression.
>> 
>>> I don't particularly like this approach to fixing it, but I also don't
>>> have any better ideas, so I guess unless someone else has a bright
>>> idea, this patch might have to go in.
>> 
>> So we know that this happens in the context of decompress() which calls
>> flush_buffer() for every chunk. flush_buffer() gets the start_address and
>> the length. We also know that the fault can only happen within:
>> 
>>   start_address <= fault_address < start_address + length + 8;
>> 
>> So something like the untested workaround below should cover the initramfs
>> oddity and avoid to weaken the protection for all other cases.
> 
> What is the actual problem?  We’re not actually demand-faulting this data, 
> are we?  Are we just overrunning the buffer because the from_user helpers are 
> too clever?  Can we fix it for real by having the fancy helpers do *aligned* 
> loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Indeed.  I would argue that the current code is a bug even in the normal case.  
If I lay out my user address space so that I have f,o,o,o,\0 at the end of a 
page and I have non-side-effect-free memory after it (MMIO, userfaultfd, etc), 
then passing a pointer to that “fooo” string to a syscall should *not* overrun 
the buffer.

If I have some time this evening, I’ll see if I can whip up a credible fix.

Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Andy Lutomirski



> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner  wrote:
> 
>> On Sat, 16 Feb 2019, Jann Horn wrote:
>>> On Sat, Feb 16, 2019 at 12:59 AM  wrote:
>>> When extracting an initramfs, a filename may be near an allocation boundary.
>>> Should that happen, strncopy_from_user will invoke unsafe_get_user which
>>> may cross the allocation boundary. Should that happen, unsafe_get_user will
>>> trigger a page fault, and strncopy_from_user would then bailout to
>>> byte_at_a_time behavior.
>>> 
>>> unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
>>> boundaries.
>>> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
>>> addresses")
>>> it may no longer rely on pagefault as the new page fault handler would
>>> trigger a BUG().
>>> 
>>> This commit allows unsafe_get_user to explicitly trigger pagefaults and
>>> handle them directly with the error target label.
>> 
>> Oof. So basically the init code is full of things that just call
>> syscalls instead of using VFS functions (which don't actually exist
>> for everything), and the VFS syscalls use getname_flags(), which uses
>> strncpy_from_user(), which can access out-of-bounds pages on
>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>> that in summary means that all the init code is potentially prone to
>> tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
>> I don't particularly like this approach to fixing it, but I also don't
>> have any better ideas, so I guess unless someone else has a bright
>> idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
>start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

What is the actual problem?  We’re not actually demand-faulting this data, are 
we?  Are we just overrunning the buffer because the from_user helpers are too 
clever?  Can we fix it for real by having the fancy helpers do *aligned* loads 
so that they don’t overrun the buffer?  Heck, this might be faster, too.


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Thomas Gleixner
On Sat, 16 Feb 2019, Thomas Gleixner wrote:

> On Sat, 16 Feb 2019, Jann Horn wrote:
> > On Sat, Feb 16, 2019 at 12:59 AM  wrote:
> > > When extracting an initramfs, a filename may be near an allocation 
> > > boundary.
> > > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > > may cross the allocation boundary. Should that happen, unsafe_get_user 
> > > will
> > > trigger a page fault, and strncopy_from_user would then bailout to
> > > byte_at_a_time behavior.
> > >
> > > unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
> > > boundaries.
> > > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on 
> > > kernel addresses")
> > > it may no longer rely on pagefault as the new page fault handler would
> > > trigger a BUG().
> > >
> > > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > > handle them directly with the error target label.
> > 
> > Oof. So basically the init code is full of things that just call
> > syscalls instead of using VFS functions (which don't actually exist
> > for everything), and the VFS syscalls use getname_flags(), which uses
> > strncpy_from_user(), which can access out-of-bounds pages on
> > architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> > that in summary means that all the init code is potentially prone to
> > tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
> > I don't particularly like this approach to fixing it, but I also don't
> > have any better ideas, so I guess unless someone else has a bright
> > idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
> start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

The other even simpler solution would be to force these functions into the
byte at a time code path during init. Too tired to hack that up now.

Thanks,

tglx


Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-16 Thread Thomas Gleixner
On Sat, 16 Feb 2019, Jann Horn wrote:
> On Sat, Feb 16, 2019 at 12:59 AM  wrote:
> > When extracting an initramfs, a filename may be near an allocation boundary.
> > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > may cross the allocation boundary. Should that happen, unsafe_get_user will
> > trigger a page fault, and strncopy_from_user would then bailout to
> > byte_at_a_time behavior.
> >
> > unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
> > boundaries.
> > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
> > addresses")
> > it may no longer rely on pagefault as the new page fault handler would
> > trigger a BUG().
> >
> > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > handle them directly with the error target label.
> 
> Oof. So basically the init code is full of things that just call
> syscalls instead of using VFS functions (which don't actually exist
> for everything), and the VFS syscalls use getname_flags(), which uses
> strncpy_from_user(), which can access out-of-bounds pages on
> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> that in summary means that all the init code is potentially prone to
> tripping over this?

Not all init code. It should be only the initramfs decompression.

> I don't particularly like this approach to fixing it, but I also don't
> have any better ideas, so I guess unless someone else has a bright
> idea, this patch might have to go in.

So we know that this happens in the context of decompress() which calls
flush_buffer() for every chunk. flush_buffer() gets the start_address and
the length. We also know that the fault can only happen within:

start_address <= fault_address < start_address + length + 8;

So something like the untested workaround below should cover the initramfs
oddity and avoid to weaken the protection for all other cases.

Thanks,

tglx

8<---
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -1,5 +1,6 @@
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -161,6 +162,14 @@ static bool bogus_uaccess(struct pt_regs
if (current->kernel_uaccess_faults_ok)
return false;
 
+   /*
+* initramfs decompression can trigger a fault when
+* unsafe_get_user() goes over the boundary of the buffer. That's a
+* valid case for e.g. strncpy_from_user().
+*/
+   if (initramfs_fault_in_decompress(fault_addr))
+   return false;
+
/* This is bad. Refuse the fixup so that we go into die(). */
if (trapnr == X86_TRAP_PF) {
pr_emerg("BUG: pagefault on kernel address 0x%lx in 
non-whitelisted uaccess\n",
--- a/include/linux/initrd.h
+++ b/include/linux/initrd.h
@@ -1,5 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#ifndef LINUX_INITRD_H
+#define LINUX_INITRD_H
+
 #define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */
 
 /* 1 = load ramdisk, 0 = don't load */
@@ -25,3 +28,14 @@ extern phys_addr_t phys_initrd_start;
 extern unsigned long phys_initrd_size;
 
 extern unsigned int real_root_dev;
+
+#ifdef CONFIG_BLK_DEV_INITRD
+bool initramfs_fault_in_decompress(unsigned long addr);
+#else
+static inline bool initramfs_fault_in_decompress(unsigned long addr)
+{
+   return false;
+}
+#endif
+
+#endif
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -403,13 +403,27 @@ static __initdata int (*actions[])(void)
[Reset] = do_reset,
 };
 
+static unsigned long flush_start;
+static unsigned long flush_length;
+
+bool initramfs_fault_in_decompress(unsigned long addr)
+{
+   return addr >= flush_start && addr < flush_start + flush_length + 8;
+}
+
 static long __init write_buffer(char *buf, unsigned long len)
 {
+   /* Store address and length for uaccess fault handling */
+   flush_start = (unsigned long) buf;
+   flush_length = len;
+
byte_count = len;
victim = buf;
 
while (!actions[state]())
;
+   /* Clear the uaccess fault handling region */
+   flush_start = flush_length = 0;
return len - byte_count;
 }
 










Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user

2019-02-15 Thread Jann Horn
+Andy Lutomirski

On Sat, Feb 16, 2019 at 12:59 AM  wrote:
>
> From: Arthur Gautier 
>
> When extracting an initramfs, a filename may be near an allocation boundary.
> Should that happen, strncopy_from_user will invoke unsafe_get_user which
> may cross the allocation boundary. Should that happen, unsafe_get_user will
> trigger a page fault, and strncopy_from_user would then bailout to
> byte_at_a_time behavior.
>
> unsafe_get_user is unsafe by nature, and rely on pagefault to detect 
> boundaries.
> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
> addresses")
> it may no longer rely on pagefault as the new page fault handler would
> trigger a BUG().
>
> This commit allows unsafe_get_user to explicitly trigger pagefaults and
> handle them directly with the error target label.

Oof. So basically the init code is full of things that just call
syscalls instead of using VFS functions (which don't actually exist
for everything), and the VFS syscalls use getname_flags(), which uses
strncpy_from_user(), which can access out-of-bounds pages on
architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
that in summary means that all the init code is potentially prone to
tripping over this?

I don't particularly like this approach to fixing it, but I also don't
have any better ideas, so I guess unless someone else has a bright
idea, this patch might have to go in.

> Kernel bug:
> [0.965251] Unpacking initramfs...
> [1.797025] BUG: pagefault on kernel address 0xae80c0c7e000 in 
> non-whitelisted uaccess
> [1.798992] BUG: unable to handle kernel paging request at ae80c0c7e000
> [1.798992] #PF error: [normal kernel read fault]
> [1.798992] PGD 68526067 P4D 68526067 PUD 68527067 PMD 67f0d067 PTE 0
> [1.798992] Oops:  [#1] SMP PTI
> [1.798992] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #14
> [1.798992] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> 1.10.2-1 04/01/2014
> [1.798992] RIP: 0010:strncpy_from_user+0x67/0xe0
> [1.798992] Code: fe fe 48 39 ca 49 ba 80 80 80 80 80 80 80 80 48 0f 46 ca 
> 31 c0 45 31 db 49 89 c8 4c 89 c1 48 29 c1 48 83 f9 07 76 49 44 89 d9 <4c> 8b 
> 0c 06 85 c9 75 3e 49 8d 0c 19 4c 89 0c 07 49 f7 d1 4c 21 c9
> [1.798992] RSP: :ae80c031fc40 EFLAGS: 00050216
> [1.798992] RAX: 0040 RBX: fefefefefefefeff RCX: 
> 
> [1.798992] RDX: 0fe0 RSI: ae80c0c7dfba RDI: 
> 8b3d27cce020
> [1.798992] RBP: ff9c R08: 0fe0 R09: 
> caccd29190978b86
> [1.798992] R10: 8080808080808080 R11:  R12: 
> ae80c0c7dfba
> [1.798992] R13: ae80c031fd00 R14:  R15: 
> 
> [1.798992] FS:  () GS:8b3d28a0() 
> knlGS:
> [1.798992] CS:  0010 DS:  ES:  CR0: 80050033
> [1.798992] CR2: ae80c0c7e000 CR3: 3a60e001 CR4: 
> 00360eb0
> [1.798992] Call Trace:
> [1.798992]  getname_flags+0x69/0x187
> [1.798992]  user_path_at_empty+0x1e/0x41
> [1.798992]  vfs_statx+0x70/0xcc
> [1.798992]  clean_path+0x41/0xa2
> [1.798992]  ? parse_header+0x40/0x10a
> [1.798992]  do_name+0x78/0x2b5
> [1.798992]  write_buffer+0x27/0x37
> [1.798992]  flush_buffer+0x34/0x8b
> [1.798992]  ? md_run_setup+0x8a/0x8a
> [1.798992]  unlz4+0x20b/0x27c
> [1.798992]  ? write_buffer+0x37/0x37
> [1.798992]  ? decompress_method+0x80/0x80
> [1.798992]  unpack_to_rootfs+0x17a/0x2b7
> [1.798992]  ? md_run_setup+0x8a/0x8a
> [1.798992]  ? clean_rootfs+0x159/0x159
> [1.798992]  populate_rootfs+0x5d/0x105
> [1.798992]  do_one_initcall+0x86/0x169
> [1.798992]  ? do_early_param+0x8e/0x8e
> [1.798992]  kernel_init_freeable+0x16a/0x1f4
> [1.798992]  ? rest_init+0xaa/0xaa
> [1.798992]  kernel_init+0xa/0xfa
> [1.798992]  ret_from_fork+0x35/0x40
>
> You may reproduce the issue with the following initrd:
>   truncate -s 8388313 a
>   
> SECONDFILENAME=
>   truncate -s 10 $SECONDFILENAME
>   echo "a\n$SECONDFILENAME" | cpio -o --format=newc | lz4 -l > initrd.img.lz4
>
> This places the second filename in the cpio near the allocation boundary made
> by lz4 decompression and should trigger the bug.
>
> Fixes: 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel 
> addresses")
>
> Cc: Jann Horn 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: x...@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Pascal Bouchareine 
> Signed-off-by: Arthur Gautier 
> ---
>  arch/x86/include/asm/uaccess.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index 780f2b42c8efe..2c272dc43e05a 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h