Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-06 Thread Linus Torvalds
On Wed, Dec 6, 2017 at 3:21 PM, Kees Cook  wrote:
> On Wed, Dec 6, 2017 at 2:31 AM, David Laight  wrote:
>>
>> I can also image issues where you want to know whether 2 pointers point
>> into the same structure (like an skb).
>
> This is already retained due to the hashing. i.e. the same pointer
> value will produce the same hash value, so that kind of direct
> comparison still works.

DavidL isn't talking about the _same_ pointer, he's talking about it
being offset from the same base.

But realistically, I don't think we ever hash things like that anyway.

Things like that do show up in register dumps etc during oopses, but
those aren't hashed.

Linus


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-06 Thread Kees Cook
On Wed, Dec 6, 2017 at 2:31 AM, David Laight  wrote:
> From: David Miller
>> Sent: 05 December 2017 20:31
> ...
>> > Would it make sense to keep the 3 lowest bits of the address?
>> >
>> > Currently printed pointers no longer have any correlation with the actual
>> > alignment in memory of the object, which is a typical cause of a class of 
>> > bugs.
>>
>> Yeah, this is driving people nuts who wonder why pointers are aligned
>> all weird now.
>
> I can also image issues where you want to know whether 2 pointers point
> into the same structure (like an skb).

This is already retained due to the hashing. i.e. the same pointer
value will produce the same hash value, so that kind of direct
comparison still works.

> Useful if you suspect that code is using a stale pointer because the
> skb got reallocated by some internal function.
> I'm not sure such pointers are printed by default though.

I hope not. :) skb used to be exported for INET_DIAG, but that got
fixed a while back.

-Kees

-- 
Kees Cook
Pixel Security


RE: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-06 Thread David Laight
From: David Miller
> Sent: 05 December 2017 20:31
...
> > Would it make sense to keep the 3 lowest bits of the address?
> >
> > Currently printed pointers no longer have any correlation with the actual
> > alignment in memory of the object, which is a typical cause of a class of 
> > bugs.
> 
> Yeah, this is driving people nuts who wonder why pointers are aligned
> all weird now.

I can also image issues where you want to know whether 2 pointers point
into the same structure (like an skb).
Useful if you suspect that code is using a stale pointer because the
skb got reallocated by some internal function.
I'm not sure such pointers are printed by default though.

I know I have debugged things that required hexdumps of memory areas
referenced by traced pointers.
I won't have done that for anything written with the kernel printf because
getting any more info for a linux kernel oops is actually quite hard.

David



Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-06 Thread Geert Uytterhoeven
Hi Linus,

On Wed, Dec 6, 2017 at 12:33 AM, Linus Torvalds
 wrote:
> On Tue, Dec 5, 2017 at 2:57 PM, Geert Uytterhoeven  
> wrote:
>> Lowest 3 is good enough for all natural types, up to long long.
>> We may still receive complaints from people who care about seeing if
>> a pointer is cacheline-aligned or not. Fixing that may need up to 7 bits, I'm
>> afraid, which is a bit too much to give up.
>
> I really think even the lowest three is a bit too much.
>
> Who really cares? If it's something that is particularly _about_
> alignment (ie an alignment trap), maybe just print out those bits
> separately.

If I'm not mistaken, some architectures don't generate alignment traps, but
just zero the LSBs.

> And for everything else? It's purely about getting used to it.

Yes, we will get used to it.

I agree that for debugging during development, we will just use %px and be
fine.

Storm in a glass of water, everybody will have forgotten by the time v4.16 is
released.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-05 Thread Linus Torvalds
On Tue, Dec 5, 2017 at 2:57 PM, Geert Uytterhoeven  wrote:
> Lowest 3 is good enough for all natural types, up to long long.
> We may still receive complaints from people who care about seeing if
> a pointer is cacheline-aligned or not. Fixing that may need up to 7 bits, I'm
> afraid, which is a bit too much to give up.

I really think even the lowest three is a bit too much.

Who really cares? If it's something that is particularly _about_
alignment (ie an alignment trap), maybe just print out those bits
separately.

And for everything else? It's purely about getting used to it.

I will just cut-and-paste what I wrote in another thread about the hashing:

  I'm going to require some actual proof of an actual case where it
  mattered that a particular value was hashed.

  Not hand-waving.

  Not "it surprised and confused me" because it looked different. You'll
  get used to it.

  So an actual "this was critical information that mattered for this
  particular bug, and it was missing due to the hashing of this
  particular value and debugging was harder in actual reality due to
  that".

  Because the actual example I have seen so far, not only didn't the
  hashing matter AT ALL, most of the _unhashed_ values shouldn't have
  been there either, and were due to arm still printing stuff that
  shouldn't have been printed at all and just made the oops more complex
  and harder to read and report.

This subject is really easy to bike-shed around. Everybody can have an
opinion. I want actual hard data and facts, not opinions.

And if the hashing _really_ is a problem, we'll just change that
particular thing to %px. But it needs actual hard data and real
reasons first, ok?

  Linus

  Linus


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-05 Thread Tobin C. Harding
On Tue, Dec 05, 2017 at 09:20:57PM +0100, Geert Uytterhoeven wrote:
> Hi Tobin,
> 
> On Wed, Nov 29, 2017 at 3:05 AM, Tobin C. Harding  wrote:
> > Currently there exist approximately 14 000 places in the kernel where
> > addresses are being printed using an unadorned %p. This potentially
> > leaks sensitive information regarding the Kernel layout in memory. Many
> > of these calls are stale, instead of fixing every call lets hash the
> > address by default before printing. This will of course break some
> > users, forcing code printing needed addresses to be updated.
> >
> > Code that _really_ needs the address will soon be able to use the new
> > printk specifier %px to print the address.
> 
> > --- a/lib/vsprintf.c
> > +++ b/lib/vsprintf.c
> 
> > +/* Maps a pointer to a 32 bit unique identifier. */
> > +static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec 
> > spec)
> > +{
> > +   unsigned long hashval;
> > +   const int default_width = 2 * sizeof(ptr);
> > +
> > +   if (unlikely(!have_filled_random_ptr_key)) {
> > +   spec.field_width = default_width;
> > +   /* string length must be less than default_width */
> > +   return string(buf, end, "(ptrval)", spec);
> > +   }
> > +
> > +#ifdef CONFIG_64BIT
> > +   hashval = (unsigned long)siphash_1u64((u64)ptr, _key);
> > +   /*
> > +* Mask off the first 32 bits, this makes explicit that we have
> > +* modified the address (and 32 bits is plenty for a unique ID).
> > +*/
> > +   hashval = hashval & 0x;
> > +#else
> > +   hashval = (unsigned long)siphash_1u32((u32)ptr, _key);
> > +#endif
> 
> Would it make sense to keep the 3 lowest bits of the address?
> 
> Currently printed pointers no longer have any correlation with the actual
> alignment in memory of the object, which is a typical cause of a class of 
> bugs.

We'd have to keep the lowest 4 since we are printing in hex, right? This
is easy enough to add. I wasn't the architect behind the hashing but I
can do up a patch and see if anyone who knows crypto objects.

thanks,
Tobin.


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-05 Thread David Miller
From: Geert Uytterhoeven 
Date: Tue, 5 Dec 2017 21:20:57 +0100

> Hi Tobin,
> 
> On Wed, Nov 29, 2017 at 3:05 AM, Tobin C. Harding  wrote:
>> Currently there exist approximately 14 000 places in the kernel where
>> addresses are being printed using an unadorned %p. This potentially
>> leaks sensitive information regarding the Kernel layout in memory. Many
>> of these calls are stale, instead of fixing every call lets hash the
>> address by default before printing. This will of course break some
>> users, forcing code printing needed addresses to be updated.
>>
>> Code that _really_ needs the address will soon be able to use the new
>> printk specifier %px to print the address.
> 
>> --- a/lib/vsprintf.c
>> +++ b/lib/vsprintf.c
> 
>> +/* Maps a pointer to a 32 bit unique identifier. */
>> +static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec 
>> spec)
>> +{
>> +   unsigned long hashval;
>> +   const int default_width = 2 * sizeof(ptr);
>> +
>> +   if (unlikely(!have_filled_random_ptr_key)) {
>> +   spec.field_width = default_width;
>> +   /* string length must be less than default_width */
>> +   return string(buf, end, "(ptrval)", spec);
>> +   }
>> +
>> +#ifdef CONFIG_64BIT
>> +   hashval = (unsigned long)siphash_1u64((u64)ptr, _key);
>> +   /*
>> +* Mask off the first 32 bits, this makes explicit that we have
>> +* modified the address (and 32 bits is plenty for a unique ID).
>> +*/
>> +   hashval = hashval & 0x;
>> +#else
>> +   hashval = (unsigned long)siphash_1u32((u32)ptr, _key);
>> +#endif
> 
> Would it make sense to keep the 3 lowest bits of the address?
> 
> Currently printed pointers no longer have any correlation with the actual
> alignment in memory of the object, which is a typical cause of a class of 
> bugs.

Yeah, this is driving people nuts who wonder why pointers are aligned
all weird now.


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-12-05 Thread Geert Uytterhoeven
Hi Tobin,

On Wed, Nov 29, 2017 at 3:05 AM, Tobin C. Harding  wrote:
> Currently there exist approximately 14 000 places in the kernel where
> addresses are being printed using an unadorned %p. This potentially
> leaks sensitive information regarding the Kernel layout in memory. Many
> of these calls are stale, instead of fixing every call lets hash the
> address by default before printing. This will of course break some
> users, forcing code printing needed addresses to be updated.
>
> Code that _really_ needs the address will soon be able to use the new
> printk specifier %px to print the address.

> --- a/lib/vsprintf.c
> +++ b/lib/vsprintf.c

> +/* Maps a pointer to a 32 bit unique identifier. */
> +static char *ptr_to_id(char *buf, char *end, void *ptr, struct printf_spec 
> spec)
> +{
> +   unsigned long hashval;
> +   const int default_width = 2 * sizeof(ptr);
> +
> +   if (unlikely(!have_filled_random_ptr_key)) {
> +   spec.field_width = default_width;
> +   /* string length must be less than default_width */
> +   return string(buf, end, "(ptrval)", spec);
> +   }
> +
> +#ifdef CONFIG_64BIT
> +   hashval = (unsigned long)siphash_1u64((u64)ptr, _key);
> +   /*
> +* Mask off the first 32 bits, this makes explicit that we have
> +* modified the address (and 32 bits is plenty for a unique ID).
> +*/
> +   hashval = hashval & 0x;
> +#else
> +   hashval = (unsigned long)siphash_1u32((u32)ptr, _key);
> +#endif

Would it make sense to keep the 3 lowest bits of the address?

Currently printed pointers no longer have any correlation with the actual
alignment in memory of the object, which is a typical cause of a class of bugs.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH V11 3/5] printk: hash addresses printed with %p

2017-11-29 Thread Andrew Morton
On Wed, 29 Nov 2017 13:05:03 +1100 "Tobin C. Harding"  wrote:

> Currently there exist approximately 14 000 places in the kernel where
> addresses are being printed using an unadorned %p. This potentially
> leaks sensitive information regarding the Kernel layout in memory. Many
> of these calls are stale, instead of fixing every call lets hash the
> address by default before printing. This will of course break some
> users, forcing code printing needed addresses to be updated.
> 
> Code that _really_ needs the address will soon be able to use the new
> printk specifier %px to print the address.
> 
> For what it's worth, usage of unadorned %p can be broken down as
> follows (thanks to Joe Perches).
> 
> $ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
>1084 arch
>  20 block
>  10 crypto
>  32 Documentation
>8121 drivers
>1221 fs
> 143 include
> 101 kernel
>  69 lib
> 100 mm
>1510 net
>  40 samples
>   7 scripts
>  11 security
> 166 sound
> 152 tools
>   2 virt
> 
> Add function ptr_to_id() to map an address to a 32 bit unique
> identifier. Hash any unadorned usage of specifier %p and any malformed
> specifiers.
> 
> ...
>
> @@ -1644,6 +1646,73 @@ char *device_node_string(char *buf, char *end, struct 
> device_node *dn,
>   return widen_string(buf, buf - buf_start, end, spec);
>  }
>  
> +static bool have_filled_random_ptr_key __read_mostly;
> +static siphash_key_t ptr_key __read_mostly;
> +
> +static void fill_random_ptr_key(struct random_ready_callback *unused)
> +{
> + get_random_bytes(_key, sizeof(ptr_key));
> + /*
> +  * have_filled_random_ptr_key==true is dependent on get_random_bytes().
> +  * ptr_to_id() needs to see have_filled_random_ptr_key==true
> +  * after get_random_bytes() returns.
> +  */
> + smp_mb();
> + WRITE_ONCE(have_filled_random_ptr_key, true);
> +}

I don't think I'm seeing anything which prevents two CPUs from
initializing ptr_key at the same time.  Probably doesn't matter much...



[PATCH V11 3/5] printk: hash addresses printed with %p

2017-11-28 Thread Tobin C. Harding
Currently there exist approximately 14 000 places in the kernel where
addresses are being printed using an unadorned %p. This potentially
leaks sensitive information regarding the Kernel layout in memory. Many
of these calls are stale, instead of fixing every call lets hash the
address by default before printing. This will of course break some
users, forcing code printing needed addresses to be updated.

Code that _really_ needs the address will soon be able to use the new
printk specifier %px to print the address.

For what it's worth, usage of unadorned %p can be broken down as
follows (thanks to Joe Perches).

$ git grep -E '%p[^A-Za-z0-9]' | cut -f1 -d"/" | sort | uniq -c
   1084 arch
 20 block
 10 crypto
 32 Documentation
   8121 drivers
   1221 fs
143 include
101 kernel
 69 lib
100 mm
   1510 net
 40 samples
  7 scripts
 11 security
166 sound
152 tools
  2 virt

Add function ptr_to_id() to map an address to a 32 bit unique
identifier. Hash any unadorned usage of specifier %p and any malformed
specifiers.

Signed-off-by: Tobin C. Harding 
---
 Documentation/printk-formats.txt |  12 -
 lib/test_printf.c| 108 +--
 lib/vsprintf.c   |  81 ++---
 3 files changed, 155 insertions(+), 46 deletions(-)

diff --git a/Documentation/printk-formats.txt b/Documentation/printk-formats.txt
index 71b62db7eca2..b4e668ac4fe3 100644
--- a/Documentation/printk-formats.txt
+++ b/Documentation/printk-formats.txt
@@ -5,7 +5,6 @@ How to get printk format specifiers right
 :Author: Randy Dunlap 
 :Author: Andrew Murray 
 
-
 Integer types
 =
 
@@ -45,6 +44,17 @@ return from vsnprintf.
 Raw pointer value SHOULD be printed with %p. The kernel supports
 the following extended format specifiers for pointer types:
 
+Pointer Types
+=
+
+Pointers printed without a specifier extension (i.e unadorned %p) are
+hashed to give a unique identifier without leaking kernel addresses to user
+space. On 64 bit machines the first 32 bits are zeroed.
+
+::
+
+   %p  abcdef12 or abcdef12
+
 Symbols/Function Pointers
 =
 
diff --git a/lib/test_printf.c b/lib/test_printf.c
index 563f10e6876a..71ebfa43ad05 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -24,24 +24,6 @@
 #define PAD_SIZE 16
 #define FILL_CHAR '$'
 
-#define PTR1 ((void*)0x01234567)
-#define PTR2 ((void*)(long)(int)0xfedcba98)
-
-#if BITS_PER_LONG == 64
-#define PTR1_ZEROES "0"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fedcba98"
-#define PTR_WIDTH 16
-#else
-#define PTR1_ZEROES "0"
-#define PTR1_SPACES " "
-#define PTR1_STR "1234567"
-#define PTR2_STR "fedcba98"
-#define PTR_WIDTH 8
-#endif
-#define PTR_WIDTH_STR stringify(PTR_WIDTH)
-
 static unsigned total_tests __initdata;
 static unsigned failed_tests __initdata;
 static char *test_buffer __initdata;
@@ -217,30 +199,79 @@ test_string(void)
test("a  |   |   ", "%-3.s|%-3.0s|%-3.*s", "a", "b", 0, "c");
 }
 
+#define PLAIN_BUF_SIZE 64  /* leave some space so we don't oops */
+
+#if BITS_PER_LONG == 64
+
+#define PTR_WIDTH 16
+#define PTR ((void *)0x0123456789ab)
+#define PTR_STR "0123456789ab"
+#define ZEROS ""   /* hex 32 zero bits */
+
+static int __init
+plain_format(void)
+{
+   char buf[PLAIN_BUF_SIZE];
+   int nchars;
+
+   nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+   if (nchars != PTR_WIDTH || strncmp(buf, ZEROS, strlen(ZEROS)) != 0)
+   return -1;
+
+   return 0;
+}
+
+#else
+
+#define PTR_WIDTH 8
+#define PTR ((void *)0x456789ab)
+#define PTR_STR "456789ab"
+
+static int __init
+plain_format(void)
+{
+   /* Format is implicitly tested for 32 bit machines by plain_hash() */
+   return 0;
+}
+
+#endif /* BITS_PER_LONG == 64 */
+
+static int __init
+plain_hash(void)
+{
+   char buf[PLAIN_BUF_SIZE];
+   int nchars;
+
+   nchars = snprintf(buf, PLAIN_BUF_SIZE, "%p", PTR);
+
+   if (nchars != PTR_WIDTH || strncmp(buf, PTR_STR, PTR_WIDTH) == 0)
+   return -1;
+
+   return 0;
+}
+
+/*
+ * We can't use test() to test %p because we don't know what output to expect
+ * after an address is hashed.
+ */
 static void __init
 plain(void)
 {
-   test(PTR1_ZEROES PTR1_STR " " PTR2_STR, "%p %p", PTR1, PTR2);
-   /*
-* The field width is overloaded for some %p extensions to
-* pass another piece of information. For plain pointers, the
-* behaviour is slightly odd: One cannot pass either the 0
-* flag nor a precision to %p without gcc complaining, and if
-* one explicitly gives a field width, the number is no longer
-* zero-padded.
-*/
-   test("|" PTR1_STR PTR1_SPACES "  |  " PTR1_SPACES PTR1_STR "|",
-"|%-*p|%*p|",