Re: libc malloc poison

2013-07-05 Thread Otto Moerbeek
On Thu, Jul 04, 2013 at 05:24:20PM +0200, Mark Kettenis wrote:

  From: Theo de Raadt dera...@cvs.openbsd.org
  Date: Thu, 04 Jul 2013 09:04:54 -0600
  
  I suspect the best approach would be a hybrid value.  The upper half
  of the address should try to land in an unmapped zone, or into the zero
  page, or into some address space hole, ir into super high memory above
  the stack which is gauranteed unmapped.
 
 Don't forget strict alignment architectures, where it is beneficial
 to have the lowest bit set to trigger alignment traps.

You also want the highest bit set. This makes sure signed indexes get
interpreted as negative, which should more often detect problems than
positive ones. There are not only pointers in the heap. 

I very much prefer the values te be easily recognizable in a debugger
and to keep a clear distinction between uninitialized and freed
chunks.  Too much random is not good thing in this case. I picked 0xdf
a few years ago to mark free'd memory. The f is supposed to help you
remember that when you see it in a debugger. 

Maybe put a fixed pattern in the low nibbles and a random in the high
nibble. Together with the lowest and highest bit set this would go
like:

0x8e0d0e0f || 0xr0r0r0r0

(pick another lhs for free'ed mem)

Wondering if that would produce easily recognizable patterns that
still trigger enough faults.

-Otto




Re: libc malloc poison

2013-07-05 Thread Henri Kemppainen
 On Thu, Jul 04, 2013 at 05:24:20PM +0200, Mark Kettenis wrote:
   From: Theo de Raadt dera...@cvs.openbsd.org
   Date: Thu, 04 Jul 2013 09:04:54 -0600
   
   I suspect the best approach would be a hybrid value.  The upper half
   of the address should try to land in an unmapped zone, or into the zero
   page, or into some address space hole, ir into super high memory above
   the stack which is gauranteed unmapped.
  
  Don't forget strict alignment architectures, where it is beneficial
  to have the lowest bit set to trigger alignment traps.

 You also want the highest bit set. This makes sure signed indexes get
 interpreted as negative, which should more often detect problems than
 positive ones. There are not only pointers in the heap. 

A while ago someone hit a bug in my code that I hadn't (and wouldn't ever
have) triggered with MALLOC_OPTIONS=S because the Duh pattern always sets
the high bit, giving a negative integer which just so happened to be
harmless.  Back then I wished malloc would've used all random bits instead.

If you test run your code more than a couple times (as one obviously should,
before going into release), then I'd say randomness (which will likely
soon produce a bug-trigger value) is better than a fixed pattern that attempts
to be useful in some (common?) scenarios but will reliably fail to make a
difference in other cases.

I do see the value in having the option to use fixed patterns though, so
perhaps the hybrid value with a configurable mask as Theo proposed is a
good idea.



Re: libc malloc poison

2013-07-04 Thread Theo de Raadt
 On Wed, Jul 03, 2013 at 17:21, Theo de Raadt wrote:
  +   int pval = 0xd0d0caca;
  
  Can you explain the choice of this?
 
 I thought it sounded clever.

Ok, because there's more to the picture.

Inside the kernel, we tend to use 0xdeadbeef, or the DEADBEEF0/DEADBEEF1 values.

Reason for the latter is that we can try to put this value into memory
which the kernel does not manage.  Basically on half of our kernel
architectures if that is loaded into a pointer, it will hit unmanaged
kernel memory, exposing the bug.

Furthermore, the idea is that the 0xdeadbeef value should have a high
mix of set bits versus clear bits, for when it lands in a flag
variable.  A lot of bits are set; some paper I read years ago
discussed that on average flag bits set tends to traverse.

Furthermore, each of the sub-fields tend to be odd, which in other
use after cases regarding offsets/indexes can lead to more unaligned
access, once again triggering and exposing a bug.

Those are all theoretical ideas to try to expose the bugs as early
as possible.

I think the use of 0xd0d0caca in userland might not be the most
suitable choice, especially in a MI fashion.

  There are arguments to make this MI; other arguments to make it MD;
  and other arguments to introduce a bit of randomness.
  
  I'd like to know which arguments you have
 
 Since libc doesn't do free list integrity checking, I'm currently
 leaning towards a random value. (even with random, we could still
 check that all words of a free chunk are the same.)
 
 Somebody also noticed that we don't have separate values for allocated
 and freed memory. I suppose this makes debugging harder since you
 can't obviously identify freed memory? I lean towards prioritizing
 finding more bugs, which implies we need more variability, since any
 one value may allow a program to work where a different value would
 not.

I agree with this last sentence.

I suspect the best approach would be a hybrid value.  The upper half
of the address should try to land in an unmapped zone, or into the zero
page, or into some address space hole, ir into super high memory above
the stack which is gauranteed unmapped.

The 64-bit machines require a bit more consideration as well; we want
two 32-bit values to combine into a nice trashy address.

Should we use a kernel sysctl to recommend the high word?, and a mask
against random?  Of course, that could also be done using MD includes.

 



Re: libc malloc poison

2013-07-04 Thread Mark Kettenis
 From: Theo de Raadt dera...@cvs.openbsd.org
 Date: Thu, 04 Jul 2013 09:04:54 -0600
 
 I suspect the best approach would be a hybrid value.  The upper half
 of the address should try to land in an unmapped zone, or into the zero
 page, or into some address space hole, ir into super high memory above
 the stack which is gauranteed unmapped.

Don't forget strict alignment architectures, where it is beneficial
to have the lowest bit set to trigger alignment traps.



Re: libc malloc poison

2013-07-04 Thread Theo de Raadt
  From: Theo de Raadt dera...@cvs.openbsd.org
  Date: Thu, 04 Jul 2013 09:04:54 -0600
  
  I suspect the best approach would be a hybrid value.  The upper half
  of the address should try to land in an unmapped zone, or into the zero
  page, or into some address space hole, ir into super high memory above
  the stack which is gauranteed unmapped.
 
 Don't forget strict alignment architectures, where it is beneficial
 to have the lowest bit set to trigger alignment traps.

That's why I vaguely mentioned the idea of a sysctl or MD defines, which
would declare a fixed component, plus a mask on top of random.

That fixed component need not just be high bits, it can also cover the
lowest bit (or two) for instance.



libc malloc poison

2013-07-03 Thread Ted Unangst
change the junking to be word size. maybe later we can also change the
values to be random or something.

Index: stdlib/malloc.c
===
RCS file: /cvs/src/lib/libc/stdlib/malloc.c,v
retrieving revision 1.149
diff -u -p -r1.149 malloc.c
--- stdlib/malloc.c 22 Dec 2012 07:32:17 -  1.149
+++ stdlib/malloc.c 3 Jul 2013 19:52:27 -
@@ -80,14 +80,6 @@
 
 #define PAGEROUND(x)  (((x) + (MALLOC_PAGEMASK))  ~MALLOC_PAGEMASK)
 
-/*
- * What to use for Junk.  This is the byte value we use to fill with
- * when the 'J' option is enabled. Use SOME_JUNK right after alloc,
- * and SOME_FREEJUNK right before free.
- */
-#define SOME_JUNK  0xd0/* as in Duh :-) */
-#define SOME_FREEJUNK  0xdf
-
 #define MMAP(sz)   mmap(NULL, (size_t)(sz), PROT_READ | PROT_WRITE, \
 MAP_ANON | MAP_PRIVATE, -1, (off_t) 0)
 
@@ -237,6 +229,18 @@ hash(void *p)
 }
 
 static void
+poison(void *p, size_t len)
+{
+   int *ip = p;
+   size_t i;
+   int pval = 0xd0d0caca;
+
+   len = len / 4;
+   for (i = 0; i  len; i++)
+   ip[i] = pval;
+}
+
+static void
 wrterror(char *msg, void *p)
 {
char*q =  error: ;
@@ -419,7 +423,7 @@ map(struct dir_info *d, size_t sz, int z
memset(p, 0, sz);
else if (mopts.malloc_junk 
mopts.malloc_freeunmap)
-   memset(p, SOME_FREEJUNK, sz);
+   poison(p, sz);
return p;
} else if (r-size  psz)
big = r;
@@ -437,7 +441,7 @@ map(struct dir_info *d, size_t sz, int z
if (zero_fill)
memset(p, 0, sz);
else if (mopts.malloc_junk  mopts.malloc_freeunmap)
-   memset(p, SOME_FREEJUNK, sz);
+   poison(p, sz);
return p;
}
p = MMAP(sz);
@@ -975,7 +979,7 @@ malloc_bytes(struct dir_info *d, size_t 
k = bp-shift;
 
if (mopts.malloc_junk  bp-size  0)
-   memset((char *)bp-page + k, SOME_JUNK, bp-size);
+   poison((char *)bp-page + k, bp-size);
return ((char *)bp-page + k);
 }
 
@@ -1073,7 +1077,7 @@ omalloc(size_t sz, int zero_fill, void *
MALLOC_LEEWAY) {
/* fill whole allocation */
if (mopts.malloc_junk)
-   memset(p, SOME_JUNK, psz - mopts.malloc_guard);
+   poison(p, psz - mopts.malloc_guard);
/* shift towards the end */
p = ((char *)p) + ((MALLOC_PAGESIZE - MALLOC_LEEWAY -
(sz - mopts.malloc_guard))  ~(MALLOC_MINSIZE-1));
@@ -1083,11 +1087,10 @@ omalloc(size_t sz, int zero_fill, void *
} else {
if (mopts.malloc_junk) {
if (zero_fill)
-   memset((char *)p + sz - 
mopts.malloc_guard,
-   SOME_JUNK, psz - sz);
+   poison((char *)p + sz -
+   mopts.malloc_guard, psz - sz);
else
-   memset(p, SOME_JUNK,
-   psz - mopts.malloc_guard);
+   poison(p, psz - mopts.malloc_guard);
}
}
 
@@ -1202,8 +1205,7 @@ ofree(void *p)
malloc_guarded -= mopts.malloc_guard;
}
if (mopts.malloc_junk  !mopts.malloc_freeunmap)
-   memset(p, SOME_FREEJUNK,
-   PAGEROUND(sz) - mopts.malloc_guard);
+   poison(p, PAGEROUND(sz) - mopts.malloc_guard);
unmap(g_pool, p, PAGEROUND(sz));
delete(g_pool, r);
} else {
@@ -1211,7 +1213,7 @@ ofree(void *p)
int i;
 
if (mopts.malloc_junk  sz  0)
-   memset(p, SOME_FREEJUNK, sz);
+   poison(p, sz);
if (!mopts.malloc_freenow) {
i = getrnibble();
tmp = p;
@@ -1308,7 +1310,7 @@ orealloc(void *p, size_t newsz, void *f)
if (q == hint) {
malloc_used += needed;
if (mopts.malloc_junk)
-   memset(q, SOME_JUNK, needed);
+   poison(q, needed);
r-size = newsz;
STATS_SETF(r, f);
 

Re: libc malloc poison

2013-07-03 Thread Theo de Raadt
 +   int pval = 0xd0d0caca;

Can you explain the choice of this?

There are arguments to make this MI; other arguments to make it MD;
and other arguments to introduce a bit of randomness.

I'd like to know which arguments you have



Re: libc malloc poison

2013-07-03 Thread Ted Unangst
On Wed, Jul 03, 2013 at 17:21, Theo de Raadt wrote:
 +   int pval = 0xd0d0caca;
 
 Can you explain the choice of this?

I thought it sounded clever.

 There are arguments to make this MI; other arguments to make it MD;
 and other arguments to introduce a bit of randomness.
 
 I'd like to know which arguments you have

Since libc doesn't do free list integrity checking, I'm currently
leaning towards a random value. (even with random, we could still
check that all words of a free chunk are the same.)

Somebody also noticed that we don't have separate values for allocated
and freed memory. I suppose this makes debugging harder since you
can't obviously identify freed memory? I lean towards prioritizing
finding more bugs, which implies we need more variability, since any
one value may allow a program to work where a different value would
not.