Re: malloc write after free error checking

2023-09-24 Thread Todd C . Miller
On Sun, 24 Sep 2023 09:58:53 +0200, Otto Moerbeek wrote:

> The wayland issue was found as well, using the same method.
> I'm working on programming the heuristic that is quite effective into
> malloc itself. It currently looks like this for the X case above:
>
> X(67417) in malloc(): write to free mem 0xdd0277d4270 [0..7]@16
> allocated at /usr/lib/libc.so.97.1 0x92f22
> (preceding chunk 0xdd0277d4260 allocated at /usr/X11R6/bin/X 0x177374 (now fr
> ee))
>
> $ addr2line -e /usr/lib/libc.so.97.1 0x92f22
> /usr/src/lib/libc/string/strdup.c:45
> $ addr2line -e /usr/X11R6/bin/X 0x177374
> /usr/xenocara/xserver/Xext/xvdisp.c:995
>
> The idea is: if a buffer overflow is detected, it is very wel possible
> that the overwrite happened as an out-of-bound write of the preceding
> chunk. It is also possible that it is a genuine write-after-free, in
> that case the developer should inspect the code that allocated and
> manipulated the first chunk. Malloc has no way to the the difference,
> that requires debugging by a human.
>
> This is all experimental and the final form may change but it
> certainly look promising, especially as regular malloc code did not
> change at all (the extra info needed is only collected if malloc flag
> D is set).

This is very cool.  Being able to tell where the (now-freed) chunk
was allocated is a huge help in debugging this kind of issue.

 - todd



Re: malloc write after free error checking

2023-09-24 Thread Otto Moerbeek
On Wed, Sep 20, 2023 at 08:08:23AM +0200, Otto Moerbeek wrote:

> On Sun, Sep 03, 2023 at 09:21:18AM +0200, Otto Moerbeek wrote:
> 
> > Hello,
> > 
> > I'm seeing some reports of "write after free" reported by malloc by
> > peolpe running current.  Malloc has become more strict since begining
> > of June. Let me explain:
> > 
> > Small allocations share a page. e.g. a 4k page will hold 8 512 byte
> > allocations.
> > 
> > When one such allocation is freed, it will be filled with a "free
> > junk" pattern, stay for a short while in the delayed free list (and be
> > subhject to write after free checks) and after that it will be marked
> > free and it might be allocated again. On a new allocation the write
> > after free check will also be done. That is the new part of my commit
> > in June.  Another change is that on the allocation of the page
> > containing chunks, all chunks wil be filled with a "free junk" (0xdf)
> > patttern. 
> > 
> > The change has a consequence: the time span of the "write after free"
> > checks is much longer, as it can take a long time for a chunk to get
> > used again.
> > 
> > I do believe the changes itself are correct and can help findiung
> > bugs in other code. 
> > 
> > You can also be set upon a wrong foot: if an out of bounds write on a
> > adjacent chunk happens and lands in (another) free chunk, upon
> > allocation of that free chunk it will be reported as a "write after
> > free" case. It might even be that an allocation that was never
> > allocated and never freed is still "write after free" because of
> > "close" out of bounds writes. 
> > 
> > I'm thinking how to change the error message to reflect that.
> > 
> > If these extra checks hinder progress, it is possible to swicth them
> > off using malloc option f (small F).
> 
> That should be j (small J).
> 
> > For general debugging I recommend using malloc option S, is has all
> > the extra strictness that can be enabled while still conforming to the
> > malloc API.
> > 
> > Please be aware of these things while hunting for bugs.
> 
> I'm happy to say two of the more complex ones are (being) fixed: one
> turned out to be a reference counting bug in firefox. See
> http://undeadly.org/cgi?action=article;sid=20230912094727
> 
> The other, a write after free that crashed the X server when running
> picard was diagnosed by me.  This one was a bit nasty, as it required
> instrumenting malloc to print some extra info to find the root cause. 
> 
> The bug is that the call in
> https://github.com/openbsd/xenocara/blob/master/xserver/Xext/xvdisp.c#L1002
> overwrites the first 4 bytes of the chunk next to the one allocated on
> line 995.
> 
> A workaround is to allocate 4 bytes extra, matthieu@ will be looking
> for a proper fix, as it requires knowledge of the X internals.
> 
> I am pondering if the instrumentation hack I used could be modified to
> be always available and print the function that did the original
> allocation when detecting a write after free.
> 
> The conclusion is that in these two cases, malloc helped in
> detecting/finding the bugs (and it was not a bug in malloc itself :-).
> Coincidentally this was the topioc of my talk during EuroBSD2023 last
> weekend, see http://www.openbsd.org/events.html
> 
> I hope to look at the reports I got for the Wayland port (which is
> WIP) as well, see
> https://github.com/openbsd/ports/blob/master/wayland/TODO-Wayland.md 

The wayland issue was found as well, using the same method.
I'm working on programming the heuristic that is quite effective into
malloc itself. It currently looks like this for the X case above:

X(67417) in malloc(): write to free mem 0xdd0277d4270 [0..7]@16
allocated at /usr/lib/libc.so.97.1 0x92f22
(preceding chunk 0xdd0277d4260 allocated at /usr/X11R6/bin/X 0x177374 (now 
free))

$ addr2line -e /usr/lib/libc.so.97.1 0x92f22
/usr/src/lib/libc/string/strdup.c:45
$ addr2line -e /usr/X11R6/bin/X 0x177374
/usr/xenocara/xserver/Xext/xvdisp.c:995

The idea is: if a buffer overflow is detected, it is very wel possible
that the overwrite happened as an out-of-bound write of the preceding
chunk. It is also possible that it is a genuine write-after-free, in
that case the developer should inspect the code that allocated and
manipulated the first chunk. Malloc has no way to the the difference,
that requires debugging by a human.

This is all experimental and the final form may change but it
certainly look promising, especially as regular malloc code did not
change at all (the extra info needed is only collected if malloc flag
D is set).

-Otto



Re: malloc write after free error checking

2023-09-20 Thread Otto Moerbeek
On Wed, Sep 20, 2023 at 03:02:27PM +0200, Matthieu Herrb wrote:

> On Wed, Sep 20, 2023 at 08:08:23AM +0200, Otto Moerbeek wrote:
> > 
> > The other, a write after free that crashed the X server when running
> > picard was diagnosed by me.  This one was a bit nasty, as it required
> > instrumenting malloc to print some extra info to find the root cause. 
> > 
> > The bug is that the call in
> > https://github.com/openbsd/xenocara/blob/master/xserver/Xext/xvdisp.c#L1002
> > overwrites the first 4 bytes of the chunk next to the one allocated on
> > line 995.
> > 
> > A workaround is to allocate 4 bytes extra, matthieu@ will be looking
> > for a proper fix, as it requires knowledge of the X internals.
> > 
> 
> Hi,
> 
> Thanks again for finding it. Can you try this patch ?
> 
> ===
> Fix overflow in glamor_xv_query_image_attributes for NV12 image format
> 
> This is a format with num_planes == 2, we have only 2 elements in
> offsets[] and pitches[]
> 
> Bug found by Otto Moerbeek on OpenBSD using his strict malloc checking.
> ---
>  glamor/glamor_xv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/glamor/glamor_xv.c b/glamor/glamor_xv.c
> index a3d6b3bc3..e0e8e0ba9 100644
> --- a/glamor/glamor_xv.c
> +++ b/glamor/glamor_xv.c
> @@ -291,10 +291,10 @@ glamor_xv_query_image_attributes(int id,
>  pitches[0] = size;
>  size *= *h;
>  if (offsets)
> -offsets[1] = offsets[2] = size;
> +offsets[1] = size;
>  tmp = ALIGN(*w, 4);
>  if (pitches)
> -pitches[1] = pitches[2] = tmp;
> +pitches[1] = tmp;
>  tmp *= (*h >> 1);
>  size += tmp;
>  break;
> --- 
> 2.42.0

Yes, that works for me,

-Otto



Re: malloc write after free error checking

2023-09-20 Thread Matthieu Herrb
On Wed, Sep 20, 2023 at 08:08:23AM +0200, Otto Moerbeek wrote:
> 
> The other, a write after free that crashed the X server when running
> picard was diagnosed by me.  This one was a bit nasty, as it required
> instrumenting malloc to print some extra info to find the root cause. 
> 
> The bug is that the call in
> https://github.com/openbsd/xenocara/blob/master/xserver/Xext/xvdisp.c#L1002
> overwrites the first 4 bytes of the chunk next to the one allocated on
> line 995.
> 
> A workaround is to allocate 4 bytes extra, matthieu@ will be looking
> for a proper fix, as it requires knowledge of the X internals.
> 

Hi,

Thanks again for finding it. Can you try this patch ?

===
Fix overflow in glamor_xv_query_image_attributes for NV12 image format

This is a format with num_planes == 2, we have only 2 elements in
offsets[] and pitches[]

Bug found by Otto Moerbeek on OpenBSD using his strict malloc checking.
---
 glamor/glamor_xv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/glamor/glamor_xv.c b/glamor/glamor_xv.c
index a3d6b3bc3..e0e8e0ba9 100644
--- a/glamor/glamor_xv.c
+++ b/glamor/glamor_xv.c
@@ -291,10 +291,10 @@ glamor_xv_query_image_attributes(int id,
 pitches[0] = size;
 size *= *h;
 if (offsets)
-offsets[1] = offsets[2] = size;
+offsets[1] = size;
 tmp = ALIGN(*w, 4);
 if (pitches)
-pitches[1] = pitches[2] = tmp;
+pitches[1] = tmp;
 tmp *= (*h >> 1);
 size += tmp;
 break;
--- 
2.42.0


-- 
Matthieu Herrb



Re: malloc write after free error checking

2023-09-20 Thread Otto Moerbeek
On Sun, Sep 03, 2023 at 09:21:18AM +0200, Otto Moerbeek wrote:

> Hello,
> 
> I'm seeing some reports of "write after free" reported by malloc by
> peolpe running current.  Malloc has become more strict since begining
> of June. Let me explain:
> 
> Small allocations share a page. e.g. a 4k page will hold 8 512 byte
> allocations.
> 
> When one such allocation is freed, it will be filled with a "free
> junk" pattern, stay for a short while in the delayed free list (and be
> subhject to write after free checks) and after that it will be marked
> free and it might be allocated again. On a new allocation the write
> after free check will also be done. That is the new part of my commit
> in June.  Another change is that on the allocation of the page
> containing chunks, all chunks wil be filled with a "free junk" (0xdf)
> patttern. 
> 
> The change has a consequence: the time span of the "write after free"
> checks is much longer, as it can take a long time for a chunk to get
> used again.
> 
> I do believe the changes itself are correct and can help findiung
> bugs in other code. 
> 
> You can also be set upon a wrong foot: if an out of bounds write on a
> adjacent chunk happens and lands in (another) free chunk, upon
> allocation of that free chunk it will be reported as a "write after
> free" case. It might even be that an allocation that was never
> allocated and never freed is still "write after free" because of
> "close" out of bounds writes. 
> 
> I'm thinking how to change the error message to reflect that.
> 
> If these extra checks hinder progress, it is possible to swicth them
> off using malloc option f (small F).

That should be j (small J).

> For general debugging I recommend using malloc option S, is has all
> the extra strictness that can be enabled while still conforming to the
> malloc API.
> 
> Please be aware of these things while hunting for bugs.

I'm happy to say two of the more complex ones are (being) fixed: one
turned out to be a reference counting bug in firefox. See
http://undeadly.org/cgi?action=article;sid=20230912094727

The other, a write after free that crashed the X server when running
picard was diagnosed by me.  This one was a bit nasty, as it required
instrumenting malloc to print some extra info to find the root cause. 

The bug is that the call in
https://github.com/openbsd/xenocara/blob/master/xserver/Xext/xvdisp.c#L1002
overwrites the first 4 bytes of the chunk next to the one allocated on
line 995.

A workaround is to allocate 4 bytes extra, matthieu@ will be looking
for a proper fix, as it requires knowledge of the X internals.

I am pondering if the instrumentation hack I used could be modified to
be always available and print the function that did the original
allocation when detecting a write after free.

The conclusion is that in these two cases, malloc helped in
detecting/finding the bugs (and it was not a bug in malloc itself :-).
Coincidentally this was the topioc of my talk during EuroBSD2023 last
weekend, see http://www.openbsd.org/events.html

I hope to look at the reports I got for the Wayland port (which is
WIP) as well, see
https://github.com/openbsd/ports/blob/master/wayland/TODO-Wayland.md 

-Otto




Re: malloc write after free error checking

2023-09-03 Thread Otto Moerbeek
On Sun, Sep 03, 2023 at 09:30:49AM +0200, Otto Moerbeek wrote:

> > You can also be set upon a wrong foot: if an out of bounds write on a
> > adjacent chunk happens and lands in (another) free chunk, upon
> > allocation of that free chunk it will be reported as a "write after
> > free" case. It might even be that an allocation that was never
> > allocated and never freed is still "write after free" because of
> > "close" out of bounds writes. 
> > 
> > I'm thinking how to change the error message to reflect that.

I was thinking to change the message to "write to unallocated chunk",
trying to get the message trough that it also applies to chunks never
allocated but written to, not only to chunks that were freed and then
written to.

-Otto



Re: malloc write after free error checking

2023-09-03 Thread Otto Moerbeek
On Sun, Sep 03, 2023 at 09:21:18AM +0200, Otto Moerbeek wrote:

> Hello,
> 
> I'm seeing some reports of "write after free" reported by malloc by
> peolpe running current.  Malloc has become more strict since begining
> of June. Let me explain:
> 
> Small allocations share a page. e.g. a 4k page will hold 8 512 byte
> allocations.
> 
> When one such allocation is freed, it will be filled with a "free
> junk" pattern, stay for a short while in the delayed free list (and be
> subhject to write after free checks) and after that it will be marked
> free and it might be allocated again. On a new allocation the write
> after free check will also be done. That is the new part of my commit
> in June.  Another change is that on the allocation of the page
> containing chunks, all chunks wil be filled with a "free junk" (0xdf)
> patttern. 
> 
> The change has a consequence: the time span of the "write after free"
> checks is much longer, as it can take a long time for a chunk to get
> used again.
> 
> I do believe the changes itself are correct and can help findiung
> bugs in other code. 
> 
> You can also be set upon a wrong foot: if an out of bounds write on a
> adjacent chunk happens and lands in (another) free chunk, upon
> allocation of that free chunk it will be reported as a "write after
> free" case. It might even be that an allocation that was never
> allocated and never freed is still "write after free" because of
> "close" out of bounds writes. 
> 
> I'm thinking how to change the error message to reflect that.
> 
> If these extra checks hinder progress, it is possible to swicth them
> off using malloc option f (small F).

Oops, that should be j (small J).

> 
> For general debugging I recommend using malloc option S, is has all
> the extra strictness that can be enabled while still conforming to the
> malloc API.
> 
> Please be aware of these things while hunting for bugs.
> 
>   -Otto
> 



malloc write after free error checking

2023-09-03 Thread Otto Moerbeek
Hello,

I'm seeing some reports of "write after free" reported by malloc by
peolpe running current.  Malloc has become more strict since begining
of June. Let me explain:

Small allocations share a page. e.g. a 4k page will hold 8 512 byte
allocations.

When one such allocation is freed, it will be filled with a "free
junk" pattern, stay for a short while in the delayed free list (and be
subhject to write after free checks) and after that it will be marked
free and it might be allocated again. On a new allocation the write
after free check will also be done. That is the new part of my commit
in June.  Another change is that on the allocation of the page
containing chunks, all chunks wil be filled with a "free junk" (0xdf)
patttern. 

The change has a consequence: the time span of the "write after free"
checks is much longer, as it can take a long time for a chunk to get
used again.

I do believe the changes itself are correct and can help findiung
bugs in other code. 

You can also be set upon a wrong foot: if an out of bounds write on a
adjacent chunk happens and lands in (another) free chunk, upon
allocation of that free chunk it will be reported as a "write after
free" case. It might even be that an allocation that was never
allocated and never freed is still "write after free" because of
"close" out of bounds writes. 

I'm thinking how to change the error message to reflect that.

If these extra checks hinder progress, it is possible to swicth them
off using malloc option f (small F).

For general debugging I recommend using malloc option S, is has all
the extra strictness that can be enabled while still conforming to the
malloc API.

Please be aware of these things while hunting for bugs.

-Otto