Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-15 Thread Otto Moerbeek
On Thu, Apr 13, 2023 at 08:22:45PM +0200, Otto Moerbeek wrote:

> On Tue, Apr 11, 2023 at 05:50:43PM +0200, Otto Moerbeek wrote:
> 
> > On Sun, Apr 09, 2023 at 12:17:35PM +0200, Otto Moerbeek wrote:
> > 
> > > On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> > > 
> > > > On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > > > > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> > > > > 
> > > > > > On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> > > > > > 
> > > > > > > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > This is work in progress. I have to think if the flags to 
> > > > > > > > > kdump I'm
> > > > > > > > > introducing should be two or a single one.
> > > > > > > > > 
> > > > > > > > > Currently, malloc.c can be compiled with MALLOC_STATS 
> > > > > > > > > defined. If run
> > > > > > > > > with option D it dumps its state to a malloc.out file at 
> > > > > > > > > exit. This
> > > > > > > > > state can be used to find leaks amongst other things.
> > > > > > > > > 
> > > > > > > > > This is not ideal for pledged processes, as they often have 
> > > > > > > > > no way to
> > > > > > > > > write files.
> > > > > > > > > 
> > > > > > > > > This changes malloc to use utrace(2) for that.
> > > > > > > > > 
> > > > > > > > > As kdump has no nice way to show those lines without all 
> > > > > > > > > extras it
> > > > > > > > > normally shows, so add two options to it to just show the 
> > > > > > > > > lines.
> > > > > > > > > 
> > > > > > > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > > > > > > 
> > > > > > > > > Run :
> > > > > > > > > 
> > > > > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > > > > ...
> > > > > > > > > $ kdump -hu
> > > > > > > > > 
> > > > > > > > > Feedback appreciated.
> > > > > > > 
> > > > > > > I can't really comment on malloc(3) stuff, but I agree that 
> > > > > > > utrace(2) is a good 
> > > > > > > way to get information outside a pledged process.
> > > > > > > 
> > > > > > > I tend to think it is safe to use it, as the pledged process need 
> > > > > > > cooperation 
> > > > > > > from outside to exfiltrate informations (a process with 
> > > > > > > permission to call 
> > > > > > > ktrace(2) on this pid).
> > > > > > > 
> > > > > > > Please note it is a somehow generic problem: at least profiled 
> > > > > > > processes would 
> > > > > > > also get advantage of using it.
> > > > > > > 
> > > > > > > 
> > > > > > > Regarding kdump options, I think that -u option should implies -h 
> > > > > > > (no header).
> > > > > > > 
> > > > > > > Does it would make sens to considere a process using utrace(2) 
> > > > > > > with several 
> > > > > > > interleaved records for different sources ? A process with 
> > > > > > > MALLOC_OPTIONS=D and 
> > > > > > > profiling enabled for example ? An (another) option on kdump to 
> > > > > > > filter on utrace 
> > > > > > > label would be useful in such case, or have -u mandate a label to 
> > > > > > > filter on.
> > > > > > > 
> > > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > > $ kdump -u mallocdumpline
> > > > > > > 
> > > > > > > and for profiling:
> > > > > > > 
> > > > > > > $ kdump -u profil > gmon.out
> > > > > > > $ gprof your_program gmon.out
> > > > > > > 
> > > > > > > Thanks.
> > > > > > 
> > > > > > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > > > > > part to make it more flexable for different purposes.
> > > > > 
> > > > > Anothew aspect of safety is the information availble in the heap
> > > > > itself. I'm pretty sure the addresses of call sites into malloc are
> > > > > interesting to attackers. To prevent a program having access to those
> > > > > (even if they are stored inside the malloc meta data that is not
> > > > > directly accesible to a program), I'm making sure the recording only
> > > > > takes place if malloc option D is used.
> > > > > 
> > > > > This is included in a diff with the kdump changes and a few other
> > > > > things below. I'm also compiling with MALLOC_STATS if !SMALL.
> > > > > 
> > > > > usage is now:
> > > > > 
> > > > > $ MALLOC_OPTIONS=D ktrace -tu a.out 
> > > > > $ kdump -u malloc
> > > > > 
> > > > 
> > > > I would prefer if every utrace() call is a full line (in other words 
> > > > ulog
> > > > should be line buffered). It makes the regular kdump output more 
> > > > useable.
> > > > Right now you depend on kdump -u to put the lines back together.
> > > 
> > > Right. Done line buffering in the new diff below.
> > > 
> > > > Whenever I used utrace() I normally passed binary objects to the call 
> > > > so I
> > > > could enrich the ktrace with userland trace info.  So if kdump -u is 
> > > > used
> > > > for more then just mallocstats it should have a true binary mode. For
> > > > example gmon.out is a binary format so the above example by semarie@ 
> > > > 

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-13 Thread Otto Moerbeek
On Tue, Apr 11, 2023 at 05:50:43PM +0200, Otto Moerbeek wrote:

> On Sun, Apr 09, 2023 at 12:17:35PM +0200, Otto Moerbeek wrote:
> 
> > On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> > 
> > > On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > > > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> > > > 
> > > > > On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> > > > > 
> > > > > > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > This is work in progress. I have to think if the flags to kdump 
> > > > > > > > I'm
> > > > > > > > introducing should be two or a single one.
> > > > > > > > 
> > > > > > > > Currently, malloc.c can be compiled with MALLOC_STATS defined. 
> > > > > > > > If run
> > > > > > > > with option D it dumps its state to a malloc.out file at exit. 
> > > > > > > > This
> > > > > > > > state can be used to find leaks amongst other things.
> > > > > > > > 
> > > > > > > > This is not ideal for pledged processes, as they often have no 
> > > > > > > > way to
> > > > > > > > write files.
> > > > > > > > 
> > > > > > > > This changes malloc to use utrace(2) for that.
> > > > > > > > 
> > > > > > > > As kdump has no nice way to show those lines without all extras 
> > > > > > > > it
> > > > > > > > normally shows, so add two options to it to just show the lines.
> > > > > > > > 
> > > > > > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > > > > > 
> > > > > > > > Run :
> > > > > > > > 
> > > > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > > > ...
> > > > > > > > $ kdump -hu
> > > > > > > > 
> > > > > > > > Feedback appreciated.
> > > > > > 
> > > > > > I can't really comment on malloc(3) stuff, but I agree that 
> > > > > > utrace(2) is a good 
> > > > > > way to get information outside a pledged process.
> > > > > > 
> > > > > > I tend to think it is safe to use it, as the pledged process need 
> > > > > > cooperation 
> > > > > > from outside to exfiltrate informations (a process with permission 
> > > > > > to call 
> > > > > > ktrace(2) on this pid).
> > > > > > 
> > > > > > Please note it is a somehow generic problem: at least profiled 
> > > > > > processes would 
> > > > > > also get advantage of using it.
> > > > > > 
> > > > > > 
> > > > > > Regarding kdump options, I think that -u option should implies -h 
> > > > > > (no header).
> > > > > > 
> > > > > > Does it would make sens to considere a process using utrace(2) with 
> > > > > > several 
> > > > > > interleaved records for different sources ? A process with 
> > > > > > MALLOC_OPTIONS=D and 
> > > > > > profiling enabled for example ? An (another) option on kdump to 
> > > > > > filter on utrace 
> > > > > > label would be useful in such case, or have -u mandate a label to 
> > > > > > filter on.
> > > > > > 
> > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > $ kdump -u mallocdumpline
> > > > > > 
> > > > > > and for profiling:
> > > > > > 
> > > > > > $ kdump -u profil > gmon.out
> > > > > > $ gprof your_program gmon.out
> > > > > > 
> > > > > > Thanks.
> > > > > 
> > > > > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > > > > part to make it more flexable for different purposes.
> > > > 
> > > > Anothew aspect of safety is the information availble in the heap
> > > > itself. I'm pretty sure the addresses of call sites into malloc are
> > > > interesting to attackers. To prevent a program having access to those
> > > > (even if they are stored inside the malloc meta data that is not
> > > > directly accesible to a program), I'm making sure the recording only
> > > > takes place if malloc option D is used.
> > > > 
> > > > This is included in a diff with the kdump changes and a few other
> > > > things below. I'm also compiling with MALLOC_STATS if !SMALL.
> > > > 
> > > > usage is now:
> > > > 
> > > > $ MALLOC_OPTIONS=D ktrace -tu a.out 
> > > > $ kdump -u malloc
> > > > 
> > > 
> > > I would prefer if every utrace() call is a full line (in other words ulog
> > > should be line buffered). It makes the regular kdump output more useable.
> > > Right now you depend on kdump -u to put the lines back together.
> > 
> > Right. Done line buffering in the new diff below.
> > 
> > > Whenever I used utrace() I normally passed binary objects to the call so I
> > > could enrich the ktrace with userland trace info.  So if kdump -u is used
> > > for more then just mallocstats it should have a true binary mode. For
> > > example gmon.out is a binary format so the above example by semarie@ would
> > > not work.  As usual this can be solved in tree once that problem is hit.
> > 
> > Right, we can do that when needed. 
> > 
> > This is approaching a state where I'm happy with it. So reviews, tests
> > appreciated.
> > 
> > I would like to have the possibility to record a caller deeper in the
> > stack (as many program use wrappers to 

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-11 Thread Otto Moerbeek
On Sun, Apr 09, 2023 at 12:17:35PM +0200, Otto Moerbeek wrote:

> On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> 
> > On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> > > 
> > > > On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> > > > 
> > > > > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > This is work in progress. I have to think if the flags to kdump 
> > > > > > > I'm
> > > > > > > introducing should be two or a single one.
> > > > > > > 
> > > > > > > Currently, malloc.c can be compiled with MALLOC_STATS defined. If 
> > > > > > > run
> > > > > > > with option D it dumps its state to a malloc.out file at exit. 
> > > > > > > This
> > > > > > > state can be used to find leaks amongst other things.
> > > > > > > 
> > > > > > > This is not ideal for pledged processes, as they often have no 
> > > > > > > way to
> > > > > > > write files.
> > > > > > > 
> > > > > > > This changes malloc to use utrace(2) for that.
> > > > > > > 
> > > > > > > As kdump has no nice way to show those lines without all extras it
> > > > > > > normally shows, so add two options to it to just show the lines.
> > > > > > > 
> > > > > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > > > > 
> > > > > > > Run :
> > > > > > > 
> > > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > > ...
> > > > > > > $ kdump -hu
> > > > > > > 
> > > > > > > Feedback appreciated.
> > > > > 
> > > > > I can't really comment on malloc(3) stuff, but I agree that utrace(2) 
> > > > > is a good 
> > > > > way to get information outside a pledged process.
> > > > > 
> > > > > I tend to think it is safe to use it, as the pledged process need 
> > > > > cooperation 
> > > > > from outside to exfiltrate informations (a process with permission to 
> > > > > call 
> > > > > ktrace(2) on this pid).
> > > > > 
> > > > > Please note it is a somehow generic problem: at least profiled 
> > > > > processes would 
> > > > > also get advantage of using it.
> > > > > 
> > > > > 
> > > > > Regarding kdump options, I think that -u option should implies -h (no 
> > > > > header).
> > > > > 
> > > > > Does it would make sens to considere a process using utrace(2) with 
> > > > > several 
> > > > > interleaved records for different sources ? A process with 
> > > > > MALLOC_OPTIONS=D and 
> > > > > profiling enabled for example ? An (another) option on kdump to 
> > > > > filter on utrace 
> > > > > label would be useful in such case, or have -u mandate a label to 
> > > > > filter on.
> > > > > 
> > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > $ kdump -u mallocdumpline
> > > > > 
> > > > > and for profiling:
> > > > > 
> > > > > $ kdump -u profil > gmon.out
> > > > > $ gprof your_program gmon.out
> > > > > 
> > > > > Thanks.
> > > > 
> > > > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > > > part to make it more flexable for different purposes.
> > > 
> > > Anothew aspect of safety is the information availble in the heap
> > > itself. I'm pretty sure the addresses of call sites into malloc are
> > > interesting to attackers. To prevent a program having access to those
> > > (even if they are stored inside the malloc meta data that is not
> > > directly accesible to a program), I'm making sure the recording only
> > > takes place if malloc option D is used.
> > > 
> > > This is included in a diff with the kdump changes and a few other
> > > things below. I'm also compiling with MALLOC_STATS if !SMALL.
> > > 
> > > usage is now:
> > > 
> > > $ MALLOC_OPTIONS=D ktrace -tu a.out 
> > > $ kdump -u malloc
> > > 
> > 
> > I would prefer if every utrace() call is a full line (in other words ulog
> > should be line buffered). It makes the regular kdump output more useable.
> > Right now you depend on kdump -u to put the lines back together.
> 
> Right. Done line buffering in the new diff below.
> 
> > Whenever I used utrace() I normally passed binary objects to the call so I
> > could enrich the ktrace with userland trace info.  So if kdump -u is used
> > for more then just mallocstats it should have a true binary mode. For
> > example gmon.out is a binary format so the above example by semarie@ would
> > not work.  As usual this can be solved in tree once that problem is hit.
> 
> Right, we can do that when needed. 
> 
> This is approaching a state where I'm happy with it. So reviews, tests
> appreciated.
> 
> I would like to have the possibility to record a caller deeper in the
> stack (as many program use wrappers to call malloc), but
> __builtin_return_address only works with a constant argument.  So some
> magic needed. Will return to that later.

New version. By default now only the leak info accumulated over all
pools is dumped. To get the full dump, use MALLOC_OPTIONS=DD


-Otto

Index: lib/libc/stdlib/malloc.3

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
On Sun, Apr 09, 2023 at 12:17:35PM +0200, Otto Moerbeek wrote:
> On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> > On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > 
> > I would prefer if every utrace() call is a full line (in other words ulog
> > should be line buffered). It makes the regular kdump output more useable.
> > Right now you depend on kdump -u to put the lines back together.
> 
> Right. Done line buffering in the new diff below.
> 
> > Whenever I used utrace() I normally passed binary objects to the call so I
> > could enrich the ktrace with userland trace info.  So if kdump -u is used
> > for more then just mallocstats it should have a true binary mode. For
> > example gmon.out is a binary format so the above example by semarie@ would
> > not work.  As usual this can be solved in tree once that problem is hit.
> 
> Right, we can do that when needed. 
> 
> This is approaching a state where I'm happy with it. So reviews, tests
> appreciated.
> 
> I would like to have the possibility to record a caller deeper in the
> stack (as many program use wrappers to call malloc), but
> __builtin_return_address only works with a constant argument.  So some
> magic needed. Will return to that later.
> 
>   -Otto

the kdump part is ok semarie@. maybe wait a bit a let's others to comment.

regarding the malloc part, it reads fine.

If I correctly followed the code, ulog() function is only called during 
malloc_exit() at atexit(3) time. so, the static variables should be fine in 
multi-threaded context (atexit(3) finalizer is using spinlock to be 
thread-safe).

Thanks.

> Index: lib/libc/stdlib/malloc.3
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
> retrieving revision 1.130
> diff -u -p -r1.130 malloc.3
> --- lib/libc/stdlib/malloc.3  1 Apr 2023 18:47:51 -   1.130
> +++ lib/libc/stdlib/malloc.3  9 Apr 2023 10:16:11 -
> @@ -284,10 +284,13 @@ If it has been corrupted, the process is
>  .It Cm D
>  .Dq Dump .
>  .Fn malloc
> -will dump statistics to the file
> -.Pa ./malloc.out ,
> -if it already exists,
> +will dump statistics using
> +.Xr utrace 2
>  at exit.
> +To record the dump:
> +.Dl $ MALLOC_OPTIONS=D ktrace -tu program ...
> +To view the dump:
> +.Dl $ kdump -u malloc ...
>  This option requires the library to have been compiled with -DMALLOC_STATS in
>  order to have any effect.
>  .It Cm F
> Index: lib/libc/stdlib/malloc.c
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.280
> diff -u -p -r1.280 malloc.c
> --- lib/libc/stdlib/malloc.c  5 Apr 2023 06:25:38 -   1.280
> +++ lib/libc/stdlib/malloc.c  9 Apr 2023 10:16:11 -
> @@ -1,6 +1,6 @@
>  /*   $OpenBSD: malloc.c,v 1.280 2023/04/05 06:25:38 otto Exp $   */
>  /*
> - * Copyright (c) 2008, 2010, 2011, 2016 Otto Moerbeek 
> + * Copyright (c) 2008, 2010, 2011, 2016, 2023 Otto Moerbeek 
>   * Copyright (c) 2012 Matthew Dempsky 
>   * Copyright (c) 2008 Damien Miller 
>   * Copyright (c) 2000 Poul-Henning Kamp 
> @@ -23,7 +23,9 @@
>   * can buy me a beer in return. Poul-Henning Kamp
>   */
>  
> -/* #define MALLOC_STATS */
> +#ifndef SMALL
> +#define MALLOC_STATS
> +#endif
>  
>  #include 
>  #include 
> @@ -39,8 +41,10 @@
>  #include 
>  
>  #ifdef MALLOC_STATS
> +#include 
>  #include 
> -#include 
> +#include 
> +#include 
>  #endif
>  
>  #include "thread_private.h"
> @@ -225,10 +229,14 @@ struct malloc_readonly {
>   size_t  malloc_guard;   /* use guard pages after allocations? */
>  #ifdef MALLOC_STATS
>   int malloc_stats;   /* dump statistics at end */
> +#define  DO_STATSmopts.malloc_stats
> +#else
> +#define  DO_STATS0
>  #endif
>   u_int32_t malloc_canary;/* Matched against ones in pool */
>  };
>  
> +
>  /* This object is mapped PROT_READ after initialisation to prevent tampering 
> */
>  static union {
>   struct malloc_readonly mopts;
> @@ -243,15 +251,11 @@ static __dead void wrterror(struct dir_i
>  __attribute__((__format__ (printf, 2, 3)));
>  
>  #ifdef MALLOC_STATS
> -void malloc_dump(int, int, struct dir_info *);
> +void malloc_dump(void);
>  PROTO_NORMAL(malloc_dump);
> -void malloc_gdump(int);
> -PROTO_NORMAL(malloc_gdump);
>  static void malloc_exit(void);
> -#define CALLER   __builtin_return_address(0)
> -#else
> -#define CALLER   NULL
>  #endif
> +#define CALLER   (DO_STATS ? __builtin_return_address(0) : NULL)
>  
>  /* low bits of r->p determine size: 0 means >= page size and r->size holding
>   * real size, otherwise low bits is the bucket + 1
> @@ -318,9 +322,9 @@ wrterror(struct dir_info *d, char *msg, 
>   dprintf(STDERR_FILENO, "\n");
>  
>  #ifdef MALLOC_STATS
> - if (mopts.malloc_stats)
> - malloc_gdump(STDERR_FILENO);
> -#endif /* MALLOC_STATS */
> + if (DO_STATS)
> +  

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Otto Moerbeek
On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:

> On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> > 
> > > On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> > > 
> > > > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > This is work in progress. I have to think if the flags to kdump I'm
> > > > > > introducing should be two or a single one.
> > > > > > 
> > > > > > Currently, malloc.c can be compiled with MALLOC_STATS defined. If 
> > > > > > run
> > > > > > with option D it dumps its state to a malloc.out file at exit. This
> > > > > > state can be used to find leaks amongst other things.
> > > > > > 
> > > > > > This is not ideal for pledged processes, as they often have no way 
> > > > > > to
> > > > > > write files.
> > > > > > 
> > > > > > This changes malloc to use utrace(2) for that.
> > > > > > 
> > > > > > As kdump has no nice way to show those lines without all extras it
> > > > > > normally shows, so add two options to it to just show the lines.
> > > > > > 
> > > > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > > > 
> > > > > > Run :
> > > > > > 
> > > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > > ...
> > > > > > $ kdump -hu
> > > > > > 
> > > > > > Feedback appreciated.
> > > > 
> > > > I can't really comment on malloc(3) stuff, but I agree that utrace(2) 
> > > > is a good 
> > > > way to get information outside a pledged process.
> > > > 
> > > > I tend to think it is safe to use it, as the pledged process need 
> > > > cooperation 
> > > > from outside to exfiltrate informations (a process with permission to 
> > > > call 
> > > > ktrace(2) on this pid).
> > > > 
> > > > Please note it is a somehow generic problem: at least profiled 
> > > > processes would 
> > > > also get advantage of using it.
> > > > 
> > > > 
> > > > Regarding kdump options, I think that -u option should implies -h (no 
> > > > header).
> > > > 
> > > > Does it would make sens to considere a process using utrace(2) with 
> > > > several 
> > > > interleaved records for different sources ? A process with 
> > > > MALLOC_OPTIONS=D and 
> > > > profiling enabled for example ? An (another) option on kdump to filter 
> > > > on utrace 
> > > > label would be useful in such case, or have -u mandate a label to 
> > > > filter on.
> > > > 
> > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > $ kdump -u mallocdumpline
> > > > 
> > > > and for profiling:
> > > > 
> > > > $ kdump -u profil > gmon.out
> > > > $ gprof your_program gmon.out
> > > > 
> > > > Thanks.
> > > 
> > > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > > part to make it more flexable for different purposes.
> > 
> > Anothew aspect of safety is the information availble in the heap
> > itself. I'm pretty sure the addresses of call sites into malloc are
> > interesting to attackers. To prevent a program having access to those
> > (even if they are stored inside the malloc meta data that is not
> > directly accesible to a program), I'm making sure the recording only
> > takes place if malloc option D is used.
> > 
> > This is included in a diff with the kdump changes and a few other
> > things below. I'm also compiling with MALLOC_STATS if !SMALL.
> > 
> > usage is now:
> > 
> > $ MALLOC_OPTIONS=D ktrace -tu a.out 
> > $ kdump -u malloc
> > 
> 
> I would prefer if every utrace() call is a full line (in other words ulog
> should be line buffered). It makes the regular kdump output more useable.
> Right now you depend on kdump -u to put the lines back together.

Right. Done line buffering in the new diff below.

> Whenever I used utrace() I normally passed binary objects to the call so I
> could enrich the ktrace with userland trace info.  So if kdump -u is used
> for more then just mallocstats it should have a true binary mode. For
> example gmon.out is a binary format so the above example by semarie@ would
> not work.  As usual this can be solved in tree once that problem is hit.

Right, we can do that when needed. 

This is approaching a state where I'm happy with it. So reviews, tests
appreciated.

I would like to have the possibility to record a caller deeper in the
stack (as many program use wrappers to call malloc), but
__builtin_return_address only works with a constant argument.  So some
magic needed. Will return to that later.

-Otto


Index: lib/libc/stdlib/malloc.3
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
retrieving revision 1.130
diff -u -p -r1.130 malloc.3
--- lib/libc/stdlib/malloc.31 Apr 2023 18:47:51 -   1.130
+++ lib/libc/stdlib/malloc.39 Apr 2023 10:16:11 -
@@ -284,10 +284,13 @@ If it has been corrupted, the process is
 .It Cm D
 .Dq Dump .
 .Fn malloc
-will dump statistics to the file
-.Pa ./malloc.out ,

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
On Sun, Apr 09, 2023 at 10:08:25AM +0200, Claudio Jeker wrote:
> On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> > On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> 
> I would prefer if every utrace() call is a full line (in other words ulog
> should be line buffered). It makes the regular kdump output more useable.
> Right now you depend on kdump -u to put the lines back together.
> 
> Whenever I used utrace() I normally passed binary objects to the call so I
> could enrich the ktrace with userland trace info.  So if kdump -u is used
> for more then just mallocstats it should have a true binary mode. For
> example gmon.out is a binary format so the above example by semarie@ would
> not work.  As usual this can be solved in tree once that problem is hit.

Yes, I saw my example wasn't right for gmon.out. but it could be workarounded 
easily with unvis(1) as the vis(3) options used preserved the reversibility.

$ kdump -u profil | unvis > gmon.out

I agree it could be done in tree if a binary output in kdump(1) is prefered.

Regarding profiling, I wonder if it is currently broken. I am investigating, 
and 
will do a bug report once I have enough information.
-- 
Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Claudio Jeker
On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> 
> > On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> > 
> > > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > > Hi,
> > > > > 
> > > > > This is work in progress. I have to think if the flags to kdump I'm
> > > > > introducing should be two or a single one.
> > > > > 
> > > > > Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> > > > > with option D it dumps its state to a malloc.out file at exit. This
> > > > > state can be used to find leaks amongst other things.
> > > > > 
> > > > > This is not ideal for pledged processes, as they often have no way to
> > > > > write files.
> > > > > 
> > > > > This changes malloc to use utrace(2) for that.
> > > > > 
> > > > > As kdump has no nice way to show those lines without all extras it
> > > > > normally shows, so add two options to it to just show the lines.
> > > > > 
> > > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > > 
> > > > > Run :
> > > > > 
> > > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > > ...
> > > > > $ kdump -hu
> > > > > 
> > > > > Feedback appreciated.
> > > 
> > > I can't really comment on malloc(3) stuff, but I agree that utrace(2) is 
> > > a good 
> > > way to get information outside a pledged process.
> > > 
> > > I tend to think it is safe to use it, as the pledged process need 
> > > cooperation 
> > > from outside to exfiltrate informations (a process with permission to 
> > > call 
> > > ktrace(2) on this pid).
> > > 
> > > Please note it is a somehow generic problem: at least profiled processes 
> > > would 
> > > also get advantage of using it.
> > > 
> > > 
> > > Regarding kdump options, I think that -u option should implies -h (no 
> > > header).
> > > 
> > > Does it would make sens to considere a process using utrace(2) with 
> > > several 
> > > interleaved records for different sources ? A process with 
> > > MALLOC_OPTIONS=D and 
> > > profiling enabled for example ? An (another) option on kdump to filter on 
> > > utrace 
> > > label would be useful in such case, or have -u mandate a label to filter 
> > > on.
> > > 
> > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > $ kdump -u mallocdumpline
> > > 
> > > and for profiling:
> > > 
> > > $ kdump -u profil > gmon.out
> > > $ gprof your_program gmon.out
> > > 
> > > Thanks.
> > 
> > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > part to make it more flexable for different purposes.
> 
> Anothew aspect of safety is the information availble in the heap
> itself. I'm pretty sure the addresses of call sites into malloc are
> interesting to attackers. To prevent a program having access to those
> (even if they are stored inside the malloc meta data that is not
> directly accesible to a program), I'm making sure the recording only
> takes place if malloc option D is used.
> 
> This is included in a diff with the kdump changes and a few other
> things below. I'm also compiling with MALLOC_STATS if !SMALL.
> 
> usage is now:
> 
> $ MALLOC_OPTIONS=D ktrace -tu a.out 
> $ kdump -u malloc
> 

I would prefer if every utrace() call is a full line (in other words ulog
should be line buffered). It makes the regular kdump output more useable.
Right now you depend on kdump -u to put the lines back together.

Whenever I used utrace() I normally passed binary objects to the call so I
could enrich the ktrace with userland trace info.  So if kdump -u is used
for more then just mallocstats it should have a true binary mode. For
example gmon.out is a binary format so the above example by semarie@ would
not work.  As usual this can be solved in tree once that problem is hit.

> Index: lib/libc/stdlib/malloc.3
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
> retrieving revision 1.130
> diff -u -p -r1.130 malloc.3
> --- lib/libc/stdlib/malloc.3  1 Apr 2023 18:47:51 -   1.130
> +++ lib/libc/stdlib/malloc.3  9 Apr 2023 07:08:20 -
> @@ -284,10 +284,13 @@ If it has been corrupted, the process is
>  .It Cm D
>  .Dq Dump .
>  .Fn malloc
> -will dump statistics to the file
> -.Pa ./malloc.out ,
> -if it already exists,
> +will dump statistics using
> +.Xr utrace 2
>  at exit.
> +To record the dump:
> +.Dl $ MALLOC_OPTIONS=D ktrace -tu program ...
> +To view the dump:
> +.Dl $ kdump -u malloc ...
>  This option requires the library to have been compiled with -DMALLOC_STATS in
>  order to have any effect.
>  .It Cm F
> Index: lib/libc/stdlib/malloc.c
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.280
> diff -u -p -r1.280 malloc.c
> --- lib/libc/stdlib/malloc.c  5 Apr 2023 06:25:38 -   1.280
> +++ lib/libc/stdlib/malloc.c  9 Apr 2023 07:08:20 -
> @@ -1,6 +1,6 @@
>  

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Sebastien Marie
On Sun, Apr 09, 2023 at 09:15:12AM +0200, Otto Moerbeek wrote:
> On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:
> 
> > 
> > Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> > part to make it more flexable for different purposes.
> 
> Anothew aspect of safety is the information availble in the heap
> itself. I'm pretty sure the addresses of call sites into malloc are
> interesting to attackers. To prevent a program having access to those
> (even if they are stored inside the malloc meta data that is not
> directly accesible to a program), I'm making sure the recording only
> takes place if malloc option D is used.
> 
> This is included in a diff with the kdump changes and a few other
> things below. I'm also compiling with MALLOC_STATS if !SMALL.
> 
> usage is now:
> 
> $ MALLOC_OPTIONS=D ktrace -tu a.out 
> $ kdump -u malloc
> 
> -Otto

few comments regarding kdump below. I will try to look at malloc part too.

 
> Index: usr.bin/kdump/kdump.1
> ===
> RCS file: /home/cvs/src/usr.bin/kdump/kdump.1,v
> retrieving revision 1.35
> diff -u -p -r1.35 kdump.1
> --- usr.bin/kdump/kdump.1 30 Jul 2022 07:19:30 -  1.35
> +++ usr.bin/kdump/kdump.1 9 Apr 2023 07:08:20 -
> @@ -42,6 +42,7 @@
>  .Op Fl m Ar maxdata
>  .Op Fl p Ar pid
>  .Op Fl t Ar trstr
> +.Op Fl u Ar word

I would use the term 'label' (instead of 'word'), as it is the way it is called 
inside utrace(2) man page.

>  .Sh DESCRIPTION
>  .Nm
>  displays the kernel trace files produced with
> @@ -106,6 +107,18 @@ See the
>  option of
>  .Xr ktrace 1
>  for the meaning of the letters.
> +.It Fl u Ar word
> +Display
> +.Xr utrace 2
> +records having
> +.XR utrace 2
> +header
> +.Ar word
> +as strings with
> +.Xr vis 3
> +escaping, without
> +.Xr ktrace 2
> +header information.
>  .It Fl X
>  Display I/O data with hexadecimal data and printable ASCII characters
>  side by side.
> Index: usr.bin/kdump/kdump.c
> ===
> RCS file: /home/cvs/src/usr.bin/kdump/kdump.c,v
> retrieving revision 1.156
> diff -u -p -r1.156 kdump.c
> --- usr.bin/kdump/kdump.c 17 Feb 2023 18:01:26 -  1.156
> +++ usr.bin/kdump/kdump.c 9 Apr 2023 07:08:20 -
> @@ -88,6 +88,7 @@ int needtid, tail, basecol;
>  char *tracefile = DEF_TRACEFILE;
>  struct ktr_header ktr_header;
>  pid_t pid_opt = -1;
> +char* utracefilter;
>  
>  #define eqs(s1, s2)  (strcmp((s1), (s2)) == 0)
>  
> @@ -168,7 +169,7 @@ main(int argc, char *argv[])
>   screenwidth = 80;
>   }
>  
> - while ((ch = getopt(argc, argv, "f:dHlm:np:RTt:xX")) != -1)
> + while ((ch = getopt(argc, argv, "f:dHlm:np:RTt:u:xX")) != -1)
>   switch (ch) {
>   case 'f':
>   tracefile = optarg;
> @@ -212,6 +213,9 @@ main(int argc, char *argv[])
>   if (trpoints < 0)
>   errx(1, "unknown trace point in %s", optarg);
>   break;
> + case 'u':
> + utracefilter = optarg;

I think it would make sense to also force trpoints to only user data:

trpoints = KTRFAC_USER;

this way, if you trace others informations in the dump file, you only get the 
related utrace(2) entries.

-t and -u should be also mutually exclusive.

> + break;
>   case 'x':
>   iohex = 1;
>   break;
> @@ -246,7 +250,7 @@ main(int argc, char *argv[])
>   silent = 0;
>   if (pid_opt != -1 && pid_opt != ktr_header.ktr_pid)
>   silent = 1;
> - if (silent == 0 && trpoints & (1< + if (utracefilter == NULL && silent == 0 && trpoints & 
> (1<   dumpheader(_header);
>   ktrlen = ktr_header.ktr_len;
>   if (ktrlen > size) {
> @@ -1254,10 +1258,18 @@ showbufc(int col, unsigned char *dp, siz
>  static void
>  showbuf(unsigned char *dp, size_t datalen)
>  {
> - int i, j;
> + size_t i, j;
>   int col = 0, bpl;
>   unsigned char c;
> + char visbuf[5];
>  
> + if (utracefilter != NULL) {
> + for (i = 0; i < datalen; i++) {
> + vis(visbuf, dp[i], VIS_SAFE | VIS_OCTAL, 0);
> + printf("%s", visbuf);

I don't know if it would make a difference or not, but it is possible to use 
strvis(3) as dp is fixed size (KTR_USER_MAXLEN). so visbuf would be 
KTR_USER_MAXLEN*4+1 in size.

but the way you did is fine too.

> + }
> + return;
> + }
>   if (iohex == 1) {
>   putchar('\t');
>   col = 8;
> @@ -1280,7 +1292,7 @@ showbuf(unsigned char *dp, size_t datale
>   if (bpl <= 0)
>   bpl = 1;
>   for (i = 0; i < datalen; i += bpl) {
> - 

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Otto Moerbeek
On Sun, Apr 09, 2023 at 08:20:43AM +0200, Otto Moerbeek wrote:

> On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:
> 
> > On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > > Hi,
> > > > 
> > > > This is work in progress. I have to think if the flags to kdump I'm
> > > > introducing should be two or a single one.
> > > > 
> > > > Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> > > > with option D it dumps its state to a malloc.out file at exit. This
> > > > state can be used to find leaks amongst other things.
> > > > 
> > > > This is not ideal for pledged processes, as they often have no way to
> > > > write files.
> > > > 
> > > > This changes malloc to use utrace(2) for that.
> > > > 
> > > > As kdump has no nice way to show those lines without all extras it
> > > > normally shows, so add two options to it to just show the lines.
> > > > 
> > > > To use, compile and install libc with MALLOC_STATS defined.
> > > > 
> > > > Run :
> > > > 
> > > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > > ...
> > > > $ kdump -hu
> > > > 
> > > > Feedback appreciated.
> > 
> > I can't really comment on malloc(3) stuff, but I agree that utrace(2) is a 
> > good 
> > way to get information outside a pledged process.
> > 
> > I tend to think it is safe to use it, as the pledged process need 
> > cooperation 
> > from outside to exfiltrate informations (a process with permission to call 
> > ktrace(2) on this pid).
> > 
> > Please note it is a somehow generic problem: at least profiled processes 
> > would 
> > also get advantage of using it.
> > 
> > 
> > Regarding kdump options, I think that -u option should implies -h (no 
> > header).
> > 
> > Does it would make sens to considere a process using utrace(2) with several 
> > interleaved records for different sources ? A process with MALLOC_OPTIONS=D 
> > and 
> > profiling enabled for example ? An (another) option on kdump to filter on 
> > utrace 
> > label would be useful in such case, or have -u mandate a label to filter on.
> > 
> > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > $ kdump -u mallocdumpline
> > 
> > and for profiling:
> > 
> > $ kdump -u profil > gmon.out
> > $ gprof your_program gmon.out
> > 
> > Thanks.
> 
> Thanks! Your suggestions make a lot of sense. I'll rework the kdump
> part to make it more flexable for different purposes.

Anothew aspect of safety is the information availble in the heap
itself. I'm pretty sure the addresses of call sites into malloc are
interesting to attackers. To prevent a program having access to those
(even if they are stored inside the malloc meta data that is not
directly accesible to a program), I'm making sure the recording only
takes place if malloc option D is used.

This is included in a diff with the kdump changes and a few other
things below. I'm also compiling with MALLOC_STATS if !SMALL.

usage is now:

$ MALLOC_OPTIONS=D ktrace -tu a.out 
$ kdump -u malloc

-Otto


Index: lib/libc/stdlib/malloc.3
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
retrieving revision 1.130
diff -u -p -r1.130 malloc.3
--- lib/libc/stdlib/malloc.31 Apr 2023 18:47:51 -   1.130
+++ lib/libc/stdlib/malloc.39 Apr 2023 07:08:20 -
@@ -284,10 +284,13 @@ If it has been corrupted, the process is
 .It Cm D
 .Dq Dump .
 .Fn malloc
-will dump statistics to the file
-.Pa ./malloc.out ,
-if it already exists,
+will dump statistics using
+.Xr utrace 2
 at exit.
+To record the dump:
+.Dl $ MALLOC_OPTIONS=D ktrace -tu program ...
+To view the dump:
+.Dl $ kdump -u malloc ...
 This option requires the library to have been compiled with -DMALLOC_STATS in
 order to have any effect.
 .It Cm F
Index: lib/libc/stdlib/malloc.c
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
retrieving revision 1.280
diff -u -p -r1.280 malloc.c
--- lib/libc/stdlib/malloc.c5 Apr 2023 06:25:38 -   1.280
+++ lib/libc/stdlib/malloc.c9 Apr 2023 07:08:20 -
@@ -1,6 +1,6 @@
 /* $OpenBSD: malloc.c,v 1.280 2023/04/05 06:25:38 otto Exp $   */
 /*
- * Copyright (c) 2008, 2010, 2011, 2016 Otto Moerbeek 
+ * Copyright (c) 2008, 2010, 2011, 2016, 2023 Otto Moerbeek 
  * Copyright (c) 2012 Matthew Dempsky 
  * Copyright (c) 2008 Damien Miller 
  * Copyright (c) 2000 Poul-Henning Kamp 
@@ -23,7 +23,9 @@
  * can buy me a beer in return. Poul-Henning Kamp
  */
 
-/* #define MALLOC_STATS */
+#ifndef SMALL
+#define MALLOC_STATS
+#endif
 
 #include 
 #include 
@@ -39,8 +41,10 @@
 #include 
 
 #ifdef MALLOC_STATS
+#include 
 #include 
-#include 
+#include 
+#include 
 #endif
 
 #include "thread_private.h"
@@ -225,10 +229,14 @@ struct malloc_readonly {
size_t  malloc_guard;   /* use guard pages after allocations? */
 #ifdef MALLOC_STATS
int malloc_stats;   /* dump statistics at end */
+#define DO_STATS  

Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-09 Thread Otto Moerbeek
On Sun, Apr 09, 2023 at 07:53:31AM +0200, Sebastien Marie wrote:

> On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > > Hi,
> > > 
> > > This is work in progress. I have to think if the flags to kdump I'm
> > > introducing should be two or a single one.
> > > 
> > > Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> > > with option D it dumps its state to a malloc.out file at exit. This
> > > state can be used to find leaks amongst other things.
> > > 
> > > This is not ideal for pledged processes, as they often have no way to
> > > write files.
> > > 
> > > This changes malloc to use utrace(2) for that.
> > > 
> > > As kdump has no nice way to show those lines without all extras it
> > > normally shows, so add two options to it to just show the lines.
> > > 
> > > To use, compile and install libc with MALLOC_STATS defined.
> > > 
> > > Run :
> > > 
> > > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > > ...
> > > $ kdump -hu
> > > 
> > > Feedback appreciated.
> 
> I can't really comment on malloc(3) stuff, but I agree that utrace(2) is a 
> good 
> way to get information outside a pledged process.
> 
> I tend to think it is safe to use it, as the pledged process need cooperation 
> from outside to exfiltrate informations (a process with permission to call 
> ktrace(2) on this pid).
> 
> Please note it is a somehow generic problem: at least profiled processes 
> would 
> also get advantage of using it.
> 
> 
> Regarding kdump options, I think that -u option should implies -h (no header).
> 
> Does it would make sens to considere a process using utrace(2) with several 
> interleaved records for different sources ? A process with MALLOC_OPTIONS=D 
> and 
> profiling enabled for example ? An (another) option on kdump to filter on 
> utrace 
> label would be useful in such case, or have -u mandate a label to filter on.
> 
> $ MALLOC_OPTIONS=D ktrace -tu your_program
> $ kdump -u mallocdumpline
> 
> and for profiling:
> 
> $ kdump -u profil > gmon.out
> $ gprof your_program gmon.out
> 
> Thanks.

Thanks! Your suggestions make a lot of sense. I'll rework the kdump
part to make it more flexable for different purposes.

-Otto

> -- 
> Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-08 Thread Sebastien Marie
On Fri, Apr 07, 2023 at 09:52:52AM +0200, Otto Moerbeek wrote:
> > Hi,
> > 
> > This is work in progress. I have to think if the flags to kdump I'm
> > introducing should be two or a single one.
> > 
> > Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> > with option D it dumps its state to a malloc.out file at exit. This
> > state can be used to find leaks amongst other things.
> > 
> > This is not ideal for pledged processes, as they often have no way to
> > write files.
> > 
> > This changes malloc to use utrace(2) for that.
> > 
> > As kdump has no nice way to show those lines without all extras it
> > normally shows, so add two options to it to just show the lines.
> > 
> > To use, compile and install libc with MALLOC_STATS defined.
> > 
> > Run :
> > 
> > $ MALLOC_OPTIONS=D ktrace -tu your_program
> > ...
> > $ kdump -hu
> > 
> > Feedback appreciated.

I can't really comment on malloc(3) stuff, but I agree that utrace(2) is a good 
way to get information outside a pledged process.

I tend to think it is safe to use it, as the pledged process need cooperation 
from outside to exfiltrate informations (a process with permission to call 
ktrace(2) on this pid).

Please note it is a somehow generic problem: at least profiled processes would 
also get advantage of using it.


Regarding kdump options, I think that -u option should implies -h (no header).

Does it would make sens to considere a process using utrace(2) with several 
interleaved records for different sources ? A process with MALLOC_OPTIONS=D and 
profiling enabled for example ? An (another) option on kdump to filter on 
utrace 
label would be useful in such case, or have -u mandate a label to filter on.

$ MALLOC_OPTIONS=D ktrace -tu your_program
$ kdump -u mallocdumpline

and for profiling:

$ kdump -u profil > gmon.out
$ gprof your_program gmon.out

Thanks.
-- 
Sebastien Marie



Re: MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-07 Thread Otto Moerbeek
On Wed, Apr 05, 2023 at 10:54:19AM +0200, Otto Moerbeek wrote:

> Hi,
> 
> This is work in progress. I have to think if the flags to kdump I'm
> introducing should be two or a single one.
> 
> Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
> with option D it dumps its state to a malloc.out file at exit. This
> state can be used to find leaks amongst other things.
> 
> This is not ideal for pledged processes, as they often have no way to
> write files.
> 
> This changes malloc to use utrace(2) for that.
> 
> As kdump has no nice way to show those lines without all extras it
> normally shows, so add two options to it to just show the lines.
> 
> To use, compile and install libc with MALLOC_STATS defined.
> 
> Run :
> 
> $ MALLOC_OPTIONS=D ktrace -tu your_program
> ...
> $ kdump -hu
> 
> Feedback appreciated.

Second iteration. Main difference: docs and the leak report is greatly
enhanced by adding info that can be fed to addr2line. Exmaple:

...
Leak report:
 f sum  #avg
 0x9fd0eb58abb   40960 10   4096 addr2line -e ./a.out 0x1abb
 0x9fd0eb58ae98192  1   8192 addr2line -e ./a.out 0x1ae9
 0x9ffa92d9daf   65536  1  65536 addr2line -e /usr/lib/libc.so.97.0 
0x87daf

$ addr2line -e /usr/lib/libc.so.97.0 0x87daf
/usr/src/lib/libc/stdio/makebuf.c:62

Note that only the top caller is reported. I once wrote code to record
more stack frames, but I do not think I want that in. More elaborate
memory allocation analysis is better done by external tools or
instrumentation wrappers.

But even them, the current report is very useful IMO. It might be even
usefull enough to compile it always #ifndef SMALL. The runtime
overhead (if malloc option D is not used) is small. I'll be thinking
about possible reasons not to compile it in by default.

-Otto

Index: lib/libc/stdlib/malloc.3
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
retrieving revision 1.130
diff -u -p -r1.130 malloc.3
--- lib/libc/stdlib/malloc.31 Apr 2023 18:47:51 -   1.130
+++ lib/libc/stdlib/malloc.37 Apr 2023 07:43:22 -
@@ -284,10 +284,13 @@ If it has been corrupted, the process is
 .It Cm D
 .Dq Dump .
 .Fn malloc
-will dump statistics to the file
-.Pa ./malloc.out ,
-if it already exists,
+will dump statistics using
+.Xr utrace 2
 at exit.
+To record the dump:
+.Dl $ MALLOC_OPTIONS=D ktrace -tu program ...
+To view the dump:
+.Dl $ kdump -hu ...
 This option requires the library to have been compiled with -DMALLOC_STATS in
 order to have any effect.
 .It Cm F
Index: lib/libc/stdlib/malloc.c
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
retrieving revision 1.280
diff -u -p -r1.280 malloc.c
--- lib/libc/stdlib/malloc.c5 Apr 2023 06:25:38 -   1.280
+++ lib/libc/stdlib/malloc.c7 Apr 2023 07:43:22 -
@@ -23,7 +23,7 @@
  * can buy me a beer in return. Poul-Henning Kamp
  */
 
-/* #define MALLOC_STATS */
+#define MALLOC_STATS
 
 #include 
 #include 
@@ -39,8 +39,10 @@
 #include 
 
 #ifdef MALLOC_STATS
+#include 
 #include 
-#include 
+#include 
+#include 
 #endif
 
 #include "thread_private.h"
@@ -243,10 +245,8 @@ static __dead void wrterror(struct dir_i
 __attribute__((__format__ (printf, 2, 3)));
 
 #ifdef MALLOC_STATS
-void malloc_dump(int, int, struct dir_info *);
+void malloc_dump(void);
 PROTO_NORMAL(malloc_dump);
-void malloc_gdump(int);
-PROTO_NORMAL(malloc_gdump);
 static void malloc_exit(void);
 #define CALLER __builtin_return_address(0)
 #else
@@ -319,7 +319,7 @@ wrterror(struct dir_info *d, char *msg, 
 
 #ifdef MALLOC_STATS
if (mopts.malloc_stats)
-   malloc_gdump(STDERR_FILENO);
+   malloc_dump();
 #endif /* MALLOC_STATS */
 
errno = saved_errno;
@@ -2225,6 +2225,21 @@ aligned_alloc(size_t alignment, size_t s
 
 #ifdef MALLOC_STATS
 
+static void
+ulog(const char *format, ...)
+{
+   va_list ap;
+   char buf[100];
+   int len;
+
+   va_start(ap, format);
+   len = vsnprintf(buf, sizeof(buf), format, ap);
+   if (len > (int)sizeof(buf))
+   len = sizeof(buf);
+   utrace("mallocdumpline", buf, len);
+   va_end(ap);
+}
+
 struct malloc_leak {
void *f;
size_t total_size;
@@ -2280,21 +2295,32 @@ putleakinfo(void *f, size_t sz, int cnt)
 static struct malloc_leak *malloc_leaks;
 
 static void
-dump_leaks(int fd)
+dump_leaks(void)
 {
struct leaknode *p;
unsigned int i = 0;
 
-   dprintf(fd, "Leak report\n");
-   dprintf(fd, " f sum  #avg\n");
+   ulog("Leak report:\n");
+   ulog(" f sum  #avg\n");
/* XXX only one page of summary */
if (malloc_leaks == NULL)
malloc_leaks = MMAP(MALLOC_PAGESIZE, 0);
if (malloc_leaks != MAP_FAILED)

MALLOC_STATS: dump internal state and leak info via utrace(2)

2023-04-05 Thread Otto Moerbeek
Hi,

This is work in progress. I have to think if the flags to kdump I'm
introducing should be two or a single one.

Currently, malloc.c can be compiled with MALLOC_STATS defined. If run
with option D it dumps its state to a malloc.out file at exit. This
state can be used to find leaks amongst other things.

This is not ideal for pledged processes, as they often have no way to
write files.

This changes malloc to use utrace(2) for that.

As kdump has no nice way to show those lines without all extras it
normally shows, so add two options to it to just show the lines.

To use, compile and install libc with MALLOC_STATS defined.

Run :

$ MALLOC_OPTIONS=D ktrace -tu your_program
...
$ kdump -hu

Feedback appreciated.

-Otto

Index: lib/libc/stdlib/malloc.c
===
RCS file: /cvs/src/lib/libc/stdlib/malloc.c,v
retrieving revision 1.280
diff -u -p -r1.280 malloc.c
--- lib/libc/stdlib/malloc.c5 Apr 2023 06:25:38 -   1.280
+++ lib/libc/stdlib/malloc.c5 Apr 2023 08:23:42 -
@@ -23,7 +23,7 @@
  * can buy me a beer in return. Poul-Henning Kamp
  */
 
-/* #define MALLOC_STATS */
+#define MALLOC_STATS
 
 #include 
 #include 
@@ -39,7 +39,9 @@
 #include 
 
 #ifdef MALLOC_STATS
+#include 
 #include 
+#include 
 #include 
 #endif
 
@@ -243,10 +245,8 @@ static __dead void wrterror(struct dir_i
 __attribute__((__format__ (printf, 2, 3)));
 
 #ifdef MALLOC_STATS
-void malloc_dump(int, int, struct dir_info *);
+void malloc_dump(void);
 PROTO_NORMAL(malloc_dump);
-void malloc_gdump(int);
-PROTO_NORMAL(malloc_gdump);
 static void malloc_exit(void);
 #define CALLER __builtin_return_address(0)
 #else
@@ -319,7 +319,7 @@ wrterror(struct dir_info *d, char *msg, 
 
 #ifdef MALLOC_STATS
if (mopts.malloc_stats)
-   malloc_gdump(STDERR_FILENO);
+   malloc_dump();
 #endif /* MALLOC_STATS */
 
errno = saved_errno;
@@ -2225,6 +2225,21 @@ aligned_alloc(size_t alignment, size_t s
 
 #ifdef MALLOC_STATS
 
+static void
+ulog(const char *format, ...)
+{
+   va_list ap;
+   char buf[100];
+   int len;
+
+   va_start(ap, format);
+   len = vsnprintf(buf, sizeof(buf), format, ap);
+   if (len > (int)sizeof(buf))
+   len = sizeof(buf);
+   utrace("mallocdumpline", buf, len);
+   va_end(ap);
+}
+
 struct malloc_leak {
void *f;
size_t total_size;
@@ -2280,20 +2295,20 @@ putleakinfo(void *f, size_t sz, int cnt)
 static struct malloc_leak *malloc_leaks;
 
 static void
-dump_leaks(int fd)
+dump_leaks(void)
 {
struct leaknode *p;
unsigned int i = 0;
 
-   dprintf(fd, "Leak report\n");
-   dprintf(fd, " f sum  #avg\n");
+   ulog("Leak report\n");
+   ulog(" f sum  #avg\n");
/* XXX only one page of summary */
if (malloc_leaks == NULL)
malloc_leaks = MMAP(MALLOC_PAGESIZE, 0);
if (malloc_leaks != MAP_FAILED)
memset(malloc_leaks, 0, MALLOC_PAGESIZE);
RBT_FOREACH(p, leaktree, ) {
-   dprintf(fd, "%18p %7zu %6u %6zu\n", p->d.f,
+   ulog("%18p %7zu %6u %6zu\n", p->d.f,
p->d.total_size, p->d.count, p->d.total_size / p->d.count);
if (malloc_leaks == MAP_FAILED ||
i >= MALLOC_PAGESIZE / sizeof(struct malloc_leak))
@@ -2306,10 +2321,10 @@ dump_leaks(int fd)
 }
 
 static void
-dump_chunk(int fd, struct chunk_info *p, void *f, int fromfreelist)
+dump_chunk(struct chunk_info *p, void *f, int fromfreelist)
 {
while (p != NULL) {
-   dprintf(fd, "chunk %18p %18p %4zu %d/%d\n",
+   ulog("chunk %18p %18p %4zu %d/%d\n",
p->page, ((p->bits[0] & 1) ? NULL : f),
B2SIZE(p->bucket), p->free, p->total);
if (!fromfreelist) {
@@ -2325,17 +2340,17 @@ dump_chunk(int fd, struct chunk_info *p,
}
p = LIST_NEXT(p, entries);
if (p != NULL)
-   dprintf(fd, "");
+   ulog("");
}
 }
 
 static void
-dump_free_chunk_info(int fd, struct dir_info *d)
+dump_free_chunk_info(struct dir_info *d)
 {
int i, j, count;
struct chunk_info *p;
 
-   dprintf(fd, "Free chunk structs:\n");
+   ulog("Free chunk structs:\n");
for (i = 0; i <= BUCKETS; i++) {
count = 0;
LIST_FOREACH(p, >chunk_info_list[i], entries)
@@ -2345,99 +2360,97 @@ dump_free_chunk_info(int fd, struct dir_
if (p == NULL && count == 0)
continue;
if (j == 0)
-   dprintf(fd, "%3d) %3d ", i, count);
+   ulog("%3d) %3d ", i, count);
else
-   dprintf(fd, " ");
+