Edward Pilatowicz wrote: > On Wed, Feb 18, 2009 at 06:14:59AM +0100, Roland Mainz wrote: > > Glenn Fowler wrote: > > > On Fri, 13 Feb 2009 15:19:30 -0800 Edward Pilatowicz wrote: > > > > > > > > > > What does $ /usr/xpg4/bin/file /usr/bin/sleep /usr/bin/alias # say on > > > > > the system where this fails ? > > > > > > well not mentioning libumem in the original message was quite an omission > > > > Erm... AFAIK libumem isn't the source of the problem. Solaris's libumem > > is an alternative memory allocator which "overrides" the default > > |libc::malloc()| and provides configurable debugging aids (in a similar > > way as libast's internal memory corruption checks controlled via > > VMDEBUG/VMCHECK/&co. - see > > http://docs.sun.com/app/docs/doc/816-5168/umem-debug-3malloc?l=ja&a=view > > for some documentation). AFAIK Edward was only using libumem to > > track-down the source of the problem via libumem and the crash happens > > with and without it. > > actually, i run everything with libumem all the time. the only time i > don't pre-load libumem is when i know that application has memory > corruption problems that will cause libumem to kill it. (and i usually > file a bug on these issues before disabling libumem.) this means that i > file a lot of memory corruption bugs against other programs. > > i don't think that you'll see this issue without libumem.
Mhhh... Ok... > > > we were careful in the solaris build to add an _ast_ prefix > > > to any libast function that might interfere with solaris libc > > > > > > ast provides its own malloc/free, and those calls are mapped > > > to _ast_malloc/_ast_free in the ksh/ast code for opensolaris builds > > > so that call to free() in the stack trace was not done directly by > > > any ksh/ast code > > > > Right... but it seems something has trashed the heap managed by > > |libc::malloc()| - either something is writing randomly into areas where > > it shouldn't write to... or maybe we hit bug in Solaris. > > > > > is there a description on how libumem allocates/frees physical memory? > > > > Uhm... good question... looking at > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libumem/common/ > > it seems to have at least support for |sbrk()| and |mmap()|. > > libumem is a drop in place replace for libc malloc()/free(). so if an > application never uses libc malloc()/free(), then it should never use > libumem malloc/free. Erm... that's not correct. libc/libnsl/etc.-internal functions may call |libc::malloc()| even if your own application doesn't contain a single call to |malloc()| ... > as you've noticed, it can be configured to use > either sbrk() or mmap(). now, since libast has it's own malloc()/free() > replacements, i would think that you need to somehow ensure that no-one > (not even the linker) calls into libc malloc/free. otherwise you could > end up with libc and libast allocating from the same heap... (which if > you run with libumem, would be initialized to 0xdeadbeef.) Uhm... how should this happen ? libast and libc use _different_ ELF symbols for their matching malloc versions (e.g. |libc::malloc()| vs. |libast::_ast_malloc()|) and can both happily co-exist in one and the same application. AFAIK the problem we're seeing here is that something corrupts the heap managed by |libc::malloc()| when ksh93 is doing a |setlocale()| call. But that's all theory for now since I can't reproduce the crash on my side even after trying ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 3992797 (;O/ \/ \O;)