On Wed, 26 Dec 2007, Vamsee Priya wrote:

> Hi
>
> I thought that the bug might be in my program. But everything works fine
> with 64 bit binary....are there any flags that need to be set/unset
> while compling?
> And I reiterate that things worked fine some times with the 32 bit
> binary also....

I'd put it that way:
"sometimes" is not exactly an indicator that the program works [ well 
under all circumstances ].

There are differences between 64bit and 32bit (data structure padding and 
operand sizes) that might hide a problem in 64bit - for example, if you're 
overwriting the end of a data structure by two bytes; in 32bit mode, no 
padding might've been added and you'd corrupt another piece of data that 
happens to be adjacent in memory. In 64bit mode, padding might add e.g. 
four bytes at the end of that struct and while you still overflow it, 
there's no data-in-use being overwritten and the program stays stable.

Your program would still have a bug, but unless you create a 32bit version 
of it, it won't bite.

Similar differences can occur if you have a 32bit program on, say, Windows 
and Solaris - a bug may strike only on one and not on the other, due to 
differences in data structure alignments and sizes, or due to differences 
in how e.g. malloc() works.

Especially heap overflows and double frees are not fatal at all times, and 
not always fatal with all malloc implementations. But they're bugs in your 
program nonetheless. One such is what libumem has caught in the stacktrace 
you've shown, whether it's an overwrite-past-end or a double free, you'll 
see from "::umem_status".

Ergo: That your program works in one environment is no guarantee that it's 
bug free. If it falls over, with a different compiler, different OS/rev, 
faced with different input or linked against different libraries, then 
first strategy should always be to suspect your own program.

Unless you've got the sort of program that has already been tested and 
validated against literally gazillions of such combinations, and shown to 
work well - and now you find such a change (e.g. by the OS vendor) breaks 
it. The vendor of the "failing OS" will then be highly interested in 
working with you :)

FrankH.

>
> Thanks
> Priya
>
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
> Of [EMAIL PROTECTED]
> Sent: Wednesday, December 26, 2007 5:59 PM
> To: Vamsee Priya
> Cc: [email protected]
> Subject: Re: [osol-discuss] SIGSEGV in libc.so.1`_malloc_unlocked
> onSolarisx86 machine
>
>
>> Hi
>> Apart from the scenario I explained below, I also get a SIGABRT with
> the
>> following stack trace
>>
>> libc.so.1`_lwp_kill+0x15(1, 6)
>> libc.so.1`raise+0x1f(6)
>> libumem.so.1`umem_do_abort+0x25(9, fefb5000, 804691c, fef98b1c,
>> fefa3ae8, 80aa810)
>> libumem.so.1`umem_err_recoverable+0x46(fefa3ae8)
>> libumem.so.1`umem_error+0x453(1, 80aa810, 80c7c68)
>> libumem.so.1`umem_free+0xf6(80c7c68, 50)
>> libumem.so.1`process_free+0xfd(80c7c70, 1, 0, 80469a8, 805890b,
> 80c7c70)
>> libumem.so.1`free+0x14(80c7c70, 80c1b48, 0, 80c1b60)
>> meta_free+0xbf(80469f0, 80c1b88, 1, 80a0bd0, 0, 0)
>> active_out+0x44e(6, 8047e1f, 0, 65)
>> active+0xe0(2, 8047e1f, 0, 0, 8046b57)
>> main+0xd59(6, 8047d0c, 8047d28)
>> _start+0x80(6, 8047dd8, 8047df7, 8047dfa, 8047e0c, 8047e1f)
>>
>> Please suggest me as to what can be done to over come these issues.
>
> Fix the bug in your program :-)
>
> This is a abort() from libumem which indicates that it found an error
> in your program.
>
> So you need to take the core from SIGABRT and run that with "mdb".
>
> Casper
>
>
>
> _______________________________________________
> opensolaris-discuss mailing list
> [email protected]
>
_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to