On Wed, 26 Dec 2007, Vamsee Priya wrote:
> Hi > > I thought that the bug might be in my program. But everything works fine > with 64 bit binary....are there any flags that need to be set/unset > while compling? > And I reiterate that things worked fine some times with the 32 bit > binary also.... I'd put it that way: "sometimes" is not exactly an indicator that the program works [ well under all circumstances ]. There are differences between 64bit and 32bit (data structure padding and operand sizes) that might hide a problem in 64bit - for example, if you're overwriting the end of a data structure by two bytes; in 32bit mode, no padding might've been added and you'd corrupt another piece of data that happens to be adjacent in memory. In 64bit mode, padding might add e.g. four bytes at the end of that struct and while you still overflow it, there's no data-in-use being overwritten and the program stays stable. Your program would still have a bug, but unless you create a 32bit version of it, it won't bite. Similar differences can occur if you have a 32bit program on, say, Windows and Solaris - a bug may strike only on one and not on the other, due to differences in data structure alignments and sizes, or due to differences in how e.g. malloc() works. Especially heap overflows and double frees are not fatal at all times, and not always fatal with all malloc implementations. But they're bugs in your program nonetheless. One such is what libumem has caught in the stacktrace you've shown, whether it's an overwrite-past-end or a double free, you'll see from "::umem_status". Ergo: That your program works in one environment is no guarantee that it's bug free. If it falls over, with a different compiler, different OS/rev, faced with different input or linked against different libraries, then first strategy should always be to suspect your own program. Unless you've got the sort of program that has already been tested and validated against literally gazillions of such combinations, and shown to work well - and now you find such a change (e.g. by the OS vendor) breaks it. The vendor of the "failing OS" will then be highly interested in working with you :) FrankH. > > Thanks > Priya > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf > Of [EMAIL PROTECTED] > Sent: Wednesday, December 26, 2007 5:59 PM > To: Vamsee Priya > Cc: [email protected] > Subject: Re: [osol-discuss] SIGSEGV in libc.so.1`_malloc_unlocked > onSolarisx86 machine > > >> Hi >> Apart from the scenario I explained below, I also get a SIGABRT with > the >> following stack trace >> >> libc.so.1`_lwp_kill+0x15(1, 6) >> libc.so.1`raise+0x1f(6) >> libumem.so.1`umem_do_abort+0x25(9, fefb5000, 804691c, fef98b1c, >> fefa3ae8, 80aa810) >> libumem.so.1`umem_err_recoverable+0x46(fefa3ae8) >> libumem.so.1`umem_error+0x453(1, 80aa810, 80c7c68) >> libumem.so.1`umem_free+0xf6(80c7c68, 50) >> libumem.so.1`process_free+0xfd(80c7c70, 1, 0, 80469a8, 805890b, > 80c7c70) >> libumem.so.1`free+0x14(80c7c70, 80c1b48, 0, 80c1b60) >> meta_free+0xbf(80469f0, 80c1b88, 1, 80a0bd0, 0, 0) >> active_out+0x44e(6, 8047e1f, 0, 65) >> active+0xe0(2, 8047e1f, 0, 0, 8046b57) >> main+0xd59(6, 8047d0c, 8047d28) >> _start+0x80(6, 8047dd8, 8047df7, 8047dfa, 8047e0c, 8047e1f) >> >> Please suggest me as to what can be done to over come these issues. > > Fix the bug in your program :-) > > This is a abort() from libumem which indicates that it found an error > in your program. > > So you need to take the core from SIGABRT and run that with "mdb". > > Casper > > > > _______________________________________________ > opensolaris-discuss mailing list > [email protected] > _______________________________________________ opensolaris-discuss mailing list [email protected]
