Hi, thanks to Simon Knight, and Matthew Gregan who replied offlist with some pointers.
Matthew pointed out that the created threads were not being cleaned up properly. I had to either detach the threads, or do a pthread_join of the thread. With that change, the test program ran a lot better, and had no corruption style errors. Matthew suggested that the corruption errors were a result of too many active threads at once, which is indeed possible. So, I guess that makes me one step closer to solving the original problem: > I have been having some memory errors in a C++ program, which has been run > on my quad cpu box here (FC1). It has been complaining about errors in > malloc_consolidate. Thanks for your time guys, Derek. ===================================================================== On Mon, 13 Feb 2006, Derek Smithies wrote: > Hi, > > I have been having some memory errors in a C++ program, which has been run > on my quad cpu box here (FC1). It has been complaining about errors in > malloc_consolidate. Nasty stuff this, whats going on. > > So, I cut it down to the minimum, wrote a test application. > > compiled with: > gcc -g -o memtest memtest.c -lm -lc -lpthread > > debug version is important, it means the optimiser will not remove any > code, and there will be debugable core dumps on error. > > I ran it with the command line > ./memtest 0 1300 > > Load averages of 150 on a quad cpu box were observed. > > the program will exit when you hit the return key. > > ====================================================== > > My concern is that it is a memory allocation error within the glibc or > similar place, and would love some verification of this for other distros. > I wonder if any here are willing to try this program, and see if they can > create memory allocation errors. > > The program has one loop that just spawns threads as fast as it can go, > Each spawned thread does some memory allocations, frees the allocated > memory, and exits. > > I wonder if someone here would like to try it on a multicpu box, and see > if it generates a memory error. The single cpu boxes here did not "like" > this program at all. They needed "power"ful intervention. > > You can set the environment variable MALLOC_TRACE_ to 2, and glibc will > then do memory allocation/freeing testing for you, and to abort (with core > dump) on detecting memory errors. > > I would not run this program on a vital production server > > Thanks in advance for those who do test this program - it is a hard test. > Specifically designed to hunt down memory allocation errors in threads on > a multi cpu box. > > > Cheers, > > Derek. > -- > Derek Smithies Ph.D. Any fool can write code that > IndraNet Technologies Ltd. a computer can understand. > Email: [EMAIL PROTECTED] Good programmers write code > ph +64 3 365 6485 that humans can understand. > Web: http://www.indranet-technologies.com/ Martin Fowler > -- Derek Smithies Ph.D. Any fool can write code that IndraNet Technologies Ltd. a computer can understand. Email: [EMAIL PROTECTED] Good programmers write code ph +64 3 365 6485 that humans can understand. Web: http://www.indranet-technologies.com/ Martin Fowler
