Hi,
 thanks to Simon Knight, and Matthew Gregan who replied offlist with some
pointers.

Matthew pointed out that the created threads were not being cleaned up 
properly. I had to either detach the threads, or do a pthread_join of the 
thread.

With that change, the test program ran a lot better, and had no corruption 
style errors. 

Matthew suggested that the corruption errors were a result of too many 
active threads at once, which is indeed possible.
So, I guess that makes me one step closer to solving the original problem:
> I have been having some memory errors in a C++ program, which has been run
> on my quad cpu box here (FC1). It has been complaining about errors in 
> malloc_consolidate. 

Thanks for your time guys,

Derek.
=====================================================================

 On Mon, 13 Feb 2006, Derek Smithies wrote:

> Hi,
> 
> I have been having some memory errors in a C++ program, which has been run
> on my quad cpu box here (FC1). It has been complaining about errors in 
> malloc_consolidate. Nasty stuff this, whats going on.
> 
> So, I cut it down to the minimum, wrote a test application.
> 
> compiled with:
>  gcc -g -o memtest memtest.c -lm -lc -lpthread
> 
> debug version is important, it means the optimiser will not remove any
> code, and there will be debugable core dumps on error.
> 
> I ran it with the command line
> ./memtest 0 1300
> 
> Load averages of 150 on a quad cpu box were observed.
> 
> the program will exit when you hit the return key.
> 
> ======================================================
> 
> My concern is that it is a memory allocation error within the glibc or 
> similar place, and would love some verification of this for other distros.
> I wonder if any here are willing to try this program, and see if they can 
> create memory allocation errors.
> 
> The program has one loop that just spawns threads as fast as it can go,
> Each spawned thread does some memory allocations, frees the allocated
> memory, and exits.
> 
> I wonder if someone here would like to try it on a multicpu box, and see
> if it generates a memory error. The single cpu boxes here did not "like"
> this program at all. They needed "power"ful intervention.
> 
> You can set the environment variable MALLOC_TRACE_ to 2, and glibc will 
> then do memory allocation/freeing testing for you, and to abort (with core 
> dump) on detecting memory errors.
> 
> I would not run this program on a vital production server
> 
> Thanks in advance for those who do test this program - it is a hard test.
> Specifically designed to hunt down memory allocation errors in threads on 
> a multi cpu box.
> 
> 
> Cheers,
> 
>  Derek.
>  -- 
> Derek Smithies Ph.D.                 Any fool can write code that 
> IndraNet Technologies Ltd.                a computer can understand.        
> Email: [EMAIL PROTECTED]         Good programmers write code 
> ph +64 3 365 6485                          that humans can understand.
> Web: http://www.indranet-technologies.com/            Martin Fowler
> 

-- 
Derek Smithies Ph.D.                 Any fool can write code that 
IndraNet Technologies Ltd.                a computer can understand.        
Email: [EMAIL PROTECTED]         Good programmers write code 
ph +64 3 365 6485                          that humans can understand.
Web: http://www.indranet-technologies.com/            Martin Fowler


Reply via email to