Re: [HACKERS] shall we have a TRACE_MEMORY mode
As I follow Relyea Mike's recent post of possible memory leak, I think that we are lack of a good way of identifing memory usage. Maybe we should also remember __FILE__, __LINE__ etc for better memory usage diagnose when TRACE_MEMORY is on? I find __FILE__ and __LINE__ very helpful when debugging both leaks and corruption. I also add a per-context (AllocSetContext.callCount) call counter. Each time a new piece of memory is allocated from a context, I bump the call counter and record the new value in the header for that chunk of memory (AllocChunkData.callCount). That way, I can look at a chunk of memory and know that it was allocated the 42nd time that I grabbed a hunk of memory from that context. The next time I run my test, I can set a conditional breakpoint (cond callCounter==42) that stops at the correct allocation (and thus grab my stack dump). The numbers aren't always exactly the same, but in most cases they are. Obviously you're now writing 12 extra bytes of overhead to each AllocChunkData (__FILE__, __LINE__, and callCount) and 4 bytes to each AllocSetContext (callCount). -- Korry
Re: [HACKERS] shall we have a TRACE_MEMORY mode
On 6/20/06, Tom Lane [EMAIL PROTECTED] wrote: One idea that comes to mind is to have a compile time option to record the palloc __FILE__ and _LINE__ in every AllocChunk header. Then it would not be so hard to identify the culprit while trawling through memory. The overhead costs would be so high that you'd never turn it on by default though :-( Will adding 8 bytes, that too as a compile-time option, be a big overhead? 4 for __FILE__'s char* and 4 for __LINE__'s int; this, assuming 32 bit arch, and that no duplicates of __FILE__ string for each file are stored in the binary by the compiler, also called 'Enable string Pooling' in VS.Net (http://msdn2.microsoft.com/en-us/library/s0s0asdt.aspx). Another thing to consider is that the proximate location of the palloc is frequently *not* very useful. For instance, if your memory is getting eaten by lists, all the palloc traces will point at new_tail_cell(). Not much help. I don't know what to do about that ... any ideas? We can consider such utility functions equivalent to palloc, hence the caller's __FILE__ and __LINE__ will passed in to these functions, and these functions will use the same to call the palloc (or the palloc's #define expanded). So, in effect, in the log files, allocation will seem to have been done from the location which called the utility function. Regards, Gurjeet. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] shall we have a TRACE_MEMORY mode
On Tue, 2006-06-20 at 00:18 -0400, Tom Lane wrote: One idea that comes to mind is to have a compile time option to record the palloc __FILE__ and _LINE__ in every AllocChunk header. Then it would not be so hard to identify the culprit while trawling through memory. The overhead costs would be so high that you'd never turn it on by default though :-( Could we set that as an option for each memory context when we create it? All or nothing seems too extreme for me for most cases. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Simon Riggs wrote: On Tue, 2006-06-20 at 00:18 -0400, Tom Lane wrote: One idea that comes to mind is to have a compile time option to record the palloc __FILE__ and _LINE__ in every AllocChunk header. Then it would not be so hard to identify the culprit while trawling through memory. The overhead costs would be so high that you'd never turn it on by default though :-( Could we set that as an option for each memory context when we create it? All or nothing seems too extreme for me for most cases. What most cases? There is only one case -- there is a big leak and you want to find out where. You don't have this code turned on all the time, you must enable it at compile time, so we want it to be as simple as possible. On Tue, 2006-06-20 at 00:18 -0400, Tom Lane wrote: That seems mostly the hard way to me, because our memory management scheme is *not* based around thou shalt free() what thou malloc()ed. You'd need a tool that understood about resetting memory contexts (recursively) to get anywhere at all in analyzing such a trace. Of course. It's not difficult to do that; just tedious. I wrote such a tool to debug a Mammoth Replicator problem (I don't think I've kept it though). The logging code must emit messages about context creation, destruction and reset, and have the alloc message indicate what context is the chunk being created on. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Alvaro Herrera said: That seems mostly the hard way to me, because our memory management scheme is *not* based around thou shalt free() what thou malloc()ed. You'd need a tool that understood about resetting memory contexts (recursively) to get anywhere at all in analyzing such a trace. Of course. It's not difficult to do that; just tedious. I wrote such a tool to debug a Mammoth Replicator problem (I don't think I've kept it though). The logging code must emit messages about context creation, destruction and reset, and have the alloc message indicate what context is the chunk being created on. Could we tag each context with its name? Then we could centralise a lot of this, ISTM, and the overhead involved in setting the tag at context creation doesn't seem like a heavy price to pay. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Andrew Dunstan wrote: Alvaro Herrera said: That seems mostly the hard way to me, because our memory management scheme is *not* based around thou shalt free() what thou malloc()ed. You'd need a tool that understood about resetting memory contexts (recursively) to get anywhere at all in analyzing such a trace. Of course. It's not difficult to do that; just tedious. I wrote such a tool to debug a Mammoth Replicator problem (I don't think I've kept it though). The logging code must emit messages about context creation, destruction and reset, and have the alloc message indicate what context is the chunk being created on. Could we tag each context with its name? Then we could centralise a lot of this, ISTM, and the overhead involved in setting the tag at context creation doesn't seem like a heavy price to pay. Each context already keeps track of its own name. But the problem (or at last a part of the problem) is not what context each chunk is allocated in, but where did a given chunk come from (where was it allocated), Which is why saving __FILE__/__LINE__ is useful. Regarding stuff allocated by lappend(), makeNode() or other functions which centralizedly allocate in the name of the caller, maybe we could enhance the prototypes to get __FILE__ and __LINE__ from their caller. That would help pinpoint the true source of allocation. Something like #ifdef TRACE_MEMORY #define lappend(_list_, _elt_) \ lappend_tracemem(_list_, _elt_, __FILE__, __LINE__) #endif etc. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] shall we have a TRACE_MEMORY mode
On Tue, Jun 20, 2006 at 12:18:32AM -0400, Tom Lane wrote: Another thing to consider is that the proximate location of the palloc is frequently *not* very useful. For instance, if your memory is getting eaten by lists, all the palloc traces will point at new_tail_cell(). Not much help. I don't know what to do about that ... any ideas? GCC has __builtin_return_address (LEVEL) which returns the frame address of the LEVELth caller (on systems where this is possible). You could perhaps track the caller and the callers caller. glibc comes with a function called backtrace(0 which can be used to grab several levels simultaneously. You can use dladdr() to turn these into useful addresses. These are probably not portable, on the whole. As for overhead, maybe you can deal with that by only tracing blocks that exceed a megabyte or more. Perhaps a small test: if( increasing size of context over 10 MB ) dump_stack_trace() Ofcourse, you might just miss the allocations you need to look at... The backtrace_symbols_fd() function dumps straight to a file, if you want to avoid cluttering up the logs. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Tom Lane [EMAIL PROTECTED] writes: Another thing to consider is that the proximate location of the palloc is frequently *not* very useful. For instance, if your memory is getting eaten by lists, all the palloc traces will point at new_tail_cell(). Not much help. I don't know what to do about that ... any ideas? Well the traditional thing to do is store a backtrace a la Dmalloc, one of the better debugging malloc libraries out there. It has mostly been superceded by Purify/Valgrind type tools but it still has a place for handling memory leaks. It's unfortunate that's impossible to use Dmalloc with Postgres. It would probably be nigh impossible to merge in Dmalloc code into Postgres's allocator given the different models. But perhaps it would be possible to steal specific pieces of it like the backtrace grabbing code. -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Alvaro Herrera [EMAIL PROTECTED] writes: Simon Riggs wrote: Could we set that as an option for each memory context when we create it? All or nothing seems too extreme for me for most cases. What most cases? There is only one case -- there is a big leak and you want to find out where. There's a more significant reason why not, which is that all AllocChunks must have the same header, else pfree doesn't know what to do. That seems mostly the hard way to me, because our memory management scheme is *not* based around thou shalt free() what thou malloc()ed. You'd need a tool that understood about resetting memory contexts (recursively) to get anywhere at all in analyzing such a trace. Of course. It's not difficult to do that; just tedious. I wrote such a tool to debug a Mammoth Replicator problem (I don't think I've kept it though). The logging code must emit messages about context creation, destruction and reset, and have the alloc message indicate what context is the chunk being created on. Well, the logging approach would definitely be less intrusive to the system's runtime behavior, and would (maybe) not require gdb to use. If you can resurrect that tool it'd be interesting to look at. Maybe it's on a backup tape somewhere? regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Alvaro Herrera [EMAIL PROTECTED] wrote But the problem (or at last a part of the problem) is not what context each chunk is allocated in, but where did a given chunk come from (where was it allocated), Which is why saving __FILE__/__LINE__ is useful. Agreed. Maybe we should not clutter these trace info in the AllocChunkData. We save them in a separe memory context which is only activated when TRACE_MEMORY is on. Also, recording every __FILE__/__LINE__ seems not neccessary, we merge them and record the count of calls. Once a leak is happened, the usual suspect is the high-count one. So the output of memory context dump will be looks like this: execQual.c 1953123456 execHash.c 208 12 ... #ifdef TRACE_MEMORY #define lappend(_list_, _elt_) \ lappend_tracemem(_list_, _elt_, __FILE__, __LINE__) #endif This might be the only portable way I could think of. We don't want to redefine all of the functions calling palloc()/MemoryContextAlloc(), we redefine the most suspectable ones like those in heaptuple.c. Regards, Qingqing ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] shall we have a TRACE_MEMORY mode
As I follow Relyea Mike's recent post of possible memory leak, I think that we are lack of a good way of identifing memory usage. Maybe we should also remember __FILE__, __LINE__ etc for better memory usage diagnose when TRACE_MEMORY is on? Regards, Qingqing ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Qingqing Zhou wrote: As I follow Relyea Mike's recent post of possible memory leak, I think that we are lack of a good way of identifing memory usage. Maybe we should also remember __FILE__, __LINE__ etc for better memory usage diagnose when TRACE_MEMORY is on? Hmm, this would have been a great help to me not long ago, so I'd say it would be nice to have. About the exact form we'd give the feature: maybe write each allocation/freeing to a per-backend file, say /tmp/pgmem.pid. Also memory context creation, destruction, reset. Having the __FILE__ and __LINE__ on each operation would be a good tracing tool as well. Then it's easy to write Perl tools to find specific problems. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] shall we have a TRACE_MEMORY mode
Alvaro Herrera [EMAIL PROTECTED] writes: About the exact form we'd give the feature: maybe write each allocation/freeing to a per-backend file, say /tmp/pgmem.pid. Also memory context creation, destruction, reset. Having the __FILE__ and __LINE__ on each operation would be a good tracing tool as well. Then it's easy to write Perl tools to find specific problems. That seems mostly the hard way to me, because our memory management scheme is *not* based around thou shalt free() what thou malloc()ed. You'd need a tool that understood about resetting memory contexts (recursively) to get anywhere at all in analyzing such a trace. I've had some success in the past with debugging memory leaks by trawling through the oversized memory contexts with gdb x and trying to understand what the bulk of the data was. This is certainly pretty painful though. One idea that comes to mind is to have a compile time option to record the palloc __FILE__ and _LINE__ in every AllocChunk header. Then it would not be so hard to identify the culprit while trawling through memory. The overhead costs would be so high that you'd never turn it on by default though :-( Another thing to consider is that the proximate location of the palloc is frequently *not* very useful. For instance, if your memory is getting eaten by lists, all the palloc traces will point at new_tail_cell(). Not much help. I don't know what to do about that ... any ideas? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings