Hi, For the second last trace involving SearchCatCacheList (catcache.c:1691), the ctlist's members are stored in cl->members array where cl is returned at the end of SearchCatCacheList.
Maybe this was not accounted for by valgrind ? Cheers On Tue, Mar 16, 2021 at 7:31 PM Andres Freund <and...@anarazel.de> wrote: > Hi, > > David, there's a question about a commit of yours below, hence adding > you. > > On 2021-03-16 19:36:10 -0400, Tom Lane wrote: > > Out of curiosity I poked at this for a little while. > > Cool. > > > > It doesn't appear to me that we leak much at all, at least not if you > > are willing to take "still reachable" blocks as not-leaked. > > Well, I think for any sort of automated testing - which I think would be > useful - we'd really need *no* leaks. I know that I get a few bleats > whenever I forget to set --leak-check=no. It's also not just postgres > itself, but some of the helper tools... > > And it's not just valgrind, also gcc/clang sanitizers... > > > > What I found out is that we have a lot of issues that seem to devolve > > to valgrind not being sure that a block is referenced. I identified > > two main causes of that: > > > > (1) We have a pointer, but it doesn't actually point right at the start > > of the block. A primary culprit here is lists of thingies that use the > > slist and dlist infrastructure. As an experiment, I moved the dlist_node > > fields of some popular structs to the beginning, and verified that that > > silences associated complaints. I'm not sure that we want to insist on > > put-the-link-first as policy (although if we did, it could provide some > > notational savings perhaps). However, unless someone knows of a way to > > teach valgrind about this situation, there may be no other way to silence > > those leakage complaints. A secondary culprit is people randomly > applying > > CACHELINEALIGN or the like to a palloc'd address, so that the address we > > have isn't pointing right at the block start. > > Hm, do you still have a backtrace / suppression for one of those? I > didn't see any in a quick (*) serial installcheck I just ran. Or I > wasn't able to pinpoint them to this issue. > > > I think the run might have shown a genuine leak: > > ==2048803== 16 bytes in 1 blocks are definitely lost in loss record 139 of > 906 > ==2048803== at 0x89D2EA: palloc (mcxt.c:975) > ==2048803== by 0x2392D3: heap_beginscan (heapam.c:1198) > ==2048803== by 0x264E8F: table_beginscan_strat (tableam.h:918) > ==2048803== by 0x265994: systable_beginscan (genam.c:453) > ==2048803== by 0x83C2D1: SearchCatCacheMiss (catcache.c:1359) > ==2048803== by 0x83C197: SearchCatCacheInternal (catcache.c:1299) > ==2048803== by 0x83BE9A: SearchCatCache1 (catcache.c:1167) > ==2048803== by 0x85876A: SearchSysCache1 (syscache.c:1134) > ==2048803== by 0x84CDB3: RelationInitTableAccessMethod (relcache.c:1795) > ==2048803== by 0x84F807: RelationBuildLocalRelation (relcache.c:3554) > ==2048803== by 0x303C9D: heap_create (heap.c:395) > ==2048803== by 0x305790: heap_create_with_catalog (heap.c:1291) > ==2048803== by 0x41A327: DefineRelation (tablecmds.c:885) > ==2048803== by 0x6C96B6: ProcessUtilitySlow (utility.c:1131) > ==2048803== by 0x6C948A: standard_ProcessUtility (utility.c:1034) > ==2048803== by 0x6C865F: ProcessUtility (utility.c:525) > ==2048803== by 0x6C7409: PortalRunUtility (pquery.c:1159) > ==2048803== by 0x6C7636: PortalRunMulti (pquery.c:1305) > ==2048803== by 0x6C6B11: PortalRun (pquery.c:779) > ==2048803== by 0x6C05AB: exec_simple_query (postgres.c:1173) > ==2048803== > { > <insert_a_suppression_name_here> > Memcheck:Leak > match-leak-kinds: definite > fun:palloc > fun:heap_beginscan > fun:table_beginscan_strat > fun:systable_beginscan > fun:SearchCatCacheMiss > fun:SearchCatCacheInternal > fun:SearchCatCache1 > fun:SearchSysCache1 > fun:RelationInitTableAccessMethod > fun:RelationBuildLocalRelation > fun:heap_create > fun:heap_create_with_catalog > fun:DefineRelation > fun:ProcessUtilitySlow > fun:standard_ProcessUtility > fun:ProcessUtility > fun:PortalRunUtility > fun:PortalRunMulti > fun:PortalRun > fun:exec_simple_query > } > > Since 56788d2156fc heap_beginscan() allocates > scan->rs_base.rs_private = > palloc(sizeof(ParallelBlockTableScanWorkerData)); > in heap_beginscan(). But doesn't free it in heap_endscan(). > > In most of the places heap scans are begun inside transient contexts, > but not always. In the above trace for example > RelationBuildLocalRelation switched to CacheMemoryContext, and nothing > switched to something else. > > I'm a bit confused about the precise design of rs_private / > ParallelBlockTableScanWorkerData, specifically why it's been added to > TableScanDesc, instead of just adding it to HeapScanDesc? And why is it > allocated unconditionally, instead of just for parallel scans? > > > I don't think this is a false positive, even though it theoretically > could be freed by resetting CacheMemoryContext (see below)? > > > I saw a lot of false positives from autovacuum workers, because > AutovacMemCxt is never deleted, and because table_toast_map is created > in TopMemoryContext. Adding an explicit > MemoryContextDelete(AutovacMemCxt) and parenting table_toast_map in that > shut that up. > > > Which brings me to the question why allocations in CacheMemoryContext, > AutovacMemCxt are considered to be "lost", even though they're still > "reachable" via a context reset. I think valgrind ends up treating > memory allocated via memory contexts that still exist at process end as > lost, regardless of being reachable via the the memory pool (from > valgrinds view). Which I guess actually makes sense, for things like > TopMemoryContext and CacheContext - anything not reachable by means > other than a context reset is effectively lost for those. > > For autovac launcher / worker it seems like a sensible thing to just > delete AutovacMemCxt. > > > > > (2) The only pointer to the start of a block is actually somewhere within > > the block. This is common in dynahash tables, where we allocate a slab > > of entries in a single palloc and then thread them together. Each entry > > should have exactly one referencing pointer, but that pointer is more > > likely to be elsewhere within the same palloc block than in the external > > hash bucket array. AFAICT, all cases except where the slab's first entry > > is pointed to by a hash bucket pointer confuse valgrind to some extent. > > I was able to hack around this by preventing dynahash from allocating > > more than one hash entry per palloc, but I wonder if there's a better > way. > > Hm. For me the number of leaks seem to stay the same with or without > your changes related to this. Is this a USE_VALGRIND build? > > I'm using valgrind-3.16.1 > > > > Attached is a very crude hack, not meant for commit, that hacks things > > up enough to greatly reduce the number of complaints with > > "--leak-check=full". > > > > One thing I've failed to silence so far is a bunch of entries like > > > > ==00:00:03:56.088 3467702== 1,861 bytes in 67 blocks are definitely lost > in loss record 1,290 of 1,418 > > ==00:00:03:56.088 3467702== at 0x950650: MemoryContextAlloc > (mcxt.c:827) > > ==00:00:03:56.088 3467702== by 0x951710: MemoryContextStrdup > (mcxt.c:1179) > > ==00:00:03:56.088 3467702== by 0x91C86E: RelationInitIndexAccessInfo > (relcache.c:1444) > > ==00:00:03:56.088 3467702== by 0x91DA9C: RelationBuildDesc > (relcache.c:1200) > > > > which is complaining about the memory context identifiers for system > > indexes' rd_indexcxt contexts. Those are surely not being leaked in > > any real sense. I suspect that this has something to do with valgrind > > not counting the context->ident fields as live pointers, but I don't > > have enough valgrind-fu to fix that. > > Yea. I suspect it's related to the fact that we mark the memory as a > valgrind mempool, I'll try to investigate. > > > I do see a bunch of leaks bleats below fun:plpgsql_compile that I don't > yet understand. E.g. > > ==2054558== 32 bytes in 1 blocks are definitely lost in loss record 284 of > 913 > ==2054558== at 0x89D389: palloc (mcxt.c:975) > ==2054558== by 0x518732: new_list (list.c:134) > ==2054558== by 0x518C0C: lappend (list.c:341) > ==2054558== by 0x83CAE7: SearchCatCacheList (catcache.c:1691) > ==2054558== by 0x859A9C: SearchSysCacheList (syscache.c:1447) > ==2054558== by 0x313192: FuncnameGetCandidates (namespace.c:975) > ==2054558== by 0x313D91: FunctionIsVisible (namespace.c:1450) > ==2054558== by 0x7C2891: format_procedure_extended (regproc.c:375) > ==2054558== by 0x7C27C3: format_procedure (regproc.c:324) > ==2054558== by 0xA7693E1: do_compile (pl_comp.c:348) > ==2054558== by 0xA769130: plpgsql_compile (pl_comp.c:224) > > and > > ==2054558== 30 bytes in 4 blocks are definitely lost in loss record 225 of > 913 > ==2054558== at 0x89D389: palloc (mcxt.c:975) > ==2054558== by 0x3ADDAE: downcase_identifier (scansup.c:52) > ==2054558== by 0x3ADD85: downcase_truncate_identifier (scansup.c:39) > ==2054558== by 0x3AB5E4: core_yylex (scan.l:1032) > ==2054558== by 0xA789B2D: internal_yylex (pl_scanner.c:321) > ==2054558== by 0xA7896E3: plpgsql_yylex (pl_scanner.c:152) > ==2054558== by 0xA780015: plpgsql_yyparse (pl_gram.c:1945) > ==2054558== by 0xA76A652: do_compile (pl_comp.c:788) > ==2054558== by 0xA769130: plpgsql_compile (pl_comp.c:224) > ==2054558== by 0xA78948F: plpgsql_validator (pl_handler.c:539) > > Based on the quick look I had (dinner is calling) I didn't yet > understand how plpgsql_compile_tmp_cxt error handling works. > > Greetings, > > Andres Freund > > > (*) or not so quick, I had to figure out why valgrind was so slow. It > turned out that I had typed shared_buffers=32MB into > shared_buffers=32GB... > > >