Re: Why our Valgrind reports suck

Andres Freund Fri, 09 May 2025 08:30:11 -0700

Hi,

On 2025-05-08 22:04:06 -0400, Tom Lane wrote:
> A nearby thread [1] reminded me to wonder why we seem to have
> so many false-positive leaks reported by Valgrind these days.
> For example, at exit of a backend that's executed a couple of
> trivial queries, I see
> 
> ==00:00:00:25.515 260013== LEAK SUMMARY:
> ==00:00:00:25.515 260013==    definitely lost: 3,038 bytes in 90 blocks
> ==00:00:00:25.515 260013==    indirectly lost: 4,431 bytes in 61 blocks
> ==00:00:00:25.515 260013==      possibly lost: 390,242 bytes in 852 blocks
> ==00:00:00:25.515 260013==    still reachable: 579,139 bytes in 1,457 blocks
> ==00:00:00:25.515 260013==         suppressed: 0 bytes in 0 blocks
> 
> so about a thousand "leaked" blocks, all but a couple of which
> are false positives --- including nearly all the "definitely"
> leaked ones.
> 
> Some testing and reading of the Valgrind manual [2] turned up a
> number of answers, which mostly boil down to us using very
> Valgrind-unfriendly data structures.  Per [2],
> 
>     There are two ways a block can be reached. The first is with a
>     "start-pointer", i.e. a pointer to the start of the block. The
>     second is with an "interior-pointer", i.e. a pointer to the middle
>     of the block.
> 
>     [ A block is reported as "possibly lost" if ] a chain of one or
>     more pointers to the block has been found, but at least one of the
>     pointers is an interior-pointer.


Huh. We use the memory pool client requests to inform valgrind about memory
contexts. I seem to recall that that "hid" many leak warnings from valgrind. I
wonder if we somehow broke (or weakened) that.

We currently don't reset TopMemoryContext at exit, which, obviously, does
massively increase the number of leaks. But OTOH, without that there's not a
whole lot of value in the leak check...

Greetings,

Andres Freund

Re: Why our Valgrind reports suck

Reply via email to