On Tue, Jul 23, 2019 at 06:18:26PM -0700, Melanie Plageman wrote:
On Thu, Jul 18, 2019 at 11:24 AM Jeff Davis <pg...@j-davis.com> wrote:

Previous discussion:
https://postgr.es/m/1407012053.15301.53.camel%40jeff-desktop

This patch introduces a way to ask a memory context how much memory it
currently has allocated. Each time a new block (not an individual
chunk, but a new malloc'd block) is allocated, it increments a struct
member; and each time a block is free'd, it decrements the struct
member. So it tracks blocks allocated by malloc, not what is actually
used for chunks allocated by palloc.


Cool! I like how straight-forward this approach is. It seems easy to
build on, as well.

Are there cases where we are likely to palloc a lot without needing to
malloc in a certain memory context? For example, do we have a pattern
where, for some kind of memory intensive operator, we might palloc in
a per tuple context and consistently get chunks without having to
malloc and then later, where we to try and check the bytes allocated
for one of these per tuple contexts to decide on some behavior, the
number would not be representative?


I think there's plenty of places where we quickly get into a state with
enough chunks in the freelist - the per-tuple contexts are a good
example of that, I think.

I think that is basically what Heikki is asking about with HashAgg,
but I wondered if there were other cases that you had already thought
through where this might happen.


I think Heikki was asking about places with a lot of sub-contexts, which a
completely different issue. It used to be the case that some aggregates
created a separate context for each group - like array_agg. That would
make Jeff's approach to accounting rather inefficient, because checking
how much memory is used would be very expensive (having to loop over a
large number of contexts).


The purpose is for Memory Bounded Hash Aggregation, but it may be
useful in more cases as we try to better adapt to and control memory
usage at execution time.


This approach seems like it would be good for memory intensive
operators which use a large, representative context. I think the
HashTableContext for HashJoin might be one?


Yes, that might me a good candidate (and it would be much simpler than
the manual accounting we use now).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Reply via email to