Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Robert Haas Wed, 06 Aug 2014 08:45:12 -0700

On Sat, Aug 2, 2014 at 4:40 PM, Jeff Davis <[email protected]> wrote:
> Attached is a patch that explicitly tracks allocated memory (the blocks,
> not the chunks) for each memory context, as well as its children.
>
> This is a prerequisite for memory-bounded HashAgg, which I intend to
> submit for the next CF. Hashjoin tracks the tuple sizes that it adds to
> the hash table, which is a good estimate for Hashjoin. But I don't think
> it's as easy for Hashagg, for which we need to track transition values,
> etc. (also, for HashAgg, I expect that the overhead will be more
> significant than for Hashjoin). If we track the space used by the memory
> contexts directly, it's easier and more accurate.
>
> I did some simple pgbench select-only tests, and I didn't see any TPS
> difference.


I was curious whether a performance difference would show up when
sorting, so I tried it out.  I set up a test with pgbench -i 300.  I
then repeatedly restarted the database, and after each restart, did
this:

time psql -c 'set trace_sort=on; reindex index pgbench_accounts_pkey;'

I alternated runs between master and master with this patch, and got
the following results:

master:
LOG:  internal sort ended, 1723933 KB used: CPU 2.58s/11.54u sec
elapsed 16.88 sec
LOG:  internal sort ended, 1723933 KB used: CPU 2.50s/12.37u sec
elapsed 17.60 sec
LOG:  internal sort ended, 1723933 KB used: CPU 2.14s/11.28u sec
elapsed 16.11 sec

memory-accounting:
LOG:  internal sort ended, 1723933 KB used: CPU 2.57s/11.97u sec
elapsed 17.39 sec
LOG:  internal sort ended, 1723933 KB used: CPU 2.30s/12.57u sec
elapsed 17.68 sec
LOG:  internal sort ended, 1723933 KB used: CPU 2.54s/11.99u sec
elapsed 17.25 sec

Comparing the median times, that's about a 3% regression.  For this
particular case, we might be able to recapture that by replacing the
bespoke memory-tracking logic in tuplesort.c with use of this new
facility.  I'm not sure whether there are other cases that we might
also want to test; I think stuff that runs all on the server side is
likely to show up problems more clearly than pgbench.  Maybe a
PL/pgsql loop that does something allocation-intensive on each
iteration, for example, like parsing a big JSON document.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Reply via email to