On Sat, 28 Mar 2020 at 13:21, Jeff Davis <pg...@j-davis.com> wrote: > Attached refactoring patch. There's enough in here that warrants > discussion that I don't think this makes sense for v13 and I'm adding > it to the July commitfest.
I had a read over this too. I noted down the following during my pass of it. 1. The comment mentions "passthru", but you've removed that parameter. * For now, the passthru pointer just points to "int level"; later we might * make that more complicated. */ static void -MemoryContextStatsPrint(MemoryContext context, void *passthru, +MemoryContextStatsPrint(MemoryContext context, int level, const char *stats_string) 2. I don't think MemoryContextCount is the best name for this function. When I saw: counters = MemoryContextCount(aggstate->hash_metacxt, flags, true); i assumed it was returning some integer number, that is until I looked at the "counters" datatype. Maybe it would be better to name the function MemoryContextGetTelemetry and the struct MemoryContextTelemetry rather than MemoryContextCounters? Or maybe MemoryContextTally and call the function either MemoryContextGetTelemetry() or MemoryContextGetTally(). Or perhaps MemoryContextGetAccounting() and the struct MemoryContextAccounting 3. I feel like it would be nicer if you didn't change the "count" methods to return a MemoryContextCounters. It means you may need to zero a struct for each level, assign the values, then add them to the total. If you were just to zero the struct in MemoryContextCount() then pass it into the count function, then you could just have it do all the += work. It would reduce the code in MemoryContextCount() too. 4. Do you think it would be better to have two separate functions for MemoryContextCount(), a recursive version and a non-recursive version. I feel that the function should be so simple that it does not make a great deal of sense to load it up to handle both cases. Looking around mcxt.c, I see MemoryContextResetOnly() and MemoryContextResetChildren(), while that's not a perfect example, it does seem like a better lead to follow. 5. For performance testing, I tried using the following table with 1MB work_mem then again with 1GB work_mem. I wondered if making the accounting more complex would cause a slowdown in nodeAgg.c, as I see we're calling this function each time we get a tuple that belongs in a new group. The 1MB test is likely not such a great way to measure this since we do spill to disk in that case and the changing in accounting means we do that at a slightly different time, but the 1GB test does not spill and it's a bit slower. create table t (a int); insert into t select x from generate_Series(1,10000000) x; analyze t; -- Unpatched set work_mem = '1MB'; explain analyze select a from t group by a; -- Ran 3 times. QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=481750.10..659875.95 rows=10000048 width=4) (actual time=3350.190..8193.400 rows=10000000 loops=1) Group Key: a Planned Partitions: 32 Peak Memory Usage: 1177 kB Disk Usage: 234920 kB HashAgg Batches: 1188 -> Seq Scan on t (cost=0.00..144248.48 rows=10000048 width=4) (actual time=0.013..1013.755 rows=10000000 loops=1) Planning Time: 0.131 ms Execution Time: 8586.420 ms Execution Time: 8446.961 ms Execution Time: 8449.492 ms (9 rows) -- Patched set work_mem = '1MB'; explain analyze select a from t group by a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=481748.00..659873.00 rows=10000000 width=4) (actual time=3470.107..8598.836 rows=10000000 loops=1) Group Key: a Planned Partitions: 32 Peak Memory Usage: 1033 kB Disk Usage: 234816 kB HashAgg Batches: 1056 -> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.017..1091.820 rows=10000000 loops=1) Planning Time: 0.285 ms Execution Time: 8996.824 ms Execution Time: 8781.624 ms Execution Time: 8900.324 ms (9 rows) -- Unpatched set work_mem = '1GB'; explain analyze select a from t group by a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=169248.00..269248.00 rows=10000000 width=4) (actual time=4537.779..7033.318 rows=10000000 loops=1) Group Key: a Peak Memory Usage: 868369 kB -> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.018..820.136 rows=10000000 loops=1) Planning Time: 0.054 ms Execution Time: 7561.063 ms Execution Time: 7573.985 ms Execution Time: 7572.058 ms (6 rows) -- Patched set work_mem = '1GB'; explain analyze select a from t group by a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=169248.00..269248.00 rows=10000000 width=4) (actual time=4840.045..7359.970 rows=10000000 loops=1) Group Key: a Peak Memory Usage: 861975 kB -> Seq Scan on t (cost=0.00..144248.00 rows=10000000 width=4) (actual time=0.018..789.975 rows=10000000 loops=1) Planning Time: 0.055 ms Execution Time: 7904.069 ms Execution Time: 7913.692 ms Execution Time: 7927.061 ms (6 rows) Perhaps the slowdown is unrelated. If it, then maybe the reduction in branching mentioned by Andres might help a bit plus maybe a bit from what I mentioned above about passing in the counter struct instead of returning it at each level. David