Re: Extremely slow HashAggregate in simple UNION query
> On 21. Aug 2019, at 20:26, Jeff Janes wrote: > > As noted elsewhere, v12 thwarts your attempts to deliberately design the bad > estimates. You can still get them, you just have to work a bit harder at it: > > CREATE FUNCTION j (bigint, bigint) returns setof bigint as $$ select > generate_series($1,$2) $$ rows 1000 language sql; Yeah, that's awesome! I didn't know about this until I ran into this issue, I'll definitely be using it for future estimation problems that are difficult to fix otherwise! > I've made an extension which has a function which always returns true, but > lies about how often it is expected to return true. See the attachment. With > that, you can fine-tune the planner. > > CREATE EXTENSION pg_selectivities ; Very cool and useful : )! I think in most cases I'll be okay with declaring a function with a static ROWS estimate, but I'll consider your extension if I need more flexibility in the future! Thanks Felix
Re: Extremely slow HashAggregate in simple UNION query
Hi, > On 20. Aug 2019, at 19:32, Andres Freund wrote: > > Hi, > > On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote: >> >> HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual >> time=19.349..23.123 rows=1 loops=1) > > FWIW, that's not a mis-estimate I'm getting on master ;). Obviously > that doesn't actually address your concern... I suppose this is thanks to the new optimizer support functions mentioned by Michael and Pavel? Of course I'm very excited about those improvements, but yeah, my real query is mis-estimating things for totally different reasons not involving any SRFs. >> I'm certainly a novice when it comes to PostgreSQL internals, but I'm >> wondering if this could be fixed by taking a more dynamic approach for >> allocating HashAggregate hash tables? > > Under-sizing the hashtable just out of caution will have add overhead to > a lot more common cases. That requires copying data around during > growth, which is far far from free. Or you can use hashtables that don't > need to copy, but they're also considerably slower in the more common > cases. How does PostgreSQL currently handle the case where the initial hash table is under-sized due to the planner having underestimated things? Are the growth costs getting amortized by using an exponential growth function? Anyway, I can accept my situation to be an edge case that doesn't justify making things more complicated. >> 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of >> the query execution instead of 45ms [5]. > > Well, there's plenty work that's not attributed to nodes. IIRC we don't > track executor startup/shutdown overhead on a per-node basis. I didn't know that, thanks for clarifying : ).
Extremely slow HashAggregate in simple UNION query
Hi all, today I debugged a query that was executing about 100x slower than expected, and was very surprised by what I found. I'm posting to this list to see if this might be an issue that should be fixed in PostgreSQL itself. Below is a simplified version of the query in question: SET work_mem='64MB'; EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b UNION SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b; HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual time=19.349..23.123 rows=1 loops=1) Group Key: a.a, b.b -> Append (cost=0.01..70020.01 rows=200 width=8) (actual time=0.022..0.030 rows=2 loops=1) -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) (actual time=0.021..0.022 rows=1 loops=1) -> Function Scan on generate_series a (cost=0.00..10.00 rows=1000 width=4) (actual time=0.014..0.014 rows=1 loops=1) -> Function Scan on generate_series b (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) (actual time=0.006..0.007 rows=1 loops=1) -> Function Scan on generate_series a_1 (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Function Scan on generate_series b_1 (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.002 rows=1 loops=1) Planning Time: 0.101 ms Execution Time: 45.986 ms As you can see, it takes over 45ms (!) to execute what appear to be a very simple query. How is this possible? Based on my debugging, I think the following is going on: 1. The query overestimates the final output rows by a factor of 2 million. [1] 2. The executor uses the bad estimate to allocate a huge hash table [2], and the increased work_mem [3] gives it enough rope to hang itself [4]. 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5]. I'm certainly a novice when it comes to PostgreSQL internals, but I'm wondering if this could be fixed by taking a more dynamic approach for allocating HashAggregate hash tables? Anyway, for my query using UNION ALL was acceptable, so I'm not too troubled by this. I was just really caught by surprise that bad estimates can not only cause bad query plans, but also cause good query plans to execute extremely slowly. I had never seen that before. Best Regards Felix [1] My actual query had bad estimates for other reasons (GIN Index), but that's another story. The query above was of course deliberately designed to have bad estimates. [2] nodeAgg.c: build_hash_table() [3] A lot of queries in my application benefit from increased work_mem. [4] execGrouping.c: nbuckets = Min(nbuckets, (long) ((work_mem * 1024L) / entrysize)); [5] In my actual query it was even worse, only 6 out of 40ms of the total execution time were reported as being spent in the query nodes.
GIN Index has O(N^2) complexity for array overlap operator?
Hi everybody, I ran into an issue with using the && array operator on a GIN index of mine. Basically I have a query that looks like this: SELECT * FROM example WHERE keys && ARRAY[...]; This works fine for a small number of array elements (N), but gets really slow as N gets bigger in what appears to be O(N^2) complexity. However, from studying the GIN data structure as described by the docs, it seems that the performance for this could be O(N). In fact, it's possible to coerce the query planner into an O(N) plan like this: SELECT DISTINCT ON (example.id) * FROM unnest(ARRAY[...]) key JOIN example ON keys && ARRAY[key] In order to illustrate this better, I've created a jupyter notebook that populates an example table, show the query plans for both queries, and most importantly benchmarks them and plots a time vs array size (N) graph. https://github.com/felixge/pg-slow-gin/blob/master/pg-slow-gin.ipynb <https://github.com/felixge/pg-slow-gin/blob/master/pg-slow-gin.ipynb> Please help me understand what causes the O(N^2) performance for query 1 and if query 2 is the best way to work around this issue. Thanks Felix Geisendörfer PS: I'm using Postgres 10, but also verified that this problem exists with Postgres 11.