Re: Extremely slow HashAggregate in simple UNION query
I believe this would be relevant- https://www.cybertec-postgresql.com/en/optimizer-support-functions/ It seems there is hope down the road to improve those estimates.
Re: Extremely slow HashAggregate in simple UNION query
Hi, On 2019-08-20 19:55:56 +0200, Felix Geisendörfer wrote: > > On 20. Aug 2019, at 19:32, Andres Freund wrote: > > FWIW, that's not a mis-estimate I'm getting on master ;). Obviously > > that doesn't actually address your concern... > > I suppose this is thanks to the new optimizer support functions > mentioned by Michael and Pavel? Right. > > Under-sizing the hashtable just out of caution will have add overhead to > > a lot more common cases. That requires copying data around during > > growth, which is far far from free. Or you can use hashtables that don't > > need to copy, but they're also considerably slower in the more common > > cases. > > How does PostgreSQL currently handle the case where the initial hash > table is under-sized due to the planner having underestimated things? > Are the growth costs getting amortized by using an exponential growth > function? Yes. But that's still far from free. Greetings, Andres Freund
Re: Extremely slow HashAggregate in simple UNION query
Hi, > On 20. Aug 2019, at 19:32, Andres Freund wrote: > > Hi, > > On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote: >> >> HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual >> time=19.349..23.123 rows=1 loops=1) > > FWIW, that's not a mis-estimate I'm getting on master ;). Obviously > that doesn't actually address your concern... I suppose this is thanks to the new optimizer support functions mentioned by Michael and Pavel? Of course I'm very excited about those improvements, but yeah, my real query is mis-estimating things for totally different reasons not involving any SRFs. >> I'm certainly a novice when it comes to PostgreSQL internals, but I'm >> wondering if this could be fixed by taking a more dynamic approach for >> allocating HashAggregate hash tables? > > Under-sizing the hashtable just out of caution will have add overhead to > a lot more common cases. That requires copying data around during > growth, which is far far from free. Or you can use hashtables that don't > need to copy, but they're also considerably slower in the more common > cases. How does PostgreSQL currently handle the case where the initial hash table is under-sized due to the planner having underestimated things? Are the growth costs getting amortized by using an exponential growth function? Anyway, I can accept my situation to be an edge case that doesn't justify making things more complicated. >> 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of >> the query execution instead of 45ms [5]. > > Well, there's plenty work that's not attributed to nodes. IIRC we don't > track executor startup/shutdown overhead on a per-node basis. I didn't know that, thanks for clarifying : ).
Re: Extremely slow HashAggregate in simple UNION query
Hi, On 2019-08-20 17:11:58 +0200, Felix Geisendörfer wrote: > today I debugged a query that was executing about 100x slower than expected, > and was very surprised by what I found. > > I'm posting to this list to see if this might be an issue that should be > fixed in PostgreSQL itself. > > Below is a simplified version of the query in question: > > SET work_mem='64MB'; > EXPLAIN ANALYZE > SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b > UNION > SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b; > > HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual > time=19.349..23.123 rows=1 loops=1) FWIW, that's not a mis-estimate I'm getting on master ;). Obviously that doesn't actually address your concern... > 1. The query overestimates the final output rows by a factor of 2 million. [1] Right. There's not really that much we can do about that in general. That'll always be possible. Although we can obviously improve the estimates a good bit more. > I'm certainly a novice when it comes to PostgreSQL internals, but I'm > wondering if this could be fixed by taking a more dynamic approach for > allocating HashAggregate hash tables? Under-sizing the hashtable just out of caution will have add overhead to a lot more common cases. That requires copying data around during growth, which is far far from free. Or you can use hashtables that don't need to copy, but they're also considerably slower in the more common cases. > 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of > the query execution instead of 45ms [5]. Well, there's plenty work that's not attributed to nodes. IIRC we don't track executor startup/shutdown overhead on a per-node basis. So I don't really think this is necessarily something that suspicious. Which indeed seems to be what's happening here (this is with 11, to be able to hit the problem with your reproducer): + 33.01% postgres postgres [.] tuplehash_iterate - 18.39% postgres libc-2.28.so [.] __memset_avx2_erms - 90.94% page_fault __memset_avx2_erms tuplehash_allocate tuplehash_create BuildTupleHashTableExt build_hash_table ExecInitAgg ExecInitNode InitPlan standard_ExecutorStart Greetings, Andres Freund
Re: Extremely slow HashAggregate in simple UNION query
út 20. 8. 2019 v 17:12 odesílatel Felix Geisendörfer napsal: > Hi all, > > today I debugged a query that was executing about 100x slower than > expected, and was very surprised by what I found. > > I'm posting to this list to see if this might be an issue that should be > fixed in PostgreSQL itself. > > Below is a simplified version of the query in question: > > SET work_mem='64MB';q > EXPLAIN ANALYZE > SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b > UNION > SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b; > > HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual > time=19.349..23.123 rows=1 loops=1) > Group Key: a.a, b.b > -> Append (cost=0.01..70020.01 rows=200 width=8) (actual > time=0.022..0.030 rows=2 loops=1) > -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) > (actual time=0.021..0.022 rows=1 loops=1) > -> Function Scan on generate_series a (cost=0.00..10.00 > rows=1000 width=4) (actual time=0.014..0.014 rows=1 loops=1) > -> Function Scan on generate_series b (cost=0.00..10.00 > rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) > -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) > (actual time=0.006..0.007 rows=1 loops=1) > -> Function Scan on generate_series a_1 (cost=0.00..10.00 > rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) > -> Function Scan on generate_series b_1 (cost=0.00..10.00 > rows=1000 width=4) (actual time=0.002..0.002 rows=1 loops=1) > Planning Time: 0.101 ms > Execution Time: 45.986 ms > > As you can see, it takes over 45ms (!) to execute what appear to be a very > simple query. How is this possible? > > Based on my debugging, I think the following is going on: > > 1. The query overestimates the final output rows by a factor of 2 million. > [1] > 2. The executor uses the bad estimate to allocate a huge hash table [2], > and the increased work_mem [3] gives it enough rope to hang itself [4]. > 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of > the query execution instead of 45ms [5]. > > I'm certainly a novice when it comes to PostgreSQL internals, but I'm > wondering if this could be fixed by taking a more dynamic approach for > allocating HashAggregate hash tables? > Table functions has not statistics - and default ROWS value is 1000, so it is reason why there is very ever estimating. This specific case can be solved better in PostgreSQL 12, where some functions like generate_series has support functions with better estimations. You can get profile of this query with some profiler https://wiki.postgresql.org/wiki/Profiling_with_perf and you can see a reason why the query is slow. The speed on PostgreSQL 12 of your example is good - about 1ms But when I repeat your example, the speed was more terrible. On second hand - nobody can expect optimal plan when there is this crazy miss estimation. Looks so some wrong is inside some cleaning part. Pavel > Anyway, for my query using UNION ALL was acceptable, so I'm not too > troubled by this. I was just really caught by surprise that bad estimates > can not only cause bad query plans, but also cause good query plans to > execute extremely slowly. I had never seen that before. > > Best Regards > Felix > > [1] My actual query had bad estimates for other reasons (GIN Index), but > that's another story. The query above was of course deliberately designed > to have bad estimates. > [2] nodeAgg.c: build_hash_table() > [3] A lot of queries in my application benefit from increased work_mem. > [4] execGrouping.c: nbuckets = Min(nbuckets, (long) ((work_mem * 1024L) / > entrysize)); > [5] In my actual query it was even worse, only 6 out of 40ms of the total > execution time were reported as being spent in the query nodes. > >
Extremely slow HashAggregate in simple UNION query
Hi all, today I debugged a query that was executing about 100x slower than expected, and was very surprised by what I found. I'm posting to this list to see if this might be an issue that should be fixed in PostgreSQL itself. Below is a simplified version of the query in question: SET work_mem='64MB'; EXPLAIN ANALYZE SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b UNION SELECT * FROM generate_series(1, 1) a, generate_series(1, 1) b; HashAggregate (cost=80020.01..100020.01 rows=200 width=8) (actual time=19.349..23.123 rows=1 loops=1) Group Key: a.a, b.b -> Append (cost=0.01..70020.01 rows=200 width=8) (actual time=0.022..0.030 rows=2 loops=1) -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) (actual time=0.021..0.022 rows=1 loops=1) -> Function Scan on generate_series a (cost=0.00..10.00 rows=1000 width=4) (actual time=0.014..0.014 rows=1 loops=1) -> Function Scan on generate_series b (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Nested Loop (cost=0.01..20010.01 rows=100 width=8) (actual time=0.006..0.007 rows=1 loops=1) -> Function Scan on generate_series a_1 (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Function Scan on generate_series b_1 (cost=0.00..10.00 rows=1000 width=4) (actual time=0.002..0.002 rows=1 loops=1) Planning Time: 0.101 ms Execution Time: 45.986 ms As you can see, it takes over 45ms (!) to execute what appear to be a very simple query. How is this possible? Based on my debugging, I think the following is going on: 1. The query overestimates the final output rows by a factor of 2 million. [1] 2. The executor uses the bad estimate to allocate a huge hash table [2], and the increased work_mem [3] gives it enough rope to hang itself [4]. 3. Somehow EXPLAIN gets confused by this and only ends up tracking 23ms of the query execution instead of 45ms [5]. I'm certainly a novice when it comes to PostgreSQL internals, but I'm wondering if this could be fixed by taking a more dynamic approach for allocating HashAggregate hash tables? Anyway, for my query using UNION ALL was acceptable, so I'm not too troubled by this. I was just really caught by surprise that bad estimates can not only cause bad query plans, but also cause good query plans to execute extremely slowly. I had never seen that before. Best Regards Felix [1] My actual query had bad estimates for other reasons (GIN Index), but that's another story. The query above was of course deliberately designed to have bad estimates. [2] nodeAgg.c: build_hash_table() [3] A lot of queries in my application benefit from increased work_mem. [4] execGrouping.c: nbuckets = Min(nbuckets, (long) ((work_mem * 1024L) / entrysize)); [5] In my actual query it was even worse, only 6 out of 40ms of the total execution time were reported as being spent in the query nodes.
Re: Erratically behaving query needs optimization
Yes, adding another index might help reduce the number of rows filtered --> Rows Removed by Filter: 1308337 Also, make sure you run vacuum analyze on this query. Regards, Michael Vitale Luís Roberto Weck wrote on 8/20/2019 10:58 AM: Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu: Hello, I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914, 64-bit" and I have a query that runs several times per user action (9-10 times). The query takes a long time to execute, specially at first, due to cold caches I think, but the performance varies greatly during a run of the application (while applying the said action by the user several times). My tables are only getting bigger with time, not much DELETEs and even less UPDATEs as far as I can tell. Problematic query: EXPLAIN (ANALYZE,BUFFERS) SELECT DISTINCT ON (results.attribute_id) results.timestamp, results.data FROM results JOIN scheduler_operation_executions ON scheduler_operation_executions.id = results.operation_execution_id JOIN scheduler_task_executions ON scheduler_task_executions.id = scheduler_operation_executions.task_execution_id WHERE scheduler_task_executions.device_id = 97 AND results.data <> '' AND results.data IS NOT NULL AND results.object_id = 1955 AND results.attribute_id IN (4, 5) -- possibly a longer list here AND results.data_access_result = 'SUCCESS' ORDER BY results.attribute_id, results.timestamp DESC LIMIT 2 -- limit by the length of the attributes list In words: I want the latest (ORDER BY results.timestamp DESC) results of a device (scheduler_task_executions.device_id = 97 - hence the joins results -> scheduler_operation_executions -> scheduler_task_executions) for a given object and attributes with some additional constraints on the data column. But I only want the latest attributes for which we have results, hence the DISTINCT ON (results.attribute_id) and LIMIT. First run: https://explain.depesz.com/s/qh4C Limit (cost=157282.39..157290.29 rows=2 width=54) (actual time=44068.166..44086.970 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Unique (cost=157282.39..157298.20 rows=4 width=54) (actual time=44068.164..44069.301 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Sort (cost=157282.39..157290.29 rows=3162 width=54) (actual time=44068.161..44068.464 rows=2052 loops=1) Sort Key: results.attribute_id, results."timestamp" DESC Sort Method: quicksort Memory: 641kB Buffers: shared hit=215928 read=85139 -> Gather (cost=62853.04..157098.57 rows=3162 width=54) (actual time=23518.745..44076.385 rows=4102 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=215928 read=85139 -> Nested Loop (cost=61853.04..155782.37 rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367 loops=3) Buffers: shared hit=215928 read=85139 -> Parallel Hash Join (cost=61852.61..143316.27 rows=24085 width=4) (actual time=23271.275..40018.451 rows=19756 loops=3) Hash Cond: (scheduler_operation_executions.task_execution_id = scheduler_task_executions.id) Buffers: shared hit=6057 read=85139 -> Parallel Seq Scan on scheduler_operation_executions (cost=0.00..74945.82 rows=2482982 width=8) (actual time=7.575..15694.435 rows=1986887 loops=3) Buffers: shared hit=2996 read=47120 -> Parallel Hash (cost=61652.25..61652.25 rows=16029 width=4) (actual time=23253.337..23253.337 rows=13558 loops=3) Buckets: 65536 Batches: 1 Memory Usage: 2144kB Buffers: shared hit=2977 read=38019 -> Parallel Seq Scan on scheduler_task_executions (cost=0.00..61652.25 rows=16029 width=4) (actual time=25.939..23222.174 rows=13558 loops=3) Filter: (device_id = 97) Rows Removed by Filter: 1308337 Buffers: shared hit=2977 read=38019 -> Index Scan using index_operation_execution_id_asc on results (cost=0.43..0.51 rows=1 width=58) (actual time=0.191..0.191 rows=0 loops=59269) Index Cond: (operation_execution_id = scheduler_operation_executions.id) Filter: ((data IS NOT NULL) AND (data <> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text)) Rows Removed by Filter: 0 Buffers: shared hit=209871 Planning Time: 29.295 ms Execution
Re: Erratically behaving query needs optimization
Em 20/08/2019 10:54, Barbu Paul - Gheorghe escreveu: Hello, I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914, 64-bit" and I have a query that runs several times per user action (9-10 times). The query takes a long time to execute, specially at first, due to cold caches I think, but the performance varies greatly during a run of the application (while applying the said action by the user several times). My tables are only getting bigger with time, not much DELETEs and even less UPDATEs as far as I can tell. Problematic query: EXPLAIN (ANALYZE,BUFFERS) SELECT DISTINCT ON (results.attribute_id) results.timestamp, results.data FROM results JOIN scheduler_operation_executions ON scheduler_operation_executions.id = results.operation_execution_id JOIN scheduler_task_executions ON scheduler_task_executions.id = scheduler_operation_executions.task_execution_id WHERE scheduler_task_executions.device_id = 97 AND results.data <> '' AND results.data IS NOT NULL AND results.object_id = 1955 AND results.attribute_id IN (4, 5) -- possibly a longer list here AND results.data_access_result = 'SUCCESS' ORDER BY results.attribute_id, results.timestamp DESC LIMIT 2 -- limit by the length of the attributes list In words: I want the latest (ORDER BY results.timestamp DESC) results of a device (scheduler_task_executions.device_id = 97 - hence the joins results -> scheduler_operation_executions -> scheduler_task_executions) for a given object and attributes with some additional constraints on the data column. But I only want the latest attributes for which we have results, hence the DISTINCT ON (results.attribute_id) and LIMIT. First run: https://explain.depesz.com/s/qh4C Limit (cost=157282.39..157290.29 rows=2 width=54) (actual time=44068.166..44086.970 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Unique (cost=157282.39..157298.20 rows=4 width=54) (actual time=44068.164..44069.301 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Sort (cost=157282.39..157290.29 rows=3162 width=54) (actual time=44068.161..44068.464 rows=2052 loops=1) Sort Key: results.attribute_id, results."timestamp" DESC Sort Method: quicksort Memory: 641kB Buffers: shared hit=215928 read=85139 -> Gather (cost=62853.04..157098.57 rows=3162 width=54) (actual time=23518.745..44076.385 rows=4102 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=215928 read=85139 -> Nested Loop (cost=61853.04..155782.37 rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367 loops=3) Buffers: shared hit=215928 read=85139 -> Parallel Hash Join (cost=61852.61..143316.27 rows=24085 width=4) (actual time=23271.275..40018.451 rows=19756 loops=3) Hash Cond: (scheduler_operation_executions.task_execution_id = scheduler_task_executions.id) Buffers: shared hit=6057 read=85139 -> Parallel Seq Scan on scheduler_operation_executions (cost=0.00..74945.82 rows=2482982 width=8) (actual time=7.575..15694.435 rows=1986887 loops=3) Buffers: shared hit=2996 read=47120 -> Parallel Hash (cost=61652.25..61652.25 rows=16029 width=4) (actual time=23253.337..23253.337 rows=13558 loops=3) Buckets: 65536 Batches: 1 Memory Usage: 2144kB Buffers: shared hit=2977 read=38019 -> Parallel Seq Scan on scheduler_task_executions (cost=0.00..61652.25 rows=16029 width=4) (actual time=25.939..23222.174 rows=13558 loops=3) Filter: (device_id = 97) Rows Removed by Filter: 1308337 Buffers: shared hit=2977 read=38019 -> Index Scan using index_operation_execution_id_asc on results (cost=0.43..0.51 rows=1 width=58) (actual time=0.191..0.191 rows=0 loops=59269) Index Cond: (operation_execution_id = scheduler_operation_executions.id) Filter: ((data IS NOT NULL) AND (data <> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text)) Rows Removed by Filter: 0 Buffers: shared hit=209871 Planning Time: 29.295 ms Execution Time: 44087.365 ms Second run: https://explain.depesz.com/s/uy9f Limit (cost=157282.39..157290.29 rows=2 width=54) (actual time=789.363..810.440 rows=2 loops=1) Buffers: shared hit=216312 read=84755 -> Unique
Erratically behaving query needs optimization
Hello, I'm running "PostgreSQL 11.2, compiled by Visual C++ build 1914, 64-bit" and I have a query that runs several times per user action (9-10 times). The query takes a long time to execute, specially at first, due to cold caches I think, but the performance varies greatly during a run of the application (while applying the said action by the user several times). My tables are only getting bigger with time, not much DELETEs and even less UPDATEs as far as I can tell. Problematic query: EXPLAIN (ANALYZE,BUFFERS) SELECT DISTINCT ON (results.attribute_id) results.timestamp, results.data FROM results JOIN scheduler_operation_executions ON scheduler_operation_executions.id = results.operation_execution_id JOIN scheduler_task_executions ON scheduler_task_executions.id = scheduler_operation_executions.task_execution_id WHERE scheduler_task_executions.device_id = 97 AND results.data <> '' AND results.data IS NOT NULL AND results.object_id = 1955 AND results.attribute_id IN (4, 5) -- possibly a longer list here AND results.data_access_result = 'SUCCESS' ORDER BY results.attribute_id, results.timestamp DESC LIMIT 2 -- limit by the length of the attributes list In words: I want the latest (ORDER BY results.timestamp DESC) results of a device (scheduler_task_executions.device_id = 97 - hence the joins results -> scheduler_operation_executions -> scheduler_task_executions) for a given object and attributes with some additional constraints on the data column. But I only want the latest attributes for which we have results, hence the DISTINCT ON (results.attribute_id) and LIMIT. First run: https://explain.depesz.com/s/qh4C Limit (cost=157282.39..157290.29 rows=2 width=54) (actual time=44068.166..44086.970 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Unique (cost=157282.39..157298.20 rows=4 width=54) (actual time=44068.164..44069.301 rows=2 loops=1) Buffers: shared hit=215928 read=85139 -> Sort (cost=157282.39..157290.29 rows=3162 width=54) (actual time=44068.161..44068.464 rows=2052 loops=1) Sort Key: results.attribute_id, results."timestamp" DESC Sort Method: quicksort Memory: 641kB Buffers: shared hit=215928 read=85139 -> Gather (cost=62853.04..157098.57 rows=3162 width=54) (actual time=23518.745..44076.385 rows=4102 loops=1) Workers Planned: 2 Workers Launched: 2 Buffers: shared hit=215928 read=85139 -> Nested Loop (cost=61853.04..155782.37 rows=1318 width=54) (actual time=23290.514..43832.223 rows=1367 loops=3) Buffers: shared hit=215928 read=85139 -> Parallel Hash Join (cost=61852.61..143316.27 rows=24085 width=4) (actual time=23271.275..40018.451 rows=19756 loops=3) Hash Cond: (scheduler_operation_executions.task_execution_id = scheduler_task_executions.id) Buffers: shared hit=6057 read=85139 -> Parallel Seq Scan on scheduler_operation_executions (cost=0.00..74945.82 rows=2482982 width=8) (actual time=7.575..15694.435 rows=1986887 loops=3) Buffers: shared hit=2996 read=47120 -> Parallel Hash (cost=61652.25..61652.25 rows=16029 width=4) (actual time=23253.337..23253.337 rows=13558 loops=3) Buckets: 65536 Batches: 1 Memory Usage: 2144kB Buffers: shared hit=2977 read=38019 -> Parallel Seq Scan on scheduler_task_executions (cost=0.00..61652.25 rows=16029 width=4) (actual time=25.939..23222.174 rows=13558 loops=3) Filter: (device_id = 97) Rows Removed by Filter: 1308337 Buffers: shared hit=2977 read=38019 -> Index Scan using index_operation_execution_id_asc on results (cost=0.43..0.51 rows=1 width=58) (actual time=0.191..0.191 rows=0 loops=59269) Index Cond: (operation_execution_id = scheduler_operation_executions.id) Filter: ((data IS NOT NULL) AND (data <> ''::text) AND (attribute_id = ANY ('{4,5}'::integer[])) AND (object_id = 1955) AND (data_access_result = 'SUCCESS'::text)) Rows Removed by Filter: 0 Buffers: shared hit=209871 Planning Time: 29.295 ms Execution Time: 44087.365 ms Second run: https://explain.depesz.com/s/uy9f Limit (cost=157282.39..157290.29 rows=2 width=54) (actual time=789.363..810.440 rows=2 loops=1) Buffers: shared hit=216312 read=84755 -> Unique (cost=157282.39..157298.20 rows=4 width=54) (actual time=789.361..789.535 rows=2 loops=1) Buffers: shared