Re: Parallel hints in PostgreSQL with consistent perfromance
Thank you for your response !! I am experimenting with SQL query performance for SELECT queries on large tables and I observed that changing/increasing the degree of parallel hint doesn't give the expected performance improvement. I have executed the SELECT query with 2,4 & 6 parallel degree however every time only 4 workers launched & there was a slight increase in Execution time as well, why there is an increase in execution time with parallel degree 6 as compared to 2 or 4? Please refer to the test results I am sharing the latest test results here : *Session variables set in psql prompt:* # show max_parallel_workers; max_parallel_workers -- 8 (1 row) # show max_parallel_workers_per_gather; max_parallel_workers_per_gather - 6 (1 row) *1st time query executed with PARALLEL DEGREE 2 * explain analyze select /*+* PARALLEL(A 2)* */ * from test_compare_all_col_src1 A; QUERY PLAN Gather (cost=10.00..45524.73 rows=949636 width=97) (actual time=0.673..173.017 rows=955000 loops=1) Workers Planned: 4 * Workers Launched: 4* -> Parallel Seq Scan on test_compare_all_col_src1 a (cost=0.00..44565.09 rows=237409 width=97) (actual time=0.039..51.941 rows=191000 loops=5) Planning Time: 0.093 ms * Execution Time: 209.745 ms* (6 rows) *2nd time query executed with PARALLEL DEGREE 4* explain analyze select /*+ *PARALLEL(A 4)* */ * from aparopka.test_compare_all_col_src1 A; QUERY PLAN Gather (cost=10.00..45524.73 rows=949636 width=97) (actual time=0.459..174.771 rows=955000 loops=1) Workers Planned: 4 *Workers Launched: 4* -> Parallel Seq Scan on test_compare_all_col_src1 a (cost=0.00..44565.09 rows=237409 width=97) (actual time=0.038..54.320 rows=191000 loops=5) Planning Time: 0.073 ms *Execution Time: 210.170 ms* (6 rows) 3rd time query executed with PARALLEL DEGREE 6 explain analyze select /**+ PARALLEL(A 6)* */ * from aparopka.test_compare_all_col_src1 A; QUERY PLAN Gather (cost=10.00..45524.73 rows=949636 width=97) (actual time=0.560..196.586 rows=955000 loops=1) Workers Planned: 4 *Workers Launched: 4* -> Parallel Seq Scan on test_compare_all_col_src1 a (cost=0.00..44565.09 rows=237409 width=97) (actual time=0.049..58.741 rows=191000 loops=5) Planning Time: 0.095 ms *Execution Time: 235.365 ms* (6 rows) Table Schema : Table "test_compare_all_col_src1" Column |Type | Collation | Nullable | Default | Storage | Stats target | Description -+-+---+--+-+--+--+- col_smallint| integer | | | | plain| | col_int | integer | | | | plain| | col_bigint | bigint | | not null | | plain| | col_numeric | numeric | | | | main | | col_real| real| | | | plain| | col_double | double precision| | | | plain| | col_bool| boolean | | | | plain| | col_char| character(1)| | | | extended | | col_varchar | character varying(2000) | | | | extended | | col_date| date| | | | plain| | col_time| time without time zone | | | | plain| | col_timetz | time with time zone | | | | plain| | col_timestamp | timestamp without time zone | | | | plain| | col_timestamptz | timestamp with time zone| | | | plain| | Indexes: "test_compare_all_col_src1_pkey" PRIMARY KEY, btree (col_bigint) Replica Identity: FULL Access method: heap # select count(*) from test_compare_all_col_src1; count 955000 (1 row) Thanks, --Mohini
Re: Parallel hints in PostgreSQL with consistent perfromance
On Thu, Dec 28, 2023 at 9:47 AM mohini mane wrote: > Thank you for your response !! > I am experimenting with SQL query performance for SELECT queries on large > tables and I observed that changing/increasing the degree of parallel hint > doesn't give the expected performance improvement. > Why do you believe you are changing the degree of parallelism? PostgreSQL does not have parallel hints (or any hint in comments), so you are just changing a comment in the queries, which changes nothing at all in the execution plan. Unless you are not using vanilla PostgreSQL or you have some extension in place, in which case you didn't provide enough information. Best regards, -- Matheus de Oliveira
Re: Parallel hints in PostgreSQL with consistent perfromance
On Thursday, December 28, 2023, mohini mane wrote: > Thank you for your response !! > I am experimenting with SQL query performance for SELECT queries on large > tables and I observed that changing/increasing the degree of parallel hint > doesn't give the expected performance improvement. > > I have executed the SELECT query with 2,4 & 6 parallel degree however > every time only 4 workers launched & there was a slight increase in > Execution time as well, why there is an increase in execution time with > parallel degree 6 as compared to 2 or 4? > Random environmental effects. Also, analyzing a performance test without understanding how “buffers” are used is largely pointless. Whatever told you about that comment syntax is hallucinating. Please don’t reply by top-posting. Inline reply to the comments others make directly and trim as needed. Simply restating your first email isn’t productive. You cannot enforce the number of workers used, only the the maximum. That is you knob. David J.
Slow GroupAggregate and Sort
Hello great day, we have a strange case with slow query and would like some help. I've already read the article https://wiki.postgresql.org/wiki/Slow_Query_Questions Explain: https://paste.depesz.com/s/PLP Explain2: https://explain-postgresql.com/archive/explain/8e4b573c5f7bcf3a0d30675a430051fd:0:2023-12-26 Query: https://paste.depesz.com/s/fd3 DDL: https://paste.depesz.com/s/vBW tunning: https://paste.depesz.com/s/dXa We have citus cluster with the following configuration: 1 master + 3 data nodes, each machine have: - 24 cores (Intel Xeon E5 2620) - 192 GB RAM - 1TB SSD each node has configured postgres settings using tuning.sql The main Table DDL is in (ddl.sql) also distributed are as follow: SELECT create_distributed_table('salert_post', 'id',shard_count := 72); SELECT create_distributed_table('salert_q56', 'post', colocate_with => 'salert_post'); when run the query (query.sql) as you can see in explain (plan4_v3.txt) citus take about 18s to run all fragments but each fragment take at most 2s, so my questions are- why citus take this time in run all fragments? - if I tuned each postgres node efficiently why take much time to make sort and aggregate with citus results? good night, I hope you can help me with some ideas also we remove partitions, and test only with citus, but query took more than a minute. as a note, we not have 72 shards on the same node we have 72 in total, 24 shards each node. I think the problem was in Sort and in GroupAggregate I no have idea how speed up this in master node, because the Custom Scan (Citus Adaptive) is not too slow, the most time is consumed in master on Sort and group I hope you can help me.
Re: Slow GroupAggregate and Sort
On Thu, Dec 28, 2023 at 12:03 PM Darwin Correa wrote: > > when run the query (query.sql) as you can see in explain (plan4_v3.txt) > citus take about 18s to run all fragments > Where is plan4_v3.txt? Is that hidden in some non-obvious way in one of your links? > but each fragment take at most 2s, so my questions are- why citus take > this time in run all fragments? > I only see that one arbitrary fragment takes 2.7s, with no indication whether that one is the slowest one or not. But I am not used to reading citus plans. > also we remove partitions, and test only with citus, but query took more > than a minute. > as a note, we not have 72 shards on the same node we have 72 in total, 24 > shards each node. > I thought the point of sharding was to bring more CPU and RAM to bear than can feasibly be obtained in one machine. Doesn't that make 24 shards per machine completely nuts? > > I think the problem was in Sort and in GroupAggregate I no have idea how > speed up this in master node, because the Custom Scan (Citus Adaptive) is > not too slow, the most time is consumed in master on Sort and group > You want to know why citus is so slow here, but also say it isn't slow and something else is slow instead? I'd break this down into more manageable chunks for investigation. Populate one scratch table (on one node, not a hypertable) with all 2.6 million rows. See how long it takes to populate it based on the citus query, and separately see how long it takes to run the aggregate query on the populated scratch table. What version of PostgreSQL (and citus) are you using? In my hands (without citus being involved), the sort includes "users" as the last column, to support the count(distinct users) operation. I don't know why yours doesn't do that. Cheers, Jeff
Re: Parallel hints in PostgreSQL with consistent perfromance
On Thu, Dec 28, 2023 at 7:47 AM mohini mane wrote: > Thank you for your response !! > I am experimenting with SQL query performance for SELECT queries on large > tables and I observed that changing/increasing the degree of parallel hint > doesn't give the expected performance improvement. > But you still have addressed the fact that PostgreSQL *does not have planner hints*. Are you using some nonstandard extension, or nonstandard fork? > I have executed the SELECT query with 2,4 & 6 parallel degree however > every time only 4 workers launched & there was a slight increase in > Execution time as well, > Adding an ignored comment to your SQL would not be expected to do anything. So it is not surprising that it does not do anything about the number of workers launched. It is just a comment. A note to the human who is reading the code. > why there is an increase in execution time with parallel degree 6 as > compared to 2 or 4? > Those small changes seem to be perfectly compatible with random noise. You would need to repeat them dozens of times in random order, and then do a statistical test to convince me otherwise. >