On Mon, Dec 8, 2014 at 10:40 AM, Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Sat, Dec 6, 2014 at 5:37 PM, Stephen Frost <sfr...@snowman.net> wrote: > > > > So to summarize my understanding, below are the set of things > which I should work on and in the order they are listed. > > 1. Push down qualification > 2. Performance Data > 3. Improve the way to push down the information related to worker. > 4. Dynamic allocation of work for workers. > >
I have worked on the patch to accomplish above mentioned points 1, 2 and partly 3 and would like to share the progress with community. If the statement contain quals that don't have volatile functions, then they will be pushed down and the parallel can will be considered for cost evaluation. I think eventually we might need some better way to decide about which kind of functions are okay to be pushed. I have also unified the way information is passed from master backend to worker backends which is convert each node to string that has to be passed and then later workers convert string to node, this has simplified the related code. I have taken performance data for different selectivity and complexity of qual expressions, I understand that there will be other kind of scenario's which we need to consider, however I think the current set of tests is good place to start, please feel free to comment on kind of scenario's which you want me to check Performance Data ------------------------------ *m/c details* IBM POWER-8 24 cores, 192 hardware threads RAM = 492GB *non-default settings in postgresql.conf* max_connections=300 shared_buffers = 8GB checkpoint_segments = 300 checkpoint_timeout = 30min max_worker_processes=100 create table tbl_perf(c1 int, c2 char(1000)); 30 million rows ------------------------ insert into tbl_perf values(generate_series(1,10000000),'aaaaa'); insert into tbl_perf values(generate_series(10000000,30000000),'aaaaa'); Function used in quals ----------------------------------- A simple function which will perform some calculation and return the value passed which can be used in qual condition. create or replace function calc_factorial(a integer, fact_val integer) returns integer as $$ begin perform (fact_val)!; return a; end; $$ language plpgsql STABLE; In below data, num_workers - number of parallel workers configured using parallel_seqscan_degree. 0, means it will execute sequence scan and greater than 0 means parallel sequence scan. exec_time - Execution Time given by Explain Analyze statement. *Tests having quals containing function evaluation in qual* *expressions.* *Test-1* *Query -* Explain analyze select c1 from tbl_perf where c1 > calc_factorial(29700000,10) and c2 like '%aa%'; *Selection_criteria – *1% of rows will be selected *num_workers* *exec_time (ms)* 0 229534 2 121741 4 67051 8 35607 16 24743 *Test-2* *Query - *Explain analyze select c1 from tbl_perf where c1 > calc_factorial(27000000,10) and c2 like '%aa%'; *Selection_criteria – *10% of rows will be selected *num_workers* *exec_time (ms)* 0 226671 2 151587 4 93648 8 70540 16 55466 *Test-3* *Query -* Explain analyze select c1 from tbl_perf where c1 > calc_factorial(22500000,10) and c2 like '%aa%'; *Selection_criteria –* 25% of rows will be selected *num_workers* *exec_time (ms)* 0 232673 2 197609 4 142686 8 111664 16 98097 *Tests having quals containing simple expressions in qual.* *Test-4* *Query - *Explain analyze select c1 from tbl_perf where c1 > 29700000 and c2 like '%aa%'; *Selection_criteria –* 1% of rows will be selected *num_workers* *exec_time (ms)* 0 15505 2 9155 4 6030 8 4523 16 4459 32 8259 64 13388 *Test-5* *Query - *Explain analyze select c1 from tbl_perf where c1 > 28500000 and c2 like '%aa%'; *Selection_criteria –* 5% of rows will be selected *num_workers* *exec_time (ms)* 0 18906 2 13446 4 8970 8 7887 16 10403 *Test-6* *Query -* Explain analyze select c1 from tbl_perf where c1 > 27000000 and c2 like '%aa%'; *Selection_criteria – *10% of rows will be selected *num_workers* *exec_time (ms)* 0 16132 2 23780 4 20275 8 11390 16 11418 Conclusion ------------------ 1. Parallel workers help a lot when there is an expensive qualification to evaluated, the more expensive the qualification the more better are results. 2. It works well for low selectivity quals and as the selectivity increases, the benefit tends to go down due to additional tuple communication cost between workers and master backend. 3. After certain point, increasing having more number of workers won't help and rather have negative impact, refer Test-4. I think as discussed previously we need to introduce 2 additional cost variables (parallel_startup_cost, cpu_tuple_communication_cost) to estimate the parallel seq scan cost so that when the tables are small or selectivity is high, it should increase the cost of parallel plan. Thoughts and feedback for the current state of patch is welcome. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com