On Mon, Nov 23, 2015 at 10:39 PM, Robert Haas <robertmh...@gmail.com> wrote:
> On Mon, Nov 23, 2015 at 7:45 AM, Amit Kapila <amit.kapil...@gmail.com>
> > Without this patch, that 0.5 (or 50% of leaders effort) is considered
> > Gather node irrespective of the number of workers or other factors, but
> > I think with Patch that is no longer true and that's what I am worrying
> > about.
> Nope, that patch does not change that at all.  We probably should, but
> this patch does not.

I have taken some performance data with this patch.

- Select data from inheritance hierarchy with very few tuples.

Create table parent_rel(c1 int, c2 text);
Create table child1_rel () Inherits (parent_rel);
Create table child2_rel () Inherits (parent_rel);

insert into parent_rel values(generate_series(1,15), 'aaaa');
insert into child1_rel values(generate_series(10,20),'aaa');
insert into child2_rel values(generate_series(20,30),'aaa');

Analyze parent_rel;
Analyze child1_rel;
Analyze child2_rel;

set max_parallel_degree=4;
set parallel_setup_cost=0;
set parallel_tuple_cost=0.01;

postgres=# explain select count(*) from parent_rel;
                                      QUERY PLAN

 Aggregate  (cost=2.71..2.72 rows=1 width=0)
   ->  Gather  (cost=0.00..2.62 rows=37 width=0)
         Number of Workers: 1
         ->  Append  (cost=0.00..2.25 rows=37 width=0)
               ->  Parallel Seq Scan on parent_rel  (cost=0.00..0.77
rows=15 width=0)
               ->  Parallel Seq Scan on child1_rel  (cost=0.00..0.74
rows=11 width=0)
               ->  Parallel Seq Scan on child2_rel  (cost=0.00..0.74
rows=11 width=0)

I have changed parallel_setup_cost and parallel_tuple_cost, so
it is selecting Gather path even for a small relation.  However,
the same won't be true for non-inheritence relation as if the number
of pages in relation are below than threshold (1000), it won't select
parallel path.  Now here we might want to have similar restriction for
Append Relation as well, that if combining all the child subpaths doesn't
have more than threshold number of pages, then don't try to build the
parallel path.

- Choose the data set that fits in shared_buffers and then run statements
with different selectivity and max_parallel_degree

Test setup
1. Use,  pgbench -i -s 100 <db_name> to create initial data.
2. Use attached pgbench_partitions.sql to create 10 partitions with equal
3. Use, parallel_append.sh to execute statements with different Selectivity
and max_parallel_degree (changed parallel_tuple_cost to 0.001)

Selection_criteria – 1% of rows will be selected and used costly function
evaluation for each row


*max_parallel_degree* *exec_time (ms)* *workers_used*
0 76202 0
2 28556 2
4 21620 3
8 21693 3
16 21654 3
32 21579 3
64 21474 3


*max_parallel_degree* *exec_time (ms)* *workers_used*
0 77027 0
2 27088 2
4 16648 4
8 13730 5
16 13787 5
32 13794 5
64 13872 5

So here we can see that with Patch, performance is better, but I
think that is mainly due to number of workers working on a plan.
It is not clear that if we would have allowed more workers to
work at higher max_parallel_degree whether that can give us any
substantial benefit, but anyway I think thats a generic worker allocation
improvement which is not directly related to this patch.  The data
at different selectivities can be found in the attached document,
more or less that shows a similar trend.  Apart from this, I have tried
with data set which doesn't fit shared buffers, but fit in RAM, for that
also it shows similar trend.

Patch looks good, apart from worker allocation stuff, but I think we
can deal with that separately.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment: pgbench_partitions.sql
Description: Binary data

Attachment: parallel_append.sh
Description: Bourne shell script

Attachment: parallel_append_data.ods
Description: application/vnd.oasis.opendocument.spreadsheet

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to