Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Tom Lane
I wrote:
> OK, I'll take care of it (since I now realize that the inconsistency
> is my own fault --- I committed that GUC not you).  It's unclear what
> this will do for Peter's complaint though.

On closer inspection, the answer is "nothing", because the select_parallel
test overrides the default value of min_parallel_relation_size anyway.
(Without that, I don't think tenk1 is large enough to trigger
consideration of parallel scan at all.)

I find that at BLCKSZ 8K, the planner thinks the best plan is

 HashAggregate  (cost=5320.28..7920.28 rows=1 width=12)
   Group Key: parallel_restricted(unique1)
   ->  Index Only Scan using tenk1_unique1 on tenk1  (cost=0.29..2770.28 
rows=1 width=8)

which is what the regression test script expects.  Forcing the parallel
plan to be chosen, we get this using the cost parameters set up by
select_parallel:

 HashAggregate  (cost=5433.00..8033.00 rows=1 width=12)
   Group Key: parallel_restricted(unique1)
   ->  Gather  (cost=0.00..2883.00 rows=1 width=8)
 Workers Planned: 4
 ->  Parallel Seq Scan on tenk1  (cost=0.00..383.00 rows=2500 width=4)

However, at BLCKSZ 16K, we get these numbers instead:

 HashAggregate  (cost=5264.28..7864.28 rows=1 width=12)
   Group Key: parallel_restricted(unique1)
   ->  Index Only Scan using tenk1_unique1 on tenk1  (cost=0.29..2714.28 
rows=1 width=8)

 HashAggregate  (cost=5251.00..7851.00 rows=1 width=12)
   Group Key: parallel_restricted(unique1)
   ->  Gather  (cost=0.00..2701.00 rows=1 width=8)
 Workers Planned: 4
 ->  Parallel Seq Scan on tenk1  (cost=0.00..201.00 rows=2500 width=4)

so the planner goes for the second one.

I don't think there's anything particularly broken here.  The seqscan
cost estimate is largely dependent on the number of blocks, and there's
half as many blocks at 16K.  The indexscan estimate is also reduced,
but not as much, so it stops looking like the cheaper alternative.

We could maybe twiddle the cost parameters select_parallel uses so that
the same plan is chosen at both block sizes, but it seems like it would
be very fragile, and I'm not sure there's much point.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Tom Lane
Robert Haas  writes:
> On Thu, Sep 15, 2016 at 10:44 AM, Tom Lane  wrote:
>> Well, sure, but at any reasonable value of min_parallel_relation_size
>> that won't be a factor.  The question here is whether we want the default
>> value to be platform-independent.  I notice that both config.sgml and
>> postgresql.conf.sample claim that the default value is 8MB, which this
>> discussion reveals to be a lie.  If you want to keep the default expressed
>> as "1024" and not "(8 * 1024 * 1024) / BLCKSZ", we need to change the
>> documentation.

> I don't particularly care about that.  Changing it to 8MB always would
> be fine with me.

OK, I'll take care of it (since I now realize that the inconsistency
is my own fault --- I committed that GUC not you).  It's unclear what
this will do for Peter's complaint though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Robert Haas
On Thu, Sep 15, 2016 at 10:44 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Sep 15, 2016 at 9:59 AM, Tom Lane  wrote:
>>> Possibly we ought to change things so that the default value of
>>> min_parallel_relation_size is a fixed number of bytes rather
>>> than a fixed number of blocks.  Not sure though.
>
>> The reason why this was originally reckoned in blocks is because the
>> data is divided between the workers on the basis of a block number.
>> In the degenerate case where blocks < workers, the extra workers will
>> get no blocks at all, and thus no rows at all.
>
> Well, sure, but at any reasonable value of min_parallel_relation_size
> that won't be a factor.  The question here is whether we want the default
> value to be platform-independent.  I notice that both config.sgml and
> postgresql.conf.sample claim that the default value is 8MB, which this
> discussion reveals to be a lie.  If you want to keep the default expressed
> as "1024" and not "(8 * 1024 * 1024) / BLCKSZ", we need to change the
> documentation.

I don't particularly care about that.  Changing it to 8MB always would
be fine with me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Tom Lane
Robert Haas  writes:
> On Thu, Sep 15, 2016 at 9:59 AM, Tom Lane  wrote:
>> Possibly we ought to change things so that the default value of
>> min_parallel_relation_size is a fixed number of bytes rather
>> than a fixed number of blocks.  Not sure though.

> The reason why this was originally reckoned in blocks is because the
> data is divided between the workers on the basis of a block number.
> In the degenerate case where blocks < workers, the extra workers will
> get no blocks at all, and thus no rows at all.

Well, sure, but at any reasonable value of min_parallel_relation_size
that won't be a factor.  The question here is whether we want the default
value to be platform-independent.  I notice that both config.sgml and
postgresql.conf.sample claim that the default value is 8MB, which this
discussion reveals to be a lie.  If you want to keep the default expressed
as "1024" and not "(8 * 1024 * 1024) / BLCKSZ", we need to change the
documentation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Alvaro Herrera
Robert Haas wrote:
> On Thu, Sep 15, 2016 at 9:59 AM, Tom Lane  wrote:
> > Possibly we ought to change things so that the default value of
> > min_parallel_relation_size is a fixed number of bytes rather
> > than a fixed number of blocks.  Not sure though.
> 
> The reason why this was originally reckoned in blocks is because the
> data is divided between the workers on the basis of a block number.

Maybe the solution is to fill the table to a given number of blocks
rather than a number of rows.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Robert Haas
On Thu, Sep 15, 2016 at 9:59 AM, Tom Lane  wrote:
> Possibly we ought to change things so that the default value of
> min_parallel_relation_size is a fixed number of bytes rather
> than a fixed number of blocks.  Not sure though.

The reason why this was originally reckoned in blocks is because the
data is divided between the workers on the basis of a block number.
In the degenerate case where blocks < workers, the extra workers will
get no blocks at all, and thus no rows at all.  It seemed best to
insist that the relation had a reasonable number of blocks so that we
could hope for a reasonably even distribution of work among a pool of
workers.  I'm not altogether sure that's the right way of thinking
about this problem but I'm not sure it's wrong, either; anyway, it's
as far as my thought process had progressed at the time I wrote the
code.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] select_parallel test fails with nonstandard block size

2016-09-15 Thread Tom Lane
Peter Eisentraut  writes:
> When building with --with-blocksize=16, the select_parallel test fails
> with this difference:

>  explain (costs off)
> select  sum(parallel_restricted(unique1)) from tenk1
> group by(parallel_restricted(unique1));
> - QUERY PLAN
> -
> +QUERY PLAN
> +---
>   HashAggregate
> Group Key: parallel_restricted(unique1)
> -   ->  Index Only Scan using tenk1_unique1 on tenk1
> -(3 rows)
> +   ->  Gather
> + Workers Planned: 4
> + ->  Parallel Seq Scan on tenk1
> +(5 rows)

>  set force_parallel_mode=1;
>  explain (costs off)

> We know that different block sizes cause some test failures, mainly
> because of row ordering differences.  But this looked a bit different.

I suspect what is happening is that min_parallel_relation_size is
being interpreted differently (because the default is set at 1024
blocks, regardless of what BLCKSZ is) and that's affecting the
cost estimate for the parallel seqscan.  The direction of change
seems a bit surprising though; if the table is now half as big
blocks-wise, how did that make the parallel scan look cheaper?
Please step through create_plain_partial_paths and see what
is being done differently.

Possibly we ought to change things so that the default value of
min_parallel_relation_size is a fixed number of bytes rather
than a fixed number of blocks.  Not sure though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers