Re: [HACKERS] Cost model for parallel CREATE INDEX

Robert Haas Thu, 02 Mar 2017 05:51:10 -0800

On Wed, Mar 1, 2017 at 12:58 AM, Peter Geoghegan <[email protected]> wrote:
> * This scales based on output size (projected index size), not input
> size (heap scan input). Apparently, that's what we always do right
> now.


Actually, I'm not aware of any precedent for that. I'd just pass the
heap size to compute_parallel_workers(), leaving the index size as 0,
and call it good.  What you're doing now seems exactly backwards from
parallel query generally.

> So, the main factor that
> discourages parallel sequential scans doesn't really exist for
> parallel CREATE INDEX.

Agreed.

> We could always defer the cost model to another release, and only
> support the storage parameter for now, though that has disadvantages,
> some less obvious [4].

I think it's totally counter-intuitive that any hypothetical index
storage parameter would affect the degree of parallelism involved in
creating the index and also the degree of parallelism involved in
scanning it.  Whether or not other systems do such crazy things seems
to me to beside the point.  I think if CREATE INDEX allows an explicit
specification of the degree of parallelism (a decision I would favor)
it should have a syntactically separate place for unsaved build
options vs. persistent storage parameters.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cost model for parallel CREATE INDEX

Reply via email to