Re: [HACKERS] Parallel Index Scans

Rahila Syed Tue, 18 Oct 2016 03:39:08 -0700

>Another point which needs some thoughts is whether it is good idea to
>use index relation size to calculate parallel workers for index scan.
>I think ideally for index scans it should be based on number of pages
>to be fetched/scanned from index.
IIUC, its not possible to know the exact number of pages scanned from an
index
in advance.
What we are essentially making parallel is the scan of the leaf pages.
So it will make sense to have the number of workers based on number of leaf
pages.
Having said that, I think it will not make much difference as compared to
existing method because
currently total index pages are used to calculate the number of workers. As
far as I understand,in large indexes, the difference between
number of leaf pages and total pages is not significant. In other words,
internal pages form a small fraction of total pages.
Also the calculation is based on log of number of pages so it will make
even lesser difference.


Thank you,
Rahila Syed






On Tue, Oct 18, 2016 at 8:38 AM, Amit Kapila <[email protected]>
wrote:

> On Thu, Oct 13, 2016 at 8:48 AM, Amit Kapila <[email protected]>
> wrote:
> > As of now, the driving table for parallel query is accessed by
> > parallel sequential scan which limits its usage to a certain degree.
> > Parallelising index scans would further increase the usage of parallel
> > query in many more cases.  This patch enables the parallelism for the
> > btree scans.  Supporting parallel index scan for other index types
> > like hash, gist, spgist can be done as separate patches.
> >
>
> I would like to have an input on the method of selecting parallel
> workers for scanning index.  Currently the patch selects number of
> workers based on size of index relation and the upper limit of
> parallel workers is max_parallel_workers_per_gather.  This is quite
> similar to what we do for parallel sequential scan except for the fact
> that in parallel seq. scan, we use the parallel_workers option if
> provided by user during Create Table.  User can provide
> parallel_workers option as below:
>
> Create Table .... With (parallel_workers = 4);
>
> Is it desirable to have similar option for parallel index scans, if
> yes then what should be the interface for same?  One possible way
> could be to allow user to provide it during Create Index as below:
>
> Create Index .... With (parallel_workers = 4);
>
> If above syntax looks sensible, then we might need to think what
> should be used for parallel index build.  It seems to me that parallel
> tuple sort patch [1] proposed by Peter G. is using above syntax for
> getting the parallel workers input from user for parallel index
> builds.
>
> Another point which needs some thoughts is whether it is good idea to
> use index relation size to calculate parallel workers for index scan.
> I think ideally for index scans it should be based on number of pages
> to be fetched/scanned from index.
>
>
> [1] - https://www.postgresql.org/message-id/CAM3SWZTmkOFEiCDpUNaO4n9-
> 1xcmWP-1NXmT7h0Pu3gM2YuHvg%40mail.gmail.com
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
>
> --
> Sent via pgsql-hackers mailing list ([email protected])
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Parallel Index Scans

Reply via email to