Re: [HACKERS] Rename max_parallel_degree?

2016-12-04 Thread Julien Rouhaud
On Fri, Dec 02, 2016 at 07:45:56AM -0500, Robert Haas wrote:
> On Thu, Dec 1, 2016 at 10:07 PM, Haribabu Kommi
>  wrote:
> > From the recent mails, it is not clear to me what is the status of this
> > patch.
> > moved to next CF with "needs review" state. Please feel free to update
> > the status.
> 
> I have committed this patch.  And updated the status, too!

Thanks!

--
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-12-02 Thread Robert Haas
On Thu, Dec 1, 2016 at 10:07 PM, Haribabu Kommi
 wrote:
> From the recent mails, it is not clear to me what is the status of this
> patch.
> moved to next CF with "needs review" state. Please feel free to update
> the status.

I have committed this patch.  And updated the status, too!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-12-01 Thread Haribabu Kommi
On Thu, Oct 27, 2016 at 3:38 AM, Robert Haas  wrote:

> On Mon, Oct 24, 2016 at 4:04 PM, Peter Eisentraut
>  wrote:
> > On 10/12/16 7:58 PM, Robert Haas wrote:
> >> I don't think it's wrong that the handling is done there, though.  The
> >> process that is registering the background worker is well-placed to
> >> check whether there are already too many, and if it does not then the
> >> slot is consumed at least temporarily even if it busts the cap.  On
> >> the flip side, the postmaster is the only process that is well-placed
> >> to know when a background worker terminates.  The worker process
> >> itself can't be made responsible for it, as you suggest below, because
> >> it may never even start up in the first place (e.g. fork() returns
> >> EAGAIN).  And the registering process can't be made responsible,
> >> because it might die before the worker.
> >
> > Those are valid technical points.  I have not worked out any
> alternatives.
> >
> > I'm concerned that all this makes background workers managed by
> > extensions second-class citizens.
>
> I suppose at some level that's true, but I have a hard time getting
> worked up about it.  There are always some areas where core code is
> going to be privileged over non-core code, but I'd say that background
> workers stand out as a shining example of enabling interesting
> non-core code.  In fact, the main reason why this patch exists at all
> is so that the sole in-core user of the background worker facility -
> parallel query - doesn't crowd out interesting non-core uses of that
> facility.  This isn't a parallel query feature; it's an
> everything-that-uses-background-workers-that-is-not-parallel-query
> feature.
>
> Also, for many types of background workers, this kind of limiting
> behavior isn't really useful or doesn't really make sense.  Obviously,
> any application that uses exactly one background worker (or any fixed
> number of background workers) doesn't need any feature similar to this
> one.  Also, any application that uses a "launcher" process and some
> number of per-database workers can limit the number of workers it
> consumes by teaching the launcher process to enforce a limit.  These
> two patterns are quite common, I think, and probably cover a lot of
> what background workers might like to do.  Parallel query is a little
> bit unusual in that any backend in the system might launch workers
> without having any idea what workers other backends are already doing.
> It seems unlikely to me that many extensions would share this pattern,
> but perhaps I lack sufficient imagination.  pg_background, or any
> facility derived from it, might qualify, but I can't think what else
> would.
>
> In any event, this is not written in stone.  I don't think it would be
> a huge deal to change this later if we come up with something better.
>


>From the recent mails, it is not clear to me what is the status of this
patch.
moved to next CF with "needs review" state. Please feel free to update
the status.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Rename max_parallel_degree?

2016-10-26 Thread Robert Haas
On Mon, Oct 24, 2016 at 4:04 PM, Peter Eisentraut
 wrote:
> On 10/12/16 7:58 PM, Robert Haas wrote:
>> I don't think it's wrong that the handling is done there, though.  The
>> process that is registering the background worker is well-placed to
>> check whether there are already too many, and if it does not then the
>> slot is consumed at least temporarily even if it busts the cap.  On
>> the flip side, the postmaster is the only process that is well-placed
>> to know when a background worker terminates.  The worker process
>> itself can't be made responsible for it, as you suggest below, because
>> it may never even start up in the first place (e.g. fork() returns
>> EAGAIN).  And the registering process can't be made responsible,
>> because it might die before the worker.
>
> Those are valid technical points.  I have not worked out any alternatives.
>
> I'm concerned that all this makes background workers managed by
> extensions second-class citizens.

I suppose at some level that's true, but I have a hard time getting
worked up about it.  There are always some areas where core code is
going to be privileged over non-core code, but I'd say that background
workers stand out as a shining example of enabling interesting
non-core code.  In fact, the main reason why this patch exists at all
is so that the sole in-core user of the background worker facility -
parallel query - doesn't crowd out interesting non-core uses of that
facility.  This isn't a parallel query feature; it's an
everything-that-uses-background-workers-that-is-not-parallel-query
feature.

Also, for many types of background workers, this kind of limiting
behavior isn't really useful or doesn't really make sense.  Obviously,
any application that uses exactly one background worker (or any fixed
number of background workers) doesn't need any feature similar to this
one.  Also, any application that uses a "launcher" process and some
number of per-database workers can limit the number of workers it
consumes by teaching the launcher process to enforce a limit.  These
two patterns are quite common, I think, and probably cover a lot of
what background workers might like to do.  Parallel query is a little
bit unusual in that any backend in the system might launch workers
without having any idea what workers other backends are already doing.
It seems unlikely to me that many extensions would share this pattern,
but perhaps I lack sufficient imagination.  pg_background, or any
facility derived from it, might qualify, but I can't think what else
would.

In any event, this is not written in stone.  I don't think it would be
a huge deal to change this later if we come up with something better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-26 Thread Amit Kapila
On Wed, Oct 26, 2016 at 5:54 PM, Robert Haas  wrote:
> On Mon, Oct 24, 2016 at 3:48 AM, Amit Kapila  wrote:
>> On Thu, Oct 13, 2016 at 5:28 AM, Robert Haas  wrote:
>>> On Wed, Oct 12, 2016 at 8:13 AM, Peter Eisentraut
>>>  wrote:
 On 10/4/16 10:16 AM, Julien Rouhaud wrote:
> Please find attached v9 of the patch, adding the parallel worker class
> and changing max_worker_processes default to 16 and max_parallel_workers
> to 8.  I also added Amit's explanation for the need of a write barrier
> in ForgetBackgroundWorker().

 This approach totally messes up the decoupling between the background
 worker facilities and consumers of those facilities.  Having dozens of
 lines of code in bgworker.c that does the accounting and resource
 checking on behalf of parallel.c looks very suspicious.  Once we add
 logical replication workers or whatever, we'll be tempted to add even
 more stuff in there and it will become a mess.
>>>
>>> I attach a new version of the patch that I've been hacking on in my
>>> "spare time", which reduces the footprint in bgworker.c quite a bit.
>>>
>>
>> Couple of comments -
>>
>> @@ -370,6 +388,9 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
>>
>>   Assert(rw->rw_shmem_slot <
>> max_worker_processes);
>>   slot = >slot[rw->rw_shmem_slot];
>> + if ((rw-
>>>rw_worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
>> + BackgroundWorkerData-
>>>parallel_terminate_count++;
>> +
>>   slot->in_use = false;
>>
>> It seems upthread [1], we agreed to have a write barrier before the
>> setting slot->in_use, but I don't see the same in patch.
>
> That's because I removed it.  The reason given for the barrier was
> that otherwise it might be reordered before the check of
> is_parallel_worker, but that's now done by checking the postmaster's
> backend-private copy of the flags, not the copy in shared memory.  So
> the reordering can't affect the result.
>

You are right.  I missed to notice that.

The patch looks good to me.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-26 Thread Robert Haas
On Mon, Oct 24, 2016 at 3:48 AM, Amit Kapila  wrote:
> On Thu, Oct 13, 2016 at 5:28 AM, Robert Haas  wrote:
>> On Wed, Oct 12, 2016 at 8:13 AM, Peter Eisentraut
>>  wrote:
>>> On 10/4/16 10:16 AM, Julien Rouhaud wrote:
 Please find attached v9 of the patch, adding the parallel worker class
 and changing max_worker_processes default to 16 and max_parallel_workers
 to 8.  I also added Amit's explanation for the need of a write barrier
 in ForgetBackgroundWorker().
>>>
>>> This approach totally messes up the decoupling between the background
>>> worker facilities and consumers of those facilities.  Having dozens of
>>> lines of code in bgworker.c that does the accounting and resource
>>> checking on behalf of parallel.c looks very suspicious.  Once we add
>>> logical replication workers or whatever, we'll be tempted to add even
>>> more stuff in there and it will become a mess.
>>
>> I attach a new version of the patch that I've been hacking on in my
>> "spare time", which reduces the footprint in bgworker.c quite a bit.
>>
>
> Couple of comments -
>
> @@ -370,6 +388,9 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
>
>   Assert(rw->rw_shmem_slot <
> max_worker_processes);
>   slot = >slot[rw->rw_shmem_slot];
> + if ((rw-
>>rw_worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
> + BackgroundWorkerData-
>>parallel_terminate_count++;
> +
>   slot->in_use = false;
>
> It seems upthread [1], we agreed to have a write barrier before the
> setting slot->in_use, but I don't see the same in patch.

That's because I removed it.  The reason given for the barrier was
that otherwise it might be reordered before the check of
is_parallel_worker, but that's now done by checking the postmaster's
backend-private copy of the flags, not the copy in shared memory.  So
the reordering can't affect the result.

> Isn't it better to keep planner related checks from Julien's patch,
> especially below one:
>
> --- a/src/backend/optimizer/plan/planner.c
> +++ b/src/backend/optimizer/plan/planner.c
> @@ -249,6 +249,7 @@ standard_planner(Query *parse, int cursorOptions,
> ParamListInfo boundParams)
>   parse->utilityStmt == NULL &&
>   !parse->hasModifyingCTE &&
>   max_parallel_workers_per_gather > 0 &&
> + max_parallel_workers > 0 &&
>   !IsParallelWorker() &&
>   !IsolationIsSerializable())

I don't really think we need to do that.  If you set
max_parallel_workers_per_gather > max_parallel_workers, you might get
some plans that will never succeed in obtaining the number of workers
for which they planned.  If that bothers you, don't do it.  It's
actually quite useful to do that for testing purposes, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-24 Thread Peter Eisentraut
On 10/12/16 7:58 PM, Robert Haas wrote:
> I don't think it's wrong that the handling is done there, though.  The
> process that is registering the background worker is well-placed to
> check whether there are already too many, and if it does not then the
> slot is consumed at least temporarily even if it busts the cap.  On
> the flip side, the postmaster is the only process that is well-placed
> to know when a background worker terminates.  The worker process
> itself can't be made responsible for it, as you suggest below, because
> it may never even start up in the first place (e.g. fork() returns
> EAGAIN).  And the registering process can't be made responsible,
> because it might die before the worker.

Those are valid technical points.  I have not worked out any alternatives.

I'm concerned that all this makes background workers managed by
extensions second-class citizens.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-24 Thread Amit Kapila
On Thu, Oct 13, 2016 at 5:28 AM, Robert Haas  wrote:
> On Wed, Oct 12, 2016 at 8:13 AM, Peter Eisentraut
>  wrote:
>> On 10/4/16 10:16 AM, Julien Rouhaud wrote:
>>> Please find attached v9 of the patch, adding the parallel worker class
>>> and changing max_worker_processes default to 16 and max_parallel_workers
>>> to 8.  I also added Amit's explanation for the need of a write barrier
>>> in ForgetBackgroundWorker().
>>
>> This approach totally messes up the decoupling between the background
>> worker facilities and consumers of those facilities.  Having dozens of
>> lines of code in bgworker.c that does the accounting and resource
>> checking on behalf of parallel.c looks very suspicious.  Once we add
>> logical replication workers or whatever, we'll be tempted to add even
>> more stuff in there and it will become a mess.
>
> I attach a new version of the patch that I've been hacking on in my
> "spare time", which reduces the footprint in bgworker.c quite a bit.
>

Couple of comments -

@@ -370,6 +388,9 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)

  Assert(rw->rw_shmem_slot <
max_worker_processes);
  slot = >slot[rw->rw_shmem_slot];
+ if ((rw-
>rw_worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
+ BackgroundWorkerData-
>parallel_terminate_count++;
+
  slot->in_use = false;

It seems upthread [1], we agreed to have a write barrier before the
setting slot->in_use, but I don't see the same in patch.


Isn't it better to keep planner related checks from Julien's patch,
especially below one:

--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -249,6 +249,7 @@ standard_planner(Query *parse, int cursorOptions,
ParamListInfo boundParams)
  parse->utilityStmt == NULL &&
  !parse->hasModifyingCTE &&
  max_parallel_workers_per_gather > 0 &&
+ max_parallel_workers > 0 &&
  !IsParallelWorker() &&
  !IsolationIsSerializable())

> I don't think it's wrong that the handling is done there, though.  The
> process that is registering the background worker is well-placed to
> check whether there are already too many, and if it does not then the
> slot is consumed at least temporarily even if it busts the cap.  On
> the flip side, the postmaster is the only process that is well-placed
> to know when a background worker terminates.  The worker process
> itself can't be made responsible for it, as you suggest below, because
> it may never even start up in the first place (e.g. fork() returns
> EAGAIN).  And the registering process can't be made responsible,
> because it might die before the worker.
>
>> I think we should think of a way where we can pass the required
>> information during background worker setup, like a pointer to the
>> resource-limiting GUC variable.
>
> I don't think this can work, per the above.
>

I see the value in Peter's point, but could not think of a way to
implement the same.


[1] - 
https://www.postgresql.org/message-id/CA%2BTgmoZS7DFFGz7kKWLoTAsNnXRKUiQFDt5yHgf4L73z52dgAQ%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-12 Thread Robert Haas
On Wed, Oct 12, 2016 at 8:13 AM, Peter Eisentraut
 wrote:
> On 10/4/16 10:16 AM, Julien Rouhaud wrote:
>> Please find attached v9 of the patch, adding the parallel worker class
>> and changing max_worker_processes default to 16 and max_parallel_workers
>> to 8.  I also added Amit's explanation for the need of a write barrier
>> in ForgetBackgroundWorker().
>
> This approach totally messes up the decoupling between the background
> worker facilities and consumers of those facilities.  Having dozens of
> lines of code in bgworker.c that does the accounting and resource
> checking on behalf of parallel.c looks very suspicious.  Once we add
> logical replication workers or whatever, we'll be tempted to add even
> more stuff in there and it will become a mess.

I attach a new version of the patch that I've been hacking on in my
"spare time", which reduces the footprint in bgworker.c quite a bit.
I don't think it's wrong that the handling is done there, though.  The
process that is registering the background worker is well-placed to
check whether there are already too many, and if it does not then the
slot is consumed at least temporarily even if it busts the cap.  On
the flip side, the postmaster is the only process that is well-placed
to know when a background worker terminates.  The worker process
itself can't be made responsible for it, as you suggest below, because
it may never even start up in the first place (e.g. fork() returns
EAGAIN).  And the registering process can't be made responsible,
because it might die before the worker.

> I think we should think of a way where we can pass the required
> information during background worker setup, like a pointer to the
> resource-limiting GUC variable.

I don't think this can work, per the above.

> For style, I would also prefer the "class" to be a separate struct field
> from the flags.

I think that it makes sense to keep the class as part of the flags
because then a large amount of the stuff in this patch that is
concerned with propagating the flags around becomes unnecessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e826c19..8e53edc 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1984,7 +1984,7 @@ include_dir 'conf.d'
 
  Sets the maximum number of background processes that the system
  can support.  This parameter can only be set at server start.  The
- default is 8.
+ default is 16.
 
 
 
@@ -2006,8 +2006,9 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
- number of workers may not actually be available at run time.  If this
+ , limited by
+ .  Note that the requested
+ number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
  disables parallel query execution.
@@ -2036,6 +2037,22 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 8.  When increasing or
+ decreasing this value, consider also adjusting
+ .
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 59dc394..1c32fcd 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -454,7 +454,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 			 MyProcPid);
 	worker.bgw_flags =
-		BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+		| BGWORKER_CLASS_PARALLEL;
 	worker.bgw_start_time = BgWorkerStart_ConsistentState;
 	worker.bgw_restart_time = BGW_NEVER_RESTART;
 	worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 3a2929a..75f9a55 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -80,9 +80,22 @@ typedef struct BackgroundWorkerSlot
 	BackgroundWorker worker;
 } BackgroundWorkerSlot;
 
+/*
+ * In order to limit the total number of parallel workers (according to
+ * max_parallel_workers GUC), we maintain the number of active parallel
+ * workers.  Since the postmaster cannot take 

Re: [HACKERS] Rename max_parallel_degree?

2016-10-12 Thread Peter Eisentraut
On 10/4/16 10:16 AM, Julien Rouhaud wrote:
> Please find attached v9 of the patch, adding the parallel worker class
> and changing max_worker_processes default to 16 and max_parallel_workers
> to 8.  I also added Amit's explanation for the need of a write barrier
> in ForgetBackgroundWorker().

This approach totally messes up the decoupling between the background
worker facilities and consumers of those facilities.  Having dozens of
lines of code in bgworker.c that does the accounting and resource
checking on behalf of parallel.c looks very suspicious.  Once we add
logical replication workers or whatever, we'll be tempted to add even
more stuff in there and it will become a mess.

I think we should think of a way where we can pass the required
information during background worker setup, like a pointer to the
resource-limiting GUC variable.

For style, I would also prefer the "class" to be a separate struct field
from the flags.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-04 Thread Julien Rouhaud
On 03/10/2016 21:27, Robert Haas wrote:
> On Fri, Sep 30, 2016 at 12:23 PM, Julien Rouhaud
>  wrote:
>> I've already fixed every other issues mentioned upthread, but I'm facing
>> a problem for this one.  Assuming that the bgworker classes are supposed
>> to be mutually exclusive, I don't see a simple and clean way to add such
>> a check in SanityCheckBackgroundWorker().  Am I missing something
>> obvious, or can someone give me some advice for this?
> 
> My advice is "don't worry about it".   If somebody thinks of something
> that can be usefully added there, it will take very little time to add
> it and test that it works.  Don't get hung up on that for now.
> 

Ok, thanks!

Please find attached v9 of the patch, adding the parallel worker class
and changing max_worker_processes default to 16 and max_parallel_workers
to 8.  I also added Amit's explanation for the need of a write barrier
in ForgetBackgroundWorker().

I'll add this patch to the next commitfest.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e826c19..61c5a7c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1984,7 +1984,7 @@ include_dir 'conf.d'
 
  Sets the maximum number of background processes that the system
  can support.  This parameter can only be set at server start.  The
- default is 8.
+ default is 16.
 
 
 
@@ -2006,8 +2006,9 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
- number of workers may not actually be available at run time.  If this
+ , limited by
+ .  Note that the requested
+ number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
  disables parallel query execution.
@@ -2036,6 +2037,21 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 8.  Setting this value to 0
+ disables parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index cde0ed3..0177401 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
 	snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 			 MyProcPid);
 	worker.bgw_flags =
-		BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+		BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+		| BGWORKER_CLASS_PARALLEL;
 	worker.bgw_start_time = BgWorkerStart_ConsistentState;
 	worker.bgw_restart_time = BGW_NEVER_RESTART;
 	worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index e42ef98..6ad8fd0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -718,9 +718,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo *rel)
 	}
 
 	/*
-	 * In no case use more than max_parallel_workers_per_gather workers.
+	 * In no case use more than max_parallel_workers or
+	 * max_parallel_workers_per_gather workers.
 	 */
-	parallel_workers = Min(parallel_workers, max_parallel_workers_per_gather);
+	parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+max_parallel_workers_per_gather));
 
 	/* If any limit was set to zero, the user doesn't want a parallel scan. */
 	if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 2a49639..09dc077 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int			effective_cache_size = DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost		disable_cost = 1.0e10;
 
+int			max_parallel_workers = 8;
 int			max_parallel_workers_per_gather = 2;
 
 bool		enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index f657ffc..aa8670b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -249,6 +249,7 @@ standard_planner(Query *parse, int cursorOptions, ParamListInfo boundParams)
 		parse->utilityStmt == NULL &&
 		!parse->hasModifyingCTE &&
 		max_parallel_workers_per_gather > 0 &&
+		max_parallel_workers > 0 &&
 		

Re: [HACKERS] Rename max_parallel_degree?

2016-10-03 Thread Robert Haas
On Fri, Sep 30, 2016 at 12:23 PM, Julien Rouhaud
 wrote:
> I've already fixed every other issues mentioned upthread, but I'm facing
> a problem for this one.  Assuming that the bgworker classes are supposed
> to be mutually exclusive, I don't see a simple and clean way to add such
> a check in SanityCheckBackgroundWorker().  Am I missing something
> obvious, or can someone give me some advice for this?

My advice is "don't worry about it".   If somebody thinks of something
that can be usefully added there, it will take very little time to add
it and test that it works.  Don't get hung up on that for now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-10-02 Thread Michael Paquier
On Sat, Oct 1, 2016 at 1:23 AM, Julien Rouhaud
 wrote:
> On 23/09/2016 21:10, Robert Haas wrote:
>> On Fri, Sep 23, 2016 at 8:54 AM, Peter Eisentraut
>>  wrote:
>>> On 9/20/16 4:07 PM, Robert Haas wrote:
 No, I'm assuming that the classes would be built-in.  A string tag
 seems like over-engineering to me, particularly because the postmaster
 needs to switch on the tag, and we need to be very careful about the
 degree to which the postmaster trusts the contents of shared memory.
>>>
>>> I'm hoping that we can come up with something that extensions can
>>> participate in, without the core having to know ahead of time what those
>>> extensions are or how they would be categorized.
>>>
>>> My vision is something like
>>>
>>> max_processes = 512  # requires restart
>>>
>>> process_allowances = 'connection:300 superuser:10 autovacuum:10
>>> parallel:30 replication:10 someextension:20 someotherextension:20'
>>> # does not require restart
>>
>> I don't think it's going to be very practical to allow extensions to
>> participate in the mechanism because there have to be a finite number
>> of slots that is known at the time we create the main shared memory
>> segment.
>>
>> Also, it's really important that we don't add lots more surface area
>> for the postmaster to crash and burn.
>>
>
> It seems that there's no objection on Robert's initial proposal, so I'll
> try to implement it.
>
> I've already fixed every other issues mentioned upthread, but I'm facing
> a problem for this one.  Assuming that the bgworker classes are supposed
> to be mutually exclusive, I don't see a simple and clean way to add such
> a check in SanityCheckBackgroundWorker().  Am I missing something
> obvious, or can someone give me some advice for this?

Okay, so marking it as returned with feedback is adapted? I have done
so but feel free to contradict me.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-30 Thread Julien Rouhaud
On 23/09/2016 21:10, Robert Haas wrote:
> On Fri, Sep 23, 2016 at 8:54 AM, Peter Eisentraut
>  wrote:
>> On 9/20/16 4:07 PM, Robert Haas wrote:
>>> No, I'm assuming that the classes would be built-in.  A string tag
>>> seems like over-engineering to me, particularly because the postmaster
>>> needs to switch on the tag, and we need to be very careful about the
>>> degree to which the postmaster trusts the contents of shared memory.
>>
>> I'm hoping that we can come up with something that extensions can
>> participate in, without the core having to know ahead of time what those
>> extensions are or how they would be categorized.
>>
>> My vision is something like
>>
>> max_processes = 512  # requires restart
>>
>> process_allowances = 'connection:300 superuser:10 autovacuum:10
>> parallel:30 replication:10 someextension:20 someotherextension:20'
>> # does not require restart
> 
> I don't think it's going to be very practical to allow extensions to
> participate in the mechanism because there have to be a finite number
> of slots that is known at the time we create the main shared memory
> segment.
> 
> Also, it's really important that we don't add lots more surface area
> for the postmaster to crash and burn.
> 

It seems that there's no objection on Robert's initial proposal, so I'll
try to implement it.

I've already fixed every other issues mentioned upthread, but I'm facing
a problem for this one.  Assuming that the bgworker classes are supposed
to be mutually exclusive, I don't see a simple and clean way to add such
a check in SanityCheckBackgroundWorker().  Am I missing something
obvious, or can someone give me some advice for this?

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-23 Thread Robert Haas
On Fri, Sep 23, 2016 at 8:54 AM, Peter Eisentraut
 wrote:
> On 9/20/16 4:07 PM, Robert Haas wrote:
>> No, I'm assuming that the classes would be built-in.  A string tag
>> seems like over-engineering to me, particularly because the postmaster
>> needs to switch on the tag, and we need to be very careful about the
>> degree to which the postmaster trusts the contents of shared memory.
>
> I'm hoping that we can come up with something that extensions can
> participate in, without the core having to know ahead of time what those
> extensions are or how they would be categorized.
>
> My vision is something like
>
> max_processes = 512  # requires restart
>
> process_allowances = 'connection:300 superuser:10 autovacuum:10
> parallel:30 replication:10 someextension:20 someotherextension:20'
> # does not require restart

I don't think it's going to be very practical to allow extensions to
participate in the mechanism because there have to be a finite number
of slots that is known at the time we create the main shared memory
segment.

Also, it's really important that we don't add lots more surface area
for the postmaster to crash and burn.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-23 Thread Peter Eisentraut
On 9/20/16 4:07 PM, Robert Haas wrote:
> No, I'm assuming that the classes would be built-in.  A string tag
> seems like over-engineering to me, particularly because the postmaster
> needs to switch on the tag, and we need to be very careful about the
> degree to which the postmaster trusts the contents of shared memory.

I'm hoping that we can come up with something that extensions can
participate in, without the core having to know ahead of time what those
extensions are or how they would be categorized.

My vision is something like

max_processes = 512  # requires restart

process_allowances = 'connection:300 superuser:10 autovacuum:10
parallel:30 replication:10 someextension:20 someotherextension:20'
# does not require restart

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-20 Thread Robert Haas
On Tue, Sep 20, 2016 at 3:31 PM, Peter Eisentraut
 wrote:
> On 9/20/16 11:47 AM, Robert Haas wrote:
 >> #define BGWORKER_CLASS_MASK   0x00f0
 >> #define BGWORKER_CLASS_PARALLEL  0x0010
 >> /* add additional bgworker classes here */
>>> >
>>> > Unless we have a mechanism that makes use of this somehow, this attempt
>>> > at categorizing might be premature.
>> My main reason for wanting is that I suspect there will be a similar
>> desire to limit the number of replication workers at some point.
>
> Would that work for something developed as an extension?  Would we all
> have to agree on non-conflicting numbers?  Maybe a string tag would work
> better?

No, I'm assuming that the classes would be built-in.  A string tag
seems like over-engineering to me, particularly because the postmaster
needs to switch on the tag, and we need to be very careful about the
degree to which the postmaster trusts the contents of shared memory.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-20 Thread Peter Eisentraut
On 9/20/16 11:47 AM, Robert Haas wrote:
>>> >> #define BGWORKER_CLASS_MASK   0x00f0
>>> >> #define BGWORKER_CLASS_PARALLEL  0x0010
>>> >> /* add additional bgworker classes here */
>> >
>> > Unless we have a mechanism that makes use of this somehow, this attempt
>> > at categorizing might be premature.
> My main reason for wanting is that I suspect there will be a similar
> desire to limit the number of replication workers at some point.

Would that work for something developed as an extension?  Would we all
have to agree on non-conflicting numbers?  Maybe a string tag would work
better?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-20 Thread Petr Jelinek
On 20/09/16 17:47, Robert Haas wrote:
> On Tue, Sep 20, 2016 at 11:27 AM, Peter Eisentraut
>  wrote:
>> On 9/19/16 2:18 PM, Robert Haas wrote:
>>> I suggest that we make it part of bgw_flags, but use a bitmask for it,
>>> like this:
>>>
>>> #define BGWORKER_CLASS_MASK   0x00f0
>>> #define BGWORKER_CLASS_PARALLEL  0x0010
>>> /* add additional bgworker classes here */
>>
>> Unless we have a mechanism that makes use of this somehow, this attempt
>> at categorizing might be premature.
> 
> My main reason for wanting is that I suspect there will be a similar
> desire to limit the number of replication workers at some point.
> 

Yes there definitely is desire to use this for replication workers as
well. The logical replication launcher currently handles the limit by
itself but I'd definitely prefer to reuse something more generic.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-20 Thread Robert Haas
On Tue, Sep 20, 2016 at 11:27 AM, Peter Eisentraut
 wrote:
> On 9/19/16 2:18 PM, Robert Haas wrote:
>> I suggest that we make it part of bgw_flags, but use a bitmask for it,
>> like this:
>>
>> #define BGWORKER_CLASS_MASK   0x00f0
>> #define BGWORKER_CLASS_PARALLEL  0x0010
>> /* add additional bgworker classes here */
>
> Unless we have a mechanism that makes use of this somehow, this attempt
> at categorizing might be premature.

My main reason for wanting is that I suspect there will be a similar
desire to limit the number of replication workers at some point.

However, that could be wrong.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-20 Thread Peter Eisentraut
On 9/19/16 2:18 PM, Robert Haas wrote:
> I suggest that we make it part of bgw_flags, but use a bitmask for it,
> like this:
> 
> #define BGWORKER_CLASS_MASK   0x00f0
> #define BGWORKER_CLASS_PARALLEL  0x0010
> /* add additional bgworker classes here */

Unless we have a mechanism that makes use of this somehow, this attempt
at categorizing might be premature.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-19 Thread Robert Haas
On Sat, Sep 17, 2016 at 2:40 AM, Amit Kapila  wrote:
> On Fri, Sep 16, 2016 at 11:54 PM, Robert Haas  wrote:
>>
>> +   /*
>> +* We need a write barrier to make sure the update of
>> +* parallel_terminate_count is done before the store to in_use
>> +*/
>>
>> Does the order actually matter here?
>>
>
> I think so.  If slot->in_use is reordered before the check of
> is_parallel_worker, then it is possible that concurrent registration
> of worker can mark the is_parallel_worker as false before we check the
> flag here.  See explanation in previous e-mail [1].

Tricky.  I believe you're right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-19 Thread Robert Haas
On Fri, Sep 16, 2016 at 3:53 PM, Julien Rouhaud
 wrote:
> That's fine by me.  Should this be done (if there's no objection) in the
> same patch, or in another one?

I'd say "same patch".

>> I'd suggest renaming the "parallel" flag to BackgroundWorkerSlot to
>> "is_parallel_worker".  Or, actually, what I think would be better is
>> to give it a name like worker_class, and then we can have
>> BGWORKER_CLASS_PARALLEL and perhaps eventually
>> BGWORKER_CLASS_REPLICATION, etc.
>
> For now I just renamed "parallel" to "is_parallel_worker", since this is
> straightforward.  For a new "worker_class", I guess we'd need a new enum
> stored in BackgroundWorker struct instead of the
> BGWORKER_IS_PARALLEL_WORKER flag, and store it in the
> BackgroundWorkerSlot. Should I do that instead?

I suggest that we make it part of bgw_flags, but use a bitmask for it,
like this:

#define BGWORKER_CLASS_MASK   0x00f0
#define BGWORKER_CLASS_PARALLEL  0x0010
/* add additional bgworker classes here */

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-17 Thread Amit Kapila
On Fri, Sep 16, 2016 at 11:54 PM, Robert Haas  wrote:
>
> +   /*
> +* We need a write barrier to make sure the update of
> +* parallel_terminate_count is done before the store to in_use
> +*/
>
> Does the order actually matter here?
>

I think so.  If slot->in_use is reordered before the check of
is_parallel_worker, then it is possible that concurrent registration
of worker can mark the is_parallel_worker as false before we check the
flag here.  See explanation in previous e-mail [1].


[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2BQ_DdJ28qXYSHZiBKNf2MY1QFrv5XAOAw4ZXHw4TPMxA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-09-16 Thread Julien Rouhaud
On 16/09/2016 20:24, Robert Haas wrote:
> On Wed, Jun 29, 2016 at 10:46 PM, Amit Kapila  wrote:
>> Your patch looks good to me and is ready for a committer's look.
>>
>> Notes for committer -
>> a. Verify if description of newly added Guc max_parallel_workers looks
>> okay to you, me and Julien are not in 100% agreement on that.
>> b. Comments might need some improvement.
> 
> This patch needs to be rebased.  I hope somebody can volunteer to do
> that, because I'd like to commit it once we've hashed out the details.
> 

I just rebased the previous patch on current HEAD, with some other
modifications, see below (attached v8 if that helps).

> Would it bother anybody very much if we bumped up these values, say by
> increasing max_worker_processes from 8 to 16 and max_parallel_workers
> from 4 (as it is in the current patch version) to 8?  I feel like 4 is
> a bit more conservative than I'd like to be by default, and I'd like
> to make sure that we leave room for other sorts of background workers
> between the two limits.
> 

That's fine by me.  Should this be done (if there's no objection) in the
same patch, or in another one?


> I'd suggest renaming the "parallel" flag to BackgroundWorkerSlot to
> "is_parallel_worker".  Or, actually, what I think would be better is
> to give it a name like worker_class, and then we can have
> BGWORKER_CLASS_PARALLEL and perhaps eventually
> BGWORKER_CLASS_REPLICATION, etc.
> 

For now I just renamed "parallel" to "is_parallel_worker", since this is
straightforward.  For a new "worker_class", I guess we'd need a new enum
stored in BackgroundWorker struct instead of the
BGWORKER_IS_PARALLEL_WORKER flag, and store it in the
BackgroundWorkerSlot. Should I do that instead?


> + * terminated ones.  These counters can of course overlaps, but it's not
> + * important here since the substraction will still give the right number.
> 
> overlaps -> overflow.  substraction -> subtraction.
> 

oops sorry, fixed

> +   /*
> +* We need a write barrier to make sure the update of
> +* parallel_terminate_count is done before the store to in_use
> +*/
> 
> Does the order actually matter here?
> 

After some more thinking, it looks like a reorder here won't have any
impact. I'll remove it, unless Amit has an objection about it.

> +   {"max_parallel_workers", PGC_USERSET, RESOURCES_ASYNCHRONOUS,
> +   gettext_noop("Sets the maximum number of
> parallel processes for the cluster."),
> 
> I suggest: sets the maximum number of parallel workers that can be
> active at one time.
> 

changed

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cd66abc..3abd2e5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2005,8 +2005,9 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
- number of workers may not actually be available at run time.  If this
+ , limited by
+ .  Note that the requested
+ number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
  disables parallel query execution.
@@ -2030,6 +2031,21 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 4.  Setting this value to 0
+ disables parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index cde0ed3..5f6e429 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 99b6bc8..8f33813 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ 

Re: [HACKERS] Rename max_parallel_degree?

2016-09-16 Thread Robert Haas
On Wed, Jun 29, 2016 at 10:46 PM, Amit Kapila  wrote:
> Your patch looks good to me and is ready for a committer's look.
>
> Notes for committer -
> a. Verify if description of newly added Guc max_parallel_workers looks
> okay to you, me and Julien are not in 100% agreement on that.
> b. Comments might need some improvement.

This patch needs to be rebased.  I hope somebody can volunteer to do
that, because I'd like to commit it once we've hashed out the details.

Would it bother anybody very much if we bumped up these values, say by
increasing max_worker_processes from 8 to 16 and max_parallel_workers
from 4 (as it is in the current patch version) to 8?  I feel like 4 is
a bit more conservative than I'd like to be by default, and I'd like
to make sure that we leave room for other sorts of background workers
between the two limits.

I'd suggest renaming the "parallel" flag to BackgroundWorkerSlot to
"is_parallel_worker".  Or, actually, what I think would be better is
to give it a name like worker_class, and then we can have
BGWORKER_CLASS_PARALLEL and perhaps eventually
BGWORKER_CLASS_REPLICATION, etc.

+ * terminated ones.  These counters can of course overlaps, but it's not
+ * important here since the substraction will still give the right number.

overlaps -> overflow.  substraction -> subtraction.

+   /*
+* We need a write barrier to make sure the update of
+* parallel_terminate_count is done before the store to in_use
+*/

Does the order actually matter here?

+   {"max_parallel_workers", PGC_USERSET, RESOURCES_ASYNCHRONOUS,
+   gettext_noop("Sets the maximum number of
parallel processes for the cluster."),

I suggest: sets the maximum number of parallel workers that can be
active at one time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-29 Thread Amit Kapila
On Wed, Jun 29, 2016 at 11:54 AM, Julien Rouhaud
 wrote:
> On 29/06/2016 06:29, Amit Kapila wrote:
>> On Wed, Jun 29, 2016 at 2:57 AM, Julien Rouhaud
>>  wrote:
>>>
>>> Thanks a lot for the help!
>>>
>>> PFA v6 which should fix all the issues mentioned.
>>
>> Couple of minor suggestions.
>>
>> - .  Note that the requested
>> + , limited by
>> + .  Note that the requested
>>
>> Typo.
>> /linked/linkend
>>
>
> Oops, fixed.
>
>> You can always find such mistakes by doing make check in doc/src/sgml/
>>
>
> I wasn't aware of that, it's really a nice thing to know, thanks!
>
>> + /*
>> + * We need a memory barrier here to make sure the above test doesn't get
>> + * reordered
>> + */
>> + pg_read_barrier();
>>
>> /memory barrier/read barrier
>>
>
> fixed
>
>> + if (max_parallel_workers == 0)
>> + {
>> + ereport(elevel,
>> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
>> + errmsg("background worker \"%s\": cannot request parallel worker if
>> no parallel worker allowed",
>>
>> " ..no parallel worker is allowed".  'is' seems to be missing.
>>
>
> fixed
>

Your patch looks good to me and is ready for a committer's look.

Notes for committer -
a. Verify if description of newly added Guc max_parallel_workers looks
okay to you, me and Julien are not in 100% agreement on that.
b. Comments might need some improvement.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-29 Thread Julien Rouhaud
On 29/06/2016 08:51, Amit Kapila wrote:
> On Wed, Jun 29, 2016 at 11:54 AM, Julien Rouhaud
>  wrote:
>> Or should we allow setting it to -1 for instance to disable the limit?
>>
> 
> By disabling the limit, do you mean to say that only
> max_parallel_workers_per_gather will determine the workers required or
> something else?

I meant what the current behavior (before this patch) is, which is
probably what some user without custom dynamic bgworker may like to
have.  Otherwise, you'd have to change two parameters to effectively
increase parallelism, and I think it can also cause some confusion.

>  If earlier, then I am not sure if it is good idea,
> because it can cause some confusion to the user about usage of both
> the parameters together.
> 

I also agree.  I don't see an ideal solution, so just keeping this patch
behavior is fine for me.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-29 Thread Amit Kapila
On Wed, Jun 29, 2016 at 11:54 AM, Julien Rouhaud
 wrote:
> Or should we allow setting it to -1 for instance to disable the limit?
>

By disabling the limit, do you mean to say that only
max_parallel_workers_per_gather will determine the workers required or
something else?  If earlier, then I am not sure if it is good idea,
because it can cause some confusion to the user about usage of both
the parameters together.



-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-29 Thread Julien Rouhaud
On 29/06/2016 06:29, Amit Kapila wrote:
> On Wed, Jun 29, 2016 at 2:57 AM, Julien Rouhaud
>  wrote:
>>
>> Thanks a lot for the help!
>>
>> PFA v6 which should fix all the issues mentioned.
> 
> Couple of minor suggestions.
> 
> - .  Note that the requested
> + , limited by
> + .  Note that the requested
> 
> Typo.
> /linked/linkend
> 

Oops, fixed.

> You can always find such mistakes by doing make check in doc/src/sgml/
> 

I wasn't aware of that, it's really a nice thing to know, thanks!

> + /*
> + * We need a memory barrier here to make sure the above test doesn't get
> + * reordered
> + */
> + pg_read_barrier();
> 
> /memory barrier/read barrier
> 

fixed

> + if (max_parallel_workers == 0)
> + {
> + ereport(elevel,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("background worker \"%s\": cannot request parallel worker if
> no parallel worker allowed",
> 
> " ..no parallel worker is allowed".  'is' seems to be missing.
> 

fixed

> 
>>  Also, after second
>> thought I didn't add the extra hint about max_worker_processes in the
>> max_parallel_worker paragraph, since this line was a duplicate of the
>> precedent paragraph, it seemed better to leave the text as is.
>>
> 
> not a big problem, we can leave it for committer to decide on same.
> However just by reading the description of max_parallel_worker, user
> can set its value more than max_wroker_processes which we don't want.
> 

Right.  On the other hand I'm not sure that's really an issue, because
such a case is handled in the code, and setting max_parallel_workers way
above max_worker_processes could be a way to configure it as unlimited.
Or should we allow setting it to -1 for instance to disable the limit?

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 061697b..3a47421 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2005,7 +2005,8 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
+ , limited by
+ .  Note that the requested
  number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
@@ -2014,6 +2015,21 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 4.  Setting this value to 0
+ disables parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index 088700e..ea7680b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -452,7 +452,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 2e4b670..e1da5f9 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -724,9 +724,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 8c1dccc..6cb2f4e 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   

Re: [HACKERS] Rename max_parallel_degree?

2016-06-28 Thread Amit Kapila
On Wed, Jun 29, 2016 at 2:57 AM, Julien Rouhaud
 wrote:
>
> Thanks a lot for the help!
>
> PFA v6 which should fix all the issues mentioned.

Couple of minor suggestions.

- .  Note that the requested
+ , limited by
+ .  Note that the requested

Typo.
/linked/linkend

You can always find such mistakes by doing make check in doc/src/sgml/

+ /*
+ * We need a memory barrier here to make sure the above test doesn't get
+ * reordered
+ */
+ pg_read_barrier();

/memory barrier/read barrier

+ if (max_parallel_workers == 0)
+ {
+ ereport(elevel,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("background worker \"%s\": cannot request parallel worker if
no parallel worker allowed",

" ..no parallel worker is allowed".  'is' seems to be missing.


>  Also, after second
> thought I didn't add the extra hint about max_worker_processes in the
> max_parallel_worker paragraph, since this line was a duplicate of the
> precedent paragraph, it seemed better to leave the text as is.
>

not a big problem, we can leave it for committer to decide on same.
However just by reading the description of max_parallel_worker, user
can set its value more than max_wroker_processes which we don't want.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-28 Thread Julien Rouhaud
On 28/06/2016 04:44, Amit Kapila wrote:
> On Mon, Jun 27, 2016 at 10:35 PM, Julien Rouhaud
>>
>> There's already a pg_memory_barrier() call in
>> BackgroundWorkerStateChange(), to avoid reordering the notify_pid load.
>> Couldn't we use it to also make sure the parallel_terminate_count
>> increment happens before the slot->in_use store?
>>
> 
> Yes, that is enough, as memory barrier ensures that both loads and
> stores are completed before any loads and stores that are after
> barrier.
> 
>>  I guess that a write
>> barrier will be needed in ForgetBacgroundWorker().
>>
> 
> Yes.
> 
 2.
 + if (parallel && (BackgroundWorkerData->parallel_register_count -
 +
 BackgroundWorkerData->parallel_terminate_count) >=
 +
 max_parallel_workers)
 + {
 + LWLockRelease(BackgroundWorkerLock);
 + return
 false;
 + }
 +

 I think we need a read barrier here, so that this check doesn't get
 reordered with the for loop below it.
>>
>> You mean between the end of this block and the for loop?
>>
> 
> Yes.
> 
  Also, see if you find the code
 more readable by moving the after && part of check to next line.
>>
>> I think I'll just pgindent the file.
>>
> 
> make sense.
> 
> 

Thanks a lot for the help!

PFA v6 which should fix all the issues mentioned.  Also, after second
thought I didn't add the extra hint about max_worker_processes in the
max_parallel_worker paragraph, since this line was a duplicate of the
precedent paragraph, it seemed better to leave the text as is.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a82bf06..6812b0d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2009,7 +2009,8 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
+ , limited by
+ .  Note that the requested
  number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
@@ -2018,6 +2019,21 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 4.  Setting this value to 0
+ disables parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index 088700e..ea7680b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -452,7 +452,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index 2e4b670..e1da5f9 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -724,9 +724,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index 8c1dccc..6cb2f4e 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 

Re: [HACKERS] Rename max_parallel_degree?

2016-06-27 Thread Amit Kapila
On Mon, Jun 27, 2016 at 10:35 PM, Julien Rouhaud
 wrote:
> On 27/06/2016 07:18, Amit Kapila wrote:
>> On Mon, Jun 27, 2016 at 10:21 AM, Amit Kapila  
>> wrote:
>>>
>>> I think as the parallel_terminate_count is only modified by postmaster
>>> and read by other processes, such an operation will be considered
>>> atomic.  We don't need to specifically make it atomic unless multiple
>>> processes are allowed to modify such a counter.  Although, I think we
>>> need to use some barriers in code to make it safe.
>>>
>>> 1.
>>> @@ -272,6 +279,8 @@ BackgroundWorkerStateChange(void)
>>>   pg_memory_barrier();
>>>
>>> slot->pid = 0;
>>>   slot->in_use = false;
>>> + if (slot->parallel)
>>> +
>>> BackgroundWorkerData->parallel_terminate_count++;
>>>
>>> I think the check of slot->parallel and increment to
>>> parallel_terminate_count should be done before marking slot->in_use to
>>> false.  Consider the case if slot->in_use is marked as false and then
>>> before next line execution, one of the backends registers dynamic
>>> worker (non-parallel worker), so that
>>> backend can see this slot as free and use it.  It will also mark the
>>> parallel flag of slot as false.  Now when postmaster will check the
>>> slot->parallel flag, it will find it false and then skip the increment
>>> to parallel_terminate_count.
>>>
>>
>> If you agree with above theory, then you need to use write barrier as
>> well after incrementing the parallel_terminate_count to avoid
>> re-ordering with slot->in_use flag.
>>
>
> I totally agree, I didn't thought about this corner case.
>
> There's already a pg_memory_barrier() call in
> BackgroundWorkerStateChange(), to avoid reordering the notify_pid load.
> Couldn't we use it to also make sure the parallel_terminate_count
> increment happens before the slot->in_use store?
>

Yes, that is enough, as memory barrier ensures that both loads and
stores are completed before any loads and stores that are after
barrier.

>  I guess that a write
> barrier will be needed in ForgetBacgroundWorker().
>

Yes.

>>> 2.
>>> + if (parallel && (BackgroundWorkerData->parallel_register_count -
>>> +
>>> BackgroundWorkerData->parallel_terminate_count) >=
>>> +
>>> max_parallel_workers)
>>> + {
>>> + LWLockRelease(BackgroundWorkerLock);
>>> + return
>>> false;
>>> + }
>>>+
>>>
>>> I think we need a read barrier here, so that this check doesn't get
>>> reordered with the for loop below it.
>
> You mean between the end of this block and the for loop?
>

Yes.

>>>  Also, see if you find the code
>>> more readable by moving the after && part of check to next line.
>
> I think I'll just pgindent the file.
>

make sense.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-27 Thread Julien Rouhaud
On 27/06/2016 07:18, Amit Kapila wrote:
> On Mon, Jun 27, 2016 at 10:21 AM, Amit Kapila  wrote:
>>
>> I think as the parallel_terminate_count is only modified by postmaster
>> and read by other processes, such an operation will be considered
>> atomic.  We don't need to specifically make it atomic unless multiple
>> processes are allowed to modify such a counter.  Although, I think we
>> need to use some barriers in code to make it safe.
>>
>> 1.
>> @@ -272,6 +279,8 @@ BackgroundWorkerStateChange(void)
>>   pg_memory_barrier();
>>
>> slot->pid = 0;
>>   slot->in_use = false;
>> + if (slot->parallel)
>> +
>> BackgroundWorkerData->parallel_terminate_count++;
>>
>> I think the check of slot->parallel and increment to
>> parallel_terminate_count should be done before marking slot->in_use to
>> false.  Consider the case if slot->in_use is marked as false and then
>> before next line execution, one of the backends registers dynamic
>> worker (non-parallel worker), so that
>> backend can see this slot as free and use it.  It will also mark the
>> parallel flag of slot as false.  Now when postmaster will check the
>> slot->parallel flag, it will find it false and then skip the increment
>> to parallel_terminate_count.
>>
> 
> If you agree with above theory, then you need to use write barrier as
> well after incrementing the parallel_terminate_count to avoid
> re-ordering with slot->in_use flag.
> 

I totally agree, I didn't thought about this corner case.

There's already a pg_memory_barrier() call in
BackgroundWorkerStateChange(), to avoid reordering the notify_pid load.
Couldn't we use it to also make sure the parallel_terminate_count
increment happens before the slot->in_use store?  I guess that a write
barrier will be needed in ForgetBacgroundWorker().

>> 2.
>> + if (parallel && (BackgroundWorkerData->parallel_register_count -
>> +
>> BackgroundWorkerData->parallel_terminate_count) >=
>> +
>> max_parallel_workers)
>> + {
>> + LWLockRelease(BackgroundWorkerLock);
>> + return
>> false;
>> + }
>>+
>>
>> I think we need a read barrier here, so that this check doesn't get
>> reordered with the for loop below it.

You mean between the end of this block and the for loop? (Sorry, I'm not
at all familiar with the pg_barrier mechanism).

>>  Also, see if you find the code
>> more readable by moving the after && part of check to next line.

I think I'll just pgindent the file.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-26 Thread Amit Kapila
On Mon, Jun 27, 2016 at 10:21 AM, Amit Kapila  wrote:
> On Sun, Jun 26, 2016 at 3:57 PM, Julien Rouhaud
>  wrote:
>> On 26/06/2016 08:37, Amit Kapila wrote:
>>>
>>> @@ -370,6 +379,8 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
>>>   Assert(rw->rw_shmem_slot <
>>> max_worker_processes);
>>>   slot = >slot[rw->rw_shmem_slot];
>>>   slot->in_use =
>>> false;
>>> + if (slot->parallel)
>>> + BackgroundWorkerData->parallel_terminate_count++;
>>>
>>> I think operations on parallel_terminate_count are not safe.
>>> ForgetBackgroundWorker() and RegisterDynamicBackgroundWorker() can try
>>> to read write at same time.  It seems you need to use atomic
>>> operations to ensure safety.
>>>
>>>
>>
>> But operations on parallel_terminate_count are done by the postmaster,
>> and per Robert's previous email postmaster can't use atomics:
>>
>
> I think as the parallel_terminate_count is only modified by postmaster
> and read by other processes, such an operation will be considered
> atomic.  We don't need to specifically make it atomic unless multiple
> processes are allowed to modify such a counter.  Although, I think we
> need to use some barriers in code to make it safe.
>
> 1.
> @@ -272,6 +279,8 @@ BackgroundWorkerStateChange(void)
>   pg_memory_barrier();
>
> slot->pid = 0;
>   slot->in_use = false;
> + if (slot->parallel)
> +
> BackgroundWorkerData->parallel_terminate_count++;
>
> I think the check of slot->parallel and increment to
> parallel_terminate_count should be done before marking slot->in_use to
> false.  Consider the case if slot->in_use is marked as false and then
> before next line execution, one of the backends registers dynamic
> worker (non-parallel worker), so that
> backend can see this slot as free and use it.  It will also mark the
> parallel flag of slot as false.  Now when postmaster will check the
> slot->parallel flag, it will find it false and then skip the increment
> to parallel_terminate_count.
>

If you agree with above theory, then you need to use write barrier as
well after incrementing the parallel_terminate_count to avoid
re-ordering with slot->in_use flag.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-26 Thread Amit Kapila
On Sun, Jun 26, 2016 at 3:57 PM, Julien Rouhaud
 wrote:
> On 26/06/2016 08:37, Amit Kapila wrote:
>>
>> @@ -370,6 +379,8 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
>>   Assert(rw->rw_shmem_slot <
>> max_worker_processes);
>>   slot = >slot[rw->rw_shmem_slot];
>>   slot->in_use =
>> false;
>> + if (slot->parallel)
>> + BackgroundWorkerData->parallel_terminate_count++;
>>
>> I think operations on parallel_terminate_count are not safe.
>> ForgetBackgroundWorker() and RegisterDynamicBackgroundWorker() can try
>> to read write at same time.  It seems you need to use atomic
>> operations to ensure safety.
>>
>>
>
> But operations on parallel_terminate_count are done by the postmaster,
> and per Robert's previous email postmaster can't use atomics:
>

I think as the parallel_terminate_count is only modified by postmaster
and read by other processes, such an operation will be considered
atomic.  We don't need to specifically make it atomic unless multiple
processes are allowed to modify such a counter.  Although, I think we
need to use some barriers in code to make it safe.

1.
@@ -272,6 +279,8 @@ BackgroundWorkerStateChange(void)
  pg_memory_barrier();

slot->pid = 0;
  slot->in_use = false;
+ if (slot->parallel)
+
BackgroundWorkerData->parallel_terminate_count++;

I think the check of slot->parallel and increment to
parallel_terminate_count should be done before marking slot->in_use to
false.  Consider the case if slot->in_use is marked as false and then
before next line execution, one of the backends registers dynamic
worker (non-parallel worker), so that
backend can see this slot as free and use it.  It will also mark the
parallel flag of slot as false.  Now when postmaster will check the
slot->parallel flag, it will find it false and then skip the increment
to parallel_terminate_count.

2.
+ if (parallel && (BackgroundWorkerData->parallel_register_count -
+
BackgroundWorkerData->parallel_terminate_count) >=
+
max_parallel_workers)
+ {
+ LWLockRelease(BackgroundWorkerLock);
+ return
false;
+ }
+

I think we need a read barrier here, so that this check doesn't get
reordered with the for loop below it.   Also, see if you find the code
more readable by moving the after && part of check to next line.

3.
typedef struct BackgroundWorkerArray
 {
  int total_slots;
+ uint32
parallel_register_count;
+ uint32 parallel_terminate_count;
  BackgroundWorkerSlot
slot[FLEXIBLE_ARRAY_MEMBER];
 } BackgroundWorkerArray;

See, if you can add comments on top of this structure to explain the
usage/meaning of newly added parallel_* counters?


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-26 Thread Julien Rouhaud
On 26/06/2016 08:37, Amit Kapila wrote:
> On Sat, Jun 25, 2016 at 2:27 PM, Julien Rouhaud
>  wrote:
>>>
>>
>> It's better thanks.  Should we document somewhere the link between this
>> parameter and custom dynamic background workers or is it pretty
>> self-explanatory?
>>
> 
> How about if add an additiona line like:
> Parallel workers are taken from the pool of processes established by
> guc-max-worker-processes.
> 
> I think one might feel some duplication of text between this and what
> we have for max_parallel_workers_per_gather, but it seems genuine to
> me.
> 

Ok, I'll add it.

> 
> @@ -370,6 +379,8 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
>   Assert(rw->rw_shmem_slot <
> max_worker_processes);
>   slot = >slot[rw->rw_shmem_slot];
>   slot->in_use =
> false;
> + if (slot->parallel)
> + BackgroundWorkerData->parallel_terminate_count++;
> 
> I think operations on parallel_terminate_count are not safe.
> ForgetBackgroundWorker() and RegisterDynamicBackgroundWorker() can try
> to read write at same time.  It seems you need to use atomic
> operations to ensure safety.
> 
> 

But operations on parallel_terminate_count are done by the postmaster,
and per Robert's previous email postmaster can't use atomics:

> We can't have the postmaster and the
> backends share the same counter, because then it would need locking,
> and the postmaster can't try to take spinlocks - can't even use
> atomics, because those might be emulated using spinlocks.

Do we support platforms on which 32bits operations are not atomic?

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-26 Thread Amit Kapila
On Sat, Jun 25, 2016 at 2:27 PM, Julien Rouhaud
 wrote:
> On 25/06/2016 09:33, Amit Kapila wrote:
>> On Wed, Jun 15, 2016 at 11:43 PM, Julien Rouhaud
>>  wrote:
>>>
>>> Attached v4 implements the design you suggested, I hope everything's ok.
>>>
>>
>> Few review comments:
>>
>
> Thanks for the review.
>
>
>
>> 4.
>> +  > xreflabel="max_parallel_workers">
>> +   max_parallel_workers (integer)
>> +   
>> +max_parallel_workers configuration
>> parameter
>> +   
>> +   
>> +   
>> +
>> + Sets the maximum number of workers that can be launched at the same
>> + time for the whole server.  This parameter allows the 
>> administrator to
>> + reserve background worker slots for for third part dynamic 
>> background
>> + workers.  The default value is 4.  Setting this value to 0 disables
>> + parallel query execution.
>> +
>> +   
>> +  
>>
>> How about phrasing it as:
>> Sets the maximum number of workers that the system can support for
>> parallel queries.  The default value is 4.  Setting this value to 0
>> disables parallel query execution.
>>
>
> It's better thanks.  Should we document somewhere the link between this
> parameter and custom dynamic background workers or is it pretty
> self-explanatory?
>

How about if add an additiona line like:
Parallel workers are taken from the pool of processes established by
guc-max-worker-processes.

I think one might feel some duplication of text between this and what
we have for max_parallel_workers_per_gather, but it seems genuine to
me.


@@ -370,6 +379,8 @@ ForgetBackgroundWorker(slist_mutable_iter *cur)
  Assert(rw->rw_shmem_slot <
max_worker_processes);
  slot = >slot[rw->rw_shmem_slot];
  slot->in_use =
false;
+ if (slot->parallel)
+ BackgroundWorkerData->parallel_terminate_count++;

I think operations on parallel_terminate_count are not safe.
ForgetBackgroundWorker() and RegisterDynamicBackgroundWorker() can try
to read write at same time.  It seems you need to use atomic
operations to ensure safety.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-25 Thread Julien Rouhaud
On 25/06/2016 09:33, Amit Kapila wrote:
> On Wed, Jun 15, 2016 at 11:43 PM, Julien Rouhaud
>  wrote:
>>
>> Attached v4 implements the design you suggested, I hope everything's ok.
>>
> 
> Few review comments:
> 

Thanks for the review.

> 1.
> + if (parallel && (BackgroundWorkerData->parallel_register_count -
> +
> BackgroundWorkerData->parallel_terminate_count) >=
> +
> max_parallel_workers)
> + return false;
> 
> I think this check should be done after acquiring
> BackgroundWorkerLock, otherwise some other backend can simultaneously
> increment parallel_register_count.
> 

You're right, fixed.

> 2.
> 
> +/*
> + * This flag is used internally for parallel queries, to keep track of the
> + * number of active
> parallel workers and make sure we never launch more than
> + * max_parallel_workers parallel workers at
> the same time.  Third part
> + * background workers should not use this flag.
> + */
> +#define
> BGWORKER_IS_PARALLEL_WORKER 0x0004
> +
> 
> "Third part", do yo want to say Third party?
> 

Yes, sorry. Fixed

> 3.
> static bool
> SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel)
> {
> ..
> }
> 
> Isn't it better to have a check in above function such that if
> bgw_flags is BGWORKER_IS_PARALLEL_WORKER and max_parallel_workers is
> zero, then ereport?  Also, consider if it is better to have some other
> checks related to BGWORKER_IS_PARALLEL_WORKER, like if caller sets
> BGWORKER_IS_PARALLEL_WORKER, then it must have database connection and
> shared memory access.
> 

I added these checks. I don't see another check to add right now.

> 4.
> +   xreflabel="max_parallel_workers">
> +   max_parallel_workers (integer)
> +   
> +max_parallel_workers configuration
> parameter
> +   
> +   
> +   
> +
> + Sets the maximum number of workers that can be launched at the same
> + time for the whole server.  This parameter allows the administrator 
> to
> + reserve background worker slots for for third part dynamic 
> background
> + workers.  The default value is 4.  Setting this value to 0 disables
> + parallel query execution.
> +
> +   
> +  
> 
> How about phrasing it as:
> Sets the maximum number of workers that the system can support for
> parallel queries.  The default value is 4.  Setting this value to 0
> disables parallel query execution.
> 

It's better thanks.  Should we document somewhere the link between this
parameter and custom dynamic background workers or is it pretty
self-explanatory?

> 5.
> max_parallel_workers_per_gather configuration
> parameter
>
>
>
> 
>  Sets the maximum number of workers that can be started by a single
>  Gather node.  Parallel workers are taken from the
>  pool of processes established by
>  .
> 
> I think it is better to change above in documentation to indicate that
> "pool of processes established by guc-max-parallel-workers".
> 

The real limit is the minimum of these two values, I think it's
important to be explicit here, since this pool is shared for parallelism
and custom bgworkers.

What about "pool of processes established by guc-max-worker-processes,
limited by guc-max-parallel-workers" (used in attached v5 patch)

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index a82bf06..6812b0d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2009,7 +2009,8 @@ include_dir 'conf.d'
  Sets the maximum number of workers that can be started by a single
  Gather node.  Parallel workers are taken from the
  pool of processes established by
- .  Note that the requested
+ , limited by
+ .  Note that the requested
  number of workers may not actually be available at runtime.  If this
  occurs, the plan will run with fewer workers than expected, which may
  be inefficient.  The default value is 2.  Setting this value to 0
@@ -2018,6 +2019,21 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that the system can support for
+ parallel queries.  The default value is 4.  Setting this value to 0
+ disables parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index 088700e..ea7680b 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -452,7 +452,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);

Re: [HACKERS] Rename max_parallel_degree?

2016-06-25 Thread Amit Kapila
On Wed, Jun 15, 2016 at 11:43 PM, Julien Rouhaud
 wrote:
>
> Attached v4 implements the design you suggested, I hope everything's ok.
>

Few review comments:

1.
+ if (parallel && (BackgroundWorkerData->parallel_register_count -
+
BackgroundWorkerData->parallel_terminate_count) >=
+
max_parallel_workers)
+ return false;

I think this check should be done after acquiring
BackgroundWorkerLock, otherwise some other backend can simultaneously
increment parallel_register_count.

2.

+/*
+ * This flag is used internally for parallel queries, to keep track of the
+ * number of active
parallel workers and make sure we never launch more than
+ * max_parallel_workers parallel workers at
the same time.  Third part
+ * background workers should not use this flag.
+ */
+#define
BGWORKER_IS_PARALLEL_WORKER 0x0004
+

"Third part", do yo want to say Third party?

3.
static bool
SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel)
{
..
}

Isn't it better to have a check in above function such that if
bgw_flags is BGWORKER_IS_PARALLEL_WORKER and max_parallel_workers is
zero, then ereport?  Also, consider if it is better to have some other
checks related to BGWORKER_IS_PARALLEL_WORKER, like if caller sets
BGWORKER_IS_PARALLEL_WORKER, then it must have database connection and
shared memory access.

4.
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  

How about phrasing it as:
Sets the maximum number of workers that the system can support for
parallel queries.  The default value is 4.  Setting this value to 0
disables parallel query execution.

5.
max_parallel_workers_per_gather configuration
parameter
   
   
   

 Sets the maximum number of workers that can be started by a single
 Gather node.  Parallel workers are taken from the
 pool of processes established by
 .

I think it is better to change above in documentation to indicate that
"pool of processes established by guc-max-parallel-workers".

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-15 Thread Julien Rouhaud
On 15/06/2016 18:14, Julien Rouhaud wrote:
> On 15/06/2016 17:49, Robert Haas wrote:
>> On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud
>>  wrote:
 I don't entirely like the new logic in
 RegisterDynamicBackgroundWorker.
>>>
 I wonder if we can't drive this off
 of a couple of counters, instead of having the process registering the
 background worker iterate over every slot. [...]

>>
>> I think we should go that way.  Some day we might try to make the
>> process of finding a free slot more efficient than it is today; I'd
>> rather not double down on linear search.
>>
> 
> Ok.
> 
>> Are you going to update this patch?
>>
> 
> Sure, I'll post a new patch ASAP.
> 

Attached v4 implements the design you suggested, I hope everything's ok.

I didn't do anything for the statistics part, because I'm not sure
having the counters separately is useful (instead of just having the
current number of parallel workers), and if it's useful I'd find strange
to have these counters get reset at restart instead of being stored and
accumulated as other stats, and that's look too much for 9.6 material.
I'd be happy to work on this though.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0e5a1e..c661c7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2018,6 +2018,23 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..b429474 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index cc8ba61..2bcd86b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index e7f63f4..fd62126 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 07b925e..66e65c8 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
IsUnderPostmaster && dynamic_shared_memory_type != 
DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
parse->utilityStmt == NULL && max_parallel_workers_per_gather > 
0 &&
-   !IsParallelWorker() && !IsolationIsSerializable() &&
-   !has_parallel_hazard((Node *) parse, true);
+   max_parallel_workers > 0 && !IsParallelWorker() &&
+   !IsolationIsSerializable() && 

Re: [HACKERS] Rename max_parallel_degree?

2016-06-15 Thread Julien Rouhaud
On 15/06/2016 17:49, Robert Haas wrote:
> On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud
>  wrote:
>>> I don't entirely like the new logic in
>>> RegisterDynamicBackgroundWorker.
>>
>>> I wonder if we can't drive this off
>>> of a couple of counters, instead of having the process registering the
>>> background worker iterate over every slot. [...]
>>>
> 
> I think we should go that way.  Some day we might try to make the
> process of finding a free slot more efficient than it is today; I'd
> rather not double down on linear search.
> 

Ok.

> Are you going to update this patch?
> 

Sure, I'll post a new patch ASAP.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-15 Thread Robert Haas
On Tue, Jun 14, 2016 at 7:10 AM, Julien Rouhaud
 wrote:
>> I don't entirely like the new logic in
>> RegisterDynamicBackgroundWorker.
>
> I'm not that happy with it too. We can avoid iterating over every slots
> if the feature isn't activated though (max_parallel_workers >=
> max_worker_processes).
>
>> I wonder if we can't drive this off
>> of a couple of counters, instead of having the process registering the
>> background worker iterate over every slot.  Suppose we add two
>> counters to BackgroundWorkerArray, parallel_register_count and
>> parallel_terminate_count.  Whenever a backend successfully registers a
>> parallel worker, it increments parallel_register_count.  Whenever the
>> postmaster marks a parallel wokrer slot as no longer in use, it
>> increments parallel_terminate_count.  Then, the number of active
>> parallel workers is just parallel_register_count -
>> parallel_terminate_count.  (We can't have the postmaster and the
>> backends share the same counter, because then it would need locking,
>> and the postmaster can't try to take spinlocks - can't even use
>> atomics, because those might be emulated using spinlocks.)
>>
>
> I wanted to maintain counters at first, but it seemed more invasive, and
> I thought that the max_parallel_worker would be ueful in environnements
> where there're lots of parallel workers and dynamic workers used, so
> finding a free slot would require iterating over most of the slots most
> of the time anyway.  I'm of course also ok with maintaining counters.

I think we should go that way.  Some day we might try to make the
process of finding a free slot more efficient than it is today; I'd
rather not double down on linear search.

Are you going to update this patch?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Robert Haas
On Tue, Jun 14, 2016 at 12:16 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Of course, it would be nice if we could make these counters 64-bit
>> integers, but we can't, because we don't rely on 64-bit reads and
>> writes to be atomic on all platforms.  So instead they'll have to be
>> uint32.  That means they could wrap (if you really work at it) but
>> subtraction will still return the right answer, so it's OK.
>
> OK ...
>
>> If we
>> want to allow the number of parallel workers started to be available
>> for statistical purposes, we can keep to uint32 values for that
>> (parallel_register_count_lo and parallel_register_count_hi, for
>> example), and increment the second one whenever the first one rolls
>> over to zero.
>
> And that's going to be atomic how exactly?

The only process that can look at that structure without taking a lock
is the postmaster.  And the postmaster would only examine
parallel_register_count_lo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Tom Lane
Robert Haas  writes:
> Of course, it would be nice if we could make these counters 64-bit
> integers, but we can't, because we don't rely on 64-bit reads and
> writes to be atomic on all platforms.  So instead they'll have to be
> uint32.  That means they could wrap (if you really work at it) but
> subtraction will still return the right answer, so it's OK.

OK ...

> If we
> want to allow the number of parallel workers started to be available
> for statistical purposes, we can keep to uint32 values for that
> (parallel_register_count_lo and parallel_register_count_hi, for
> example), and increment the second one whenever the first one rolls
> over to zero.

And that's going to be atomic how exactly?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-14 Thread Julien Rouhaud
On 14/06/2016 04:09, Robert Haas wrote:
> On Mon, Jun 13, 2016 at 5:42 AM, Julien Rouhaud
>  wrote:
>> Agreed, and fixed in attached v3.
> 
> I don't entirely like the new logic in
> RegisterDynamicBackgroundWorker.

I'm not that happy with it too. We can avoid iterating over every slots
if the feature isn't activated though (max_parallel_workers >=
max_worker_processes).

> I wonder if we can't drive this off
> of a couple of counters, instead of having the process registering the
> background worker iterate over every slot.  Suppose we add two
> counters to BackgroundWorkerArray, parallel_register_count and
> parallel_terminate_count.  Whenever a backend successfully registers a
> parallel worker, it increments parallel_register_count.  Whenever the
> postmaster marks a parallel wokrer slot as no longer in use, it
> increments parallel_terminate_count.  Then, the number of active
> parallel workers is just parallel_register_count -
> parallel_terminate_count.  (We can't have the postmaster and the
> backends share the same counter, because then it would need locking,
> and the postmaster can't try to take spinlocks - can't even use
> atomics, because those might be emulated using spinlocks.)
> 

I wanted to maintain counters at first, but it seemed more invasive, and
I thought that the max_parallel_worker would be ueful in environnements
where there're lots of parallel workers and dynamic workers used, so
finding a free slot would require iterating over most of the slots most
of the time anyway.  I'm of course also ok with maintaining counters.

> If we want to allow the number of parallel workers started to be available
> for statistical purposes, we can keep to uint32 values for that
> (parallel_register_count_lo and parallel_register_count_hi, for
> example), and increment the second one whenever the first one rolls
> over to zero.
> 

I didn't think about monitoring. I'm not sure if this counter would be
really helpful without also having the number of time a parallel worker
couldn't be launched (and I'd really like to have this one).

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-13 Thread Robert Haas
On Mon, Jun 13, 2016 at 5:42 AM, Julien Rouhaud
 wrote:
> Agreed, and fixed in attached v3.

I don't entirely like the new logic in
RegisterDynamicBackgroundWorker.  I wonder if we can't drive this off
of a couple of counters, instead of having the process registering the
background worker iterate over every slot.  Suppose we add two
counters to BackgroundWorkerArray, parallel_register_count and
parallel_terminate_count.  Whenever a backend successfully registers a
parallel worker, it increments parallel_register_count.  Whenever the
postmaster marks a parallel wokrer slot as no longer in use, it
increments parallel_terminate_count.  Then, the number of active
parallel workers is just parallel_register_count -
parallel_terminate_count.  (We can't have the postmaster and the
backends share the same counter, because then it would need locking,
and the postmaster can't try to take spinlocks - can't even use
atomics, because those might be emulated using spinlocks.)

Of course, it would be nice if we could make these counters 64-bit
integers, but we can't, because we don't rely on 64-bit reads and
writes to be atomic on all platforms.  So instead they'll have to be
uint32.  That means they could wrap (if you really work at it) but
subtraction will still return the right answer, so it's OK.  If we
want to allow the number of parallel workers started to be available
for statistical purposes, we can keep to uint32 values for that
(parallel_register_count_lo and parallel_register_count_hi, for
example), and increment the second one whenever the first one rolls
over to zero.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-13 Thread Julien Rouhaud
On 13/06/2016 03:16, Robert Haas wrote:
> On Sat, Jun 11, 2016 at 6:24 PM, Julien Rouhaud
>  wrote:
>> On 11/06/2016 23:37, Julien Rouhaud wrote:
>>> On 09/06/2016 16:04, Robert Haas wrote:
 There remains only the task of adding max_parallel_degree
 as a system-wide limit

>>>
>>> PFA a patch to fix this open item.  I used the GUC name provided in the
>>> 9.6 open item page (max_parallel_workers), with a default value of 4.
>>> Value 0 is another way to disable parallel query.
>>>
>>
>> Sorry I just realized I made a stupid mistake, I didn't check in all
>> slots to get the number of active parallel workers. Fixed in attached v2.
> 
> I think instead of adding a "bool parallel" argument to
> RegisterDynamicBackgroundWorker, we should just define a new constant
> for bgw_flags, say BGW_IS_PARALLEL_WORKER.
> 

Agreed, and fixed in attached v3.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0e5a1e..c661c7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2018,6 +2018,23 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..b429474 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -453,7 +453,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
snprintf(worker.bgw_name, BGW_MAXLEN, "parallel worker for PID %d",
 MyProcPid);
worker.bgw_flags =
-   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION;
+   BGWORKER_SHMEM_ACCESS | BGWORKER_BACKEND_DATABASE_CONNECTION
+   | BGWORKER_IS_PARALLEL_WORKER;
worker.bgw_start_time = BgWorkerStart_ConsistentState;
worker.bgw_restart_time = BGW_NEVER_RESTART;
worker.bgw_main = ParallelWorkerMain;
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index cc8ba61..2bcd86b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index e7f63f4..fd62126 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 54c0440..81d124c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
IsUnderPostmaster && dynamic_shared_memory_type != 
DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
parse->utilityStmt == NULL && max_parallel_workers_per_gather > 
0 &&
-   !IsParallelWorker() && !IsolationIsSerializable() &&
-   !has_parallel_hazard((Node *) parse, true);
+   max_parallel_workers > 0 && !IsParallelWorker() &&
+   !IsolationIsSerializable() && !has_parallel_hazard((Node *) 
parse,
+   true);
 
/*
 * glob->parallelModeNeeded should tell us whether it's necessary to
diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 382edad..3749d72 100644
--- 

Re: [HACKERS] Rename max_parallel_degree?

2016-06-12 Thread Robert Haas
On Sat, Jun 11, 2016 at 6:24 PM, Julien Rouhaud
 wrote:
> On 11/06/2016 23:37, Julien Rouhaud wrote:
>> On 09/06/2016 16:04, Robert Haas wrote:
>>> OK, I pushed this after re-reviewing it and fixing a number of
>>> oversights.  There remains only the task of adding max_parallel_degree
>>> as a system-wide limit (as opposed to max_parallel_degree now
>>> max_parallel_workers_per_gather which is a per-Gather limit), which
>>> I'm going to argue should be a new open item and not necessarily one
>>> that I have to own myself.  I would like to take care of it, but I
>>> will not put it ahead of fixing actual defects and I will not promise
>>> to have it done in time for 9.6.
>>>
>>
>> PFA a patch to fix this open item.  I used the GUC name provided in the
>> 9.6 open item page (max_parallel_workers), with a default value of 4.
>> Value 0 is another way to disable parallel query.
>>
>
> Sorry I just realized I made a stupid mistake, I didn't check in all
> slots to get the number of active parallel workers. Fixed in attached v2.

I think instead of adding a "bool parallel" argument to
RegisterDynamicBackgroundWorker, we should just define a new constant
for bgw_flags, say BGW_IS_PARALLEL_WORKER.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-11 Thread Julien Rouhaud
On 11/06/2016 23:37, Julien Rouhaud wrote:
> On 09/06/2016 16:04, Robert Haas wrote:
>>
>> OK, I pushed this after re-reviewing it and fixing a number of
>> oversights.  There remains only the task of adding max_parallel_degree
>> as a system-wide limit (as opposed to max_parallel_degree now
>> max_parallel_workers_per_gather which is a per-Gather limit), which
>> I'm going to argue should be a new open item and not necessarily one
>> that I have to own myself.  I would like to take care of it, but I
>> will not put it ahead of fixing actual defects and I will not promise
>> to have it done in time for 9.6.
>>
> 
> PFA a patch to fix this open item.  I used the GUC name provided in the
> 9.6 open item page (max_parallel_workers), with a default value of 4.
> Value 0 is another way to disable parallel query.
> 

Sorry I just realized I made a stupid mistake, I didn't check in all
slots to get the number of active parallel workers. Fixed in attached v2.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0e5a1e..c661c7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2018,6 +2018,23 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..7b53b53 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -473,8 +473,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
{
memcpy(worker.bgw_extra, , sizeof(int));
if (!any_registrations_failed &&
-   RegisterDynamicBackgroundWorker(,
-   
>worker[i].bgwhandle))
+   RegisterDynamicBackgroundWorkerInternal(,
+   
>worker[i].bgwhandle, true))
{
shm_mq_set_handle(pcxt->worker[i].error_mqh,
  
pcxt->worker[i].bgwhandle);
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index cc8ba61..2bcd86b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index e7f63f4..fd62126 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 54c0440..81d124c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
IsUnderPostmaster && dynamic_shared_memory_type != 
DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
parse->utilityStmt == NULL && max_parallel_workers_per_gather > 
0 &&
-   !IsParallelWorker() && !IsolationIsSerializable() &&
-   !has_parallel_hazard((Node *) parse, true);
+   max_parallel_workers > 0 && !IsParallelWorker() &&
+   !IsolationIsSerializable() && !has_parallel_hazard((Node *) 
parse,
+   true);
 
/*

Re: [HACKERS] Rename max_parallel_degree?

2016-06-11 Thread Julien Rouhaud
On 09/06/2016 16:04, Robert Haas wrote:
> 
> OK, I pushed this after re-reviewing it and fixing a number of
> oversights.  There remains only the task of adding max_parallel_degree
> as a system-wide limit (as opposed to max_parallel_degree now
> max_parallel_workers_per_gather which is a per-Gather limit), which
> I'm going to argue should be a new open item and not necessarily one
> that I have to own myself.  I would like to take care of it, but I
> will not put it ahead of fixing actual defects and I will not promise
> to have it done in time for 9.6.
> 

PFA a patch to fix this open item.  I used the GUC name provided in the
9.6 open item page (max_parallel_workers), with a default value of 4.
Value 0 is another way to disable parallel query.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e0e5a1e..c661c7a 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -2018,6 +2018,23 @@ include_dir 'conf.d'

   
 
+  
+   max_parallel_workers (integer)
+   
+max_parallel_workers configuration 
parameter
+   
+   
+   
+
+ Sets the maximum number of workers that can be launched at the same
+ time for the whole server.  This parameter allows the administrator to
+ reserve background worker slots for for third part dynamic background
+ workers.  The default value is 4.  Setting this value to 0 disables
+ parallel query execution.
+
+   
+  
+
   
backend_flush_after (integer)

diff --git a/src/backend/access/transam/parallel.c 
b/src/backend/access/transam/parallel.c
index ab5ef25..7b53b53 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -473,8 +473,8 @@ LaunchParallelWorkers(ParallelContext *pcxt)
{
memcpy(worker.bgw_extra, , sizeof(int));
if (!any_registrations_failed &&
-   RegisterDynamicBackgroundWorker(,
-   
>worker[i].bgwhandle))
+   RegisterDynamicBackgroundWorkerInternal(,
+   
>worker[i].bgwhandle, true))
{
shm_mq_set_handle(pcxt->worker[i].error_mqh,
  
pcxt->worker[i].bgwhandle);
diff --git a/src/backend/optimizer/path/allpaths.c 
b/src/backend/optimizer/path/allpaths.c
index cc8ba61..2bcd86b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -719,9 +719,11 @@ create_plain_partial_paths(PlannerInfo *root, RelOptInfo 
*rel)
}
 
/*
-* In no case use more than max_parallel_workers_per_gather workers.
+* In no case use more than max_parallel_workers or
+* max_parallel_workers_per_gather workers.
 */
-   parallel_workers = Min(parallel_workers, 
max_parallel_workers_per_gather);
+   parallel_workers = Min(max_parallel_workers, Min(parallel_workers,
+   max_parallel_workers_per_gather));
 
/* If any limit was set to zero, the user doesn't want a parallel scan. 
*/
if (parallel_workers <= 0)
diff --git a/src/backend/optimizer/path/costsize.c 
b/src/backend/optimizer/path/costsize.c
index e7f63f4..fd62126 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -113,6 +113,7 @@ int effective_cache_size = 
DEFAULT_EFFECTIVE_CACHE_SIZE;
 
 Cost   disable_cost = 1.0e10;
 
+intmax_parallel_workers = 4;
 intmax_parallel_workers_per_gather = 2;
 
 bool   enable_seqscan = true;
diff --git a/src/backend/optimizer/plan/planner.c 
b/src/backend/optimizer/plan/planner.c
index 54c0440..81d124c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -246,8 +246,9 @@ standard_planner(Query *parse, int cursorOptions, 
ParamListInfo boundParams)
IsUnderPostmaster && dynamic_shared_memory_type != 
DSM_IMPL_NONE &&
parse->commandType == CMD_SELECT && !parse->hasModifyingCTE &&
parse->utilityStmt == NULL && max_parallel_workers_per_gather > 
0 &&
-   !IsParallelWorker() && !IsolationIsSerializable() &&
-   !has_parallel_hazard((Node *) parse, true);
+   max_parallel_workers > 0 && !IsParallelWorker() &&
+   !IsolationIsSerializable() && !has_parallel_hazard((Node *) 
parse,
+   true);
 
/*
 * glob->parallelModeNeeded should tell us whether it's necessary to
diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 382edad..71025b7 100644
--- 

Re: [HACKERS] Rename max_parallel_degree?

2016-06-09 Thread Robert Haas
On Thu, Jun 9, 2016 at 6:09 PM, Peter Geoghegan  wrote:
> On Thu, Jun 9, 2016 at 1:08 PM, Robert Haas  wrote:
>>> I am in favor of having something similar to
>>> max_parallel_workers_per_gather for utility statements like CREATE
>>> INDEX. That will need a cost model, at least where the DBA isn't
>>> explicit about the number of workers to use.
>>
>> We may well need that, but I think it should be discussed in
>> conjunction with the patches that add parallelism for those utility
>> statements, rather than discussing it on a thread for a 9.6 open item.
>
> Of course.
>
> I don't think it needs to be scoped to utility statements. It's just
> clear that it's not appropriate to use max_parallel_workers_per_gather
> within utility statements, even though something like that will be
> needed.

True.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-09 Thread Peter Geoghegan
On Thu, Jun 9, 2016 at 1:08 PM, Robert Haas  wrote:
>> I am in favor of having something similar to
>> max_parallel_workers_per_gather for utility statements like CREATE
>> INDEX. That will need a cost model, at least where the DBA isn't
>> explicit about the number of workers to use.
>
> We may well need that, but I think it should be discussed in
> conjunction with the patches that add parallelism for those utility
> statements, rather than discussing it on a thread for a 9.6 open item.

Of course.

I don't think it needs to be scoped to utility statements. It's just
clear that it's not appropriate to use max_parallel_workers_per_gather
within utility statements, even though something like that will be
needed.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-09 Thread Robert Haas
On Thu, Jun 9, 2016 at 2:05 PM, Peter Geoghegan  wrote:
> On Thu, Jun 9, 2016 at 7:04 AM, Robert Haas  wrote:
>> OK, I pushed this after re-reviewing it and fixing a number of
>> oversights.  There remains only the task of adding max_parallel_degree
>> as a system-wide limit (as opposed to max_parallel_degree now
>> max_parallel_workers_per_gather which is a per-Gather limit), which
>> I'm going to argue should be a new open item and not necessarily one
>> that I have to own myself.  I would like to take care of it, but I
>> will not put it ahead of fixing actual defects and I will not promise
>> to have it done in time for 9.6.
>
> I am in favor of having something similar to
> max_parallel_workers_per_gather for utility statements like CREATE
> INDEX. That will need a cost model, at least where the DBA isn't
> explicit about the number of workers to use.

We may well need that, but I think it should be discussed in
conjunction with the patches that add parallelism for those utility
statements, rather than discussing it on a thread for a 9.6 open item.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-09 Thread Peter Geoghegan
On Thu, Jun 9, 2016 at 7:04 AM, Robert Haas  wrote:
> OK, I pushed this after re-reviewing it and fixing a number of
> oversights.  There remains only the task of adding max_parallel_degree
> as a system-wide limit (as opposed to max_parallel_degree now
> max_parallel_workers_per_gather which is a per-Gather limit), which
> I'm going to argue should be a new open item and not necessarily one
> that I have to own myself.  I would like to take care of it, but I
> will not put it ahead of fixing actual defects and I will not promise
> to have it done in time for 9.6.

I am in favor of having something similar to
max_parallel_workers_per_gather for utility statements like CREATE
INDEX. That will need a cost model, at least where the DBA isn't
explicit about the number of workers to use.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-09 Thread Robert Haas
On Wed, Jun 8, 2016 at 10:18 AM, Tom Lane  wrote:
> Josh Berkus  writes:
>> On 06/07/2016 11:01 PM, Robert Haas wrote:
>>> Here's a patch change max_parallel_degree to
>>> max_parallel_workers_per_gather, and also changing parallel_degree to
>>> parallel_workers.  I haven't tackled adding a separate
>>> max_parallel_workers, at least not yet.  Are people OK with this?
>
>> +1
>
> I've not read the patch in detail, but this sketch of what to do
> sounds fine.

OK, I pushed this after re-reviewing it and fixing a number of
oversights.  There remains only the task of adding max_parallel_degree
as a system-wide limit (as opposed to max_parallel_degree now
max_parallel_workers_per_gather which is a per-Gather limit), which
I'm going to argue should be a new open item and not necessarily one
that I have to own myself.  I would like to take care of it, but I
will not put it ahead of fixing actual defects and I will not promise
to have it done in time for 9.6.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Robert Haas
On Wed, Jun 8, 2016 at 10:41 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Jun 8, 2016 at 10:18 AM, Tom Lane  wrote:
>>> catversion is not relevant to GUC changes.  It's not really necessary,
>>> because you'd get a clean, easily diagnosed and repaired failure during
>>> postmaster startup anyway.  The point of bumping catversion is to prevent
>>> a postmaster starting when the incompatibility is subtler or harder to
>>> debug than that.
>
>> The reloption is also getting renamed here.
>
> Hmm.  I forget what the behavior is if we see an unrecognized reloption
> already stored in the catalogs, but it might be worth checking.  If
> that's something that's painful to get out of, maybe a catversion bump
> would be appropriate.

rhaas=# create table tgl (a int);
CREATE TABLE
rhaas=# alter table tgl set (parallel_degree = 3);
ALTER TABLE
rhaas=# select reloptions from pg_class where relname = 'tgl';
 reloptions
-
 {parallel_degree=3}
(1 row)

rhaas=# update pg_class set reloptions = '{beats_me=3}' where relname = 'tgl';
UPDATE 1
rhaas=# alter table tgl reset (beats_me);
ALTER TABLE
rhaas=# select reloptions from pg_class where relname = 'tgl';
 reloptions


(1 row)

So it's not hard to get rid of.  pg_dump does dump the unrecognized
option, which will fail at restore time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Tom Lane
Robert Haas  writes:
> On Wed, Jun 8, 2016 at 10:18 AM, Tom Lane  wrote:
>> catversion is not relevant to GUC changes.  It's not really necessary,
>> because you'd get a clean, easily diagnosed and repaired failure during
>> postmaster startup anyway.  The point of bumping catversion is to prevent
>> a postmaster starting when the incompatibility is subtler or harder to
>> debug than that.

> The reloption is also getting renamed here.

Hmm.  I forget what the behavior is if we see an unrecognized reloption
already stored in the catalogs, but it might be worth checking.  If
that's something that's painful to get out of, maybe a catversion bump
would be appropriate.

(In practice this affects nobody, because there was already a catversion
bump since beta1.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Robert Haas
On Wed, Jun 8, 2016 at 10:18 AM, Tom Lane  wrote:
>>> Note that there is a dump/restore hazard if people have set the
>>> parallel_degree reloption on a beta1 install, or used ALTER { USER |
>>> DATABASE } .. SET parallel_degree.  Can everybody live with that?
>>> Should I bump catversion when applying this?
>
>> IMHO, we just need to call it out in the beta2 announcement.
>
> catversion is not relevant to GUC changes.  It's not really necessary,
> because you'd get a clean, easily diagnosed and repaired failure during
> postmaster startup anyway.  The point of bumping catversion is to prevent
> a postmaster starting when the incompatibility is subtler or harder to
> debug than that.

The reloption is also getting renamed here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Tom Lane
Josh Berkus  writes:
> On 06/07/2016 11:01 PM, Robert Haas wrote:
>> Here's a patch change max_parallel_degree to
>> max_parallel_workers_per_gather, and also changing parallel_degree to
>> parallel_workers.  I haven't tackled adding a separate
>> max_parallel_workers, at least not yet.  Are people OK with this?

> +1

I've not read the patch in detail, but this sketch of what to do
sounds fine.

>> Note that there is a dump/restore hazard if people have set the
>> parallel_degree reloption on a beta1 install, or used ALTER { USER |
>> DATABASE } .. SET parallel_degree.  Can everybody live with that?
>> Should I bump catversion when applying this?

> IMHO, we just need to call it out in the beta2 announcement.

catversion is not relevant to GUC changes.  It's not really necessary,
because you'd get a clean, easily diagnosed and repaired failure during
postmaster startup anyway.  The point of bumping catversion is to prevent
a postmaster starting when the incompatibility is subtler or harder to
debug than that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-08 Thread Josh Berkus
On 06/07/2016 11:01 PM, Robert Haas wrote:
> On Fri, Jun 3, 2016 at 9:39 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> I think we should just go with max_parallel_workers for a limit on
>>> total parallel workers within max_work_processes, and
>>> max_parallel_workers_per_gather for a per-Gather limit.  It's slightly
>>> annoying that we may end up renaming the latter GUC, but not as
>>> annoying as spending another three weeks debating this and missing
>>> beta2.
>>
>> +1.  I'm not as convinced as you are that we'll replace the GUC later,
>> but in any case this is an accurate description of the current
>> semantics.  And I'm really *not* in favor of fudging the name with
>> the intent of changing the GUC's semantics later --- that would fail
>> all sorts of compatibility expectations.
> 
> Here's a patch change max_parallel_degree to
> max_parallel_workers_per_gather, and also changing parallel_degree to
> parallel_workers.  I haven't tackled adding a separate
> max_parallel_workers, at least not yet.  Are people OK with this?

+1

> 
> Note that there is a dump/restore hazard if people have set the
> parallel_degree reloption on a beta1 install, or used ALTER { USER |
> DATABASE } .. SET parallel_degree.  Can everybody live with that?
> Should I bump catversion when applying this?

IMHO, we just need to call it out in the beta2 announcement.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-07 Thread Robert Haas
On Fri, Jun 3, 2016 at 9:39 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> I think we should just go with max_parallel_workers for a limit on
>> total parallel workers within max_work_processes, and
>> max_parallel_workers_per_gather for a per-Gather limit.  It's slightly
>> annoying that we may end up renaming the latter GUC, but not as
>> annoying as spending another three weeks debating this and missing
>> beta2.
>
> +1.  I'm not as convinced as you are that we'll replace the GUC later,
> but in any case this is an accurate description of the current
> semantics.  And I'm really *not* in favor of fudging the name with
> the intent of changing the GUC's semantics later --- that would fail
> all sorts of compatibility expectations.

Here's a patch change max_parallel_degree to
max_parallel_workers_per_gather, and also changing parallel_degree to
parallel_workers.  I haven't tackled adding a separate
max_parallel_workers, at least not yet.  Are people OK with this?

Note that there is a dump/restore hazard if people have set the
parallel_degree reloption on a beta1 install, or used ALTER { USER |
DATABASE } .. SET parallel_degree.  Can everybody live with that?
Should I bump catversion when applying this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


parallel-degree-to-workers-v1.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Josh berkus
On 06/02/2016 09:33 PM, Peter Eisentraut wrote:
> On 6/3/16 12:21 AM, Petr Jelinek wrote:
>> On 01/06/16 17:55, David G. Johnston wrote:
>>> On Wed, Jun 1, 2016 at 11:45 AM, Petr Jelinek >> >wrote:
>>>
>>> That GUC also controls worker processes that are started by
>>> extensions, not just ones that parallel query starts. This is btw
>>> one thing I don't like at all about how the current limits work, the
>>> parallel query will fight for workers with extensions because they
>>> share the same limit.
>>>
>>>
>>> ​Given that this models reality the GUC is doing its job.  Now, maybe we
>>> need additional knobs to give the end-user the ability to influence how
>>> those fights will turn out.
>>
>> Agreed, my point is that I think we do need additional knob.
> 
> We need one knob to control how many process slots to create at server
> start, and then a bunch of sliders to control how to allocate those
> between regular connections, superuser connections, replication,
> autovacuum, parallel workers, background workers (by tag/label/group),
> and so on.

Now that's crazy talk.  I mean, next thing you'll be saying that we need
the ability to monitor this, or even change it at runtime.  Where does
the madness end?  ;-)

Seriously, you have a point here; it's maybe time to stop tackling
process management per server piecemeal.  Question is, who wants to work
on this?

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Tom Lane
Robert Haas  writes:
> I think we should just go with max_parallel_workers for a limit on
> total parallel workers within max_work_processes, and
> max_parallel_workers_per_gather for a per-Gather limit.  It's slightly
> annoying that we may end up renaming the latter GUC, but not as
> annoying as spending another three weeks debating this and missing
> beta2.

+1.  I'm not as convinced as you are that we'll replace the GUC later,
but in any case this is an accurate description of the current
semantics.  And I'm really *not* in favor of fudging the name with
the intent of changing the GUC's semantics later --- that would fail
all sorts of compatibility expectations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Robert Haas
On Fri, Jun 3, 2016 at 8:30 AM, David G. Johnston
 wrote:
> How big is the hazard of future-naming this and documenting the present
> limitation?  Is the casual user reading explains even going to be aware of
> that particular implementation detail?

Well, see, the nice thing about max_parallel_degree is that it is very
slightly ambiguous, just enough to paper over this.  But I'm going to
oppose naming the GUC inaccurately based on a hoped-for future
development project.

Another way to paper over the difference that would be to call this
max_parallel_workers_per_operation.  Then we can document that an
operation is a Gather node, and in the future we could document that
it's now a statement.  However, if we pick a name like this, then
we're sort of implying that this GUC will control DDL, too.

I think we should just go with max_parallel_workers for a limit on
total parallel workers within max_work_processes, and
max_parallel_workers_per_gather for a per-Gather limit.  It's slightly
annoying that we may end up renaming the latter GUC, but not as
annoying as spending another three weeks debating this and missing
beta2.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread David G. Johnston
On Fri, Jun 3, 2016 at 8:20 AM, Robert Haas  wrote:

> On Thu, Jun 2, 2016 at 4:35 PM, Josh berkus  wrote:
> > I was assuming that we would have *both* per-operation and per-statement
> > limits.  I can see reasons for having both, I can see why power users
> > would want both, but it's going to be overwhelming to casual users.
>
> I don't think so.  I think the fact that this is per-gather-node
> rather than per-statement right now is basically a defect.  Once we
> have a per-statement limit, I see no value in having the
> per-gather-node setting.  So, yes, at that point, I would push to
> rename the GUC.
>
>
​How big is the hazard of future-naming this and documenting the present
limitation?  Is the casual user reading explains even going to be aware of
that particular implementation detail?

David J.


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Robert Haas
On Thu, Jun 2, 2016 at 4:35 PM, Josh berkus  wrote:
> I was assuming that we would have *both* per-operation and per-statement
> limits.  I can see reasons for having both, I can see why power users
> would want both, but it's going to be overwhelming to casual users.

I don't think so.  I think the fact that this is per-gather-node
rather than per-statement right now is basically a defect.  Once we
have a per-statement limit, I see no value in having the
per-gather-node setting.  So, yes, at that point, I would push to
rename the GUC.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-03 Thread Albe Laurenz
Robert Haas wrote:
> On Wed, Jun 1, 2016 at 5:29 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> I've largely given up hope of coming up with an alternative that can
> >> attract more than one vote and that is also at least mildly accurate,
> >> but one idea is max_parallel_workers_per_gather_node.  That will be
> >> totally clear.
> >
> > Given the reference to Gather nodes, couldn't you drop the word
> > "parallel"?  "node" might not be necessary either.
> 
> Well, I think we could drop node, if you like.  I think parallel
> wouldn't be good to drop, though, because it sounds like we want a
> global limit on parallel workers also, and that can't be just
> max_workers.  So I think we should keep parallel in there for all of
> them, and have max_parallel_workers and
> max_parallel_workers_per_gather(_node).  The reloption and the Path
> struct field can be parallel_workers rather than parallel_degree.

I believe that it will be impossible to find a name that makes
the meaning clear to everybody.  Those who do not read the documentation
will always find a way to misunderstand it.

These suggestions have the added disadvantage that it is hard
to remember them.  I see myself going, "I have to change the maximum
for parallel workers, what was the name again?", and having to resort
to the manual (or SHOW ALL) each time.

I suggest to follow the precedent of "work_mem", stick with
something simple like "max_parallel_workers" and accept the risk
of not being totally self-explanatory.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Peter Eisentraut

On 6/3/16 12:21 AM, Petr Jelinek wrote:

On 01/06/16 17:55, David G. Johnston wrote:

On Wed, Jun 1, 2016 at 11:45 AM, Petr Jelinek >wrote:

That GUC also controls worker processes that are started by
extensions, not just ones that parallel query starts. This is btw
one thing I don't like at all about how the current limits work, the
parallel query will fight for workers with extensions because they
share the same limit.


​Given that this models reality the GUC is doing its job.  Now, maybe we
need additional knobs to give the end-user the ability to influence how
those fights will turn out.


Agreed, my point is that I think we do need additional knob.


We need one knob to control how many process slots to create at server 
start, and then a bunch of sliders to control how to allocate those 
between regular connections, superuser connections, replication, 
autovacuum, parallel workers, background workers (by tag/label/group), 
and so on.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Petr Jelinek

On 01/06/16 17:55, David G. Johnston wrote:

On Wed, Jun 1, 2016 at 11:45 AM, Petr Jelinek >wrote:

That GUC also controls worker processes that are started by
extensions, not just ones that parallel query starts. This is btw
one thing I don't like at all about how the current limits work, the
parallel query will fight for workers with extensions because they
share the same limit.


​Given that this models reality the GUC is doing its job.  Now, maybe we
need additional knobs to give the end-user the ability to influence how
those fights will turn out.


Agreed, my point is that I think we do need additional knob.

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 01:42 PM, David G. Johnston wrote:
> ​Are you referring to right now or if we move the goal posts to making
> > this a per-statement reservation?​
> 
> I was assuming that we would have *both* per-operation and per-statement
> limits.  I can see reasons for having both, I can see why power users
> would want both, but it's going to be overwhelming to casual users.
> 
> 
> ​ Got that.  The only problem on that front with the current setup is
> that right now we are saying: "at most use 3 of the 8 available
> processes": i.e., we tie ourselves to a fixed number when I think a
> better algorithm would be: "on/off/auto - default auto" and we detect at
> runtime whatever values we feel are most appropriate based upon the
> machine we are running on.  If the user doesn't like our choices they
> can specify their own values.  But the only specified values in the
> configurations would be those placed there automatically by the user. 
> If value isn't specified but is required it gets set at startup and can
> be seen in pg_settings.
> 

Well, the hard part here is that the appropriate value is based on the
level of concurrency in the database as a whole.  For example, if it's a
website database with 200 active connections on a 32-core machine
already, you want zero parallelism.  Whereas if it's the only current
statement on a 16-core VM, you want like 8.

That sounds like a heuristic, except that the number of concurrent
statements can change in milleseconds.  So we'd really want to base this
on some sort of moving average, set conservatively.  This will be some
interesting code, and will probably need to be revised several times
before we get it right.  Particularly since this would involve scanning
some of the global catalogs we've been trying to move activity off of.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread David G. Johnston
On Thu, Jun 2, 2016 at 4:35 PM, Josh berkus  wrote:

> On 06/02/2016 01:08 PM, David G. Johnston wrote:
> > On Thu, Jun 2, 2016 at 3:52 PM, Josh berkus  > >wrote:
> >
> > On 06/02/2016 08:53 AM, Tom Lane wrote:
> > > Josh berkus > writes:
> > >> On 06/02/2016 04:58 AM, Robert Haas wrote:
> > >>> Well, I think we could drop node, if you like.  I think parallel
> > >>> wouldn't be good to drop, though, because it sounds like we want
> a
> > >>> global limit on parallel workers also, and that can't be just
> > >>> max_workers.  So I think we should keep parallel in there for
> all of
> > >>> them, and have max_parallel_workers and
> > >>> max_parallel_workers_per_gather(_node).  The reloption and the
> Path
> > >>> struct field can be parallel_workers rather than parallel_degree.
> > >
> > >> So does that mean we'll rename it if you manage to implement a
> parameter
> > >> which controls the number of workers for the whole statement?
> > >
> > > That would fit in as something like
> max_parallel_workers_per_statement.
> >
> > ETOOMANYKNOBS
> >
> > I'm trying to think of some way we can reasonably automate this for
> > users ...
> >
> >
> > ​Are you referring to right now or if we move the goal posts to making
> > this a per-statement reservation?​
>
> I was assuming that we would have *both* per-operation and per-statement
> limits.  I can see reasons for having both, I can see why power users
> would want both, but it's going to be overwhelming to casual users.
>
>
​Got that.  The only problem on that front with the current setup is that
right now we are saying: "at most use 3 of the 8 available processes":
i.e., we tie ourselves to a fixed number when I think a better algorithm
would be: "on/off/auto - default auto" and we detect at runtime whatever
values we feel are most appropriate based upon the machine we are running
on.  If the user doesn't like our choices they can specify their own
values.  But the only specified values in the configurations would be those
placed there automatically by the user.  If value isn't specified but is
required it gets set at startup and can be seen in pg_settings.

David J.


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 01:08 PM, David G. Johnston wrote:
> On Thu, Jun 2, 2016 at 3:52 PM, Josh berkus  >wrote:
> 
> On 06/02/2016 08:53 AM, Tom Lane wrote:
> > Josh berkus > writes:
> >> On 06/02/2016 04:58 AM, Robert Haas wrote:
> >>> Well, I think we could drop node, if you like.  I think parallel
> >>> wouldn't be good to drop, though, because it sounds like we want a
> >>> global limit on parallel workers also, and that can't be just
> >>> max_workers.  So I think we should keep parallel in there for all of
> >>> them, and have max_parallel_workers and
> >>> max_parallel_workers_per_gather(_node).  The reloption and the Path
> >>> struct field can be parallel_workers rather than parallel_degree.
> >
> >> So does that mean we'll rename it if you manage to implement a 
> parameter
> >> which controls the number of workers for the whole statement?
> >
> > That would fit in as something like max_parallel_workers_per_statement.
> 
> ETOOMANYKNOBS
> 
> I'm trying to think of some way we can reasonably automate this for
> users ...
> 
> 
> ​Are you referring to right now or if we move the goal posts to making
> this a per-statement reservation?​

I was assuming that we would have *both* per-operation and per-statement
limits.  I can see reasons for having both, I can see why power users
would want both, but it's going to be overwhelming to casual users.


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread David G. Johnston
On Thu, Jun 2, 2016 at 3:52 PM, Josh berkus  wrote:

> On 06/02/2016 08:53 AM, Tom Lane wrote:
> > Josh berkus  writes:
> >> On 06/02/2016 04:58 AM, Robert Haas wrote:
> >>> Well, I think we could drop node, if you like.  I think parallel
> >>> wouldn't be good to drop, though, because it sounds like we want a
> >>> global limit on parallel workers also, and that can't be just
> >>> max_workers.  So I think we should keep parallel in there for all of
> >>> them, and have max_parallel_workers and
> >>> max_parallel_workers_per_gather(_node).  The reloption and the Path
> >>> struct field can be parallel_workers rather than parallel_degree.
> >
> >> So does that mean we'll rename it if you manage to implement a parameter
> >> which controls the number of workers for the whole statement?
> >
> > That would fit in as something like max_parallel_workers_per_statement.
>
> ETOOMANYKNOBS
>
> I'm trying to think of some way we can reasonably automate this for
> users ...
>

​Are you referring to right now or if we move the goal posts to making this
a per-statement reservation?​

Oh, and how does one measure 0.7​18... of a knob?

David J.


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 08:53 AM, Tom Lane wrote:
> Josh berkus  writes:
>> On 06/02/2016 04:58 AM, Robert Haas wrote:
>>> Well, I think we could drop node, if you like.  I think parallel
>>> wouldn't be good to drop, though, because it sounds like we want a
>>> global limit on parallel workers also, and that can't be just
>>> max_workers.  So I think we should keep parallel in there for all of
>>> them, and have max_parallel_workers and
>>> max_parallel_workers_per_gather(_node).  The reloption and the Path
>>> struct field can be parallel_workers rather than parallel_degree.
> 
>> So does that mean we'll rename it if you manage to implement a parameter
>> which controls the number of workers for the whole statement?
> 
> That would fit in as something like max_parallel_workers_per_statement.

ETOOMANYKNOBS

I'm trying to think of some way we can reasonably automate this for
users ...

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Tom Lane
Josh berkus  writes:
> On 06/02/2016 04:58 AM, Robert Haas wrote:
>> Well, I think we could drop node, if you like.  I think parallel
>> wouldn't be good to drop, though, because it sounds like we want a
>> global limit on parallel workers also, and that can't be just
>> max_workers.  So I think we should keep parallel in there for all of
>> them, and have max_parallel_workers and
>> max_parallel_workers_per_gather(_node).  The reloption and the Path
>> struct field can be parallel_workers rather than parallel_degree.

> So does that mean we'll rename it if you manage to implement a parameter
> which controls the number of workers for the whole statement?

That would fit in as something like max_parallel_workers_per_statement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Josh berkus
On 06/02/2016 04:58 AM, Robert Haas wrote:

> Well, I think we could drop node, if you like.  I think parallel
> wouldn't be good to drop, though, because it sounds like we want a
> global limit on parallel workers also, and that can't be just
> max_workers.  So I think we should keep parallel in there for all of
> them, and have max_parallel_workers and
> max_parallel_workers_per_gather(_node).  The reloption and the Path
> struct field can be parallel_workers rather than parallel_degree.

So does that mean we'll rename it if you manage to implement a parameter
which controls the number of workers for the whole statement?


-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread David G. Johnston
On Thu, Jun 2, 2016 at 9:59 AM, Tom Lane  wrote:

> Robert Haas  writes:
> > Well, I think we could drop node, if you like.  I think parallel
> > wouldn't be good to drop, though, because it sounds like we want a
> > global limit on parallel workers also, and that can't be just
> > max_workers.  So I think we should keep parallel in there for all of
> > them, and have max_parallel_workers and
> > max_parallel_workers_per_gather(_node).  The reloption and the Path
> > struct field can be parallel_workers rather than parallel_degree.
>
> WFM.
>

​+1​


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Tom Lane
Robert Haas  writes:
> Well, I think we could drop node, if you like.  I think parallel
> wouldn't be good to drop, though, because it sounds like we want a
> global limit on parallel workers also, and that can't be just
> max_workers.  So I think we should keep parallel in there for all of
> them, and have max_parallel_workers and
> max_parallel_workers_per_gather(_node).  The reloption and the Path
> struct field can be parallel_workers rather than parallel_degree.

WFM.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-02 Thread Robert Haas
On Wed, Jun 1, 2016 at 5:29 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I've largely given up hope of coming up with an alternative that can
>> attract more than one vote and that is also at least mildly accurate,
>> but one idea is max_parallel_workers_per_gather_node.  That will be
>> totally clear.
>
> Given the reference to Gather nodes, couldn't you drop the word
> "parallel"?  "node" might not be necessary either.

Well, I think we could drop node, if you like.  I think parallel
wouldn't be good to drop, though, because it sounds like we want a
global limit on parallel workers also, and that can't be just
max_workers.  So I think we should keep parallel in there for all of
them, and have max_parallel_workers and
max_parallel_workers_per_gather(_node).  The reloption and the Path
struct field can be parallel_workers rather than parallel_degree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Josh berkus
On 06/01/2016 02:21 PM, Robert Haas wrote:
> If you lined up ten people in a room all of whom were competent
> database professionals and none of whom knew anything about PostgreSQL
> and asked them to guess what a setting called work_mem does and what a
> setting called max_parallel_degree does, I will wager you $5 that
> they'd do better on the second one.  Likewise, I bet the guesses for
> max_parallel_degree would be closer to the mark than the guesses for
> maintenance_work_mem or replacement_sort_tuples or commit_siblings or
> bgwriter_lru_multiplier.

Incidentally, the reason I didn't jump into this thread until the
patches showed up is that I don't think it actually matters what the
parameters are named.  They're going to require documentation
regardless, parallism just isn't something people grok instinctively.

I care about how the parameters *work*, and whether that's consistent
across our various resource management settings.

-- 
--
Josh Berkus
Red Hat OSAS
(any opinions are my own)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Tom Lane
Robert Haas  writes:
> I've largely given up hope of coming up with an alternative that can
> attract more than one vote and that is also at least mildly accurate,
> but one idea is max_parallel_workers_per_gather_node.  That will be
> totally clear.

Given the reference to Gather nodes, couldn't you drop the word
"parallel"?  "node" might not be necessary either.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Robert Haas
On Wed, Jun 1, 2016 at 10:10 AM, Tom Lane  wrote:
> Amit Kapila  writes:
>> Your explanation is clear, however the name max_parallel_workers makes it
>> sound like that parallelising an operation is all about workers.  Yes it
>> depends a lot on the number of workers allocated for parallel operation,
>> but that is not everything.  I think calling it max_parallelism as
>> suggested by Alvaro upthread suits better than max_parallel_workers.
>
> I don't think that's a good direction at all.  This entire discussion is
> caused by the fact that it's not very clear what "max_parallel_degree"
> measures.  Fixing that problem by renaming the variable to something that
> doesn't even pretend to tell you what it's counting is not an improvement.

I agree with that, but I also think you're holding the name of this
GUC to a ridiculously high standard.  It's not like you can look at
"work_mem" and know what it measures without reading the fine manual.
If you lined up ten people in a room all of whom were competent
database professionals and none of whom knew anything about PostgreSQL
and asked them to guess what a setting called work_mem does and what a
setting called max_parallel_degree does, I will wager you $5 that
they'd do better on the second one.  Likewise, I bet the guesses for
max_parallel_degree would be closer to the mark than the guesses for
maintenance_work_mem or replacement_sort_tuples or commit_siblings or
bgwriter_lru_multiplier.

I've largely given up hope of coming up with an alternative that can
attract more than one vote and that is also at least mildly accurate,
but one idea is max_parallel_workers_per_gather_node.  That will be
totally clear.  Three possible problems, none of them fatal, are:

- I have plans to eventually fix it so that the workers are shared
across the whole plan rather than just the plan node.  In most cases
that won't matter, but there are corner cases where it does.  Now when
that happens, we can rename this to max_parallel_workers_per_query,
breaking backward compatibility.
- It will force us to use a different GUC for utility commands.
- It's a bit long-winded.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread David G. Johnston
On Wed, Jun 1, 2016 at 11:45 AM, Petr Jelinek  wrote:

> That GUC also controls worker processes that are started by extensions,
> not just ones that parallel query starts. This is btw one thing I don't
> like at all about how the current limits work, the parallel query will
> fight for workers with extensions because they share the same limit.


​Given that this models reality the GUC is doing its job.  Now, maybe we
need additional knobs to give the end-user the ability to influence how
those fights will turn out.

But as far as a high-level setting goes max_worker_processes seems to fit
the bill - and apparently fits within our existing cluster options naming
convention.

Parallel query uses workers to assist in query execution.
Background tasks use workers during execution.
​Others.​

David J.
​


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Petr Jelinek

On 01/06/16 17:27, Jim Nasby wrote:

On 5/31/16 8:48 PM, Robert Haas wrote:

On Tue, May 31, 2016 at 5:58 PM, Tom Lane  wrote:

Alvaro Herrera  writes:

Robert Haas wrote:

I just want to point out that if we change #1, we're breaking
postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
I'd just leave it alone.



We can add the old name as a synonym in guc.c to maintain
compatibility.


I doubt this is much of an issue at this point; max_worker_processes has
only been there a release or so, and surely there are very few people
explicitly setting it, given its limited use-case up to now.  It will be
really hard to change it after 9.6, but I think we could still get away
with that today.


max_worker_processes was added in 9.4, so it's been there for two
releases, but it probably is true that few people have set it.
Nevertheless, I don't think there's much evidence that it is a bad
enough name that we really must change it.


ISTM that all the confusion about parallel query would go away if the
setting was max_parallel_assistants instead of _workers. It's exactly
how parallel query works: there are helpers that *assist* the backend in
executing the query.

The big downside to "assistants" is it breaks all lexical connection to
max_worker_processes. So what if we change that to
max_assistant_processes? I think "assistant" and "worker" are close
enough in meaning for "stand alone" uses of BG workers so as not to be
confusing, and I don't see any options for parallelism that are any
clearer.


That GUC also controls worker processes that are started by extensions, 
not just ones that parallel query starts. This is btw one thing I don't 
like at all about how the current limits work, the parallel query will 
fight for workers with extensions because they share the same limit.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Jim Nasby

On 5/31/16 8:48 PM, Robert Haas wrote:

On Tue, May 31, 2016 at 5:58 PM, Tom Lane  wrote:

Alvaro Herrera  writes:

Robert Haas wrote:

I just want to point out that if we change #1, we're breaking
postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
I'd just leave it alone.



We can add the old name as a synonym in guc.c to maintain compatibility.


I doubt this is much of an issue at this point; max_worker_processes has
only been there a release or so, and surely there are very few people
explicitly setting it, given its limited use-case up to now.  It will be
really hard to change it after 9.6, but I think we could still get away
with that today.


max_worker_processes was added in 9.4, so it's been there for two
releases, but it probably is true that few people have set it.
Nevertheless, I don't think there's much evidence that it is a bad
enough name that we really must change it.


ISTM that all the confusion about parallel query would go away if the 
setting was max_parallel_assistants instead of _workers. It's exactly 
how parallel query works: there are helpers that *assist* the backend in 
executing the query.


The big downside to "assistants" is it breaks all lexical connection to 
max_worker_processes. So what if we change that to 
max_assistant_processes? I think "assistant" and "worker" are close 
enough in meaning for "stand alone" uses of BG workers so as not to be 
confusing, and I don't see any options for parallelism that are any clearer.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Tom Lane
Amit Kapila  writes:
> Your explanation is clear, however the name max_parallel_workers makes it
> sound like that parallelising an operation is all about workers.  Yes it
> depends a lot on the number of workers allocated for parallel operation,
> but that is not everything.  I think calling it max_parallelism as
> suggested by Alvaro upthread suits better than max_parallel_workers.

I don't think that's a good direction at all.  This entire discussion is
caused by the fact that it's not very clear what "max_parallel_degree"
measures.  Fixing that problem by renaming the variable to something that
doesn't even pretend to tell you what it's counting is not an improvement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-06-01 Thread Amit Kapila
On Tue, May 31, 2016 at 11:30 PM, Tom Lane  wrote:
>
> I wrote:
> > I really think that a GUC named "max_parallel_workers", which in fact
> > limits the number of workers and not something else, is the way to go.
>
> To be concrete, I suggest comparing the attached documentation patch
> with Robert's.  Which one is more understandable?
>

Your explanation is clear, however the name max_parallel_workers makes it
sound like that parallelising an operation is all about workers.  Yes it
depends a lot on the number of workers allocated for parallel operation,
but that is not everything.  I think calling it max_parallelism as
suggested by Alvaro upthread suits better than max_parallel_workers.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Robert Haas
On Tue, May 31, 2016 at 10:13 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, May 31, 2016 at 10:02 PM, Tom Lane  wrote:
>>> The reloption does not set an exact value, according to the code:
>
>> True, max_parallel_degree is an overriding limit.  But the point is
>> that, without the reloption, you can't get lots of workers on a small
>> table.  The reloption lets you do that.
>
> Color me skeptical that that's actually a good idea ...

I'll color you as not having read the threads where multiple users asked for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Tom Lane
Robert Haas  writes:
> On Tue, May 31, 2016 at 10:02 PM, Tom Lane  wrote:
>> The reloption does not set an exact value, according to the code:

> True, max_parallel_degree is an overriding limit.  But the point is
> that, without the reloption, you can't get lots of workers on a small
> table.  The reloption lets you do that.

Color me skeptical that that's actually a good idea ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Robert Haas
On Tue, May 31, 2016 at 10:02 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Now, this case is a little trickier.  If we called it simply
>> parallel_degree rather than max_parallel_degree, then it would have
>> the same name as the reloption.  But the reloption sets an exact
>> value, and the GUC sets a cap, which is a significant difference.
>
> The reloption does not set an exact value, according to the code:
>
> /*
>  * Use the table parallel_degree, but don't go further than
>  * max_parallel_degree.
>  */
> parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);
>
> although the existing documentation for it is so vague that you
> couldn't tell from that.

True, max_parallel_degree is an overriding limit.  But the point is
that, without the reloption, you can't get lots of workers on a small
table.  The reloption lets you do that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Tom Lane
Robert Haas  writes:
> Now, this case is a little trickier.  If we called it simply
> parallel_degree rather than max_parallel_degree, then it would have
> the same name as the reloption.  But the reloption sets an exact
> value, and the GUC sets a cap, which is a significant difference.

The reloption does not set an exact value, according to the code:

/*
 * Use the table parallel_degree, but don't go further than
 * max_parallel_degree.
 */
parallel_degree = Min(rel->rel_parallel_degree, max_parallel_degree);

although the existing documentation for it is so vague that you
couldn't tell from that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Robert Haas
On Tue, May 31, 2016 at 8:51 PM, Peter Eisentraut
 wrote:
> On 5/31/16 4:04 PM, Tom Lane wrote:
>> The name should be closely related to what we use for #3.  I could go for
>> max_total_parallel_workers for #2 and max_parallel_workers for #3.
>> Or maybe max_parallel_workers_total?
>
> Most cluster-wide settings like this are named max_something
> (max_connections, max_wal_senders, max_replication_slots), whereas things
> that apply on a lower level are named max_something_per_something
> (max_files_per_process, max_locks_per_transations).
>
> So let's leave max_worker_processes mostly alone and not add any _total_,
> _absolute_, _altogether_. ;-)

That's interesting, because it suggests that max_parallel_degree might
end up being called something that doesn't begin with "max".  Which is
an interesting line of thought.  Now, it does function as a maximum of
sorts, but that doesn't necessarily imply that it has to have max in
the name.  By way of analogy, work_mem is not called max_work_mem, yet
everybody still understands that the actual memory used might be less
than the configured value.

Now, this case is a little trickier.  If we called it simply
parallel_degree rather than max_parallel_degree, then it would have
the same name as the reloption.  But the reloption sets an exact
value, and the GUC sets a cap, which is a significant difference.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Robert Haas
On Tue, May 31, 2016 at 5:58 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Robert Haas wrote:
>>> I just want to point out that if we change #1, we're breaking
>>> postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
>>> I'd just leave it alone.
>
>> We can add the old name as a synonym in guc.c to maintain compatibility.
>
> I doubt this is much of an issue at this point; max_worker_processes has
> only been there a release or so, and surely there are very few people
> explicitly setting it, given its limited use-case up to now.  It will be
> really hard to change it after 9.6, but I think we could still get away
> with that today.

max_worker_processes was added in 9.4, so it's been there for two
releases, but it probably is true that few people have set it.
Nevertheless, I don't think there's much evidence that it is a bad
enough name that we really must change it.

...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Peter Eisentraut

On 5/31/16 4:04 PM, Tom Lane wrote:

The name should be closely related to what we use for #3.  I could go for
max_total_parallel_workers for #2 and max_parallel_workers for #3.
Or maybe max_parallel_workers_total?


Most cluster-wide settings like this are named max_something 
(max_connections, max_wal_senders, max_replication_slots), whereas 
things that apply on a lower level are named max_something_per_something 
(max_files_per_process, max_locks_per_transations).


So let's leave max_worker_processes mostly alone and not add any 
_total_, _absolute_, _altogether_. ;-)


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Tom Lane
Alvaro Herrera  writes:
> Robert Haas wrote:
>> I just want to point out that if we change #1, we're breaking
>> postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
>> I'd just leave it alone.

> We can add the old name as a synonym in guc.c to maintain compatibility.

I doubt this is much of an issue at this point; max_worker_processes has
only been there a release or so, and surely there are very few people
explicitly setting it, given its limited use-case up to now.  It will be
really hard to change it after 9.6, but I think we could still get away
with that today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread David G. Johnston
On Tue, May 31, 2016 at 4:12 PM, Robert Haas  wrote:

> On Tue, May 31, 2016 at 4:04 PM, Tom Lane  wrote:
> > Alvaro Herrera  writes:
> >> Robert Haas wrote:
> >>> So I think in the long run we should have three limits:
> >>>
> >>> 1. Cluster-wide limit on number of worker processes for all purposes
> >>> (currently, max_worker_processes).
> >>>
> >>> 2. Cluster-wide limit on number of worker processes for parallelism
> >>> (don't have this yet).
> >>>
> >>> 3. Per-operation limit on number of worker processes for parallelism
> >>> (currently, max_parallel_degree).
> >>>
> >>> Whatever we rename, there needs to be enough semantic space between #1
> >>> and #3 to allow for the possibility - I think the very likely
> >>> possibility - that we will eventually also want #2.
> >
> >> max_background_workers sounds fine to me for #1, and I propose to add #2
> >> in 9.6 rather than wait.
> >
> > +1 to both points.
> >
> >> max_total_parallel_query_workers ?
> >
> > The name should be closely related to what we use for #3.  I could go for
> > max_total_parallel_workers for #2 and max_parallel_workers for #3.
> > Or maybe max_parallel_workers_total?
>
> I just want to point out that if we change #1, we're breaking
> postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
> I'd just leave it alone.
>

​+1

We shall assume that the DBA has set the value of max_worker_processes
​according to hardware specifications without regard to what exactly could
be parallelized.


>
> I would propose to call #2 max_parallel_workers and #3
> max_parallel_degree, which I think is clear as daylight, but since I
> invented all of these names, I guess I'm biased.
>

​So we have operation, session, and cluster scope yet no way to know which
is which without memorizing the documentation.​

​While we cannot enforce this I'd say any of these that are intended to be
changed by the user should have functions published as their official API.
The name of the function can be made to be user friendly.  For all the
others pick a name with something closer to the 64 character limit that is
as expressive as possible so that seeing the name in postgresql.conf is all
person really needs to see to know that they are changing the right thing.​

I say you expectations are off if you want to encode these differences by
using workers and degree at the same time.  Lets go for a part-time DBA
vocabulary here.  I get the merit of degree, and by itself it might even be
OK, but in our usage degree is theory while workers are actual and I'd say
people are likely to relate better to the later.

David J.


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Peter Geoghegan
On Tue, May 31, 2016 at 1:27 PM, Alvaro Herrera
 wrote:
> This is a very good point.
>
> I think parallel maintenance commands are going to require different
> tuning than different queries, and I'd rather have separate parameters
> for those things rather than force the same parameter being changed over
> and over depending on what is the session going to execute next.

FWIW, as things stand my parallel CREATE INDEX patch does have a cost
model which is pretty close to the one for parallel sequential scan.
The cost model cares about max_parallel_degree. However, it also
introduces a parallel_degree index storage parameter, which overrides
the cost model to make it indicate the number of parallel workers to
use (presumably, somebody will change master to make parallel_degree
in terms of "participant processes" rather than worker processes soon,
at which point I'll follow their lead).

The storage parameter sticks around, of course, so a REINDEX will
reuse it without asking. (I think CLUSTER should do the same with the
index storage parameter.)

What's novel about this is that for utility statements, you can
override the cost model, and so disregard max_parallel_degree if you
so choose. My guess is that DBAs will frequently want to do so.

I'm not attached to any of this, but that's what I've come up with to
date. Possibly the index storage parameter concept has baggage
elsewhere, which comes up when we later go to parallelize index scans.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Alvaro Herrera
Robert Haas wrote:

> I just want to point out that if we change #1, we're breaking
> postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
> I'd just leave it alone.

We can add the old name as a synonym in guc.c to maintain compatibility.

> I would propose to call #2 max_parallel_workers and #3
> max_parallel_degree, which I think is clear as daylight, but since I
> invented all of these names, I guess I'm biased.

You are biased :-)

> In terms of forward-compatibility, one thing to note is that we
> currently have only parallel QUERY, but in the future we will no doubt
> also have parallel operations that are not queries, like VACUUM and
> CLUSTER and ALTER TABLE.  If we put the word "query" into the name of
> a GUC, we're committing to the idea that it only applies to queries,
> and that there will be some other limit for non-queries.  If we don't
> put the word query in there, we are not necessarily committing to the
> reverse, but we're at least leaning in that direction.  So far I've
> largely dodged having to make a decision about this, because talking
> about the degree of parallelism for CLUSTER makes just as much sense
> as talking about the degree of parallelism for a query, but we could
> also decide to have e.g. maintenance_parallel_degree instead of
> max_parallel_degree for such operations.  But as we commit to names
> it's something to be aware of.

This is a very good point.

I think parallel maintenance commands are going to require different
tuning than different queries, and I'd rather have separate parameters
for those things rather than force the same parameter being changed over
and over depending on what is the session going to execute next.

Also, I think your proposed change from "max" to "maintenance" does not
make much sense; I'd rather have max_maintenance_parallel_degree.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rename max_parallel_degree?

2016-05-31 Thread Robert Haas
On Tue, May 31, 2016 at 4:04 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Robert Haas wrote:
>>> So I think in the long run we should have three limits:
>>>
>>> 1. Cluster-wide limit on number of worker processes for all purposes
>>> (currently, max_worker_processes).
>>>
>>> 2. Cluster-wide limit on number of worker processes for parallelism
>>> (don't have this yet).
>>>
>>> 3. Per-operation limit on number of worker processes for parallelism
>>> (currently, max_parallel_degree).
>>>
>>> Whatever we rename, there needs to be enough semantic space between #1
>>> and #3 to allow for the possibility - I think the very likely
>>> possibility - that we will eventually also want #2.
>
>> max_background_workers sounds fine to me for #1, and I propose to add #2
>> in 9.6 rather than wait.
>
> +1 to both points.
>
>> max_total_parallel_query_workers ?
>
> The name should be closely related to what we use for #3.  I could go for
> max_total_parallel_workers for #2 and max_parallel_workers for #3.
> Or maybe max_parallel_workers_total?

I just want to point out that if we change #1, we're breaking
postgresql.conf compatibility for, IMHO, not a whole lot of benefit.
I'd just leave it alone.

I would propose to call #2 max_parallel_workers and #3
max_parallel_degree, which I think is clear as daylight, but since I
invented all of these names, I guess I'm biased.

In terms of forward-compatibility, one thing to note is that we
currently have only parallel QUERY, but in the future we will no doubt
also have parallel operations that are not queries, like VACUUM and
CLUSTER and ALTER TABLE.  If we put the word "query" into the name of
a GUC, we're committing to the idea that it only applies to queries,
and that there will be some other limit for non-queries.  If we don't
put the word query in there, we are not necessarily committing to the
reverse, but we're at least leaning in that direction.  So far I've
largely dodged having to make a decision about this, because talking
about the degree of parallelism for CLUSTER makes just as much sense
as talking about the degree of parallelism for a query, but we could
also decide to have e.g. maintenance_parallel_degree instead of
max_parallel_degree for such operations.  But as we commit to names
it's something to be aware of.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >