Re: pg_upgrade: Pass -j down to vacuumdb

2019-04-04 Thread Tom Lane
Peter Eisentraut  writes:
> I think the real fix is to have pg_upgrade copy over the statistics.
> They are right there after all, just take them.

Well, it's not "just take them", we need to develop some infrastructure
that will permit coping with cross-version differences in the stats
storage.  This isn't insoluble, but it will take some work.  We had
a prior discussion that trailed off without much result:

https://www.postgresql.org/message-id/flat/724322880.K8vzik8zPz%40abook

regards, tom lane




Re: pg_upgrade: Pass -j down to vacuumdb

2019-04-04 Thread Peter Eisentraut
On 2019-04-03 23:24, Justin Pryzby wrote:
> I just did a test on one of our large-but-not-huge customers.  With
> stats_target=1, analyzing a 145GB partitioned table looks like it'll take
> perhaps an hour; they have ~1TB data, so delaying services during ANALYZE 
> would
> nullify the utility of pg_upgrade.  I can restore the essential tables from
> backup in 15-30 minutes.

I think the real fix is to have pg_upgrade copy over the statistics.
They are right there after all, just take them.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_upgrade: Pass -j down to vacuumdb

2019-04-03 Thread Justin Pryzby
On Wed, Apr 03, 2019 at 04:42:14PM -0400, Jeff Janes wrote:
> So maybe the first stage could
> be run by pg_upgrade itself, while the new server is still running on a
> linux socket in a private directory.

I think that would take too long.  It would be less of an issue if there was
feedback/progress from pg_upgrade during the analyze.

For our upgrades (which typically take ~15min but several customers take up to
~60min), I only analyze base tables (essentially, those which are neither
parents nor children), then start services, then ANALYZE with default stats
target.  I would want to avoid delaying services restart for more than another
(say) 5 minutes, and I would want to avoid even that unless there was a
progress report indicating that it's projected to take only a few more minutes.

I just did a test on one of our large-but-not-huge customers.  With
stats_target=1, analyzing a 145GB partitioned table looks like it'll take
perhaps an hour; they have ~1TB data, so delaying services during ANALYZE would
nullify the utility of pg_upgrade.  I can restore the essential tables from
backup in 15-30 minutes.

It might be fine if pg_upgrade took an option which enabled analyze, perhaps
instead of outputting analyze_new_cluster.sh.  But actually, a problem with
*that* is that currently pg_upgrade avoids starting the new cluster.  That
seems to be deliberate, since, with --link, that's an irreversible operation:
it's unsafe to start the old cluster afterwards.

Tangent: I have a queued mail from ~15 months ago wherein I proposed adding to
pg_upgrade an option to remove the old data dir (or probably only the files
associated with known relations).  I realized at the time that would be pretty
scary without having first verified that the new cluster at least starts.  I'm
not sure how good an idea that is, but --startnewcluster would be needed there,
too.

Justin




Re: pg_upgrade: Pass -j down to vacuumdb

2019-04-03 Thread Jeff Janes
On Fri, Mar 29, 2019 at 5:58 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-28 02:43, Jeff Janes wrote:
> > At first blush I thought it was obvious that you would not want to run
> > analyze-in-stages in parallel.  But after thinking about it some more
> > and reflecting on experience doing some troublesome upgrades, I would
> > reverse that and say it is now obvious you do want at least the first
> > stage of analyze-in-stages, and probably the first two, to run in
> > parallel.  That is not currently an option it supports, so we can't
> > really recommend it in the script or the docs.
>
> So do you think we should copy down the -j option from pg_upgrade, or
> make some other arrangement?
>

For 12, I think we should not pass it down so that it runs automatically,
and just go with a doc-only patch instead.  Where the text written into
the analyze_new_cluster.sh recommending to hit Ctrl-C and do something else
is counted as documentation.

I agree with you that the value of -j that was used for pg_ugrade might not
be suitable for vacuumdb, but anyone who tests things thoroughly enough to
know exactly what value to use for each is not going to be the target
audience of analyze_new_cluster.sh anyway.  So I think the -j used in
pg_upgrade can just be written directly into the suggestion, and that would
be good enough for the intended use.

Ideally I think the analyze-in-stages should have the ability to
parallelize the first stage and not the last one, but that is not v12
material.

Even more ideally it should only have two stages, not three.  One stage
runs to generate minimally-tolerable stats before the database is opened to
other users, and one runs after it is open (but hopefully before the big
end-of-month reports get run, or whatever the big statistics-sensitive
queries are on your system).   But we don't really have a concept of "open
to other users" in the first place, and doing it yourself by juggling
pg_hba files is annoying and error prone.  So maybe the first stage could
be run by pg_upgrade itself, while the new server is still running on a
linux socket in a private directory.

The defense of the current three-stage method for analyze-in-stages is that
very few people are likely to know just what the level of
"minimally-tolerable stats" for their system actually are, because upgrades
are rare enough not to learn by experience, and realistic load-generators
are rare enough not to learn from test/QA environments.  If the
default_statistics_target=1 stats are enough, then you got them quickly.
If they aren't, at least you didn't waste too much time collecting them
before moving on to the second-stage default_statistics_target=10 and then
the final stage.  So the three stages are amortizing over our ignorance.

Cheers,

Jeff


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-29 Thread Peter Eisentraut
On 2019-03-28 02:43, Jeff Janes wrote:
> At first blush I thought it was obvious that you would not want to run
> analyze-in-stages in parallel.  But after thinking about it some more
> and reflecting on experience doing some troublesome upgrades, I would
> reverse that and say it is now obvious you do want at least the first
> stage of analyze-in-stages, and probably the first two, to run in
> parallel.  That is not currently an option it supports, so we can't
> really recommend it in the script or the docs.

So do you think we should copy down the -j option from pg_upgrade, or
make some other arrangement?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-27 Thread Jeff Janes
On Tue, Mar 26, 2019 at 7:28 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 2019-03-25 22:57, Tom Lane wrote:
> > + fprintf(script, "echo %sYou may wish to add --jobs=N for parallel
> analyzing.%s\n",
> > + ECHO_QUOTE, ECHO_QUOTE);
>
> But then you get that information after you have already started the
> script.
>

True, but the same goes for all the other information there, and it sleeps
to let you break out of it.  And I make it a habit to glance through any
scripts someone suggests that I run, so would notice the embedded advice
without running it at all.

I don't find any information about this analyze business on the
> pg_upgrade reference page.  Maybe a discussion there could explain the
> different paths better than making the output script extra complicated.
>
> Essentially: If you want a slow and gentle analyze, use the supplied
> script.  If you want a fast analyze, use vacuumdb, perhaps with an
> appropriate --jobs option.  Note that pg_upgrade --jobs and vacuumdb
> --jobs are resource-bound in different ways, so the same value might not
> be appropriate for both.
>
>
To me, analyze-in-stages is not about gentleness at all.  For example, it
does nothing to move vacuum_cost_delay away from its default of 0.  Rather,
it is about slamming the bare minimum statistics in there as fast as
possible, so your database doesn't keel over from horrible query plans on
even simple queries as soon as you reopen.  I want the database to survive
long enough for the more complete statistics to be gathered.  If you have
quickly accumulated max_connection processes all running horrible query
plans that never finish, your database might as well still be closed for
all the good it does the users. And all the load generated by those is
going to make the critical ANALYZE all that much slower.

At first blush I thought it was obvious that you would not want to run
analyze-in-stages in parallel.  But after thinking about it some more and
reflecting on experience doing some troublesome upgrades, I would reverse
that and say it is now obvious you do want at least the first stage of
analyze-in-stages, and probably the first two, to run in parallel.  That is
not currently an option it supports, so we can't really recommend it in the
script or the docs.

But we could at least adopt the more straightforward patch, suggest that if
they don't want analyze-in-stages they should consider doing the big-bang
analyze in parallel.

Cheers,

Jeff


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-26 Thread Tom Lane
Peter Eisentraut  writes:
> On 2019-03-25 22:57, Tom Lane wrote:
>> +fprintf(script, "echo %sYou may wish to add --jobs=N for parallel 
>> analyzing.%s\n",
>> +ECHO_QUOTE, ECHO_QUOTE);

> But then you get that information after you have already started the script.

Yes, but that's already true of all the info that this script prints out.

> I don't find any information about this analyze business on the
> pg_upgrade reference page.  Maybe a discussion there could explain the
> different paths better than making the output script extra complicated.

> Essentially: If you want a slow and gentle analyze, use the supplied
> script.  If you want a fast analyze, use vacuumdb, perhaps with an
> appropriate --jobs option.  Note that pg_upgrade --jobs and vacuumdb
> --jobs are resource-bound in different ways, so the same value might not
> be appropriate for both.

I have no objection to handling it that way (i.e., just as a doc fix).
Or we could do both.

regards, tom lane



Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-26 Thread Peter Eisentraut
On 2019-03-25 22:57, Tom Lane wrote:
> + fprintf(script, "echo %sYou may wish to add --jobs=N for parallel 
> analyzing.%s\n",
> + ECHO_QUOTE, ECHO_QUOTE);

But then you get that information after you have already started the script.

I don't find any information about this analyze business on the
pg_upgrade reference page.  Maybe a discussion there could explain the
different paths better than making the output script extra complicated.

Essentially: If you want a slow and gentle analyze, use the supplied
script.  If you want a fast analyze, use vacuumdb, perhaps with an
appropriate --jobs option.  Note that pg_upgrade --jobs and vacuumdb
--jobs are resource-bound in different ways, so the same value might not
be appropriate for both.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-26 Thread Daniel Gustafsson
On Tuesday, March 26, 2019 3:20 AM, Michael Paquier  wrote:

> On Mon, Mar 25, 2019 at 05:57:50PM -0400, Tom Lane wrote:
>
> > In short, I'm not convinced that most of this patch is an improvement
> > on the status quo. I think we'd be best off to just take the idea
> > of explicitly mentioning adding --jobs to a manual run, ie roughly
>
> Yeah, no objections from here to keep that stuff the simpler the
> better. So I am on board with your suggestion.

+1, I'd rather have that as "documentation" after an upgrade.

cheers ./daniel



Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-25 Thread Michael Paquier
On Mon, Mar 25, 2019 at 05:57:50PM -0400, Tom Lane wrote:
> In short, I'm not convinced that most of this patch is an improvement
> on the status quo.  I think we'd be best off to just take the idea
> of explicitly mentioning adding --jobs to a manual run, ie roughly

Yeah, no objections from here to keep that stuff the simpler the
better.  So I am on board with your suggestion.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-25 Thread Tom Lane
Jesper Pedersen  writes:
> [ v9_0001-Highlight-that-the-jobs-option-isn-t-passed-down-to-.patch ]

Now that I've looked at this, it seems like it's fairly confused about
what the proposed VACUUMDB_OPTS variable would actually do for you.
If you're going to run vacuumdb directly, it hardly seems like you'd
bother with setting such a variable; you'd just enter the options you
want directly on the command line.

Conversely, if what you plan to do is set VACUUMDB_OPTS and then
run this script, the fact that --analyze-in-stages is hard-wired
into the script's invocation doesn't seem right either: you probably
don't want that if your goal is to get done as fast as possible.

In short, I'm not convinced that most of this patch is an improvement
on the status quo.  I think we'd be best off to just take the idea
of explicitly mentioning adding --jobs to a manual run, ie roughly

fprintf(script, "echo %sthis script and run:%s\n",
ECHO_QUOTE, ECHO_QUOTE);
fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
new_cluster.bindir, user_specification.data,
/* Did we copy the free space files? */
(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
"--analyze-only" : "--analyze", ECHO_QUOTE);
+   fprintf(script, "echo %sYou may wish to add --jobs=N for parallel 
analyzing.%s\n",
+   ECHO_QUOTE, ECHO_QUOTE);
fprintf(script, "echo%s\n\n", ECHO_BLANK);

fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",


regards, tom lane



RE: pg_upgrade: Pass -j down to vacuumdb

2019-03-12 Thread Jamison, Kirk
Hi Jesper,

> Thanks Kirk !
>
> > On 3/12/19 2:20 PM, Robert Haas wrote:
> > The words 'by default' should be removed here, because there is also 
> > no non-default way to get that behavior, either.
> > 
> 
> Here is v9 based on Kirk's and your input.

Thanks! Although there were trailing whitespaces.
After fixing that, it works as intended (built & passed tests successfully).
That aside, the v9 patch looks good and comment was addressed.


Regards,
Kirk Jamison


Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-12 Thread Jesper Pedersen

Hi,

Thanks Kirk !

On 3/12/19 2:20 PM, Robert Haas wrote:

The words 'by default' should be removed here, because there is also
no non-default way to get that behavior, either.



Here is v9 based on Kirk's and your input.

Best regards,
 Jesper

>From 5b879f79300412638705e32aa3ed51aff3cbe75c Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Tue, 12 Mar 2019 16:02:27 -0400
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Author: Jesper Pedersen and Kirk Jamison
Reviewed-by: Peter Eisentraut, Jerry Sievers, Michael Paquier, Tom Lane, Fabrízio de Royes Mello, Álvaro Herrera and Robert Haas
Discussion: https://www.postgresql.org/message-id/flat/11b6b108-6ac0-79a6-785a-f13dfe60a...@redhat.com
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 23 +++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..5f2a94918d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note that this option isn't passed to the
+ vacuumdb application.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..e4356f011c 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -493,17 +493,32 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 
 	fprintf(script, "echo %sIf you would like default statistics as quickly as possible, cancel%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %sthis script and run:%s\n",
+	fprintf(script, "echo %sthis script. You may add optional vacuumdb options (e.g. --jobs)%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+	fprintf(script, "echo %sby editing $VACUUMDB_OPTS, then run the command below.%s\n",
+			ECHO_QUOTE, ECHO_QUOTE);
+#ifndef WIN32
+	fprintf(script, "echo %s\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all %s%s\n", ECHO_QUOTE,
 			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
+			/* Did we copy the free space files? */
 			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
 			"--analyze-only" : "--analyze", ECHO_QUOTE);
+#else
+	fprintf(script, "echo %s\"%s\\vacuumdb.exe\" %s %%VACUUMDB_OPTS%% --all %s%s\n", ECHO_QUOTE,
+			new_cluster.bindir, user_specification.data,
+			/* Did we copy the free space files? */
+			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+			"--analyze-only" : "--analyze", ECHO_QUOTE);
+#endif
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s %%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



Re: pg_upgrade: Pass -j down to vacuumdb

2019-03-12 Thread Robert Haas
On Mon, Mar 11, 2019 at 10:05 PM Jamison, Kirk  wrote:
> I was thinking of something like the attached,

+ machine. Note that this option isn't passed to the
+ vacuumdb application by default.

The words 'by default' should be removed here, because there is also
no non-default way to get that behavior, either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



RE: pg_upgrade: Pass -j down to vacuumdb

2019-03-11 Thread Jamison, Kirk
Hi Jesper,

Sorry I almost completely forgot to get back to you on this.
Actually your patch works when I tested it before,
and I understand the intention.
. 
Although a point was raised by other developers by making
--jobs optional in the suggested line by using the env variable instead.

> > Kirk, can you provide a delta patch to match what you are thinking ?

I was thinking of something like the attached, 
but the user needs to edit the variable value though. 
I am not sure how to implement the suggestion to allow the 
script to absorb a value from the environment,
so that it doesn't need to be edited at all.

Regards,
Kirk Jamison


v8_0001-Use-vacuumdb-opt-env-variable-to-pass-vacuumdb-options.patch
Description: v8_0001-Use-vacuumdb-opt-env-variable-to-pass-vacuumdb-options.patch


analyze_new_cluster.sh
Description: analyze_new_cluster.sh


RE: pg_upgrade: Pass -j down to vacuumdb

2019-02-03 Thread Jamison, Kirk
Hi, 

On February 1, 2019 8:14 PM +, Jesper Pedersen wrote:

> On 2/1/19 4:58 AM, Alvaro Herrera wrote:
>> On 2019-Feb-01, Jamison, Kirk wrote:
>>> I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I 
>>> thought what the other developers suggested is to utilize it so that 
>>> --jobs for vacuumdb would be optional and just based from user-specified 
>>> option $VACUUMDB_OPTS.
>>> By which it would not utilize the amount of jobs used for pg_upgrade.
>>> To address your need of needing a parallel, the script would just 
>>> then suggest if the user wants a --jobs option. The if/else for 
>>> number of jobs is not needed in that case, maybe only for ifndef WIN32 else 
>>> case.
>> 
>> No Kirk, you got it right -- (some of) those ifdefs are not needed, as 
>> adding the --jobs in the command line is not needed.  I do think some 
>> extra whitespace in the format string is needed.


> The point of the patch is to highlight to the user that even though he/she 
> does "pg_upgrade -j 8" the "-j 8" option isn't passed down to vacuumdb.
> Default value is 1, so in that case the echo command doesn't highlight that 
> fact, otherwise it is. The user can then cancel the script and use the 
> suggested command line.
> The script then executes the vaccumdb command with the $VACUUMDB_OPTS 
> argument as noted in the documentation.
> Sample script attached as well.

Sorry I think I might have understood now where you are coming from,
where your script clarifies that it's not necessarily passed down.
If committers can let it pass, maybe it's necessary to highlight in the script,
Something like:
"If you would like default statistics as quickly as possible (e.g. run in 
parallel)..."

The difference is still perhaps your use of --jobs option
as somehow "explicitly" embedded in the code, compared to the suggestion of
making it optional by using the $VACUUMDB_OPTS variable, so that
jobs need not to be explicitly written, unless the user wants to.
(which I also think it's a better improvement because it's optional)

Some quick code check. sorry, don't have much time to write a patch for it 
today.
Perhaps you could write it similar to what you did in the --analyze-in-stages 
part.
Maybe something like (not sure if correct):
+   fprintf(script, "echo %s\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all 
%s%s\n", ECHO_QUOTE,
+   new_cluster.bindir, user_specification.data,
+   /* Did we copy the free space files? */
+   (GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+   "--analyze-only" : "--analyze", ECHO_QUOTE);

I'll continue to review though from time to time.

Regards,
Kirk Jamison


Re: pg_upgrade: Pass -j down to vacuumdb

2019-02-03 Thread Andres Freund
Hi,

On 2019-02-01 14:14:27 -0500, Jesper Pedersen wrote:

> From 4b5856725af5d971a26a2ba150c1067ee21bb242 Mon Sep 17 00:00:00 2001
> From: jesperpedersen 
> Date: Fri, 21 Dec 2018 05:05:31 -0500
> Subject: [PATCH] Highlight that the --jobs option isn't passed down to
>  vacuumdb by default.

This is currently marked as waiting for author, which doesn't seem
accurate. I've set it to needs review, and moved it to the next
commitfest - please revert if that's not accurate.

- Andres



Re: pg_upgrade: Pass -j down to vacuumdb

2019-02-01 Thread Jesper Pedersen

Hi,

On 2/1/19 4:58 AM, Alvaro Herrera wrote:

On 2019-Feb-01, Jamison, Kirk wrote:


I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I thought what
the other developers suggested is to utilize it so that --jobs for vacuumdb
would be optional and just based from user-specified option $VACUUMDB_OPTS.
By which it would not utilize the amount of jobs used for pg_upgrade.
To address your need of needing a parallel, the script would just then
suggest if the user wants a --jobs option. The if/else for number of jobs is
not needed in that case, maybe only for ifndef WIN32 else case.


No Kirk, you got it right -- (some of) those ifdefs are not needed, as
adding the --jobs in the command line is not needed.  I do think some
extra whitespace in the format string is needed.



I'm confused by this.

The point of the patch is to highlight to the user that even though 
he/she does "pg_upgrade -j 8" the "-j 8" option isn't passed down to 
vacuumdb.


Default value is 1, so in that case the echo command doesn't highlight 
that fact, otherwise it is. The user can then cancel the script and use 
the suggested command line.


The script then executes the vaccumdb command with the $VACUUMDB_OPTS 
argument as noted in the documentation.


Sample script attached as well.

I added extra space in the --analyze-in-stages part.

Kirk, can you provide a delta patch to match what you are thinking ?

Best regards,
 Jesper

>From 4b5856725af5d971a26a2ba150c1067ee21bb242 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb by default.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 28 ++--
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..84a7ccb6f1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note that this option isn't passed to the
+ vacuumdb application by default.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..570b7c9ae7 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,15 +495,31 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--jobs %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s $VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s %%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



analyze_new_cluster.sh
Description: application/shellscript


Re: pg_upgrade: Pass -j down to vacuumdb

2019-02-01 Thread Alvaro Herrera
On 2019-Feb-01, Jamison, Kirk wrote:

> I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I thought what
> the other developers suggested is to utilize it so that --jobs for vacuumdb
> would be optional and just based from user-specified option $VACUUMDB_OPTS.
> By which it would not utilize the amount of jobs used for pg_upgrade.
> To address your need of needing a parallel, the script would just then
> suggest if the user wants a --jobs option. The if/else for number of jobs is
> not needed in that case, maybe only for ifndef WIN32 else case.

No Kirk, you got it right -- (some of) those ifdefs are not needed, as
adding the --jobs in the command line is not needed.  I do think some
extra whitespace in the format string is needed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: pg_upgrade: Pass -j down to vacuumdb

2019-02-01 Thread Jamison, Kirk
On January 31, 2019, 9:29PM +, Jesper Pedersen wrote:

>>> I added most of the documentation back, as requested by Kirk.
>> 
>> Actually, I find it useful to be documented. However, major contributors 
>> have higher opinions in terms of experience, so I think it's alright with me 
>> not to include the doc part if they finally say so. 
>
> I think we need to leave it to the Committer to decide, as both Peter and 
> Michael are committers; provided the patch reaches RfC.

Agreed.

>>> 1) You still enforce -j to use the number of jobs that the caller of 
>>> pg_upgrade provides, and we agreed that both things are separate 
>>> concepts upthread, no?  What has been suggested by Alvaro is to add 
>>> a comment so as one can use VACUUM_OPTS with -j optionally, instead 
>>> of suggesting a full-fledged vacuumdb command which depends on what 
>>> pg_upgrade uses.  So there is no actual need for the if/else 
>>> complication business.
> 
>> I think it is ok for the echo command to highlight to the user that 
>> running --analyze-only using the same amount of jobs will give a faster 
>> result.

Since you used user_opts.jobs (which is coming from pg_upgrade, is it not?),
could you clarify more the statement above? Or did you mean somehow that
it won't be a problem for vacuumdb to use the same?
Though correctness-wise is arguable, if the committers can let it pass
from your answer, then I think it's alright.

I'm not sure if misunderstood the purpose of $VACUUMDB_OPTS. I thought what
the other developers suggested is to utilize it so that --jobs for vacuumdb
would be optional and just based from user-specified option $VACUUMDB_OPTS.
By which it would not utilize the amount of jobs used for pg_upgrade.
To address your need of needing a parallel, the script would just then
suggest if the user wants a --jobs option. The if/else for number of jobs is
not needed in that case, maybe only for ifndef WIN32 else case.

Regards,
Kirk Jamison




Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-31 Thread Jesper Pedersen

Hi,

On 1/30/19 7:59 PM, Jamison, Kirk wrote:

I added most of the documentation back, as requested by Kirk.


Actually, I find it useful to be documented. However, major contributors have 
higher opinions in terms of experience, so I think it's alright with me not to 
include the doc part if they finally say so.



I think we need to leave it to the Committer to decide, as both Peter  
and Michael are committers; provided the patch reaches RfC.



It seems to me that the latest patch sent is incorrect for multiple
reasons:
1) You still enforce -j to use the number of jobs that the caller of
pg_upgrade provides, and we agreed that both things are separate
concepts upthread, no?  What has been suggested by Alvaro is to add a
comment so as one can use VACUUM_OPTS with -j optionally, instead of
suggesting a full-fledged vacuumdb command which depends on what
pg_upgrade uses.  So there is no actual need for the if/else
complication business.



I think it is ok for the echo command to highlight to the user that running 
--analyze-only using the same amount of jobs will give a faster result.


I missed that part.
IIUC, using the $VACUUMDB_OPTS variable would simplify and correct the usage of 
jobs for vacuumdb. (pg_upgrade jobs is not equal to vacuumdb jobs) Plus, it 
might simplify and reduce the number of additional lines.
Tom Lane also suggested above to make the script absorb the value from env.
Would that address your concern of getting a faster result, yes?



The actual line in the script executed is using $VACUUMDB_OPTS at the  
moment, so could you expand on the above ? The 'if' could be removed if  
we assume that pg_upgrade would only be used with server > 9.5, but to  
be safer I would leave it in, as it is only console output.


Best regards,
 Jesper



RE: pg_upgrade: Pass -j down to vacuumdb

2019-01-30 Thread Jamison, Kirk
On January 29, 2019 8:19 PM +, Jesper Pedersen wrote:
>On 1/29/19 12:08 AM, Michael Paquier wrote:
>> I would document the optional VACUUM_OPTS on the page of pg_upgrade.
>> If Peter thinks it is fine to not do so, that's fine for me as well.
>> 
..
>I added most of the documentation back, as requested by Kirk.

Actually, I find it useful to be documented. However, major contributors have 
higher opinions in terms of experience, so I think it's alright with me not to 
include the doc part if they finally say so.

>> It seems to me that the latest patch sent is incorrect for multiple
>> reasons:
>> 1) You still enforce -j to use the number of jobs that the caller of 
>> pg_upgrade provides, and we agreed that both things are separate 
>> concepts upthread, no?  What has been suggested by Alvaro is to add a 
>> comment so as one can use VACUUM_OPTS with -j optionally, instead of 
>> suggesting a full-fledged vacuumdb command which depends on what 
>> pg_upgrade uses.  So there is no actual need for the if/else 
>> complication business.

> I think it is ok for the echo command to highlight to the user that running 
> --analyze-only using the same amount of jobs will give a faster result.

I missed that part. 
IIUC, using the $VACUUMDB_OPTS variable would simplify and correct the usage of 
jobs for vacuumdb. (pg_upgrade jobs is not equal to vacuumdb jobs) Plus, it 
might simplify and reduce the number of additional lines.
Tom Lane also suggested above to make the script absorb the value from env.
Would that address your concern of getting a faster result, yes?


Regards,
Kirk Jamison




Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-29 Thread Jesper Pedersen

Hi,

On 1/29/19 12:08 AM, Michael Paquier wrote:

On Tue, Jan 29, 2019 at 12:35:30AM +, Jamison, Kirk wrote:

I just checked the patch.
As per advice, you removed the versioning and specified --jobs.
The patch still applies, builds and passed the tests successfully.


I would document the optional VACUUM_OPTS on the page of pg_upgrade.
If Peter thinks it is fine to not do so, that's fine for me as well.



I added most of the documentation back, as requested by Kirk.


It seems to me that the latest patch sent is incorrect for multiple
reasons:
1) You still enforce -j to use the number of jobs that the caller of
pg_upgrade provides, and we agreed that both things are separate
concepts upthread, no?  What has been suggested by Alvaro is to add a
comment so as one can use VACUUM_OPTS with -j optionally, instead of
suggesting a full-fledged vacuumdb command which depends on what
pg_upgrade uses.  So there is no actual need for the if/else
complication business.


I think it is ok for the echo command to highlight to the user that 
running --analyze-only using the same amount of jobs will give a faster 
result.



2) Perhaps we need to worry about the second vacuumdb --all command,
which may want custom options, which are not necessarily the same as
the options of the first command?  I don't think we need to care as it
applies only to an upgraded cluster using something older 8.4, just
wondering..


I think that --all --analyze-in-stages is what most people want. And 
with the $VACUUMDB_OPTS variable people have an option to put in more, 
such as -j X.


Best regards,
 Jesper


>From b02e1f414e309361595ab3fe84fb6f66bcf63265 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb by default.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 28 ++--
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..84a7ccb6f1 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note that this option isn't passed to the
+ vacuumdb application by default.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..847e604088 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,15 +495,31 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--jobs %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s$VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s%%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Michael Paquier
On Tue, Jan 29, 2019 at 12:35:30AM +, Jamison, Kirk wrote:
> I just checked the patch.
> As per advice, you removed the versioning and specified --jobs. 
> The patch still applies, builds and passed the tests successfully.

I would document the optional VACUUM_OPTS on the page of pg_upgrade.
If Peter thinks it is fine to not do so, that's fine for me as well.

It seems to me that the latest patch sent is incorrect for multiple
reasons:
1) You still enforce -j to use the number of jobs that the caller of
pg_upgrade provides, and we agreed that both things are separate
concepts upthread, no?  What has been suggested by Alvaro is to add a
comment so as one can use VACUUM_OPTS with -j optionally, instead of
suggesting a full-fledged vacuumdb command which depends on what
pg_upgrade uses.  So there is no actual need for the if/else
complication business.
2) Perhaps we need to worry about the second vacuumdb --all command,
which may want custom options, which are not necessarily the same as
the options of the first command?  I don't think we need to care as it
applies only to an upgraded cluster using something older 8.4, just
wondering..
--
Michael


signature.asc
Description: PGP signature


RE: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Jamison, Kirk
Hi Jesper,

Thanks for updating your patches quickly.

>On 1/28/19 3:50 PM, Peter Eisentraut wrote:
>>> Done, and v4 attached.
>> 
>> I would drop the changes in pgupgrade.sgml.  We don't need to explain 
>> what doesn't happen, when nobody before now ever thought that it would 
>> happen.
>> 
>> Also, we don't need the versioning stuff.  The new cluster in 
>> pg_upgrade is always of the current version, and we know what that one 
>> supports.
>> 

>Done.

I just checked the patch.
As per advice, you removed the versioning and specified --jobs. 
The patch still applies, builds and passed the tests successfully.

I'd argue though that it's helpful to find a short documentation to clarify
that it's not pass down by to vacuumdb default. I think the initial doc 
change from the previous patch would be good.
+ machine. Note, that this option isn't passed to the
+ vacuumdb application by default.
Just remove the comma symbol from "Note, that..."


Regards,
Kirk Jamison


Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Jesper Pedersen

On 1/28/19 3:50 PM, Peter Eisentraut wrote:

Done, and v4 attached.


I would drop the changes in pgupgrade.sgml.  We don't need to explain
what doesn't happen, when nobody before now ever thought that it would
happen.

Also, we don't need the versioning stuff.  The new cluster in pg_upgrade
is always of the current version, and we know what that one supports.



Done.

Best regards,
 Jesper
>From 5b51f927dff808b4a8516b424c2ef790dbe96da6 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb by default.

Author: Jesper Pedersen 
---
 src/bin/pg_upgrade/check.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..847e604088 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,15 +495,31 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--jobs %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s$VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s%%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Peter Eisentraut
On 28/01/2019 17:47, Jesper Pedersen wrote:
> On 1/28/19 11:30 AM, Fabrízio de Royes Mello wrote:
>> IMHO you should use long option name '--jobs' like others.
>>
> 
> Thanks for your feedback !
> 
> Done, and v4 attached.

I would drop the changes in pgupgrade.sgml.  We don't need to explain
what doesn't happen, when nobody before now ever thought that it would
happen.

Also, we don't need the versioning stuff.  The new cluster in pg_upgrade
is always of the current version, and we know what that one supports.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Jesper Pedersen

On 1/28/19 11:30 AM, Fabrízio de Royes Mello wrote:

IMHO you should use long option name '--jobs' like others.



Thanks for your feedback !

Done, and v4 attached.

Best regards,
 Jesper
>From 142b4723f433f39d0eed2ced4beccc3721451103 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the --jobs option isn't passed down to
 vacuumdb by default.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 28 ++--
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..b1da0cc7a2 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note, that this option isn't passed to the
+ vacuumdb application by default.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution such as --jobs.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..3b478ef95f 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,15 +495,31 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1 || GET_MAJOR_VERSION(new_cluster.major_version) < 905)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--jobs %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s$VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s%%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Fabrízio de Royes Mello
On Mon, Jan 28, 2019 at 1:15 PM Jesper Pedersen 
wrote:
>
> Done.
>
> Attached is v3.
>

IMHO you should use long option name '--jobs' like others.

Regards,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-28 Thread Jesper Pedersen

Hi,

On 1/27/19 7:50 PM, Jamison, Kirk wrote:

There were also helpful comments from the developers above,
pointing it to the right direction.
In addition to Peter's comment, quoting "...if you want to do this
as fast as possible, don't use this script.  That comment could be
enhanced to suggest the use of the -j option.", so I think you
should also update the following script in create_script_for_cluster_analyze():
  fprintf(script, "echo %sIf you would like default statistics as quickly as 
possible, cancel%s\n",
  ECHO_QUOTE, ECHO_QUOTE);



Yes, it will echo the -j option if passed, and the server supports it.


Alvaro Herrera  writes:

So let's have it write with a $VACUUMDB_OPTS variable, which is by
default defined as empty but with a comment suggesting that maybe the
user wants to add the -j option.  This way, if they have to edit it,
they only have to edit the VACUUMDB_OPTS line instead of each of the
two vacuumdb lines.



Tom Lane wrote:

Even better, allow the script to absorb a value from the environment, so that 
it needn't be edited at all.




Done.

Attached is v3.

Best regards,
 Jesper
>From 1cba8299ae328942a4bf623d841a80b1b6043d3a Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the -j option isn't passed down to vacuumdb by
 default.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pgupgrade.sgml |  6 +-
 src/bin/pg_upgrade/check.c  | 28 ++--
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..b1da0cc7a2 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,11 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note, that this option isn't passed to the
+ vacuumdb application by default.
+ The system environment variable VACUUMDB_OPTS can be
+ used to pass extra options to the vacuumdb
+ application during the script execution such as --jobs.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..5114df01b5 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,15 +495,31 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1 || GET_MAJOR_VERSION(new_cluster.major_version) < 905)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s-j %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
-	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
+#ifndef WIN32
+	fprintf(script, "\"%s/vacuumdb\" %s$VACUUMDB_OPTS --all --analyze-in-stages\n",
 			new_cluster.bindir, user_specification.data);
+#else
+	fprintf(script, "\"%s\\vacuumdb.exe\" %s%%VACUUMDB_OPTS%% --all --analyze-in-stages\n",
+			new_cluster.bindir, user_specification.data);
+#endif
 	/* Did we copy the free space files? */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) < 804)
 		fprintf(script, "\"%s/vacuumdb\" %s--all\n", new_cluster.bindir,
-- 
2.17.2



RE: pg_upgrade: Pass -j down to vacuumdb

2019-01-27 Thread Jamison, Kirk
Hi Jesper,

On Friday, January 25, 2019, Jesper Pedersen  
> Thanks for your feedback !
>
> As per Peter's comments I have changed the patch (v2) to not pass down the -j 
> option to vacuumdb.
>
> Only an update to the documentation and console output is made in order to 
> make it more clear.

Yeah, I understood from the references that the parallel option use
for vacuumdb and pg_upgrade are different. I was also getting
clarification whether pg_upgrade gets to pass it down to vacuumdb.
Based from other Michael's and Peter's replies, I now understand it.

There were also helpful comments from the developers above,
pointing it to the right direction.
In addition to Peter's comment, quoting "...if you want to do this
as fast as possible, don't use this script.  That comment could be
enhanced to suggest the use of the -j option.", so I think you 
should also update the following script in create_script_for_cluster_analyze():
 fprintf(script, "echo %sIf you would like default statistics as quickly as 
possible, cancel%s\n",
 ECHO_QUOTE, ECHO_QUOTE);

There were also advice from Tom and Alvaro that you can incorporate
further in your next patch.

>Alvaro Herrera  writes:
>> So let's have it write with a $VACUUMDB_OPTS variable, which is by 
>> default defined as empty but with a comment suggesting that maybe the 
>> user wants to add the -j option.  This way, if they have to edit it, 
>> they only have to edit the VACUUMDB_OPTS line instead of each of the 
>> two vacuumdb lines.
>
Tom Lane wrote:
>Even better, allow the script to absorb a value from the environment, so that 
>it needn't be edited at all.

Regards,
Kirk Jamison


Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-27 Thread Michael Paquier
On Fri, Jan 25, 2019 at 12:16:49PM -0500, Tom Lane wrote:
> Alvaro Herrera  writes:
>> So let's have it write with a $VACUUMDB_OPTS variable, which is by
>> default defined as empty but with a comment suggesting that maybe the
>> user wants to add the -j option.  This way, if they have to edit it,
>> they only have to edit the VACUUMDB_OPTS line instead of each of the two
>> vacuumdb lines.
> 
> Even better, allow the script to absorb a value from the environment,
> so that it needn't be edited at all.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-25 Thread Tom Lane
Alvaro Herrera  writes:
> So let's have it write with a $VACUUMDB_OPTS variable, which is by
> default defined as empty but with a comment suggesting that maybe the
> user wants to add the -j option.  This way, if they have to edit it,
> they only have to edit the VACUUMDB_OPTS line instead of each of the two
> vacuumdb lines.

Even better, allow the script to absorb a value from the environment,
so that it needn't be edited at all.

regards, tom lane



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-25 Thread Alvaro Herrera
On 2019-Jan-25, Peter Eisentraut wrote:

> On 25/01/2019 11:28, Michael Paquier wrote:
> > based on that linking the value used by pg_upgrade and vacuumdb is a
> > bad concept in my opinion, and the patch should be rejected.  More
> > documentation on pg_upgrade side to explain that a bit better could be
> > a good idea though, as it is perfectly possible to use your own
> > post-upgrade script or rewrite partially the generated one.
> 
> Right.  pg_upgrade doesn't actually call vacuumdb.  It creates a script
> that you may use.  The script itself contains a comment that says, if
> you want to do this as fast as possible, don't use this script.  That
> comment could be enhanced to suggest the use of the -j option.

So let's have it write with a $VACUUMDB_OPTS variable, which is by
default defined as empty but with a comment suggesting that maybe the
user wants to add the -j option.  This way, if they have to edit it,
they only have to edit the VACUUMDB_OPTS line instead of each of the two
vacuumdb lines.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-25 Thread Jesper Pedersen

Hi Kirk,

On 1/24/19 9:31 PM, Jamison, Kirk wrote:

According to CF app, this patch needs review so I took a look at it.
Currently, your patch applies and builds cleanly, and all tests are also 
successful
based from the CF bot patch tester.

I'm not sure if I have understood the argument raised by Peter correctly.
Quoting Peter, "it's not clear that pg_upgrade and vacuumdb are bound the same way, 
so it's not a given that the same -j number should be used."
I think it's whether the # jobs for pg_upgrade is used the same way for 
parallel vacuum.

According to the official docs, the recommended setting for pg_upgrade --j 
option is the maximum of the number of CPU cores and tablespaces. [1]
As for vacuumdb, parallel vacuum's (-j) recommended setting is based from your 
max_connections [2], which is the max # of concurrent connections to your db 
server.



Thanks for your feedback !

As per Peter's comments I have changed the patch (v2) to not pass down 
the -j option to vacuumdb.


Only an update to the documentation and console output is made in order 
to make it more clear.


Best regards,
 Jesper



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-25 Thread Peter Eisentraut
On 25/01/2019 11:28, Michael Paquier wrote:
> based on that linking the value used by pg_upgrade and vacuumdb is a
> bad concept in my opinion, and the patch should be rejected.  More
> documentation on pg_upgrade side to explain that a bit better could be
> a good idea though, as it is perfectly possible to use your own
> post-upgrade script or rewrite partially the generated one.

Right.  pg_upgrade doesn't actually call vacuumdb.  It creates a script
that you may use.  The script itself contains a comment that says, if
you want to do this as fast as possible, don't use this script.  That
comment could be enhanced to suggest the use of the -j option.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-25 Thread Michael Paquier
On Fri, Jan 25, 2019 at 02:31:57AM +, Jamison, Kirk wrote:
> I'm not sure if I have understood the argument raised by Peter
> correctly.  Quoting Peter, "it's not clear that pg_upgrade and
> vacuumdb are bound the same way, so it's not a given that the same
> -j number should be used."  I think it's whether the # jobs for
> pg_upgrade is used the same way for parallel vacuum.

vacuumdb and pg_upgrade are designed for different purposes and have
different properties, hence using a value of -j for one does not
necessarily mean that the same value should be used for the other.
An ANALYZE is nice because it is non-intrusive for a live production
system, and if you begin to pass down a -j argument you may finish by
eating more resources that would have been necessary for the same
job.  For some users perhaps that matters, for some it does not.  So
based on that linking the value used by pg_upgrade and vacuumdb is a
bad concept in my opinion, and the patch should be rejected.  More
documentation on pg_upgrade side to explain that a bit better could be
a good idea though, as it is perfectly possible to use your own
post-upgrade script or rewrite partially the generated one.
--
Michael


signature.asc
Description: PGP signature


RE: pg_upgrade: Pass -j down to vacuumdb

2019-01-24 Thread Jamison, Kirk
Hi,

According to CF app, this patch needs review so I took a look at it.
Currently, your patch applies and builds cleanly, and all tests are also 
successful
based from the CF bot patch tester.

I'm not sure if I have understood the argument raised by Peter correctly. 
Quoting Peter, "it's not clear that pg_upgrade and vacuumdb are bound the same 
way, so it's not a given that the same -j number should be used."
I think it's whether the # jobs for pg_upgrade is used the same way for 
parallel vacuum. 

According to the official docs, the recommended setting for pg_upgrade --j 
option is the maximum of the number of CPU cores and tablespaces. [1]
As for vacuumdb, parallel vacuum's (-j) recommended setting is based from your 
max_connections [2], which is the max # of concurrent connections to your db 
server.

[1] https://www.postgresql.org/docs/current/pgupgrade.html
[2] https://www.postgresql.org/docs/current/app-vacuumdb.html


Regards,
Kirk Jamison


Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-07 Thread Jesper Pedersen

Hi Peter,

On 1/3/19 7:03 AM, Peter Eisentraut wrote:

Perhaps more documentation would be useful here.



Here is v2 that just notes that the -j option isn't passed down.

Best regards,
 Jesper
>From fe0be6c1f5cbcaeb2981ff4dcfceae2e2cb286b7 Mon Sep 17 00:00:00 2001
From: jesperpedersen 
Date: Fri, 21 Dec 2018 05:05:31 -0500
Subject: [PATCH] Highlight that the -j option isn't passed down to vacuumdb by
 default.

Author: Jesper Pedersen 
---
 doc/src/sgml/ref/pgupgrade.sgml |  3 ++-
 src/bin/pg_upgrade/check.c  | 21 -
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
index 7e1afaf0a5..937e95688d 100644
--- a/doc/src/sgml/ref/pgupgrade.sgml
+++ b/doc/src/sgml/ref/pgupgrade.sgml
@@ -395,7 +395,8 @@ NET STOP postgresql-
  in parallel;  a good place to start is the maximum of the number of
  CPU cores and tablespaces.  This option can dramatically reduce the
  time to upgrade a multi-database server running on a multiprocessor
- machine.
+ machine. Note, that this option isn't passed to the
+ vacuumdb application by default.
 
 
 
diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index fc5aa7010f..7bae43a0b1 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -495,11 +495,22 @@ create_script_for_cluster_analyze(char **analyze_script_file_name)
 			ECHO_QUOTE, ECHO_QUOTE);
 	fprintf(script, "echo %sthis script and run:%s\n",
 			ECHO_QUOTE, ECHO_QUOTE);
-	fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
-			new_cluster.bindir, user_specification.data,
-	/* Did we copy the free space files? */
-			(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
-			"--analyze-only" : "--analyze", ECHO_QUOTE);
+	if (user_opts.jobs <= 1 || GET_MAJOR_VERSION(new_cluster.major_version) < 905)
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s--all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
+	else
+	{
+		fprintf(script, "echo %s\"%s/vacuumdb\" %s-j %d --all %s%s\n", ECHO_QUOTE,
+new_cluster.bindir, user_specification.data, user_opts.jobs,
+/* Did we copy the free space files? */
+(GET_MAJOR_VERSION(old_cluster.major_version) >= 804) ?
+"--analyze-only" : "--analyze", ECHO_QUOTE);
+	}
 	fprintf(script, "echo%s\n\n", ECHO_BLANK);
 
 	fprintf(script, "\"%s/vacuumdb\" %s--all --analyze-in-stages\n",
-- 
2.17.2



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-03 Thread Jerry Sievers
Peter Eisentraut  writes:

> On 02/01/2019 20:47, Jesper Pedersen wrote:
>
>> Well, that really depends. The user passed -j to pg_upgrade in order for 
>> the upgrade to happen faster, so maybe they would expect, as I would, 
>> that the ANALYZE phase would happen in parallel too.
>
> pg_upgrade -j reduces the *downtime* caused by pg_upgrade during the
> upgrade process.  Also, during said downtime, nothing else is happening,
> so you can use all the resources of the machine.
>
> Once the system is back up, you don't necessarily want to use all the
> resources.  The analyze script is specifically written to run while
> production traffic is active.  If you just want to run the analyze as
> fast as possible, you can just run vacuumdb -j directly, without using
> the script.

Peter, I'm skeptical here.

I might permit a connection to a just pg_upgraded DB prior to any
analyze being known finished only for the most trivial case.  At my site
however, *trivial* systems are a small minority.

In fact, our automated upgrade workflow uses our home-built parallel
analyzer which predates vacuumdb -j.  Apps are not allowed into the DB
until a fast 1st pass has been done.

We run it in 2 phases...

$all preceeding upgrade steps w/system locked out
analyze-lite (reduced stats target)
open DB for application traffic
analyze-full

Of course we are increasing downtime by disallowing app traffic till
finish of analyze-lite however the assumption is that many queries would
be too slow to attempt without full analyzer coverage, albiet at a
reduced stats target.

IMO this is certainly a case of no 1-size-fits-all solution so perhaps a
--analyze-jobs option :-)

FWIW

Thanks

> Moreover, it's not clear that pg_upgrade and vacuumdb are bound the same
> way, so it's not a given that the same -j number should be used.
>
> Perhaps more documentation would be useful here.

-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-03 Thread Peter Eisentraut
On 02/01/2019 20:47, Jesper Pedersen wrote:
> Well, that really depends. The user passed -j to pg_upgrade in order for 
> the upgrade to happen faster, so maybe they would expect, as I would, 
> that the ANALYZE phase would happen in parallel too.

pg_upgrade -j reduces the *downtime* caused by pg_upgrade during the
upgrade process.  Also, during said downtime, nothing else is happening,
so you can use all the resources of the machine.

Once the system is back up, you don't necessarily want to use all the
resources.  The analyze script is specifically written to run while
production traffic is active.  If you just want to run the analyze as
fast as possible, you can just run vacuumdb -j directly, without using
the script.

Moreover, it's not clear that pg_upgrade and vacuumdb are bound the same
way, so it's not a given that the same -j number should be used.

Perhaps more documentation would be useful here.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pg_upgrade: Pass -j down to vacuumdb

2019-01-02 Thread Jesper Pedersen

Hi,

On 12/29/18 10:03 AM, Peter Eisentraut wrote:

On 21/12/2018 11:12, Jesper Pedersen wrote:

Here is a patch that passes the -j option from pg_upgrade down to
vacuumdb if supported.


It's not clear to me that that is what is usually wanted.

The point of the post-upgrade analyze script is specifically to do the
analyze in a gentle fashion.  Mixing in parallelism would seem to defeat
that a bit.



Well, that really depends. The user passed -j to pg_upgrade in order for 
the upgrade to happen faster, so maybe they would expect, as I would, 
that the ANALYZE phase would happen in parallel too.


At least there should be a "notice" about what the script will do in 
this case.


Best regards,
 Jesper



Re: pg_upgrade: Pass -j down to vacuumdb

2018-12-29 Thread Peter Eisentraut
On 21/12/2018 11:12, Jesper Pedersen wrote:
> Here is a patch that passes the -j option from pg_upgrade down to 
> vacuumdb if supported.

It's not clear to me that that is what is usually wanted.

The point of the post-upgrade analyze script is specifically to do the
analyze in a gentle fashion.  Mixing in parallelism would seem to defeat
that a bit.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services