Re: [PATCHES] pg_dump additional options for performance

2008-07-19 Thread Simon Riggs

On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote:
> On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
> > Simon,
> > 
> >   I agree with adding these options in general, since I find myself
> >   frustrated by having to vi huge dumps to change simple schema things.
> >   A couple of comments on the patch though:
> > 
> >   - Conflicting option handling
> > I think we are doing our users a disservice by putting it on them to
> > figure out exactly what:
> > multiple object groups cannot be used together
> > means to them.  You and I may understand what an "object group" is,
> > and why there can be only one, but it's a great deal less clear than
> > the prior message of
> > options -s/--schema-only and -a/--data-only cannot be used together
> > My suggestion would be to either list out the specific options which
> > can't be used together, as was done previously, or add a bit of (I
> > realize, boring) code and actually tell the user which of the
> > conflicting options were used.
> > 
> >   - Documentation
> > When writing the documentation I would stress that "pre-schema" and
> > "post-schema" be defined in terms of PostgreSQL objects and why they
> > are pre vs. post.
> > 
> >   - Technically, the patch needs to be updated slightly since another
> > pg_dump-related patch was committed recently which also added
> > options and thus causes a conflict.
> > 
> >   Beyond those minor points, the patch looks good to me.
> 
> Thanks for the review. I'll make the changes you suggest.

Patch updated to head, plus changes/docs requested.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: doc/src/sgml/ref/pg_dump.sgml
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.102
diff -c -r1.102 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml	13 Apr 2008 03:49:21 -	1.102
--- doc/src/sgml/ref/pg_dump.sgml	20 Jul 2008 06:33:30 -
***
*** 133,139 
 
  Include large objects in the dump.  This is the default behavior
  except when --schema, --table, or
! --schema-only is specified, so the -b
  switch is only useful to add large objects to selective dumps.
 

--- 133,140 
 
  Include large objects in the dump.  This is the default behavior
  except when --schema, --table, or
! --schema-only or --schema-pre-load or
! 		--schema-post-load is specified, so the -b
  switch is only useful to add large objects to selective dumps.
 

***
*** 443,448 
--- 444,471 
   
  
   
+   --schema-pre-load
+   
+
+ 		Dump only the object definitions (schema) required to load data. Dumps
+ 		exactly what --schema-only would dump, but only those
+ 		statements before the data load.
+
+   
+  
+ 
+  
+   --schema-post-load
+   
+
+ 		Dump only the object definitions (schema) required after data has been
+ 		loaded. Dumps exactly what --schema-only would dump, but 
+ 		only those statements after the data load.
+
+   
+  
+ 
+  
-S username
--superuser=username

***
*** 774,779 
--- 797,830 

  

+The output of pg_dump can be notionally divided into three parts:
+
+ 
+  
+ 	  Pre-Schema - objects required before data loading, such as 
+ 	  CREATE TABLE.
+ 	  This part can be requested using --schema-pre-load.
+  
+ 
+ 
+  
+ 	  Table Data - data can be requested using --data-only.
+  
+ 
+ 
+  
+ 	  Post-Schema - objects required after data loading, such as
+ 	  ALTER TABLE and CREATE INDEX.
+ 	  This part can be requested using --schema-post-load.
+  
+ 
+
+This allows us to work more easily with large data dump files when
+there is some need to edit commands or resequence their execution for
+performance.
+   
+ 
+   
 Because pg_dump is used to transfer data
 to newer versions of PostgreSQL, the output of
 pg_dump can be loaded into
Index: doc/src/sgml/ref/pg_restore.sgml
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/doc/src/sgml/ref/pg_restore.sgml,v
retrieving revision 1.75
diff -c -r1.75 pg_restore.sgml
*** doc/src/sgml/ref/pg_restore.sgml	13 Apr 2008 03:49:21 -	1.75
--- doc/src/sgml/ref/pg_restore.sgml	20 Jul 2008 06:33:18 -
***
*** 321,326 
--- 321,350 
   
  
   
+   --schema-post-load
+   
+
+ 		Dump only the object definitions (schema) required after data has been
+ 		loaded. Dumps exactly what --schema-only would dump, but 
+ 		only those statements after the data load.
+
+   
+  
+ 
+  
+   -S usernam

Re: [PATCHES] pg_dump additional options for performance

2008-07-19 Thread Simon Riggs

On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote:
> Simon,
> 
>   I agree with adding these options in general, since I find myself
>   frustrated by having to vi huge dumps to change simple schema things.
>   A couple of comments on the patch though:
> 
>   - Conflicting option handling
> I think we are doing our users a disservice by putting it on them to
>   figure out exactly what:
>   multiple object groups cannot be used together
>   means to them.  You and I may understand what an "object group" is,
>   and why there can be only one, but it's a great deal less clear than
>   the prior message of
>   options -s/--schema-only and -a/--data-only cannot be used together
>   My suggestion would be to either list out the specific options which
>   can't be used together, as was done previously, or add a bit of (I
>   realize, boring) code and actually tell the user which of the
>   conflicting options were used.
> 
>   - Documentation
>   When writing the documentation I would stress that "pre-schema" and
>   "post-schema" be defined in terms of PostgreSQL objects and why they
>   are pre vs. post.
> 
>   - Technically, the patch needs to be updated slightly since another
>   pg_dump-related patch was committed recently which also added
>   options and thus causes a conflict.
> 
>   Beyond those minor points, the patch looks good to me.

Thanks for the review. I'll make the changes you suggest.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


[PATCHES] pg_dump additional options for performance

2008-07-19 Thread Stephen Frost
Simon,

  I agree with adding these options in general, since I find myself
  frustrated by having to vi huge dumps to change simple schema things.
  A couple of comments on the patch though:

  - Conflicting option handling
I think we are doing our users a disservice by putting it on them to
figure out exactly what:
multiple object groups cannot be used together
means to them.  You and I may understand what an "object group" is,
and why there can be only one, but it's a great deal less clear than
the prior message of
options -s/--schema-only and -a/--data-only cannot be used together
My suggestion would be to either list out the specific options which
can't be used together, as was done previously, or add a bit of (I
realize, boring) code and actually tell the user which of the
conflicting options were used.

  - Documentation
When writing the documentation I would stress that "pre-schema" and
"post-schema" be defined in terms of PostgreSQL objects and why they
are pre vs. post.

  - Technically, the patch needs to be updated slightly since another
pg_dump-related patch was committed recently which also added
options and thus causes a conflict.

  Beyond those minor points, the patch looks good to me.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [PATCHES] Is autovacuum doing a wraparound-avoiding VACUUM?

2008-07-19 Thread Simon Riggs

On Fri, 2008-07-18 at 01:44 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > On Thu, 2008-07-17 at 17:10 -0400, Alvaro Herrera wrote:
> >> I don't like your wording though; it feels too verbose (and you're
> >> losing the ANALYZE in case it's doing both things).  How about 
> >> 
> >> snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
> >> "autovacuum: VACUUM%s%s", vac
> >> tab->at_doanalyze ? " ANALYZE" : "",
> >> tab->at_wraparound ? " (wraparound)" : "");
> 
> > Yes, looks good.
> 
> May I suggest "(to prevent wraparound)" or something like that?
> Otherwise, +1.
> 
> >> You're not proposing it for 8.3 right?
> 
> > I think I am. It's an important diagnostic for your other fix.
> 
> I agree, this is important for visibility into what's happening.
> The string isn't getting translated so I don't see any big downside
> to applying the patch in back branches.

Patches for 8.3 and CVS HEAD.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.81
diff -c -r1.81 autovacuum.c
*** src/backend/postmaster/autovacuum.c	17 Jul 2008 21:02:31 -	1.81
--- src/backend/postmaster/autovacuum.c	19 Jul 2008 07:58:33 -
***
*** 2657,2664 
  	/* Report the command and possible options */
  	if (tab->at_dovacuum)
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
!  "autovacuum: VACUUM%s",
!  tab->at_doanalyze ? " ANALYZE" : "");
  	else
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
   "autovacuum: ANALYZE");
--- 2657,2665 
  	/* Report the command and possible options */
  	if (tab->at_dovacuum)
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
!  "autovacuum: VACUUM%s%s",
!  tab->at_doanalyze ? " ANALYZE" : "",
!  tab->at_wraparound ? " (to prevent wraparound)" : "");
  	else
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
   "autovacuum: ANALYZE");
Index: src/backend/postmaster/autovacuum.c
===
RCS file: /home/sriggs/pg/REPOSITORY/pgsql/src/backend/postmaster/autovacuum.c,v
retrieving revision 1.71.2.4
diff -c -r1.71.2.4 autovacuum.c
*** src/backend/postmaster/autovacuum.c	17 Jul 2008 21:02:41 -	1.71.2.4
--- src/backend/postmaster/autovacuum.c	19 Jul 2008 07:58:43 -
***
*** 291,297 
  static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared,
  		  PgStat_StatDBEntry *shared,
  		  PgStat_StatDBEntry *dbentry);
! static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid);
  static void avl_sighup_handler(SIGNAL_ARGS);
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
--- 291,297 
  static PgStat_StatTabEntry *get_pgstat_tabentry_relid(Oid relid, bool isshared,
  		  PgStat_StatDBEntry *shared,
  		  PgStat_StatDBEntry *dbentry);
! static void autovac_report_activity(VacuumStmt *vacstmt, Oid relid, bool for_wraparound);
  static void avl_sighup_handler(SIGNAL_ARGS);
  static void avl_sigusr1_handler(SIGNAL_ARGS);
  static void avl_sigterm_handler(SIGNAL_ARGS);
***
*** 2633,2639 
  	MemoryContextSwitchTo(old_cxt);
  
  	/* Let pgstat know what we're doing */
! 	autovac_report_activity(&vacstmt, relid);
  
  	vacuum(&vacstmt, relids, bstrategy, for_wraparound, true);
  }
--- 2633,2639 
  	MemoryContextSwitchTo(old_cxt);
  
  	/* Let pgstat know what we're doing */
! 	autovac_report_activity(&vacstmt, relid, for_wraparound);
  
  	vacuum(&vacstmt, relids, bstrategy, for_wraparound, true);
  }
***
*** 2650,2656 
   * bother to report "" or some such.
   */
  static void
! autovac_report_activity(VacuumStmt *vacstmt, Oid relid)
  {
  	char	   *relname = get_rel_name(relid);
  	char	   *nspname = get_namespace_name(get_rel_namespace(relid));
--- 2650,2656 
   * bother to report "" or some such.
   */
  static void
! autovac_report_activity(VacuumStmt *vacstmt, Oid relid, bool for_wraparound)
  {
  	char	   *relname = get_rel_name(relid);
  	char	   *nspname = get_namespace_name(get_rel_namespace(relid));
***
*** 2661,2668 
  	/* Report the command and possible options */
  	if (vacstmt->vacuum)
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
!  "autovacuum: VACUUM%s",
!  vacstmt->analyze ? " ANALYZE" : "");
  	else
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
   "autovacuum: ANALYZE");
--- 2661,2669 
  	/* Report the command and possible options */
  	if (vacstmt->vacuum)
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
!  "autovacuum: VACUUM%s%s",
!  vacstmt->analyze ? " ANALYZE" : "",
!  for_wraparound ? " (to prevent wraparound)" : "");
  	else
  		snprintf(activity, MAX_AUTOVAC_ACTIV_LEN,
   "autovacuum: ANALYZE");

-- 
Sent via pgsql-patches mailing list (pgsql