Re: [HACKERS] analyze after a database restore?

2003-03-17 Thread Bruce Momjian

Attached is a committed patch to add a recommendation for ANALYZE after
restore.  It is a shame we only have vacuumdb -a to do analyze _and_
vacuum, and no analyze-only option.

---

Tom Lane wrote:
 mlw [EMAIL PROTECTED] writes:
   From an ease of use perspective, it would be one less step.
 
 There is something to be said for that.  As Rod notes, this has been
 considered and rejected before --- but I think that was back when
 ANALYZE (a) could only be done as part of VACUUM, and (b) insisted on
 scanning the whole table.  The current implementation is vastly
 lighter-weight than what we were looking at back then.  Perhaps it's
 time to reconsider.
 
 Although I suggested doing a single unconditional ANALYZE at the end
 of the script, second thought leads me to think the per-table ANALYZE
 (probably issued right after the table's data-load step) might be
 better.  That way you'd not have any side-effects on already-existing
 tables in the database you are loading to.  OTOH, that way would leave
 the system catalogs un-analyzed, which might be bad.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/backup.sgml
===
RCS file: /cvsroot/pgsql-server/doc/src/sgml/backup.sgml,v
retrieving revision 2.24
diff -c -c -r2.24 backup.sgml
*** doc/src/sgml/backup.sgml11 Nov 2002 20:14:02 -  2.24
--- doc/src/sgml/backup.sgml17 Mar 2003 23:58:37 -
***
*** 126,131 
--- 126,138 
 /para
  
 para
+ Once restored, it is wise to run commandANALYZE/ on each
+ database so the optimizer has useful statistics. You
+ can also run commandvacuumdb -a -z/ to commandANALYZE/ all
+ databases.
+/para
+ 
+para
  The ability of applicationpg_dump/ and applicationpsql/ to
  write to or read from pipes makes it possible to dump a database
  directly from one server to another, for example
Index: doc/src/sgml/ref/pg_dump.sgml
===
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.56
diff -c -c -r1.56 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml   13 Feb 2003 04:54:15 -  1.56
--- doc/src/sgml/ref/pg_dump.sgml   17 Mar 2003 23:58:38 -
***
*** 650,655 
--- 650,660 
  /programlisting
/para
  
+para
+ Once restored, it is wise to run commandANALYZE/ on each
+ restored object so the optimizer has useful statistics.
+/para
+ 
para
 applicationpg_dump/application has a few limitations:
  
***
*** 682,687 
--- 687,698 
 other output formats is not limited, except possibly by the
 operating system.
/para
+ 
+   para
+Once restored, it is wise to run commandANALYZE/ on each
+restored object so the optimizer has useful statistics.
+   /para
+ 
   /refsect1
  
   refsect1 id=pg-dump-examples
Index: doc/src/sgml/ref/pg_dumpall.sgml
===
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/pg_dumpall.sgml,v
retrieving revision 1.36
diff -c -c -r1.36 pg_dumpall.sgml
*** doc/src/sgml/ref/pg_dumpall.sgml6 Jan 2003 18:53:24 -   1.36
--- doc/src/sgml/ref/pg_dumpall.sgml17 Mar 2003 23:58:38 -
***
*** 258,267 
 applicationpg_dumpall/application will need to connect several
 times to the productnamePostgreSQL/productname server.  If password
 authentication is configured, it will ask for a password each time. In
!that case it would be convenient to set up a password file.
/para
  
-   commentBut where is that password file documented?/comment
   /refsect1
  
  
--- 258,274 
 applicationpg_dumpall/application will need to connect several
 times to the productnamePostgreSQL/productname server.  If password
 authentication is configured, it will ask for a password each time. In
!that case it would be convenient to set up a filename.pgpass/ 
!password file.
!   /para
! 
!   para
!Once restored, it is wise to run commandANALYZE/ on each
!database so the optimizer has useful statistics. You
!can also run commandvacuumdb -a -z/ to commandANALYZE/ all
!databases.
/para
  
   /refsect1
  
  
Index: doc/src/sgml/ref/pg_restore.sgml
===
RCS file: /cvsroot/pgsql-server/doc/src/sgml/ref/pg_restore.sgml,v
retrieving revision 1.35

Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread Tom Lane
mlw [EMAIL PROTECTED] writes:
 Should pg_dump appened an ANALYZE for each table?

A single ANALYZE at the end of the script would be sufficient.  I'm not
sure that pg_dump should do this automatically though.  If you're not
done restoring then it's mostly a waste of cycles, and how is pg_dump to
know that?

I do note that the docs are rather stingy with this important bit of
knowhow :-(  Neither of the obvious places that I looked in (pg_dump
reference page and admin guide's backup/restore chapter) mention the
need to issue an ANALYZE after completing a restore.  I'm pretty sure it
is mentioned *somewhere* ;-) ... but it needs to be more prominent.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread Rod Taylor
On Thu, 2003-02-27 at 12:27, mlw wrote:
 I just dumped and restored a rather large database, I upgraded from 
 7.2.x to 7.3.x. When I went to test my application against the new 
 database, it was dog slow. It had all the indexes, and looked fine.
 
 Then it dawned on me, Doh! ANALYZE!
 
 Should pg_dump appened an ANALYZE for each table?
 
 On small tables, this shouldn't take too long. On large tables, you're 
 gonna have to do it anyway. I guess it could be an option as well.
 
 It just seems like on of the tasks that is required for a restored 
 database to work properly, and as such, should probably be specified in 
 the backup procedure.

It's been debated before whether pg_dump should append ANALYZE to the
end of the load script.  It has always been determined that it shouldn't
(see archives for arguments).

However, an 'Auto-vacuum' process should watch stats and re-analyze the
table when the larger of 30% or 1000 rows has been affected (inserts, or
deletes mostly). That is probably a better solution overall.

-- 
Rod Taylor [EMAIL PROTECTED]

PGP Key: http://www.rbt.ca/rbtpub.asc


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread mlw


Tom Lane wrote:

mlw [EMAIL PROTECTED] writes:
 

Should pg_dump appened an ANALYZE for each table?
   

A single ANALYZE at the end of the script would be sufficient.  I'm not
sure that pg_dump should do this automatically though.  If you're not
done restoring then it's mostly a waste of cycles, and how is pg_dump to
know that?
I do note that the docs are rather stingy with this important bit of
knowhow :-(  Neither of the obvious places that I looked in (pg_dump
reference page and admin guide's backup/restore chapter) mention the
need to issue an ANALYZE after completing a restore.  I'm pretty sure it
is mentioned *somewhere* ;-) ... but it needs to be more prominent.
 

While these are all comforting points, I *know* about analyze and I 
occasionally forget. It just seems like a nessisary step after restoring 
a backup. Conceptually, one could consider it just as important as an 
index, i.e. the system will perform poorly without it.

From an ease of use perspective, it would be one less step.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread Tom Lane
mlw [EMAIL PROTECTED] writes:
  From an ease of use perspective, it would be one less step.

There is something to be said for that.  As Rod notes, this has been
considered and rejected before --- but I think that was back when
ANALYZE (a) could only be done as part of VACUUM, and (b) insisted on
scanning the whole table.  The current implementation is vastly
lighter-weight than what we were looking at back then.  Perhaps it's
time to reconsider.

Although I suggested doing a single unconditional ANALYZE at the end
of the script, second thought leads me to think the per-table ANALYZE
(probably issued right after the table's data-load step) might be
better.  That way you'd not have any side-effects on already-existing
tables in the database you are loading to.  OTOH, that way would leave
the system catalogs un-analyzed, which might be bad.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread Andrew Sullivan
On Thu, Feb 27, 2003 at 02:45:46PM -0500, mlw wrote:
 
 Are there any reasons why it should not be an option on pg_dump?

I wonder whether that mightn't be the best answer.  Maybe it should
even be the default, and --noanalyse an option.

I agree that from the point of view of simplifying administration,
it'd be a nice addition.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Liberty RMS   Toronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread Alvaro Herrera
On Thu, Feb 27, 2003 at 03:12:36PM -0500, Tom Lane wrote:

 Although I suggested doing a single unconditional ANALYZE at the end
 of the script, second thought leads me to think the per-table ANALYZE
 (probably issued right after the table's data-load step) might be
 better.  That way you'd not have any side-effects on already-existing
 tables in the database you are loading to.  OTOH, that way would leave
 the system catalogs un-analyzed, which might be bad.

Huh... is there a way to analyze a specific schema?  Maybe that can
solve the problem of system catalogs being left un-analyzed, by having
pg_dump emit an ANALYZE SCHEMA pg_catalog or something.

-- 
Alvaro Herrera (alvherre[a]dcc.uchile.cl)
La libertad es como el dinero; el que no la sabe emplear la pierde (Alvarez)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread D'Arcy J.M. Cain
On Thursday 27 February 2003 13:12, mlw wrote:
 Tom Lane wrote:
 A single ANALYZE at the end of the script would be sufficient.  I'm not
 sure that pg_dump should do this automatically though.  If you're not
 done restoring then it's mostly a waste of cycles, and how is pg_dump to
 know that?
 [...]
  From an ease of use perspective, it would be one less step.

Why not have pg_dump emit a friendly reminder?

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED]|vex}.net   |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] analyze after a database restore?

2003-02-27 Thread mlw


D'Arcy J.M. Cain wrote:

On Thursday 27 February 2003 13:12, mlw wrote:
 

Tom Lane wrote:
   

A single ANALYZE at the end of the script would be sufficient.  I'm not
sure that pg_dump should do this automatically though.  If you're not
done restoring then it's mostly a waste of cycles, and how is pg_dump to
know that?
 

[...]
From an ease of use perspective, it would be one less step.
   

Why not have pg_dump emit a friendly reminder?

 

The reminder won't work, because the backup may be happening in an 
automatic fashion, and anything but error messages will be lost. I 
dislike having to have an expert be present at the database restore 
phase of operation.

Suppose a company loses the PG admin and a reasonably experienced person 
takes his or her place temporarily, This scenario happens all the time 
with all sorts of projects. A reasonably experienced person will be able 
to accomplish a DB restore but will probably not know about performing 
an analyze. Under the pressure of restoring after a crash on a live 
system, even a reasonably experienced PG admin may forget, hell I forgot 
and I've been using PG since 1997.

The correct view of a database backup should be to include the 
statistics of the database as it existed at the time backup, these 
statistics are part of this state snapshot because the directly affect 
the operation of the database. I do not want to evoke the name of 
Larry's evil product, but it saves its statistics when the data is exported.

Short of including the relevant statistics, there should be an option on 
pg_dump to emit an ANALYZE; at the end of a database dump. This will 
allow a knowledgeable admin to selectively add the vacuum so that 
someone possibly less qualified than he can do the restore.

Does anyone disagree that a query's explain should look the same or 
better after a successful restore? From a product QA point of view, if a 
valid backup set, when restored, does not recreate the system in a state 
at least as efficient and workable as the system when it was backed up, 
you did not have a successful restore. Any QA department would rate this 
as a serious bug.

Are there any reasons why it should not be an option on pg_dump?

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])