The long-lost pg_upgrade (was:Re: [HACKERS] 8.2 features status)

2006-08-12 Thread Lamar Owen
On Friday 04 August 2006 02:20, Josh Berkus wrote:
 grin Aren't I, the marketing geek, supposed to be the one whining about
 this?
[snip]
  * In-place upgrades (pg_upgrade)

 BTW, I may get Sun to contribute an engineer for this; will get you posted.

Long time no post.  This statement really caught my attention; bravo if true 
upgrading can happen, and someone can be put on it and do it right.

As Tom said, a little farther down the thread, we have talked over this many 
times.  I specifically remember, oh, about a dozen times I personally 
have 'gadflied' this issue.

As one who now has a, let's see: 
[EMAIL PROTECTED] ~]# du /var/lib/pgsql/data -s
16668528/var/lib/pgsql/data
[EMAIL PROTECTED] ~]# 

Yes, a 16GB inventory database, with in-database large object images.  Anyway, 
as one who does not look forward to migrating this the old-fashioned way (I 
can just imagine how fas^H^H^Hslow a restore of all those large objects is 
going to be; backup is slow enough (about 50 minutes on this Xeon 2.4GHz 
box)), in place upgrade would cut this considerably; the database is not a 
complex one, just a largish one.  It's, let's see, only holding a little less 
than 5,000 items with associated lo images (due to many factors, this is 
handled through ODBC from Microsoft Access; it is a kludge, and a big one, 
but it works very smoothly from the users' points of view, where item images 
are literally 'dragged and dropped' from the digital camera straight to the 
database).

So, anyway, looking forward to seeing some progress in this department... :-)
-- 
Lamar Owen
Director of Information Technology
Pisgah Astronomical Research Institute
1 PARI Drive
Rosman, NC  28772
(828)862-5554
www.pari.edu

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-07 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 I don't think there is very much hope of an in-place upgrade for
 scenarios involving changes in contents of user tables.  In particular,
 what of a change that requires more space than before, such as adding a
 locale indicator to text fields?  There's no guarantee that the data on
 an old page will still fit, and there's certainly no hope of something
 operating at the xlog level being able to move tuples across pages ---
 if nothing else, because it's not in a position to compute new index
 entries.  I don't see this working for page-at-a-time updates even in a
 full backend environment; again, indexes are the killer consideration.
 I don't see how to get sane behavior from an index containing some
 old-style entries and some new-style ones for a changed datatype.
 
 As you mentioned, the scenarios that look practical for in-place upgrade
 are the ones where only system catalog contents need to change.  We've
 already discussed this (many times) and agreed that we could live with
 restricting user-table changes to happen only once every few releases.

What of having support in the backend for two heap formats in each version.
That is, 8.3 would be able to support 8.2 heaps as well as 8.3. There could be
a flag in pg_class indicating the heap format for that heap. Then you would be
able to upgrade without rewriting all your tables and the only requirement is
that sometime prior to the next upgrade you issue a per-table ALTER TABLE that
involves rewriting the table such as CLUSTER or ALTER COLUMN TYPE USING.

That still requires 2x space but only for a single table at a time which is
much better than 2x space for the entire database. It also lets you schedule
that job for some point in time when you can arrange to have the extra space.



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-07 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 [ concerning handling a change in a single datatype's representation ]

 1. Under old system, munge the system catalog to add code for new inet
 type with new OID. Probably needs a shared lib (if you could create
 type input/output function with pl/pgsql it would help here).
 2. Execute ALTER COLUMN TYPE USING to switch to the new type.
 3. Shutdown old server
 4. Complete catalog changes
 5. Startup new version

Probably easier and more flexible just to include both versions of the
datatype in the new release, and tell people they'd better ALTER COLUMN
TYPE sometime before updating to release N+2.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-07 Thread Martijn van Oosterhout
On Mon, Aug 07, 2006 at 08:37:14AM -0400, Tom Lane wrote:
 Probably easier and more flexible just to include both versions of the
 datatype in the new release, and tell people they'd better ALTER COLUMN
 TYPE sometime before updating to release N+2.

Right, but I wasn't sure if that'd be considered acceptable.

So, in a nutshell, if all the other issues of pg_upgrade get ironed out
and the only blocker was a change in the on-disk format of a type,
including both the old and the new versions in the next major release
would be an option seriously considered.

Well, that's got to make it easiesr.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-07 Thread Jim C. Nasby
On Mon, Aug 07, 2006 at 06:05:30AM -0400, Gregory Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  I don't think there is very much hope of an in-place upgrade for
  scenarios involving changes in contents of user tables.  In particular,
  what of a change that requires more space than before, such as adding a
  locale indicator to text fields?  There's no guarantee that the data on
  an old page will still fit, and there's certainly no hope of something
  operating at the xlog level being able to move tuples across pages ---
  if nothing else, because it's not in a position to compute new index
  entries.  I don't see this working for page-at-a-time updates even in a
  full backend environment; again, indexes are the killer consideration.
  I don't see how to get sane behavior from an index containing some
  old-style entries and some new-style ones for a changed datatype.
  
  As you mentioned, the scenarios that look practical for in-place upgrade
  are the ones where only system catalog contents need to change.  We've
  already discussed this (many times) and agreed that we could live with
  restricting user-table changes to happen only once every few releases.
 
 What of having support in the backend for two heap formats in each version.
 That is, 8.3 would be able to support 8.2 heaps as well as 8.3. There could be
 a flag in pg_class indicating the heap format for that heap. Then you would be
 able to upgrade without rewriting all your tables and the only requirement is
 that sometime prior to the next upgrade you issue a per-table ALTER TABLE that
 involves rewriting the table such as CLUSTER or ALTER COLUMN TYPE USING.
 
 That still requires 2x space but only for a single table at a time which is
 much better than 2x space for the entire database. It also lets you schedule
 that job for some point in time when you can arrange to have the extra space.

Unless you make it so that the old-version pages get changed to the new
version on-the-fly as they get dirtied.

And please folks, yes, it's be insane to make the backend deal with more
than one old version at a time. I think it's perfectly acceptable to
tell folks that if they want to jump 4 major versions at once that
they'll have to do it some other way.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-07 Thread Sander Steffann

Hi,


[ concerning handling a change in a single datatype's representation ]



1. Under old system, munge the system catalog to add code for new inet
type with new OID. Probably needs a shared lib (if you could create
type input/output function with pl/pgsql it would help here).
2. Execute ALTER COLUMN TYPE USING to switch to the new type.
3. Shutdown old server
4. Complete catalog changes
5. Startup new version


Probably easier and more flexible just to include both versions of the
datatype in the new release, and tell people they'd better ALTER COLUMN
TYPE sometime before updating to release N+2.


Sounds very nice to have for me! Add an easy way to see which columns are 
still in the old format, and upgrading will be so much easier.


- Sander



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-05 Thread Josh Berkus
Rick,

 The objective is to smoothly upgrade to the new version with minimal
 downtime.

Thanks for jumping in.

 The different proposals as far as I can see are as follows:

 Proposal A - the big one time reformatting
 1) shutdown the db
 2) run a command that upgrades the data directory to the new format
 3) start up the new postgres version with the new data dir

 Pros: only pg_upgrade (or whatever it's called) needs to know about
 the old and new formats, each version of postgres knows about it's
 format and that's it.  The postgres code stays clean
 cons: your database is down while the upgrade takes place.  This
 sucks because the people who need this are the same people who are
 trying to avoid downtime.  It's faster than a dump/reload but it
 doesn't completely solve the problem, it just mitigates it.

Yes, but it mitigates it *considerably.*   The general idea is that doing an 
in-place binary upgrade should take 80% less time, and require only 10% (or 
so) extra disk space.   This would be sufficient for most users with large 
databases; they can afford to be down from midnight to 3 am but not to be 
down for the whole weekend (as dump/reload requires for a 3tb database) nor 
do they have 100% + extra disk space and 20% extra CPU on the machine (as 
upgrade-by-replication requires).

Plus, there are versions (like 8.2 looks to be) where there is *no* change in 
the file format.  For these, pg_upgrade would need just to bootstrap the 
system, swap the system tables and header files, and restart.   Seconds 
instead of hours.  As PostgreSQL matures further, I predict that there will 
be more releases with no change in the file format, making this even more 
useful.

 Proposal B - the gradual upgrade
 1) shutdown the db
 2) start it back up with the new version of postgres
 3) the new postgres version upgrades things in place as needed

This is impractical, because the number of version-to-version compatibility 
libraries required will grow geometrically with time.   We don't have to just 
accomodate the last version, but the last 4+ versions.

 Proposal C - PITR with in on the fly disk upgrades
 1) setup PITR
 2) run pg_upgrade on your latest backed up data directories
 3) start up the new pg on that data directory in restartable
 recovery / read-only / hot-standby mode
 4) update the recovery log importer so that it can update the log
 files on the fly as it applies them
 5) failover to the hot standby as you normally would

I don't think this is practical.   Simon?

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-05 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 Proposal C - PITR with in on the fly disk upgrades
 1) setup PITR
 2) run pg_upgrade on your latest backed up data directories
 3) start up the new pg on that data directory in restartable
 recovery / read-only / hot-standby mode
 4) update the recovery log importer so that it can update the log
 files on the fly as it applies them
 5) failover to the hot standby as you normally would

 I don't think this is practical.   Simon?

I don't think there is very much hope of an in-place upgrade for
scenarios involving changes in contents of user tables.  In particular,
what of a change that requires more space than before, such as adding a
locale indicator to text fields?  There's no guarantee that the data on
an old page will still fit, and there's certainly no hope of something
operating at the xlog level being able to move tuples across pages ---
if nothing else, because it's not in a position to compute new index
entries.  I don't see this working for page-at-a-time updates even in a
full backend environment; again, indexes are the killer consideration.
I don't see how to get sane behavior from an index containing some
old-style entries and some new-style ones for a changed datatype.

As you mentioned, the scenarios that look practical for in-place upgrade
are the ones where only system catalog contents need to change.  We've
already discussed this (many times) and agreed that we could live with
restricting user-table changes to happen only once every few releases.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-04 Thread Jim C. Nasby
On Thu, Aug 03, 2006 at 11:20:48PM -0700, Josh Berkus wrote:
  * In-place upgrades (pg_upgrade)
 
 BTW, I may get Sun to contribute an engineer for this; will get you posted.

How would such a thing handle changes to page formats?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-04 Thread Stephen Frost
* Jim C. Nasby ([EMAIL PROTECTED]) wrote:
 On Thu, Aug 03, 2006 at 11:20:48PM -0700, Josh Berkus wrote:
   * In-place upgrades (pg_upgrade)
  
  BTW, I may get Sun to contribute an engineer for this; will get you posted.
 
 How would such a thing handle changes to page formats?

Couldn't this be done by converting a table/partial-table at a time?
It wouldn't be something which could run while the system is live, but
it'd probably take less time than dump/restore and wouldn't require
double the disk space of the whole database... no?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-04 Thread Jim C. Nasby
On Fri, Aug 04, 2006 at 02:12:16PM -0400, Stephen Frost wrote:
 * Jim C. Nasby ([EMAIL PROTECTED]) wrote:
  On Thu, Aug 03, 2006 at 11:20:48PM -0700, Josh Berkus wrote:
* In-place upgrades (pg_upgrade)
   
   BTW, I may get Sun to contribute an engineer for this; will get you 
   posted.
  
  How would such a thing handle changes to page formats?
 
 Couldn't this be done by converting a table/partial-table at a time?
 It wouldn't be something which could run while the system is live, but
 it'd probably take less time than dump/restore and wouldn't require
 double the disk space of the whole database... no?

True, but if you're going to go about creating code that can deal with 2
different versions of on-disk data, why not go one better: put that code
into the database itself, so that pages are converted on-the-fly as
they're dirtied. That way you have *no* downtime (or almost no, anyway).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-04 Thread Stephen Frost
* Jim C. Nasby ([EMAIL PROTECTED]) wrote:
 On Fri, Aug 04, 2006 at 02:12:16PM -0400, Stephen Frost wrote:
  * Jim C. Nasby ([EMAIL PROTECTED]) wrote:
   How would such a thing handle changes to page formats?
  
  Couldn't this be done by converting a table/partial-table at a time?
  It wouldn't be something which could run while the system is live, but
  it'd probably take less time than dump/restore and wouldn't require
  double the disk space of the whole database... no?
 
 True, but if you're going to go about creating code that can deal with 2
 different versions of on-disk data, why not go one better: put that code
 into the database itself, so that pages are converted on-the-fly as
 they're dirtied. That way you have *no* downtime (or almost no, anyway).

Certainly a good idea but I'm really not sure that: a) we'd want the
core server to have essentially cruft code and b) that it'd be anywhere
near simple to make this actually work in the core system.

Supporting two versions of the page format when the only goal is
changing the formatting is quite a different thing than fully supporting
two on-disk formats throughout the backend.  Additionally, if you have
something to convert the database wholesale then that really just needs
to support 2 versions at a time, if it's piecemeal then you could end up
in a situation where you have to support 3, 4, 5, however many prior
page formats existed.  If not then you have to have a way to force all
the pages to be upgraded and you're unlikely to be able to do a whole
lot more with the system while that's happening, thus adding alot of
extra complication just to end up having everyone follow the
'pg_upgrade'-style path anyway...

In the end though, I think once it's been done in pg_upgrade this could
be revisited and considered and if enough people are interested in doing
it they could probably reuse some of hte pg_upgrade code to make it
happen.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: pg_upgrade (was: [HACKERS] 8.2 features status)

2006-08-04 Thread Rick Gigger

I had a few thoughts on this issue:

The objective is to smoothly upgrade to the new version with minimal  
downtime.


The different proposals as far as I can see are as follows:

Proposal A - the big one time reformatting
1) shutdown the db
2) run a command that upgrades the data directory to the new format
3) start up the new postgres version with the new data dir

Pros: only pg_upgrade (or whatever it's called) needs to know about  
the old and new formats, each version of postgres knows about it's  
format and that's it.  The postgres code stays clean
cons: your database is down while the upgrade takes place.  This  
sucks because the people who need this are the same people who are  
trying to avoid downtime.  It's faster than a dump/reload but it  
doesn't completely solve the problem, it just mitigates it.


Proposal B - the gradual upgrade
1) shutdown the db
2) start it back up with the new version of postgres
3) the new postgres version upgrades things in place as needed

Pros: very short downtime.  only the time to shutdown the postgres  
version and start up the new one
cons: postgres code gets filled with cruft.  each version has to know  
about the old versions on disk data format and how to upgrade it.   
Until it is finished you will be left with a database that is part  
old format, part new format.  This could introduce bugs for people  
who never needed the feature in the first place.


Here is another proposal that I haven't heard anyone else suggest.   
My apologies in advance if it's obviously not workable or has already  
be discussed.


Proposal C - PITR with in on the fly disk upgrades
1) setup PITR
2) run pg_upgrade on your latest backed up data directories
3) start up the new pg on that data directory in restartable  
recovery / read-only / hot-standby mode
4) update the recovery log importer so that it can update the log  
files on the fly as it applies them

5) failover to the hot standby as you normally would

Pros: essentially no downtime, just any incidental time needed for  
the failover to occur.
	cruft in postgres main codebase is mimimized.  It's limited to the  
log importer.  All other parts of postgres are unaffected
Cons: requires another server or double the disk space on the  
original server.  Is this a problem for people with databases so  
large that a dump reload is unacceptable?
	Perhaps there are technical issues with postgres that I don't  
understand that would make this too hard.
	Maybe it would take to long to update each log file as it's applied  
so it wouldn't be able to catch up.


Oh yeah there's another way
Proposal D - Use slony
But I figured since that's been working for a long time, if slony  
solved their problem then they wouldn't be looking for something else.


I have no need for this feature as a dump reload is not a problem for  
me.  I've always wondered though if that was a feasible answer to  
this problem.  Each time it crops up people propose solutions A and B  
but never C.



On Aug 4, 2006, at 1:30 PM, Jim C. Nasby wrote:


On Fri, Aug 04, 2006 at 02:12:16PM -0400, Stephen Frost wrote:

* Jim C. Nasby ([EMAIL PROTECTED]) wrote:

On Thu, Aug 03, 2006 at 11:20:48PM -0700, Josh Berkus wrote:

* In-place upgrades (pg_upgrade)


BTW, I may get Sun to contribute an engineer for this; will get  
you posted.


How would such a thing handle changes to page formats?


Couldn't this be done by converting a table/partial-table at a time?
It wouldn't be something which could run while the system is live,  
but

it'd probably take less time than dump/restore and wouldn't require
double the disk space of the whole database... no?


True, but if you're going to go about creating code that can deal  
with 2
different versions of on-disk data, why not go one better: put that  
code

into the database itself, so that pages are converted on-the-fly as
they're dirtied. That way you have *no* downtime (or almost no,  
anyway).

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster




---(end of broadcast)---
TIP 6: explain analyze is your friend