[HACKERS] Database file copy

Srini Raghavan Wed, 22 Dec 2010 16:35:44 -0800

Hello,

[Tried the general forum, didn't hear from anyone so far, trying this forum 
now, 
please review, thanks]



We are looking to distribute postgres databases to our customers along with our 
application. We are currently evaluating postgres version 8.4.4. The database 
can be of size 25 gb (compressed files fits in few dvds, the product is 
distributed on dvds). The pg_restore of this database takes several hours on 
the 
low end machines running windows os. The pg_restore is run during our product 
install, and the current install time projection is not acceptable. Our 
customers can purchase different databases over a period of time, and the 
application makes transactional updates to the databases after installation. 
Hence, copying the entire data folder instead of using the pg_restore is not an 
option, as the transactional updates will be lost.

I have read the documentation and the few posts available that discourages file 
copy based restore of individual databases, but, I have found a way to do this. 
I would appreciate if the experts can read and advise if the approach will 
work, 
given our environment and usage boundaries.

Master Postgres instance (this is where we create the data, we have complete 
control of this environment):
1. Create the database and populate data.
2. Set vacuum_freeze_table_age to 0 in the postgresql.conf
3. Run vacuum full - this will reset the row xid to the FrozenXid
4. Shutdown postgres and take a copy of the files for the given database.

In the deploy instance at the customer site:
1. Create the new database.
2. Shutdown postgres instance and copy the database files created in the master 
instance to the database specific folder.
3. Start postgres instance.

We don't use table row oids. If the cluster wide oid collides with the oid in 
the copied database files during subsequent ddl operations, postgres resolves 
this by skipping to the next available oid. There will be a delay to find the 
next available oid, which is acceptable in our case, as the ddl operations at 
the customer site are rare.  And, the vacuum full with vacuum_freeze_table_age 
set to 0 on the master instance takes care of the xmin, allowing transactions 
to 
be visible, and for further transactions at the customer site to continue 
without colliding. 


I have tested this and it works, and I am continuing to test it more. I would 
like for validation of this idea from the experts and the community to make 
sure 
I haven't overlooked something obvious that might cause issues.

Thank you,
Srini

[HACKERS] Database file copy

Reply via email to