I have an idea, to break pg_dump into functions within PostgreSQL.
Philip Warner wrote:
However, there are some complications because pg_dump is also the upgrade tool; the backed can only know how to describe itself for the current dialect of SQL accepted by PG. As we upgrade and improve the SQL, and add features, pg_dump needs to talk to old backends and dump prior versions in a format compatible with current (new) versions. This means that for some purposes it will not be able to use backend functions, or at least will have to have it's own mutant version of them.
Tom Lane wrote:
This strikes me as largely useless, because the problems that are actually hard for pg_dump are not at the level of individual objects;
they have to do with problems like determining a safe dump order and
altering definitions to break circularities in the dump requirements.
I don't think that extending the pg_get_xxx family of functions would
make pg_dump's life easier by any measurable amount.
I was thinking really of very low-level dumping functions,
that just dump simple single statements.
Its mainly for development purposes, we generally develop db apps in
three or four stages:
1, In a developer only db, each developer has a private schema to experiment in (prototypes, proof-of-concepts, alpha stage, etc).
2, Then its moved into its own schema on the same developer server for further stabilization (alpha/beta stage).
3, Once it reaches beta, it moves onto a mirror of the production server, for further testing by users (beta/pre-release stage).
4, For release, it's moved onto a production server for general consumption.
So there is lots of dumping and reloading of schemas, the only way to do it is by using pg_dump and then hand editing the dump (or with sed/awk scripts etc.), which is a bit of a pain.
I was thinking that if all the dumping functionality was available as PostgreSQL functions, then a simple PL/pgSQL function could be written to dump exactly what is required, and using dblink, fed directly from one db to another. Nice :)
Dependency functions could be developed to determine the ideal dump order for higher-level dumping, and would also be useful for displaying dependency graphs in admin interfaces (phpPgAdmin, pgAdmin III, etc.).
There is also a fundamental problem with the current pg_get_xxx functions, which is that they rely on backend-internal catalog operations that generally use SnapshotNow semantics. This is not what we want to guarantee that pg_dump dumps a consistent snapshot --- we need to look at catalog rows that existed as of pg_dump's serializable snapshot, instead. We have gotten away with it so far because pg_dump starts by taking read locks on every table in sight, and that is sufficient to block schema changes on the tables. But extending the pg_get_xxx approach to non-table-related objects would be seriously dangerous. (I think pg_get_viewdef is already broken, actually, since you can't lock a view.)
So, are pg_catalog tables subject to MVCC in the same way as user tables? ie. If I BEGIN a transaction, will the pg_catalog data remain consistent thoughout the transaction regardless of any DDL commands in a parallel session?
Philip Warner wrote: > Perhaps it would be nice if, in each new version we created a library > that could be built against old versions to provide the functions needed > by pg_dump to upgrade, and a similar library would form part of the new > version as well. Kind of a 'pg_dump translation plugin'. This may be way > too expensive an option, when a few 'if' statements inside pg_dump will > achieve almost the same result. It would remove/reduce bloat in pg_dump > and make the functions available more generally, at the expense of > duplicating lots of code for each supported version.
I was thinking of sharing a library (or just code base) between pg_dump and built-in functions, so that pg_dump doesn't have to rely on old built-in functions of a db you are trying to upgrade.
-- Mark Gibson <gibsonm |AT| cromwell |DOT| co |DOT| uk> Web Developer & Database Admin Cromwell Tools Ltd. Leicester, England.
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html