Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-28 Thread Arian Prins
[EMAIL PROTECTED] (Christopher Kings-Lynne) wrote in message news:[EMAIL 
PROTECTED]...
 Lastly, I presume it's possible to create a system of circular 
 dependencies (eg create or replace view), which really cannot be solved 
 without a system of 'shells', similar to that needed to dump types and 
 their i/o functions.
 
 Views seem to be by far the nastiest object.  They can be dependent on 
 almost everything in the database.

Hello Group,

It might be an idea to keep track of all data-definition changes
during the lifetime of a database. Keep all the
SQL-definition-commands in a seperate systemtable somewhere. Then,
when the schema is dumped you'd have the choice of dumping:
- the most recent schema with database-determined ordering (as is the
case in the current situation)
- replay the recorded datadefinition of the past, so you know the
data definition is executed in a sound sequence.

Of course, data itself would be extracted seperate of the definitions
and there would also be the need to remove constrains while the data
is being loaded. This last thing could be done by parsing the
recorded data-definition-commands or by first blindly running the
recorded commands, then removing (or disabling) any constraints that
are there and finally replacing (or re-enabling) the constraints.

Negative side of this idea is obviously that you need to make changes
to a lot of internals of the postgresql database. AND you need to make
perfectly sure that the state of the database is always consistent
with the state of the recorded SQL.

Good Luck,
Arian.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-23 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 Well.. the second one will be much slower when the foreign keys verify.
 Primary, unique constraints I'll buy in the create statement. Check
 constraints and defaults are a little fuzzier.

FK, primary, and unique constraints are already split out from the
CREATE TABLE for performance reasons.  We could think about folding them
back in in a schema-only dump, but in a full dump I don't think it's
negotiable --- you really want to load the table data before you install
these constraints.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-23 Thread Andreas Pflug
Tom Lane wrote:

Rod Taylor [EMAIL PROTECTED] writes:
 

Well.. the second one will be much slower when the foreign keys verify.
Primary, unique constraints I'll buy in the create statement. Check
constraints and defaults are a little fuzzier.
   

FK, primary, and unique constraints are already split out from the
CREATE TABLE for performance reasons.  We could think about folding them
back in in a schema-only dump, but in a full dump I don't think it's
negotiable --- you really want to load the table data before you install
these constraints.
 

IMHO here we have opposite requirements: The dump/restore process should 
run as fast as possible, so constraints have to be generated separately, 
but pg_dump is also used to reengineer and slightly modify the schema, 
where it's helpful if definitions are grouped. I've been asked for 
complete schema extraction features in pgAdmin3, and I replied 'use 
pg_dump'. It seems that pg_dump tries to serve both requirements, being 
a compromise where two dedicated tools could do it better.

Regards,
Andreas


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-23 Thread ow
--- Tom Lane [EMAIL PROTECTED] wrote:
 FK, primary, and unique constraints are already split out from the
 CREATE TABLE for performance reasons.  We could think about folding them
 back in in a schema-only dump, but in a full dump I don't think it's
 negotiable --- you really want to load the table data before you install
 these constraints.

IMHO, not only data need to loaded before FK constraints are created but also
there has got to be a feature to allow creation of an FK constraint WITHOUT
doing the verification that all loaded/existing records satisfy the FK
constraint. The ability to create a FK constraint without verification of
existing records should exist by itself (maybe only to superuser) and also as
an option in pg_restore. 

More details:
http://archives.postgresql.org/pgsql-admin/2003-11/msg00308.php
http://archives.postgresql.org/pgsql-admin/2003-11/msg00323.php


Thanks







__
Do you Yahoo!?
Free Pop-Up Blocker - Get it now
http://companion.yahoo.com/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Andreas Pflug
Christopher Kings-Lynne wrote:



There are two levels (sort of) of dependency.  The first is that whole 
classes of objects can be dependent on whole other classes.  eg. 
databases depend on users, or ALL FK's can be dumped after ALL tables, 
etc..  It would make the dump more readable if you dumped those 
definite dependencies in that order, rather than shuffling everything up.

I agree that dumping should be done class-wise (Tables, Functions, 
Views) whenever possible, but I don't agree on FKs dumped separately 
from the table. IMHO indexes and constraints belong to the table, and 
modifying the dump will be hard if a table's code is scattered all around.

Regards,
Andreas


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Stephan Szabo
On Sat, 22 Nov 2003, Andreas Pflug wrote:

 Christopher Kings-Lynne wrote:

 
 
  There are two levels (sort of) of dependency.  The first is that whole
  classes of objects can be dependent on whole other classes.  eg.
  databases depend on users, or ALL FK's can be dumped after ALL tables,
  etc..  It would make the dump more readable if you dumped those
  definite dependencies in that order, rather than shuffling everything up.
 
 I agree that dumping should be done class-wise (Tables, Functions,
 Views) whenever possible, but I don't agree on FKs dumped separately
 from the table. IMHO indexes and constraints belong to the table, and
 modifying the dump will be hard if a table's code is scattered all around.

You're going to potentially have the constraints scattered in any case due
to circular dependency chains. I'd think that having all the constraints
in one place would be easier than trying to go through the list of tables
that might be in a circular chain in order to find the constraints.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Andreas Pflug
Stephan Szabo wrote:

You're going to potentially have the constraints scattered in any case due
to circular dependency chains. I'd think that having all the constraints
in one place would be easier than trying to go through the list of tables
that might be in a circular chain in order to find the constraints.
 

I still disagree. cyclic dependencies should be avoided anyhow. You'll 
get an awful lot of trouble loading data in such a case. Some database 
systems refuse to create such stuff right away (mssql).

Regards,
Andreas


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Rod Taylor
On Sat, 2003-11-22 at 16:53, Andreas Pflug wrote:
 Stephan Szabo wrote:
 
 You're going to potentially have the constraints scattered in any case due
 to circular dependency chains. I'd think that having all the constraints
 in one place would be easier than trying to go through the list of tables
 that might be in a circular chain in order to find the constraints.
   
 
 I still disagree. cyclic dependencies should be avoided anyhow. You'll 
 get an awful lot of trouble loading data in such a case. Some database 
 systems refuse to create such stuff right away (mssql).

CREATE TABLE a (col integer primary key);
CREATE TABLE b (col integer primary key);
ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED;
ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a;

How does MSSQL deal with the above?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Andreas Pflug
Rod Taylor wrote:



CREATE TABLE a (col integer primary key);
CREATE TABLE b (col integer primary key);
ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED;
ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a;
How does MSSQL deal with the above?#
 

It depends. Restricting FKs are generated silently, while ON DELETE 
CASCADE will throw a message and refuse to create. MSSQL doesn't know 
about deferred FKs; no chance to enter spuriously inconsistent data.
Still, using cyclic references is IMHO bad design style. I can't accept 
an exceptional case as reason to break *all* table's definition into 
pieces. The CREATE TABLE syntax shows that I'm probably not the only one 
thinking like this: it may include all constraint definitions as well.

There might be discussions whether its better to script
CREATE TABLE xxx ..;
ALTER TABLE xxx ADD PRIMARY KEY ;
ALTER TABLE xxx ADD FOREIGN KEY ;
or
CREATE TABLE xxx (, PRIMARY KEY (..), FOREIGN KEY (..));
I'd opt for the second version (a little formatted, maybe :-)

Regards,
Andreas
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-22 Thread Rod Taylor

 CREATE TABLE a (col integer primary key);
 CREATE TABLE b (col integer primary key);
 ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED;
 ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a;

 Still, using cyclic references is IMHO bad design style. I can't accept 

They're extremely useful when you have normalized data and a very
expensive (but repeatable) process whose results you want to cache for
performance reasons. It enforces that original data and cache are both
added in a single transaction.

But you're right. Normally they're a bad idea.

 an exceptional case as reason to break *all* table's definition into 
 pieces. The CREATE TABLE syntax shows that I'm probably not the only one 
 thinking like this: it may include all constraint definitions as well.
 
 There might be discussions whether its better to script
 CREATE TABLE xxx ..;
 ALTER TABLE xxx ADD PRIMARY KEY ;
 ALTER TABLE xxx ADD FOREIGN KEY ;
 or
 CREATE TABLE xxx (, PRIMARY KEY (..), FOREIGN KEY (..));
 
 I'd opt for the second version (a little formatted, maybe :-)

Well.. the second one will be much slower when the foreign keys verify.
Primary, unique constraints I'll buy in the create statement. Check
constraints and defaults are a little fuzzier.

Logic will be required to pull them out in the event they call functions
which depends on the table or we enable subqueries (assertion like
constraints) in them.





---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-21 Thread Tom Lane
I'm thinking about attacking pg_dump's lack of knowledge about using
dependencies to determine a safe dump order.  But if there's someone
out there actively working on the problem, I don't want to tread on
your toes ... anyone?

Also, if you've got uncommitted patches for pg_dump, please let me know.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Anyone working on pg_dump dependency ordering?

2003-11-21 Thread Christopher Kings-Lynne
I'm thinking about attacking pg_dump's lack of knowledge about using
dependencies to determine a safe dump order.  But if there's someone
out there actively working on the problem, I don't want to tread on
your toes ... anyone?
I've done a whole lot of _thinking_, but basically no _doing_, so go 
right ahead :)

I may as well let you know my thoughts:

There are two levels (sort of) of dependency.  The first is that whole 
classes of objects can be dependent on whole other classes.  eg. 
databases depend on users, or ALL FK's can be dumped after ALL tables, 
etc..  It would make the dump more readable if you dumped those definite 
dependencies in that order, rather than shuffling everything up.

The second level of dependency is when a bunch of object types can 
depend on each other.  The current solution for that is to sort by OID, 
but this fails when it is possible to add a dependency to an object 
after it has been created.

eg:

- Adding a column (with a type) to a table
- All the CREATE OR REPLACE commands
- etc.
Hence, a full db wide topological sort might not be necessary.

Lastly, I presume it's possible to create a system of circular 
dependencies (eg create or replace view), which really cannot be solved 
without a system of 'shells', similar to that needed to dump types and 
their i/o functions.

Views seem to be by far the nastiest object.  They can be dependent on 
almost everything in the database.

Also, if you've got uncommitted patches for pg_dump, please let me know.
Yes, my 'COMMENT ON' mega patch in the queue contains dumping of 
conversions and comments on a bunch of objects.

BTW, if you commit that patch - you might want to change my comment on 
type patch to put the  around any, and change the results file 
appropriately.  I noticed I accidentally included that in the patch, and 
was about to mention it.

CHris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]