Re: [HACKERS] Anyone working on pg_dump dependency ordering?
[EMAIL PROTECTED] (Christopher Kings-Lynne) wrote in message news:[EMAIL PROTECTED]... Lastly, I presume it's possible to create a system of circular dependencies (eg create or replace view), which really cannot be solved without a system of 'shells', similar to that needed to dump types and their i/o functions. Views seem to be by far the nastiest object. They can be dependent on almost everything in the database. Hello Group, It might be an idea to keep track of all data-definition changes during the lifetime of a database. Keep all the SQL-definition-commands in a seperate systemtable somewhere. Then, when the schema is dumped you'd have the choice of dumping: - the most recent schema with database-determined ordering (as is the case in the current situation) - replay the recorded datadefinition of the past, so you know the data definition is executed in a sound sequence. Of course, data itself would be extracted seperate of the definitions and there would also be the need to remove constrains while the data is being loaded. This last thing could be done by parsing the recorded data-definition-commands or by first blindly running the recorded commands, then removing (or disabling) any constraints that are there and finally replacing (or re-enabling) the constraints. Negative side of this idea is obviously that you need to make changes to a lot of internals of the postgresql database. AND you need to make perfectly sure that the state of the database is always consistent with the state of the recorded SQL. Good Luck, Arian. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
Rod Taylor [EMAIL PROTECTED] writes: Well.. the second one will be much slower when the foreign keys verify. Primary, unique constraints I'll buy in the create statement. Check constraints and defaults are a little fuzzier. FK, primary, and unique constraints are already split out from the CREATE TABLE for performance reasons. We could think about folding them back in in a schema-only dump, but in a full dump I don't think it's negotiable --- you really want to load the table data before you install these constraints. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
Tom Lane wrote: Rod Taylor [EMAIL PROTECTED] writes: Well.. the second one will be much slower when the foreign keys verify. Primary, unique constraints I'll buy in the create statement. Check constraints and defaults are a little fuzzier. FK, primary, and unique constraints are already split out from the CREATE TABLE for performance reasons. We could think about folding them back in in a schema-only dump, but in a full dump I don't think it's negotiable --- you really want to load the table data before you install these constraints. IMHO here we have opposite requirements: The dump/restore process should run as fast as possible, so constraints have to be generated separately, but pg_dump is also used to reengineer and slightly modify the schema, where it's helpful if definitions are grouped. I've been asked for complete schema extraction features in pgAdmin3, and I replied 'use pg_dump'. It seems that pg_dump tries to serve both requirements, being a compromise where two dedicated tools could do it better. Regards, Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
--- Tom Lane [EMAIL PROTECTED] wrote: FK, primary, and unique constraints are already split out from the CREATE TABLE for performance reasons. We could think about folding them back in in a schema-only dump, but in a full dump I don't think it's negotiable --- you really want to load the table data before you install these constraints. IMHO, not only data need to loaded before FK constraints are created but also there has got to be a feature to allow creation of an FK constraint WITHOUT doing the verification that all loaded/existing records satisfy the FK constraint. The ability to create a FK constraint without verification of existing records should exist by itself (maybe only to superuser) and also as an option in pg_restore. More details: http://archives.postgresql.org/pgsql-admin/2003-11/msg00308.php http://archives.postgresql.org/pgsql-admin/2003-11/msg00323.php Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
Christopher Kings-Lynne wrote: There are two levels (sort of) of dependency. The first is that whole classes of objects can be dependent on whole other classes. eg. databases depend on users, or ALL FK's can be dumped after ALL tables, etc.. It would make the dump more readable if you dumped those definite dependencies in that order, rather than shuffling everything up. I agree that dumping should be done class-wise (Tables, Functions, Views) whenever possible, but I don't agree on FKs dumped separately from the table. IMHO indexes and constraints belong to the table, and modifying the dump will be hard if a table's code is scattered all around. Regards, Andreas ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
On Sat, 22 Nov 2003, Andreas Pflug wrote: Christopher Kings-Lynne wrote: There are two levels (sort of) of dependency. The first is that whole classes of objects can be dependent on whole other classes. eg. databases depend on users, or ALL FK's can be dumped after ALL tables, etc.. It would make the dump more readable if you dumped those definite dependencies in that order, rather than shuffling everything up. I agree that dumping should be done class-wise (Tables, Functions, Views) whenever possible, but I don't agree on FKs dumped separately from the table. IMHO indexes and constraints belong to the table, and modifying the dump will be hard if a table's code is scattered all around. You're going to potentially have the constraints scattered in any case due to circular dependency chains. I'd think that having all the constraints in one place would be easier than trying to go through the list of tables that might be in a circular chain in order to find the constraints. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
Stephan Szabo wrote: You're going to potentially have the constraints scattered in any case due to circular dependency chains. I'd think that having all the constraints in one place would be easier than trying to go through the list of tables that might be in a circular chain in order to find the constraints. I still disagree. cyclic dependencies should be avoided anyhow. You'll get an awful lot of trouble loading data in such a case. Some database systems refuse to create such stuff right away (mssql). Regards, Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
On Sat, 2003-11-22 at 16:53, Andreas Pflug wrote: Stephan Szabo wrote: You're going to potentially have the constraints scattered in any case due to circular dependency chains. I'd think that having all the constraints in one place would be easier than trying to go through the list of tables that might be in a circular chain in order to find the constraints. I still disagree. cyclic dependencies should be avoided anyhow. You'll get an awful lot of trouble loading data in such a case. Some database systems refuse to create such stuff right away (mssql). CREATE TABLE a (col integer primary key); CREATE TABLE b (col integer primary key); ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED; ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a; How does MSSQL deal with the above? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
Rod Taylor wrote: CREATE TABLE a (col integer primary key); CREATE TABLE b (col integer primary key); ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED; ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a; How does MSSQL deal with the above?# It depends. Restricting FKs are generated silently, while ON DELETE CASCADE will throw a message and refuse to create. MSSQL doesn't know about deferred FKs; no chance to enter spuriously inconsistent data. Still, using cyclic references is IMHO bad design style. I can't accept an exceptional case as reason to break *all* table's definition into pieces. The CREATE TABLE syntax shows that I'm probably not the only one thinking like this: it may include all constraint definitions as well. There might be discussions whether its better to script CREATE TABLE xxx ..; ALTER TABLE xxx ADD PRIMARY KEY ; ALTER TABLE xxx ADD FOREIGN KEY ; or CREATE TABLE xxx (, PRIMARY KEY (..), FOREIGN KEY (..)); I'd opt for the second version (a little formatted, maybe :-) Regards, Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
CREATE TABLE a (col integer primary key); CREATE TABLE b (col integer primary key); ALTER TABLE a ADD FOREIGN KEY (col) REFERENCES b INITIALLY DEFERRED; ALTER TABLE b ADD FOREIGN KEY (col) REFERENCES a; Still, using cyclic references is IMHO bad design style. I can't accept They're extremely useful when you have normalized data and a very expensive (but repeatable) process whose results you want to cache for performance reasons. It enforces that original data and cache are both added in a single transaction. But you're right. Normally they're a bad idea. an exceptional case as reason to break *all* table's definition into pieces. The CREATE TABLE syntax shows that I'm probably not the only one thinking like this: it may include all constraint definitions as well. There might be discussions whether its better to script CREATE TABLE xxx ..; ALTER TABLE xxx ADD PRIMARY KEY ; ALTER TABLE xxx ADD FOREIGN KEY ; or CREATE TABLE xxx (, PRIMARY KEY (..), FOREIGN KEY (..)); I'd opt for the second version (a little formatted, maybe :-) Well.. the second one will be much slower when the foreign keys verify. Primary, unique constraints I'll buy in the create statement. Check constraints and defaults are a little fuzzier. Logic will be required to pull them out in the event they call functions which depends on the table or we enable subqueries (assertion like constraints) in them. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Anyone working on pg_dump dependency ordering?
I'm thinking about attacking pg_dump's lack of knowledge about using dependencies to determine a safe dump order. But if there's someone out there actively working on the problem, I don't want to tread on your toes ... anyone? Also, if you've got uncommitted patches for pg_dump, please let me know. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Anyone working on pg_dump dependency ordering?
I'm thinking about attacking pg_dump's lack of knowledge about using dependencies to determine a safe dump order. But if there's someone out there actively working on the problem, I don't want to tread on your toes ... anyone? I've done a whole lot of _thinking_, but basically no _doing_, so go right ahead :) I may as well let you know my thoughts: There are two levels (sort of) of dependency. The first is that whole classes of objects can be dependent on whole other classes. eg. databases depend on users, or ALL FK's can be dumped after ALL tables, etc.. It would make the dump more readable if you dumped those definite dependencies in that order, rather than shuffling everything up. The second level of dependency is when a bunch of object types can depend on each other. The current solution for that is to sort by OID, but this fails when it is possible to add a dependency to an object after it has been created. eg: - Adding a column (with a type) to a table - All the CREATE OR REPLACE commands - etc. Hence, a full db wide topological sort might not be necessary. Lastly, I presume it's possible to create a system of circular dependencies (eg create or replace view), which really cannot be solved without a system of 'shells', similar to that needed to dump types and their i/o functions. Views seem to be by far the nastiest object. They can be dependent on almost everything in the database. Also, if you've got uncommitted patches for pg_dump, please let me know. Yes, my 'COMMENT ON' mega patch in the queue contains dumping of conversions and comments on a bunch of objects. BTW, if you commit that patch - you might want to change my comment on type patch to put the around any, and change the results file appropriately. I noticed I accidentally included that in the patch, and was about to mention it. CHris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]