Re: [HACKERS] pg_dump schema breakup

2006-08-19 Thread Naz Gassiep




Tom Lane wrote:

  Andrew Dunstan [EMAIL PROTECTED] writes:
  
  
Well, the other issue is how many canned breakup schemes we are going to 
support. If this particular one is of sufficiently general usefulness 
then I have no objection. But when you can produce it trivially from the 
output of "pg_dump -s", the need to hardcode it hardly seems pressing.

  
  
FWIW, I am in favor of providing a way to break up the dump output like
this, I was merely objecting to the vocabulary ;-).  We have certainly
seen tons of people burnt by the performance problems inherent in
separate-data-and-schema restores, and splitting the dump into three
parts instead of two seems like it would fix that.

But I also like Alvaro's comment that this should be on the restore side
not so much the dump side.  If you do two or three successive pg_dump
runs to make your dump then you run a nontrivial risk of not getting
consistent dumps.  My advice to people would be to do *one* full
"pg_dump -Fc" and then extract three scripts out of that.

The question then is whether it's worth providing the extraction
functionality in a more canned, user-friendly form than "here, hack up
the -L output with this perl script".  I'd vote yes.

			regards, tom lane

I greatly appreciate the comments here and am glad that my initial idea
has support. This thread highlights to me the difference between the
"hey there's a good idea there despite the fact that's he's obviously
not a veteran software developer" culture that the PostgreSQL community
has instead of the "he is obviously not a veteran software developer so
what on Earth could he have to offer us" responses I've had from
various other open source projects.

On a less obsequious note, I agree that pg_dump should be used to dump
everything in a single run to avoid consistency issues, and the
selection of data to be restored should be done with pg_restore. As
this is a feature that I would benefit greatly from, how do I go about
ensuring that this idea finds its way to the appropriate developer and
doesn't get forgotten in the mountain of ideas in the "that'd be nice
to have some day" category?

- Naz




Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Tom Lane
Naz Gassiep [EMAIL PROTECTED] writes:
 I propose that two more be added:
 --tables-only
 --constraints-only

This doesn't seem well-defined at all.  There are many objects in a
database that are definitely neither tables nor constraints, and it's
not very clear what things should be considered constraints either.

I think what you may really be after is the stuff that should be loaded
before inserting data and the stuff that should be loaded after, but
the above are poor names for these concepts.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Naz Gassiep





Tom Lane wrote:

  Naz Gassiep [EMAIL PROTECTED] writes:
  
  
I propose that two more be added:
--tables-only
--constraints-only

  
  
This doesn't seem well-defined at all.  There are many objects in a
database that are definitely neither tables nor constraints, and it's
not very clear what things should be considered constraints either.

I think what you may really be after is "the stuff that should be loaded
before inserting data" and "the stuff that should be loaded after", but
the above are poor names for these concepts.

			regards, tom lane

And here I was thinking that I'd been clear :)

But yes, you are right, what I want is "the stuff that gets loaded
before data insertion" and "the stuff that goes after data insertion"

Essentially, the CREATE TABLE statements are the first part as that is
what is needed for data to be inserted. Everything else is the second
part.




Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Andreas Joseph Krogh
On Friday 18 August 2006 18:52, Tom Lane wrote:
 Naz Gassiep [EMAIL PROTECTED] writes:
  I propose that two more be added:
  --tables-only
  --constraints-only

 This doesn't seem well-defined at all.  There are many objects in a
 database that are definitely neither tables nor constraints, and it's
 not very clear what things should be considered constraints either.

 I think what you may really be after is the stuff that should be loaded
 before inserting data and the stuff that should be loaded after, but
 the above are poor names for these concepts.

But it certainly would be nice to be able to dump all that stuff:-)

-- 
Andreas Joseph Krogh [EMAIL PROTECTED]
Senior Software Developer / Manager
gpg public_key: http://dev.officenet.no/~andreak/public_key.asc
+-+
OfficeNet AS| The most difficult thing in the world is to |
Karenslyst Allé 11  | know how to do a thing and to watch |
PO. Box 529 Skøyen  | somebody else doing it wrong, without   |
0214 Oslo   | comment.|
NORWAY  | |
Mobile: +47 909  56 963 | |
+-+

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Naz Gassiep




Andreas Joseph Krogh wrote:

  On Friday 18 August 2006 18:52, Tom Lane wrote:
  
  
Naz Gassiep [EMAIL PROTECTED] writes:


  I propose that two more be added:
--tables-only
--constraints-only
  

This doesn't seem well-defined at all.  There are many objects in a
database that are definitely neither tables nor constraints, and it's
not very clear what things should be considered constraints either.

I think what you may really be after is "the stuff that should be loaded
before inserting data" and "the stuff that should be loaded after", but
the above are poor names for these concepts.

  
  
But it certainly would be nice to be able to dump all that "stuff":-)

Yea, I've been told that this would not be a high demand feature. So do
I have a second vote? ;-)




Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Joshua D. Drake



I think what you may really be after is the stuff that should be loaded
before inserting data and the stuff that should be loaded after, but
the above are poor names for these concepts.



But it certainly would be nice to be able to dump all that stuff:-)
Yea, I've been told that this would not be a high demand feature. So 
do I have a second vote? ;-)


The ability to have a dump that automatically separated the before-data 
and after-data objects is definitely
useful. The amount of times I have had to dump the schema and data 
separately just so I can modify the
schema before restore, or disable some function that is causing problems 
can not even be counted.


Sincerely,

Joshua D. Drake


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Andrew Dunstan

Joshua D. Drake wrote:


I think what you may really be after is the stuff that should be 
loaded
before inserting data and the stuff that should be loaded after, 
but

the above are poor names for these concepts.



But it certainly would be nice to be able to dump all that stuff:-)
Yea, I've been told that this would not be a high demand feature. So 
do I have a second vote? ;-)


The ability to have a dump that automatically separated the 
before-data and after-data objects is definitely
useful. The amount of times I have had to dump the schema and data 
separately just so I can modify the
schema before restore, or disable some function that is causing 
problems can not even be counted.




We already have a highly selective and configurable restore mechanism, 
using the -L feature of pg_restore. Maybe there's a good special case 
for this particular split, but it is hardly undoable now.


As for Naz' needs - I gave him a perl script I whipped up in few minutes 
to do the split he wanted - and I'm sure you could do the same in python ;-)


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Naz Gassiep

Andrew Dunstan wrote:
We already have a highly selective and configurable restore mechanism, 
using the -L feature of pg_restore. Maybe there's a good special case 
for this particular split, but it is hardly undoable now.


As for Naz' needs - I gave him a perl script I whipped up in few 
minutes to do the split he wanted - and I'm sure you could do the same 
in python ;-)


cheers

andrew
And very appreciative I am too. While I concede it is doable now, and I 
could do this in Python as well, I do feel that a feature like this 
would have wider use than may be apparent on a first glance.
Furthermore, I think that the simplicity of implementing it makes it 
really a question of why not rather than why.

- Naz

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Alvaro Herrera
Andrew Dunstan wrote:
 Joshua D. Drake wrote:
 
 I think what you may really be after is the stuff that should be 
 loaded
 before inserting data and the stuff that should be loaded after, 
 but
 the above are poor names for these concepts.
 
 
 But it certainly would be nice to be able to dump all that stuff:-)
 Yea, I've been told that this would not be a high demand feature. So 
 do I have a second vote? ;-)
 
 The ability to have a dump that automatically separated the 
 before-data and after-data objects is definitely
 useful. The amount of times I have had to dump the schema and data 
 separately just so I can modify the
 schema before restore, or disable some function that is causing 
 problems can not even be counted.
 
 
 We already have a highly selective and configurable restore mechanism, 
 using the -L feature of pg_restore. Maybe there's a good special case 
 for this particular split, but it is hardly undoable now.
 
 As for Naz' needs - I gave him a perl script I whipped up in few minutes 
 to do the split he wanted - and I'm sure you could do the same in python ;-)

Maybe what we need is a program to process the object list from
pg_restore -L.  Or a mode in pg_restore, from this dump give me all
the sutff to be loaded before inserting data, and the reverse.

I mean, should the problem be attacked while dumping, or while
restoring?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Andrew Dunstan

Alvaro Herrera wrote:

Andrew Dunstan wrote:
  

Joshua D. Drake wrote:

I think what you may really be after is the stuff that should be 
loaded
before inserting data and the stuff that should be loaded after, 
but

the above are poor names for these concepts.
   


But it certainly would be nice to be able to dump all that stuff:-)
  
Yea, I've been told that this would not be a high demand feature. So 
do I have a second vote? ;-)

The ability to have a dump that automatically separated the 
before-data and after-data objects is definitely
useful. The amount of times I have had to dump the schema and data 
separately just so I can modify the
schema before restore, or disable some function that is causing 
problems can not even be counted.


  
We already have a highly selective and configurable restore mechanism, 
using the -L feature of pg_restore. Maybe there's a good special case 
for this particular split, but it is hardly undoable now.


As for Naz' needs - I gave him a perl script I whipped up in few minutes 
to do the split he wanted - and I'm sure you could do the same in python ;-)



Maybe what we need is a program to process the object list from
pg_restore -L.  Or a mode in pg_restore, from this dump give me all
the sutff to be loaded before inserting data, and the reverse.

I mean, should the problem be attacked while dumping, or while
restoring?

  


Well, the other issue is how many canned breakup schemes we are going to 
support. If this particular one is of sufficiently general usefulness 
then I have no objection. But when you can produce it trivially from the 
output of pg_dump -s, the need to hardcode it hardly seems pressing.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] pg_dump schema breakup

2006-08-18 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Well, the other issue is how many canned breakup schemes we are going to 
 support. If this particular one is of sufficiently general usefulness 
 then I have no objection. But when you can produce it trivially from the 
 output of pg_dump -s, the need to hardcode it hardly seems pressing.

FWIW, I am in favor of providing a way to break up the dump output like
this, I was merely objecting to the vocabulary ;-).  We have certainly
seen tons of people burnt by the performance problems inherent in
separate-data-and-schema restores, and splitting the dump into three
parts instead of two seems like it would fix that.

But I also like Alvaro's comment that this should be on the restore side
not so much the dump side.  If you do two or three successive pg_dump
runs to make your dump then you run a nontrivial risk of not getting
consistent dumps.  My advice to people would be to do *one* full
pg_dump -Fc and then extract three scripts out of that.

The question then is whether it's worth providing the extraction
functionality in a more canned, user-friendly form than here, hack up
the -L output with this perl script.  I'd vote yes.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org