Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Rod Taylor

On Tue, 2002-08-13 at 07:49, Oleg Bartunov wrote:
 We have a patch for pg_dump which adds a possibility to dump
 a part of table, for example:
 
 dump Top.Science.Astronomy heirarchy from dmoz catalog
 
 pg_dump -d -t dmoz -w select * from dmoz where path @ 'Top.Science.Astronomy' dmoz
 
 We found it's very useful. We'd like to extend it to use also with COPY
 but it has no support for select (of course it will works only for
 special case when ALL columns retrieved).
 
 The question is: Is't worth to submit patch for pg_dump and look into
 copy code ?

I've been asked by co-workers for information on how to do this type of
thing.  They do partial table dumps to a development system for the
purpose of finding / eliminating bugs.

That said, it may be smart to make the 'select * from table' part
yourself, and let the user supply a where clause.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Oleg Bartunov

On 13 Aug 2002, Rod Taylor wrote:

 On Tue, 2002-08-13 at 07:49, Oleg Bartunov wrote:
  We have a patch for pg_dump which adds a possibility to dump
  a part of table, for example:
 
  dump Top.Science.Astronomy heirarchy from dmoz catalog
 
  pg_dump -d -t dmoz -w select * from dmoz where path @ 'Top.Science.Astronomy' 
dmoz
 
  We found it's very useful. We'd like to extend it to use also with COPY
  but it has no support for select (of course it will works only for
  special case when ALL columns retrieved).
 
  The question is: Is't worth to submit patch for pg_dump and look into
  copy code ?

 I've been asked by co-workers for information on how to do this type of
 thing.  They do partial table dumps to a development system for the
 purpose of finding / eliminating bugs.

 That said, it may be smart to make the 'select * from table' part
 yourself, and let the user supply a where clause.


find patch (7.2) in attachement. Note, it works with -d (insert mode) option.



 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


--- pg_dump.c.old   Fri Aug  9 19:41:08 2002
+++ pg_dump.c   Fri Aug  9 19:28:34 2002
@@ -128,6 +128,7 @@
TableInfo  *tblinfo;
int tblidx;
booloids;
+   const char *select_command;
 } DumpContext;

 static void
@@ -390,6 +391,7 @@
 {
const DumpContext *dctx = (DumpContext *) dctxv;
const char *classname = dctx-tblinfo[dctx-tblidx].relname;
+   const char *select_command = dctx-select_command;

PGresult   *res;
PQExpBuffer q = createPQExpBuffer();
@@ -397,9 +399,23 @@
int field;

if (fout-remoteVersion = 70100)
-   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR SELECT * FROM 
ONLY %s, fmtId(classname, force_quotes));
+   if (select_command)
+   {
+   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR %s, 
+select_command);
+   }
+   else
+   {
+   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR 
+SELECT * FROM ONLY %s, fmtId(classname, force_quotes));
+   }
else
-   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR SELECT * FROM 
%s, fmtId(classname, force_quotes));
+   if (select_command)
+   {
+   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR %s, 
+select_command);
+   }
+   else
+   {
+   appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR 
+SELECT * FROM %s, fmtId(classname, force_quotes));
+   }

res = PQexec(g_conn, q-data);
if (!res ||
@@ -547,7 +563,7 @@
  */
 static void
 dumpClasses(const TableInfo *tblinfo, const int numTables, Archive *fout,
-const char *onlytable, const bool oids, const bool force_quotes)
+const char *onlytable, const bool oids, const bool force_quotes, 
+const char *select_command)
 {
int i;
DataDumperPtr dumpFn;
@@ -587,6 +603,7 @@
dumpCtx-tblinfo = (TableInfo *) tblinfo;
dumpCtx-tblidx = i;
dumpCtx-oids = oids;
+   dumpCtx-select_command = NULL;

if (!dumpData)
{
@@ -602,6 +619,10 @@
/* Restore using INSERT */
dumpFn = dumpClasses_dumpData;
copyStmt = NULL;
+   if (onlytable  (strcmp(classname, onlytable) == 0))
+   {
+   dumpCtx-select_command = select_command;
+   }
}

ArchiveEntry(fout, tblinfo[i].oid, tblinfo[i].relname,
@@ -648,6 +669,7 @@
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
+   const char *select_command = NULL;
char   *tablename = NULL;
booloids = false;
TableInfo  *tblinfo;
@@ -694,6 +716,7 @@
{no-privileges, no_argument, NULL, 'x'},
{no-acl, no_argument, NULL, 'x'},
{compress, 

Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Tom Lane

This patch seems extremely messy to me.  Unless I'm missing something,
-w just plain fails except when you are dumping a specific table (ie,
-t must be given as well).  And heaven help you if you specify a
different table in -t than the one -w is selecting from.  This isn't
well thought out.

I'm not at all convinced that such a thing belongs in pg_dump anyway.
It'd be more useful as a manually-invokable feature, I think.  You
can almost do this in psql with
select * from table where something
\g outfile
but I don't think you can get psql to emit the data in a form that can
be reloaded reliably (it won't quote data characters that look the same
as column delimiters, for instance).

What would seem to make sense is adding a WHERE-clause option to
COPY TO, and then you could go
COPY table TO 'myfile' WHERE ...
We already have column-list support in COPY, so we can already slice the
table vertically --- WHERE would let you slice it horizontally, which
seems a natural extension.  (BTW, has anyone taught psql's \copy about
column lists?  AFAIR the original patch was only against the backend.)

I'm finding it hard to visualize situations where I'd want the extra
baggage of pg_dump for something like this.  If I want the schema at
all, I'll probably want it separate from the data so that I can hack
the schema conveniently --- so I'd want to do a pg_dump -s -t table
and then do the selective copying separately.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Oleg Bartunov

On Tue, 13 Aug 2002, Tom Lane wrote:

 This patch seems extremely messy to me.  Unless I'm missing something,
 -w just plain fails except when you are dumping a specific table (ie,
 -t must be given as well).  And heaven help you if you specify a
 different table in -t than the one -w is selecting from.  This isn't
 well thought out.

You're right. The patch I've sent was no way for submitting to sources !
I just asked about the feature and attached to message for Rod Taylor.


 I'm not at all convinced that such a thing belongs in pg_dump anyway.
 It'd be more useful as a manually-invokable feature, I think.  You
 can almost do this in psql with
   select * from table where something
   \g outfile
 but I don't think you can get psql to emit the data in a form that can
 be reloaded reliably (it won't quote data characters that look the same
 as column delimiters, for instance).


that was the reason we don't use psql for dumping


 What would seem to make sense is adding a WHERE-clause option to
 COPY TO, and then you could go
   COPY table TO 'myfile' WHERE ...
 We already have column-list support in COPY, so we can already slice the
 table vertically --- WHERE would let you slice it horizontally, which
 seems a natural extension.  (BTW, has anyone taught psql's \copy about
 column lists?  AFAIR the original patch was only against the backend.)

 I'm finding it hard to visualize situations where I'd want the extra
 baggage of pg_dump for something like this.  If I want the schema at
 all, I'll probably want it separate from the data so that I can hack
 the schema conveniently --- so I'd want to do a pg_dump -s -t table
 and then do the selective copying separately.


that'd be nice ! I often need such a feature because db at work are often
too large to play at home :-)


   regards, tom lane

 ---(end of broadcast)---
 TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Hannu Krosing

On Tue, 2002-08-13 at 20:24, Tom Lane wrote:
 
 What would seem to make sense is adding a WHERE-clause option to
 COPY TO, and then you could go
   COPY table TO 'myfile' WHERE ...

What about :

COPY table TO 'myfile' [ WHERE cond ] [ AS INSERT [ WITH COLUMN ] ];

to get the data as INSERT INTO statements (pg_dump -d), with optional
column list (pg_dump -D)


Hannu


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Bruce Momjian


Actually, loading all this stuff into COPY is not the way to go, I
think.  

Informix had:

UNLOAD TO 'filename'
SELECT ...

I have to admit, this is a superior way to do thing compared to what we
have.  Is is possible for us to get:

COPY TO 'filename'
SELECT ...

It allows any arbitrary table, group by, even order by combination.


---

Hannu Krosing wrote:
 On Tue, 2002-08-13 at 20:24, Tom Lane wrote:
  
  What would seem to make sense is adding a WHERE-clause option to
  COPY TO, and then you could go
  COPY table TO 'myfile' WHERE ...
 
 What about :
 
 COPY table TO 'myfile' [ WHERE cond ] [ AS INSERT [ WITH COLUMN ] ];
 
 to get the data as INSERT INTO statements (pg_dump -d), with optional
 column list (pg_dump -D)
 
 
 Hannu
 
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Hannu Krosing

On Wed, 2002-08-14 at 00:03, Bruce Momjian wrote:
 
 Actually, loading all this stuff into COPY is not the way to go, I
 think.  
 
 Informix had:
 
   UNLOAD TO 'filename'
   SELECT ...
 
 I have to admit, this is a superior way to do thing compared to what we
 have.  Is is possible for us to get:
 
   COPY TO 'filename'
   SELECT ...
 
 It allows any arbitrary table, group by, even order by combination.


It would be more in line with the rest of the system to just allow
subselect as 'table'

COPY (select in,name from mystuff wher id  10) over10stuff TO stdout;

-
Hannu


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Bruce Momjian

Hannu Krosing wrote:
 On Wed, 2002-08-14 at 00:03, Bruce Momjian wrote:
  
  Actually, loading all this stuff into COPY is not the way to go, I
  think.  
  
  Informix had:
  
  UNLOAD TO 'filename'
  SELECT ...
  
  I have to admit, this is a superior way to do thing compared to what we
  have.  Is is possible for us to get:
  
  COPY TO 'filename'
  SELECT ...
  
  It allows any arbitrary table, group by, even order by combination.
 
 
 It would be more in line with the rest of the system to just allow
 subselect as 'table'
 
 COPY (select in,name from mystuff wher id  10) over10stuff TO stdout;

Yep, that would work too.  Clearly, we should shoot for something that
leverages the existing SELECT code rather than hang more clauses off of
COPY.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 Clearly, we should shoot for something that leverages the existing
 SELECT code rather than hang more clauses off of COPY.

Yeah, that's a good point.  COPY IN is still a special case, I think,
but seems like COPY OUT could be reimplemented as a special tuple
destination for the regular executor machinery.

Q: how much performance hit would we be taking?  If this slows down
pg_dump a lot, the extra code is worth keeping.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Christopher Kings-Lynne

 I'm finding it hard to visualize situations where I'd want the extra
 baggage of pg_dump for something like this.  If I want the schema at
 all, I'll probably want it separate from the data so that I can hack
 the schema conveniently --- so I'd want to do a pg_dump -s -t table
 and then do the selective copying separately.

Tell you what I'd kill for:

pg_dump -t users_\* db  users.sql

Sort of thing.

Chris


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Tell you what I'd kill for:
 pg_dump -t users_\* db  users.sql

Yeah, I was just about to start working on pg_dump's -t switch.
It absolutely *must* be made schema-aware for 7.3 (-t schema.table
doesn't work today) and it seems like we might as well try to allow
wildcard support.  Another thought I had was to accept multiple -t
switches, eg pg_dump -t foo -t bar -t schema.baz* to dump all tables
matching any of the patterns.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Christopher Kings-Lynne

 Yeah, I was just about to start working on pg_dump's -t switch.
 It absolutely *must* be made schema-aware for 7.3 (-t schema.table
 doesn't work today) and it seems like we might as well try to allow
 wildcard support.  Another thought I had was to accept multiple -t
 switches, eg pg_dump -t foo -t bar -t schema.baz* to dump all tables
 matching any of the patterns.

What about a switch to set schema search path as well?  You could also just
have the tables comma separated in the -t ??

Chris


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Tom Lane

Scott Shattuck [EMAIL PROTECTED] writes:
 I'd also kill for pg_restore --ignore-existing-objects  so I could
 run the darn thing against a database that's already got pl/pgsql
 installed in template1 and the dump file wants to install it again etc.

In general, I think it's a mistake for pg_restore to bail out on errors.
The underlying pg_dump scripts have been built and optimized on the
assumption that psql would keep plugging after seeing an error.  For
example, scripts containing \connect - foo still work if there's no
foo user ... but only because psql doesn't go belly-up.  pg_restore
is way less forgiving.

I think the ideal behavior for pg_restore would be to abandon work on
the current dump item upon seeing a SQL error, but to pick up with the
next one.  (Of course we could have an --anal-retentive switch to bail
on first error, but I doubt it'd be used much.)  Errors associated with
ownership switches shouldn't cause failure in any case.

Anybody care to submit patches to make this happen?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] db partial dumping with pg_dump

2002-08-13 Thread Tom Lane

Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 What about a switch to set schema search path as well?

I'm leaning to the thought that this is a bad idea, and that pg_dump
should act in a way that's search-path-independent.  But I need to
think more.

 You could also just have the tables comma separated in the -t ??

Don't like it --- that makes comma a special character in pg_dump
tablename patterns, which is an exception we don't need to make.
The multiple-dash-t approach needs no such assumption.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster