Re: [HACKERS] db partial dumping with pg_dump
On Tue, 2002-08-13 at 07:49, Oleg Bartunov wrote: We have a patch for pg_dump which adds a possibility to dump a part of table, for example: dump Top.Science.Astronomy heirarchy from dmoz catalog pg_dump -d -t dmoz -w select * from dmoz where path @ 'Top.Science.Astronomy' dmoz We found it's very useful. We'd like to extend it to use also with COPY but it has no support for select (of course it will works only for special case when ALL columns retrieved). The question is: Is't worth to submit patch for pg_dump and look into copy code ? I've been asked by co-workers for information on how to do this type of thing. They do partial table dumps to a development system for the purpose of finding / eliminating bugs. That said, it may be smart to make the 'select * from table' part yourself, and let the user supply a where clause. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] db partial dumping with pg_dump
On 13 Aug 2002, Rod Taylor wrote: On Tue, 2002-08-13 at 07:49, Oleg Bartunov wrote: We have a patch for pg_dump which adds a possibility to dump a part of table, for example: dump Top.Science.Astronomy heirarchy from dmoz catalog pg_dump -d -t dmoz -w select * from dmoz where path @ 'Top.Science.Astronomy' dmoz We found it's very useful. We'd like to extend it to use also with COPY but it has no support for select (of course it will works only for special case when ALL columns retrieved). The question is: Is't worth to submit patch for pg_dump and look into copy code ? I've been asked by co-workers for information on how to do this type of thing. They do partial table dumps to a development system for the purpose of finding / eliminating bugs. That said, it may be smart to make the 'select * from table' part yourself, and let the user supply a where clause. find patch (7.2) in attachement. Note, it works with -d (insert mode) option. ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 --- pg_dump.c.old Fri Aug 9 19:41:08 2002 +++ pg_dump.c Fri Aug 9 19:28:34 2002 @@ -128,6 +128,7 @@ TableInfo *tblinfo; int tblidx; booloids; + const char *select_command; } DumpContext; static void @@ -390,6 +391,7 @@ { const DumpContext *dctx = (DumpContext *) dctxv; const char *classname = dctx-tblinfo[dctx-tblidx].relname; + const char *select_command = dctx-select_command; PGresult *res; PQExpBuffer q = createPQExpBuffer(); @@ -397,9 +399,23 @@ int field; if (fout-remoteVersion = 70100) - appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR SELECT * FROM ONLY %s, fmtId(classname, force_quotes)); + if (select_command) + { + appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR %s, +select_command); + } + else + { + appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR +SELECT * FROM ONLY %s, fmtId(classname, force_quotes)); + } else - appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR SELECT * FROM %s, fmtId(classname, force_quotes)); + if (select_command) + { + appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR %s, +select_command); + } + else + { + appendPQExpBuffer(q, DECLARE _pg_dump_cursor CURSOR FOR +SELECT * FROM %s, fmtId(classname, force_quotes)); + } res = PQexec(g_conn, q-data); if (!res || @@ -547,7 +563,7 @@ */ static void dumpClasses(const TableInfo *tblinfo, const int numTables, Archive *fout, -const char *onlytable, const bool oids, const bool force_quotes) +const char *onlytable, const bool oids, const bool force_quotes, +const char *select_command) { int i; DataDumperPtr dumpFn; @@ -587,6 +603,7 @@ dumpCtx-tblinfo = (TableInfo *) tblinfo; dumpCtx-tblidx = i; dumpCtx-oids = oids; + dumpCtx-select_command = NULL; if (!dumpData) { @@ -602,6 +619,10 @@ /* Restore using INSERT */ dumpFn = dumpClasses_dumpData; copyStmt = NULL; + if (onlytable (strcmp(classname, onlytable) == 0)) + { + dumpCtx-select_command = select_command; + } } ArchiveEntry(fout, tblinfo[i].oid, tblinfo[i].relname, @@ -648,6 +669,7 @@ const char *pghost = NULL; const char *pgport = NULL; const char *username = NULL; + const char *select_command = NULL; char *tablename = NULL; booloids = false; TableInfo *tblinfo; @@ -694,6 +716,7 @@ {no-privileges, no_argument, NULL, 'x'}, {no-acl, no_argument, NULL, 'x'}, {compress,
Re: [HACKERS] db partial dumping with pg_dump
This patch seems extremely messy to me. Unless I'm missing something, -w just plain fails except when you are dumping a specific table (ie, -t must be given as well). And heaven help you if you specify a different table in -t than the one -w is selecting from. This isn't well thought out. I'm not at all convinced that such a thing belongs in pg_dump anyway. It'd be more useful as a manually-invokable feature, I think. You can almost do this in psql with select * from table where something \g outfile but I don't think you can get psql to emit the data in a form that can be reloaded reliably (it won't quote data characters that look the same as column delimiters, for instance). What would seem to make sense is adding a WHERE-clause option to COPY TO, and then you could go COPY table TO 'myfile' WHERE ... We already have column-list support in COPY, so we can already slice the table vertically --- WHERE would let you slice it horizontally, which seems a natural extension. (BTW, has anyone taught psql's \copy about column lists? AFAIR the original patch was only against the backend.) I'm finding it hard to visualize situations where I'd want the extra baggage of pg_dump for something like this. If I want the schema at all, I'll probably want it separate from the data so that I can hack the schema conveniently --- so I'd want to do a pg_dump -s -t table and then do the selective copying separately. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] db partial dumping with pg_dump
On Tue, 13 Aug 2002, Tom Lane wrote: This patch seems extremely messy to me. Unless I'm missing something, -w just plain fails except when you are dumping a specific table (ie, -t must be given as well). And heaven help you if you specify a different table in -t than the one -w is selecting from. This isn't well thought out. You're right. The patch I've sent was no way for submitting to sources ! I just asked about the feature and attached to message for Rod Taylor. I'm not at all convinced that such a thing belongs in pg_dump anyway. It'd be more useful as a manually-invokable feature, I think. You can almost do this in psql with select * from table where something \g outfile but I don't think you can get psql to emit the data in a form that can be reloaded reliably (it won't quote data characters that look the same as column delimiters, for instance). that was the reason we don't use psql for dumping What would seem to make sense is adding a WHERE-clause option to COPY TO, and then you could go COPY table TO 'myfile' WHERE ... We already have column-list support in COPY, so we can already slice the table vertically --- WHERE would let you slice it horizontally, which seems a natural extension. (BTW, has anyone taught psql's \copy about column lists? AFAIR the original patch was only against the backend.) I'm finding it hard to visualize situations where I'd want the extra baggage of pg_dump for something like this. If I want the schema at all, I'll probably want it separate from the data so that I can hack the schema conveniently --- so I'd want to do a pg_dump -s -t table and then do the selective copying separately. that'd be nice ! I often need such a feature because db at work are often too large to play at home :-) regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] db partial dumping with pg_dump
On Tue, 2002-08-13 at 20:24, Tom Lane wrote: What would seem to make sense is adding a WHERE-clause option to COPY TO, and then you could go COPY table TO 'myfile' WHERE ... What about : COPY table TO 'myfile' [ WHERE cond ] [ AS INSERT [ WITH COLUMN ] ]; to get the data as INSERT INTO statements (pg_dump -d), with optional column list (pg_dump -D) Hannu ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] db partial dumping with pg_dump
Actually, loading all this stuff into COPY is not the way to go, I think. Informix had: UNLOAD TO 'filename' SELECT ... I have to admit, this is a superior way to do thing compared to what we have. Is is possible for us to get: COPY TO 'filename' SELECT ... It allows any arbitrary table, group by, even order by combination. --- Hannu Krosing wrote: On Tue, 2002-08-13 at 20:24, Tom Lane wrote: What would seem to make sense is adding a WHERE-clause option to COPY TO, and then you could go COPY table TO 'myfile' WHERE ... What about : COPY table TO 'myfile' [ WHERE cond ] [ AS INSERT [ WITH COLUMN ] ]; to get the data as INSERT INTO statements (pg_dump -d), with optional column list (pg_dump -D) Hannu ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] db partial dumping with pg_dump
On Wed, 2002-08-14 at 00:03, Bruce Momjian wrote: Actually, loading all this stuff into COPY is not the way to go, I think. Informix had: UNLOAD TO 'filename' SELECT ... I have to admit, this is a superior way to do thing compared to what we have. Is is possible for us to get: COPY TO 'filename' SELECT ... It allows any arbitrary table, group by, even order by combination. It would be more in line with the rest of the system to just allow subselect as 'table' COPY (select in,name from mystuff wher id 10) over10stuff TO stdout; - Hannu ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] db partial dumping with pg_dump
Hannu Krosing wrote: On Wed, 2002-08-14 at 00:03, Bruce Momjian wrote: Actually, loading all this stuff into COPY is not the way to go, I think. Informix had: UNLOAD TO 'filename' SELECT ... I have to admit, this is a superior way to do thing compared to what we have. Is is possible for us to get: COPY TO 'filename' SELECT ... It allows any arbitrary table, group by, even order by combination. It would be more in line with the rest of the system to just allow subselect as 'table' COPY (select in,name from mystuff wher id 10) over10stuff TO stdout; Yep, that would work too. Clearly, we should shoot for something that leverages the existing SELECT code rather than hang more clauses off of COPY. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] db partial dumping with pg_dump
Bruce Momjian [EMAIL PROTECTED] writes: Clearly, we should shoot for something that leverages the existing SELECT code rather than hang more clauses off of COPY. Yeah, that's a good point. COPY IN is still a special case, I think, but seems like COPY OUT could be reimplemented as a special tuple destination for the regular executor machinery. Q: how much performance hit would we be taking? If this slows down pg_dump a lot, the extra code is worth keeping. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] db partial dumping with pg_dump
I'm finding it hard to visualize situations where I'd want the extra baggage of pg_dump for something like this. If I want the schema at all, I'll probably want it separate from the data so that I can hack the schema conveniently --- so I'd want to do a pg_dump -s -t table and then do the selective copying separately. Tell you what I'd kill for: pg_dump -t users_\* db users.sql Sort of thing. Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] db partial dumping with pg_dump
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Tell you what I'd kill for: pg_dump -t users_\* db users.sql Yeah, I was just about to start working on pg_dump's -t switch. It absolutely *must* be made schema-aware for 7.3 (-t schema.table doesn't work today) and it seems like we might as well try to allow wildcard support. Another thought I had was to accept multiple -t switches, eg pg_dump -t foo -t bar -t schema.baz* to dump all tables matching any of the patterns. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] db partial dumping with pg_dump
Yeah, I was just about to start working on pg_dump's -t switch. It absolutely *must* be made schema-aware for 7.3 (-t schema.table doesn't work today) and it seems like we might as well try to allow wildcard support. Another thought I had was to accept multiple -t switches, eg pg_dump -t foo -t bar -t schema.baz* to dump all tables matching any of the patterns. What about a switch to set schema search path as well? You could also just have the tables comma separated in the -t ?? Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] db partial dumping with pg_dump
Scott Shattuck [EMAIL PROTECTED] writes: I'd also kill for pg_restore --ignore-existing-objects so I could run the darn thing against a database that's already got pl/pgsql installed in template1 and the dump file wants to install it again etc. In general, I think it's a mistake for pg_restore to bail out on errors. The underlying pg_dump scripts have been built and optimized on the assumption that psql would keep plugging after seeing an error. For example, scripts containing \connect - foo still work if there's no foo user ... but only because psql doesn't go belly-up. pg_restore is way less forgiving. I think the ideal behavior for pg_restore would be to abandon work on the current dump item upon seeing a SQL error, but to pick up with the next one. (Of course we could have an --anal-retentive switch to bail on first error, but I doubt it'd be used much.) Errors associated with ownership switches shouldn't cause failure in any case. Anybody care to submit patches to make this happen? regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html
Re: [HACKERS] db partial dumping with pg_dump
Christopher Kings-Lynne [EMAIL PROTECTED] writes: What about a switch to set schema search path as well? I'm leaning to the thought that this is a bad idea, and that pg_dump should act in a way that's search-path-independent. But I need to think more. You could also just have the tables comma separated in the -t ?? Don't like it --- that makes comma a special character in pg_dump tablename patterns, which is an exception we don't need to make. The multiple-dash-t approach needs no such assumption. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster