subject:"Re\: \[PATCHES\] pg_dump additional options for performance"

Re: [PATCHES] pg_dump additional options for performance

2008-08-02 Thread Tom Lane

chris <[EMAIL PROTECTED]> writes: > Do we need to wait until a fully-parallelizing pg_restore is > implemented before adding this functionality to pg_dump? They're independent problems ... and I would venture that parallel dump is harder. > Further, it's actually not obvious that we *necessarily*

Re: [PATCHES] pg_dump additional options for performance

2008-08-02 Thread chris

[EMAIL PROTECTED] (Tom Lane) writes: > Simon Riggs <[EMAIL PROTECTED]> writes: >> I want to dump tables separately for performance reasons. There are >> documented tests showing 100% gains using this method. There is no gain >> adding this to pg_restore. There is a gain to be had - parallelising >>

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Stephen Frost

* Joshua D. Drake ([EMAIL PROTECTED]) wrote: > Custom format rocks for partial set restores from a whole dump. See the > TOC option :) I imagine it does, but that's very rarely what I need. Most of the time we're dumping out a schema to load it into a seperate schema (usually on another host).

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Joshua D. Drake

Stephen Frost wrote: * Tom Lane ([EMAIL PROTECTED]) wrote: Stephen Frost <[EMAIL PROTECTED]> writes: I dislike, and doubt that I'd use, this approach. At the end of the day, it ends up processing the same (very large amount of data) multiple times. This would depend on the dump being in the

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote: > Right, but the parallelization is going to happen sometime, and it is > going to happen in the context of pg_restore. So I think it's pretty > silly to argue that no one will ever want this feature to work in > pg_restore. I think you've about convinced me

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote: > Stephen Frost <[EMAIL PROTECTED]> writes: > > I dislike, and doubt that I'd use, this approach. At the end of the > > day, it ends up processing the same (very large amount of data) multiple > > times. > > Well, that's easily avoided: just replace the third

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Joshua D. Drake

Andrew Dunstan wrote: Joshua D. Drake wrote: Agreed but that is a problem I understand with a solution I don't. I am all eyes on a way to fix that. One thought I had and please, be gentle in response was some sort of async transaction capability. I know that libpq has the ability to send a

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Andrew Dunstan

Joshua D. Drake wrote: Agreed but that is a problem I understand with a solution I don't. I am all eyes on a way to fix that. One thought I had and please, be gentle in response was some sort of async transaction capability. I know that libpq has the ability to send async queries. Is it pos

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Simon Riggs

On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote: > 2. We have no concurrency which means, anyone with any database over 50G > has unacceptable restore times. Agreed. Also the core reason for wanting -w > 3. We have to continue develop hacks to define custom utilization. Why > am I pas

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Simon Riggs

On Sat, 2008-07-26 at 13:56 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I want to dump tables separately for performance reasons. There are > > documented tests showing 100% gains using this method. There is no gain > > adding this to pg_restore. There is a gain to be had

Re: [PATCHES] pg_dump additional options for performance

2008-07-27 Thread Joshua D. Drake

Simon Riggs wrote: On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote: 2. We have no concurrency which means, anyone with any database over 50G has unacceptable restore times. Agreed. Sounds good. Doesn't help with the main element of dump time: one table at a time to one output fil

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread daveg

On Sat, Jul 26, 2008 at 01:56:14PM -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > I want to dump tables separately for performance reasons. There are > > documented tests showing 100% gains using this method. There is no gain > > adding this to pg_restore. There is a gain to b

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Joshua D. Drake

On Sat, 2008-07-26 at 13:43 -0400, Tom Lane wrote: > Stephen Frost <[EMAIL PROTECTED]> writes: > > I dislike, and doubt that I'd use, this approach. At the end of the > > day, it ends up processing the same (very large amount of data) multiple > > times. > > Well, that's easily avoided: just rep

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes: > I want to dump tables separately for performance reasons. There are > documented tests showing 100% gains using this method. There is no gain > adding this to pg_restore. There is a gain to be had - parallelising > index creation, but this patch doesn't pro

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes: > * Tom Lane ([EMAIL PROTECTED]) wrote: >> This argument is nonsense. The typical usage of this capability, IMHO, >> will be >> >> pg_dump -Fc >whole.dump >> pg_restore --schema-before-data whole.dump >before.sql >> pg_restore --data-only whole.dump >data

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Simon Riggs

On Sat, 2008-07-26 at 12:20 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote: > >> The key problem is that pg_restore is broken: > > > The key capability here is being able to split the dump into multiple > > pieces. The equiva

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote: > >> The key problem is that pg_restore is broken: > > > The key capability here is being able to split the dump into multiple > > pieces. The equivalent capabilit

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes: >>> Another issue is that the rules for deciding which objects are "before >>> data" and which are "after data" are wrong. In particular ACLs are after >>> data not before data, which is relatively easy to fix. >> >> OK > This was partially why I was c

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes: > On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote: >> The key problem is that pg_restore is broken: > The key capability here is being able to split the dump into multiple > pieces. The equivalent capability on restore is *not* required, because > once the

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Stephen Frost

* Simon Riggs ([EMAIL PROTECTED]) wrote: > The key capability here is being able to split the dump into multiple > pieces. The equivalent capability on restore is *not* required, because > once the dump has been split the restore never needs to be. It might > seem that the patch should be symmetric

Re: [PATCHES] pg_dump additional options for performance

2008-07-26 Thread Simon Riggs

On Fri, 2008-07-25 at 19:16 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > [ pg_dump_beforeafter.v6.patch ] > Unfortunately there's still a lot of work to do, and I don't feel like > doing it so I'm bouncing this patch back for further work. Fair enough. Thanks for the revi

Re: [PATCHES] pg_dump additional options for performance

2008-07-25 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Tom Lane wrote: > * --schema-before-data, --data-only, and --schema-after-data can be I thought you were arguing for some better names at one point? Those seem very confusing to me, especially "--schema-after-data". I know it means "the parts o

Re: [PATCHES] pg_dump additional options for performance

2008-07-25 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes: > [ pg_dump_beforeafter.v6.patch ] I looked over this patch a bit. I have a proposal for a slightly different way of defining the new switches: * --schema-before-data, --data-only, and --schema-after-data can be specified in any combination to obtain any s

Re: [PATCHES] pg_dump additional options for performance

2008-07-24 Thread Simon Riggs

On Thu, 2008-07-24 at 03:54 -0400, Tom Lane wrote: > Simon Riggs <[EMAIL PROTECTED]> writes: > > [80k patch] > > Surely there is a whole lot of unintended noise in this patch? > I certainly don't believe that you meant to change keywords.c > for instance. Removed, thanks. Unrelated to this patc

Re: [PATCHES] pg_dump additional options for performance

2008-07-24 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes: > [80k patch] Surely there is a whole lot of unintended noise in this patch? I certainly don't believe that you meant to change keywords.c for instance. regards, tom lane -- Sent via pgsql-patches mailing list (pgsql-patches@postgr

Re: [PATCHES] pg_dump additional options for performance

2008-07-23 Thread Stephen Frost

Simon, * Simon Riggs ([EMAIL PROTECTED]) wrote: > ...and with command line help also. The documentation and whatnot looks good to me now. There are a couple of other issues I found while looking through and testing the patch though- Index: src/bin/pg_dump/pg_dump.c =

Re: [PATCHES] pg_dump additional options for performance

2008-07-23 Thread Simon Riggs

On Wed, 2008-07-23 at 17:40 +0100, Simon Riggs wrote: > On Mon, 2008-07-21 at 07:56 -0400, Stephen Frost wrote: > > Simon, > > > > * Simon Riggs ([EMAIL PROTECTED]) wrote: > > > > I hadn't realized that Simon was using "pre-schema" and "post-schema" > > > > to name the first and third parts. I'd

Re: [PATCHES] pg_dump additional options for performance

2008-07-23 Thread Simon Riggs

On Mon, 2008-07-21 at 07:56 -0400, Stephen Frost wrote: > Simon, > > * Simon Riggs ([EMAIL PROTECTED]) wrote: > > > I hadn't realized that Simon was using "pre-schema" and "post-schema" > > > to name the first and third parts. I'd agree that this is confusing > > > nomenclature: it looks like it

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Simon Riggs

On Mon, 2008-07-21 at 19:19 -0400, Tom Lane wrote: > Stephen Frost <[EMAIL PROTECTED]> writes: > > Are there use cases for just --omit-post-load or --omit-pre-load? > > Probably not many. The thing that's bothering me is the > action-at-a-distance property of the positive-logic switches. > How a

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes: > Are there use cases for just --omit-post-load or --omit-pre-load? Probably not many. The thing that's bothering me is the action-at-a-distance property of the positive-logic switches. How are we going to explain this? "By default, --schema-pre-

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Stephen Frost

Tom, et al, * Tom Lane ([EMAIL PROTECTED]) wrote: > Ah, I see. No objection to those switch names, at least assuming we > want to stick to positive-logic switches. What did you think of the > negative-logic suggestion (--omit-xxx)? My preference is for positive-logic switches in general. The p

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes: > * Tom Lane ([EMAIL PROTECTED]) wrote: >> As far as the documentation/definition aspect goes, I think it should >> just say the parts are >> * stuff needed before you can load the data >> * the data >> * stuff needed after loading the data > Even that is

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Stephen Frost

Tom, * Tom Lane ([EMAIL PROTECTED]) wrote: > As far as the documentation/definition aspect goes, I think it should > just say the parts are > * stuff needed before you can load the data > * the data > * stuff needed after loading the data > and not try to be any more specific tha

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Stephen Frost

Simon, * Simon Riggs ([EMAIL PROTECTED]) wrote: > > I hadn't realized that Simon was using "pre-schema" and "post-schema" > > to name the first and third parts. I'd agree that this is confusing > > nomenclature: it looks like it's trying to say that the data is the > > schema, and the schema is n

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Simon Riggs

On Mon, 2008-07-21 at 07:46 -0400, Stephen Frost wrote: > * Simon Riggs ([EMAIL PROTECTED]) wrote: > > The options split the dump into 3 parts that's all: before the load, the > > load and after the load. > > > > --schema-pre-load says > > "Dumps exactly what --schema-only would dump, but only th

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Tom Lane

Andrew Dunstan <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> Maybe invert the logic? >> --omit-pre-data >> --omit-data >> --omit-post-data > Please, no. Negative logic seems likely to cause endless confusion. I think it might actually be less confusing, because with this approach, each switch

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Andrew Dunstan

Tom Lane wrote: Simon Riggs <[EMAIL PROTECTED]> writes: I also suggested having three options --want-pre-schema --want-data --want-post-schema so we could ask for any or all parts in the one dump. --data-only and --schema-only are negative options so don't allow this. (I don't like those na

Re: [PATCHES] pg_dump additional options for performance

2008-07-21 Thread Stephen Frost

* Simon Riggs ([EMAIL PROTECTED]) wrote: > The options split the dump into 3 parts that's all: before the load, the > load and after the load. > > --schema-pre-load says > "Dumps exactly what --schema-only would dump, but only those > statements before the data load." > > What is it you are sugge

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Tom Lane

Simon Riggs <[EMAIL PROTECTED]> writes: > I also suggested having three options > --want-pre-schema > --want-data > --want-post-schema > so we could ask for any or all parts in the one dump. --data-only and > --schema-only are negative options so don't allow this. > (I don't like those names either

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Simon Riggs

On Sun, 2008-07-20 at 23:34 -0400, Tom Lane wrote: > Stephen Frost <[EMAIL PROTECTED]> writes: > > * daveg ([EMAIL PROTECTED]) wrote: > >> One observation, indexes should be built right after the table data > >> is loaded for each table, this way, the index build gets a hot cache > >> for the tabl

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Simon Riggs

On Sun, 2008-07-20 at 21:18 -0400, Stephen Frost wrote: > * Simon Riggs ([EMAIL PROTECTED]) wrote: > > On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote: > > > Even this doesn't cover everything though- it's too focused on tables > > > and data loading. Where do functions go? What about typ

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Tom Lane

Stephen Frost <[EMAIL PROTECTED]> writes: > * daveg ([EMAIL PROTECTED]) wrote: >> One observation, indexes should be built right after the table data >> is loaded for each table, this way, the index build gets a hot cache >> for the table data instead of having to re-read it later as we do now. >

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Stephen Frost

* daveg ([EMAIL PROTECTED]) wrote: > One observation, indexes should be built right after the table data > is loaded for each table, this way, the index build gets a hot cache > for the table data instead of having to re-read it later as we do now. That's not how pg_dump has traditionally worked,

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread daveg

On Sun, Jul 20, 2008 at 09:18:29PM -0400, Stephen Frost wrote: > * Simon Riggs ([EMAIL PROTECTED]) wrote: > > On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote: > > > Even this doesn't cover everything though- it's too focused on tables > > > and data loading. Where do functions go? What abo

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Stephen Frost

* Simon Riggs ([EMAIL PROTECTED]) wrote: > On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote: > > Even this doesn't cover everything though- it's too focused on tables > > and data loading. Where do functions go? What about types? > > Yes, it is focused on tables and data loading. What abou

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Simon Riggs

On Sun, 2008-07-20 at 17:43 -0400, Stephen Frost wrote: > Perhaps this is up for some debate, but I find the documentation added > for these options to be lacking the definitions I was looking for, and > the explanation of why they are what they are. I'm also not sure I > agree with the "Pre-Sch

Re: [PATCHES] pg_dump additional options for performance

2008-07-20 Thread Stephen Frost

Simon, * Simon Riggs ([EMAIL PROTECTED]) wrote: > On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote: > > On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote: [...] > > > - Conflicting option handling Thanks for putting in the extra code to explicitly indicate which conflicting options wer

Re: [PATCHES] pg_dump additional options for performance

2008-07-19 Thread Simon Riggs

On Sun, 2008-07-20 at 05:47 +0100, Simon Riggs wrote: > On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote: > > Simon, > > > > I agree with adding these options in general, since I find myself > > frustrated by having to vi huge dumps to change simple schema things. > > A couple of comm

Re: [PATCHES] pg_dump additional options for performance

2008-07-19 Thread Simon Riggs

On Sat, 2008-07-19 at 23:07 -0400, Stephen Frost wrote: > Simon, > > I agree with adding these options in general, since I find myself > frustrated by having to vi huge dumps to change simple schema things. > A couple of comments on the patch though: > > - Conflicting option handling >

49 matches

Mail list logo