Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-25 Thread David Fetter
On Fri, Dec 24, 2010 at 06:37:26PM -0500, Andrew Dunstan wrote:
 On 12/24/2010 06:26 PM, Aidan Van Dyk wrote:
 On Fri, Dec 24, 2010 at 2:48 PM, Joshua D. Drakej...@commandprompt.com  
 wrote:
 
 I would have to agree here. The idea that we have to search email
 is bad enough (issue/bug/feature tracker anyone?) but to have
 someone say, search the archives? That is just plain rude and
 anti-community.
 Saying search the bugtracker is no less rude than search the
 archives...
 
 And most of the bugtrackers I've had to search have way *less*
 ease-of-use for searching than a good mailing list archive (I tend
 to keep going back to gmane's search)
 
 It's deja vu all over again. See mailing list archives for details.

LOL!

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-25 Thread Gurjeet Singh
On Mon, Dec 6, 2010 at 7:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Josh Berkus j...@agliodbs.com writes:
  However, if you were doing something like parallel pg_dump you could
  just run the parent and child instances all against the slave, so the
  pg_dump scenario doesn't seem to offer much of a supporting use-case for
  worrying about this.  When would you really need to be able to do it?

  If you had several standbys, you could distribute the work of the
  pg_dump among them.  This would be a huge speedup for a large database,
  potentially, thanks to parallelization of I/O and network.  Imagine
  doing a pg_dump of a 300GB database in 10min.

 That does sound kind of attractive.  But to do that I think we'd have to
 go with the pass-the-snapshot-through-the-client approach.  Shipping
 internal snapshot files through the WAL stream doesn't seem attractive
 to me.

 While I see Robert's point about preferring not to expose the snapshot
 contents to clients, I don't think it outweighs all other considerations
 here; and every other one is pointing to doing it the other way.


How about the publishing transaction puts the snapshot in a (new) system
table and passes a UUID to its children, and the joining transactions looks
for that UUID in the system table using dirty snapshot (SnapshotAny) using a
security-definer function owned by superuser.

No shared memory used, and if WAL-logged, the snapshot would get to the
slaves too.

I realize SnapshotAny wouldn't be sufficient since we want the tuple to
become invisible when the publishing transaction ends (commit/rollback),
hence something akin to (new) HeapTupleSatisfiesStillRunning() would be
needed.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Bruce Momjian
Robert Haas wrote:
 I actually think that the phrase this has been discussed before and
 rejected should be permanently removed from our list of excuses for
 rejecting a patch.  Or if we must use that excuse, then I think a link
 to the relevant discussion is a must, and the relevant discussion had
 better reflect the fact that $TOPIC was in fact rejected.  It seems to
 me that in at least 50% of cases, someone comes back and says one of
 the following things:
 
 1. I searched the archives and could find no discussion along those lines.
 2. I read that discussion and it doesn't appear to me that it reflects
 a rejection of this idea.  Instead what people seemed to be saying was
 X.
 3. At the time that might have been true, but what has changed in the
 meanwhile is X.

Agreed.  Perhaps we need an anti-TODO that lists things we don't want in
more detail.  The TODO has that for a few items, but scaling things up
there will be cumbersome.

I agree that having the person saying it was rejected find the email
discussion is ideal --- if they can't find it, odds are the patch person
will not be able to find it either.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Joshua D. Drake
 anwhile is X.
 
 Agreed.  Perhaps we need an anti-TODO that lists things we don't want in
 more detail.  The TODO has that for a few items, but scaling things up
 there will be cumbersome.
 

Well there is a problem with this too. A good example is hints. A lot of
the community wants hints. A lot of the community doesn't. The community
changes as we get more mature and more hackers. It isn't hard to point
to dozens of items we have now that would have been on that list 5 years
ago.


 I agree that having the person saying it was rejected find the email
 discussion is ideal --- if they can't find it, odds are the patch person
 will not be able to find it either.

I would have to agree here. The idea that we have to search email is bad
enough (issue/bug/feature tracker anyone?) but to have someone say,
search the archives? That is just plain rude and anti-community.

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Aidan Van Dyk
On Fri, Dec 24, 2010 at 2:48 PM, Joshua D. Drake j...@commandprompt.com wrote:

 I would have to agree here. The idea that we have to search email is bad
 enough (issue/bug/feature tracker anyone?) but to have someone say,
 search the archives? That is just plain rude and anti-community.

Saying search the bugtracker is no less rude than search the archives...

And most of the bugtrackers I've had to search have way *less*
ease-of-use for searching than a good mailing list archive (I tend to
keep going back to gmane's search)

a.


-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Andrew Dunstan



On 12/24/2010 06:26 PM, Aidan Van Dyk wrote:

On Fri, Dec 24, 2010 at 2:48 PM, Joshua D. Drakej...@commandprompt.com  wrote:


I would have to agree here. The idea that we have to search email is bad
enough (issue/bug/feature tracker anyone?) but to have someone say,
search the archives? That is just plain rude and anti-community.

Saying search the bugtracker is no less rude than search the archives...

And most of the bugtrackers I've had to search have way *less*
ease-of-use for searching than a good mailing list archive (I tend to
keep going back to gmane's search)




It's deja vu all over again. See mailing list archives for details.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Joshua D. Drake
On Fri, 2010-12-24 at 18:26 -0500, Aidan Van Dyk wrote:
 On Fri, Dec 24, 2010 at 2:48 PM, Joshua D. Drake j...@commandprompt.com 
 wrote:
 
  I would have to agree here. The idea that we have to search email is bad
  enough (issue/bug/feature tracker anyone?) but to have someone say,
  search the archives? That is just plain rude and anti-community.
 
 Saying search the bugtracker is no less rude than search the archives...
 
 And most of the bugtrackers I've had to search have way *less*
 ease-of-use for searching than a good mailing list archive (I tend to
 keep going back to gmane's search)

I think you kind of missed my point.

JD

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-24 Thread Robert Haas
On Dec 24, 2010, at 10:52 AM, Bruce Momjian br...@momjian.us wrote:
 Agreed.  Perhaps we need an anti-TODO that lists things we don't want in
 more detail.  The TODO has that for a few items, but scaling things up
 there will be cumbersome.

I don't really think that'd be much better.  What might be of some value is 
summaries of previous discussions, *with citations*.  Foo seems like it would 
be useful [1,2,3] but there are concerns about bar [4,5] and baz[6].

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-14 Thread Robert Haas
On Tue, Dec 7, 2010 at 3:23 AM, Koichi Suzuki koichi@gmail.com wrote:
 This is what Postgres-XC is doing between a coordinator and a
 datanode.    Coordinator may correspond to poolers/loadbalancers.
 Does anyone think it makes sense to extract XC implementation of
 snapshot shipping to PostgreSQL itself?

Perhaps, though of course it would need to be re-licensed.  I'd be
happy to see us pursue a snapshot cloning framework, wherever it comes
from.  I remain unconvinced that it should be made a hard requirement
for parallel pg_dump, but of course if we can get it implemented then
the point becomes moot.

Let's not let this fall on the floor.  Someone should pursue this,
whether it's Joachim or Koichi or someone else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-14 Thread Koichi Suzuki
Robert;

Thank you very much for your advice.   Indeed, I'm considering to
change the license to PostgreSQL's one.   It may take a bit more
though...
--
Koichi Suzuki



2010/12/15 Robert Haas robertmh...@gmail.com:
 On Tue, Dec 7, 2010 at 3:23 AM, Koichi Suzuki koichi@gmail.com wrote:
 This is what Postgres-XC is doing between a coordinator and a
 datanode.    Coordinator may correspond to poolers/loadbalancers.
 Does anyone think it makes sense to extract XC implementation of
 snapshot shipping to PostgreSQL itself?

 Perhaps, though of course it would need to be re-licensed.  I'd be
 happy to see us pursue a snapshot cloning framework, wherever it comes
 from.  I remain unconvinced that it should be made a hard requirement
 for parallel pg_dump, but of course if we can get it implemented then
 the point becomes moot.

 Let's not let this fall on the floor.  Someone should pursue this,
 whether it's Joachim or Koichi or someone else.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-14 Thread Robert Haas
On Tue, Dec 14, 2010 at 7:06 PM, Koichi Suzuki koichi@gmail.com wrote:
 Thank you very much for your advice.   Indeed, I'm considering to
 change the license to PostgreSQL's one.   It may take a bit more
 though...

You wouldn't necessarily need to relicense all of Postgres-XC
(although that would be cool, too, at least IMO), just the portion you
were proposing for commit to PostgreSQL.  Or it doesn't sound like it
would be infeasible for someone to code this up from scratch.  But we
should try to make something good happen here!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-07 Thread Koichi Suzuki
This is what Postgres-XC is doing between a coordinator and a
datanode.Coordinator may correspond to poolers/loadbalancers.
Does anyone think it makes sense to extract XC implementation of
snapshot shipping to PostgreSQL itself?

Cheers;
--
Koichi Suzuki



2010/12/7 Stefan Kaltenbrunner ste...@kaltenbrunner.cc:
 On 12/07/2010 01:22 AM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

 If you had several standbys, you could distribute the work of the
 pg_dump among them.  This would be a huge speedup for a large database,
 potentially, thanks to parallelization of I/O and network.  Imagine
 doing a pg_dump of a 300GB database in 10min.

 That does sound kind of attractive.  But to do that I think we'd have to
 go with the pass-the-snapshot-through-the-client approach.  Shipping
 internal snapshot files through the WAL stream doesn't seem attractive
 to me.

 this kind of functionality would also be very useful/interesting for
 connection poolers/loadbalancers that are trying to distribute load
 across multiple hosts and could use that to at least give some sort of
 consistency guarantee.



 Stefan

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-07 Thread Stefan Kaltenbrunner
On 12/07/2010 09:23 AM, Koichi Suzuki wrote:
 This is what Postgres-XC is doing between a coordinator and a
 datanode.Coordinator may correspond to poolers/loadbalancers.
 Does anyone think it makes sense to extract XC implementation of
 snapshot shipping to PostgreSQL itself?

well if there is a preeceeding implementation of that it would certainly
be of interest to see that - but before you go and extract the code
maybe you could tell us how exactly it works?



Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 02:55, Robert Haas wrote:

 On Sun, Dec 5, 2010 at 1:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

 I'm wondering if we should reconsider the pass-it-through-the-client
 approach, because if we could make that work it would be more general and
 it wouldn't need any special privileges.  The trick seems to be to apply
 sufficient sanity testing to the snapshot proposed to be installed in
 the subsidiary transaction.  I think the requirements would basically be
 (1) xmin= any listed XIDs  xmax
 (2) xmin not so old as to cause GlobalXmin to decrease
 (3) xmax not beyond current XID counter
 (4) XID list includes all still-running XIDs in the given range

 Thoughts?

 I think this is too ugly to live.  I really think it's a very bad idea
 for database clients to need to explicitly know anywhere near this
 many details about how the server represents snapshots.  It's not
 impossible we might want to change this in the future, and even if we
 don't, it seems to me to be exposing a whole lot of unnecessary
 internal grottiness.

 The client doesn't need to know anything about the snapshot blob that the
 server gives it. It just needs to pass it back to the server through the
 other connection. To the client, it's just an opaque chunk of bytes.

I suppose that would work, but I still think it's a bad idea.  We made
this mistake with expression trees.  Any oversight in the code that
validates the chunk of bytes when it (or a modified version) is sent
back to the server turns into a security hole.  I think it's a whole
lot simpler and cleaner to keep the representation details private to
the server.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 14:57, Robert Haas wrote:

On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

The client doesn't need to know anything about the snapshot blob that the
server gives it. It just needs to pass it back to the server through the
other connection. To the client, it's just an opaque chunk of bytes.


I suppose that would work, but I still think it's a bad idea.  We made
this mistake with expression trees.  Any oversight in the code that
validates the chunk of bytes when it (or a modified version) is sent
back to the server turns into a security hole.


True, but a snapshot is a lot simpler than an expression tree. It's 
pretty much impossible to plug all the holes in the expression-tree 
reading functions, and keep them hole-free in the future. The expression 
tree format is constantly in flux. A snapshot, however, is a fairly 
isolated small data structure that rarely changes.



 I think it's a whole
lot simpler and cleaner to keep the representation details private to
the server.


Well, then you need some sort of cross-backend communication, which is 
always a bit clumsy.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 14:57, Robert Haas wrote:

 On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 The client doesn't need to know anything about the snapshot blob that the
 server gives it. It just needs to pass it back to the server through the
 other connection. To the client, it's just an opaque chunk of bytes.

 I suppose that would work, but I still think it's a bad idea.  We made
 this mistake with expression trees.  Any oversight in the code that
 validates the chunk of bytes when it (or a modified version) is sent
 back to the server turns into a security hole.

 True, but a snapshot is a lot simpler than an expression tree. It's pretty
 much impossible to plug all the holes in the expression-tree reading
 functions, and keep them hole-free in the future. The expression tree format
 is constantly in flux. A snapshot, however, is a fairly isolated small data
 structure that rarely changes.

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.

  I think it's a whole
 lot simpler and cleaner to keep the representation details private to
 the server.

 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

A temp file seems quite sufficient, and not at all difficult.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.


Note that we already have txid_current_snapshot() function, which 
exposes all that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 15:53, Robert Haas wrote:

 I guess.  It still seems far too much like exposing the server's guts
 for my taste.  It might not be as bad as the expression tree stuff,
 but there's nothing particularly good about it either.

 Note that we already have txid_current_snapshot() function, which exposes
 all that.

Fair enough, and I think that's actually useful for Slony c.  But I
don't think we should shy away of providing a cleaner API here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 10:22 AM, Robert Haas wrote:

On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.

Note that we already have txid_current_snapshot() function, which exposes
all that.

Fair enough, and I think that's actually useful for Slonyc.  But I
don't think we should shy away of providing a cleaner API here.



Just don't let the perfect get in the way of the good :P

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 10:35 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 12/06/2010 10:22 AM, Robert Haas wrote:

 On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 On 06.12.2010 15:53, Robert Haas wrote:

 I guess.  It still seems far too much like exposing the server's guts
 for my taste.  It might not be as bad as the expression tree stuff,
 but there's nothing particularly good about it either.

 Note that we already have txid_current_snapshot() function, which exposes
 all that.

 Fair enough, and I think that's actually useful for Slonyc.  But I
 don't think we should shy away of providing a cleaner API here.


 Just don't let the perfect get in the way of the good :P

I'll keep that in mind.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

 A temp file seems quite sufficient, and not at all difficult.

Not at all difficult is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 10:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

 A temp file seems quite sufficient, and not at all difficult.

 Not at all difficult is nonsense.  To do that, you need to invent some
 mechanism for sender and receivers to identify which temp file they want
 to use,

Why is this even remotely hard?  That's the whole point of having the
publish operation return a token.  The token either is, or uniquely
identifies, the file name.

 and you need to think of some way to clean up the files when the
 client forgets to tell you to do so.  That's going to be at least as
 ugly as anything else.

Backends don't forget to call their end-of-transaction hooks, do they?
 They might crash, but we already have code to remove temp files on
server restart.  At most it would need minor adjustment.

  And I think it's unproven that this approach
 would be security-hole-free either.  For instance, what about some other
 session overwriting pg_dump's snapshot temp file?

Why would this be any different from any other temp file?  We surely
must have a mechanism in place to ensure that the temporary files used
by sorts or hash joins don't get overwritten by some other session, or
the system would be totally unstable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 10:40 AM, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Well, then you need some sort of cross-backend communication, which is
always a bit clumsy.

A temp file seems quite sufficient, and not at all difficult.

Not at all difficult is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?




Yeah. I'm still not convinced that using shared memory is a bad way to 
pass these around. Surely we're not talking about large numbers of them. 
What am I missing here?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Yeah. I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers of them. 
 What am I missing here?

They're not of a very predictable size.

Robert's idea of publish() returning a temp file identifier, which then
gets removed at transaction end, might work all right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Why not just say give me the snapshot currently held by process ?

There's not a unique snapshot held by a particular process.  Also, we
don't want to expend the overhead to fully publish every snapshot.
I think it's really necessary that the sending process take some
deliberate action to publish a snapshot.

 And please, not temp files if possible.

Barring the cleanup issue, I don't see why not.  This is a relatively
low-usage feature, I think, so I wouldn't be much in favor of dedicating
shmem to it even if the space requirement were predictable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 12:28 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

Yeah. I'm still not convinced that using shared memory is a bad way to
pass these around. Surely we're not talking about large numbers of them.
What am I missing here?

They're not of a very predictable size.




Ah. Ok.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers
 of them.  What am I missing here?
 
 They're not of a very predictable size.
 
Surely you can predict that any snapshot is no larger than a fairly
small fixed portion plus sizeof(TransactionId) * MaxBackends?  So,
for example, if you're configured for 100 connections, you'd be
limited to something under 1kB, maximum?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers
 of them.  What am I missing here?
 
 They're not of a very predictable size.
 
 Surely you can predict that any snapshot is no larger than a fairly
 small fixed portion plus sizeof(TransactionId) * MaxBackends?

No.  See subtransactions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 Surely you can predict that any snapshot is no larger than a fairly
 small fixed portion plus sizeof(TransactionId) * MaxBackends?
 
 No.  See subtransactions.
 
Subtransactions are included in snapshots?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 No.  See subtransactions.
 
 Subtransactions are included in snapshots?

Sure, see GetSnapshotData().  You could avoid it by setting
suboverflowed, but that comes at a nontrivial performance cost.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 No.  See subtransactions.
 
 Subtransactions are included in snapshots?
 
 Sure, see GetSnapshotData().  You could avoid it by setting
 suboverflowed, but that comes at a nontrivial performance cost.
 
Yeah, sorry for blurting like that before I checked.  I was somewhat
panicked that I'd missed something important for SSI, because my
XidIsConcurrent check just uses xmin, xmax, and xip; I was afraid
what I have would fall down in the face of subtransactions.  But on
review I found that I'd thought that through and (discussion in in
the archives) I always wanted to associate the locks and conflicts
with the top level transaction; so that was already identified
before checking for overlap, and it was therefore more efficient to
just check that.
 
Sorry for the senior moment.  :-/
 
Perhaps a line or two of comments about that in the SSI patch would
be a good idea.  And maybe some tests involving subtransactions
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread marcin mank
On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IIRC, in old discussions of this problem we first considered allowing
 clients to pull down an explicit representation of their snapshot (which
 actually is an existing feature now, txid_current_snapshot()) and then
 upload that again to become the active snapshot in another connection.

Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 21:48, marcin mank wrote:

On Sun, Dec 5, 2010 at 7:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

IIRC, in old discussions of this problem we first considered allowing
clients to pull down an explicit representation of their snapshot (which
actually is an existing feature now, txid_current_snapshot()) and then
upload that again to become the active snapshot in another connection.


Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?


Hmm, I suppose it could. That's an interesting idea, you could run 
parallel pg_dump or something else against master and/or multiple hot 
standby servers, all working on the same snapshot.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes:
 On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IIRC, in old discussions of this problem we first considered allowing
 clients to pull down an explicit representation of their snapshot (which
 actually is an existing feature now, txid_current_snapshot()) and then
 upload that again to become the active snapshot in another connection.

 Could a hot standby use such a snapshot representation? I.e. same
 snapshot on the master and the standby?

Hm, that's a good question.  It seems like it's at least possibly
workable, but I'm not sure if there are any showstoppers.  The other
proposal of publish-a-snapshot would presumably NOT support this, since
we'd not want to ship the snapshot temp files down the WAL stream.

However, if you were doing something like parallel pg_dump you could
just run the parent and child instances all against the slave, so the
pg_dump scenario doesn't seem to offer much of a supporting use-case for
worrying about this.  When would you really need to be able to do it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Josh Berkus

 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

If you had several standbys, you could distribute the work of the
pg_dump among them.  This would be a huge speedup for a large database,
potentially, thanks to parallelization of I/O and network.  Imagine
doing a pg_dump of a 300GB database in 10min.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

 If you had several standbys, you could distribute the work of the
 pg_dump among them.  This would be a huge speedup for a large database,
 potentially, thanks to parallelization of I/O and network.  Imagine
 doing a pg_dump of a 300GB database in 10min.

That does sound kind of attractive.  But to do that I think we'd have to
go with the pass-the-snapshot-through-the-client approach.  Shipping
internal snapshot files through the WAL stream doesn't seem attractive
to me.

While I see Robert's point about preferring not to expose the snapshot
contents to clients, I don't think it outweighs all other considerations
here; and every other one is pointing to doing it the other way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Koichi Suzuki
We may need other means to ensure that the snapshot is available on
the slave.  It could be a bit too early to use the snapshot on the
slave depending upon the delay of WAL replay.
--
Koichi Suzuki



2010/12/7 Tom Lane t...@sss.pgh.pa.us:
 marcin mank marcin.m...@gmail.com writes:
 On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IIRC, in old discussions of this problem we first considered allowing
 clients to pull down an explicit representation of their snapshot (which
 actually is an existing feature now, txid_current_snapshot()) and then
 upload that again to become the active snapshot in another connection.

 Could a hot standby use such a snapshot representation? I.e. same
 snapshot on the master and the standby?

 Hm, that's a good question.  It seems like it's at least possibly
 workable, but I'm not sure if there are any showstoppers.  The other
 proposal of publish-a-snapshot would presumably NOT support this, since
 we'd not want to ship the snapshot temp files down the WAL stream.

 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

                        regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Stefan Kaltenbrunner
On 12/07/2010 01:22 AM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?
 
 If you had several standbys, you could distribute the work of the
 pg_dump among them.  This would be a huge speedup for a large database,
 potentially, thanks to parallelization of I/O and network.  Imagine
 doing a pg_dump of a 300GB database in 10min.
 
 That does sound kind of attractive.  But to do that I think we'd have to
 go with the pass-the-snapshot-through-the-client approach.  Shipping
 internal snapshot files through the WAL stream doesn't seem attractive
 to me.

this kind of functionality would also be very useful/interesting for
connection poolers/loadbalancers that are trying to distribute load
across multiple hosts and could use that to at least give some sort of
consistency guarantee.



Stefan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tatsuo Ishii
 On 12/07/2010 01:22 AM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?
 
 If you had several standbys, you could distribute the work of the
 pg_dump among them.  This would be a huge speedup for a large database,
 potentially, thanks to parallelization of I/O and network.  Imagine
 doing a pg_dump of a 300GB database in 10min.
 
 That does sound kind of attractive.  But to do that I think we'd have to
 go with the pass-the-snapshot-through-the-client approach.  Shipping
 internal snapshot files through the WAL stream doesn't seem attractive
 to me.
 
 this kind of functionality would also be very useful/interesting for
 connection poolers/loadbalancers that are trying to distribute load
 across multiple hosts and could use that to at least give some sort of
 consistency guarantee.

In addition to this, that will greatly help query based replication
tools such as pgpool-II. Sounds great.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Greg Smith

Joachim Wieland wrote:

Regarding snapshot cloning and dump consistency, I brought this up
already several months ago and asked if the feature is considered
useful even without snapshot cloning.


In addition, Joachim submitted a synchronized snapshot patch that looks 
to me like it slipped through the cracks without being fully explored.  
Since it's split in the official archives the easiest way to read the 
thread is at 
http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg143866.html


Or you can use these two:
http://archives.postgresql.org/pgsql-hackers/2010-01/msg00916.php
http://archives.postgresql.org/pgsql-hackers/2010-02/msg00363.php

That never made it into a CommitFest proper that I can see, it just 
picked up review mainly from Markus.  The way I read that thread, there 
were two objections:


1) This mechanism isn't general enough for all use-cases outside of 
pg_dump, which doesn't make it wrong when the question is how to get 
parallel pg_dump running


2) Running as superuser is excessive.  Running as the database owner was 
suggested as likely to be good enough for pg_dump purposes.


Ultimately I think that stalled because without a client that needed it 
the code wasn't so interesting yet.  But now there is one; should that 
get revived again?  It seems like all of the pieces needed to build 
what's really desired here are available, it's just the always 
non-trivial task of integrating them together the right way that's needed.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 In addition, Joachim submitted a synchronized snapshot patch that looks 
 to me like it slipped through the cracks without being fully explored.  
 ...
 The way I read that thread, there were two objections:

 1) This mechanism isn't general enough for all use-cases outside of 
 pg_dump, which doesn't make it wrong when the question is how to get 
 parallel pg_dump running

 2) Running as superuser is excessive.  Running as the database owner was 
 suggested as likely to be good enough for pg_dump purposes.

IIRC, in old discussions of this problem we first considered allowing
clients to pull down an explicit representation of their snapshot (which
actually is an existing feature now, txid_current_snapshot()) and then
upload that again to become the active snapshot in another connection.
That was rejected on the grounds that you could cause all kinds of
mischief by uploading a bad snapshot; so we decided to think about
providing a server-side-only means to clone another backend's current
snapshot.  Which is essentially what Joachim's above-mentioned patch
provides.  However, as was discussed in that thread, that approach is
far from being ideal either.

I'm wondering if we should reconsider the pass-it-through-the-client
approach, because if we could make that work it would be more general and
it wouldn't need any special privileges.  The trick seems to be to apply
sufficient sanity testing to the snapshot proposed to be installed in
the subsidiary transaction.  I think the requirements would basically be
(1) xmin = any listed XIDs  xmax
(2) xmin not so old as to cause GlobalXmin to decrease
(3) xmax not beyond current XID counter
(4) XID list includes all still-running XIDs in the given range

One tricky part would be ensuring GlobalXmin doesn't decrease when the
snap is installed, but I think that could be made to work if we take
ProcArrayLock exclusively and insist on observing some other running
transaction with xmin = proposed xmin.  For the pg_dump case this would
certainly hold since xmin would be the parent pg_dump's xmin.

Given the checks stated above, it would be possible for someone to
install a snapshot that corresponds to no actual state of the database,
eg it shows some T1 as running and T2 as committed when actually T1
committed before T2.  I don't see any simple way for the installation
function to detect that, but I'm not sure whether it matters.  The user
might see inconsistent data, but do we care?  Perhaps as a safety
measure we should only allow snapshot installation in read-only
transactions, so that even if the xact does observe inconsistent data it
can't possibly corrupt the database state thereby.  This'd be no skin
off pg_dump's nose, obviously.  Or compromise on only superusers can
do it in non-read-only transactions.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Robert Haas
On Sun, Dec 5, 2010 at 1:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm wondering if we should reconsider the pass-it-through-the-client
 approach, because if we could make that work it would be more general and
 it wouldn't need any special privileges.  The trick seems to be to apply
 sufficient sanity testing to the snapshot proposed to be installed in
 the subsidiary transaction.  I think the requirements would basically be
 (1) xmin = any listed XIDs  xmax
 (2) xmin not so old as to cause GlobalXmin to decrease
 (3) xmax not beyond current XID counter
 (4) XID list includes all still-running XIDs in the given range

 Thoughts?

I think this is too ugly to live.  I really think it's a very bad idea
for database clients to need to explicitly know anywhere near this
many details about how the server represents snapshots.  It's not
impossible we might want to change this in the future, and even if we
don't, it seems to me to be exposing a whole lot of unnecessary
internal grottiness.

How about just pg_publish_snapshot(), returning a token that is only
valid until the end of the transaction in which it was called, and
pg_subscribe_snapshot(token)?  The implementation can be that the
publisher writes its snapshot to a temp file and returns the name of
the temp file, setting an at-commit hook to remove the temp file.  The
subscriber reads the temp file and sets the contents as its
transaction snapshot.  If security is a concern, one could also save
the publisher's role OID to the file and require the subscriber's to
match.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Andrew Dunstan



On 12/05/2010 08:55 PM, Robert Haas wrote:

On Sun, Dec 5, 2010 at 1:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

I'm wondering if we should reconsider the pass-it-through-the-client
approach, because if we could make that work it would be more general and
it wouldn't need any special privileges.  The trick seems to be to apply
sufficient sanity testing to the snapshot proposed to be installed in
the subsidiary transaction.  I think the requirements would basically be
(1) xmin= any listed XIDs  xmax
(2) xmin not so old as to cause GlobalXmin to decrease
(3) xmax not beyond current XID counter
(4) XID list includes all still-running XIDs in the given range

Thoughts?

I think this is too ugly to live.  I really think it's a very bad idea
for database clients to need to explicitly know anywhere near this
many details about how the server represents snapshots.  It's not
impossible we might want to change this in the future, and even if we
don't, it seems to me to be exposing a whole lot of unnecessary
internal grottiness.

How about just pg_publish_snapshot(), returning a token that is only
valid until the end of the transaction in which it was called, and
pg_subscribe_snapshot(token)?  The implementation can be that the
publisher writes its snapshot to a temp file and returns the name of
the temp file, setting an at-commit hook to remove the temp file.  The
subscriber reads the temp file and sets the contents as its
transaction snapshot.  If security is a concern, one could also save
the publisher's role OID to the file and require the subscriber's to
match.


Why not just say give me the snapshot currently held by process ?

And please, not temp files if possible.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Robert Haas
On Sun, Dec 5, 2010 at 9:04 PM, Andrew Dunstan and...@dunslane.net wrote:
 Why not just say give me the snapshot currently held by process ?

 And please, not temp files if possible.

As far as I'm aware, the full snapshot doesn't normally exist in
shared memory, hence the need for publication of some sort.  We could
dedicate a shared memory region for publication but then you have to
decide how many slots to allocate, and any number you pick will be too
many for some people and not enough for others, not to mention that
shared memory is a fairly precious resource.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Joachim Wieland
On Sun, Dec 5, 2010 at 9:27 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Dec 5, 2010 at 9:04 PM, Andrew Dunstan and...@dunslane.net wrote:
 Why not just say give me the snapshot currently held by process ?

 And please, not temp files if possible.

 As far as I'm aware, the full snapshot doesn't normally exist in
 shared memory, hence the need for publication of some sort.  We could
 dedicate a shared memory region for publication but then you have to
 decide how many slots to allocate, and any number you pick will be too
 many for some people and not enough for others, not to mention that
 shared memory is a fairly precious resource.

So here is a patch that I have been playing with in the past, I have
done it a while back and thanks go to Koichi Suzuki for his helpful
comments. I have not published it earlier because I haven't worked on
it recently and from the discussion that I brought up in march I got
the feeling that people are fine with having a first version of
parallel dump without synchronized snapshots.

I am not really sure that what the patch does is sufficient nor if it
does it in the right way but I hope that it can serve as a basis to
collect ideas (and doubt).

My idea is pretty much similar to Robert's about publishing snapshots
and subscribing to them, the patch even uses these words.

Basically the idea is that a transaction in isolation level
serializable can publish a snapshot and as long as this transaction is
alive, its snapshot can be adopted by other transactions. Requiring
the publishing transaction to be serializable guarantees that the copy
of the snapshot in shared memory is always current. When the
transaction ends, the copy of the snapshot is also invalidated and
cannot be adopted anymore. So instead of doing explicit checks, the
patch aims at always having a reference transaction around that
guarantees validity of the snapshot information in shared memory.

The patch currently creates a new area in shared memory to store
snapshot information but we can certainly discuss this... I had a GUC
in mind that can control the number of available slots, similar to
max_prepared_transactions. Snapshot information can become quite
large, especially with a high number of max_connections.

Known limitations: the patch is lacking awareness of prepared
transactions completely and doesn't check if both backends belong to
the same user.


Joachim
diff --git a/src/backend/storage/ipc/ipci.c b/src/backend/storage/ipc/ipci.c
index 95beba8..c24150f 100644
*** a/src/backend/storage/ipc/ipci.c
--- b/src/backend/storage/ipc/ipci.c
*** CreateSharedMemoryAndSemaphores(bool mak
*** 124,129 
--- 124,130 
  		size = add_size(size, BTreeShmemSize());
  		size = add_size(size, SyncScanShmemSize());
  		size = add_size(size, AsyncShmemSize());
+ 		size = add_size(size, SyncSnapshotShmemSize());
  #ifdef EXEC_BACKEND
  		size = add_size(size, ShmemBackendArraySize());
  #endif
*** CreateSharedMemoryAndSemaphores(bool mak
*** 228,233 
--- 229,235 
  	BTreeShmemInit();
  	SyncScanShmemInit();
  	AsyncShmemInit();
+ 	SyncSnapshotInit();
  
  #ifdef EXEC_BACKEND
  
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 6e7a6db..00522fb 100644
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
*** typedef struct ProcArrayStruct
*** 91,96 
--- 91,111 
  
  static ProcArrayStruct *procArray;
  
+ 
+ /* this should be a GUC later... */
+ #define MAX_SYNC_SNAPSHOT_SETS	4
+ typedef struct
+ {
+ 	SnapshotData	ssd;
+ 	char			name[NAMEDATALEN];
+ 	BackendId		backendId;
+ 	OiddatabaseId;
+ } NamedSnapshotData;
+ 
+ typedef NamedSnapshotData* NamedSnapshot;
+ 
+ static NamedSnapshot syncSnapshots;
+ 
  /*
   * Bookkeeping for tracking emulated transactions in recovery
   */
*** static int KnownAssignedXidsGetAndSetXmi
*** 159,164 
--- 174,182 
  static TransactionId KnownAssignedXidsGetOldestXmin(void);
  static void KnownAssignedXidsDisplay(int trace_level);
  
+ static bool DeleteSyncSnapshot(const char *name);
+ static bool snapshotPublished = false;  /* true if we have published at least one snapshot */
+ 
  /*
   * Report shared-memory space needed by CreateSharedProcArray.
   */
*** ProcArrayRemove(PGPROC *proc, Transactio
*** 350,355 
--- 368,379 
  void
  ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
  {
+ 	if (snapshotPublished)
+ 	{
+ 		DeleteSyncSnapshot(NULL);
+ 		snapshotPublished = false;
+ 	}
+ 
  	if (TransactionIdIsValid(latestXid))
  	{
  		/*
*** KnownAssignedXidsDisplay(int trace_level
*** 3104,3106 
--- 3132,3374 
  
  	pfree(buf.data);
  }
+ 
+ 
+ /*
+  *  Report space needed for our shared memory area.
+  *
+  *  Memory is structured as follows:
+  *
+  *  NamedSnapshotData[0]
+  *  NamedSnapshotData[1]
+  *  NamedSnapshotData[2]
+  *  Xids for NamedSnapshotData[0]
+  *  

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Koichi Suzuki
Thank you Joachim;

Yes, and the current patch requires the original (publisher)
transaction is alive to prevent RecentXmin updated.

I hope this restriction is acceptable if publishing/subscribing is
provided via functions, not statements.

Cheers;
--
Koichi Suzuki



2010/12/6 Joachim Wieland j...@mcknight.de:
 On Sun, Dec 5, 2010 at 9:27 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Dec 5, 2010 at 9:04 PM, Andrew Dunstan and...@dunslane.net wrote:
 Why not just say give me the snapshot currently held by process ?

 And please, not temp files if possible.

 As far as I'm aware, the full snapshot doesn't normally exist in
 shared memory, hence the need for publication of some sort.  We could
 dedicate a shared memory region for publication but then you have to
 decide how many slots to allocate, and any number you pick will be too
 many for some people and not enough for others, not to mention that
 shared memory is a fairly precious resource.

 So here is a patch that I have been playing with in the past, I have
 done it a while back and thanks go to Koichi Suzuki for his helpful
 comments. I have not published it earlier because I haven't worked on
 it recently and from the discussion that I brought up in march I got
 the feeling that people are fine with having a first version of
 parallel dump without synchronized snapshots.

 I am not really sure that what the patch does is sufficient nor if it
 does it in the right way but I hope that it can serve as a basis to
 collect ideas (and doubt).

 My idea is pretty much similar to Robert's about publishing snapshots
 and subscribing to them, the patch even uses these words.

 Basically the idea is that a transaction in isolation level
 serializable can publish a snapshot and as long as this transaction is
 alive, its snapshot can be adopted by other transactions. Requiring
 the publishing transaction to be serializable guarantees that the copy
 of the snapshot in shared memory is always current. When the
 transaction ends, the copy of the snapshot is also invalidated and
 cannot be adopted anymore. So instead of doing explicit checks, the
 patch aims at always having a reference transaction around that
 guarantees validity of the snapshot information in shared memory.

 The patch currently creates a new area in shared memory to store
 snapshot information but we can certainly discuss this... I had a GUC
 in mind that can control the number of available slots, similar to
 max_prepared_transactions. Snapshot information can become quite
 large, especially with a high number of max_connections.

 Known limitations: the patch is lacking awareness of prepared
 transactions completely and doesn't check if both backends belong to
 the same user.


 Joachim


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-05 Thread Heikki Linnakangas

On 06.12.2010 02:55, Robert Haas wrote:

On Sun, Dec 5, 2010 at 1:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

I'm wondering if we should reconsider the pass-it-through-the-client
approach, because if we could make that work it would be more general and
it wouldn't need any special privileges.  The trick seems to be to apply
sufficient sanity testing to the snapshot proposed to be installed in
the subsidiary transaction.  I think the requirements would basically be
(1) xmin= any listed XIDs  xmax
(2) xmin not so old as to cause GlobalXmin to decrease
(3) xmax not beyond current XID counter
(4) XID list includes all still-running XIDs in the given range

Thoughts?


I think this is too ugly to live.  I really think it's a very bad idea
for database clients to need to explicitly know anywhere near this
many details about how the server represents snapshots.  It's not
impossible we might want to change this in the future, and even if we
don't, it seems to me to be exposing a whole lot of unnecessary
internal grottiness.


The client doesn't need to know anything about the snapshot blob that 
the server gives it. It just needs to pass it back to the server through 
the other connection. To the client, it's just an opaque chunk of bytes.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Robert Haas
On Thu, Dec 2, 2010 at 9:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 Umm, nobody has attributed ridiculousness to anyone. Please don't put
 words in my mouth. But I think this is a perfectly reasonable discussion
 to have. Nobody gets to come along and get the features they want
 without some sort of consensus, not me, not you, not Joachim, not Tom.

 In particular, this issue *has* been discussed before, and there was a
 consensus that preserving dump consistency was a requirement.  I don't
 think that Joachim gets to bypass that decision just by submitting a
 patch that ignores it.

Well, the discussion that Joachim linked too certainly doesn't have
any sort of clear consensus that that's the only way to go.  In fact,
it seems to be much closer to the opposite consensus.  Perhaps there
is some OTHER time that this has been discussed where synchronization
is a hard requirement was the consensus.  There's an old saw that the
nice thing about standards is there are so many to choose from, and
the same thing can certainly be said about -hackers discussions on any
particular topic.

I actually think that the phrase this has been discussed before and
rejected should be permanently removed from our list of excuses for
rejecting a patch.  Or if we must use that excuse, then I think a link
to the relevant discussion is a must, and the relevant discussion had
better reflect the fact that $TOPIC was in fact rejected.  It seems to
me that in at least 50% of cases, someone comes back and says one of
the following things:

1. I searched the archives and could find no discussion along those lines.
2. I read that discussion and it doesn't appear to me that it reflects
a rejection of this idea.  Instead what people seemed to be saying was
X.
3. At the time that might have been true, but what has changed in the
meanwhile is X.

In short, the problem with referring to previous discussions is that
our memories grow fuzzy over time.  We remember that an idea was not
adopted, but not exactly why it wasn't adopted.  We reject a new patch
with a good implementation of $FEATURE because an old patch was badly
done, or fell down on some peripheral issue, or just never got done.
Veteran backend hackers understand the inevitable necessity of arguing
about what consensus is actually reflected in the archives and whether
it's still relevant, but new people can be (and frequently are) put
off by it; and even for experienced contributors, it does little to
advance the dialogue.  Hmm, according to so-and-so's memory, sometime
in the fourteen-year-history of the project someone didn't like this
idea, or maybe a similar one.  Whee, time to start Googling.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Andrew Dunstan



On 12/02/2010 11:44 PM, Joachim Wieland wrote:

On Thu, Dec 2, 2010 at 9:33 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

In particular, this issue *has* been discussed before, and there was a
consensus that preserving dump consistency was a requirement.  I don't
think that Joachim gets to bypass that decision just by submitting a
patch that ignores it.

I am not trying to bypass anything here :)  Regarding the locking
issue I probably haven't done sufficient research, at least I managed
to miss the emails that mentioned it. Anyway, that seems to be solved
now fortunately, I'm going to implement your idea over the weekend.

Regarding snapshot cloning and dump consistency, I brought this up
already several months ago and asked if the feature is considered
useful even without snapshot cloning. And actually it was you who
motivated me to work on it even without having snapshot consistency...

http://archives.postgresql.org/pgsql-hackers/2010-03/msg01181.php

In my patch pg_dump emits a warning when called with -j, if you feel
better with an extra option
--i-know-that-i-have-no-synchronized-snapshots, fine with me :-)

In the end we provide a tool with limitations, it might not serve all
use cases but there are use cases that would benefit a lot. I
personally think this is better than to provide no tool at all...





I think Tom's statement there:


I think migration to a new server version (that's too incompatible for
PITR or pg_migrate migration) is really the only likely use case.


is just wrong. Say you have a site that's open 24/7. But there is a 
window of, say, 6 hours, each day, when it's almost but not quite quiet. 
You want to be able to make your disaster recovery dump within that 
window, and the low level of traffic means you can afford the degraded 
performance that might result from a parallel dump. Or say you have a 
hot standby machine from which you want to make the dump but want to set 
the max_standby_*_delay as low as possible. These are both cases where 
you might want parallel dump and yet you want dump consistency. I have a 
client currently considering the latter setup, and the timing tolerances 
are a little tricky. The times in which the system is in a state that we 
want dumped are fixed, and we want to be sure that the dump is finished 
by the next time such a time rolls around. (This is a system that in 
effect makes one giant state change at a time.) If we can't complete the 
dump in that time then there will be a delay introduced to the system's 
critical path. Parallel dump will be very useful in helping us avoid 
such a situation, but only if it's properly consistent.


I think Josh Berkus' comments in the thread you mentioned are correct:


Actually, I'd say that there's a broad set of cases of people who want
to do a parallel pg_dump while their system is active.  Parallel pg_dump
on a stopped system will help some people (for migration, particularly)
but parallel pg_dump with snapshot cloning will help a lot more people.




cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Robert Haas
On Fri, Dec 3, 2010 at 8:02 AM, Andrew Dunstan and...@dunslane.net wrote:
 I think Josh Berkus' comments in the thread you mentioned are correct:

 Actually, I'd say that there's a broad set of cases of people who want
 to do a parallel pg_dump while their system is active.  Parallel pg_dump
 on a stopped system will help some people (for migration, particularly)
 but parallel pg_dump with snapshot cloning will help a lot more people.

But you failed to quote the rest of what he said:

 So: if parallel dump in single-user mode is what you can get done, then
 do it.  We can always improve it later, and we have to start somewhere.
 But we will eventually need parallel pg_dump on active systems, and
 that should remain on the TODO list.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Andrew Dunstan



On 12/03/2010 11:23 AM, Robert Haas wrote:

On Fri, Dec 3, 2010 at 8:02 AM, Andrew Dunstanand...@dunslane.net  wrote:

I think Josh Berkus' comments in the thread you mentioned are correct:


Actually, I'd say that there's a broad set of cases of people who want
to do a parallel pg_dump while their system is active.  Parallel pg_dump
on a stopped system will help some people (for migration, particularly)
but parallel pg_dump with snapshot cloning will help a lot more people.

But you failed to quote the rest of what he said:


So: if parallel dump in single-user mode is what you can get done, then
do it.  We can always improve it later, and we have to start somewhere.
But we will eventually need parallel pg_dump on active systems, and
that should remain on the TODO list.


Right, and the reason I don't think that's right is that it seems to me 
like a serious potential footgun.


But in any case, the reason I quoted Josh was in answer to a different 
point, namely Tom's statement about the limited potential uses.


cheers

andre

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Robert Haas
On Fri, Dec 3, 2010 at 11:40 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 12/03/2010 11:23 AM, Robert Haas wrote:

 On Fri, Dec 3, 2010 at 8:02 AM, Andrew Dunstanand...@dunslane.net
  wrote:

 I think Josh Berkus' comments in the thread you mentioned are correct:

 Actually, I'd say that there's a broad set of cases of people who want
 to do a parallel pg_dump while their system is active.  Parallel pg_dump
 on a stopped system will help some people (for migration, particularly)
 but parallel pg_dump with snapshot cloning will help a lot more people.

 But you failed to quote the rest of what he said:

 So: if parallel dump in single-user mode is what you can get done, then
 do it.  We can always improve it later, and we have to start somewhere.
 But we will eventually need parallel pg_dump on active systems, and
 that should remain on the TODO list.

 Right, and the reason I don't think that's right is that it seems to me like
 a serious potential footgun.

 But in any case, the reason I quoted Josh was in answer to a different
 point, namely Tom's statement about the limited potential uses.

I know the use cases are limited, but I think it's still useful on its own.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie dic 03 13:56:32 -0300 2010:

 I know the use cases are limited, but I think it's still useful on its own.

I don't understand what's so difficult about starting with the snapshot
cloning patch.  AFAIR it's already been written anyway, no?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-03 Thread Andrew Dunstan



On 12/03/2010 12:17 PM, Alvaro Herrera wrote:

Excerpts from Robert Haas's message of vie dic 03 13:56:32 -0300 2010:


I know the use cases are limited, but I think it's still useful on its own.

I don't understand what's so difficult about starting with the snapshot
cloning patch.  AFAIR it's already been written anyway, no?



Yeah. If we can do it then this whole argument becomes moot. Like you I 
don't see why we can't.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Heikki Linnakangas

On 02.12.2010 07:39, Joachim Wieland wrote:

On Sun, Nov 14, 2010 at 6:52 PM, Joachim Wielandj...@mcknight.de  wrote:

You would add a regular parallel dump with

$ pg_dump -j 4 -Fd -f out.dir dbname


So this is an updated series of patches for my parallel pg_dump WIP
patch. Most importantly it now runs on Windows once you get it to
compile there (I have added the new files to the respective project of
Mkvcbuild.pm but I wondered why the other archive formats do not need
to be defined in that file...).

So far nobody has volunteered to review this patch. It would be great
if people could at least check it out, run it and let me know if it
works and if they have any comments.


That's a big patch..

I don't see the point of the sort-by-relpages code. The order the 
objects are dumped should be irrelevant, as long as you obey the 
restrictions dictated by dependencies. Or is it only needed for the 
multiple-target-dirs feature? Frankly I don't see the point of that, so 
it would be good to cull it out at least in this first stage.



--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I don't see the point of the sort-by-relpages code. The order the objects
 are dumped should be irrelevant, as long as you obey the restrictions
 dictated by dependencies. Or is it only needed for the multiple-target-dirs
 feature? Frankly I don't see the point of that, so it would be good to cull
 it out at least in this first stage.

From the talk at CHAR(10), and provided memory serves, it's an
optimisation so that you're doing largest file in a process and all the
little file in other processes. In lots of case the total pg_dump
duration is then reduced to about the time to dump the biggest files.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Joachim Wieland
On Thu, Dec 2, 2010 at 6:19 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I don't see the point of the sort-by-relpages code. The order the objects
 are dumped should be irrelevant, as long as you obey the restrictions
 dictated by dependencies. Or is it only needed for the multiple-target-dirs
 feature? Frankly I don't see the point of that, so it would be good to cull
 it out at least in this first stage.

A guy called Dimitri Fontaine actually proposed the
serveral-directories feature here and other people liked the idea.

http://archives.postgresql.org/pgsql-hackers/2008-02/msg01061.php  :-)

The code doesn't change much with or without it, and if people are no
longer in favour of it, I have no problem with taking it out.

As Dimitri has already pointed out, the relpage sorting thing is there
to start with the largest table(s) first.


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Dimitri Fontaine
Joachim Wieland j...@mcknight.de writes:
 A guy called Dimitri Fontaine actually proposed the
 serveral-directories feature here and other people liked the idea.

Hehe :)

Reading that now, it could be that I didn't know at the time that given
a powerful enough subsystem disk there's no way to saturate it with one
CPU. So the use case of parralel dump in a bunch or user given locations
would be to use different mount points (disk subsystems) at the same
time.  Not sure how releveant it is.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Josh Berkus

On 12/02/2010 05:50 AM, Dimitri Fontaine wrote:

So the use case of parralel dump in a bunch or user given locations
would be to use different mount points (disk subsystems) at the same
time.  Not sure how releveant it is.


I think it will complicate this feature unnecessarily for 9.1. 
Personally, I need this patch so much I'm thinking of backporting it. 
However, having all the data go to one directory/mount wouldn't trouble 
me at all.


Now, if only I could think of some way to write a parallel dump to a set 
of pipes, I'd be in heaven.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 12:56 PM, Josh Berkus wrote:

On 12/02/2010 05:50 AM, Dimitri Fontaine wrote:

So the use case of parralel dump in a bunch or user given locations
would be to use different mount points (disk subsystems) at the same
time.  Not sure how releveant it is.


I think it will complicate this feature unnecessarily for 9.1. 
Personally, I need this patch so much I'm thinking of backporting it. 
However, having all the data go to one directory/mount wouldn't 
trouble me at all.


Now, if only I could think of some way to write a parallel dump to a 
set of pipes, I'd be in heaven.


The only way I can see that working sanely would be to have a program 
gathering stuff at the other end of the pipes, and ensuring it was all 
coherent. That would be a huge growth in scope for this, and I seriously 
doubt it's worth it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Josh Berkus

 Now, if only I could think of some way to write a parallel dump to a
 set of pipes, I'd be in heaven.
 
 The only way I can see that working sanely would be to have a program
 gathering stuff at the other end of the pipes, and ensuring it was all
 coherent. That would be a huge growth in scope for this, and I seriously
 doubt it's worth it.

Oh, no question.  And there's workarounds ... sshfs, for example.  I'm
just thinking of the ad-hoc parallel backup I'm running today, which
relies heavily on pipes.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Joachim Wieland
On Thu, Dec 2, 2010 at 12:56 PM, Josh Berkus j...@agliodbs.com wrote:
 Now, if only I could think of some way to write a parallel dump to a set of
 pipes, I'd be in heaven.

What exactly are you trying to accomplish with the pipes?

Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 That's a big patch..

Not nearly big enough :-(

In the past, proposals for this have always been rejected on the grounds
that it's impossible to assure a consistent dump if different
connections are used to read different tables.  I fail to understand
why that consideration can be allowed to go by the wayside now.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 05:01 PM, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

That's a big patch..

Not nearly big enough :-(

In the past, proposals for this have always been rejected on the grounds
that it's impossible to assure a consistent dump if different
connections are used to read different tables.  I fail to understand
why that consideration can be allowed to go by the wayside now.




Well, snapshot cloning should allow that objection to be overcome, no?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 12/02/2010 05:01 PM, Tom Lane wrote:
 In the past, proposals for this have always been rejected on the grounds
 that it's impossible to assure a consistent dump if different
 connections are used to read different tables.  I fail to understand
 why that consideration can be allowed to go by the wayside now.

 Well, snapshot cloning should allow that objection to be overcome, no?

Possibly, but we need to see that patch first not second.

(I'm not actually convinced that snapshot cloning is the only problem
here; locking could be an issue too, if there are concurrent processes
trying to take locks that will conflict with pg_dump's.  But the
snapshot issue is definitely a showstopper.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Bruce Momjian
Dimitri Fontaine wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  I don't see the point of the sort-by-relpages code. The order the objects
  are dumped should be irrelevant, as long as you obey the restrictions
  dictated by dependencies. Or is it only needed for the multiple-target-dirs
  feature? Frankly I don't see the point of that, so it would be good to cull
  it out at least in this first stage.
 
 From the talk at CHAR(10), and provided memory serves, it's an
 optimisation so that you're doing largest file in a process and all the
 little file in other processes. In lots of case the total pg_dump
 duration is then reduced to about the time to dump the biggest files.

Seems there should be a comment in the code explaining why this is being
done.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 05:32 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 12/02/2010 05:01 PM, Tom Lane wrote:

In the past, proposals for this have always been rejected on the grounds
that it's impossible to assure a consistent dump if different
connections are used to read different tables.  I fail to understand
why that consideration can be allowed to go by the wayside now.

Well, snapshot cloning should allow that objection to be overcome, no?

Possibly, but we need to see that patch first not second.


Yes, I agree with that.


(I'm not actually convinced that snapshot cloning is the only problem
here; locking could be an issue too, if there are concurrent processes
trying to take locks that will conflict with pg_dump's.  But the
snapshot issue is definitely a showstopper.)





Why is that more an issue with parallel pg_dump?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Robert Haas
On Thu, Dec 2, 2010 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 12/02/2010 05:01 PM, Tom Lane wrote:
 In the past, proposals for this have always been rejected on the grounds
 that it's impossible to assure a consistent dump if different
 connections are used to read different tables.  I fail to understand
 why that consideration can be allowed to go by the wayside now.

 Well, snapshot cloning should allow that objection to be overcome, no?

 Possibly, but we need to see that patch first not second.

Yes, by all means let's allow the perfect to be the enemy of the good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 07:13 PM, Robert Haas wrote:

On Thu, Dec 2, 2010 at 5:32 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 12/02/2010 05:01 PM, Tom Lane wrote:

In the past, proposals for this have always been rejected on the grounds
that it's impossible to assure a consistent dump if different
connections are used to read different tables.  I fail to understand
why that consideration can be allowed to go by the wayside now.

Well, snapshot cloning should allow that objection to be overcome, no?

Possibly, but we need to see that patch first not second.

Yes, by all means let's allow the perfect to be the enemy of the good.



That seems like a bit of an easy shot. Requiring that parallel pg_dump 
produce a dump that is as consistent as non-parallel pg_dump currently 
produces isn't unreasonable. It's not stopping us moving forward, it's 
just not wanting to go backwards.


And it shouldn't be terribly hard. IIRC Joachim has already done some 
work on it.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Robert Haas
On Thu, Dec 2, 2010 at 7:21 PM, Andrew Dunstan and...@dunslane.net wrote:
 In the past, proposals for this have always been rejected on the
 grounds
 that it's impossible to assure a consistent dump if different
 connections are used to read different tables.  I fail to understand
 why that consideration can be allowed to go by the wayside now.
 Well, snapshot cloning should allow that objection to be overcome, no?
 Possibly, but we need to see that patch first not second.
 Yes, by all means let's allow the perfect to be the enemy of the good.


 That seems like a bit of an easy shot. Requiring that parallel pg_dump
 produce a dump that is as consistent as non-parallel pg_dump currently
 produces isn't unreasonable.  It's not stopping us moving forward, it's just
 not wanting to go backwards.

I certainly agree that would be nice.  But if Joachim thought the
patch were useless without that, perhaps he wouldn't have bothered
writing it at this point.  In fact, he doesn't think that, and he
mentioned the use cases he sees in his original post.  But even
supposing you wouldn't personally find this useful in those
situations, how can you possibly say that HE wouldn't find it useful
in those situations?  I understand that people sometimes show up here
and ask for ridiculous things, but I don't think we should be too
quick to attribute ridiculousness to regular contributors.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 07:48 PM, Robert Haas wrote:

On Thu, Dec 2, 2010 at 7:21 PM, Andrew Dunstanand...@dunslane.net  wrote:

In the past, proposals for this have always been rejected on the
grounds
that it's impossible to assure a consistent dump if different
connections are used to read different tables.  I fail to understand
why that consideration can be allowed to go by the wayside now.

Well, snapshot cloning should allow that objection to be overcome, no?

Possibly, but we need to see that patch first not second.

Yes, by all means let's allow the perfect to be the enemy of the good.


That seems like a bit of an easy shot. Requiring that parallel pg_dump
produce a dump that is as consistent as non-parallel pg_dump currently
produces isn't unreasonable.  It's not stopping us moving forward, it's just
not wanting to go backwards.

I certainly agree that would be nice.  But if Joachim thought the
patch were useless without that, perhaps he wouldn't have bothered
writing it at this point.  In fact, he doesn't think that, and he
mentioned the use cases he sees in his original post.  But even
supposing you wouldn't personally find this useful in those
situations, how can you possibly say that HE wouldn't find it useful
in those situations?  I understand that people sometimes show up here
and ask for ridiculous things, but I don't think we should be too
quick to attribute ridiculousness to regular contributors.



Umm, nobody has attributed ridiculousness to anyone. Please don't put 
words in my mouth. But I think this is a perfectly reasonable discussion 
to have. Nobody gets to come along and get the features they want 
without some sort of consensus, not me, not you, not Joachim, not Tom.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Robert Haas
On Dec 2, 2010, at 8:11 PM, Andrew Dunstan and...@dunslane.net wrote:
 Umm, nobody has attributed ridiculousness to anyone. Please don't put words 
 in my mouth. But I think this is a perfectly reasonable discussion to have. 
 Nobody gets to come along and get the features they want without some sort of 
 consensus, not me, not you, not Joachim, not Tom.

I'm not disputing that we COULD reject the patch. I AM disputing that we've 
made a cogent argument for doing so.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 12/02/2010 05:32 PM, Tom Lane wrote:
 (I'm not actually convinced that snapshot cloning is the only problem
 here; locking could be an issue too, if there are concurrent processes
 trying to take locks that will conflict with pg_dump's.  But the
 snapshot issue is definitely a showstopper.)

 Why is that more an issue with parallel pg_dump?

The scenario that bothers me is

1. pg_dump parent process AccessShareLocks everything to be dumped.

2. somebody else tries to acquire AccessExclusiveLock on table foo.

3. pg_dump child process is told to dump foo, tries to acquire
AccessShareLock.

Now, process 3 is blocked behind process 2 is blocked behind process 1
which is waiting for 3 to complete.  Can you say undetectable deadlock?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 09:09 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 12/02/2010 05:32 PM, Tom Lane wrote:

(I'm not actually convinced that snapshot cloning is the only problem
here; locking could be an issue too, if there are concurrent processes
trying to take locks that will conflict with pg_dump's.  But the
snapshot issue is definitely a showstopper.)

Why is that more an issue with parallel pg_dump?

The scenario that bothers me is

1. pg_dump parent process AccessShareLocks everything to be dumped.

2. somebody else tries to acquire AccessExclusiveLock on table foo.
hmm.
3. pg_dump child process is told to dump foo, tries to acquire
AccessShareLock.

Now, process 3 is blocked behind process 2 is blocked behind process 1
which is waiting for 3 to complete.  Can you say undetectable deadlock?




Hmm. Yeah. Maybe we could get around it if we prefork the workers and 
they all acquire locks on everything to be dumped up front in nowait 
mode, right after the parent, and if they can't the whole dump fails. Or 
something along those lines.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Umm, nobody has attributed ridiculousness to anyone. Please don't put 
 words in my mouth. But I think this is a perfectly reasonable discussion 
 to have. Nobody gets to come along and get the features they want 
 without some sort of consensus, not me, not you, not Joachim, not Tom.

In particular, this issue *has* been discussed before, and there was a
consensus that preserving dump consistency was a requirement.  I don't
think that Joachim gets to bypass that decision just by submitting a
patch that ignores it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 12/02/2010 09:09 PM, Tom Lane wrote:
 Now, process 3 is blocked behind process 2 is blocked behind process 1
 which is waiting for 3 to complete.  Can you say undetectable deadlock?

 Hmm. Yeah. Maybe we could get around it if we prefork the workers and 
 they all acquire locks on everything to be dumped up front in nowait 
 mode, right after the parent, and if they can't the whole dump fails. Or 
 something along those lines.

[ thinks for a bit... ]  Actually it might be good enough if a child
simply takes the lock it needs in nowait mode, and reports failure on
error.  We know the parent already has that lock, so the only way that
the child's request can fail is if something conflicting with
AccessShareLock is queued up behind the parent's lock.  So failure to
get the child lock immediately proves that the deadlock case applies.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Andrew Dunstan



On 12/02/2010 09:41 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 12/02/2010 09:09 PM, Tom Lane wrote:

Now, process 3 is blocked behind process 2 is blocked behind process 1
which is waiting for 3 to complete.  Can you say undetectable deadlock?

Hmm. Yeah. Maybe we could get around it if we prefork the workers and
they all acquire locks on everything to be dumped up front in nowait
mode, right after the parent, and if they can't the whole dump fails. Or
something along those lines.

[ thinks for a bit... ]  Actually it might be good enough if a child
simply takes the lock it needs in nowait mode, and reports failure on
error.  We know the parent already has that lock, so the only way that
the child's request can fail is if something conflicting with
AccessShareLock is queued up behind the parent's lock.  So failure to
get the child lock immediately proves that the deadlock case applies.





Yeah, that would be a whole lot simpler. It would avoid the deadlock, 
but it would have lots more chances for failure. But it would at least 
be a good place to start.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-02 Thread Joachim Wieland
On Thu, Dec 2, 2010 at 9:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In particular, this issue *has* been discussed before, and there was a
 consensus that preserving dump consistency was a requirement.  I don't
 think that Joachim gets to bypass that decision just by submitting a
 patch that ignores it.

I am not trying to bypass anything here :)  Regarding the locking
issue I probably haven't done sufficient research, at least I managed
to miss the emails that mentioned it. Anyway, that seems to be solved
now fortunately, I'm going to implement your idea over the weekend.

Regarding snapshot cloning and dump consistency, I brought this up
already several months ago and asked if the feature is considered
useful even without snapshot cloning. And actually it was you who
motivated me to work on it even without having snapshot consistency...

http://archives.postgresql.org/pgsql-hackers/2010-03/msg01181.php

In my patch pg_dump emits a warning when called with -j, if you feel
better with an extra option
--i-know-that-i-have-no-synchronized-snapshots, fine with me :-)

In the end we provide a tool with limitations, it might not serve all
use cases but there are use cases that would benefit a lot. I
personally think this is better than to provide no tool at all...


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers