Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-09-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 After further reflection, I think that the above reasoning is wrong.
 GetNewRelFileNode() guarantees that the OID doesn't collide with the
 OID column of pg_class, not the relfilenode column of pg_class; and,
 per the comments to that function, it only guarantees this when
 creating a new relation (to which we're therefore assigning an OID)
 and not when rewriting an existing relation.  So it provides no
 protection against the scenario where backend #1 drops a table that
 lacks physical storage just as backend #2 assigns that OID to some
 other relation and begins creating files, which backend #1 then blows
 away.

 However, I think we're safe for a different reason [ omitted ]

The above scenario is only a risk if you suppose that dropping a
relation that lacks physical storage will nonetheless result in
attempted unlink()s.  I think that that's probably not the case;
but it seems related to a question that was bothering me recently.
Namely: why do we assign relfilenode values to relations that have
no physical storage?  If we did not do so, then relation drop would
see a zero in relfilenode and would certainly not attempt an incorrect
unlink().

So I'd like to look into setting relfilenode to zero for relation
relkinds that lack storage.  I'm not exactly sure why the code doesn't
do that already, though.

This came to mind after reading a commit from Bruce in which he had to
hack up pg_upgrade to preserve relfilenode numbers for storage-less
relations.  Presumably that hack could get undone.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-09-15 Thread Robert Haas
On Wed, Sep 15, 2010 at 12:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 After further reflection, I think that the above reasoning is wrong.
 GetNewRelFileNode() guarantees that the OID doesn't collide with the
 OID column of pg_class, not the relfilenode column of pg_class; and,
 per the comments to that function, it only guarantees this when
 creating a new relation (to which we're therefore assigning an OID)
 and not when rewriting an existing relation.  So it provides no
 protection against the scenario where backend #1 drops a table that
 lacks physical storage just as backend #2 assigns that OID to some
 other relation and begins creating files, which backend #1 then blows
 away.

 However, I think we're safe for a different reason [ omitted ]

 The above scenario is only a risk if you suppose that dropping a
 relation that lacks physical storage will nonetheless result in
 attempted unlink()s.  I think that that's probably not the case;

Why?  How would we know that it didn't have physical storage prior to
attempting the unlinks?

 but it seems related to a question that was bothering me recently.
 Namely: why do we assign relfilenode values to relations that have
 no physical storage?  If we did not do so, then relation drop would
 see a zero in relfilenode and would certainly not attempt an incorrect
 unlink().

I agree that setting relfilenode to 0 for relkinds that lack physical
storage is a good idea, but that's obviously only going to handle that
specific case.

 This came to mind after reading a commit from Bruce in which he had to
 hack up pg_upgrade to preserve relfilenode numbers for storage-less
 relations.  Presumably that hack could get undone.

Seems like a good thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-09-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Sep 15, 2010 at 12:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The above scenario is only a risk if you suppose that dropping a
 relation that lacks physical storage will nonetheless result in
 attempted unlink()s.  I think that that's probably not the case;

 Why?  How would we know that it didn't have physical storage prior to
 attempting the unlinks?

From the relkind.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-09-15 Thread Robert Haas
On Wed, Sep 15, 2010 at 12:33 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Sep 15, 2010 at 12:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The above scenario is only a risk if you suppose that dropping a
 relation that lacks physical storage will nonetheless result in
 attempted unlink()s.  I think that that's probably not the case;

 Why?  How would we know that it didn't have physical storage prior to
 attempting the unlinks?

 From the relkind.

Oh, sure, I agree with you in that specific case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-09-11 Thread Robert Haas
On Thu, Aug 12, 2010 at 1:29 PM, Robert Haas robertmh...@gmail.com wrote:
 Lastly, it bothers me that you've put in code to delete files belonging
 to temp rels during crash restart, without any code to clean up their
 catalog entries.  This will therefore lead to dangling pg_class
 references, with uncertain but probably not very nice consequences.
 I think that probably shouldn't go in until there's code to drop the
 catalog entries too.

 I thought about this pretty carefully, and I don't believe that there
 are any unpleasant consequences.  The code that assigns relfilenode
 numbers is pretty careful to check that the newly assigned value is
 unused BOTH in pg_class and in the directory where the file will be
 created, so there should be no danger of a number getting used over
 again while the catalog entries remain.  Also, the drop-object code
 doesn't mind that the physical storage doesn't exist; it's perfectly
 happy with that situation.  It is true that you will get an ERROR if
 you try to SELECT from a table whose physical storage has been
 removed, but that seems OK.

After further reflection, I think that the above reasoning is wrong.
GetNewRelFileNode() guarantees that the OID doesn't collide with the
OID column of pg_class, not the relfilenode column of pg_class; and,
per the comments to that function, it only guarantees this when
creating a new relation (to which we're therefore assigning an OID)
and not when rewriting an existing relation.  So it provides no
protection against the scenario where backend #1 drops a table that
lacks physical storage just as backend #2 assigns that OID to some
other relation and begins creating files, which backend #1 then blows
away.

However, I think we're safe for a different reason.  The only time we
unlink the physical storage of a relation without nuking its catalog
entries is during the startup sequence, when we wipe out the physical
storage for any leftover temp tables.  So if there's a race condition,
it can only apply to temp tables.  But temp tables for a particular
backend ID can only be created by that backend, and before a backend
will create any temp tables it will drop any previously existing temp
schema.  Thus, by the time a temp table can get created, there CAN'T
be any leftover catalog entries from previous sessions for it to
potentially collide with.

I think the reasoning above probably should be added to the comment at
the beginning of GetNewRelFileNode(), and I'll go do that unless
someone thinks it's incorrect.  The first sentence of that comment is
also now technically incorrect: what it now does is generate a
relfilenode such that tablespace, relfilenode, backendid is unique.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Here is an updated patch.  It's in context-diff format this time,

Thanks, I appreciate that ;-)

This looks committable to me, with a couple of tiny suggestions:

In the text added to storage.sgml, s/temporary relation/temporary relations/.
Also, it'd be better if BBB and FFF were marked up as replaceable
rather than literal, see examples elsewhere in that file.

The comment for local_buffer_write_error_callback() probably meant to
say during local buffer writes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Robert Haas
On Fri, Aug 13, 2010 at 12:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Here is an updated patch.  It's in context-diff format this time,

 Thanks, I appreciate that ;-)

 This looks committable to me, with a couple of tiny suggestions:

Woo hoo!

 In the text added to storage.sgml, s/temporary relation/temporary relations/.
 Also, it'd be better if BBB and FFF were marked up as replaceable
 rather than literal, see examples elsewhere in that file.

I see.  How should I mark tBBB_FFF?

I actually didn't like that way of explaining it very much, but I
couldn't think of anything clearer.  Saying the name will consist of
a lowercase t, followed by the backend ID, followed by an underscore,
followed by the filenode did not seem better.

 The comment for local_buffer_write_error_callback() probably meant to
 say during local buffer writes.

No doubt.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 13, 2010 at 12:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, it'd be better if BBB and FFF were marked up as replaceable
 rather than literal, see examples elsewhere in that file.

 I see.  How should I mark tBBB_FFF?

I'd do
literaltreplaceableBBB/replaceable_replaceableFFF/replaceable/literal

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Here's an updated patch, with the invalidation changes merged in and
 hopefully-suitable adjustments elsewhere.

I haven't tested this patch, but I read through it (and have I mentioned
how unbelievably illegible -u format patches are?).  It seems to be
close to commitable, but I found a few issues.  In no particular order:

storage.sgml needs to be updated to describe this file-naming scheme.

BackendRelFileNode isn't a particularly nice struct name; it seems like
it puts the most important aspect of the struct's purpose somewhere in
the middle of the name.  Maybe RelFileNodeBackend would be better, or
RelFileNodeFull, or something like that.

In GetNewRelFileNode, you've changed some code that is commented
/* This logic should match RelationInitPhysicalAddr */, but there
is not any corresponding change in RelationInitPhysicalAddr.  I find
this bothersome.  I think the problem is that you've made it look
like the assignment of rnode.backend is part of the logic that ought
to match RelationInitPhysicalAddr.  Perhaps set that off, at least
by a blank line, if not its own comment.

The relpath() and relpathperm() macros are a tad underdocumented for my
taste; at least put comments noting that one takes a RelFileNode and the
other a BackendRelFileNode.  And I wonder if you couldn't find better
names for relpathperm and relpathbackend.

assign_maxconnections and assign_autovacuum_max_workers need to be fixed
for MAX_BACKENDS; in fact I think all the occurrences of INT_MAX / 4 in
guc.c need to be replaced.  (And if I were you I'd grep to see if
anyplace outside guc.c knows that value...)  Also, as a matter of style,
the comment currently attached to max_connections needs to be attached
to the definition of the constant, not just modified where it is.

As an old bit-shaver this sorta bothers me:

+++ b/src/include/utils/rel.h
@@ -127,7 +127,7 @@ typedef struct RelationData
struct SMgrRelationData *rd_smgr;   /* cached file handle, or NULL 
*/
int rd_refcnt;  /* reference count */
boolrd_istemp;  /* rel is a temporary relation 
*/
-   boolrd_islocaltemp; /* rel is a temp rel of this session */
+   BackendId   rd_backend; /* owning backend id, if 
temporary relation */
boolrd_isnailed;/* rel is nailed in cache */
boolrd_isvalid; /* relcache entry is valid */
charrd_indexvalid;  /* state of rd_indexlist: 0 = not 
valid, 1 =

The struct is going to be less efficiently packed with that field order.
Maybe swap rd_istemp and rd_backend?

+   Assert(relation-rd_backend != InvalidOid);
ought to be InvalidBackendId, no?  Both new calls of GetTempNamespaceBackendId
have this wrong.  You might also want to change the comment of
GetTempNamespaceBackendId to note that its failure result is not just -1
but InvalidBackendId.

Lastly, it bothers me that you've put in code to delete files belonging
to temp rels during crash restart, without any code to clean up their
catalog entries.  This will therefore lead to dangling pg_class
references, with uncertain but probably not very nice consequences.
I think that probably shouldn't go in until there's code to drop the
catalog entries too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread David Fetter
On Thu, Aug 12, 2010 at 12:27:45PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  Here's an updated patch, with the invalidation changes merged in and
  hopefully-suitable adjustments elsewhere.
 
 I haven't tested this patch, but I read through it (and have I mentioned
 how unbelievably illegible -u format patches are?).

I have every confidence that you, of all people, can arrange to use
'filterdiff --format=context' on attached patches automatically,
saving you some time and us some boredom :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Joshua D. Drake
On Thu, 2010-08-12 at 09:44 -0700, David Fetter wrote:
 On Thu, Aug 12, 2010 at 12:27:45PM -0400, Tom Lane wrote:
  Robert Haas robertmh...@gmail.com writes:
   Here's an updated patch, with the invalidation changes merged in and
   hopefully-suitable adjustments elsewhere.
  
  I haven't tested this patch, but I read through it (and have I mentioned
  how unbelievably illegible -u format patches are?).
 
 I have every confidence that you, of all people, can arrange to use
 'filterdiff --format=context' on attached patches automatically,
 saving you some time and us some boredom :)

I was under the impression that the project guideline was that we only
accepted context diffs?

Joshua D. Drake

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread David Fetter
On Thu, Aug 12, 2010 at 09:53:54AM -0700, Joshua D. Drake wrote:
 On Thu, 2010-08-12 at 09:44 -0700, David Fetter wrote:
  On Thu, Aug 12, 2010 at 12:27:45PM -0400, Tom Lane wrote:
   Robert Haas robertmh...@gmail.com writes:
Here's an updated patch, with the invalidation changes merged
in and hopefully-suitable adjustments elsewhere.
   
   I haven't tested this patch, but I read through it (and have I
   mentioned how unbelievably illegible -u format patches are?).
  
  I have every confidence that you, of all people, can arrange to
  use 'filterdiff --format=context' on attached patches
  automatically, saving you some time and us some boredom :)
 
 I was under the impression that the project guideline was that we
 only accepted context diffs?

Since they're trivially producible from unified diffs, this is a
pretty silly reason to bounce--or even comment on--patches.  It's less
a guideline than a personal preference, namely Tom's.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Alvaro Herrera
Excerpts from David Fetter's message of jue ago 12 12:59:33 -0400 2010:
 On Thu, Aug 12, 2010 at 09:53:54AM -0700, Joshua D. Drake wrote:

  I was under the impression that the project guideline was that we
  only accepted context diffs?
 
 Since they're trivially producible from unified diffs, this is a
 pretty silly reason to bounce--or even comment on--patches.  It's less
 a guideline than a personal preference, namely Tom's.

Not that I review that many patches, but I do dislike unified diffs too.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Here's an updated patch, with the invalidation changes merged in and
 hopefully-suitable adjustments elsewhere.

 I haven't tested this patch, but I read through it (and have I mentioned
 how unbelievably illegible -u format patches are?).

Sorry, I'll run it through filterdiff for you next time.  As you know,
I have the opposite preference, but since I was producing this
primarily for you to read

I can clean up the rest of this stuff, but not this:

 Lastly, it bothers me that you've put in code to delete files belonging
 to temp rels during crash restart, without any code to clean up their
 catalog entries.  This will therefore lead to dangling pg_class
 references, with uncertain but probably not very nice consequences.
 I think that probably shouldn't go in until there's code to drop the
 catalog entries too.

I thought about this pretty carefully, and I don't believe that there
are any unpleasant consequences.  The code that assigns relfilenode
numbers is pretty careful to check that the newly assigned value is
unused BOTH in pg_class and in the directory where the file will be
created, so there should be no danger of a number getting used over
again while the catalog entries remain.  Also, the drop-object code
doesn't mind that the physical storage doesn't exist; it's perfectly
happy with that situation.  It is true that you will get an ERROR if
you try to SELECT from a table whose physical storage has been
removed, but that seems OK.

We have two existing mechanisms for removing the catalog entries: when
a backend is first asked to access a temporary file, it does a DROP
SCHEMA ... CASCADE on any pre-existing temp schema.  And a table is in
wraparound trouble and the owning backend is no longer running,
autovacuum will drop it.  Improving on this seems difficult: if you
wanted to *guarantee* that the catalog entries were removed before we
started letting in connections, you'd need to fork a backend per
database and have each one iterate through all the temp schemas and
drop them.  Considering that the existing code seems to have been
pretty careful about how this stuff gets handled, I don't think it's
worth making the whole startup sequence slower for it.  What might be
worth considering is changing the autovacuum policy to eliminate the
wraparound check, and just have it drop temp table catalog entries for
any backend not currently running, period.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 12, 2010 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lastly, it bothers me that you've put in code to delete files belonging
 to temp rels during crash restart, without any code to clean up their
 catalog entries.  This will therefore lead to dangling pg_class
 references, with uncertain but probably not very nice consequences.

 I thought about this pretty carefully, and I don't believe that there
 are any unpleasant consequences.  The code that assigns relfilenode
 numbers is pretty careful to check that the newly assigned value is
 unused BOTH in pg_class and in the directory where the file will be
 created, so there should be no danger of a number getting used over
 again while the catalog entries remain.  Also, the drop-object code
 doesn't mind that the physical storage doesn't exist; it's perfectly
 happy with that situation.

Well, okay, but I'd suggest adding comments to the drop-table code
pointing out that it is now NECESSARY for it to not complain if the file
isn't there.  This was never a design goal before, AFAIR --- the fact
that it works like that is kind of accidental.  I am also pretty sure
that there used to be at least warning messages for that case, which we
would now not want.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 2:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 12, 2010 at 12:27 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Lastly, it bothers me that you've put in code to delete files belonging
 to temp rels during crash restart, without any code to clean up their
 catalog entries.  This will therefore lead to dangling pg_class
 references, with uncertain but probably not very nice consequences.

 I thought about this pretty carefully, and I don't believe that there
 are any unpleasant consequences.  The code that assigns relfilenode
 numbers is pretty careful to check that the newly assigned value is
 unused BOTH in pg_class and in the directory where the file will be
 created, so there should be no danger of a number getting used over
 again while the catalog entries remain.  Also, the drop-object code
 doesn't mind that the physical storage doesn't exist; it's perfectly
 happy with that situation.

 Well, okay, but I'd suggest adding comments to the drop-table code
 pointing out that it is now NECESSARY for it to not complain if the file
 isn't there.  This was never a design goal before, AFAIR --- the fact
 that it works like that is kind of accidental.  I am also pretty sure
 that there used to be at least warning messages for that case, which we
 would now not want.

That seems like a good idea.  I'll post an updated patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Alvaro Herrera
Excerpts from Robert Haas's message of jue ago 12 13:29:57 -0400 2010:

 We have two existing mechanisms for removing the catalog entries: when
 a backend is first asked to access a temporary file, it does a DROP
 SCHEMA ... CASCADE on any pre-existing temp schema.  And a table is in
 wraparound trouble and the owning backend is no longer running,
 autovacuum will drop it.  Improving on this seems difficult: if you
 wanted to *guarantee* that the catalog entries were removed before we
 started letting in connections, you'd need to fork a backend per
 database and have each one iterate through all the temp schemas and
 drop them.  Considering that the existing code seems to have been
 pretty careful about how this stuff gets handled, I don't think it's
 worth making the whole startup sequence slower for it.  What might be
 worth considering is changing the autovacuum policy to eliminate the
 wraparound check, and just have it drop temp table catalog entries for
 any backend not currently running, period.

What about having autovacuum silenty drop the catalog entry if it's a
temp entry for which the underlying file does not exist?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 5:29 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of jue ago 12 13:29:57 -0400 2010:

 We have two existing mechanisms for removing the catalog entries: when
 a backend is first asked to access a temporary file, it does a DROP
 SCHEMA ... CASCADE on any pre-existing temp schema.  And a table is in
 wraparound trouble and the owning backend is no longer running,
 autovacuum will drop it.  Improving on this seems difficult: if you
 wanted to *guarantee* that the catalog entries were removed before we
 started letting in connections, you'd need to fork a backend per
 database and have each one iterate through all the temp schemas and
 drop them.  Considering that the existing code seems to have been
 pretty careful about how this stuff gets handled, I don't think it's
 worth making the whole startup sequence slower for it.  What might be
 worth considering is changing the autovacuum policy to eliminate the
 wraparound check, and just have it drop temp table catalog entries for
 any backend not currently running, period.

 What about having autovacuum silenty drop the catalog entry if it's a
 temp entry for which the underlying file does not exist?

I think that would be subject to race conditions.  The current
mechanism is actually pretty good, and I think we can build on it if
we want to do more, rather than inventing something new.  We just need
to be specific about what problem we're trying to solve.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 12, 2010 at 5:29 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 What about having autovacuum silenty drop the catalog entry if it's a
 temp entry for which the underlying file does not exist?

 I think that would be subject to race conditions.

Well, autovacuum's willingness to drop sufficiently old temp tables
would already risk any such race conditions.  However ...

 The current
 mechanism is actually pretty good, and I think we can build on it if
 we want to do more, rather than inventing something new.  We just need
 to be specific about what problem we're trying to solve.

... I agree with this point.  This idea wouldn't fix the concern I had
about the existence of pg_class entries with no underlying file: if that
causes any issues we'd have to fix them anyway.  So I'm not sure what
the gain is.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Tom Lane
One other thought about all this: in the past, one objection that's been
raised to deleting files during crash restart is the possible loss of
forensic evidence.  It might be a good idea to provide some fairly
simple way for developers to disable that deletion subroutine.  I'm not
sure that it rises to the level of needing a GUC, though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 6:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 12, 2010 at 5:29 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 What about having autovacuum silenty drop the catalog entry if it's a
 temp entry for which the underlying file does not exist?

 I think that would be subject to race conditions.

 Well, autovacuum's willingness to drop sufficiently old temp tables
 would already risk any such race conditions.  However ...

I don't think we do.  It only drops them if the backend isn't running.
 And even if the backend starts running after we check and before we
drop it, that's OK.  We're only dropping the OLD table, which by
definition isn't relevant to the new session.  Perhaps we should be
taking a lock on the relation first, but I think that can only really
bite us if the relation were dropped and a new relation were created
with the same OID - in that case, we might drop an unrelated table,
though it's vanishingly unlikely in practice.

 The current
 mechanism is actually pretty good, and I think we can build on it if
 we want to do more, rather than inventing something new.  We just need
 to be specific about what problem we're trying to solve.

 ... I agree with this point.  This idea wouldn't fix the concern I had
 about the existence of pg_class entries with no underlying file: if that
 causes any issues we'd have to fix them anyway.  So I'm not sure what
 the gain is.

Right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 One other thought about all this: in the past, one objection that's been
 raised to deleting files during crash restart is the possible loss of
 forensic evidence.  It might be a good idea to provide some fairly
 simple way for developers to disable that deletion subroutine.  I'm not
 sure that it rises to the level of needing a GUC, though.

See http://archives.postgresql.org/pgsql-hackers/2010-06/msg00622.php :

In this version of the patch, I've improved the temporary file cleanup
mechanism.  In CVS HEAD, when a transaction commits or aborts, we
write an XLOG record with all relations that must be unlinked,
including temporary relations.  With this patch, we no longer include
temporary relations in the XLOG records (which may be a tiny
performance benefit for some people); instead, on every startup of the
database cluster, we just nuke all temporary relation files (which is
now easy to do, since they are named differently than files for
non-temporary relations) at the same time that we nuke other temp
files.  This produces slightly different behavior.  In CVS HEAD,
temporary files get removed whenever an xlog redo happens, so either
at cluster start or after a backend crash, but only to the extent that
they appear in WAL.  With this patch, we can be sure of removing ALL
stray files, which is maybe ever-so-slightly leaky in CVS HEAD.  But
on the other hand, because it hooks into the existing temporary file
cleanup code, it only happens at cluster startup, NOT after a backend
crash.  The existing coding leaves temporary sort files and similar
around after a backend crash for forensics purposes.  That might or
might not be worth rethinking for non-debug builds, but I don't think
there's any very good reason to think that temporary relation files
need to be handled differently than temporary work files.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 12, 2010 at 7:04 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 One other thought about all this: in the past, one objection that's been
 raised to deleting files during crash restart is the possible loss of
 forensic evidence.

 ...  With this patch, we can be sure of removing ALL
 stray files, which is maybe ever-so-slightly leaky in CVS HEAD.  But
 on the other hand, because it hooks into the existing temporary file
 cleanup code, it only happens at cluster startup, NOT after a backend
 crash.

Doh.  Thanks.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-12 Thread Robert Haas
On Thu, Aug 12, 2010 at 2:12 PM, Robert Haas robertmh...@gmail.com wrote:
 That seems like a good idea.  I'll post an updated patch.

Here is an updated patch.  It's in context-diff format this time, but
that made it go over 100K, so I gzip'd it because I can't remember
what the attachment size limit is these days.  I'm not sure whether
that works out to a win or not.

I think this addresses all previous review comments with the exception
that I've been unable to devise a more clever name for relpathperm()
and relpathbackend().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


temprelnames-v5.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 [ BackendRelFileNode patch ]

 One thing that I find rather distressing about this is the 25% bloat
 in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
 really necessary to *ever* send an SI message for a backend-local rel?

It can be dropped or unlinked by another backend, so, yes.

 I agree that one needs to send relcache inval sometimes for temp rels,
 but I don't see why each backend couldn't interpret that as a flush
 on its own local version.

The problem is that receipt of the inval message results in a call to
smgrclosenode(), which has to look up the (Backend)RelFileNode in
SMgrRelationHash.  If the backend ID isn't present, we can't search
the hash table efficiently.  This is not only a problem for
smgrclosenode(), either; that hash is used in a bunch of places, so
changing the hash key isn't really an option.  One possibility would
be to have two separate shared-invalidation message types for smgr.
One would indicate that a non-temporary relation was flushed, so the
backend ID field is known to be -1 and we can search the hash quickly.
 The other would indicate that a temporary relation was flushed and
you'd have to scan the whole hash table to find and flush all matching
entries.  That still doesn't sound great, though.  Another thought I
had was that if we could count on the backend ID to fit into an int16
we could fit it in to what's currently padding space.  That would
require rather dramatically lowering the maximum number of backends
(currently INT_MAX/4), but it's a little hard to imagine that we can
really support more than 32,767 simultaneous backends anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 One thing that I find rather distressing about this is the 25% bloat
 in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
 really necessary to *ever* send an SI message for a backend-local rel?

 It can be dropped or unlinked by another backend, so, yes.

Really?  Surely that should be illegal during normal operation.  We
might be doing such during crash recovery, but we don't need to
broadcast sinval messages then.

It might be sufficient to consider that there are local and global
smgr inval messages, where the former never get out of the generating
backend, so a bool is enough in the message struct.

 had was that if we could count on the backend ID to fit into an int16
 we could fit it in to what's currently padding space.  That would
 require rather dramatically lowering the maximum number of backends
 (currently INT_MAX/4), but it's a little hard to imagine that we can
 really support more than 32,767 simultaneous backends anyway.

Yeah, that occurred to me too.  A further thought is that the id field
could probably be reduced to 1 byte, leaving 3 for backendid, which
would certainly be plenty.  However representing that in a portable
struct declaration would be a bit painful I fear.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Aug 5, 2010 at 7:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 One thing that I find rather distressing about this is the 25% bloat
 in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
 really necessary to *ever* send an SI message for a backend-local rel?

 It can be dropped or unlinked by another backend, so, yes.

 Really?  Surely that should be illegal during normal operation. We
 might be doing such during crash recovery, but we don't need to
 broadcast sinval messages then.

autovacuum.c does it when we start to worry about XID wraparound, but
you can also do it from any normal backend.  Just DROP TABLE
pg_temp_2.foo or whatever and away you go.

 It might be sufficient to consider that there are local and global
 smgr inval messages, where the former never get out of the generating
 backend, so a bool is enough in the message struct.

It would be nice to be able to do it that way, but I don't believe
it's the case, per the above.

 had was that if we could count on the backend ID to fit into an int16
 we could fit it in to what's currently padding space.  That would
 require rather dramatically lowering the maximum number of backends
 (currently INT_MAX/4), but it's a little hard to imagine that we can
 really support more than 32,767 simultaneous backends anyway.

 Yeah, that occurred to me too.  A further thought is that the id field
 could probably be reduced to 1 byte, leaving 3 for backendid, which
 would certainly be plenty.  However representing that in a portable
 struct declaration would be a bit painful I fear.

Well, presumably we'd just represent it as a 1-byte field followed by
a 2-byte field, and do a bit of math.  But I don't really see the
point.  The whole architecture of a shared invalidation queue is
fundamentally non-scalable because it's a broadcast medium.  If we
wanted to efficiently support even thousands of backends (let alone
tens or hundreds of thousands) I assume we would need to rearchitect
this completely with more fine-grained queues, and have backends
subscribe to the queues pertaining to the objects they want to access
before touching them.  Or maybe something else entirely. But I don't
think broadcasting to 30,000 backends is going to work for the same
reason that plugging 30,000 machines into an Ethernet *hub* doesn't
work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Really?  Surely that should be illegal during normal operation. We
 might be doing such during crash recovery, but we don't need to
 broadcast sinval messages then.

 autovacuum.c does it when we start to worry about XID wraparound, but
 you can also do it from any normal backend.  Just DROP TABLE
 pg_temp_2.foo or whatever and away you go.

Mph.  I'm still not convinced that the sinval message needs to carry
backendid.  But maybe the first-cut solution should just be to squeeze
the id into the padding area.  You should be able to get up to 65535
allowed backends, not 32k --- or perhaps use different message type IDs
for local and global backendid, so that all 65536 bitpatterns are
allowed for a non-global backendid.

 Well, presumably we'd just represent it as a 1-byte field followed by
 a 2-byte field, and do a bit of math.  But I don't really see the
 point.  The whole architecture of a shared invalidation queue is
 fundamentally non-scalable because it's a broadcast medium.

Sure, it tops out somewhere, but 32K is way too close to configurations
we know work well enough in the field (I've seen multiple reports of
people using a couple thousand backends).  In any case, sinval readers
don't block each other in the current implementation, so I'm really
dubious that there's any inherent scalability limitation there.  I'll
hold still for 64K but I think it might be better to go for 2^24.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Jaime Casanova
On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas robertmh...@gmail.com wrote:

 Just DROP TABLE
 pg_temp_2.foo or whatever and away you go.


wow! that's true... and certainly a bug...
we shouldn't allow any session to drop other session's temp tables, or
is there a reason for this misbehavior?

-- 
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
Jaime Casanova ja...@2ndquadrant.com writes:
 On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas robertmh...@gmail.com wrote:
 Just DROP TABLE pg_temp_2.foo or whatever and away you go.

 wow! that's true... and certainly a bug...

No, it's not a bug.  You'll find only superusers can do it.

 we shouldn't allow any session to drop other session's temp tables, or
 is there a reason for this misbehavior?

It is intentional, though I'd be willing to give it up if we had more
bulletproof crash-cleanup of temp tables --- which is one of the things
this patch is supposed to result in.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Fri, Aug 6, 2010 at 2:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 6, 2010 at 1:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Really?  Surely that should be illegal during normal operation. We
 might be doing such during crash recovery, but we don't need to
 broadcast sinval messages then.

 autovacuum.c does it when we start to worry about XID wraparound, but
 you can also do it from any normal backend.  Just DROP TABLE
 pg_temp_2.foo or whatever and away you go.

 Mph.  I'm still not convinced that the sinval message needs to carry
 backendid.

Hey, if you have an idea... I'd love to get rid of it, too, but I
don't see how to do it.

 But maybe the first-cut solution should just be to squeeze
 the id into the padding area.  You should be able to get up to 65535
 allowed backends, not 32k --- or perhaps use different message type IDs
 for local and global backendid, so that all 65536 bitpatterns are
 allowed for a non-global backendid.

 Well, presumably we'd just represent it as a 1-byte field followed by
 a 2-byte field, and do a bit of math.  But I don't really see the
 point.  The whole architecture of a shared invalidation queue is
 fundamentally non-scalable because it's a broadcast medium.

 Sure, it tops out somewhere, but 32K is way too close to configurations
 we know work well enough in the field (I've seen multiple reports of
 people using a couple thousand backends).  In any case, sinval readers
 don't block each other in the current implementation, so I'm really
 dubious that there's any inherent scalability limitation there.  I'll
 hold still for 64K but I think it might be better to go for 2^24.

Well, I wouldn't expect anyone to use an exclusive lock for readers
without a good reason, but you still have n backends that each have to
read, presumably, about O(n) messages, so eventually that's going to
start to pinch.

Do you think it's worth worrying about the reduction in the number of
possible SI message types?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Fri, Aug 6, 2010 at 2:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jaime Casanova ja...@2ndquadrant.com writes:
 On Fri, Aug 6, 2010 at 12:50 PM, Robert Haas robertmh...@gmail.com wrote:
 Just DROP TABLE pg_temp_2.foo or whatever and away you go.

 wow! that's true... and certainly a bug...

 No, it's not a bug.  You'll find only superusers can do it.

 we shouldn't allow any session to drop other session's temp tables, or
 is there a reason for this misbehavior?

 It is intentional, though I'd be willing to give it up if we had more
 bulletproof crash-cleanup of temp tables --- which is one of the things
 this patch is supposed to result in.

This patch only directly addresses the issue of cleaning up the
storage, so there are still the catalog entries to worry about.  But
it doesn't seem impossible to think about building on this approach to
eventually handle that part of the problem better, too.  I haven't
thought too much about what that would look like, but I think it could
be done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 6, 2010 at 2:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sure, it tops out somewhere, but 32K is way too close to configurations
 we know work well enough in the field (I've seen multiple reports of
 people using a couple thousand backends).

 Well, I wouldn't expect anyone to use an exclusive lock for readers
 without a good reason, but you still have n backends that each have to
 read, presumably, about O(n) messages, so eventually that's going to
 start to pinch.

Sure, but I don't see much to be gained from multiple queues either.
There are few (order of zero, in fact) cases where sinval messages
are transmitted that aren't of potential interest to all backends.
Maybe you could do something useful with a very large number of
dynamically-defined queues (like one per relation) ... but managing that
would probably swamp any savings.

 Do you think it's worth worrying about the reduction in the number of
 possible SI message types?

IIRC the number of message types is the number of catalog caches plus
half a dozen or so.  We're a long way from exhausting even a 1-byte
ID field; and we could play more games if we had to, since there would
be a padding byte free in the message types that refer to a catalog
cache.  IOW, 1-byte id doesn't bother me.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 This patch only directly addresses the issue of cleaning up the
 storage, so there are still the catalog entries to worry about.  But
 it doesn't seem impossible to think about building on this approach to
 eventually handle that part of the problem better, too.  I haven't
 thought too much about what that would look like, but I think it could
 be done.

Perhaps run through pg_class after restart and flush anything marked
relistemp?  Although the ideal solution, probably, would be for temp
tables to not have persistent catalog entries in the first place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread David Fetter
On Fri, Aug 06, 2010 at 03:00:35PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  This patch only directly addresses the issue of cleaning up the
  storage, so there are still the catalog entries to worry about.
  But it doesn't seem impossible to think about building on this
  approach to eventually handle that part of the problem better,
  too.  I haven't thought too much about what that would look like,
  but I think it could be done.
 
 Perhaps run through pg_class after restart and flush anything marked
 relistemp?  Although the ideal solution, probably, would be for temp
 tables to not have persistent catalog entries in the first place.

For the upcoming global temp tables, which are visible in other
sessions, how would this work?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Tom Lane
David Fetter da...@fetter.org writes:
 On Fri, Aug 06, 2010 at 03:00:35PM -0400, Tom Lane wrote:
 Perhaps run through pg_class after restart and flush anything marked
 relistemp?  Although the ideal solution, probably, would be for temp
 tables to not have persistent catalog entries in the first place.

 For the upcoming global temp tables, which are visible in other
 sessions, how would this work?

Well, that's a totally different matter.  Those would certainly have
persistent entries, at least for the master version.  I don't think
anyone's really figured out what the best implementation looks like;
but maybe there would be per-backend child versions that could act
just like the current kind of temp table (and again would ideally not
have any persistent catalog entries).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Fri, Aug 6, 2010 at 3:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 This patch only directly addresses the issue of cleaning up the
 storage, so there are still the catalog entries to worry about.  But
 it doesn't seem impossible to think about building on this approach to
 eventually handle that part of the problem better, too.  I haven't
 thought too much about what that would look like, but I think it could
 be done.

 Perhaps run through pg_class after restart and flush anything marked
 relistemp?

The trouble is that you have to bind to a database before you can run
through pg_class, and the postmaster doesn't.  Of course, if it could
attach to a database and then detach again, this might be feasible,
although perhaps still a bit overly complex for the postmaster, but in
any event it doesn't.  We seem to already have a mechanism in place
where a backend that creates a temprel nukes any pre-existing temprel
schema for its own backend, but of course that's not fool-proof.

 Although the ideal solution, probably, would be for temp
 tables to not have persistent catalog entries in the first place.

I've been thinking about that, but it's a bit challenging to imagine
how it could work.  It's not just the pg_class entries you have to
think about, but also pg_attrdef, pg_attribute, pg_constraint,
pg_description, pg_index, pg_rewrite, and pg_trigger.  An even
stickier problem is that we have lots of places in the backend code
that refer to objects by OID.  We'd either need to change all of that
code (which seems like a non-starter) or somehow guarantee that the
OIDs assigned to any given backend's private objects would be
different from those assigned to any public object (which I also don't
see how to do).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Robert Haas
On Fri, Aug 6, 2010 at 2:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 6, 2010 at 2:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Sure, it tops out somewhere, but 32K is way too close to configurations
 we know work well enough in the field (I've seen multiple reports of
 people using a couple thousand backends).

 Well, I wouldn't expect anyone to use an exclusive lock for readers
 without a good reason, but you still have n backends that each have to
 read, presumably, about O(n) messages, so eventually that's going to
 start to pinch.

 Sure, but I don't see much to be gained from multiple queues either.
 There are few (order of zero, in fact) cases where sinval messages
 are transmitted that aren't of potential interest to all backends.
 Maybe you could do something useful with a very large number of
 dynamically-defined queues (like one per relation) ... but managing that
 would probably swamp any savings.

Well, what I was thinking is that if you could guarantee that a
certain backend COULDN'T have a particular relfilenode open at the
smgr level, for example, then it needn't read the invalidation
messages for that relfilenode.  Precisely how to slice that up is
another matter.  For the present case, for instance, you could
creating one queue per backend.  In the normal course of events, each
backend would subscribe only to its own queue, but if one backend
wanted to drop a temporary relation belonging to some other backend,
it would temporarily subscribe to that backend's queue, do whatever it
needed to do, and then flush all the smgr references before
unsubscribing from the queue.  That's a bit silly because we doubtless
wouldn't invent such a complicated mechanism just for this case, but I
think it illustrates the kind of thing that one could do.  Or if you
wanted to optimize for the case of a large number of databases running
in a single cluster, perhaps you'd want one queue per database plus a
shared queue for the shared catalogs.  I don't know.  This is just pie
in the sky.

 Do you think it's worth worrying about the reduction in the number of
 possible SI message types?

 IIRC the number of message types is the number of catalog caches plus
 half a dozen or so.  We're a long way from exhausting even a 1-byte
 ID field; and we could play more games if we had to, since there would
 be a padding byte free in the message types that refer to a catalog
 cache.  IOW, 1-byte id doesn't bother me.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-06 Thread Alvaro Herrera
Excerpts from Robert Haas's message of vie ago 06 15:32:21 -0400 2010:

  Perhaps run through pg_class after restart and flush anything marked
  relistemp?
 
 The trouble is that you have to bind to a database before you can run
 through pg_class, and the postmaster doesn't.  Of course, if it could
 attach to a database and then detach again, this might be feasible,
 although perhaps still a bit overly complex for the postmaster, but in
 any event it doesn't.

A simpler idea seems to run a process specifically to connect to the
database to scan pg_class there, and then die.  It sounds a tad
expensive though.

 I've been thinking about that, but it's a bit challenging to imagine
 how it could work.  It's not just the pg_class entries you have to
 think about, but also pg_attrdef, pg_attribute, pg_constraint,
 pg_description, pg_index, pg_rewrite, and pg_trigger.  An even
 stickier problem is that we have lots of places in the backend code
 that refer to objects by OID.  We'd either need to change all of that
 code (which seems like a non-starter) or somehow guarantee that the
 OIDs assigned to any given backend's private objects would be
 different from those assigned to any public object (which I also don't
 see how to do).

Maybe we could reserve one of the 32 bits of OID to indicate private-ness.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-08-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 [ BackendRelFileNode patch ]

One thing that I find rather distressing about this is the 25% bloat
in sizeof(SharedInvalidationMessage).  Couldn't we avoid that?  Is it
really necessary to *ever* send an SI message for a backend-local rel?

I agree that one needs to send relcache inval sometimes for temp rels,
but I don't see why each backend couldn't interpret that as a flush
on its own local version.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-07-29 Thread Jaime Casanova
On Sun, Jul 25, 2010 at 4:32 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Jul 25, 2010 at 2:37 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 but i have a few questions, maybe is right what you did i only want to
 understand it:
 - you added this in include/storage/smgr.h, so why is safe to assume
 that if the backend != InvalidBackendId it must be a temp relation?

 +#define SmgrIsTemp(smgr) \
 +   ((smgr)-smgr_rnode.backend != InvalidBackendId)

 That's pretty much the whole point of the patch.  Instead of
 identifying relations as simply temporary or not temporary, we
 identify as a temporary relation owned by backend X or as not
 temporary.


Ok, this one seems good enough... i'm marking it as ready for committer

-- 
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-07-25 Thread Jaime Casanova
On Fri, Jul 23, 2010 at 10:05 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Thu, Jun 10, 2010 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote:


 I believe that this patch will clear away one major obstacle to
 implementing global temporary and unlogged tables: it enables us to be
 sure of cleaning up properly after a crash without relying on catalog
 entries or XLOG.  Based on previous discussions, however, I believe
 there is support for making this change independently of what becomes
 of that project, just for the benefit of having a more robust cleanup
 mechanism.


 Hi,

 i was looking at v3 of this patch...


Ok, i like what you did in smgrextend, smgrwrite, and others...
changing isTemp for skipFsync is more descriptive

but i have a few questions, maybe is right what you did i only want to
understand it:
- you added this in include/storage/smgr.h, so why is safe to assume
that if the backend != InvalidBackendId it must be a temp relation?

+#define SmgrIsTemp(smgr) \
+   ((smgr)-smgr_rnode.backend != InvalidBackendId)


- you added a question like this if (rel-rd_backend == MyBackendId)
in a few places... why are you assuming that? that couldn't be a new
created relation (in current session of course)? is that safe?

-- 
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-07-25 Thread Robert Haas
On Sun, Jul 25, 2010 at 2:37 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 but i have a few questions, maybe is right what you did i only want to
 understand it:
 - you added this in include/storage/smgr.h, so why is safe to assume
 that if the backend != InvalidBackendId it must be a temp relation?

 +#define SmgrIsTemp(smgr) \
 +   ((smgr)-smgr_rnode.backend != InvalidBackendId)

That's pretty much the whole point of the patch.  Instead of
identifying relations as simply temporary or not temporary, we
identify as a temporary relation owned by backend X or as not
temporary.

 - you added a question like this if (rel-rd_backend == MyBackendId)
 in a few places... why are you assuming that? that couldn't be a new
 created relation (in current session of course)? is that safe?

Again, rd_backend is not the creating backend ID unless the relation
is a temprel.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] including backend ID in relpath of temp rels - updated patch

2010-07-23 Thread Jaime Casanova
On Thu, Jun 10, 2010 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote:


 I believe that this patch will clear away one major obstacle to
 implementing global temporary and unlogged tables: it enables us to be
 sure of cleaning up properly after a crash without relying on catalog
 entries or XLOG.  Based on previous discussions, however, I believe
 there is support for making this change independently of what becomes
 of that project, just for the benefit of having a more robust cleanup
 mechanism.


Hi,

i was looking at v3 of this patch...

it looks good, works as advertised, pass regression tests, and all
tests i could think of...
haven't looked the code at much detail but seems ok, to me at least...

-- 
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers