Re: [HACKERS] Implementation of global temporary tables?

2015-07-22 Thread Robert Haas
On Wed, Jul 15, 2015 at 11:52 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 July 2015 at 16:44, Andres Freund and...@anarazel.de wrote:
 On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
  On 15 July 2015 at 16:28, Andres Freund and...@anarazel.de wrote:
   I think that's generally a fair point. But here we're discussing to
   add
   a fair amount of wrinkles with the copy approach. The fact alone that
   the oid is different will have some ugly consequences.
  
 
  Why? We are creating a local temp table LIKE the global temp table. That
  is
  already a supported operation. So there is no different oid.

 Then your locking against ALTER, DROP etc. isn't going to work.

 There would be two objects, both locked. The temp table is just nice and
 simple. No problem.

 Your optimization may work; I hope it does. My approach definitely will. So
 we could choose either.

It's not really an optimization; it's a whole different approach.  I
looked at the create-a-temp-table-on-the-fly idea back when I
implemented unlogged tables and concluded it was an unworkable mess.
Deep down in the guts of name resolution code is not the place where
you want to suddenly decide that you need to run some DDL.  So I
believe in what Andres is proposing.  I'm not necessarily going to
shout it down if somebody finds a way to make the
temp-table-on-the-fly approach work, but my view is that making that
work, although it may look superficially appealing, will eventually
make whoever has to do it hate their life; and that even if they get
it to where it sorta works, it's going to have ugly corner cases that
are almost impossible to file down.

Another advantage of Andres's approach, BTW, is that it could
potentially eventually be extended to work on Hot Standby machines.
For that to work, we'd need a separate XID space for temporary tables,
but Noah proposed that before, and I don't think it's a completely
crazy idea (just mostly crazy).  Now, maybe nobody's going to care
about that any more in 5 years if we have full-blown logical
replication deeply integrated into core, but there's a lot to like
about a design that keeps our options in that area open.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Andres Freund
On 2015-07-20 15:33:32 +1200, Gavin Flower wrote:
 Would it be difficult to add the ability for one user to share the contents
 with a list of named other users (roles)?

No need. That feature is called unlogged tables and grants.

Doing this for temporary tables would be horrible. They live in process
local memory and not shared memory. Because that provides higher
isolation, not even though it does.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Alvaro Herrera
Pavel Stehule wrote:
 2015-07-20 5:33 GMT+02:00 Gavin Flower gavinflo...@archidevsys.co.nz:
 
  Would it be difficult to add the ability for one user to share the
  contents with a list of named other users (roles)?
 
 Probably it is possible, but not for temporary data - short data are in
 process memory, so it are not accessible from other sessions.
 
 This sharing tables needs:
 
 1. some infrastructure to hold data about sharing - who can share with what
 2. who will clean data? temporary data are cleaned on end of transaction or
 end of session
 3. data should be saved in shared memory instead process memory
 
 So it is possible, but partially different

To me this gets in the crazy ideas list.  Please add it to the TODO
page in the wiki, so that we're sure we never implement it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Pavel Stehule
2015-07-20 11:07 GMT+02:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule wrote:
  2015-07-20 5:33 GMT+02:00 Gavin Flower gavinflo...@archidevsys.co.nz:
 
   Would it be difficult to add the ability for one user to share the
   contents with a list of named other users (roles)?
 
  Probably it is possible, but not for temporary data - short data are in
  process memory, so it are not accessible from other sessions.
 
  This sharing tables needs:
 
  1. some infrastructure to hold data about sharing - who can share with
 what
  2. who will clean data? temporary data are cleaned on end of transaction
 or
  end of session
  3. data should be saved in shared memory instead process memory
 
  So it is possible, but partially different

 To me this gets in the crazy ideas list.  Please add it to the TODO
 page in the wiki, so that we're sure we never implement it.


yes, it is pretty crazy - Have no plan to implement it :)

Pavel


 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] Implementation of global temporary tables?

2015-07-20 Thread Zhaomo Yang

 Just to be clear, the idea of a global temp table is that the table def
 is available to all users, but the data is private to each session?


The table def is visible to all sessions and persistent, but the data is
private to each session and temporary.

Thanks,
Zhaomo


Re: [HACKERS] Implementation of global temporary tables?

2015-07-19 Thread Josh Berkus
Pavel, All:

Just to be clear, the idea of a global temp table is that the table def
is available to all users, but the data is private to each session?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-19 Thread Pavel Stehule
2015-07-19 21:39 GMT+02:00 Josh Berkus j...@agliodbs.com:

 Pavel, All:

 Just to be clear, the idea of a global temp table is that the table def
 is available to all users, but the data is private to each session?


yes.

Pavel



 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com



Re: [HACKERS] Implementation of global temporary tables?

2015-07-19 Thread Gavin Flower

On 20/07/15 15:00, Pavel Stehule wrote:



2015-07-19 21:39 GMT+02:00 Josh Berkus j...@agliodbs.com 
mailto:j...@agliodbs.com:


Pavel, All:

Just to be clear, the idea of a global temp table is that the
table def
is available to all users, but the data is private to each session?


yes.

Pavel


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



Just wondering...

Would it be difficult to add the ability for one user to share the 
contents with a list of named other users (roles)?


-Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-19 Thread Pavel Stehule
2015-07-20 5:33 GMT+02:00 Gavin Flower gavinflo...@archidevsys.co.nz:

 On 20/07/15 15:00, Pavel Stehule wrote:



 2015-07-19 21:39 GMT+02:00 Josh Berkus j...@agliodbs.com mailto:
 j...@agliodbs.com:

 Pavel, All:

 Just to be clear, the idea of a global temp table is that the
 table def
 is available to all users, but the data is private to each session?


 yes.

 Pavel


 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com


  Just wondering...

 Would it be difficult to add the ability for one user to share the
 contents with a list of named other users (roles)?


Probably it is possible, but not for temporary data - short data are in
process memory, so it are not accessible from other sessions.

This sharing tables needs:

1. some infrastructure to hold data about sharing - who can share with what
2. who will clean data? temporary data are cleaned on end of transaction or
end of session
3. data should be saved in shared memory instead process memory

So it is possible, but partially different


 -Gavin



Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Andrew Dunstan


On 07/15/2015 07:58 AM, Simon Riggs wrote:



For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table 
but with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, 
if it does not exist we create it as a TEMP table of the same name, 
using the Global's pg_class entry as a template


That meets the SQL Standard and doesn't contain any visibility 
problems or need for new internals.


The purpose of this feature is to automatically create a temp table 
with the same definition whenever needed. The discussion of bloat is 
just wrong. We create exactly the same amount of bloat as if we had 
typed CREATE TEMP TABLE. Optimising temp table entries in the catalog 
is another, separate patch, if we care.





Sounds fine in general. I'm a bit curious to know what are the locking 
implications of vivifying the table on access.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Simon Riggs
On 14 July 2015 at 23:20, Jim Nasby jim.na...@bluetreble.com wrote:

 On 7/9/15 12:45 AM, Pavel Stehule wrote:


 2015-07-09 7:32 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu
 mailto:zhy...@cs.ucsd.edu:

   I am not sure, if it is not useless work.

 I don't understand why an implementation taking approach 2.a would
 be useless. As I said, its performance will be no worse than current
 temp tables and it will provide a lot of convenience to users who
 need to create temp tables in every session.


 Surely it should be step forward. But you will to have to solve lot of
 problems with duplicated tables in system catalogue, and still it
 doesn't solve the main problem with temporary tables - the bloating
 catalogue - and related performance degradation.


 That being the main problem is strictly a matter of opinion based on
 your experience. Many people don't have a performance problem today, but do
 have to deal with all the pain of handling this manually (as well as all
 the limitations that go with that).

 If it's easy to fix the bloat problem at the same time as adding GLOBAL
 TEMP then great! But there's no reason to reject this just because it
 doesn't fix that issue.


Agreed

There are some good arguments for why we need this feature.

Pavel's original description of how to do this seem valid, and from the
link Tom agreed in 2009.

For me the design summary is this

* CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
with different relkind
* When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it
does not exist we create it as a TEMP table of the same name, using the
Global's pg_class entry as a template

That meets the SQL Standard and doesn't contain any visibility problems or
need for new internals.

The purpose of this feature is to automatically create a temp table with
the same definition whenever needed. The discussion of bloat is just
wrong. We create exactly the same amount of bloat as if we had typed CREATE
TEMP TABLE. Optimising temp table entries in the catalog is another,
separate patch, if we care.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Pavel Stehule
2015-07-15 13:58 GMT+02:00 Simon Riggs si...@2ndquadrant.com:

 On 14 July 2015 at 23:20, Jim Nasby jim.na...@bluetreble.com wrote:

 On 7/9/15 12:45 AM, Pavel Stehule wrote:


 2015-07-09 7:32 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu
 mailto:zhy...@cs.ucsd.edu:

   I am not sure, if it is not useless work.

 I don't understand why an implementation taking approach 2.a would
 be useless. As I said, its performance will be no worse than current
 temp tables and it will provide a lot of convenience to users who
 need to create temp tables in every session.


 Surely it should be step forward. But you will to have to solve lot of
 problems with duplicated tables in system catalogue, and still it
 doesn't solve the main problem with temporary tables - the bloating
 catalogue - and related performance degradation.


 That being the main problem is strictly a matter of opinion based on
 your experience. Many people don't have a performance problem today, but do
 have to deal with all the pain of handling this manually (as well as all
 the limitations that go with that).

 If it's easy to fix the bloat problem at the same time as adding GLOBAL
 TEMP then great! But there's no reason to reject this just because it
 doesn't fix that issue.


 Agreed

 There are some good arguments for why we need this feature.

 Pavel's original description of how to do this seem valid, and from the
 link Tom agreed in 2009.

 For me the design summary is this

 * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
 with different relkind
 * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
 it does not exist we create it as a TEMP table of the same name, using the
 Global's pg_class entry as a template

 That meets the SQL Standard and doesn't contain any visibility problems or
 need for new internals.

 The purpose of this feature is to automatically create a temp table with
 the same definition whenever needed. The discussion of bloat is just
 wrong. We create exactly the same amount of bloat as if we had typed CREATE
 TEMP TABLE. Optimising temp table entries in the catalog is another,
 separate patch, if we care.


The optimization of local temp tables is little bit harder - you cannot to
share pg_class and pg_attribute - although some memory entries can be used
too.

Regards

Pavel



 --
 Simon Riggshttp://www.2ndQuadrant.com/
 http://www.2ndquadrant.com/
 PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services



Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Simon Riggs
On 15 July 2015 at 13:26, Andrew Dunstan and...@dunslane.net wrote:


 On 07/15/2015 07:58 AM, Simon Riggs wrote:


 For me the design summary is this

 * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
 but with different relkind
 * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
 it does not exist we create it as a TEMP table of the same name, using the
 Global's pg_class entry as a template

 That meets the SQL Standard and doesn't contain any visibility problems
 or need for new internals.

 The purpose of this feature is to automatically create a temp table with
 the same definition whenever needed. The discussion of bloat is just
 wrong. We create exactly the same amount of bloat as if we had typed CREATE
 TEMP TABLE. Optimising temp table entries in the catalog is another,
 separate patch, if we care.



 Sounds fine in general. I'm a bit curious to know what are the locking
 implications of vivifying the table on access.


We would lock the Global Temp Table at the same lock level for the
activity, just as we do for INSERT, SELECT etc.. That prevents concurrent
DDL like DROP or ALTER on the Global Temp Table.

The Local temp table created is a copy of the Global Temp Table. The Local
temp table is only locally locked, so no worries.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Pavel Stehule
2015-07-15 15:53 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

   there is other question - what is effect of ALTER TABLE of global temp
 table on
   instances of this table in active sessions?

 As I said, we need to first agree on the behaviors of the existing
 commands. I can think of two options now for ALTER TABLE: 1) only allow
 ALTER TABLE when there is no other active sessions (this is how Oracle
 deals with it.) 2) handle it as if session copies inherit from the global
 copy and ALTER TABLE executes on the global copy.


There are two possible kinds of GLOBAL TEMP tables -  session related and
transation related. Transaction related tables has very short life - and @1
needs outage, @2 requires stronger locks and can slow and less effective -
because a) some changes can be invisible in other transactions (depends on
isolation levels), b) the structure can be changed, but function code not
(without dependency on isolation levels) - so it can be non consistent, c)
why to change table if this table will be dropped in next milisecond. For
this case the behave like PL functions can be very practical ~ third option
for ALTER TABLE

Regards

Pavel


 Thanks,
 Zhaomo



Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Simon Riggs
On 15 July 2015 at 16:44, Andres Freund and...@anarazel.de wrote:

 On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
  On 15 July 2015 at 16:28, Andres Freund and...@anarazel.de wrote:
   I think that's generally a fair point. But here we're discussing to add
   a fair amount of wrinkles with the copy approach. The fact alone that
   the oid is different will have some ugly consequences.
  
 
  Why? We are creating a local temp table LIKE the global temp table. That
 is
  already a supported operation. So there is no different oid.

 Then your locking against ALTER, DROP etc. isn't going to work.


There would be two objects, both locked. The temp table is just nice and
simple. No problem.

Your optimization may work; I hope it does. My approach definitely will. So
we could choose either.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Pavel Stehule
2015-07-15 15:21 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

  Sounds fine in general. I'm a bit curious to know what are the locking
 implications of  vivifying the table on access.

 The locking implications depend on how we interpret the existing commands
 in the context of global temp tables and I think we should discuss and
 agree on the behaviors of the commands with global temp tables, but I think
 in general we can follow these rules:

 If the command executes on the global temp table's metadata, for example
 an ALTER TABLE command, then we lock the global copy at the same level as
 we do a regular table.


there is other question - what is effect of ALTER TABLE of global temp
table on instances of this table in active sessions?



 If the command executes on the global temp table's data (which is actually
 stored in the session copy), for example an DML command, then the global
 copy is locked at the AccessShareLock level to prevent concurrent
 modifications to the global temp table's definition from other sessions.

 Thanks,
 Zhaomo

 On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan and...@dunslane.net
 wrote:


 On 07/15/2015 07:58 AM, Simon Riggs wrote:


 For me the design summary is this

 * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
 but with different relkind
 * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
 it does not exist we create it as a TEMP table of the same name, using the
 Global's pg_class entry as a template

 That meets the SQL Standard and doesn't contain any visibility problems
 or need for new internals.

 The purpose of this feature is to automatically create a temp table with
 the same definition whenever needed. The discussion of bloat is just
 wrong. We create exactly the same amount of bloat as if we had typed CREATE
 TEMP TABLE. Optimising temp table entries in the catalog is another,
 separate patch, if we care.



 Sounds fine in general. I'm a bit curious to know what are the locking
 implications of vivifying the table on access.

 cheers

 andrew





Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Zhaomo Yang
  there is other question - what is effect of ALTER TABLE of global temp
table on
  instances of this table in active sessions?

As I said, we need to first agree on the behaviors of the existing
commands. I can think of two options now for ALTER TABLE: 1) only allow
ALTER TABLE when there is no other active sessions (this is how Oracle
deals with it.) 2) handle it as if session copies inherit from the global
copy and ALTER TABLE executes on the global copy.

Thanks,
Zhaomo


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Simon Riggs
On 15 July 2015 at 15:57, Andres Freund and...@anarazel.de wrote:

 On 2015-07-15 16:52:49 +0200, Andres Freund wrote:
  Why do we need to create that copy? We can just use the relfilenode in
  all backends by having the backendid in the filename? Yes, there's a
  some amount of additional code needed, but it's not that much?  I
  actually think it might end up being less additional code than having a
  copy, because with the copy you'll have two different oids for global
  entry and the local copy.

 Hm, yes. Brainfart. Transaction table rewrites/truncations need to
 change the relfilenode.

 To fix We could add a backend local mapping table from global temp table
 id to the backend local relfilenode. The code to lookup the relfilenode
 is already mostly isolated.


It may be possible to do this, though I'm sure there's a wrinkle somewhere.
But there doesn't seem to be a need to overload the main feature request
with additional requirements. Doing that is just scope creep that prevents
us getting features out. Nice, simple patches from newer developers. Later
tuning and tweaking from more expert community members.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Zhaomo Yang
 Sounds fine in general. I'm a bit curious to know what are the locking
implications of  vivifying the table on access.

The locking implications depend on how we interpret the existing commands
in the context of global temp tables and I think we should discuss and
agree on the behaviors of the commands with global temp tables, but I think
in general we can follow these rules:

If the command executes on the global temp table's metadata, for example an
ALTER TABLE command, then we lock the global copy at the same level as we
do a regular table.

If the command executes on the global temp table's data (which is actually
stored in the session copy), for example an DML command, then the global
copy is locked at the AccessShareLock level to prevent concurrent
modifications to the global temp table's definition from other sessions.

Thanks,
Zhaomo

On Wed, Jul 15, 2015 at 4:26 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 07/15/2015 07:58 AM, Simon Riggs wrote:


 For me the design summary is this

 * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table
 but with different relkind
 * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if
 it does not exist we create it as a TEMP table of the same name, using the
 Global's pg_class entry as a template

 That meets the SQL Standard and doesn't contain any visibility problems
 or need for new internals.

 The purpose of this feature is to automatically create a temp table with
 the same definition whenever needed. The discussion of bloat is just
 wrong. We create exactly the same amount of bloat as if we had typed CREATE
 TEMP TABLE. Optimising temp table entries in the catalog is another,
 separate patch, if we care.



 Sounds fine in general. I'm a bit curious to know what are the locking
 implications of vivifying the table on access.

 cheers

 andrew



Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Andres Freund
On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
 It may be possible to do this, though I'm sure there's a wrinkle somewhere.
 But there doesn't seem to be a need to overload the main feature request
 with additional requirements. Doing that is just scope creep that prevents
 us getting features out. Nice, simple patches from newer developers. Later
 tuning and tweaking from more expert community members.

I think that's generally a fair point. But here we're discussing to add
a fair amount of wrinkles with the copy approach. The fact alone that
the oid is different will have some ugly consequences.

So we add complexity, just to shift it into different places later? I'm
not sure that's a good idea.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Simon Riggs
On 15 July 2015 at 16:28, Andres Freund and...@anarazel.de wrote:

 On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
  It may be possible to do this, though I'm sure there's a wrinkle
 somewhere.
  But there doesn't seem to be a need to overload the main feature request
  with additional requirements. Doing that is just scope creep that
 prevents
  us getting features out. Nice, simple patches from newer developers.
 Later
  tuning and tweaking from more expert community members.

 I think that's generally a fair point. But here we're discussing to add
 a fair amount of wrinkles with the copy approach. The fact alone that
 the oid is different will have some ugly consequences.


Why? We are creating a local temp table LIKE the global temp table. That is
already a supported operation. So there is no different oid.


 So we add complexity, just to shift it into different places later? I'm
 not sure that's a good idea.


There's no complexity in a simple temp table like. We can do this now with
triggers.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On 2015-07-15 16:24:52 +0100, Simon Riggs wrote:
 It may be possible to do this, though I'm sure there's a wrinkle somewhere.
 But there doesn't seem to be a need to overload the main feature request
 with additional requirements. Doing that is just scope creep that prevents
 us getting features out. Nice, simple patches from newer developers. Later
 tuning and tweaking from more expert community members.

 I think that's generally a fair point. But here we're discussing to add
 a fair amount of wrinkles with the copy approach. The fact alone that
 the oid is different will have some ugly consequences.

 So we add complexity, just to shift it into different places later? I'm
 not sure that's a good idea.

With all due respect, there are features that are beyond the abilities of
some newer developers, and reducing the scope isn't a good way to fix
that.  It just leaves a bigger mess to be cleaned up later.

I think Andres' idea of a per-backend filenode mapping table might work.
The existing relfilenode mapper solves a somewhat related problem, namely
how do you replace the filenode for shared system catalogs whose pg_class
entries can't be changed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 15 July 2015 at 16:28, Andres Freund and...@anarazel.de wrote:
 I think that's generally a fair point. But here we're discussing to add
 a fair amount of wrinkles with the copy approach. The fact alone that
 the oid is different will have some ugly consequences.

 Why? We are creating a local temp table LIKE the global temp table. That is
 already a supported operation. So there is no different oid.

You're presuming a specific implementation decision, one that has not been
made yet, and isn't all that attractive because of the catalog bloat issues.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Andres Freund
On 2015-07-15 16:36:12 +0100, Simon Riggs wrote:
 On 15 July 2015 at 16:28, Andres Freund and...@anarazel.de wrote:
  I think that's generally a fair point. But here we're discussing to add
  a fair amount of wrinkles with the copy approach. The fact alone that
  the oid is different will have some ugly consequences.
 
 
 Why? We are creating a local temp table LIKE the global temp table. That is
 already a supported operation. So there is no different oid.

Then your locking against ALTER, DROP etc. isn't going to work.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Andres Freund
On 2015-07-15 12:58:51 +0100, Simon Riggs wrote:
 On 14 July 2015 at 23:20, Jim Nasby jim.na...@bluetreble.com wrote:
 Pavel's original description of how to do this seem valid, and from the
 link Tom agreed in 2009.
 
 For me the design summary is this
 
 * CREATE GLOBAL TEMP TABLE creates catalog entries like a normal table but
 with different relkind
 * When we see a request to INSERT, DEL, UPD, SEL from the temp table, if it
 does not exist we create it as a TEMP table of the same name, using the
 Global's pg_class entry as a template

Why do we need to create that copy? We can just use the relfilenode in
all backends by having the backendid in the filename? Yes, there's a
some amount of additional code needed, but it's not that much?  I
actually think it might end up being less additional code than having a
copy, because with the copy you'll have two different oids for global
entry and the local copy.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-15 Thread Andres Freund
On 2015-07-15 16:52:49 +0200, Andres Freund wrote:
 Why do we need to create that copy? We can just use the relfilenode in
 all backends by having the backendid in the filename? Yes, there's a
 some amount of additional code needed, but it's not that much?  I
 actually think it might end up being less additional code than having a
 copy, because with the copy you'll have two different oids for global
 entry and the local copy.

Hm, yes. Brainfart. Transaction table rewrites/truncations need to
change the relfilenode.

To fix We could add a backend local mapping table from global temp table
id to the backend local relfilenode. The code to lookup the relfilenode
is already mostly isolated.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-14 Thread Jim Nasby

On 7/9/15 12:45 AM, Pavel Stehule wrote:


2015-07-09 7:32 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu
mailto:zhy...@cs.ucsd.edu:

  I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would
be useless. As I said, its performance will be no worse than current
temp tables and it will provide a lot of convenience to users who
need to create temp tables in every session.


Surely it should be step forward. But you will to have to solve lot of
problems with duplicated tables in system catalogue, and still it
doesn't solve the main problem with temporary tables - the bloating
catalogue - and related performance degradation.


That being the main problem is strictly a matter of opinion based on 
your experience. Many people don't have a performance problem today, but 
do have to deal with all the pain of handling this manually (as well as 
all the limitations that go with that).


If it's easy to fix the bloat problem at the same time as adding GLOBAL 
TEMP then great! But there's no reason to reject this just because it 
doesn't fix that issue.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-07-09 Thread Zhaomo Yang
  I am not sure, if it is not useless work.

I don't understand why an implementation taking approach 2.a would be
useless. As I said, its performance will be no worse than current temp
tables and it will provide a lot of convenience to users who need to create
temp tables in every session.

Thanks,
Zhaomo

On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:

 Hi


 2015-07-08 9:08 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

   more global temp tables are little bit comfortable for developers,
 I'd like to emphasize this point. This feature does much more than saving
 a developer from issuing a CREATE TEMP TABLE statement in every session.
 Here are two common use cases and I'm sure there are more.

 (1)
 Imagine in a web application scenario, a developer wants to cache some
 session information in a temp table. What's more, he also wants to specify
 some rules which reference the session information. Without this feature,
 the rules will be removed at the end of every session since they depend on
 a temporary object. Global temp tables will allow the developer to define
 the temp table and the rules once.

 (2)
 The second case is mentioned by Tom Lane back in 2010 in a thread about
 global temp tables.
 (http://www.postgresql.org/message-id/9319.1272130...@sss.pgh.pa.us)
 The context that I've seen it come up in is that people don't want to
 clutter their functions with
  create-it-if-it-doesn't-exist logic, which you have to have given the
 current behavior of temp tables.

   2.a - using on demand created temp tables - most simple solution, but
   doesn't help with catalogue bloating

 I've read the thread and people disapprove this approach because of the
 potential catalog bloat. However, I'd like to champion it. Indeed, this
 approach may have a bloat issue. But for users who needs global temp
 tables, they now have to create a new temp table in every session, which
 means they already have the bloat problem and presumably they have some
 policy to deal with it. In other words, implementing global temp tables by
 this approach gives users the same performance, plus the convenience the
 feature brings.

 The root problem here is that whether whether having the unoptimized
 feature is better than
 having no feature at all. Actually, there was a very similar discussion
 back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
 Lane's arguments here.

 Kevin Grittner's argument:

 http://www.postgresql.org/message-id/49f82aea.ee98.002...@wicourts.gov
 ... If you're saying we can implement the standard's global temporary
 tables in a way that performs better than current temporary tables, that's
 cool.  That would be a nice bonus in addition to the application
 programmer convenience and having another tick-mark on the standards
 compliance charts.  Do you think that's feasible?  If not, the feature
 would be useful to some with the same performance that temporary tables
 currently provide.

 Tom Lane's arguments:

 http://www.postgresql.org/message-id/24110.1241035...@sss.pgh.pa.us
 I'm all for eliminating catalog overheads, if we can find a way to do
 that.  I don't think that you get to veto implementation of the feature
 until we can find a way to optimize it better.  The question is not about
 whether having the optimization would be better than not having it --- it's
 about whether having the unoptimized feature is better than having no
 feature at all (which means people have to implement the same behavior by
 hand, and they'll *still* not get the optimization).

 There have been several threads here discussing global temp table since
 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
 metadata of the session copy in the catalog. However, it seems that none of
 them has been implemented, or even has a feasible design. So why don't we
 implement it in a unoptimized way first?


 I am not sure, if it is not useless work.

 Now, I am thinking so best implementation of global temp tables is
 enhancing unlogged tables to have local content. All local data can be
 saved in session memory. Usually it is less than 2KB with statistic, and
 you don't need to store it in catalogue. When anybody is working with any
 table, related data are copied to system cache - and there can be injected
 a implementation of global temp tables.

 regards

 Pavel Stehule



   Is there still interest about this feature?
 I'm very interested in this feature. I'm thinking about one
 implementation which is similar to Pavel's 2009 proposal (
 http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc78...@mail.gmail.com).
 Here are the major ideas of my design:

 (1)
 Creating the cross-session persistent schema as a regular table and
 creating session-private temp tables when a session first accesses it.

 (2)
 For DML queries, The global temp table is overloaded by its session copy
 after the relation is opened by an oid or a rangevar. For 

Re: [HACKERS] Implementation of global temporary tables?

2015-07-08 Thread Pavel Stehule
Hi


2015-07-08 9:08 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

   more global temp tables are little bit comfortable for developers,
 I'd like to emphasize this point. This feature does much more than saving
 a developer from issuing a CREATE TEMP TABLE statement in every session.
 Here are two common use cases and I'm sure there are more.

 (1)
 Imagine in a web application scenario, a developer wants to cache some
 session information in a temp table. What's more, he also wants to specify
 some rules which reference the session information. Without this feature,
 the rules will be removed at the end of every session since they depend on
 a temporary object. Global temp tables will allow the developer to define
 the temp table and the rules once.

 (2)
 The second case is mentioned by Tom Lane back in 2010 in a thread about
 global temp tables.
 (http://www.postgresql.org/message-id/9319.1272130...@sss.pgh.pa.us)
 The context that I've seen it come up in is that people don't want to
 clutter their functions with
  create-it-if-it-doesn't-exist logic, which you have to have given the
 current behavior of temp tables.

   2.a - using on demand created temp tables - most simple solution, but
   doesn't help with catalogue bloating

 I've read the thread and people disapprove this approach because of the
 potential catalog bloat. However, I'd like to champion it. Indeed, this
 approach may have a bloat issue. But for users who needs global temp
 tables, they now have to create a new temp table in every session, which
 means they already have the bloat problem and presumably they have some
 policy to deal with it. In other words, implementing global temp tables by
 this approach gives users the same performance, plus the convenience the
 feature brings.

 The root problem here is that whether whether having the unoptimized
 feature is better than
 having no feature at all. Actually, there was a very similar discussion
 back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
 Lane's arguments here.

 Kevin Grittner's argument:

 http://www.postgresql.org/message-id/49f82aea.ee98.002...@wicourts.gov
 ... If you're saying we can implement the standard's global temporary
 tables in a way that performs better than current temporary tables, that's
 cool.  That would be a nice bonus in addition to the application
 programmer convenience and having another tick-mark on the standards
 compliance charts.  Do you think that's feasible?  If not, the feature
 would be useful to some with the same performance that temporary tables
 currently provide.

 Tom Lane's arguments:

 http://www.postgresql.org/message-id/24110.1241035...@sss.pgh.pa.us
 I'm all for eliminating catalog overheads, if we can find a way to do
 that.  I don't think that you get to veto implementation of the feature
 until we can find a way to optimize it better.  The question is not about
 whether having the optimization would be better than not having it --- it's
 about whether having the unoptimized feature is better than having no
 feature at all (which means people have to implement the same behavior by
 hand, and they'll *still* not get the optimization).

 There have been several threads here discussing global temp table since
 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
 metadata of the session copy in the catalog. However, it seems that none of
 them has been implemented, or even has a feasible design. So why don't we
 implement it in a unoptimized way first?


I am not sure, if it is not useless work.

Now, I am thinking so best implementation of global temp tables is
enhancing unlogged tables to have local content. All local data can be
saved in session memory. Usually it is less than 2KB with statistic, and
you don't need to store it in catalogue. When anybody is working with any
table, related data are copied to system cache - and there can be injected
a implementation of global temp tables.

regards

Pavel Stehule



   Is there still interest about this feature?
 I'm very interested in this feature. I'm thinking about one implementation
 which is similar to Pavel's 2009 proposal (
 http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc78...@mail.gmail.com).
 Here are the major ideas of my design:

 (1)
 Creating the cross-session persistent schema as a regular table and
 creating session-private temp tables when a session first accesses it.

 (2)
 For DML queries, The global temp table is overloaded by its session copy
 after the relation is opened by an oid or a rangevar. For DDL queries,
 which copy is used depends on whether the query needs to access the data or
 metadata of the global temp table.

 There are more differences between this design and Pavel's 2009 proposal
 and I'd like to send a detailed proposal to the mailing list but first I
 want to know if our community would accept a global temp table
 implementation which provides the same performance as 

Re: [HACKERS] Implementation of global temporary tables?

2015-07-08 Thread Zhaomo Yang
  more global temp tables are little bit comfortable for developers,
I'd like to emphasize this point. This feature does much more than saving a
developer from issuing a CREATE TEMP TABLE statement in every session. Here
are two common use cases and I'm sure there are more.

(1)
Imagine in a web application scenario, a developer wants to cache some
session information in a temp table. What's more, he also wants to specify
some rules which reference the session information. Without this feature,
the rules will be removed at the end of every session since they depend on
a temporary object. Global temp tables will allow the developer to define
the temp table and the rules once.

(2)
The second case is mentioned by Tom Lane back in 2010 in a thread about
global temp tables.
(http://www.postgresql.org/message-id/9319.1272130...@sss.pgh.pa.us)
The context that I've seen it come up in is that people don't want to
clutter their functions with
 create-it-if-it-doesn't-exist logic, which you have to have given the
current behavior of temp tables.

  2.a - using on demand created temp tables - most simple solution, but
  doesn't help with catalogue bloating

I've read the thread and people disapprove this approach because of the
potential catalog bloat. However, I'd like to champion it. Indeed, this
approach may have a bloat issue. But for users who needs global temp
tables, they now have to create a new temp table in every session, which
means they already have the bloat problem and presumably they have some
policy to deal with it. In other words, implementing global temp tables by
this approach gives users the same performance, plus the convenience the
feature brings.

The root problem here is that whether whether having the unoptimized
feature is better than
having no feature at all. Actually, there was a very similar discussion
back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
Lane's arguments here.

Kevin Grittner's argument:

http://www.postgresql.org/message-id/49f82aea.ee98.002...@wicourts.gov
... If you're saying we can implement the standard's global temporary
tables in a way that performs better than current temporary tables, that's
cool.  That would be a nice bonus in addition to the application
programmer convenience and having another tick-mark on the standards
compliance charts.  Do you think that's feasible?  If not, the feature
would be useful to some with the same performance that temporary tables
currently provide.

Tom Lane's arguments:

http://www.postgresql.org/message-id/24110.1241035...@sss.pgh.pa.us
I'm all for eliminating catalog overheads, if we can find a way to do
that.  I don't think that you get to veto implementation of the feature
until we can find a way to optimize it better.  The question is not about
whether having the optimization would be better than not having it --- it's
about whether having the unoptimized feature is better than having no
feature at all (which means people have to implement the same behavior by
hand, and they'll *still* not get the optimization).

There have been several threads here discussing global temp table since
2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
metadata of the session copy in the catalog. However, it seems that none of
them has been implemented, or even has a feasible design. So why don't we
implement it in a unoptimized way first?

  Is there still interest about this feature?
I'm very interested in this feature. I'm thinking about one implementation
which is similar to Pavel's 2009 proposal (
http://www.postgresql.org/message-id/162867790904271344s1ec96d90j6cde295fdcc78...@mail.gmail.com).
Here are the major ideas of my design:

(1)
Creating the cross-session persistent schema as a regular table and
creating session-private temp tables when a session first accesses it.

(2)
For DML queries, The global temp table is overloaded by its session copy
after the relation is opened by an oid or a rangevar. For DDL queries,
which copy is used depends on whether the query needs to access the data or
metadata of the global temp table.

There are more differences between this design and Pavel's 2009 proposal
and I'd like to send a detailed proposal to the mailing list but first I
want to know if our community would accept a global temp table
implementation which provides the same performance as currently temp tables
do.

Thanks,
Zhaomo


Re: [HACKERS] Implementation of global temporary tables?

2015-07-08 Thread Pavel Stehule
2015-07-09 7:32 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

   I am not sure, if it is not useless work.

 I don't understand why an implementation taking approach 2.a would be
 useless. As I said, its performance will be no worse than current temp
 tables and it will provide a lot of convenience to users who need to create
 temp tables in every session.


Surely it should be step forward. But you will to have to solve lot of
problems with duplicated tables in system catalogue, and still it doesn't
solve the main problem with temporary tables - the bloating catalogue - and
related performance degradation.

Although global temp tables is nice to have feature (for PLpgSQL
developers), we can live without it - and with some patterns and
extensions, we are living well. But the performance issue is not be fixed
by any pattern. So the major motivation for introduction of global temp
tables is performance - from 90%.  It should be a primary target to merge
this feature to upstream. I believe, when bloating will be solved, then the
chance to accept this patch will be pretty high.

Regards

Pavel



 Thanks,
 Zhaomo

 On Tue, Jul 7, 2015 at 11:53 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 Hi


 2015-07-08 9:08 GMT+02:00 Zhaomo Yang zhy...@cs.ucsd.edu:

   more global temp tables are little bit comfortable for developers,
 I'd like to emphasize this point. This feature does much more than
 saving a developer from issuing a CREATE TEMP TABLE statement in every
 session. Here are two common use cases and I'm sure there are more.

 (1)
 Imagine in a web application scenario, a developer wants to cache some
 session information in a temp table. What's more, he also wants to specify
 some rules which reference the session information. Without this feature,
 the rules will be removed at the end of every session since they depend on
 a temporary object. Global temp tables will allow the developer to define
 the temp table and the rules once.

 (2)
 The second case is mentioned by Tom Lane back in 2010 in a thread about
 global temp tables.
 (http://www.postgresql.org/message-id/9319.1272130...@sss.pgh.pa.us)
 The context that I've seen it come up in is that people don't want to
 clutter their functions with
  create-it-if-it-doesn't-exist logic, which you have to have given the
 current behavior of temp tables.

   2.a - using on demand created temp tables - most simple solution, but
   doesn't help with catalogue bloating

 I've read the thread and people disapprove this approach because of the
 potential catalog bloat. However, I'd like to champion it. Indeed, this
 approach may have a bloat issue. But for users who needs global temp
 tables, they now have to create a new temp table in every session, which
 means they already have the bloat problem and presumably they have some
 policy to deal with it. In other words, implementing global temp tables by
 this approach gives users the same performance, plus the convenience the
 feature brings.

 The root problem here is that whether whether having the unoptimized
 feature is better than
 having no feature at all. Actually, there was a very similar discussion
 back in 2009 on global temp tables. Let me borrow Kevin Grittner's and Tom
 Lane's arguments here.

 Kevin Grittner's argument:

 http://www.postgresql.org/message-id/49f82aea.ee98.002...@wicourts.gov
 ... If you're saying we can implement the standard's global temporary
 tables in a way that performs better than current temporary tables, that's
 cool.  That would be a nice bonus in addition to the application
 programmer convenience and having another tick-mark on the standards
 compliance charts.  Do you think that's feasible?  If not, the feature
 would be useful to some with the same performance that temporary tables
 currently provide.

 Tom Lane's arguments:

 http://www.postgresql.org/message-id/24110.1241035...@sss.pgh.pa.us
 I'm all for eliminating catalog overheads, if we can find a way to do
 that.  I don't think that you get to veto implementation of the feature
 until we can find a way to optimize it better.  The question is not about
 whether having the optimization would be better than not having it --- it's
 about whether having the unoptimized feature is better than having no
 feature at all (which means people have to implement the same behavior by
 hand, and they'll *still* not get the optimization).

 There have been several threads here discussing global temp table since
 2007. Quite a few ideas aimed to avoid the bloat issue by not storing the
 metadata of the session copy in the catalog. However, it seems that none of
 them has been implemented, or even has a feasible design. So why don't we
 implement it in a unoptimized way first?


 I am not sure, if it is not useless work.

 Now, I am thinking so best implementation of global temp tables is
 enhancing unlogged tables to have local content. All local data can be
 saved in session memory. Usually it is less than 2KB with statistic, 

[HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
Hello

Six years ago we did discuss about global temporary tables - persistent
schema, ephemeral data.

http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html

I am thinking so some reasons why implement this feature are valid:

* we can get some performance benefit against current temp tables - less
the catalogue bloating,

* we can simplify a static validation of plpgsql functions when temp tables
are used,
  more global temp tables are little bit comfortable for developers,

* we can simplify migration from some other databases, where global temp
tables are default.

Some was changed from 2009:

* We have updatable CTE

* We have unlogged tables

Opened questions:

1. Name and conformance with standard of this feature - because we doesn't
support modules, a mapping ANSI - PG should not be trivial

2. Implementation

I see three possible ways how to implement it:

2.a - using on demand created temp tables - most simple solution, but
doesn't help with catalogue bloating

2.b - using unlogged tables for holding statistics, relfilenode, and all
necessary data

3.c - store ephemeral metadata only in memory without MVCC

Is there still interest about this feature?

Comments, notes?

Pavel


Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Atri Sharma
 Some was changed from 2009:

 * We have updatable CTE

 * We have unlogged tables

 Opened questions:

 1. Name and conformance with standard of this feature - because we doesn't
 support modules, a mapping ANSI - PG should not be trivial

 2. Implementation

 I see three possible ways how to implement it:

 2.a - using on demand created temp tables - most simple solution, but
 doesn't help with catalogue bloating

 2.b - using unlogged tables for holding statistics, relfilenode, and all
 necessary data

 3.c - store ephemeral metadata only in memory without MVCC


With 2.a, essentially, we are defining a global definition of a temp table,
but the actual per session objects still follow the same rules are our
current temp tables do?

2.b seems like a lot of new data, and if we are defining new paths for e.g.
statistics for global temp tables, we might end up adding new logic in
planner to use those tables. I am not seeing how this will work.

Could you elaborate a bit on 3.c please?

Something that really bothers me here, on a different note, is the catalog
churn this could cause. Of course, you mentioned in as a point in your
email, but I feel that 2.a's showstopper could be the massive catalog churn
it causes. Maybe have a way to manage such tables without getting pg_class
to bloat pretty quickly (I am assuming you do not mean metadata as the
catalog metadata in 3.c).

Regards,

Atri


Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
2015-02-02 11:51 GMT+01:00 Atri Sharma atri.j...@gmail.com:


 Some was changed from 2009:

 * We have updatable CTE

 * We have unlogged tables

 Opened questions:

 1. Name and conformance with standard of this feature - because we
 doesn't support modules, a mapping ANSI - PG should not be trivial

 2. Implementation

 I see three possible ways how to implement it:

 2.a - using on demand created temp tables - most simple solution, but
 doesn't help with catalogue bloating

 2.b - using unlogged tables for holding statistics, relfilenode, and all
 necessary data

 3.c - store ephemeral metadata only in memory without MVCC


 With 2.a, essentially, we are defining a global definition of a temp
 table, but the actual per session objects still follow the same rules are
 our current temp tables do?


yes .. it means global temp table is template for local temp table


 2.b seems like a lot of new data, and if we are defining new paths for
 e.g. statistics for global temp tables, we might end up adding new logic in
 planner to use those tables. I am not seeing how this will work.


The advantages of this method is transactional behave and moving some
bloating content to specific smaller and unlogged tables.



 Could you elaborate a bit on 3.c please?

 Something that really bothers me here, on a different note, is the catalog
 churn this could cause. Of course, you mentioned in as a point in your
 email, but I feel that 2.a's showstopper could be the massive catalog churn
 it causes. Maybe have a way to manage such tables without getting pg_class
 to bloat pretty quickly (I am assuming you do not mean metadata as the
 catalog metadata in 3.c).


3.c is good protection against catalog bloating - on second hand -
implementation will be probably more complex.



 Regards,

 Atri



Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Atri Sharma


  1. Main catalogue will be stable.
  2. There is not necessary to implement new storage and it can helps with
  transaction support.

 The amount of complexity that'd be involved to store catalog data in a
 separate relation around the caches and accesses would be significant. I
 don't think that's a realistic option.


Not to mention the problems we might end up in. We still have corner cases
in our cache code, and a new heap on top of it all might be just too
painful.


3.c - store ephemeral metadata only in memory without MVCC
  
   I think that's not an option. That'd end up being a massive amount of
   duplication at a low rate of functionality.
  
 
  I don't plan to implement a storage - I expect only few functions for
  store/read data from session memory context

 What does it have to do with temp tables then?


I think what Pavel means here is that we do not need a full fledged heap
layer and rather only a minimal API from a per session memory context.
However, that might be still as painful because we will eventually end up
inventing mechanisms for syscache and typcache to work with this storage,
which IMO is the biggest pain point around this idea.


Regards,

Atri

Regards,

Atri
*l'apprenant*


Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
2015-02-02 13:36 GMT+01:00 Atri Sharma atri.j...@gmail.com:


  1. Main catalogue will be stable.
  2. There is not necessary to implement new storage and it can helps with
  transaction support.

 The amount of complexity that'd be involved to store catalog data in a
 separate relation around the caches and accesses would be significant. I
 don't think that's a realistic option.


 Not to mention the problems we might end up in. We still have corner cases
 in our cache code, and a new heap on top of it all might be just too
 painful.


3.c - store ephemeral metadata only in memory without MVCC
  
   I think that's not an option. That'd end up being a massive amount of
   duplication at a low rate of functionality.
  
 
  I don't plan to implement a storage - I expect only few functions for
  store/read data from session memory context

 What does it have to do with temp tables then?


 I think what Pavel means here is that we do not need a full fledged heap
 layer and rather only a minimal API from a per session memory context.
 However, that might be still as painful because we will eventually end up
 inventing mechanisms for syscache and typcache to work with this storage,
 which IMO is the biggest pain point around this idea.


It should be solvable - I see another risk - if we accelerate a work with
temp tables, then 4 byte oid should not be enough.




 Regards,

 Atri

 Regards,

 Atri
 *l'apprenant*



Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Atri Sharma
On Mon, Feb 2, 2015 at 6:34 PM, Pavel Stehule pavel.steh...@gmail.com
wrote:



 2015-02-02 13:36 GMT+01:00 Atri Sharma atri.j...@gmail.com:


  1. Main catalogue will be stable.
  2. There is not necessary to implement new storage and it can helps
 with
  transaction support.

 The amount of complexity that'd be involved to store catalog data in a
 separate relation around the caches and accesses would be significant. I
 don't think that's a realistic option.


 Not to mention the problems we might end up in. We still have corner
 cases in our cache code, and a new heap on top of it all might be just too
 painful.


3.c - store ephemeral metadata only in memory without MVCC
  
   I think that's not an option. That'd end up being a massive amount of
   duplication at a low rate of functionality.
  
 
  I don't plan to implement a storage - I expect only few functions for
  store/read data from session memory context

 What does it have to do with temp tables then?


 I think what Pavel means here is that we do not need a full fledged heap
 layer and rather only a minimal API from a per session memory context.
 However, that might be still as painful because we will eventually end up
 inventing mechanisms for syscache and typcache to work with this storage,
 which IMO is the biggest pain point around this idea.


 It should be solvable - I see another risk - if we accelerate a work with
 temp tables, then 4 byte oid should not be enough.




Hrm, that might well be true. It might be worth the effort to find a better
way to materialize global temp tables then, like having a single OID and
only materializing a relfilenode for a session when the session inserts
into the temp table. Not sure here at all...



-- 
Regards,

Atri
*l'apprenant*


Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
2015-02-02 13:15 GMT+01:00 Andres Freund and...@2ndquadrant.com:

 On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote:
  2015-02-02 12:04 GMT+01:00 Andres Freund and...@2ndquadrant.com:
2.b - using unlogged tables for holding statistics, relfilenode, and
 all
necessary data
  
   I can't follow why that'd achieve anything?
  
 
  1. Main catalogue will be stable.
  2. There is not necessary to implement new storage and it can helps with
  transaction support.

 The amount of complexity that'd be involved to store catalog data in a
 separate relation around the caches and accesses would be significant. I
 don't think that's a realistic option.

3.c - store ephemeral metadata only in memory without MVCC
  
   I think that's not an option. That'd end up being a massive amount of
   duplication at a low rate of functionality.
  
 
  I don't plan to implement a storage - I expect only few functions for
  store/read data from session memory context

 What does it have to do with temp tables then?


it is mechanism how to store a session metadata related to global temp
tables



   I think it's more realistic way to implement is to have a separate
   'relpersistence' setting for global temp tables. The first access to
   such one in a session (or xact if truncate on commit) copies the table
   from the _init fork. By having the backend id in all filenames (besides
   the init fork) they're unique between sessions.
  
  
  If I understand well, it is similar to my fast implementation from 2008.
 It
  works partially,  because it doesn't solve other (session) property -
 like
  relpages, reltuples and related data from pg_statistics

 I'm honestly not particularly concerned about that problem. For one, we
 don't auto-analyze/vacuum temp tables. For another, it'd be
 comparatively easy to gather reltuples/relpages/stats from session local
 state if necessary. Those are all only accessed from a few places.


so I don't see a big differences from 3.c - all session metadata will
stored in session memory.

I didn't write code, so I have not a knowledge about details.


 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
2015-02-02 12:04 GMT+01:00 Andres Freund and...@2ndquadrant.com:

 Hi,

 On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote:
  Six years ago we did discuss about global temporary tables - persistent
  schema, ephemeral data.
 
  http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
 
  I am thinking so some reasons why implement this feature are valid:
 
  * we can get some performance benefit against current temp tables - less
  the catalogue bloating,
 
  * we can simplify a static validation of plpgsql functions when temp
 tables
  are used,
more global temp tables are little bit comfortable for developers,
 
  * we can simplify migration from some other databases, where global temp
  tables are default.

 I agree that the feature would be interesting.

  2. Implementation
 
  I see three possible ways how to implement it:
 
  2.a - using on demand created temp tables - most simple solution, but
  doesn't help with catalogue bloating

 Yea, that's no good.

  2.b - using unlogged tables for holding statistics, relfilenode, and all
  necessary data

 I can't follow why that'd achieve anything?


1. Main catalogue will be stable.
2. There is not necessary to implement new storage and it can helps with
transaction support.



  3.c - store ephemeral metadata only in memory without MVCC

 I think that's not an option. That'd end up being a massive amount of
 duplication at a low rate of functionality.


I don't plan to implement a storage - I expect only few functions for
store/read data from session memory context



 I think it's more realistic way to implement is to have a separate
 'relpersistence' setting for global temp tables. The first access to
 such one in a session (or xact if truncate on commit) copies the table
 from the _init fork. By having the backend id in all filenames (besides
 the init fork) they're unique between sessions.


If I understand well, it is similar to my fast implementation from 2008. It
works partially,  because it doesn't solve other (session) property - like
relpages, reltuples and related data from pg_statistics


 Or something roughly like that.

 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Andres Freund
On 2015-02-02 12:24:44 +0100, Pavel Stehule wrote:
 2015-02-02 12:04 GMT+01:00 Andres Freund and...@2ndquadrant.com:
   2.b - using unlogged tables for holding statistics, relfilenode, and all
   necessary data
 
  I can't follow why that'd achieve anything?
 
 
 1. Main catalogue will be stable.
 2. There is not necessary to implement new storage and it can helps with
 transaction support.

The amount of complexity that'd be involved to store catalog data in a
separate relation around the caches and accesses would be significant. I
don't think that's a realistic option.

   3.c - store ephemeral metadata only in memory without MVCC
 
  I think that's not an option. That'd end up being a massive amount of
  duplication at a low rate of functionality.
 
 
 I don't plan to implement a storage - I expect only few functions for
 store/read data from session memory context

What does it have to do with temp tables then?

  I think it's more realistic way to implement is to have a separate
  'relpersistence' setting for global temp tables. The first access to
  such one in a session (or xact if truncate on commit) copies the table
  from the _init fork. By having the backend id in all filenames (besides
  the init fork) they're unique between sessions.
 
 
 If I understand well, it is similar to my fast implementation from 2008. It
 works partially,  because it doesn't solve other (session) property - like
 relpages, reltuples and related data from pg_statistics

I'm honestly not particularly concerned about that problem. For one, we
don't auto-analyze/vacuum temp tables. For another, it'd be
comparatively easy to gather reltuples/relpages/stats from session local
state if necessary. Those are all only accessed from a few places.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Pavel Stehule
2015-02-02 11:15 GMT+01:00 Pavel Stehule pavel.steh...@gmail.com:

 Hello

 Six years ago we did discuss about global temporary tables - persistent
 schema, ephemeral data.

 http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html

 I am thinking so some reasons why implement this feature are valid:

 * we can get some performance benefit against current temp tables - less
 the catalogue bloating,

 * we can simplify a static validation of plpgsql functions when temp
 tables are used,
   more global temp tables are little bit comfortable for developers,

 * we can simplify migration from some other databases, where global temp
 tables are default.


I forgot other possible benefit:

* using temp tables on slaves - (needs 3c implementation)



 Some was changed from 2009:

 * We have updatable CTE

 * We have unlogged tables

 Opened questions:

 1. Name and conformance with standard of this feature - because we doesn't
 support modules, a mapping ANSI - PG should not be trivial

 2. Implementation

 I see three possible ways how to implement it:

 2.a - using on demand created temp tables - most simple solution, but
 doesn't help with catalogue bloating

 2.b - using unlogged tables for holding statistics, relfilenode, and all
 necessary data

 3.c - store ephemeral metadata only in memory without MVCC

 Is there still interest about this feature?

 Comments, notes?

 Pavel



Re: [HACKERS] Implementation of global temporary tables?

2015-02-02 Thread Andres Freund
Hi,

On 2015-02-02 11:15:22 +0100, Pavel Stehule wrote:
 Six years ago we did discuss about global temporary tables - persistent
 schema, ephemeral data.
 
 http://postgresql.nabble.com/idea-global-temp-tables-td2007217.html
 
 I am thinking so some reasons why implement this feature are valid:
 
 * we can get some performance benefit against current temp tables - less
 the catalogue bloating,
 
 * we can simplify a static validation of plpgsql functions when temp tables
 are used,
   more global temp tables are little bit comfortable for developers,
 
 * we can simplify migration from some other databases, where global temp
 tables are default.

I agree that the feature would be interesting.

 2. Implementation
 
 I see three possible ways how to implement it:
 
 2.a - using on demand created temp tables - most simple solution, but
 doesn't help with catalogue bloating

Yea, that's no good.

 2.b - using unlogged tables for holding statistics, relfilenode, and all
 necessary data

I can't follow why that'd achieve anything?

 3.c - store ephemeral metadata only in memory without MVCC

I think that's not an option. That'd end up being a massive amount of
duplication at a low rate of functionality.


I think it's more realistic way to implement is to have a separate
'relpersistence' setting for global temp tables. The first access to
such one in a session (or xact if truncate on commit) copies the table
from the _init fork. By having the backend id in all filenames (besides
the init fork) they're unique between sessions.

Or something roughly like that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers