Re: [HACKERS] TopPlan, again

2007-02-19 Thread Simon Riggs
On Sun, 2007-02-18 at 18:19 -0500, Tom Lane wrote:
 While thinking about having a centralized plan cache for managing plan
 invalidation, I got annoyed again about the fact that the executor needs
 access to the Query tree.  This means that we'll be storing *three*
 representations of any cached query: raw parsetree for possible
 regeneration, plus parsed Query tree and Plan tree.
...
 After looking over the code it seems that the executor needs a limited
 subset of the Query fields, namely
...
   into
   intoOptions
   intoOnCommit (why is this separate from intoOptions?)
   intoTableSpaceName
...

 which I think we should put into a new TopPlan node type.

All else sounds good, but why would we be caching a plan that used these
fields? Anybody re-executing a CREATE TABLE AS SELECT on the same table
isn't somebody we should be helping. ISTM that we'd be able to exclude
them from the TopPlan on that basis, possibly creating an Into node to
reduce the clutter.



Couple of incidental points on plan invalidation:
- We need to consider how the planner uses parameter values. Currently
the unnamed query utilises the first bind parameters to plan the query.
Doing that when we have a central plan cache will definitely cause
problems in some applications which currently repeatedly re-specify the
same parameter on their session only, but differ across sessions. Sounds
bizarre, but assuming that all users of the same query want it optimised
the same way is not a good assumption in all cases. I'm completely in
favour of a centralized plan cache in all other ways...

- I'd like to make it impossible to re-plan the output columns of
queries with unspecified output columns e.g. * or foo.* 
This makes it possible for the results of the query to change during
re-execution. I've never seen an application that used dynamic query
that allowed for the possibility that the result metadata might change
as we re-execute and allowing it would seem likely to break more
applications than we'd really want. It will also allow us to remove the
Metadata call from the v3 Protocol at Exec time, as David Strong
suggested last year on pgsql-jdbc.

- It would be good to allow for exec-time constraint exclusion, which
would allow caching plans that used by CE and stable functions (e.g. col
 CURRENT_DATE). That may change the design, even though thats not an
8.3 thing at all.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] TopPlan, again

2007-02-19 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 After looking over the code it seems that the executor needs a limited
 subset of the Query fields, namely
 ...
 which I think we should put into a new TopPlan node type.

 All else sounds good, but why would we be caching a plan that used these
 fields?

Um, what's your point?  I certainly have no desire to support two
different Executor APIs depending on whether we think the command might
be worth cacheing or not.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] TopPlan, again

2007-02-18 Thread Tom Lane
While thinking about having a centralized plan cache for managing plan
invalidation, I got annoyed again about the fact that the executor needs
access to the Query tree.  This means that we'll be storing *three*
representations of any cached query: raw parsetree for possible
regeneration, plus parsed Query tree and Plan tree.

We've repeatedly discussed getting rid of execution-time access to the
Query structure --- here's one old message about it:
http://archives.postgresql.org/pgsql-hackers/1999-02/msg00388.php
and here's a recent one:
http://archives.postgresql.org/pgsql-hackers/2006-08/msg00734.php
I think it's time to bite the bullet and do that.

After looking over the code it seems that the executor needs a limited
subset of the Query fields, namely

commandType
canSetTag
rtable
returningList
returningLists
into
intoOptions
intoOnCommit (why is this separate from intoOptions?)
intoTableSpaceName
rowMarks
resultRelation
resultRelations
nParamExec (currently in topmost Plan node)

which I think we should put into a new TopPlan node type.
returningLists and resultRelations could be removed from Query;
also, we might need only the list forms and not the singleton
returningList/resultRelation fields in TopPlan.

The other big problem is the rangetable (rtable): currently it contains
Query trees for subqueries (including views) so unless we clean that up
we aren't going to be all that far ahead in terms of reducing the
overhead.  I'm envisioning creating a compact rangetable entry struct
with just the fields the executor needs:

rtekind
relid
eref(might only need the table alias name not the column names)
requiredPerms
checkAsUser

and flattening subquery rangetables into the main list, so that there's
just one list and rangetable indexes are unique throughout a plan tree.
That will allow subqueries to execute with the same EState as the main
query and thus simplify nodeSubplan and nodeSubqueryScan.  This list
will also provide a simple way for the plan cache module to know which
relations to lock before determining whether the plan has been invalidated.

Comments, objections?  Also, any thoughts about the names to use for
these new node types?  As I commented last year, I'm not completely
happy with TopPlan because it won't actually be a subtype of Plan,
but I don't have a better idea.  Also I'm unsure what to call the
cut-down RangeTblEntry struct; maybe RunTimeRangeTblEntry?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] TopPlan, again

2007-02-18 Thread Gavin Sherry
On Sun, 18 Feb 2007, Tom Lane wrote:

 We've repeatedly discussed getting rid of execution-time access to the
 Query structure --- here's one old message about it:
 http://archives.postgresql.org/pgsql-hackers/1999-02/msg00388.php
 and here's a recent one:
 http://archives.postgresql.org/pgsql-hackers/2006-08/msg00734.php
 I think it's time to bite the bullet and do that.

Great.

 The other big problem is the rangetable (rtable): currently it contains
 Query trees for subqueries (including views) so unless we clean that up
 we aren't going to be all that far ahead in terms of reducing the
 overhead.  I'm envisioning creating a compact rangetable entry struct
 with just the fields the executor needs:

   rtekind
   relid
   eref(might only need the table alias name not the column names)
   requiredPerms
   checkAsUser

 and flattening subquery rangetables into the main list, so that there's
 just one list and rangetable indexes are unique throughout a plan tree.
 That will allow subqueries to execute with the same EState as the main
 query and thus simplify nodeSubplan and nodeSubqueryScan.  This list
 will also provide a simple way for the plan cache module to know which
 relations to lock before determining whether the plan has been invalidated.

Cool.

 Comments, objections?  Also, any thoughts about the names to use for
 these new node types?  As I commented last year, I'm not completely
 happy with TopPlan because it won't actually be a subtype of Plan,
 but I don't have a better idea.  Also I'm unsure what to call the
 cut-down RangeTblEntry struct; maybe RunTimeRangeTblEntry?

I think TopPlan is misleading. What about MetaPlan instead of TopPlan? I
think RunTimeRangeTblEntry is okay, though long. ExecRangeTblEntry?

Thanks,

Gavin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] TopPlan, again

2007-02-18 Thread Mark Kirkwood

Gavin Sherry wrote:

On Sun, 18 Feb 2007, Tom Lane wrote:




Comments, objections?  Also, any thoughts about the names to use for
these new node types?  As I commented last year, I'm not completely
happy with TopPlan because it won't actually be a subtype of Plan,
but I don't have a better idea.  Also I'm unsure what to call the
cut-down RangeTblEntry struct; maybe RunTimeRangeTblEntry?


I think TopPlan is misleading. What about MetaPlan instead of TopPlan? I
think RunTimeRangeTblEntry is okay, though long. ExecRangeTblEntry?



Would ExecPlan be better? - matches ExecRangeTblEntry.

Cheers

Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] TopPlan, again

2007-02-18 Thread Tom Lane
Mark Kirkwood [EMAIL PROTECTED] writes:
 Gavin Sherry wrote:
 On Sun, 18 Feb 2007, Tom Lane wrote:
 Comments, objections?  Also, any thoughts about the names to use for
 these new node types?  As I commented last year, I'm not completely
 happy with TopPlan because it won't actually be a subtype of Plan,
 but I don't have a better idea.  Also I'm unsure what to call the
 cut-down RangeTblEntry struct; maybe RunTimeRangeTblEntry?
 
 I think TopPlan is misleading. What about MetaPlan instead of TopPlan? I
 think RunTimeRangeTblEntry is okay, though long. ExecRangeTblEntry?

 Would ExecPlan be better? - matches ExecRangeTblEntry.

Neither of these seem to answer my worry that the node isn't a
subtype of Plan.

One thought is that in some contexts this node type will probably appear
in lists that might also contain utility statement nodes.  (Currently,
we represent such lists as Query lists that might or might not have
utilityStmt set, but I don't want a utilityStmt field in this node
type.)  So maybe we should pick something based off statement.
Perhaps PlannedStmt or ExecutableStmt?

ExecRangeTblEntry sounds good to me for the other thing.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] TopPlan, again

2007-02-18 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Comments, objections?  Also, any thoughts about the names to use for
 these new node types?  As I commented last year, I'm not completely
 happy with TopPlan because it won't actually be a subtype of Plan,
 but I don't have a better idea.  Also I'm unsure what to call the
 cut-down RangeTblEntry struct; maybe RunTimeRangeTblEntry?

My only though is that I suspect this will somehow relate to the cte stuff I
was doing for recursive queries. I'm not exactly clear how yet though.

I think this has more to do with the RangeTable stuff than the TopPlan though.
I was probably going to need a new kind of RangeTable representing a Subquery
that was a reference to a cte rather than a separate subquery.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings