Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)

2006-08-01 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 Is this intentional:

 template1=# values(1), (2);
  column1
 -
1
2
 (2 rows)

You bet.  VALUES is parallel to SELECT in the SQL grammar, so AFAICS
it should be legal anywhere you can write SELECT.

The basic productions in the spec's grammar are respectively

 query specification ::=
  SELECT [ set quantifier ] select list
table expression

and

 table value constructor ::=
  VALUES row value expression list

and both of them link into the rest of the grammar here:

 simple table ::=
query specification
  | table value constructor
  | explicit table

There is no construct I can find in the spec grammar that allows
query specification but not table value constructor.  QED.

Try some stuff like
DECLARE c CURSOR FOR VALUES ...
WHERE foo IN (VALUES ...


regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)

2006-07-24 Thread Tom Lane
Joe Conway [EMAIL PROTECTED] writes:
 Good feedback -- thanks! But without the RTE, how would VALUES in the 
 FROM clause work?

Is it different from INSERT?  I'm just imagining a Values node in
the jointree and nothing in the rangetable.

If I'm reading the spec correctly, VALUES is exactly parallel to SELECT
in the grammar, which means that to use it in FROM you would need
parentheses and an alias:

SELECT ... FROM (SELECT ...) AS foo

SELECT ... FROM (VALUES ...) AS foo

ISTM that this should be represented using an RTE_SUBQUERY node in the
outer query; the alias attaches to that node, not to the VALUES itself.
So I don't think you need that alias field in the jointree entry either.

If we stick with the plan of representing VALUES as if it were SELECT *
FROM (valuesnode), then this approach would make the second query above
have a structure like

Query
  .rtable -RTE_SUBQUERY
  .subquery -  Query
  .jointree -  Values

(leaving out a ton of detail of course, but those are the key nodes).

To get this to reverse-list in the expected form, we'd need a small
kluge in ruleutils.c that short-circuits the display of SELECT
... FROM etc when it sees a Values node at the top of the jointree.
This seems like a fairly small price to pay for keeping Query in
approximately its present form, though.

One thought is that we might allow Query.jointree to point to either
a FromExpr or a Values node, and disallow Values from appearing further
down in the jointree (except perhaps after flattening of subqueries
in the planner).  The alternative is that there's a FromExpr atop
the Values node in the jointree even in the simple case; which seems
uglier but it might avoid breaking some code that expects the top level
to always be FromExpr.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Values list-of-targetlists patch for comments (was Re: [PATCHES] [HACKERS] 8.2 features?)

2006-07-23 Thread Joe Conway

Tom Lane wrote:

Joe Conway [EMAIL PROTECTED] writes:

I'm liking this too. But when you say jointree node, are you saying to 
model the new node type after NestLoop/MergeJoin/HashJoin nodes? These 
are referred to as join nodes in ExecInitNode. Or as you mentioned a 
couple of times, should this look more like an Append node?



No, I guess I confused you by talking about the executor representation
at the same time.  This is really unrelated to the executor.  The join
tree I'm thinking of here is the data structure that dangles off
Query.jointree --- it's a representation of the query's FROM clause,
and (at present) can contain RangeTblRef, FromExpr, and JoinExpr nodes.
See the last hundred or so lines of primnodes.h for some details.
The jointree is used by the planner to compute the plan node tree that
the executor will run, but it's not the same thing.

There are basically two ways you could go about this:
1. Make a new jointree leaf node type to represent a VALUES construct,
   and dangle the list of lists of expressions off that.
2. Make a new RangeTblEntry type to represent a VALUES construct, and
   just put a RangeTblRef to it into the jointree.  The expressions
   dangle off the RangeTblEntry.

Offhand I'm not certain which of these would be cleanest.  The second
way has some similarities to the way we handle set operation trees
(UNION et al), so it might be worth looking at that stuff.  However,
being a RangeTblEntry has a lot of baggage (eg, various routines expect
to find an RTE alias, column names, column types, etc) and maybe we
don't need all that for VALUES.


Since the feature freeze is only about a week off, I wanted to post this 
patch even though it is not yet ready to be applied.


Executive summary:
==
1. The patch is now large and invasive based on adding new node
   types and associated infrastructure. I modelled the nodes largely
   on RangeFunction and FunctionScan.
2. Performance is close enough to mysql to not be a big issue (I think,
   more data below) as long as the machine does not get into a memory
   swapping regime. Memory usage is now better, but not as good as
   mysql.
3. I specifically coded with the intent of preserving current insert
   statement behavior and code paths for current functionality. So there
   *should* be no performance degradation or subtle semantics changes
   for INSERT DEFAULT VALUES, INSERT ... VALUES (with one target
   list), INSERT ... SELECT  Even Tom's recently discovered
   insert into foo values (tenk1.*) still works ;-)

Performance:

On my development machine (dual core amd64, 2GB RAM) I get the following 
results using the php script posted earlier:


Postgres:
-
$loopcount = 10;
multi-INSERT-at-once Elapsed time is 1 second

$loopcount = 30;
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 50;
multi-INSERT-at-once Elapsed time is 9 seconds

$loopcount = 80;
multi-INSERT-at-once Elapsed time is 14 seconds

$loopcount = 90;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 100;
multi-INSERT-at-once Elapsed time is 42 seconds

$loopcount = 200;
killed after 5 minutes due to swapping

MySQL:
--
$loopcount = 10;
multi-INSERT-at-once Elapsed time is 2 seconds

$loopcount = 30;
INSERT failed:Got a packet bigger than 'max_allowed_packet' bytes
changed max_allowed_packet=64M
multi-INSERT-at-once Elapsed time is 5 seconds

$loopcount = 50;
multi-INSERT-at-once Elapsed time is 8 seconds

$loopcount = 80;
multi-INSERT-at-once Elapsed time is 13 seconds

$loopcount = 90;
multi-INSERT-at-once Elapsed time is 15 seconds

$loopcount = 100;
multi-INSERT-at-once Elapsed time is 17 seconds

$loopcount = 200;
multi-INSERT-at-once Elapsed time is 36 seconds

$loopcount = 300;
multi-INSERT-at-once Elapsed time is 54 seconds

$loopcount = 400;
multi-INSERT-at-once Elapsed time is 134 seconds

table value constructor:
==
Included in this patch is support for table value constructor in the 
FROM clause, e.g.:


regression=# select * from {values (1,array[1,2]),(2,array[3,4])};
 ?column? | array
--+---
1 | {1,2}
2 | {3,4}
(2 rows)

The strange syntax is a temporary hack to eliminate shift/reduce 
conflicts. I'm not entirely sure we want to try to support this (or 
something like it) for 8.2, but much of what is needed is now readily 
available. More on known issues next.


Known Issues:
=

General:

1. Several comments in the patch are marked FIXME. These are areas
   where I was uncertain what was the right thing to do. Any advice
   on these specific spots would be very much appreciated.
2. I broke the rules regression test -- still need to look at what I
   did to mess that up. Somewhere in the reconstruction of VALUES ...
   according to the diff.

VALUES multi-targetlist INSERTS:

3. Not yet quite sure how to get DEFAULT