Re: [HACKERS] TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)

2007-04-24 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Gregory Stark [EMAIL PROTECTED] writes:
 The main data from the statistics that's of interest here are the extreme
 values of the histogram. If we're not interested in any values in that range
 then we can exclude the partition entirely.

 Except that there is *no* guarantee that the histogram includes the
 extreme values --- to promise that would require ANALYZE to scan every
 table row.

That's why I said:

  a subsequent VACUUM ANALYZE could mark the resulting statistics as
  authoritative

Not just plain analyze.

There's another issue here too. One of the other motivations is to be able to
put read-only tables on read-only media. To do that would require freezing
every tuple which would at the very least involve looking at every tuple. (It
would also involve waiting until all tuples are freezable too.) 

So there's a natural step in which to gather these authoritative statistics
anyways.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)

2007-04-23 Thread Gregory Stark

Andrew Hammond [EMAIL PROTECTED] writes:

 If you have a table with a bunch of children, and these children all
 have a primary key which is generated from the same sequence, assuming
 that you're partitioning based on date (ie, this is a transaction
 record table), it would be nice if the planner could spot that all
 tables have a primary key on a column used as a join condition, check
 the min / max to see if there is overlap between tables, then apply
 CBE as if constraints existed.

The problem is that it's not really true that sequences and time move
together. It's quite possible to have two transactions which both start just
before the date-based partition cutoff but have one land in each partition
with the greater sequence number landing in the old partition.

It would be rare (but still possible) if you always insert using quick
autocommitted inserts with nextval() in a values list. But it would be quite
likely if you use one of the other coding styles such as doing one query to
look up the nextval() and then doing various inserts based on that value in
multiple statements within a single transaction.

What I've been considering instead was using the statistics. If we provided a
way to mark partitions read-only then once a table (or partition) is marked
then a subsequent VACUUM ANALYZE could mark the resulting statistics as
authoritative. Now that we have plan invalidation we could use this kind of
information in the planning.

The main data from the statistics that's of interest here are the extreme
values of the histogram. If we're not interested in any values in that range
then we can exclude the partition entirely.

This has a number of nice properties. It requires little additional work for
the DBA and read-only is a nice simple concept for a DBA to understand. It's
even a useful feature for other purposes. It also can catch a lot more cases
than the one you describe. In particular it would eliminate the parent table
if it has no rows which gives us a chance to eliminate the Append node
altogether.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)

2007-04-23 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 The main data from the statistics that's of interest here are the extreme
 values of the histogram. If we're not interested in any values in that range
 then we can exclude the partition entirely.

Except that there is *no* guarantee that the histogram includes the
extreme values --- to promise that would require ANALYZE to scan every
table row.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster